SourceForge.net Logo

 

Mouse: from Parsing Expressions to a practical parser

Version 1.6.1


Parsing Expression Grammar (PEG) is a new way to specify recursive-descent parsers with limited backtracking. The use of backtracking lifts the LL(1) restriction usually imposed by top-down parsers. In addition, PEG can define parsers with integrated lexing.
An example of PEG:

Sum     = Space Sign Number (AddOp Number)* !_ {} ;
Number  = Digits? "." Digits Space {}
        / Digits Space {} ;
Sign    = ("-" Space)? ;
AddOp   = [-+] Space ;
Digits  = [0-9]+ ;
Space   = " "* ;

Mouse is a tool to transcribe PEG into an executable parser written in Java. Unlike some existing PEG generators (e.g., Rats!), Mouse does not produce a storage-hungry "packrat parser", but a collection of transparent recursive procedures.
An example of parsing procedure generated by Mouse:

//=====================================================================
//  Sum = Space Sign Number (AddOp Number)* !_ {} ;
//=====================================================================
private boolean Sum()
  {
    begin("Sum");
    Space();
    Sign();
    if (!Number()) return reject();
    while (Sum_0());
    if (!aheadNot()) return reject();
    sem.Sum();
    return accept();
  }

An integral feature of Mouse is the mechanism for specifying semantics (also in Java). This makes Mouse a convenient tool if one needs an ad-hoc language processor. Being written in Java, the processor is operating-system independent.
A pair of brackets {} at the end of a PEG rule indicates that the rule has an associated semantic procedure. The procedure may look like this:

//-------------------------------------------------------------------
//  Sum = Space Sign Number AddOp Number ... AddOp Number
//          0     1    2      3     4         n-2   n-1
//-------------------------------------------------------------------
void Sum()
  {
    int n = rhsSize();
    int s = (Double)rhs(2).get();
    if (!rhs(1).isEmpty()) s = -s;
    for (int i=4;i<n;i+=2)
    {
      if (rhs(i-1).charAt(0)=='+')
        s += (Double)rhs(i).get();
      else
        s -= (Double)rhs(i).get();
    }
    System.out.println(s);
  }
Special functions like "rhsSize()" and "rhs(i)" provide access to parts of the parsed phrase.

Download user's manual / tutorial (PDF file).
Download the complete package (gzipped TAR file).

License: Apache Version 2.


 

Grammars for Java and C

Parsing Expression Grammars for Java and C in the format accepted by Mouse are found here.
They are also included in the TAR file.


Latest change 2014-05-13