Scannerless parsing

Scannerless parsing

Scannerless parsing (also called "lexerless parsing") refers to the use of a single formalism to express both the lexical and context-free syntax used to parse a language.

This parsing strategy is suitable when a clear lexer-parser distinction is not required. examples of when this is appropriate include TeX, most wiki grammars, makefiles, and simple per application control languages.

Advantages

* Only a single metasyntax is required
* Non-regular lexical structure is handled easily
* "Token classification" is not required which eliminates the need for design accommodations such as "the lexer hack" and language keywords (such as "while" in C)
* Grammars are compositional (can be merged without human intervention) ref|compositional

Disadvantages

* since the lexical scanning and syntactic parsing processing is combined, the resulting parser tends to be harder to understand and debug for more complex languages

Required Extensions

Unfortunately, when parsed at the character level, most popular programming languages are no longer strictly context-free. Visser identified five key extensions to classical context-free syntax which handle almost all common non-context-free constructs arising in practice:

* Follow restrictions, a limited form of "longest match"
* Reject productions, a limited form of negative matching (as found in boolean grammars)
* Preference attributes to handle the dangling else construct in C-like languages
* "Per-production transitions" rather than per-nonterminal transitions in order to facilitate:
** Associativity attributes, which prevent a self-reference in a particular production of a nonterminal from producing that same production
** Precedence/priority rules, which prevent self-references in higher-precedence productions from producing lower-precedence productions

Implementations

* [http://dparser.sourceforge.net/ dparser] generates ANSI C code for scannerless GLR parsers
* the [http://www.cwi.nl/htbin/sen1/twiki/bin/view/SEN1/SGLR SGLR] component of ASF+SDF generates and interprets parse tables for scannerless grammars
* SGLR is also a component of the Stratego/XT program transformation system
* Spirit allows for both scannerless and scanner-based parsing
* SBP is a scannerless parser for boolean grammars (a superset of context-free grammars), written in Java.

Related Material

[http://citeseer.ist.psu.edu/visser97scannerles.html Visser, E. (1997b). Scannerless generalized-LR parsing. Technical Report P9707, Programming Research Group, University of Amsterdam]

# since lexerless parsers handle the entire class of context-free languages, which is closed under composition.


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Comparison of parser generators — This is a list of notable lexer generators and parser generators for various language classes. Contents 1 Regular languages 2 Deterministic context free languages 3 Parsing expression grammars, deterministic boolean grammars …   Wikipedia

  • Parser — Ein Parser [ˈpɑːʁzɐ] (engl. to parse, „analysieren“, bzw. lateinisch pars, „Teil“; im Deutschen gelegentlich auch Zerteiler) ist ein Computerprogramm, das in der Computertechnik für die Zerlegung und Umwandlung einer beliebigen Eingabe in ein für …   Deutsch Wikipedia

  • GLR parser — In computer science, a GLR parser ( Generalized Left to right Rightmost derivation parser ) is an extension of an LR parser algorithm to handle nondeterministic and ambiguous grammars. First described in a 1986 paper by Masaru Tomita, it has also …   Wikipedia

  • Maximal munch — In computer programming and computer science, maximal munch or longest match is the principle that when creating some construct, as much of the available input as possible should be consumed. It is similar, though not entirely identical, to the… …   Wikipedia

  • ASF+SDF Meta Environment — Infobox Software name = ASF+SDF Meta Environment caption = developer = SEN1 [http://www.cwi.nl/htbin/sen1/twiki/bin/view/SEN1/] group at the CWI [http://www.cwi.nl] latest release version = 2.0.1 latest release date = 08 September 2008 latest… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”