GLR parser

GLR parser

In computer science, a GLR parser ("Generalized Left-to-right Rightmost derivation parser") is an extension of an LR parser algorithm to handle nondeterministic and ambiguous grammars. First described in a 1986 paper by Masaru Tomita, it has also been referred to as a "parallel parser".

Though the algorithm has evolved since its original form, the principles have remained intact: Tomita's goal was to parse natural language text thoroughly and efficiently. Standard LR parsers cannot accommodate the nondeterministic and ambiguous nature of natural language, and the GLR algorithm can.

Algorithm

Briefly, the GLR algorithm works in a manner similar to the LR parser algorithm, except that, given a particular grammar, a GLR parser will process all possible interpretations of a given input in a breadth-first search. On the front-end, a GLR parser generator converts an input grammar into parser tables, in a manner similar to an LR generator. However, where LR parse tables allow for only one state transition (given a state and an input token), GLR parse tables allow for multiple transitions. In effect, GLR allows for shift/reduce and reduce/reduce conflicts.

When a conflicting transition is encountered, the parse stack is forked into two or more parallel parse stacks, where the state corresponding to each possible transition is at the top. Then, the next input token is read and used to determine the next transition(s) for each of the "top" states -- and further forking can occur. If any given top state and input token do not result in at least one transition, then that "path" through the parse tables is invalid and can be discarded.

A crucial optimization allows sharing of common prefixes and suffixes of these stacks, which constrains the overall search space and memory usage required to parse input text. The complex structures that arise from this improvement make the stack more like a lattice of nodes.

Advantages

When implemented carefully, the GLR algorithm has the same time complexity as the CYK algorithm and Earley algorithm -- "O"("n"3). However, GLR carries two additional advantages:

* The time required to run the algorithm is proportional to the degree of nondeterminism in the grammar -- on deterministic grammars the GLR algorithm runs in "O"("n") time (this is not true of the Earley and CYK algorithms)
* The GLR algorithm is "on-line" -- that is, it consumes the input tokens in a specific order and performs as much work as possible after consuming each token.

In practice, most programming languages are deterministic or "nearly deterministic," meaning that any nondeterminism is usually resolved within a small (though possibly unbounded) number of tokens. Compared to other algorithms capable of handling the full class of context-free grammars (such as Earley or CYK), the GLR algorithm gives better performance on these "nearly deterministic" grammars, because only a single stack will be active during the majority of the parsing process.

Implementations

The freely available [http://dparser.sourceforge.net/ dparser] , the Scannerless Boolean Parser, the ASF+SDF framework, and GNU Bison support GLR parser generation.


Wikimedia Foundation. 2010.

Игры ⚽ Нужен реферат?

Look at other dictionaries:

  • GLR — can refer to:* BBC London 94.9 * GLR parser * The great lakes region of Africa. * The initials of George Lincoln Rockwell. * Gateway Location Register GSM term a proxy for HLR (Home Location Register) to optimise signalling of subscriber profile… …   Wikipedia

  • Parser-Generator — Ein Parsergenerator ist ein Computerprogramm, das unter Eingabe einer Spezifikation einen Parser erzeugt. Inhaltsverzeichnis 1 Grundlagen 2 Algorithmen 3 Siehe auch 4 Weblinks // …   Deutsch Wikipedia

  • GLR-парсер — (от англ. Generalized Left to right Rightmost derivation parser Обобщенный восходящий магазинный анализатор) в информатике расширенный алгоритм LR парсера, предназначенный для разбора по недетерменированным и неоднозначным грамматикам.… …   Википедия

  • GLR — steht für: Central Mountain Air, kanadische Fluggesellschaft (ICAO Code) Gaylord (Michigan) in den Vereinigten Staaten (IATA Code des Flughafens) Generalized LR(k), ein Parseverfahren für kontextfreie Grammatiken; siehe Tomita Parser Greater… …   Deutsch Wikipedia

  • Parser — Analyse syntaxique Pour les articles homonymes, voir Analyseur. L analyse syntaxique consiste à mettre en évidence la structure d un texte, généralement un programme informatique ou du texte écrit dans une langue naturelle. Un analyseur… …   Wikipédia en Français

  • Scannerless Boolean Parser — Infobox Software name = SBP developer = Adam Megacz operating system = Java Virtual Machine genre = Parser generator license = BSD license website = [http://research.cs.berkeley.edu/project/sbp/] The Scannerless Boolean Parser is a free software… …   Wikipedia

  • Comparison of parser generators — This is a list of notable lexer generators and parser generators for various language classes. Contents 1 Regular languages 2 Deterministic context free languages 3 Parsing expression grammars, deterministic boolean grammars …   Wikipedia

  • Tomita-Parser — Ein Tomita Parser (nach Masaru Tomita) ist ein Parsverfahren für kontextfreie Grammatiken, das eine Verallgemeinerung des LR(k) Verfahrens ist. Das Verfahren wird deshalb auch GLR(k) Verfahren (für Generalized LR(k)) genannt. Ausgangspunkt des… …   Deutsch Wikipedia

  • Context-free grammar — In formal language theory, a context free grammar (CFG) is a formal grammar in which every production rule is of the form V → w where V is a single nonterminal symbol, and w is a string of terminals and/or nonterminals (w can be empty). The… …   Wikipedia

  • Parsing — In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a sequence of tokens to determine their grammatical structure with respect to a given (more or less) formal grammar.Parsing is also… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”