Parsing table

Parsing table

A parsing table is the part of a parser that makes decisions about how to treat input tokens in compiler development.

: "This page is about LR parsing tables, there are also LL parsing tables which are substantially different."

Overview

A parsing table is a table describing what action its parser should take when a given input comes while it is in a given state. It is a tabular representation of a pushdown automaton that is generated from the context-free grammar of the language to be parsed. This is possible because the set of viable prefixes (that is, the sequence of input that can be set aside before a parser needs to either perform a reduction or return an error) of a context-free language is a regular language, so the parser can use a simple parse state table to recognize each viable prefix and determine what needs to be done next.

A parsing table is used with a stack and an input buffer. The stack starts out empty, and symbols are shifted onto it one by one from the input buffer with associated states; in each step, the parsing table is consulted to decide what action to take.

More simply, an Action table defines what to do when a terminal symbol is encountered, and a goto table defines what to do when a nonterminal is encountered.

The parsing table consists of two parts, the action table and the goto table. The action table takes the state at the top of the stack and the next symbol in the input buffer (called the "lookahead" symbol) and returns the action to take, and the next state to push onto the stack. The goto table returns the next state for the parser to enter when a reduction exposes a new state on the top of the stack. The goto table is needed to convert the operation of the deterministic finite automaton of the Action table to a pushdown automaton.

For example, a parsing table might take a current position in state 1 and an input of "+" from the input buffer, and return instructions to shift the current symbol onto the stack and move to state 4. More precisely, the parser would push the token "+" onto the stack, followed by the symbol 4. Or, for an example of the use of a goto table, the current stack might contain "E+E", which is supposed to reduce to "T". After that reduction, looking up "T" in the goto table row corresponding to the state currently at the top of the stack (that is, the state that was under the "E+E" popped off to do the reduction) would tell the parser to push state 2 on top of "T". Action may be a Reduction, a Shift, or Accept or None.

Reduction

: Happens when the parser recognizes a sequence of symbols to represent a single nonterminal, and substitutes that nonterminal for the sequence. In an LR parser, this means that the parser will pop 2*N entries from the stack (N being the length of the sequence recognized - this number is doubled because with each token pushed to the stack, a state is pushed on top, and removing the token requires that the parser also remove the state) and push the recognized nonterminal in its place. This nonterminal will not have an associated state, so to determine the next parse state, the parser will use the goto table.

: This explanation can be made more precise using the "E+E" example above. Suppose the input buffer is currently empty and the contents of the stack are, from bottom to top:

: 0 E 1 + 5 E 1

: The parser sees that there are no more tokens in the input stream and looks up the empty token in the parsing table. The table contains an instruction, for example, "r4" - this means to use reduction 4 to reduce the contents of the stack. Reduction 4, part of the grammar used to specify the parser's context-free language, should be something along the lines of:

: T -> E+E

: If the parser were to reduce the "E+E" to a "T", it would pop six symbols from the stack, leaving the stack containing:

: 0

: and then push the new "T", followed by the state number found in the goto table at row 0, column T.

hift

: In this case, the parser makes the choice to shift the next symbol onto the stack. The token is moved from the input buffer to the stack along with an associated state, also specified in the action table entry, which will be the next parser state. The parser then advances to the next token.

Accept

: When a parse is complete and the sequence of tokens in question can be matched to an accept state in the parse table

ee also

* Deterministic finite state machine
* LR parser
* LL parser
* Compiler

References

* [http://www.inf.unibz.it/~artale/Compiler/slide4.pdf "Lecture slides about generating parsing tables"] , by Professor Alessandro Artale, researcher and lecturer at the Free University of Bolzano (from page 41).


Wikimedia Foundation. 2010.

Игры ⚽ Нужен реферат?

Look at other dictionaries:

  • Bottom-up parsing — (also known as shift reduce parsing) is a strategy for analyzing unknown data relationships that attempts to identify the most fundamental units first, and then to infer higher order structures from them. It attempts to build trees upward toward… …   Wikipedia

  • Control table — This simple control table directs program flow according to the value of the single input variable. Each table entry holds a possible input value to be tested for equality (implied) and a relevant subroutine to perform in the action column. The… …   Wikipedia

  • LL parser — An LL parser is a top down parser for a subset of the context free grammars. It parses the input from Left to right, and constructs a Leftmost derivation of the sentence (hence LL, compared with LR parser). The class of grammars which are… …   Wikipedia

  • LR parser — In computer science, an LR parser is a parser for context free grammars that reads input from Left to right and produces a Rightmost derivation. The term LR( k ) parser is also used; here the k refers to the number of unconsumed look ahead input… …   Wikipedia

  • CYK algorithm — The Cocke–Younger–Kasami (CYK) algorithm (alternatively called CKY) is a parsing algorithm for context free grammars. It employs bottom up parsing and dynamic programming. The standard version of CYK operates only on context free grammars given… …   Wikipedia

  • LALR parser — In computer science, a lookahead LR parser or LALR parser is a specialized form of LR parser that can deal with more context free grammars than Simple LR (SLR) parsers. It is a very popular type of parser because it gives a good trade off between …   Wikipedia

  • Simple precedence parser — In computer science, a Simple precedence parser is a type of bottom up parser for context free grammars that can be used only by Simple precedence grammars.The implementation of the parser is quite similar to the generic bottom up parser. A stack …   Wikipedia

  • Memoization — Not to be confused with Memorization. In computing, memoization is an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously processed inputs.… …   Wikipedia

  • GOLD (parser) — Infobox Software name = GOLD Parsing System caption = developer = Devin Cook [http://www.devincook.com/goldparser/contributors Multiple Contributors] latest release date = 2007 07 29 latest release version = 3.4.4 operating system = Windows… …   Wikipedia

  • Index (search engine) — Search engine indexing collects, parses, and stores data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, physics, and… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”