Shunting yard algorithm

Shunting yard algorithm

The shunting yard algorithm is a method for parsing mathematical equations specified in infix notation. It can be used to produce output in Reverse Polish notation (RPN) or as an abstract syntax tree (AST). The algorithm was invented by
Edsger Dijkstra and named the "shunting yard" algorithm because its operation resembles that of a railroad shunting yard.

Like the evaluation of RPN, the shunting yard algorithm is stack-based. Infix expressions are the form of math most people are used to, for instance 3+4 or 3+4*(2−1). For the conversion there are two text variables (strings), the input and the output. There is also a stack holding operators not yet added to the output stack. To convert, the program reads each letter in order and does something based on that letter.

A simple conversion

:Input: 3+4
#Add 3 to the output queue (whenever a number is read it is added to the output)
#Push + (or its ID) onto the operator stack
#Add 4 to the output queue
#After reading expression pop the operators off the stack and add them to the output.
# In this case there is only one, "+".
#Output 3 4 +

This already shows a couple of rules:
* All numbers are added to the output when they are read.
* At the end of reading the expression, pop all operators off the stack and onto the output.

The algorithm in detail

* While there are tokens to be read::* Read a token.:* If the token is a number, then add it to the output queue.:* If the token is a function token, then push it onto the stack.:* If the token is a function argument separator (e.g., a comma):::* Until the topmost element of the stack is a left parenthesis, pop the element from the stack and push it onto the output queue. If no left parentheses are encountered, either the separator was misplaced or parentheses were mismatched.:* If the token is an operator, o1, then:::* while there is an operator, o2, at the top of the stack, and either::::: o1 is associative or left-associative and its precedence is less than (lower precedence) or equal to that of o2, or::::: o1 is right-associative and its precedence is less than (lower precedence) that of o2,
:::: pop o2 off the stack, onto the output queue;::* push o1 onto the stack.:* If the token is a left parenthesis, then push it onto the stack.:* If the token is a right parenthesis:::* Until the token at the top of the stack is a left parenthesis, pop operators off the stack onto the output queue.::* Pop the left parenthesis from the stack, but not onto the output queue.::* If the token at the top of the stack is a function token, pop it and onto the output queue.::* If the stack runs out without finding a left parenthesis, then there are mismatched parentheses.
* When there are no more tokens to read::* While there are still operator tokens in the stack:::* If the operator token on the top of the stack is a parenthesis, then there are mismatched parenthesis.::* Pop the operator onto the output queue.
* Exit.

Complex example

If you were writing an interpreter, this output would be tokenized and written to a compiled file to be later interpreted. Conversion from infix to RPN can also allow for easier simplification of expressions. To do this, act like you are solving the RPN expression, however, whenever you come to a variable its value is null, and whenever an operator has a null value, it and its parameters are written to the output (this is a simplification, problems arise when the parameters are operators). When an operator has no null parameters its value can simply be written to the output. This method obviously doesn't include all the simplifications possible: It's more of a constant folding optimization.

ee also

*Operator-precedence parser
*Reverse Polish notation

External links

* [http://www.chris-j.co.uk/parsing.php Java Applet demonstrating the Shunting yard algorithm]
* [http://www.engr.mun.ca/~theo/Misc/exp_parsing.htm Parsing Expressions by Recursive Descent] Theodore Norvell (C) 1999–2001. Access data September 14, 2006.
* [http://montcs.bloomu.edu/~bobmon/Information/RPN/infix2rpn.shtml Infix to RPN Algorithm]
* [http://www.cs.utexas.edu/~EWD/MCReps/MR35.PDF Original description of the Shunting yard algorithm]
* [http://www.kallisti.net.nz/blog/2008/02/extension-to-the-shunting-yard-algorithm-to-allow-variable-numbers-of-arguments-to-functions/ Extension to the ‘Shunting Yard’ algorithm to allow variable numbers of arguments to functions]


Wikimedia Foundation. 2010.

Игры ⚽ Нужно решить контрольную?

Look at other dictionaries:

  • Shunting yard — may refer to: * Classification yard * Shunting yard algorithm …   Wikipedia

  • Algoritmo shunting yard — El algoritmo shunting yard es un método para analizar (parsing) las ecuaciones matemáticas especificadas en la notación de infijo. Puede ser utilizado para producir la salida en la notación polaca inversa (RPN) o como árbol de sintaxis abstracta… …   Wikipedia Español

  • Operator-precedence parser — An operator precedence parser is a bottom up parser that interprets an operator precedence grammar. For example, most calculators use operator precedence parsers to convert from the human readable infix notation with order of operations format… …   Wikipedia

  • Reverse Polish notation — (or just RPN) by analogy with the related Polish notation, a prefix notation introduced in 1920 by the Polish mathematician Jan Łukasiewicz, is a mathematical notation wherein every operator follows all of its operands. It is also known as… …   Wikipedia

  • List of algorithms — The following is a list of the algorithms described in Wikipedia. See also the list of data structures, list of algorithm general topics and list of terms relating to algorithms and data structures.If you intend to describe a new algorithm,… …   Wikipedia

  • Edsger W. Dijkstra — Edsger Wybe Dijkstra Born May 11, 1930(1930 05 11) Rotterdam, Netherl …   Wikipedia

  • Abstract syntax tree — In computer science, an abstract syntax tree (AST), or just syntax tree, is a tree representation of the syntax of some source code (that has been written in a programming language). Each node of the tree denotes a construct occurring in the… …   Wikipedia

  • Infix notation — Infobox notation logo=Infix notation is the common arithmetic and logical formula notation, in which operators are written infix style between the operands they act on (e.g. 2 + 2). It is not as simple to parse by computers as prefix notation ( e …   Wikipedia

  • Comparison of parser generators — This is a list of notable lexer generators and parser generators for various language classes. Contents 1 Regular languages 2 Deterministic context free languages 3 Parsing expression grammars, deterministic boolean grammars …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”