Wirth syntax notation

Wirth syntax notation

Wirth syntax notation (WSN) is a metasyntax, that is, a formal way to describe formal languages. Originally proposed by Niklaus Wirth in 1977 as an alternative to Backus-Naur form (BNF), it has several advantages over BNF in that it can be defined using itself, it contains an explicit iteration construct, and it avoids the use of an explicit symbol for the empty string (such as or ε). [cite journal |authorlink=Niklaus Wirth |first=Niklaus |last=Wirth |title=What Can We Do about the Unnecessary Diversity of Notations for Syntax Definitions? |url= http://doi.acm.org/10.1145/359863.359883 |journal=Communications of the ACM |date=November, 1977 |volume=20 |issue=11 |pages=822–823 |doi=10.1145/359863.359883]

WSN has been used in several international standards, starting with ISO 10303-21. ["ISO DIS 10303-21 Product Data Representation and Exchange, Part 21: Implementation Methods, Clear Text Encoding of the Exchange Structure (Annex B: WSN notational conventions)". TC 184/SC4 N204. Secrtariat, NIST, 1993-05-28] It was also used to define the syntax of EXPRESS, the data modelling language of STEP.

WSN defined in itself

SYNTAX = { PRODUCTION } . PRODUCTION = IDENTIFIER "=" EXPRESSION "." . EXPRESSION = TERM { "|" TERM } . TERM = FACTOR { FACTOR } . FACTOR = IDENTIFIER
LITERAL
" [" EXPRESSION "] "
"(" EXPRESSION ")"
"{" EXPRESSION "}" . IDENTIFIER = letter { letter } . LITERAL = """" character { character } """" .

The equals sign indicates a production. The element on the left is defined to be the combination of elements on the right. A production is terminated by a full stop (period).
*Repetition is denoted by curly brackets, "e.g.," {a} stands for ε | a | aa | aaa | ....
*Optionality is expressed by square brackets, "e.g.," [a] b stands for ab | b.
*Parenthesis serve for groupings, "e.g.," (a|b)c stands for ac | bc.

We take these concepts for granted today, but theywere novel and even controversial in 1977. Wirth later incorporated someof the concepts (with a different syntax and notation) into Extended Backus-Naur form.

Notice that letter and character are left undefined. This is because numeric characters (digits 0 through 9) may be included in both definitions or excluded from one, depending on the language being defined, "e.g.":

digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" . upper-case = "A" | "B" | ... | "Y" | "Z" . lower-case = "a" | "b" | ... | "y" | "z" . letter = upper-case | lower-case .

If goes on to include digit and other printable ASCII characters, then it diverges even more from letter, which one can assume does not include the digit characters or any of the special (non-alphanumeric) characters.

Another example

The syntax of BNF can be represented with WSN as follows, based on translating the BNF example of itself:

syntax = rule [ syntax ] . rule = opt-whitespace "<" rule-name ">" opt-whitespace "::=" opt-whitespace expression line-end . opt-whitespace = { " " } . expression = list [ "|" expression ] . line-end = opt-whitespace EOL | line-end line-end . list = term [ opt-whitespace list ] . term = literal | "<" rule-name ">" . literal = """" text """" | "'" text "'" .

This definition appears overly complicated because the concept of "optional whitespace" must be explicitly defined in BNF, but it is implicit in WSN. Even in this example, text is left undefined, but it is assumed to mean "ASCII-character { ASCII-character }". (EOL is also left undefined.) Notice how the kludge "<" rule-name ">" has been used twice because text was not explicitly defined.

One of the problems with BNF which this example illustrates is that by allowing both single-quote and double-quote characters to be used for a literal, there is an added potential for human error in attempting to create a machine-readable syntax. One of the concepts migrated to later metasyntaxes was the idea that giving the user multiple choices made it harder to write parsers for grammars defined by the syntax, so computer languages in general have become more restrictive in how a "quoted-literal" is defined.

References


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

  • Niklaus Wirth — Niklaus E. Wirth Born February 15, 1934 (1934 02 15) (age 77) …   Wikipedia

  • Erweiterte Backus-Naur-Notation — Die Erweiterte Backus Naur Form, kurz EBNF, ist eine Erweiterung der Backus Naur Form (BNF), die ursprünglich von Niklaus Wirth zur Darstellung der Syntax der Programmiersprache Pascal eingeführt wurde. Sie ist eine formale Metasyntax… …   Deutsch Wikipedia

  • Reverse Polish notation — (or just RPN) by analogy with the related Polish notation, a prefix notation introduced in 1920 by the Polish mathematician Jan Łukasiewicz, is a mathematical notation wherein every operator follows all of its operands. It is also known as… …   Wikipedia

  • Niklaus Wirth — Niklaus Wirth, 2005 Niklaus Wirth (* 15. Februar 1934 in Winterthur, Schweiz) ist ein Schweizer Informatiker. Er entwickelte unter anderem die Programmiersprache Pascal, eine der bekanntesten Programmiersprachen. Inha …   Deutsch Wikipedia

  • WSN — Wirth Syntax Notation …   Acronyms

  • WSN — Wirth Syntax Notation …   Acronyms von A bis Z

  • Extended Backus–Naur Form — In computer science, Extended Backus–Naur Form (EBNF) is a metasyntax notation used to express context free grammars: that is, a formal way to describe computer programming languages and other formal languages. It is an extension of the basic… …   Wikipedia

  • Backus–Naur Form — In computer science, Backus–Naur Form (BNF) is a metasyntax used to express context free grammars: that is, a formal way to describe formal languages. John Backus and Peter Naur developed a context free grammar to define the syntax of a… …   Wikipedia

  • Metasyntax — A metasyntax describes the allowable structure and composition of phrases and sentences of a metalanguage, which is used to describe either a natural language or a computer programming language. Some of the widely used formal metalanguages for… …   Wikipedia

  • EXPRESS (data modeling language) — EXPRESS is a standard data modelling language for product data. EXPRESS is formalized in the ISO Standard for the Exchange of Product model STEP (ISO 10303), and standardized as ISO 10303 11. Overview Data models formally define data objects and… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”