Syntactic predicate

Syntactic predicate

A syntactic predicate specifies the syntactic validity of applying a production in a formal grammar and is analogous to a semantic predicate that specifies the semantic validity of applying a production. It is a simple and effective means of dramatically improving the recognition strength of an LL parser by providing arbitrary lookahead. In their original implementation, syntactic predicates had the form “( α )?” and could only appear on the left edge of a production. The required syntactic condition α could be any valid context-free grammar fragment.

Syntactic predicates implicitly order the productions so that predicated productions specified earlier have higher precedence than predicated productions specified later within the same decision. This conveys an ability to disambiguate ambiguous productions because the programmer can simply specify which production should match.

Parsing expression grammars, invented by Bryan Ford, are a powerful formalization and extension to these simple predicates (e.g., PEGs include "not predicates" and allow predicates anywhere within a production). Morever, Bryan invented packrat parsing to handle these grammars in linear time (at the cost of heap space). ANTLR version 3 keeps the same simple predicate structure, but uses rule memoization to also guarantee linear time for predicated parsing.

More formally, a syntactic predicate is a form of production intersection, used in parser specifications or in formal grammars. In this sense, the term "predicate" has the meaning of a mathematical indicator function. If "p1" and "p2," are production rules, the language generated by "both" "p1" and "p2" is their set intersection.

Overview

Terminology

The term "syntactic predicate" was coined by Parr & QuongParr, Terence J.& Quong, Russell, "Adding Semantic and Syntactic Predicates to LL(k) parsing: pred-LL(k)," Army High Performance Computing Research Center Preprint No. 93-096, October 1993.] and differentiates this form of predicate from semantic predicates (also discussed in that paper).

Syntactic predicates have been called "multi-step matching", "parse constraints", and simply "predicates" in various literature. (See References section below.) This article uses the term "syntactic predicate" throughout for consistency and to distinguish them from semantic predicates.

Formal Closure Properties

Bar-Hillel "et al." [Bar-Hillel, Y. "et al.", "On Formal Properties of Simple Phrase Structure Grammars," "Zeitschrift für Phonetik, Sprachwissenschaft und Kommunikationsforschung", Vol. 14 No. 2, pp. 143-172, 1961.] show that the intersection of two regular languages is also a regular language, which is to say that the regular languages are closed under intersection.

The intersection of a regular language and a context-free language is also closed, and it has been known at least since Hartmanis [Hartmanis, Juris, "Context-Free Languages and Turing Machine Computations," "Proceedings of Symposia in Applied Mathematics", Vol. 19, "Mathematical Aspects of Computer Science", AMS, pp. 42-51, 1967.] that the intersection of two context-free languages is not necessarily a context-free language (and is thus not closed). This can be demonstrated easily using the canonical Type 1 language, L = { a^n b^n c^n : n ge 1 } :

Let L_1 = { a^m b^n c^n : m,n ge 1 } (Type 2) Let L_2 = { a^n b^n c^m : m,n ge 1 } (Type 2) Let L_3 = L_1 cap L_2

Given the strings "abcc", "aabbc", and "aaabbbccc", it is clear that the only string that belongs to both L1 and L2 (that is, the only one that produces a non-empty intersection) is "aaabbbccc".

Other Considerations

In most formalisms that use syntactic predicates, the syntax of the predicate is noncommutative, which is to say that the operation of predication is ordered. For instance, using the above example, consider the following pseudo-grammar, where "X ::= Y PRED Z" is understood to mean: "Y" produces "X" if and only if "Y" also satisfies predicate "Z":

S ::= a X X ::= Y PRED Z Y ::= a+ BNCN Z ::= ANBN c+ BNCN ::= b [BNCN] c ANBN ::= a [ANBN] b

Given the string "aaaabbbccc", in the case where "Y" must be satisfied "first" (and assuming a greedy implementation), S will generate "aX" and "X" in turn will generate "aaabbbccc", thereby generating "aaaabbbccc". In the case where "Z" must be satisfied first, ANBN will fail to generate "aaaabbb", and thus "aaaabbbccc" is not generated by the grammar. Moreover, if either "Y" or "Z" (or both) specify any action to be taken upon reduction (as would be the case in many parsers), the order that these productions match determines the order in which those side-effects occur. Formalisms that vary over time (such as adaptive grammars) may rely on these side effects.

Examples of Use

;ANTLR

Parr & Quong give this example of a syntactic predicate: stat: (declaration)? declaration
expression ;

which is intended to satisfy the following informally statedStroustrup, Bjarne & Ellis, Margaret A., "The Annotated C++ Reference Manual, Addison-Wesley, 1990."] constraints of C++:

# If it looks like a declaration, it is; otherwise
# if it looks like an expression, it is; otherwise
# it is a syntax error.

In the first production of rule stat, the syntactic predicate (declaration)? indicates that declaration is the syntactic context that must be present for the rest of that production to succeed. We can interpret the use of (declaration)? as "I am not sure if declaration will match; let me try it out and, if it does not match, I shall try the next alternative." Thus, when encountering a valid declaration, the rule declaration will be recognized twice--once as syntactic predicate and once during the actual parse to execute semantic actions.

Of note in the above example is the fact that any code triggered by the acceptance of the "declaration" production will only occur if the predicate is satisfied.

Canonical Examples

The language L = {a^n b^n c^n | n ge 1} can be represented in various grammars and formalisms as follows:

;Parsing Expression Grammars

S ← &(A !b) a+ B !c A ← a A? b B ← b B? c

;§-Calculus

Using a "bound" predicate:

S → {A}B A → X 'c+' X → 'a' [X] 'b' B → 'a+' Y Y → 'b' [Y] 'c'

Using two "free" predicates:

A → <'a+'>"a" <'b+'>"b" &Psi;("a" "b")X <'c+'>"c" &Psi;("b" "c")Y X → 'a' [X] 'b' Y → 'b' [Y] 'c'

;Conjunctive Grammars

(Note: the following example actually generates L = {a^n b^n c^n | n ge 0}, but is included here because it is the example given by the inventor of conjunctive grammars.Okhotin, Alexander, "Conjunctive grammars," "Journal of Automata, Languages and Combinatorics", Vol. 6, No. 4, pp. 519-535, 2001] ):

S → AB&DC A → aA | &epsilon; B → bBc | &epsilon; C → cC | &epsilon; D → aDb | &epsilon;

;Perl 6 rules

rule S { > a+ &lt;B&gt; } rule A { a ? b } rule B { b &lt;B&gt;? c }

Parsers/Formalisms Using Some Form of Syntactic Predicate

Although by no means an exhaustive list, the following parsers and grammar formalisms employ syntactic predicates:

; ANTLR (Parr & Quong):As originally implemented, syntactic predicates sit on the leftmost edge of a production such that the production to the right of the predicate is attempted if and only if the syntactic predicate first accepts the next portion of the input stream. Although ordered, the predicates are checked first, with parsing of a clause continuing if and only if the predicate is satisified, and semantic actions only occurring in non-predicates.Parr, Terence & Quong, Russell, "ANTLR: A Predicated-"LL(k)" Parser Generator," "Software--Practice and Experience", Vol. 25, No. 7, pp. 789-810, July 1995.] ; Augmented Pattern Matcher (Balmas):Balmas refers to syntactic predicates as "multi-step matching" in her paper on APM.Balmas, Françoise, "An Augmented Pattern Matcher as a Tool to Synthesize Conceptual Descriptions of Programs," "Proceedings of the Ninth Knowledged-Based Software Engineering Conference", pp. 150-157, Monterey, California, 20-23 September 1994.] As an APM parser parses, it can bind substrings to a variable, and later check this variable against other rules, continuing to parse if and only if that substring is acceptable to further rules.; Parsing expression grammars (Ford):Ford's PEGs have syntactic predicates expressed as the "and-predicate" and the "not-predicate".Ford, Bryan, "Packrat Parsing: a Practical Linear-Time Algorithm with Backtracking", Master’s thesis, Massachusetts Institute of Technology, September 2002.] ; §-Calculus (Jackson):In the §-Calculus, syntactic predicates are originally called simply "predicates", but are later divided into "bound" and "free" forms, each with different input properties.Jackson, Quinn Tyler, "Adapting to Babel: Adaptivity & Context-Sensitivity in Parsing", Ibis Publishing, Plymouth, Massachusetts, March 2006.] ; Perl 6 rules:Perl 6 introduces a generalized tool for describing a grammar called "rules", which are an extension of Perl 5's regular expression syntax.Wall, Larry, " [http://dev.perl.org/perl6/doc/design/syn/S05.html Synopsis 5: Regexes and Rules] ," 2002-2006.] Predicates are introduced via a lookahead mechanism called "before", either with "" or "" (that is: "not" before"). Perl 5 also has such lookahead, but it can only encapsulate Perl 5's more limited regexp features.; ProGrammar (NorKen Technologies):ProGrammar's GDL (Grammar Definition Language) makes use of syntactic predicates in a form called "parse constraints".NorKen Technologies website, " [http://www.programmar.com/grammar.htm Grammar Definition Language] " page.] ; Conjunctive and Boolean Grammars (Okhotin):Conjunctive grammars, first introduced by Okhotin,Okhotin, Alexander, "On Augmenting the Formalism of Context-Free Grammars with an Intersection Operation," (in Russian), "Proceedings of the Fourth International Conference "Discrete Models in the Theory of Control Systems," pp. 106-109, 2000.] introduce the explicit notion of conjunction-as-predication. Later treatment of conjunctive and boolean grammarsOkhotin, Alexander, "Boolean Grammars: Expressive Power and Algorithms", Doctoral thesis, School of Computing, Queens University, Kingston, Ontario, August 2004.] is the most thorough treatment of this formalism to date.

References

External links

* [http://www.antlr.org/ ANTLR site]
** Paper: [http://www.antlr.org/article/1055550346383/antlr.pdf "ANTLR: A Predicated Parser-Generator"] (PDF)
* [http://users.utu.fi/aleokh/conjunctive/ Alexander Okhotin's Conjunctive Grammars Page]
* [http://users.utu.fi/aleokh/boolean/ Alexander Okhotin's Boolean Grammars Page]
* [http://pdos.csail.mit.edu/~baford/packrat/ The Packrat Parsing and Parsing Expression Grammars Page]
** Thesis: [http://pdos.csail.mit.edu/~baford/packrat/thesis/ "Packrat Parsing: a Practical Linear-Time Algorithm with Backtracking"] (PDF and Postscript, as well as various source files)


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

  • Predicate — or predication may refer to:*Predicate (mathematics), a relation, or the boolean valued characteristic function or indicator function of a relation *Predicate (logic), a fundamental concept in first order logic **in Bertrand Russell s theory of… …   Wikipedia

  • predicate — predication, n. predicational, adj. predicative /pred i kay tiv, keuh /; Brit. /pri dik euh tiv/, adj. predicatively, adv. v. /pred i kayt /; adj., n. /pred i kit/, v., predicated, predicating …   Universalium

  • predicate — pred•i•cate v. [[t]ˈprɛd ɪˌkeɪt[/t]] adj., n. [[t] kɪt[/t]] v. cat•ed, cat•ing, adj. n. 1) to proclaim; declare; affirm; assert 2) pho logic a) to affirm or assert (something) of the subject of a proposition b) to make (a term) the predicate of… …   From formal English to slang

  • predicate — Synonyms and related words: IC analysis, advance, affirm, affirmance, affirmation, allegation, allege, announce, announcement, annunciate, annunciation, appositive, argue, assert, assertion, assever, asseverate, asseveration, attribute,… …   Moby Thesaurus

  • Parsing expression grammar — A parsing expression grammar, or PEG, is a type of analytic formal grammar that describes a formal language in terms of a set of rules for recognizing strings in the language. A parsing expression grammar essentially represents a recursive… …   Wikipedia

  • Memoization — Not to be confused with Memorization. In computing, memoization is an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously processed inputs.… …   Wikipedia

  • Glue Semantics — Glue Semantics, or simply Glue (Dalrymple et al. 1993; Dalrymple 1999, 2001) is a linguistic theory of semantic composition and the syntax semantics interface which assumes that meaning composition is constrained by a set of instructions stated… …   Wikipedia

  • Perl 6 — Infobox programming language name = Perl paradigm = Multi paradigm year = 2000 designer = Larry Wall latest release version = pre release latest release date = typing = dynamic, static influenced by = Perl 5, Haskell, Smalltalk influenced =… …   Wikipedia

  • Adaptive grammar — An adaptive grammar is a formal grammar that explicitly provides mechanisms within the formalism to allow its own production rules to be manipulated. Contents 1 Overview 1.1 Early history 1.2 Collaborative efforts …   Wikipedia

  • Syntax — Syntactic redirects here. For another meaning of the adjective, see Syntaxis. For other uses, see Syntax (disambiguation). Linguistics …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”