- Quex
Infobox_Software
name = quex
developer = Frank-Rene Schäfer
latest_release_version = 0.23.8
latest_release_date =March 6 ,2008
operating_system =Cross-platform
genre = Lexical analyzer generator
license =LGPL (with military use exclusion)
website = [http://quex.sourceforge.net/ quex.sourceforge.net]Quex is a lexical analyzer generator implemented in the Python programming language that creates C++ lexical analyzers. Significant features include the ability to generate lexical analyzers that operate on
Unicode input, the creation of direct coded (non-table based) lexical analyzers and the use of inheritance relationships in lexical analysis modes.Features
Direct Coded Lexical Analyzers
Quex uses traditional steps of Thompson construction to create
nondeterministic finite state machine s from regular expressions, conversion to adeterministic finite-state machine and then Hopcroft optimization to reduce the number of states to a minimum. Those mechanisms, though, have been adapted to deal with character sets rather with single characters. By means of this the calculation time can be significantly reduced. Since the unicode character set consists of much more code points than plain ASCII, those optimizations are necessary in order to produce lexical analysers in a reasonable amount of time.Instead of construction of a table based lexical analyzer where the transition information is stored in a data structure, quex generates C/C++ code to perform transitions. Direct coding creates lexical analyzers that structurally more closely resemble typical hand written lexical analyzers than table based lexers. Also direct coded lexers tend to perform better than analogous table based lexical analyzers.
Unicode Input Alphabets
Quex can handle input alphabets that contain the full Unicode code point range (0 to 10FFFFh). This is augmented by the ability to specify regular expressions that contain Unicode properties as expressions. For example, Unicode code points with the binary property XID_Start can be specified with the expression
P{XID_Start}
orP{XIDS}
. Quex can also generate code to calliconv to perform character conversion. Quex relies directly on databases as they are delivered by theUnicode Consortium. Updating to new releases of the standard consists only of copying the correspondent database files into quex's correspondent directory.Lexical Analysis Modes
Like traditional lexical analyzers like Lex or Flex, quex supports multiple lexical analysis modes in a lexer. In addition to pattern actions, quex modes can specify event actions: code to be executed during events such as entering or exiting a mode or when any match is found. Quex modes can be also be related by inheritance which allows modes to share common pattern and event actions.
Sophisticated Buffer Handling
Quex provides sophisticated mechanism of buffer handling and reload that are at the same time efficient and flexible. Quex povides interfaces that allow virtually to plug-in any character set converter. The converters are activated only 'on-demand', i.e. when new buffer filling is required. By default quex can plug-in the
Iconv library. By means of this backbone quex is able to analyse a huge set of character encodings.See also
*
Lexical analysis
* Descriptions of Thompson construction and Hopcroft optimization can be found in most textbooks on compiler construction such as .External links
* [http://quex.sourceforge.net/ Quex - A Mode Oriented Directly Coded Lexical Analyser Generator]
Wikimedia Foundation. 2010.