- String operations
In
computer science , in the area offormal language theory , frequent use is made of a variety ofstring functions ; however, the notation used is different from that used oncomputer programming , and some commonly used functions in the theoretical realm are rarely used when programming. This article defines some of these basic terms.Alphabet of a string
The alphabet of a string is a list of all of the letters that occur in a particular string. If "s" is a string, its alphabet is denoted by
:operatorname{Alph}(s)
tring substitution
Let "L" be a language, and let Sigma be its alphabet. A string substitution or simply a substitution is a mapping "f" that maps letters in Sigma to languages (possibly in a different alphabet). Thus, for example, given a letter ain Sigma, one has f(a)=L_a where L_asubsetDelta^* is some language whose alphabet is Delta. This mapping may be extended to strings as
:f(varepsilon)=varepsilon
for the
empty string varepsilon, and:f(sa)=f(s)f(a)
for string sin L. String substitution may be extended to the entire language as
:f(L)=igcup_{sin L} f(s)
An example of string substitution occurs in
regular language s, which are closed under string substitution. That is, if the letters of a regular language are substituted by other regular languages, the result is still a regular language.tring homomorphism
A string homomorphism is a string substitution such that each letter is replaced by a single string. That is, f(a)=s, where "s" is a string, for each letter "a". String homomorphisms are
homomorphism s, preserving thebinary operation ofstring concatenation . Given a language "L", the set f(L) is called the homomorphic image of "L". The inverse homomorphic image of a string "s" is defined as:f^{-1}(s)={wvert f(w)=s}
while the inverse homomorphic image of a language "L" is defined as
:f^{-1}(L)={svert f(s)in L}
Note that, in general, f(f^{-1}(L)) e L, while one does have
:f(f^{-1}(L)) subseteq L
and
:L subseteq f^{-1}(f(L))
for any language "L". Simple single-letter
substitution cipher s are examples of string homomorphisms.tring projection
If "s" is a string, and Sigma is an alphabet, the string projection of "s" is the string that results by removing all letters which are not in Sigma. It is written as pi_Sigma(s),. It is formally defined by removal of letters from the right hand side:
:pi_Sigma(s) = egin{cases} varepsilon & mbox{if } s=varepsilon mbox{ the empty string} \pi_Sigma(t) & mbox{if } s=ta mbox{ and } a otin Sigma \ pi_Sigma(t)a & mbox{if } s=ta mbox{ and } a in Sigma end{cases}
Here varepsilon denotes the
empty string . The projection of a string is essentially the same as aprojection in relational algebra .String projection may be promoted to the projection of a language. Given a
formal language "L", its projection is given by:pi_Sigma (L)={pi_Sigma(s) vert sin L }
Right quotient
The right quotient of a letter "a" from a string "s" is the truncation of the letter "a" in the string "s", from the right hand side. It is denoted as s/a. If the string does not have "a" on the right hand side, the result is the empty string. Thus:
:sa)/ b = egin{cases} s & mbox{if } a=b \varepsilon & mbox{if } a e bend{cases}
The quotient of the empty string may be taken:
:varepsilon / a = varepsilon
Similarly, given a subset Ssubset M of a monoid M, one may define the quotient subset as
:S/a={sin M vert sain S}
Left quotients may be defined similarly, with operations taking place on the left of a string.
yntactic relation
The right quotient of a subset Ssubset M of a monoid M defines an
equivalence relation , called the rightsyntactic relation of "S". It is given by:sim_S ;,=, {(s,t)in M imes M vert S/s = S/t }
The relation is clearly of finite index (has a finite number of equivalence classes) if and only if the family right quotients is finite; that is, if
:S/m vert min M}
is finite. In this case, "S" is a
recognizable language , that is, a language that can be recognized by afinite state automaton . This is discussed in greater detail in the article onsyntactic monoid s.Right cancellation
The right cancellation of a letter "a" from a string "s" is the removal of the first occurrence of the letter "a" in the string "s", starting from the right hand side. It is denoted as sdiv a and is recursively defined as
:sa)div b = egin{cases} s & mbox{if } a=b \(sdiv b)a & mbox{if } a e bend{cases}
The empty string is always cancellable:
:varepsilon div a = varepsilon
Clearly, right cancellation and projection
commute ::pi_Sigma(s)div a = pi_Sigma(s div a )
Prefixes
The prefixes of a string is the set of all prefixes to a string, with respect to a given language:
:operatorname{Pref}_L(s) = {t vert s=tu mbox { for } uin L}
The prefix closure of a language is
:operatorname{Pref} (L) = igcup_{sin L} operatorname{Pref}_L(s)
A language is called prefix closed if operatorname{Pref} (L) = L. Clearly, the prefix closure operator is
idempotent ::operatorname{Pref} (operatorname{Pref} (L)) =operatorname{Pref} (L)
The prefix relation is a
binary relation sqsubseteq such that ssqsubseteq t if and only if s in operatorname{Pref}_L(t).Prefix grammar s generate languages that are prefix-closed.ee also
*
String functions (programming)
*Levi lemma References
* John E. Hopcroft and Jeffrey D. Ullman, "Introduction to Automata Theory, Languages and Computation", Addison-Wesley Publishing, Reading Massachusetts, 1979. ISBN 0-201-029880-X. "(See chapter 3.)"
Wikimedia Foundation. 2010.