DFA minimization

DFA minimization

In computer science, more specifically in the branch of automata theory, DFA minimization is the task of transforming a given deterministic finite automaton (DFA) into an equivalent DFA that has minimum number of states. Here, two DFAs are called equivalent if they describe the same regular language. Several different algorithms accomplishing this task are known and described in standard textbooks on automata theory.[1]

Contents

Minimum DFA

For each regular language that can be accepted by a DFA, there exists a DFA with a minimum number of states (and thus a minimum programming effort to create and use) and this DFA is unique (except that states can be given different names.)[2]

There are three classes of states can be removed/merged from the original DFA without affecting the language it accepts.

  • Unreachable states are those states that are not reachable from the initial state of the DFA, for any input string.
  • Dead states are those nonaccepting states whose transitions for every input character terminate on themselves. These are also called Trap states because once entered there is no escape.
  • Nondistinguishable states are those that cannot be distinguished from one another for any input string.

DFA minimization is usually done in three steps, corresponding to the removal/merger of the relevant states. Since the elimination of nondistinguishable states is computationally the most expensive one, it's usually done as the last step.

Unreachable states

The state p of DFA M=(Q, Σ, δ, q0, F) is unreachable if no such string w in ∑* exists for which p=δ(q0, w). Reachable states can be obtained with the following algorithm:

let reachable_states:= {q0};
let new_states:= {q0};
do {
    temp := the empty set;
    for each q in new_states do
        for all c indo
            temp := temp ∪ {p such that p=δ(q,c)};
        end;
    end;
    new_states := temp \ reachable_states;
    reachable_states := reachable_states ∪ new_states;
} while(new_states ≠ the empty set);
unreachable_states := Q \ reachable_states;

Unreachable states can be removed from the DFA without affecting the language that it accepts.

Nondistinguishable states

Hopcroft's algorithm

One algorithm for merging the nondistinguishable states of a DFA, due to Hopcroft (1971), is based on partition refinement, partitioning the DFA states into groups by their behavior. These groups represent equivalence classes of the Myhill–Nerode equivalence relation, whereby every two states of the same partition are equivalent if they have the same behavior for all the input sequences. That is, for every two states p1 and p2 that belong to the same equivalence class within the partition P, it will be the case that for every input word w, if one follows the transitions determined by w from the two states p1 and p2 one will either be led to accepting states in both cases or be led to rejecting states in both cases; it should not be possible for w to take p1 to an accepting state and p2 to a rejecting state or vice versa.

The following pseudocode describes the algorithm:

P := {{all accepting states}, {all nonaccepting states}};
Q := {{all accepting states}};
while (Q is not empty) do
     choose and remove a set A from Q
     for each c indo
          let X be the set of states for which a transition on c leads to a state in A
          for each set Y in P for which X ∩ Y is nonempty do
               replace Y in P by the two sets X ∩ Y and Y \ X
               if Y is in Q
                    replace Y in Q by the same two sets
               else
                    add the smaller of the two sets to Q
          end;
     end;
end;

The algorithm starts with a partition that is too coarse: every pair of states that are equivalent according to the Myhill–Nerode relation belong to the same set in the partition, but pairs that are inequivalent might also belong to the same set. It gradually refines the partition into a larger number of smaller sets, at each step splitting sets of states into pairs of subsets that are necessarily inequivalent. The initial partition is a separation of the states into two subsets of states that clearly do not have the same behavior as each other: the accepting states and the rejecting states. The algorithm then repeatedly chooses a set A from the current partition and an input symbol c, and splits each of the sets of the partition into two (possibly empty) subsets: the subset of states that lead to A on input symbol c, and the subset of states that do not lead to A. Since A is already known to have different behavior than the other sets of the partition, the subsets that lead to A also have different behavior than the subsets that do not lead to A. When no more splits of this type can be found, the algorithm terminates.

The worst case running time of this algorithm is O(ns log n), where n is the number of states and s is the size of the alphabet. This bound follows from the fact that, for each of the ns transitions of the automaton, the sets drawn from Q that contain the target state of the transition have sizes that decrease relative to each other by a factor of two or more, so each transition participates in O(log n) of the splitting steps in the algorithm. The partition refinement data structure allows each splitting step to be performed in time proportional to the number of transitions that participate in it.[3] This remains the most efficient algorithm known for solving the problem, and for certain distributions of inputs its average-case complexity is even better, O(n log log n).[4]

Once Hopcroft's algorithm has been used to group the states of the input DFA into equivalence classes, the minimum DFA can be constructed by forming one state for each equivalence class. If S is a set of states in P, s is a state in S, and c is an input character, then the transition in the minimum DFA from the state for S, on input c, goes to the set containing the state that the input automaton would go to from state s on input c. The initial state of the minimum DFA is the one containing the initial state of the input DFA, and the accepting states of the minimum DFA are the ones whose members are accepting states of the input DFA.

Moore's algorithm

Moore's algorithm for DFA minimization is due to Edward F. Moore (1956). Like Hopcroft's algorithm, it maintains a partition that starts off separating the accepting from the rejecting states, and repeatedly refines the partition until no more refinements can be made. At each step, it replaces the current partition with the coarsest common refinement of s + 1 partitions, one of which is the current one and the others are the preimages of the current partition under the transition functions for each of the input symbols. The algorithm terminates when this replacement does not change the current partition. Its worst-case time complexity is O(n2s): each step of the algorithm may be performed in time O(ns) using a variant of radix sort to reorder the states so that states in the same set of the new partition are consecutive in the ordering, and there are at most n steps since each one but the last increases the number of sets in the partition. The instances of the DFA minimization problem that cause the worst-case behavior are the same as for Hopcroft's algorithm. The number of steps that the algorithm performs can be much smaller than n, so on average (for constant s) its performance is O(n log n) or even O(n log log n) depending on the random distribution on automata chosen to model the algorithm's average-case behavior.[4]

Brzozowski's algorithm

As Brzozowski (1963) observed, reversing the edges of a DFA produces an NFA for the reversal of the original language, and converting this NFA to a DFA using the standard powerset construction (constructing only the reachable states of the converted DFA) leads to a minimal DFA for the same reversed language. Repeating this reversal operation a second time produces a minimal DFA for the original language. The worst-case complexity of Brzozowski's algorithm is exponential, as there are regular languages for which the minimal DFA of the reversal is exponentially larger than the minimal DFA of the language,[5] but it frequently performs better than this worst case would suggest.[4]

NFA minimization

While the above procedures work for DFAs, the method of partitioning does not work for non-deterministic finite automata (NFAs). Finding a polynomial-time algorithm to minimize NFAs is impossible, unless P=NP.[6]

Notes

  1. ^ Hopcroft, Ullman (1979)
  2. ^ Hopcroft, Motwani & Ullman (2001), Section 4.4.3, "Minimization of DFA's", p. 159.
  3. ^ Hopcroft (1971); Aho, Hopcroft & Ullman (1974)
  4. ^ a b c Berstel et al. (2010).
  5. ^ For instance, the language of binary strings whose nth symbol is a one requires only n + 1 states, but its reversal requires 2n states. Leiss (1981) provides a ternary n-state DFA whose reversal requires 2n states, the maximum possible. For additional examples and the observation of the connection between these examples and the worst-case analysis of Brzozowski's algorithm, see Câmpeanu et al. (2001).
  6. ^ Hopcroft, Motwani & Ullman (2001), Section 4.4, Figure labeled "Minimizing the States of an NFA", p. 163.

References

  • Aho, Alfred V.; Hopcroft, John E.; Ullman, Jeffrey D. (1974), "4.13 Partitioning", The Design and Analysis of Computer Algorithms, Addison-Wesley, pp. 157–162 .
  • Berstel, Jean; Boasson, Luc; Carton, Olivier; Fagnot, Isabelle (2010), "Minimization of Automata", Automata: from Mathematics to Applications, European Mathematical Society, arXiv:1010.5318 
  • Brzozowski, J. A. (1963), "Canonical regular expressions and minimal state graphs for definite events", Proc. Sympos. Math. Theory of Automata (New York, 1962), Polytechnic Press of Polytechnic Inst. of Brooklyn, Brooklyn, N.Y., pp. 529–561, MR0175719 .
  • Câmpeanu, Cezar; Culik, Karel, II; Salomaa, Kai; Yu, Sheng (2001), "State Complexity of Basic Operations on Finite Languages", 4th International Workshop on Automata Implementation (WIA '99), Lecture Notes in Computer Science, 2214, Springer-Verlag, pp. 60–70, doi:10.1007/3-540-45526-4_6 .
  • Hopcroft, John (1971), "An n log n algorithm for minimizing states in a finite automaton", Theory of machines and computations (Proc. Internat. Sympos., Technion, Haifa, 1971), New York: Academic Press, pp. 189–196, MR0403320 
  • Hopcroft, John E.; Motwani, Rajeev; Ullman, Jeffrey D. (2001), Introduction to Automata Theory, Languages, and Computation (2nd ed.), Addison-Wesley .
  • Leiss, Ernst (1981), "Succinct representation of regular languages by Boolean automata", Theoretical Computer Science 13 (3): 323–330, doi:10.1016/S0304-3975(81)80005-9, MR603263 .
  • Moore, Edward F. (1956), "Gedanken-experiments on sequential machines", Automata studies, Annals of mathematics studies, no. 34, Princeton, N. J.: Princeton University Press, pp. 129–153, MR0078059 .

External links


Wikimedia Foundation. 2010.

Игры ⚽ Нужна курсовая?

Look at other dictionaries:

  • Dfa minimization — DFA Deterministic Finite Automata is very Famous in Theory of Computation .it often requires to build a DFA which has minimum number of states.following is the algorithm that produces a optimal DFA from a DFA.1.Eliminate all self loops (because… …   Wikipedia

  • Finite-state machine — State machine redirects here. For infinite state machines, see State transition system. For fault tolerance methodology, see State machine replication. SFSM redirects here. For the Italian railway company, see Circumvesuviana. A finite state… …   Wikipedia

  • Powerset construction — In the theory of computation and Automata theory, the powerset construction or subset construction is a standard method for converting a nondeterministic finite automaton (NFA) into a deterministic finite automaton (DFA) which recognizes the same …   Wikipedia

  • Automate fini — Pour les articles homonymes, voir Automate. Fig. 1 : Automate fini reconnaissant les écritures binaires des multiples de 3. Un automate fini (on dit parfois, par une traduction littér …   Wikipédia en Français

  • Moore reduction procedure — In computer science, the Moore reduction procedure is a method used for DFA minimization. The concept is to start assuming that every state may be able to combine with every other state, then separate distinguishable states into separate groups… …   Wikipedia

  • Langage rationnel — Les langages rationnels ou langages réguliers ou encore langages reconnaissables peuvent être décrits de plusieurs façons équivalentes: ce sont les langages décrits par les expressions régulières ou rationnelles,d où le nom de langages réguliers; …   Wikipédia en Français

  • Finite state machine — A finite state machine (FSM) or finite state automaton (plural: automata ) or simply a state machine, is a model of behavior composed of a finite number of states, transitions between those states, and actions. A finite state machine is an… …   Wikipedia

  • Design for assembly — (DFA) is a process by which products are designed with ease of assembly in mind. If a product contains fewer parts it will take less time to assemble, thereby reducing assembly costs. In addition, if the parts are provided with features which… …   Wikipedia

  • Design for Assembly — is a process by which products are designed with ease of assembly in mind. If a product contains fewer parts it will take less time to assemble, thereby reducing assembly costs. In addition, if the parts are provided with features which make it… …   Wikipedia

  • List of terms relating to algorithms and data structures — The [http://www.nist.gov/dads/ NIST Dictionary of Algorithms and Data Structures] is a reference work maintained by the U.S. National Institute of Standards and Technology. It defines a large number of terms relating to algorithms and data… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”