Converting ε-NFA to DFA

From last post ε-NFA in Java we saw some nice properties of the ε-NFA. Mainly involving their design simplicity. Unfortunately, ε-NFA can be much more slower compared to the equivalent DFA. In this post we will discuss few different ways of converting ε-NFA to DFA. In other words we will remove nondeterminism from ε-NFA.

Two forms of nondeterminism

We have observed nondeterminism in two slightly different forms. The most apparent is when given a state $$q$$ and an input symbol $$a$$ there are more than one transition from $$q$$ on input $$a$$. This form is clearly visible in Image 1.0.3 . Another, slightly different form involves ε-transitions. A state $$q_0$$ which has ε-transition to a state $$q_1$$ may or may not follow it, thus its next state is not determined.

The good news is that both forms of nondeterminism can be eliminated and obtained machine will be deterministic. But before getting our hands dirty and starting to convert ε-NFA into DFA, let's proof that this is indeed possible. That is, proof that every nondeterministic finite automaton has an equivalent deterministic finite automaton. This will give us better idea how to come up with an algorithm for converting ε-NFA into DFA.

Proof of equivalence between ε-NFAs and DFAs

We will prove equivalence between ε-NFA and DFA using Constructive Proof (Wikipedia) . That is given some arbitrary chosen ε-NFA or NFA we will construct DFA from it. First let's consider easier case - NFA. Let $$N = (Q, Σ, δ, q_0, F)$$ be NFA recognising some language $$A$$. We will construct a DFA $$M = (Q^\prime, Σ, δ^\prime, q_0^\prime, F^\prime)$$ recognising $$A$$ (see Definition 1.0.6 and Definition 1.0.0 ).

First we need to find out what states $$Q^\prime$$ $$M$$ will have. We can use the fact, that given a symbol $$a$$, NFA can transition from state $$q$$ to set of states $$D$$. If we treat $$D$$ as a single state in $$M$$, we will have a transition function from $$q$$ to $$D$$ on input $$a$$. In other words, $$M$$ states constructed from $$N$$ will be all the subsets of $$N$$ states, thus $$Q^\prime = P(Q)$$. Notice that theoretically, DFA have $$2^k$$ more states than NFA with $$k$$ states which is enormous increase, but in practice not all states are reachable.

Next, let's find out what transition function $$δ^\prime$$ will be. We know that a state $$R$$ of $$M$$ is a set of states, where each state $$q$$ in $$R$$ is a member of $$N$$ states $$Q$$. So, passing $$R$$ and a symbol $$a$$ from alphabet $$Σ$$ to the $$δ^\prime$$, is the same as passing each $$q$$ in $$R$$ with symbol $$a$$ to the transition function $$δ$$ of $$N$$. The result will be the union of all returned subsets. More formally, $$$ δ^\prime(R, a) = \bigcup_{q\in R} δ(q,a) $$$ In other words, above equation says that a transition function $$δ^\prime(R, a)$$ takes one state $$R$$ from set of states $$Q^\prime$$ and one symbol $$a$$ from the alphabet $$Σ$$ and outputs a set of states. That set is a union between all the subsets of $$P(Q)$$ which we would get, if we would pass all the states $$q$$ in $$R$$ and symbol $$a$$ to the transition function $$δ$$. Don't worry if this is still unclear, we will go through concrete example very soon.

At the moment we constructed states $$Q^\prime$$ of $$M$$ and a transition function $$δ^\prime$$ but we still need to find out a start state $$q_0^\prime$$ and accept states $$F^\prime$$. Given a fact that NFA can have only one start state, $$q_0^\prime$$ should be equal to $$q_0$$. Unfortunately, this is not the case because only a member of $$Q^\prime$$ can be a start state and $$q_0 \notin Q^\prime$$. But $$\{q_0\} \in Q^\prime$$, thus $$q_0^\prime =\{q_0\}$$.

Lastly, we need to find out accept states $$F^\prime$$ of $$M$$. We say that NFA accepts string if resulting set of states has an accept state. Similarly, we can reasoning about state $$R \in Q^\prime$$. If it contains an accept state of $$N$$, then $$R$$ is accepting state as well. More formally, $$$ F^\prime = \{R \in Q^\prime | R \text{ contains an accept state of } N\} $$$

NOTE: An alphabet $$Σ$$ is the same for both NFA and DFA, thus we don't need to consider it.

And we are done. By showing that DFA can be constructed from arbitrary chosen NFA, we proved that they are equivalent, meaning they recognise the same language. At this point you should see that construction of DFA from NFA is based on removing nondeterminism from NFA. Similarly, we will prove that any ε-NFA has equivalent DFA. The steps involved converting are all the same with an exception for a transition function $$δ^\prime$$ and a start state $$q_0^\prime$$.

As you may guessed, the main reason of this exception is a second form of nondeterminism, that is ε-transitions of ε-NFA. To be able construct DFA we have to take consideration of ε-transitions as well. Remember, ε-closure from Definition 1.1.1 . It is defined as set of states $$E(R)$$ which can be reached from members of $$R$$ by following only along ε-arrows, including members of $$R$$ themselves. Thus, $$δ^\prime$$ of $$M$$ will be set of states which can be reached given state $$R$$ and a symbol $$a$$ followed by their ε-transitions. More formally, $$$ δ^\prime(R, a) = \bigcup_{q\in R} E(δ(q,a)) $$$

Additionally, we need to ε-close the start state of $$M$$ to include all the states which can be reached from start state following ε-arrows, $$q_0^\prime = E(\{q_0\})$$. We have now completed the construction of the DFA $$M$$ that simulates the ε-NFA $$N$$. Theorem 1.0.0 shows summarised version of the proof.

Theorem 1.0.0

Every nondeterministic finite automaton has an equivalent deterministic finite automaton.

$$PROOF:$$

Let $$N = (Q, Σ, δ, q_0, F)$$ be NFA recognising some language $$A$$. We will construct a DFA $$M = (Q^\prime, Σ, δ^\prime, q_0^\prime, F^\prime)$$ recognising $$A$$.

  1. $$Q^\prime = P(Q)$$
  2. $$δ^\prime(R, a) = \bigcup_{q\in R} E(δ(q,a)) $$
  3. $$q_0^\prime = E(\{q_0\})$$
  4. $$F^\prime = \{R \in Q^\prime | R \text{ contains an accept state of } N\}$$
We have now completed the construction of the DFA $$M$$ that simulates the ε-NFA $$N$$.

An example of converting ε-NFA to DFA

Let's get our hands dirty and convert ε-NFA in Image 1.1.3 to DFA.

ε-NFA to be converted to DFA.
Image 1.1.3: ε-NFA to be converted to DFA.
  • First we need to find all the states of DFA. We can use the fact that states of DFA $$Q^\prime$$ will be all the subsets of NFA states $$Q$$. Therefore, $$$ \begin{align} Q^\prime &= P(Q) \\ &= P(\{q_0, q_1, q_2\}) \\ &=\{\emptyset, \{q_0\},\{q_1\},\{q_2\}, \{q_0, q_1\},\{q_0, q_2\},\{q_1, q_2\}, \{q_0, q_1, q_2\}\} \end {align} $$$
  • The start state will be all the states which can be reached from NFA start state $$q_0$$ following zero or more ε-transitions, including itself. From Image 1.1.3 we see that $$q_0$$ has only one ε-transition to $$q_2$$, therefore DFA start state $$q_0^\prime$$ will be $$$ q_0^\prime = E(\{q_0\}) = \{q_0, q_2\} $$$
  • The new accept states $$F^\prime$$ will be those containing at least one NFA's accept state. There is only one accept state $$q_0$$, thus any state in $$Q^\prime$$ containing $$q_0$$ will be an accept state. $$$ F^\prime =\{\{q_0\},\{q_0, q_1\},\{q_0, q_2\},\{q_0, q_1, q_2\}\} $$$
  • The only thing what is left is to find DFA transition function $$δ^\prime$$. To do that, we need to find all transitions from each state in $$Q^\prime$$ given symbols $$a$$ and $$b$$. For example, given symbol $$a$$ state $$\{q_1, q_2\}$$ will transition to state $$\{q_0, q_1, q_2\}$$. More formally, $$$\begin{align} δ^\prime(\{q_1, q_2\}, a) &= \bigcup_{q\in \{q_1, q_2\}} E(δ(q,a)) \\ &=E(δ(q_1,a)) \cup E(δ(q_2,a)) \\ &=E(\{q_1, q_2\}) \cup E(\{q_0\}) \\ &=\{q_1, q_2\} \cup \{q_0, q_2\} \\ &=\{q_0, q_1, q_2\} \end{align} $$$ All the rest transitions can be found similarly.

Constructing DFA from ε-NFA using transition table

Unfortunately, finding all the transitions manually using above method is too verbose, especially if DFA has lots of states. Better approach is to use transition table of NFA.

$$q$$ $$δ(q, a)$$ $$δ(q, b)$$ $$δ(q, ε)$$ $$E(δ(q, a))$$ $$E(δ(q, b))$$
$$q_0$$ $$\emptyset$$ $$\{q_1\}$$ $$\{q_2\}$$ $$\emptyset$$ $$\{q_1\}$$
$$q_1$$ $$\{q_1, q_2\}$$ $$\{q_2\}$$ $$\emptyset$$ $$\{q_1, q_2\}$$ $$\{q_2\}$$
$$q_2$$ $$\{q_0\}$$ $$\emptyset$$ $$\emptyset$$ $$\{q_0, q_2\}$$ $$\emptyset$$
Table 1.0.3: Transition table of ε-NFA.

Using Table 1.0.3 we can build DFA much more faster. We start with the start state of DFA $$\{q_0, q_2\}$$.

Initial DFA with only start state.
Image 1.1.4: Initial DFA with only start state.

Now for each symbol in the alphabet we will find transition from $$\{q_0, q_2\}$$ to other two states. We will do this by combining ε-closed transitions from Table 1.0.3 for each state in $$\{q_0, q_2\}$$. For example, first state $$q_0$$ has $$\emptyset$$ for $$E(δ(q_0, a))$$ and $$q_2$$ has $$\{q_0, q_2\}$$, thus by combining them both, we get $$\emptyset \cup \{q_0, q_2\} = \{q_0, q_2\}$$. State $$\{q_0, q_2\}$$ is already defined, so we just add an arrow with label $$a$$ to itself. Similarly, we find transition for symbol $$b$$, $$\emptyset \cup \{q_1\} = \{q_1\}$$. State $$\{q_1\}$$ is not in our diagram, thus we create new one and add transition from state $$\{q_0, q_2\}$$ with label $$b$$ to our new state Image 1.1.5 .

DFA after adding transitions from start state $$\{q_0, q_2\}$$.
Image 1.1.5: DFA after adding transitions from start state $$\{q_0, q_2\}$$.

Next step is to repeat all the previous steps for each newly created state, in our case $$\{q_1\}$$. Once again by inspecting Table 1.0.3 we find that state $$\{q_1\}$$ will transition to state $$\{q_1, q_2\}$$ on symbol $$a$$ and to $$\{2\}$$ on symbol $$b$$. Image 1.1.6 illustrates it more clearly.

DFA after adding transitions from state $$\{q_1\}$$.
Image 1.1.6: DFA after adding transitions from state $$\{q_1\}$$.

Previous transitions introduced two new states $$\{q_1, q_2\}$$ and $$\{q_2\}$$, thus we need to find transitions for these to states as well. We already know that on symbol $$a$$ state $$\{q_1, q_2\}$$ will transition to $$\{q_0, q_1, q_2\}$$ and on symbol $$b$$ to state $$\{q_2\}$$.

DFA after adding transitions from state $$\{q_1, q_2\}$$.
Image 1.1.7: DFA after adding transitions from state $$\{q_1, q_2\}$$.
NOTE: While constructing DFA don't forget about accept states. At the moment we have two of them start state $$\{q_0, q_2\}$$ and $$\{q_0, q_1, q_2\}$$.

Let's do the same thing for state $$\{q_2\}$$. On symbol $$a$$ it goes to start state $$\{q_0, q_2\}$$ and on symbol $$b$$ to $$\emptyset$$. You may wonder how empty set can be valid state? Remember that $$\emptyset$$ was included in the set of DFA states $$Q^\prime$$, hence it is perfectly valid state.

DFA after adding transitions from state $$\{q_2\}$$.
Image 1.1.8: DFA after adding transitions from state $$\{q_2\}$$.

Once again we have to find transitions for $$\emptyset$$ and $$\{q_0, q_1, q_2\}$$. State $$\emptyset$$ doesn't contain any state of ε-NFA, thus no matter what symbol is given it will transition to $$\emptyset$$ in other words to itself. State $$\{q_0, q_1, q_2\}$$ transitions to the state $$\{q_1, q_2\}$$ on symbol $$b$$ and to itself on symbol $$a$$.

Final DFA.
Image 1.1.9: Final DFA.

Last transitions did not introduced any new state, therefore DFA construction is complete. Notice that new automaton has less states than expected. Missing states are $$\{q_0\}$$ and $$\{q_0, q_1\}$$. Hence, constructing DFA using this method may result in less state than using Theorem 1.0.0 .

NOTE: DFA construction is considered complete, if every state in DFA has transition for each symbol in the alphabet.

Implementation

For implementation of both algorithms in Java see .