# Lecture 4: Deterministic Finite Automata

The following figure shows the *state diagram* of a deterministic finite automaton (DFA),

The automaton has three *states* \(q_0\), \(q_1\), and \(q_2\). The state \(q_0\) is designated as the *start state* and \(q_1\) is designated as an *accept state*. The directed edges between states specify the *transitions* of the automaton when reading an input string.

This automaton accepts the set of strings that contain \(1\) and have an even number of \(0\)’s after the last \(1\).

## Languages

An *alphabet* \(\Sigma\) is a finite set of symbols, e.g., \(\Sigma=\{0,1\}\) or \(\Sigma={a,b,c}\). (We will usually content ourselves with the binary alphabet.) A *string* \(w\) over \(\Sigma\) is a finite sequence of symboles, written as \(w=x_1\cdots x_n\) for \(x_1,\ldots,x_n\in \Sigma\). The *empty string* (string of length zero) is denoted \(\varepsilon\). The set of all strings is denoted \(\Sigma^\ast\). A *language* \(L\) is any subset of strings, so that \(L\subseteq \Sigma^\ast\).

We will use languages to encode computational problems. (A language \(L\) can be thought to encode the function \(f{\colon}\Sigma^\ast\to\{0,1\}\) with \(f(w)=1\) if and only if \(w\in L\)).

**Operations on languages:** Let \(A,B\subseteq \Sigma^*\) be two languages. The following operations on languages are useful:

- Basic set-theoretic operations,
*union*\(A\cup B\),*intersection*\(A\cap B\), and*complementation*\(\overline A:=\Sigma^* \setminus A\). (These operations correspond to \({\mathrm{\scriptstyle OR}},{\mathrm{\scriptstyle AND}},{\mathrm{\scriptstyle NOT}}\).) *Concatentation*\(AB := \{uv \mid u\in A, v\in B\}\).*Kleene star*$A^*:=A^{0A}1A^2 $, where \(A^i\) denotes the \(i\)-fold concatenation of \(A\) with itself. (The corner cases \(i=0,1\) are defined so that the rule \(A^{i}A^j=A^{i+j}\) holds. Concretely, \(A^0=\{\varepsilon\}\) and \(A^1=A\)).

(Notice that the Kleene star notation is compatible with the notation for the set of all strings.)

## Deterministic Finite Automaton

Definition:Adeterministic finite automaton\(M\) consists of the following parts:

- a
transition function\(\delta{\colon}{} Q\times \Sigma \to Q\) for analphabet\(\Sigma\) and a set ofstates\(Q\),- a
start state\(q_0\in Q\),- a set of
accept states\(F\subseteq Q\).

## Language of a DFA

Definition:A DFA \(M\)acceptsa string \(w=x_1\cdots x_n\) if there exists a sequence of states \(r_0,r_1,\ldots,r_n\in Q\) such that

- the first state is the start statem \(r_0=q_0\)
- the \(i\)th state is obtained by reading the \(i\)th symbol, \(r_i=\delta(r_{i-1},x_i)\)
- the final state is an accept state, \(r_n\in F\).

The language of \(M\) is defined as \[
L(M) = \{ w \mid \text{$M$ accepts $w$}\}.
\] If \(L\) is the language of some deterministic finite automaton, we say that \(L\) is *regular*.

## Example proof

Recall the automaton \(M\) from the beginning:

Lemma:The language of the automaton \(M\) is the set of all strings \(w\in\{0,1\}^*\) such that \(w\) contains a \(1\) and ends with an even number of \(0\)’s after the last \(1\).

We will prove this lemma by inductions. To make the proof easier, we will show a stronger statement. (Proofs by induction often become easier if the statement is strengthened.)

Define the *repeated transition function* \(\delta^*{\colon}\Sigma^*\to Q\), \[
\delta^*(w) := \text{state of $M$ after reading $w$}
\] This function satisfies \(\delta^\ast(w x)=\delta(\delta^\ast(w),x)\) for every string \(w\in \Sigma^\ast\) and symbol \(x\in \Sigma\). Furthermore, \(L(M) = \{w \mid \delta^\ast(w) \in F \}.\) Hence, the following claim implies the lemma.

**Claim:** The repeated transition function of \(M\) is \[
\delta^\ast(w) = \begin{cases}
q_0 & \text{ if $w$ does not contains $1$ ($A$)},\\
q_1 & \text{ if $w$ contains $1$ and ends with an even number of $0$'s after the the last $1$ ($B$)},\\
q_2 & \text{ if $w$ contains $1$ and ends with an odd number of $0$'s after the the last $1$ ($C$)}.
\end{cases}
\] *Proof of claim:* By induction on the length of \(w\). (At this point, the proof is completely mechanical.)

If \(w=\varepsilon\), then \(\delta^*(w)=q_0\) and the claim is true (because \(\varepsilon\in A\)).

Suppose length of \(w=ux\) for \(x\in \{0,1\}\) is greater than \(0\). The induction hypothesis is that the claim holds for all strings that are shorter than \(w\). (In particular, the claim holds for the substring \(u\) of \(w\).)

We consider three cases:

- \(u\in A\): By induction hypothesis, \(\delta^\ast(u)=q_0\). Therefore, . \[ \delta^\ast(w)=\delta(q_0,x) = \begin{cases} q_0 & \text{ if $x=0$,}\\ q_1 & \text{ if $x=1$.} \end{cases} \] The claim is true in this case because \(u0\in A\) and \(u1\in B\).
- \(u\in B\): By induction hypothesis, \(\delta^\ast(u)=q_1\). Therefore, \[ \delta^\ast(w)=\delta(q_1,x) = \begin{cases} q_2 & \text{ if $x=0$,}\\ q_1 & \text{ if $x=1$.} \end{cases} \] The claim is true in this case because \(u0 \in C\) and \(u1\in B\).
- \(u\in C\): By induction hypothesis, \(\delta^\ast(u)=q_1\). Therefore, \[ \delta^\ast(w)=\delta(q_1,x) = \begin{cases} q_1 & \text{ if $x=0$,}\\ q_1 & \text{ if $x=1$.} \end{cases} \] The claim is true in this case because both \(u0 \in B\) and \(u1\in B\).

We conclude that the claim is true in all three cases, which proves the claim.