# Lecture 4: Deterministic Finite Automata

The following figure shows the state diagram of a deterministic finite automaton (DFA),

The automaton has three states $q_0$, $q_1$, and $q_2$. The state $q_0$ is designated as the start state and $q_1$ is designated as an accept state. The directed edges between states specify the transitions of the automaton when reading an input string.

This automaton accepts the set of strings that contain $1$ and have an even number of $0$’s after the last $1$.

## Languages

An alphabet $\Sigma$ is a finite set of symbols, e.g., $\Sigma=\{0,1\}$ or $\Sigma={a,b,c}$. (We will usually content ourselves with the binary alphabet.) A string $w$ over $\Sigma$ is a finite sequence of symboles, written as $w=x_1\cdots x_n$ for $x_1,\ldots,x_n\in \Sigma$. The empty string (string of length zero) is denoted $\varepsilon$. The set of all strings is denoted $\Sigma^\ast$. A language $L$ is any subset of strings, so that $L\subseteq \Sigma^\ast$.

We will use languages to encode computational problems. (A language $L$ can be thought to encode the function $f{\colon}\Sigma^\ast\to\{0,1\}$ with $f(w)=1$ if and only if $w\in L$).

Operations on languages: Let $A,B\subseteq \Sigma^*$ be two languages. The following operations on languages are useful:

• Basic set-theoretic operations, union $A\cup B$, intersection $A\cap B$, and complementation $\overline A:=\Sigma^* \setminus A$. (These operations correspond to ${\mathrm{\scriptstyle OR}},{\mathrm{\scriptstyle AND}},{\mathrm{\scriptstyle NOT}}$.)
• Concatentation $AB := \{uv \mid u\in A, v\in B\}$.
• Kleene star $A^*:=A0A1A^2$, where $A^i$ denotes the $i$-fold concatenation of $A$ with itself. (The corner cases $i=0,1$ are defined so that the rule $A^{i}A^j=A^{i+j}$ holds. Concretely, $A^0=\{\varepsilon\}$ and $A^1=A$).

(Notice that the Kleene star notation is compatible with the notation for the set of all strings.)

## Deterministic Finite Automaton

Definition: A deterministic finite automaton $M$ consists of the following parts:

• a transition function $\delta{\colon}{} Q\times \Sigma \to Q$ for an alphabet $\Sigma$ and a set of states $Q$,
• a start state $q_0\in Q$,
• a set of accept states $F\subseteq Q$.

## Language of a DFA

Definition: A DFA $M$ accepts a string $w=x_1\cdots x_n$ if there exists a sequence of states $r_0,r_1,\ldots,r_n\in Q$ such that

• the first state is the start statem $r_0=q_0$
• the $i$th state is obtained by reading the $i$th symbol, $r_i=\delta(r_{i-1},x_i)$
• the final state is an accept state, $r_n\in F$.

The language of $M$ is defined as $L(M) = \{ w \mid \text{M accepts w}\}.$ If $L$ is the language of some deterministic finite automaton, we say that $L$ is regular.

## Example proof

Recall the automaton $M$ from the beginning:

Lemma: The language of the automaton $M$ is the set of all strings $w\in\{0,1\}^*$ such that $w$ contains a $1$ and ends with an even number of $0$’s after the last $1$.

We will prove this lemma by inductions. To make the proof easier, we will show a stronger statement. (Proofs by induction often become easier if the statement is strengthened.)

Define the repeated transition function $\delta^*{\colon}\Sigma^*\to Q$, $\delta^*(w) := \text{state of M after reading w}$ This function satisfies $\delta^\ast(w x)=\delta(\delta^\ast(w),x)$ for every string $w\in \Sigma^\ast$ and symbol $x\in \Sigma$. Furthermore, $L(M) = \{w \mid \delta^\ast(w) \in F \}.$ Hence, the following claim implies the lemma.

Claim: The repeated transition function of $M$ is $\delta^\ast(w) = \begin{cases} q_0 & \text{ if w does not contains 1 (A)},\\ q_1 & \text{ if w contains 1 and ends with an even number of 0's after the the last 1 (B)},\\ q_2 & \text{ if w contains 1 and ends with an odd number of 0's after the the last 1 (C)}. \end{cases}$ Proof of claim: By induction on the length of $w$. (At this point, the proof is completely mechanical.)

If $w=\varepsilon$, then $\delta^*(w)=q_0$ and the claim is true (because $\varepsilon\in A$).

Suppose length of $w=ux$ for $x\in \{0,1\}$ is greater than $0$. The induction hypothesis is that the claim holds for all strings that are shorter than $w$. (In particular, the claim holds for the substring $u$ of $w$.)

We consider three cases:

• $u\in A$: By induction hypothesis, $\delta^\ast(u)=q_0$. Therefore, . $\delta^\ast(w)=\delta(q_0,x) = \begin{cases} q_0 & \text{ if x=0,}\\ q_1 & \text{ if x=1.} \end{cases}$ The claim is true in this case because $u0\in A$ and $u1\in B$.
• $u\in B$: By induction hypothesis, $\delta^\ast(u)=q_1$. Therefore, $\delta^\ast(w)=\delta(q_1,x) = \begin{cases} q_2 & \text{ if x=0,}\\ q_1 & \text{ if x=1.} \end{cases}$ The claim is true in this case because $u0 \in C$ and $u1\in B$.
• $u\in C$: By induction hypothesis, $\delta^\ast(u)=q_1$. Therefore, $\delta^\ast(w)=\delta(q_1,x) = \begin{cases} q_1 & \text{ if x=0,}\\ q_1 & \text{ if x=1.} \end{cases}$ The claim is true in this case because both $u0 \in B$ and $u1\in B$.

We conclude that the claim is true in all three cases, which proves the claim.