# Lectures 9 & 10: Limitations of Finite Automata

## Pumping Lemma

The following theorem, known as pumping lemma gives a way to show that a language is not regular.

Theorem: Suppose $L$ is the language of a finite automaton $M$. Then, there exists a number $p$ (called pumping length) such that for every word $w \in L$ of length at least $p$, there exists a decomposition $w=xyz$ with the following properties:

• $x y^i z\in L$ for every integer $i\ge 0$,
• $\lvert y \rvert > 0$,
• $\lvert xy \rvert \le p$.

Proof: We may assume that $M$ doesn’t have $\varepsilon$-transitions. (We could even assume $M$ is deterministic.) We choose $p$ as the number of states of $M$. Let $w\in L$ be a word in the language $L$. We are to show a decomposition of $w$ with the properties above.

Consider a computation path of $M$ that accepts $w$. Since $w$ has length at least $p$, one state of $M$ appears at least twice on this computation path (by the pigeonhole principle). Let $q$ be the first repeated state on the computation path. We divide the computation path into three parts: from the start state to the first occurance of $q$, from the first occurance of $q$ to its second occurance, and from the second occurance of $q$ to an accept state.

Let $x,y,z$ be the strings corresponding to these three parts of the computation path. Then, $w=xyz$. We are to check that this decomposition satisfies the above properties. First, $\lvert y \rvert > 0$ because $M$ doesn’t have $\varepsilon$-transitions. Second, $\lvert xy \rvert \le p$ because $q$ is the first repeated state and therefore all other states on the first and second part of the path are distinct. Finally, $x y^i z\in L$ for every integers $i\ge 0$, because we can obtain accepting computation paths for these strings by repeating the second part of the paths $i$ times. $\square$

## Examples

Lemma: The language $\{0^n1^n \mid n \in {\mathbb N}\}$ is not regular.

Proof: We will show that for any potential pumping length $p$, there exists a word $w\in L$ of length $\lvert w \rvert \ge p$ that does not admit a decomposition as in the pumping lemma. Then, from the pumping lemma it follows that the language cannot be the language of a finite automata.

Let $p$ be a natural number. Consider the word $w=0^p1^p$ in $L$. Let $w=xyz$ be any decomposition of this word that satisfies the second and third condition of the pumping lemma. Then, $x=0^s$ and $y=0^t$ for integers $s\ge 0$ and $t\ge 1$ (because $\lvert x y \rvert \le p$ and $\lvert y \rvert \ge 1$). But then $xy^2z=0^{p+t}1^p\notin L$, which means that the decomposition doesn’t satisfy the first condition of the pumping lemma. $\square$

\

The next example shows that sometimes one has to be careful to choose the right word $w$ in the language to show that the conditions of the pumping lemma are not satisfied.

Lemma: The language $\{ ww\mid w\in \{0,1\}^\ast \}$ is not regular.

Proof: Same proof strategy as before.

Let $p$ be a natural number. Consider the word $0^p10^p1$ in $L$. Let $w=xyz$ be any decomposition of this word that satisfies the second and third condition of the pumping lemma. As before, $x=0^s$ and $y=0^t$ for integers $s\ge 0$ and $t\ge 1$. Again $x y^2z=0^{p+t}10^{p}1$ is not in $L$, which means that the decomposition doesn’t satisfy the first condition of the pumping lemma. (Convince yourself that for example, for $w=(01)^p\in L$ the proof would fail.) $\square$

\

The following example shows that sometimes it is useful to “pump down” (which means to choose $i=0$ to show that the first condition in the pumping lemma is violated).

Lemma: The language $\{0^i 1^j \mid i \geq j \}$ is not regular.

Proof: Same proof strategy as before.

Let $p$ be a natural number. Consider the word $0^p1^p$. Let $w=xyz$ be any decomposition of this word that satisfies the second and third condition of the pumping lemma. As before, $x=0^s$ and $y=0^t$ for integers $s\ge 0$ and $t\ge 1$. But then, the word $xy^0z=0^{p-t}1^p$ is not in $L$, which means that the first condition in the pumping lemma is not satisfied. (Convince yourself that $xy^iz$ is in the language for all integers $i\ge 1$.) $\square$