Lecture 3: Hard functions

Generic lower bound

Theorem: For every \(n\in{\mathbb N}\) with \(n\ge 100\), there exists an \(n\)-bit Boolean function that requires circuits of size at least \(2^{n}/100n\).

Proof: The idea is to compare the number of \(n\)-bit Boolean functions \(f{\colon}\{0,1\}^n\to \{0,1\}\) to the number of circuits of size at most \(s=2^n/100n\). If the number of functions is larger than the number of such circuits, it means that there exists a functions that is not computed by such a circuit.

Let us first count the number of \(n\)-bit Boolean functions. In general, the number of functions from a finite set \(X\) to a finite set \(Y\) is equal to \(|Y|^{|X|}\). (You should convince yourself of this fact.) Hence, the number of \(n\)-bit Boolean functions is \(2^{2^n}\).

Now let’s count the number of size-\(s\) circuits. A circuit of size \(s\) consists of a directed acyclic graph (DAG) on \(s'=s+n\le 2s\) nodes (\(n\) input nodes and \(s\) gates). Each node has in-degree at most \(2\). (The in-degree of a node is the number of ingoing edges.) Furthermore, each input node (in-degree \(0\)) is labeled with an input variable \(i\in[n]\). Each gate is labeled with a Boolean operation, either \({\mathrm{\scriptstyle AND}}\), \({\mathrm{\scriptstyle OR}}\), or \({\mathrm{\scriptstyle NOT}}\).

We claim that we can describe such a circuit completely by a bit string of length at most \(10s\log s\). First, we give every node a unique ID. Since there are only \(s'\) nodes in the circuit, we only need \(\log s'\) bits for the ID. (We can use the numbers from \(1\) to \(s'\) as IDs.) Our description of the circuit will consist of \(s'\) parts, each of length at most \(10\log s'\) (one part per vertex). To describe a single vertex \(u\), we store the ID of \(u\) (\(\log s'\) bits), the IDs of the vertices that have incoming arcs to \(u\) (\(2 \log s'\) bits) and the label of \(u\), which is either a Boolean function or the index of an input/output variable (at most \(\log n\le \log s'\) bits). We don’t need to store the outgoing arcs of a vertex, because those will be stored as incoming arcs by other vertices. In total, we store at most \(4\log s'\) bits per vertex. Hence, the description of the whole circuit is indeed shorter than \(10s\log s\) bits.

It follows that the number of circuits of size \(s\) is at most \(2^{10 s\log s}\) (the number of bit strings of length \(10 s \log s\)). For \(s=2^n/100n\), this number is much less than \(2^{2^n}\). (Check this calculation.) It follows that an overwhelming majority of \(n\)-bit Boolean functions does not have circuits of size \(2^n/100n\).


Remark: While the theorem shows that some functions require large circuits, the proof of the theorem does not give an explicit construction of such a function. (The proof is non-constructive.) It’s a famous open problem to show strong lower bounds on the circuit size of concrete functions. For example, the following funtion is conjectured to require circuits of exponential size \[ \mathrm{\scriptstyle CIRCUIT-SAT}_n(x) := \begin{cases} 0 & \text{ if $x$ is the binary encoding of a circuit}\\ & \text{ that computes the constant $0$ function,}\\ 1 & \text{ otherwise}. \end{cases} \]

Straight-line Programs

In this part, we will discuss an exact correspondence between circuits and a simple kind of programs.

Here is an example of a straight-line program:

Input: \(x_1,x_2\)

  • \(y_1 := \neg x_1\)
  • \(y_2 := \neg x_2\)
  • \(y_3 := x_1 \land y_2\)
  • \(y_4 := y_1 \land x_2\)
  • \(y_5 := y_3 \lor y_4\)

Output: \(y_5\)

This program outputs the parity of the two input bits. We can construct a circuit that corresponds to this program in the following way: For every instruction, we have a gate in the circuit that applies the same Boolean operation and has incoming edges from nodes corresponding to the right-hand-side variables of the instruction. The size of the resulting circuit is exactly equal to the number of instructions of the straight-line program.

Definition: A straight-line program is a sequence of instructions of the following form: Each instruction applies a basic Boolean operation (\({\mathrm{\scriptstyle NOT}},{\mathrm{\scriptstyle AND}},{\mathrm{\scriptstyle OR}}\)) to input variables or previously assigned variables and assigns the outcome of this operation to a new variable. The output of the program is the outcome of last instruction. The length of a straight-line program is the number of instructions.

Exercise: A Boolean function has a straight-line program of length at most \(\ell\) if and only if it has a Boolean circuit of size at most \(\ell\).