Exercise 1 (Size of binary representation)
- Show that the size of the binary representation of a number is logarithmic with respect to the number itself. More precisely, show that for every word x \in \{1\}\{0,1\}^*, |x| = \lfloor \log_2(\mathtt{value}_2(x)) \rfloor + 1.
- Let \mathtt{Composite} be the language of all numbers (written in binary) that are composite, i.e., that have a non-trivial factor. Give an algorithm for deciding memebership in \mathtt{Composite} and analyze its running time in terms of the size of the input.
- Let \mathtt{Factoring} be the language of all pairs of numbers (N,k) (written in binary) such that N has a non-trivial factor smaller than k. Give an algorithm for deciding membership in \mathtt{Factoring} and analyze its running time in terms of the size of the input.
- Show that both \mathtt{Composite} and \mathtt{Factoring} are in \mathbf{NP}.
Exercise 2 (Travelling salesperson problem) Suppose that we are given an instance of the Travelling Salesperson Problem (\mathtt{TSP}) with n cities and distances d_{ij}. For each subset S of the cities excluding city 1, and for each j \in S, define c[S,j] to be the shortest path that starts from city 1, visits all cities in S and ends up in city j.
- Give an algorithm that calculates c[S,j] by dynamic programming, that is, progressing from smaller to larger sets S and using a recurrent definition of c[S,j]. Show that this algorithm solves the \mathtt{TSP} in time O(n^2 2^n). What are the space requirements of the algorithm?
- Suppose we wish to find the shortest (in the sense of sum of weights) path from 1 to n, not necessarily visiting all cities. Argue why this problem can be solved in polynomial time.
Exercise 3 (Searching for cliques) A k-clique in a graph G = (V,E) is a complete subgraph of order k of G, that is, a subgraph on k vertices having all possible edges between them. We say that G' is a clique in G if G' is a k-clique in G for some k. Recall the following well-known NP-complete problem. \mathtt{Clique}\text{: Given an undirected graph $G$ and a positive integer $k$, determine whether $G$ contains a $k$-clique}.
Some problems related to \mathtt{Clique} are the following.
Maximal clique. A clique is maximal if it cannot be enlarged, that is, if there is no larger clique containing it. Describe an algorithm as efficient as possible that, given an undirected graph G, finds a maximal clique of G. What’s the cost of your algorithm?
Maximum clique. A clique is maximum if there is no other clique having more vertices (a maximum clique is always maximal but the reverse inclusion is not always true). Describe an algorithm as efficient as possible that, given an undirected graph G, finds a maximum clique of G. What’s the cost of your algorithm?
Planar clique. Recall that a graph is planar if it can be drawn on the plane without edge crossings. Let \mathtt{Planar Clique} be the following problem: given a planar undirected graph G and a positive integer k, does G have a k-clique? Show that \mathtt{Planar Clique} has a polynomial-time algorithm.
NotePlanarity testing can be done in linear time (on the number of vertices). You can assume the existence of these algorithms for the exercise.
Half clique. Let \mathtt{Half Clique} be the following problem: given an undirected graph G, does G have a clique with at least \big\lceil \frac{|V(G)|}{2} \big\rceil vertices? Prove that \mathtt{Half Clique} is \mathbf{NP}-complete.
Exercise 4 (Closure properties of \mathbf{P}, \mathbf{NP} and \mathbf{coNP}) The goal of this exercise is to revise some closure properties of \mathbf{P}, \mathbf{NP}, and \mathbf{coNP}.
- (closure w.r.t union) Prove the following implications
- Given A and B in \mathbf{P}, A\cup B\in \mathbf{P}.
- Given A and B in \mathbf{NP}, A\cup B\in \mathbf{NP}.
- Given A and B in \mathbf{coNP}, A\cup B\in \mathbf{coNP}.
- (closure w.r.t intersection) Prove the following implications
- Given A and B in \mathbf{P}, A\cap B\in \mathbf{P}.
- Given A and B in \mathbf{NP}, A\cap B\in \mathbf{NP}.
- Given A and B in \mathbf{coNP}, A\cap B\in \mathbf{coNP}.
- (closure w.r.t concatenation) Prove the following implications
- Given A and B in \mathbf{P}, A\cdot B\in \mathbf{P}.
- Given A and B in \mathbf{NP}, A\cdot B\in \mathbf{NP}.
- Given A and B in \mathbf{coNP}, A\cdot B\in \mathbf{coNP}.
Exercise 5 (Polynomial-time reductions) Consider the relation \le_m^p among languages and justify your answers to the following questions.
- Is \le_m^p reflexive? That is, does it hold that A \le_m^p A for any language A?
- Is \le_m^p symmetric? That is, does it hold that if A \le_m^p B, then B \le_m^p A for any languages A, B?
- Is \le_m^p antisymmetric? That is, does it hold that A \le_m^p B and B \le_m^p A imply A = B for any languages A, B?
- Is \le_m^p transitive? That is, does it hold that A \le_m^p B and B \le_m^p C imply A \le_m^p C for any languages A, B, and C?
Recall that given two languages A, B over the same alphabet \Sigma, we say that A polynomial-time reduces to B (denoted as A\leq_m^p B or A\leq_p B) if there exists a total function f:\Sigma^*\to\Sigma^* computable in polynomial time s.t. for every w\in \Sigma^*, w\in A if and only if f(w)\in B.
A useful property is the closure of the classes \mathbf{P}, \mathbf{NP} i \mathbf{coNP} under polynomial-time reductions, that is, given A\leq_m B,
- if B\in \mathbf{P}, then A\in \mathbf{P},
- if B\in \mathbf{NP}, then A\in \mathbf{NP}, and
- if B\in \mathbf{coNP}, then A\in \mathbf{coNP}.
The m in \leq_m stands for the fact that f is many-one, that is, not necessarily injective.
Exercise 6 (Closure under the Kleene star)
Show that \mathbf{P} is closed under the Kleene star.
TipUse dynamic programming. Given a set A \in \mathbf{P} over \Sigma and an input x = x_1 \dots x_n for x_i \in \Sigma, build a table indicating for each i < j whether the substring x_i \dots x_j is in A^*.
Show that \mathbf{NP} is closed under the Kleene star.
Exercise 7 (Search vs decision) Show the following consequences of the hypothesis \mathbf{P} = \mathbf{NP}.
There is a polynomial-time algorithm that produces a satisfying assignment when given a satisfying Boolean formula.
Integers can be factored in polynomial time.
There is a polynomial-time algorithm that takes an undirected graph as input and finds a largest clique contained in that graph.
The algorithms you are asked to provide compute a function, but \mathbf{NP} contains languages, not functions. The \mathbf{P} = \mathbf{NP} assumption implies that deciding satisfiability, compositeness, and the existence of cliques of a given size is all solvable in polynomial time. But even though the assumption does not show how solutions are found, you must show that you can find them anyway.
Exercise 8 (Berman’s theorem) A language is called unary if every string in it is of the form 1^n for some n \ge 0.
Prove Berman’s theorem (1978): if a unary language is \mathbf{NP}-complete, then \mathbf{P} = \mathbf{NP}.
Use the fact that \mathtt{SAT} is \mathbf{NP}-complete and, by hypothesis, reducible to a unary language via some function f. To decide \mathtt{SAT}, given an input Boolean formula \phi, consider the tree whose root (level 0) is \phi and whose formulas at level k are obtained from the ones at level k-1, where the k-th variable takes the 2 possible truth values. Show how f allows us to explore this tree in polynomial time.
Exercise 9 (\mathbf{NP} and \mathtt{HALT}) Show that \mathtt{HALT} is \mathbf{NP}-hard. Is it \mathbf{NP}-complete?
Recall that \mathtt{HALT}=\{\langle x,y \rangle \mid M_x(y)\!\downarrow\}, where M_x is the Turing machine with Gödel number x and the \downarrow means that the machine terminates.
Exercise 10 (\mathtt{UniqueSAT}) The class \mathbf{DP} (for difference polynomial time) is defined as the set of languages L for which there are two languages L_1 \in \mathbf{NP}, L_2 \in \mathbf{coNP} such that L = L_1 \cap L_2. Notice that since \overline{L_2} \in \mathbf{NP}, L is the difference of two \mathbf{NP} sets (but do not confuse \mathbf{DP} with \mathbf{NP} \cap \mathbf{coNP}, which may seem superficially similar).
Consider the problem \mathtt{Unique SAT} on determining whether a Boolean formula has a unique satisfying assignment (or model) and show the following facts.
- \mathtt{Unique SAT} \in \mathbf{DP}.
- If \mathtt{Unique SAT} is in \mathbf{NP}, then \mathbf{NP} = \mathbf{coNP}.
Exercise 11 (Are they in \mathbf{P}?) Prove that the following languages on undirected graphs are in \mathbf{NP}. Which ones of them are in \mathbf{P}?
- Two coloring. \mathtt{2COL} = \{ G \mid graph G has a coloring with two colors\}, where a k-coloring of G is an assignment of a number in \{1,\dots,k\} to each vertex of G such that no adjacent vertices get the same number.
- Three coloring. \mathtt{3COL} = \{ G \mid graph G has a coloring with three colors\}.
- Hamiltonian path. \mathtt{HP} = \{ G \mid graph G has a Hamiltonian path\}, where a path is said to be Hamiltonian if it visits every vertex exactly once.
- Connectivity. \mathtt{CONNECTED} =\{ G \mid graph G is a connected graph\}.
Exercise 12 (Representation of languages over a binary alphabet) Let A and B be two languages over the same alphabet \Sigma, and let e : \Sigma \to \{0,1\}^k be an injective function that maps each symbol in \Sigma to a binary string of length k. For a string x=x_1x_2\cdots x_n \in \Sigma^*, we define the encoding of x as e^+(x) = e(x_1)e(x_2)\cdots e(x_n) of length k\cdot n, that is the string where each symbol of x is mapped under e. Let A'=\{e^+(x) \mid x\in A\} and B'=\{e^+(x) \mid x\in B\}.
Prove the following statements:
- The language A is in \mathbf{P} if and only if the language A' is in \mathbf{P}.
- The language A is in \mathbf{NP} if and only if the language A' is in \mathbf{NP}.
- The language A is in \mathbf{coNP} if and only if the language A' is in \mathbf{coNP}.
- A\leq_m^p B if and only if A'\leq_m^p B'.
Exercise 13 (Subset-Sum/Knapsack) For n\in \mathbb N let [n]=\{1,\dots,n\}. Consider the following languages.
\texttt{SUBSET-SUM}=\left\{\langle c, x_1, \ldots, x_n\rangle :\ \begin{array}{c} c, x_1, \ldots, x_n\in \mathbb N \text{ and } \\ \exists S \subseteq [n]\ \sum_{i \in S} x_i = c\end{array}\right\}
\texttt{SUBSET-SUM-INT}=\left\{\langle c_1, c_2, x_1, \ldots, x_n\rangle :\ \begin{array}{c} c_1, c_2, x_1, \ldots, x_n\in \mathbb N \text{ and } \\ \exists S \subseteq [n]\ c_1\leq \sum_{i \in S} x_i \leq c_2\end{array}\right\}
\texttt{KNAPSACK}=\left\{\langle w_1, \ldots, w_n, v_1, \ldots, v_n, W, V\rangle :\ \begin{array}{c} w_1, \ldots, w_n, v_1, \ldots, v_n, W, V\in \mathbb N \text{ and } \\ \exists S \subseteq [n]\ \sum_{i \in S} w_i \leq W \text{ and } \sum_{i \in S} v_i \geq V\end{array}\right\}
Unless stated otherwise, in the following exercises we assume that the input is encoded in binary.
- Show that if the input is encoded in unary, then \texttt{SUBSET-SUM} can be solved in polynomial time. (Hint: use dynamic programming.)
- Show that if the input is encoded in unary, then \texttt{KNAPSACK} can be solved in polynomial time. (Hint: use dynamic programming.)
- Show that \texttt{SUBSET-SUM} is \mathbf{NP}-complete. (Hint: reduce from \texttt{3-SAT}.)
- Show that \texttt{KNAPSACK} is \mathbf{NP}-complete.
- Show that \texttt{SUBSET-SUM-INT} is \mathbf{NP}-complete.
Exercise 14 (Proof systems) A proof system for a language L\subseteq \{0,1\}^* is a polynomial-time computable function P : \{0,1\}^* \to L.
A proof system P is polynomially bounded if there exists a polynomial p such that for every x\in L, there exists a string w of length at most p(|x|) such that P(w) = x. One can think of w as a proof of the fact that x\in L, and being polynomially bounded means every x\in L has a polynomial-sized proof.
- Prove that a non-empty language L is in \mathbf{NP} if and only if there is a polynomially bounded proof system for L.
- Prove that if there is a polynomially bounded proof system for \texttt{TAUT}, then \mathbf{NP} = \mathbf{coNP}.
- When P and P' are two proof systems for the same language L, we say that P' simulates P if there is a polynomial p s.t. for all w\in \{0,1\}^*, there exists a string w' such that P'(w') = P(w) and |w'| \leq p(|w|). Show that if P is a polynomially bounded proof system for L, and P' simulates P, then P' is also polynomially bounded.
Recall that \texttt{TAUT} =\{\langle \varphi \rangle : \varphi \text{ is a Boolean formula in DNF and a tautology}\}.
Exercise 15 (On padding and Ladner’s theorem) The goal of this guided exercise is to see the first steps needed to prove Ladner’s theorem, that is the following statement: If \mathbf{P}\neq \mathbf{NP}, then there exists a language L such that L\in \mathbf{NP}\setminus \mathbf{P} and L is not \mathbf{NP}-complete.
Given two Boolean strings w,z with w\circ z we denote the concatenation of them. Recall that \mathtt{SAT} (Boolean Satisfiability Problem) is the problem of determining if there exists a True/False assignment that satisfies a given Boolean formula.
- Let \mathtt{SAT}'=\{ \varphi\circ 1^{2^n} : |\varphi|=n \land \varphi\in \mathtt{SAT} \}. Show that \mathtt{SAT}' is in \mathbf{P}.
- Let \mathtt{SAT}^{\prime\prime}=\{ \varphi\circ 1^{n^c} : |\varphi|=n \land \varphi\in \mathtt{SAT} \}, where c is a constant. Show that \mathtt{SAT}^{\prime\prime} is \mathbf{NP}-complete.
- For any function H(n), let \mathtt{SAT}_H=\{ \varphi\circ 1^{H(n)} : |\varphi|=n \land \varphi\in \mathtt{SAT} \}. Show that \mathtt{SAT}_H is in \mathbf{NP}.
- For the rest of this exercise we focus on a particular function H. Let H(n) be the smallest natural number i\leq \log\log n such that the TM M_i solves x\in \mathtt{SAT}_H in time i|x|^i, for all inputs x\in \{0,1\}^* with |x|\leq \log n. If there is no such i, then let H(n)=\log\log n.
- H(n) is defined referencing itself, however this definition is not cyclic. That is, show that H(n) is well defined.
- Show that, if \mathtt{SAT}_H\in \mathbf{P}, then there exists a constant c such that H(n)\leq c for all n.
- Show that if \mathtt{SAT_H}\in \mathbf{P}, then \mathtt{SAT}\leq_p \mathtt{SAT}_H and hence \mathbf{P}=\mathbf{NP}.
To prove Ladner’s theorem then it it only remained to argue that if \mathtt{SAT}_H were \mathbf{NP}-complete, then \mathtt{SAT}\in \mathbf{P} (this is a bit more technical, and beyond the scope of this exercise).
Putting everything together, if \mathbf{P}\neq \mathbf{NP}, then \mathtt{SAT}_H is a language in \mathbf{NP}\setminus \mathbf{P} that is not \mathbf{NP}-complete.
Exercise 16 (Oracles) An oracle Turing machine is a Turing machine M that has a special tape called the oracle tape and three special states q_{query}, q_{yes}, and q_{no}. To execute M, in addition to the input we specify a language O\subseteq \{0,1\}^* that is used as an oracle for M. Whenever during the execution M enters the state q_{query}, the content of the oracle tape is interpreted as a query to the oracle O and, in one step, M transitions to q_{yes} if x\in O and to q_{no} otherwise. If M is an oracle machine, O \subseteq \{0,1\}^* a language, and x \in \{0,1\}^*, then we denote the output of M on input x and oracle O by M(x)^O. Nondeterministic oracle TMs are defined analogously.
For any O \subseteq \{0,1\}^*, \mathbf{P}^O is the class of languages that can be decided by deterministic Turing machines with oracle access to O in polynomial time and \mathbf{NP}^O is defined analogously for nondeterministic oracle Turing machines. Given a class {\cal C} of languages, we write \mathbf{P}^{\cal C} = \bigcup_{O \in {\cal C}} \mathbf{P}^O and \mathbf{NP}^{\cal C} = \bigcup_{O \in {\cal C}} \mathbf{NP}^O.
- Show that for every oracle O\in \mathbf{P}, \mathbf{P}^O = \mathbf{P}.
- Show that \mathbf{NP} \cup \mathbf{coNP}\subseteq \mathbf{P}^{\mathtt{SAT}}.
- Define the language \mathtt{EXPCOM} = \{(M,x,1^N) \mid M is a deterministic Turing machine that accepts x within 2^N steps\}. Show that \mathbf{P}^{\mathtt{EXPCOM}} = \mathbf{NP}^{\mathtt{EXPCOM}} = \mathbf{EXP}.
- Show that \mathbf{NP}^{\mathbf{NP} \cap \mathbf{coNP}} = \mathbf{NP}.
The Baker-Gill-Solovay theorem states that there are oracles A and B such that \mathbf{P}^A = \mathbf{NP}^A and \mathbf{P}^B \neq \mathbf{NP}^B. This shows that the question of whether \mathbf{P} = \mathbf{NP} cannot be resolved by relativizing techniques, that is, techniques that hold for all oracles, such as diagonalization.