Open and Closed Problems in NP-Completeness David S. Johnson Columbia University

Open and Closed Problems in NP-Completeness

David S. JohnsonColumbia University

January, 1979Current Amazon Best Sellers Rank: #479,283 in Books #3 in Books > Computers & Technology > Programming > APIs & Operating Environments > Device Drivers

http://www.amazon.com/best-sellers-books-Amazon/zgbs/books/ref=pd_zg_hrsr_b_1_1

http://www.amazon.com/gp/bestsellers/books/5/ref=pd_zg_hrsr_b_1_2



http://www.amazon.com/gp/bestsellers/books/3845/ref=pd_zg_hrsr_b_1_5_last

Invented the concept of NP-completeness contemporaneously and independently,

published in 1971 [Cook] and 1973 [Levin].

First illustrated the breadth of applicability

of NP-completeness [1972]

Stephen Cook

Leonid Levin Richard Karp

The Cartoons

P = Set of Decision problems (ones with yes-no answers) that can be solved in polynomial time.

– O(N), O(NlogN), O(N2), O(N3), …, O(N100), …

NP = Set of Decision problems such that, if the answer is yes, there exists a proof of that fact that can be verified in polynomial time.

Definitions

Example 1: SATISFIABILITY

Given: Logical expression in conjunctive normal form, e.g.

(x or ¬y or z) and (¬z or y or ¬x) and (y or ¬z or ¬x)

Question: Is the expression “satisfiable? That is, is there an assignment of “true” or “false” to the variables such that all the clauses are true?

In this case, the answer is yes.

For instance, one satisfying truth assignment is

x = true, y = true, z = true

and this “proof” can be verified in polynomial time.

Example 2: The Traveling Salesman Problem

Given: Set of cities {c1,c2,…,cN }.

For each pair of cities {ci,cj}, an integer distance d(ci,cj).

Find: Permutation π: {1,2,…,N} → {1,2,…,N} that minimizes

Decision Problem Version: Given an integer k and a TSP instance, is there a tour of length

k or less?

Exercise: If the Decision Problem Version can be solved in polynomial

time, then an optimal tour can be found in polynomial time.

)c,d(c)c,d(c π(1)π(N)

1N

1i1)π(iπ(i)

Mathematical Observation: P ⊆ NP

Empirical Observation: There are thousands of well-studied problems in NP (including SASTIFIABILITY and the TSP) for which we know no polynomial-time algorithms, so many people assume that P ≠ NP.

Behavioral Definition: The “NP-complete” problems are problems in NP that can only be in P if P = NP.

NP-Completeness

Early Open Problem

How do we formally define and name this class?

Why “NP-Completeness”?First, some terminological history:

The idea that the class of problems solvable in polynomial time was a useful subject of study was first formulated in [Cobham, 1964] and [Edmonds, 1965].

A different [Edmonds, 1965] paper introduced the concept of a “good characterization,” which essentially captured the idea of what we now call NP.

Jack Edmonds, 2014 Jack Edmonds, 1957Alan Cobham, 2010

Why “NP-Completeness”?First, some terminological history:

The idea that the class of problems solvable in polynomial time was a useful subject of study was first formulated in [Cobham, 1964] and [Edmonds, 1965].

A different [Edmonds, 1965] paper introduced the concept of a “good characterization,” which essentially captured the idea of what we now call NP.

FP FNP

Why “NP-Completeness”?[Cook, 1971] made the useful restriction to decision problems (“languages”), defined the classes P and NP without giving them names,

And stated his famous theorem as being about problems complete for NP under what we now call “polynomial-time Turing reductions”:

• A → B if problem A can be solved in polynomial time when given a subroutine for solving B that provides answers in constant time.

FP FNPNPP

Problems complete for NP under Turing reductions

Why “NP-Completeness”?[Karp, 1972] introduced the names P and NP, and observed that Cook’s theorem would still hold if we used a simpler notion of reduction, that which we now call a “polynomial transformation”:

• A → B if, given an instance X of A, we can in polynomial time construct an instance f(X) of B such that the answer for X is yes if and only the answer for f(X) is also “yes”.

FP FNPNPP

Problems complete for NP under Turing reductionsProblems complete for NP under polynomial transformations

Cook did not provide a name for his class, but subsequently several others did:

• “Polynomial Complete problems” [Karp, 1971]

• “Polynomially Complete problems” [certain grammarians].

• “Universal sequential search problem” [Levin, 1973]

– although this referred to a larger class consisting of all search problems to which SAT could be reduced by polynomial-time Turing reductions, and which, in turn, could be reduced to SAT.

• “P-complete problems” [Sahni, 1974]

– which referred to the essentially the same class as Levin’s.

Why “NP-Completeness”?

Problems complete for NP under polynomial transformations

Problems equivalent to SAT under polynomial-time Turing reductions

Today’s terminology is in large part a result of the efforts of Don Knuth.


Stanford, 1975

In 1974, primarily because he wanted a better term than “at least as hard as the polynomial complete problems” to use in Volume 4 of The Art of Computer Programming, Knuth privately circulated a poll, reporting on the results in “A terminological proposal,” SIGACT News, 6:1 (1974), 12-18.

In the letter that contained the poll, Knuth rejected the term “P-hard,” which was Sahni’s proposal, for semantic reasons: if P ≄ NP, the problems would be much harder than P. Instead, he asked for correspondents to rank his proposals of

(a) Herculean

(b) formidable

(c) arduous


Problems to which SAT reduces by polynomial-time Turing reductions

NP-Complete Problems

Levin’s Class

Result: No winner.

Fortunately, Knuth had asked for write-in votes.

Some of the more entertaining:

• The “hard-boiled” problems

- (in honor of Cook - Ken Steiglitz)

• The “hard-assed” problems

- (“Hard as Satisfiability” - Albert Meyer)

• The “PET” problems (Shen Lin)

- They were his favorite problems

- Currently they were “Probably Exponential Time”

- If it turned out that P ≠ NP, they would be “Provably Exponential Time”

- And if it turned out that P = NP, they would be “Previously Exponential Time”


The write-in winners (proposed be a variety of people, in particular those of us at Bell Labs):

• “NP-complete” for the class of problems complete for NP

• “NP-hard” for problems to which SAT reduces via polynomial-time Turing reductions.

Advantages:

• Suggests to the reader that this is a well-defined technical concept.

• Consistent with standard language and recursion theory terminology.

• Relatively easy to explain (although newspapers and others regularly screw it up…)

Disadvantages:

• It requires an explanation.

• Although in this case one can resort to Albert Meyer’s shorthand explanation “a class of problems notorious for their computational intractability.”


A still-missing name:

Problems complete for NP under polynomial transformations

Levin’s Class: Problems equivalent to SAT under polynomial-time Turing reductions

Problems to which SAT reduces by polynomial time Turing reductionsNP-Hard Problems

NP-Complete Problems

NP-Equivalent Problems?

Open Problems about Problems

Appendix A13: Open Problems

[1] GRAPH ISOMORPHISM [2] SUBGRAPH HOMEOMORPHISM (FOR A FIXED GRAPH H) [3] GRAPH GENUS [4] CHORDAL GRAPH COMPLETION [5] CHROMATIC INDEX [6] SPANNING TREE PARITY PROBLEM [7] PARTIAL ORDER DIMENSION [8] PRECEDENCE CONSTRAINED THREE-PROCESSOR SCHEDULING [9] LINEAR PROGRAMMING[10] TOTAL UNIMODULARITY[11] COMPOSITE NUMBER[12] MINIMUM LENGTH TRIANGULATION

Appendix A13: Open Problems

[1] GRAPH ISOMORPHISM [2] SUBGRAPH HOMEOMORPHISM (FOR A FIXED GRAPH H) [3] GRAPH GENUS [4] CHORDAL GRAPH COMPLETION [5] CHROMATIC INDEX [6] SPANNING TREE PARITY PROBLEM [7] PARTIAL ORDER DIMENSION [8] PRECEDENCE CONSTRAINED THREE-PROCESSOR SCHEDULING [9] LINEAR PROGRAMMING[10] TOTAL UNIMODULARITY[11] COMPOSITE NUMBER[12] MINIMUM LENGTH TRIANGULATION

Green = Polynomial-Time Solvable Red = NP-Complete

CHORDAL GRAPH COMPLETION

INSTANCE: Graph G = (V,E) and a positive integer K.

QUESTION: Is there a superset E’ containing E of unordered pairs of vertices from V that satisfies |E’-E| ≤ K and such that G’ = (V,E’) is chordal, i.e., such that for every simple cycle of more than 3 vertices in G’, there is some edge in E’ that is not involved in the cycle but joins two vertices in the cycle?

Comment: Corresponds to the problem of minimizing “fill-in” when applying Gaussian elimination to a symmetric matrix.

• NP-COMPLETE: M. Yannakakis, "Computing the Minimum Fill-In is NP-Complete," SIAM Journal on Algebraic and Discrete Methods, 2 (1981), 77-79.

CHORDAL GRAPH COMPLETION

INSTANCE: Graph G = (V,E) and a positive integer K.

QUESTION: Is there a superset E’ containing E of unordered pairs of vertices from V that satisfies |E’-E| ≤ K and such that G’ = (V,E’) is chordal, i.e., such that for every simple cycle of more than 3 vertices in G’, there is some edge in E’ that is not involved in the cycle but joins two vertices in the cycle?

Comment: Corresponds to the problem of minimizing “fill-in” when applying Gaussian elimination to a symmetric matrix.

COMPOSITE NUMBER

INSTANCE: A positive integer N, written in binary notation.

QUESTION: Is N composite, that is, do there exist integers A and B, 1 < A,B < N, such that N = AB?

Comment: Known to be in NP ∩ co-NP. (Note that “polynomial time” in this case means polynomial in log(N).)

• Polynomial-Time Solvable: M. Agrawal, N. Kayal, and N. Saxena, ”PRIMES is in P,” Annals of Mathematics 160 (2004), 781-793.

• The initial announced running time bound was

•Current best algorithm [Pomerance & Lestra, 2005] is

The Other Closed Open Problems from G&J

Problem Answer

Reference

SUBGRAPH HOMEOMORPHISM (FOR A FIXED GRAPH H)

P Robertson & Seymour [1983++]

GRAPH GENUS NPC Thomassen [1989] (In P for fixed K – Filotti, Miller, & Reif [1979]

CHROMATIC INDEX NPC Holyer [1981]

SPANNING TREE PARITY PROBLEM P Lovász [1980]

PARTIAL ORDER DIMENSION NPC Yannakakis [1982]

LINEAR PROGRAMMING P Khachiyan [1979]

TOTAL UNIMODULARITY P Seymour [1980]

COMPOSITE NUMBER P Agrawal, Kayal, & Saxena [2004]

MINIMUM WEIGHT TRIANGULATION NPC Mulzer and Rote [2008]

MINIMUM WEIGHT TRIANGULATION

INSTANCE: Collection C = {(ai,bi): 1 ≤ i ≤ n} of pairs of integers, giving the coordinates of n points in the plane, and a positive integer B.

QUESTION: Is there a triangulation of the set of points represented by C that has total “discrete Euclidean” length of B or less? Here a triangulation is a collection of non-intersecting line segments, each joining two points of C, that divides the interior of the convex hull into triangular regions. The discrete Euclidean length of a line segment joining (ai,bi) to (aj,bj) is the ceiling of ((ai-aj)2 + (bi-bj)2)1/2.

MINIMUM WEIGHT TRIANGULATION

INSTANCE: Collection C = {(ai,bi): 1 ≤ i ≤ n} of pairs of integers, giving the coordinates of n points in the plane, and a positive integer B.

QUESTION: Is there a triangulation of the set of points represented by C that has total “discrete Euclidean” length of B or less? Here a triangulation is a collection of non-intersecting line segments, each joining two points of C, that divides the interior of the convex hull into triangular regions. The discrete Euclidean length of a line segment joining (ai,bi) to (aj,bj) is the ceiling of ((ai-aj)2 + (bi-bj)2)1/2.

Is this the right metric?

In most applications the important factor is how well-behaved the triangulation is, not how short. In polynomial time one can find triangulations that

(a) Maximize the minimum internal angle over all triangles(b) Minimize the maximum angle(c) Minimize the maximum aspect ratio(d) Minimize the maximum edge length(e) Minimize the maximum triangle height

Closed “Open Problems” from the Columns

“The NP-Completeness Column: An Ongoing Guide,”

J. Algorithms 1981 -- 1992 23 Columns

ACM Trans. Algorithms 2005 -- Present 3 Columns

IMPERFECT GRAPH (Column 1, 1981)INSTANCE: Graph G = (V,E).

QUESTION: Is G not a perfect graph, that is, is there a subset V’ ⊆ V such that the subgraph of G induced by V’ has chromatic number larger than the size of its maximum clique?

Comment: Membership in NP is non-obvious and requires an application of the Ellipsoid method (Grötschel, Lovász, & Schrijver [1981]).

Now known to be in P, as a consequence of the Strong Perfect Graph Theorem (Chudnovsky, Robertson, Seymour & Thomas [2006]) and an algorithm for recognizing whether a graph or its complement contains an “odd hole” (Chudnovsky, Cournéujols, Liu, Seymour, & Vušković [2005]).

EVEN COVER (Column 3, 1982)INSTANCE: Collection C of subsets of a given finite set X, positive integer K.

QUESTION: Is there a nonempty subcollection C’ ⊆ C with |C’| ≤ K, such that each element of X is in an even number (possibly zero) of sets from C’?

Combinatorial restatement of the classic coding theory problem that asks whether the minimum weight of a non-zero codeword in a binary linear code is K or less. Proved NP-complete by A.Vardy [1997].

“NP-complete” under randomized reductions (Ajtai [1998]).Hence, in RP if and only if NP ⊆ RP. The difficulty of this problem has potential cryptographic implications.

SHORTEST VECTOR IN A LATTICE (Column 18, 1986)INSTANCE: Collection of vectors v1, …, vn , each a member of Qn,integer B > 0.

QUESTION: Is there a nonzero vector a = (a1 , …, an) in Zn such that if x = ∑i (aivi), then the Euclidean length |x| = (∑i xi

2)1/2 is B or less?

The NP-Completeness Column: An Ongoing Guide (16), J. Algorithms, 6 (1985), 434-451

Still Open After All These Years

GRAPH ISOMORPHISM (G&J, 1979)INSTANCE: Two graphs G1 = (V1,E1) and G2 = (V2,E2).

QUESTION: Is there a one-to-one onto function f: V1 → V2 such that (u,v) ∈ E1 if and only if {f(u),f(v)} ∈ E2?

GRAPH ISOMORPHISM (G&J, 1979)INSTANCE: Two graphs G1 = (V1,E1) and G2 = (V2,E2).

QUESTION: Is there a one-to-one onto function f: V1 → V2 such that (u,v) ∈ E1 if and only if {f(u),f(v)} ∈ E2?

If Graph Isomorphism is NP-complete, then the Polynomial Hierarchy collapses.

PRECEDENCE CONSTRAINED 3-PROCESSOR SCHEDULING (G&J, 1979)INSTANCE: Set T of unit length tasks, a partial order ≺ on T, and a deadline D.QUESTION: Can T be scheduled on 3 processors so as to satisfy the precedence constraints given by ≺ and meet the overall deadline D?That is, is there a schedule σ: T → {0,1,…,D-1} such that t ≺ t’ implies σ(t) < σ(t’), and such that for each integer i, 0 ≤ i ≤ D-1, there are at most three tasks t ∈ T for which σ(t) = i?

1

2

3

5

4

6

8

7

9

1 2 3

7 8 6 9

4 5

10 2 43P1

P2

P3

PRECEDENCE CONSTRAINED 3-PROCESSOR SCHEDULING (G&J, 1979)INSTANCE: Set T of unit length tasks, a partial order ≺ on T, and a deadline D.QUESTION: Can T be scheduled on 3 processors so as to satisfy the precedence constraints given by ≺ and meet the overall deadline D?That is, is there a schedule σ: T → {0,1,…,D-1} such that t ≺ t’ implies σ(t) < σ(t’), and such that for each integer i, 0 ≤ i ≤ D-1, there are at most three tasks t ∈ T for which σ(t) = i?

No news.

FACTORING (Column 26, 2007)INSTANCE: Integers N and k, k < N.QUESTION: Does N have a factor f, 2 ≤ f ≤ k?

Comment: This problem is in NP∩co-NP. The (unique) prime decomposition of N provides a short proof for both yes and no answers (and can itself be verified using the polynomial-time algorithm for primality testing).

If FACTORING is NP-complete, then the Polynomial Hierarchy collapses.

GRACEFUL GRAPH (Column 6, 1983)INSTANCE: Graph G = (V,E), integer K.QUESTION: Is there a one-to-one function g: V → {1,2,…,K} that yields distinct values of |g(u)-g(v)| for all {u,v} 𝜖 E ?Comment: If the answer is “yes” for K = |E|, the graph is called “graceful”.

Surprisingly, this actually has applications (from X-ray crystallography to missile control, at least according to Bloom and Golomb [1977, 1978]).

Kotzig- Ringel Conjecture: Every tree is graceful (verified for all trees with |V| ≤ 35).

Biggest Remaining Open Problem

Does P = NP?

Solution is worth $1,000,000

Opinion Polls• Bill Gasarch, “Guest Column: The P =? NP poll,” SIGACT News 33:2

(2002), 34-47.• Bill Gasarch, “Guest Column: The second P =? NP poll,” SIGACT

News 43:2 (2012), 53-57.

• 63% thought the problem would still be unresolved in 2100.• Several thought it would take serious mathematics from other

fields, as in Ketan Mulmuley’s proposed algebraic geometry approach.

Number of respondents

P≠NP P=NP Ind Don’t Know

Don’t Care

Other

2002 100 61% 9% 4% 22% 1% 3%

2012 151 83% 9% 3% 1% 3% 1%

Opinion PollsSome of the comments:

– Richard Karp: I believe intuitively P≠NP but it is only an intuition.

– Boaz Barak: I am almost certain that P≠NP. I tend to agree with Scott Aaronson that, given all the evidence for it, in other fields such as physics, P≠NP would have already been declared a law of nature.

– Richard Lipton: P = NP will be proved on Dec 12, 2012.

Donald KnuthParaphrased from “Twenty Questions for Donald Knuth” (May, 2014)

http://www.informit.com/articles/article.aspx?p=2213858

I've come to believe that P = NP. My main point, however, is that I don't believe that the equality P = NP will turn out to be helpful even if it is proved, because such a proof will almost surely be nonconstructive. Although I think an O(nM) algorithm for some fixed M probably exists, I also think human beings will never know the value of M. I even suspect that nobody will even know an upper bound on M.

Jack Edmonds

The P-versus-NP Pagehttp://www.win.tue.nl/~gwoegi/P-versus-NP.htm

Milestones1 [Equal]: In 1986/87 Ted Swart (University of Guelph) wrote a number of papers (some of them had the title: "P=NP") that

gave linear programming formulations of polynomial size for the Hamiltonian cycle problem. Since linear programming is polynomially solvable and Hamiltonian cycle is NP-hard, Swart deduced that P=NP. In 1988, Mihalis Yannakakis closed the discussion with his paper "Expressing combinatorial optimization problems by linear programs."

27 [Not equal]: In November 2005, Ron Cohen proved that P is not equal to NP. In addition, his paper shows that P is not equal to the intersection of NP and co-NP. Finally, the exact inclusion relationships between the classes P, NP and co-NP are discussed. The paper is available at http://www.arxiv.org/abs/cs.CC/0511085. The title of the paper is "Proving that P is not equal to NP and that P is not equal to the intersection of NP and co-NP".

96 [Unprovable]: In November 2012, Natalia L. Malinina put the paper "On the principal impossibility to prove P=NP" onto the arxive, at http://arxiv.org/abs/1211.3492. On page 19, she writes: "Summarizing all that was said, it can be concluded that such dividing of the graphs into three classes and the behavior of the complicated vertexes at the converting (they turn into the independent cycles) gives us the infallible fact that it is impossible to prove that P=NP."

106 [Not equal]: In June 2014, Samuel C. Hsieh showed that P is not equal to NP. The paper "A Lower Bound for Boolean Satisfiability on Turing Machines" establishes a lower bound for deciding the satisfiability of the conjunction of any two Boolean formulas from a set called a full representation of Boolean functions of n variables, a set containing a Boolean formula to represent each Boolean function of n variables. Corollary 2.1 on page 13 of the paper implies the well-known exponential time hypothesis. The paper is available at http://arxiv.org/abs/1406.5970.

53 for Equals, 45 for Not Equals, 1 for both, 3 for Unprovable/Undecidable, 1 for NP=coNP, 3 Others

http://www.arxiv.org/abs/cs.CC/0511085

http://arxiv.org/abs/1211.3492

http://arxiv.org/abs/1406.5970

Only one paper from the list has appeared in a refereed journal:

[Equal]: In 1986/87 Ted Swart (University of Guelph) wrote a number of papers (some of them had the title: "P=NP") that gave linear programming formulations of polynomial size for the Hamiltonian cycle problem. Since linear programming is polynomially solvable and Hamiltonian cycle is NP-hard, Swart deduced that P=NP.In 1988, Mihalis Yannakakis closed the discussion with his paper "Expressing combinatorial optimization problems by linear programs" (Proceedings of STOC 1988, pp. 223-228). Yannakakis proved that expressing the traveling salesman problem by a symmetric linear program (as in Swart's approach) requires exponential size. The journal version of this paper was published in Journal of Computer and System Sciences 43, 1991, pp. 441-466.

1.

Yannakakis’s Theorem

• The TSP polytope cannot be expressed by a symmetric linear program with less than an exponential number of variables plus constraints.

• Recently extended to asymmetric LP’s by Fiorini, Massar, Pokutta, Tiwari, & de Wolf in “Linear vs. Semidefinite Extended Formulations: Exponential Separation and Strong Lower Bounds” (STOC 2012).

• And to Perfect Matchings by Thomas Rothvoss in “The matching polytope has exponential extension complexity” (STOC 2014).

My Favorite P = NP Proof

• Observation 1. The SATISFIABILITY problem is NP-complete for instances in conjunctive normal form.

(x or ¬y or z) and (u or y or ¬x) and …

• Observation 2. Here is a linear-time algorithm for solving instances of SATISFIABILITY in disjunctive normal form

(x and ¬y and z) or (u and y and ¬x) or …

Standard Scheme for Proving P !=NP

1. Problem A is NP-hard.2. Any algorithm that solves A must

use technique X.3. Any algorithm that uses technique X

must take exponential time.

Major Open Problem: How to convince people who use this scheme that they have to actually prove (2).

Open Problems about Problems

• Garey and Johnson (1979):– “Is problem X NP-hard?”

• Today:– “How good can approximation

algorithms be for problem X?”

The Lost Cartoon

Levels of Approximation• Bounded Difference: A(I) ≤ OPT(I) + B

– Chromatic Index: B = 1

• Sublinear Difference (Asymptotic Optimality): A(I) ≤ OPT(I) + o(OPT(I))– Bin Packing: Difference = O(log2(OPT(I)))

• Fully Polynomial-Time Approximation Scheme (FPTAS) A(I) ≤ (1+ε)OPT(I) in time poly in size and 1/ε– Knapsack Problem

• Polynomial-Time Approximation Scheme (PTAS) Aε(I) ≤ (1+ε)OPT(I) in poly time for any fixed ε

– Euclidean and Rectilinear TSP

Levels of Approximation II• Bounded ratio: A(I) ≤ r●OPT(I), r > 1

Metric TSP

• Ratio Bounded by f(n) for growing function fAsymmetric k-Center: f(n) = O(log*(n))

Point Set Width: f(n) = O(sqrt(log n))

Set Cover: f(n) = log n

Group Steiner Tree on Trees: f(n) = O(log2n)

Node Capacitated Unsplittable Flow: f(n) = sqrt(n)polylog n

Chromatic Number: f(n) = O(n (loglog n)2/(log3n))

Question, Revised

What is the best type of approximation that can be obtained, assuming P ≠ NP?

Example: Chromatic Number

Theorem [Garey & Johnson, 1976].

Unless P = NP, there is no polynomial-time approximation algorithm for CHROMATIC NUMBER that, for all graphs G = (V,E), guarantees

A(G) < 2OPT(G).

Theorem [Lund & Yannakakis, 1993] … [Hastad, 1999] … [Zuckerman, 2006].

Unless P = NP, there is no polynomial-time approximation algorithm for CHROMATIC NUMBER that, for all instances I, guarantees

A(G) < |V|1-εOPT(G),

for any ε > 0.

CHROMATIC NUMBER (Optimization Version)INSTANCE: Graph G = (V,E).QUESTION: What is the minimum k such that there is a partition V into k sets, with no edges in E having both endpoints in the same set?

Proof Technique: Gap Transformation

Given an instance I of an NP-hard problem A, construct an instance I’ of your problem such that1. If the answer to I is “yes”, then OPT(I’) = k,2. If the answer to I is “no”, then OPT(I’) ≥ f(|I’|)k.

Key Technique [Garey & Johnson, 1976]:Graph Products and Multicoloring

Key Technique [since 1991]:Probabilistically Checkable Proofs (PCP’s)

[Arora & Safra, 1992], [Arora, Lund, Motwani, Sudan, & Szegedy, 1992],…

Probabilistically Checkable Proofs

A PCP for a statement S of length n is a string L and a polynomial-time “verification algorithm” that, given S and a mechanism for accessing specific bits of L, computes probabilistically and in finite time reaches a conclusion about the truth of statement S, satisfying the following properties.1. If S is true, the verifier always says “true.”2. If S is false, the verifier says “true” with probability ≤ 1/4.

Definition: PCP(f,g) is the set of decision problems with proof lengths and verifier running times polynomially bounded in n, and for which the verifier only uses O(f(n)) random bits and looks at only O(g(n)) bits of the proof.

Theorem [ALMSS, 1991]. PCP(log,1) = NP.

A wide variety of “inappoximability” results can be proved by using this result and various strengthenings of it, together with their implicit gap theorems.

Example: SET COVER

Theorem [Johnson, 1974], [Lovasz, 1975].:The Greedy algorithm guarantees a solution within a factor ln(n) of OPT.

Theorem [Lund & Yannakakis, 1993].No poly-time algorithm can guarantee A(I)/OPT(I) < (1/4)log(n)

unless NP ⊆ DTIME(npolylog(n)).

Theorem [Feige, 1998].No poly-time algorithm can guarantee A(I)/OPT(I) < (1-o(1))ln(n)

unless NP ⊆ DTIME(nloglog(n)).

Theorem [Alon, Moshkovitz, & Safra, 2006].No poly-time algorithm can guarantee A(I)/OPT(I) < 0.2267ln(n)

unless P = NP.

SET COVER (Optimization Version)INSTANCE: Set X with n elements and a collection C of subsets S

⊂ X such that ∪S∈C S = X.

QUESTION: What is the minimum k such that there is a subcollection C’ ⊆ C with |C’| = k and∪S∈C S = X?

Question (Revised again)

What is the best type of approximation that can be obtained, assuming plausible hypothesis H?

Problem (at least for Surveyors):Explosion of Hypotheses on which Hardness Results

Depend

The Current BIG Open Problem in the Hardness of Approximation

The “Unique Games” Conjecture

3

12

1

xw

vu

LABEL COVERINSTANCE: Bipartite graph G = (V1, V2, E), positive integers N and M, a map πu,v : {1,2,...,M} → {1,2,...,N} for each edge {u,v} ∈ E where u ∈ V1 and v ∈ V2, and a bound B ≤ |E|.QUESTION: Are there label assignments L1 and L2, where L1 : V1 → {1, 2, . . . , M } and L2 : V2 → {1,2,...,N} such that at least B edges are covered, where an edge {u,v} ∈ E is covered by a labeling if πu,v(L1(u)) = L2(v)?

M = 3, N = 2 πu,v = (1,1), (2,1), (3,1)

πw,v = (1,2), (2,1), (3,2)

πw,x = (1,1), (2,1), (3,2)

LABEL COVERINSTANCE: Bipartite graph G = (V1, V2, E), positive integers N and M, a map πu,v : {1,2,...,M} → {1,2,...,N} for each edge {u,v} ∈ E where u ∈ V1 and v ∈ V2, and a bound B ≤ |E|.QUESTION: Are there label assignments L1 and L2, where L1 : V1 → {1, 2, . . . , M } and L2 : V2 → {1,2,...,N} such that at least B edges are covered, where an edge {u,v} ∈ E is covered by a labeling if πu,v(L1(u)) = L2(v)?Theorem [Lund & Yannakakis, 1993]:For any ε there is a constant kε such that it is NP-hard to distinguish between the case where there is a labeling that covers all edges in E and the case where no labeling covers more than ε|E| edges.

UNIQUE GAMES CONJECTURE (UGC) [Khot, 2002]:For any ε, δ ∈ (0, 1/2) there is a constant k = kε,δ such that, for LABEL COVER instances with |M| = |N| = k in which all the maps πu,v are permutations, it is NP-hard to distinguish between the case where there is a labeling that covers at least (1 − δ)|E| edges and the case where no labeling covers more than ε|E| edges.

If the Unique Games Conjecture is true

For no ε > 0 can any polynomial-time algorithm guarantee a solution to VERTEX COVER that is less than (2-ε)OPT unless P = NP [Khot & Regev, 2003].

Note:• Just taking a maximal matching guarantees a solution

that is no more than 2OPT.• Assuming P = NP but not the Unique Games

Conjecture, the best lower bound we have is (1.3606…-ε)OPT [Dinur & Safra, 2005]

More UGC ResultsProblem Best Guarantee Bound with UGC &

P≠NPBound with P≠NP only

Vertex Cover 2.0 OPT (2-ε)OPT

[Khot & Regev, 2003](1.3606…-ε)OPT [Dinur & Safra, 2005]

Max2-SAT

(0.940…)OPT[Feige & Goemans, 1995]

(0.940…+ε)OPT

[Khot et al., 2007]

(0.954…+ε)OPT[Håstad, 1999]

MaxCut

(0.878…)OPT[Goemans & Williamson,

1995]

(0.878…+ε)OPT

[Khot et al., 2007]

(0.941…+ε)OPT[Håstad, 1999]

Theorem [Raghavendra, 2008]. For any constraint satisfaction problem (CSP) whose objective is to satisfy the maximum possible number of constraints, there is a simple approximation algorithm based on semi-definite programming which provides the best possible guarantee, assuming UGC and P≠NP.

My Favorite Open Approximation Questions

Bin Packing:

Is there a constant B such that, given P ≠ NP (or some other plausible assumption) no polynomial-time bin packing algorithm can guarantee

A(I) ≤ OPT(I) + B?

Note: The algorithm of [Karmarkar & Karp, 1982] guaranteesA(I) ≤ OPT(I) + O(log2(OPT(I))).

This was improved last year by Thomas Rothvoss [FOCS 2013] toA(I) ≤ OPT(I) + O(log(OPT(I))loglog(OPT(I))).

Traveling Salesman Problem:

Can the metric TSP be approximated to within a constant smaller than 3/2?

Note: Christofides’ Algorithm guarantees A(I) ≤ (3/2)OPT(I).

Best current lower bound (assuming P≠NP) is that no polynomial-time algorithm can guarantee A(I) ≤ (220/219)OPT(I) [Papadimitriou & Vempala, 2006].

Asymmetric TSP:

Can the asymmetric TSP with directed triangle inequality be approximated to within a constant factor?

Note: The classic “Iterated Matching” algorithm of [Frieze, Galbiati, & Maffioli, 1982] guarantees a solution whose length is no more than logNOPT.

No polyonomial-time algorithm with a better guarantee was known until recently, when [Asadpour, Goemans, Madri, Gharan, & Saberi, 2010] presented one that provided a guarantee of O(logN/loglogN)OPT.

Neither algorithm provides tours that are competitive in practice.

Final Open Question

Is any of this relevant in the “real world”?

Remember: NP-completeness is a concept that is both

1. Worst-Case, and

2. Asymptotic.

Successes of SAT Solvers

• There are many applications of SATISFIABILITY in fields such as model checking for software verification, etc.

• It is claimed that such real-world instances, even ones of truly huge size, offer no challenge to modern day SAT codes, which run in near linear time.

• Does this mean that SAT is efficiently solvable for all practical purposes?

• No – it just means that, current applications seem to be yielding easy instances (and note that there may still be some holdouts).

• Also, just try those solvers on a transformed version of a large CLIQUE problem, and see how well they do…

• “Branch-and-Cut” approach exploiting linear programming to determine lower bounds on optimal tour length, developed by David Applegate, Bob Bixby, Vasek Chvatal, and Bill Cook.

• Based on 30+ years of theoretical developments in the “Mathematical Programming” community, plus some very good data structures and heuristics work from computer science.

• For surprisingly large instances (2000 and more cities), it finds an optimal tour and proves its optimality in reasonable running times.

• Executables and source code can be downloaded from http://www.math.uwaterloo.ca/tsp/concorde/index.html

Concorde & the TSP

N = 1000

Running times (in seconds) for 10,000 Concorde runs on random 1000-city planar Euclidean instances (2.66 Ghz Intel Xeon processor in dual-processor PC, purchased late 2002).

Range: 7.1 seconds to 38.3 hours

Concorde & the TSP

Concorde Asymptotics[Hoos and Stϋtzle, 2009 draft]

• Estimated median running time for random Euclidean instances.

• Based on– 1000 samples each for N = 500,600,…,2000– 100 samples each for N = 2500, 3000,3500,4000,4500– 2.4 Ghz AMD Opteron 2216 processors with 1MB L2

cache and 4 GB main memory, running Cluster Rocks Linux v4.2.1.

0.21 · 1.24194 √N

Actual median for N = 2000: ~57 minutes, for N = 4,500: ~96 hours

Conclusion

Proving those NP-completeness results is still worthwhile, although you can still hope for• special cases that avoid them.• exponential time algorithms that are still

fast enough to handle the instances you care about.

• fast algorithms that get good-enough results in practice (even if not in theory).

Photo Credits

• Alan Cobham: Jeff Shallit

• Jack Edmonds, 1957: Jeff Edmonds

• Jack Edmonds’ Boulder: Bill Cook

• All Others: David Johnson

Documents

Open and Closed Problems in NP-Completeness David S. Johnson Columbia University