S.Safra some slides borrowed from Dana Moshkovits

Preview:

DESCRIPTION

Probabilistically Checkable Proofs and Hardness of Approximation. S.Safra some slides borrowed from Dana Moshkovits. The Crazy Tea Party. Problem To seat all guests at a round table, so people who sit an adjacent seats like each other. John. Jane. Mary. Alice. Bob. - PowerPoint PPT Presentation

Citation preview

1

S.Safra

some slides borrowed from Dana Moshkovits

2

The Crazy Tea PartyProblem To seat all guests at a round table, so

people who sit an adjacent seats like each other.

John Mary Bob Jane Alice

John

Mary

Bob

Jane

Alice

3

Solution for the Example

Alice

Bob

Jane

John

Problem To seat all guests at a round table, so people who sit an adjacent seats like each other.

Mary

4

Naive Algorithm

• For each ordering of the guests around the table– Verify each guest likes the guest

sitting in the next seat.

5

How Much Time Should This Take? (worse case)guests steps

n (n-1)!

5 24

15 87178291200

100 9·10155

say our computer is capable of 1010

instructions per second, this will still take 3·10138 years!

6

ToursProblem Plan a trip that visits every site exactly

once.

7

Solution for the ExampleProblem Plan a trip that visits every site exactly

once.

10

Is a Problem Tractable?

• YES! And here’s an efficient algorithm for it

• NO! and I can prove it

and what if neither is the case?

12

Growth Rate: Sketch

10n

n2

2n

n! =2O(n lg

n)

input length

time

13

The World According to Complexity

reasonable unreasonable

polynomial nO(1)

exponential 2nO(1)

14

Could one be Fundamentally Harder

than the Other?

Tour

Seating

?

15

Relations Between Problems

Assuming an efficient procedure for problem A, there is an efficient procedure for

problem B

B cannot be radically harder than A

16

Reductions

B p A

B cannot be radically harder than A

In other words: A is at least as hard as B

17

Which One is Harder?

Tour

Seating

?

18

Reduce Tour to SeatingFirst Observation: The problems aren’t so

different

site guest

“directly reachable from…”

“liked by…”

19

Reduce Tour to SeatingSecond Observation: Completing the circle

• Let’s invite to our party a very popular guest,• i.e one who can sit next to everybody else.

20

Reduce Tour to Seating

• If there is a tour, there is also a way to seat all the imagined guests around the table.

. . . . . .

popular guest

21

Reduce Tour to Seating

• If there is a seating, we can easily find a tour path (no tour, no seating).

. . .

popular guest

. . .

22

Bottom Line

The seating problem is at least as hard as the tour problem

23

What have we shown?

• Although we couldn’t come up with an efficient algorithm for the problems

• Nor to prove they don’t have one,• We managed to show a very

powerful claim regarding the relation between their hardness

24

Furthermore

• Interestingly, we can also reduce the seating problem to the tour problem.

• Moreover, there is a whole class of problems, which can be pair-wise efficiently reduced to each other.

26

NPC

NPC

Contains thousands of distinct problem

exponential algorithms

efficient algorithms

?

each reducible to all others

27

How can Studying P vs NP Make You a Millionaire?

• This is the most fundamental open question of computer science.

• Resolving it would grant the solver a great honor

• … as well as substantial fortune…www.claymath.org/prizeproblems/pvsnp.htm

• Huge philosophical implications:– No need for human ingenuity!– No need for mathematicians!!!

28

Constraints Satisfaction

DefDef Constraints Satisfaction Problem (CSP):– InstanceInstance:

• Constraints: A set of constraints = { 1, …, l } over two sets of variables, X of range RX and Y of range RY

• Determinate: each constraint determines the value of a variable yY according to the value of some xX

xy : RX RY , satisfied if xy(x)=y

• Uniform: each xX appears in dX of , and each yY appears in dY of , for some global dX and dy

– OptimizeOptimize:• Define () = maximum, over all assignments to X and Y

A: X RX; Y RY

of the fraction of satisfied

29

Cook’s Characterization of NP

ThmThm: It is NP-hard to distinguish between () = 1 () < 1

For any language L in NP

testing membership in L

can be reduced to...

CSP

31

Showing hardness

From now on, to show a problem NP-hard, we merely need to reduce CSP to it.

any NP problem

can be reduced to...

CSP

new, hardproblem

can be reduced to...

Cook’s Thm

will imply the new problem is NP-hard

33

Max Independent-Set

Instance: A graph G=(V,E) and a threshold k.Problem: To decide if there is a set of

vertices I={v1,...,vk}V, s.t. for any u,vI: (u,v)E.

34

Max I.S. is NP-hard

Proof: We’ll show CSPp Max I.S.

≤p1 12 78346 43 416x y x y x y, ,...,

35

The reduction: Co-Partite Graph

• G comprise k=|X| cliques of size |RX| - a vertex for each plausible assignment to x:

kk

An edge: two assignments

that determine a

different value to same

yE {(<i,j1>, <i,j2>) | iM, j1≠j2 RX}

36

Proof of CorrectnessAn I.S. of size k must contain exactly one

vertex in every clique.

kk

A satisfying assignment

implies an I.S. of size k

An I.S. of size k corresponds to a

consistent, satisfying assignment

37

Generalized Tour Problem

• Add prices to the roads of the tour problem• Ask for the least costly tour

$8

$10

$13$12

$19

$3$17

$13

38

Approximation

• How about approximating the optimal tour? • I.e – finding a tour which costs, say, no more

than twice as much as the least costly.

$8

$10

$13$12

$19

$3$17

$13

39

Hardness of Approximation

40

Promise Problems

• Sometimes you can promise something about the input

• It doesn’t matter what you say for unfeasible inputs

I know my graph has clique of size n/4! Does it have a clique of size

n/2?

41

Promise Problems & Approximation

• We’ll see promise problems of a certain type, called gap problems, can be utilized to prove hardness of approximation.

46

Gap Problems (Max Version)

• Instance: …

• Problem: to distinguish between the following two cases:

The maximal solution B

The maximal solution ≤ A

51

Idea

• We’ve shown “standard” problems are NP-hard by reductions from CSP.

• We want to prove gap-problems are NP-hard

• Why won’t we prove some canonical gap-problem is NP-hard and reduce from it?

• If a reduction reduces one gap-problem to another we refer to it as gap-preserving

52

Gap-CSP[]Instance: Same as CSPProblem: to distinguish between the

following two cases:There exists an assignment that satisfies all constraints.No assignment can satisfy more than of the constraints.

53

PCP (Without Proof)

Theorem [FGLSS, AS, ALMSS]: For any >0,

Gap-CSP[] is NP-hard,as long as |RX|,|RY| ≥ -O(1)

54

Why Is It Called PCP? (Probabilistically Checkable Proofs)

CSP has a polynomial membership proof checkable in polynomial time.

1 12 78346 43 416x y x y x y, ,...,

x1

x2

x3

x4

x5

x6

x7

x8

yn-3

yn-2

yn-1

yn

. . .

My formula is satisfiable!

Prove it!

This assignment satisfies it!

55

Why Is It Called PCP? (Probabilistically Checkable Proofs)

…Now our verifier has to check the assignment satisfies all constraints…

56

Why Is It Called PCP? (Probabilistically Checkable Proofs)

While for gap-CSP the verifier would be right with high probability, even by:

(1)pick at random a constant number of constraints and

(2)check only those

In a NO instance of gap-CSP, 1-

of the constraints are not satisfied!

57

Why Is It Called PCP? (Probabilistically Checkable Proofs)

• Since gap-CSP is NP-hard, All NP problems have probabilistically checkable proofs.

59

Hardness of Approximation

• Do the reductions we’ve seen also work for the gap versions (i.e approximation preserving)?

• We’ll revisit the Max I.S. example.

60

The same Max I.S. ReductionAn I.S. of size k must contain exactly one vertex in every part.

kk

A satisfying assignment

implies an I.S. of size k

An I.S. of size k corresponds to a

consistent assignment

satisfying of

61

Corollary

Theorem: for any >0,Independent-set is hard to

approximate to within any constant factor

62

Chromatic Number

• Instance: a graph G=(V,E).• Problem: To minimize k, so that

there exists a function f:V{1,…,k}, for which

(u,v)E f(u)f(v)

skip

63

Chromatic NumberObservation:

Each color class is an

independent set

64

Clique Cover Number (CCN)

• Instance: a graph G=(V,E).• Problem: To minimize k, so that

there exists a function f:V{1,…,k}, for which

(u,v)E f(u)=f(v)

65

Clique Cover Number (CCN)

66

Observation

Claim: The CCN problem on graph G is the CHROMATIC-NUMBER problem on the complement graph Gc.

67

Reduction Idea

.

.

.

CLIQUE CCN

.

.

.

q

same under cyclic shift

clique preserving

m G G’

68

Correctness

• Given such transformation:– MAX-CLIQUE(G) = m CCN(G’) = q– MAX-CLIQUE(G) < m CCN(G’) > q/

69

Transformation

T:V[q]

for any v1,v2,v3,v4,v5,v6,

T(v1)+T(v2)+T(v3) T(v4)+T(v5)+T(v6) (mod q)

{v1,v2,v3}={v4,v5,v6}T is unique for triplets

70

Observations

• Such T is unique for pairs and for single vertices as well:

• If T(x)+T(u)=T(v)+T(w) (mod q), then {x,u}={v,w}

• If T(x)=T(y) (mod q), then x=y

71

Using the Transformation

0 1 2 3 4 … (q-1)

vi

vj

T(vi)=1

T(vj)=4

CLIQUE

CCN

72

Completing the CCN Graph Construction

T(s)

T(t)

(s,t)ECLIQUE

(T(s),T(t))ECCN

73

Completing the CCN Graph Construction

T(s)

T(t)

Close the set of edges under shift:

For every (x,y)E,

if x’-y’=x-y (mod q), then (x’,y’)E

74

Edge Origin Unique

T(s)

T(t)

First Observation: This edge comes

only from (s,t)

75

Triangle Consistency

Second Observation: A

triangle only comes from a triangle

76

Clique Preservation

Corollary: {c1,…,ck} is a clique in the CCN graph

iff {T(c1),…,T(ck)} is a clique in the CLIQUE graph.

77

What Remains?

• It remains to show how to construct the transformation T in polynomial time.

78

Corollaries

Theorem: CCN is NP-hard to approximate within any constant factor.

Theorem: CHROMATIC-NUMBER is NP-hard to approximate within any constant factor.

79

Max-E3-Lin-2

DefDef: Max-E3-Lin-2– Instance: a system of linear equations

L = { E1, …, En } over Z2

each equation of exactly 3 variables(whose sum is required to equal either 0 or 1)

– Problem: Compute (L)

81

Main Theorem

Thm [Hastad]: gap-Max-E3-Lin-2(1-, ½+) is NP-hard.

That is, for every constant >0 it is NP-hard to distinguish between the case 1- of the equations are satisfiable and the case ½+ are.

[ It is therefore NP-Hard to approximateMax-E3-Lin-2 to within 2- constant >0]

82

This bound is Tight!

• A random assignment satisfies half of the equations.

• Deciding whether a set of linear equations have a common solution is in P (Gaussian elimination).

83

Proof OutlineThe proof proceeds with a reduction from gap-

CSP[], known to be NP-hard for any constant >0

Given such an instance , the proof shows a poly-time construction, of an instance L of Max-E3-Lin-2 s.t. () = 1 (L) ≥ 1 - L

() < (L) ≤ ½ + L

Main Idea:Replace every x and every y with a set of variables

representing a binary-code of their assigned values.Then, test consistency within encoding and any xy

using linear equations over 3 bits

85

Long-Code of R

• One bit for every subset of R

86

Long-Code of R

• One bit for every subset of R

to encode an element eR

00 00 11 11 11

87

The Variables of L

Consider an instance of CSP[], for small constant (to be fixed later)

L has 2 types of variables:

1.a variable z[y,F]z[y,F] for every variable yY and a subset F F P[R P[Ryy]]

2.a variable z[x,F]z[x,F] for every variable xX and a subset F F P[R P[RXX]]

In fact use a “folded” long-code, s.t. f(F)=1-f([n]\F)

88

Linearity of a Legal-Encoding

An Boolean function f: P[R] f: P[R] Z Z22,, if legal long-code-word , is a linear-function, that is, for every F, G F, G P[R] P[R]::

f(F) + f(G) F) + f(G) f(F f(FG)G)

where FFG G P[R] P[R] is the symmetric difference of F and G

Unfortunately, any linear function (a sum of a subset of variables) will pass this test

89

The Distribution DefDef: denote by the biased,

product distribution over P[RX], which assigns probability to a subset H as follows:Independently, for each aRX, let– aH with probability 1-– aH with probability One should think of as a multiset of subsets in

which every subset HH appears with the appropriate probability

90

The Linear Equations

L‘s linear-equations are the union, over all ,, of the following set of equations:

FF P[R P[RYY]],, GG P[R P[RXX]] and HH

denote denote FF**== xxyy-1-1(F) (F)

z[y,F] + z[x, G] z[y,F] + z[x, G] z[x, F z[x, F** G G H] H]

91

Correctness of ReductionPropProp: if (() = 1) = 1 then (L(L) = 1-) = 1-

ProofProof: let AA be a satisfying assignment to ..Assign all LL ‘s variables according to the legal-

encoding of A’s values.A linear equation of LL, corresponding to xxyy,F,G,H,F,G,H, would be unsatisfied exactly if A(x)HH, which occurs with probability over the choice of H.

LLC-LemmaLLC-Lemma: (L(L) = ½+) = ½+/2/2 (() > 4) > 422

= 2= 2(L) -1(L) -1

Note: independent of ! (Later we use that fact to set small enough for our needs).

92

Denoting an Assignment to L

Given an assignment AL to L’s variables:

For any xX, denote by fx : P[RX] {-1, 1} the

function comprising the values AL assigns to

z[x,*] (corresponding to the long-code of the

value assigned to x)

For any yY, denote by fy : P[RY] {-1, 1} the

function comprising the values AL assigns to

z[y,*] (corresponding to the long-code of the

value assigned to y)Replacing 1 by -1 and 0 by 1

93

Distributional Assignments

Consider a CSP instance Let (R)(R) be the set of all distributions over R

DefDef: A distributional-assignment to isA: X A: X (R(RXX); Y ); Y (R(RXX))

Denote by (()) the maximummaximum over distributional-assignments A of the averageaverage probability for to be satisfied, if variables’ values are chosen according to A

Clearly (() ) (()). . Moreover

PropProp: : (() ) (())

94

The Distributional-Assignment A

DefDef:: Let A be a distributional-assignment to according to the following random processes:

• For any variable xxXX

– Choose a subset SSRRXX with probability

– Uniformly choose a random aS.• For any variable yYY

– Choose a subset SRY with probability

– Uniformly choose a random bS.

2

xf S

2

xf S

For such functions, the squares of the coefficients constitute a distribution

95

What’s to do:

Show that AALL‘s expected success on xxyy is > 4422 in two steps:

First show that AALL‘s success probability, for any xxyy

Then show that value to be 4422

X

12 2

y x y xS Rf odd S f S S

odd(xy(S)) = {b| #{aS| xy(a) = b} is odd}

96

Claim 1

Claim 1Claim 1: : AALL‘s success probability, for any xxyy

ProofProof::That success probability is

Now, taking the sum for only the cases in which Sy=odd(xy(Sx)), results in the claimed inequality.

X

12 2

y x y xS Rf odd S f S S

xy Y x X

2 2

y y x x a S x y yS R ,S Rf S f S Pr a S

100

High Success Probability

'y x x'

y Y x X x X

' ' 'y x x x x'

y Y x X x X

x

x

*F,G,H y x x

' *y y x x x x S S SF,G,H

S R ,S R ,S R

'y y x x x x S S S S SF G H

S R ,S R ,S R

2 Sy x y y x x

S R

E f F f G f F G H

f S f S f S E U F U G U F G H

f S f S f S E U F E U G E U H

f odd S f S 1 2

X

102

Related work• Thm (Friedgut): a Boolean function f with small average-

sensitivity is an [,j]-junta

• Thm (Bourgain): a Boolean function f with small high-frequency weight is an [,j]-junta

• Thm (Kindler&Safra): a Boolean function f with small high-frequency weight in a p-biased measure is an [,j]-junta

• Corollary: a Boolean function f with small noise-sensitivity is an [,j]-junta

• [Dinur, S] Showing Vertex-Cover hard to approximate to within 10 5 – 21

• Parameters: average-sensitivity [BL,KKL,F]; high-frequency weight [H,B], noise-sensitivity [BKS]

103

Boolean Functions and Juntas

A Boolean function

Def: f is a j-Junta if there exists J[n]where |J|≤ j, and s.t. for every x

f(x) = f(x J)

• f is (, j)-Junta if j-Junta f’ s.t.

n

f : P n T,F

f : 1,1 1,1

n

f : P n T,F

f : 1,1 1,1

x

f x f ' xPr x

f x f ' xPr

104

Motivation – Testing Long-code

• Def (a long-code test): given a code-word w, probe it in a constant number of entries, and– accept w.h.p if w is a monotone

dictatorship– reject w.h.p if w is not close to any

monotone dictatorship

105

Motivation – Testing Long-code

• Def(a long-code list-test): given a code-word w, probe it in a constant number of entries, and– accept w.h.p if w is a monotone

dictatorship,– reject w.h.p if a Junta J[n] s.t. f is close

to f’ and f’(F)=f’(FJ) for all F

• Note: a long-code list-test, distinguishes between the case w is a dictatorship, to the case w is far from a junta.

106

Motivation – Testing Long-code

• The long-code test, and the long-code list-test are essential tools in proving hardness results.

Examples …

• Hence finding simple sufficient-conditions for a function to be a junta is important.

107

Noise-Sensitivity• Idea: check how the value of f changes

when the input is changed not on one, but on several coordinates.

[n]x

Iz

108

Noise-Sensitivity

• Def(,p,x[n] ): Let 0<<1, and xP([n]).

Then y~,p,x, if y = (x\I) z where– I~

[n] is a noise subset, and– z~ p

I is a replacement.

Def(-noise-sensitivity): let 0<<1, then

• Note: deletes a coordinate in x w.p. (1-p),adds a coordinate to x w.p. p.

Hence, when p=1/2: equivalent to flipping each coordinate in x w.p. /2.

[n]x

Iz

[n] [n]p ,p,xx~ ,y~

ns f = Pr f x f y

[n] [n]p ,p,xx~ ,y~

ns f = Pr f x f y

109

Noise-Sensitivity – Cont.

• Advantage: very efficiently testable (using only two queries) by a perturbation-test.

• Def (perturbation-test): choose x~p, and y~,p,x, check whether f(x)=f(y). The success is proportional to the noise-sensitivity of f.

• Prop: the -noise-sensitivity is given by

2S

S

2 ns f =1 1 f S 2S

S

2 ns f =1 1 f S

110

Related Work

• [Dinur, S] Showing Vertex-Cover hard to approximate to within 10 5 – 21

• [Bourgain] Showing a Boolean function with weight <1/k on characters of size larger than k, is close to a junta of size exponential in k ([Kindler, S] similar for biased, product distribution)

Recommended