50
The satisfiability threshold The satisfiability threshold and clusters of solutions and clusters of solutions in the 3-SAT problem in the 3-SAT problem Elitza Maneva Elitza Maneva IBM Almaden Research Center IBM Almaden Research Center

The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Embed Size (px)

Citation preview

Page 1: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

The satisfiability threshold The satisfiability threshold and clusters of solutions and clusters of solutions

in the 3-SAT problemin the 3-SAT problem

Elitza ManevaElitza Maneva

IBM Almaden Research CenterIBM Almaden Research Center

Page 2: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

3-SAT3-SAT

Variables: x1, x2, …, xn take values {TRUE, FALSE}

Constraints: (x1 or x2 or not x3) , (not x2 or x4 or not x6), …

(x1 x2 x3) ( x2 x4 x6) …

x1

x2

x3

x4

x5

x6x7

x8

_ _ _

1

1

0

0

0

Page 3: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

PLRPLR

Random walkRandom walk

Belief propagationBelief propagation

Survey propagationSurvey propagation Not Not satisfiablesatisfiable

SatisfiableSatisfiable

SatisfiableSatisfiable Not Not satisfiablesatisfiable

Random 3-SATRandom 3-SAT

00 1.631.63 3.953.953.523.52 4.274.27 4.514.51

MyopicMyopic

x1 x2 x3 x4 x5 x6 x7 x8 n

m = n

Red = proved, green = unproved

Page 4: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Rigorous bounds for random 3-SATRigorous bounds for random 3-SAT

1999: [Friedgut] there is a sharp threshold of satisfiability c(n)

2002

KaporisKirousisLalas

2002

HajiaghayiSorkin

3.52

Page 5: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Rigorous bounds for random 3-SATRigorous bounds for random 3-SAT

00 1.631.63 3.523.52 4.514.51 5.195.19

Pure Literal Rule Algorithm:If any variable appears only positive or only negative

assign it 1 or 0 respectivelySimplify the formula by removing the satisfied clausesRepeat

(x1 x2 x3) ( x2 x4 x5) (x1 x2 x4) (x3 x4 x5)

1 1

_ _ _ __

0

1

Page 6: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Rigorous bounds for random 3-SATRigorous bounds for random 3-SAT

00 1.631.63 3.523.52 4.514.51 5.195.19

Myopic Algorithms:Choose a variable according to # positive and negative occurrencesAssign the variable the more popular valueSimplify the formula by 1. removing the satisfied clauses

2. removing the FALSE literals 3. assigning variables in unit clauses 4. assigning pure variables

Repeat

Best rule: maximum |# positive occurr. – # negative occurr.|

Page 7: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Rigorous bounds for random 3-SATRigorous bounds for random 3-SAT

00 1.631.63 3.523.52 4.514.51

E [# solutions] = 2n Pr [00…0 is a solution] = = 2n (1-1/8)m = = (2 (7/8))n

For >5.191, E [# solutions] 0, so Pr [satisfiable] 0

5.195.19

Page 8: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Rigorous bounds for random 3-SATRigorous bounds for random 3-SAT

00 1.631.63 3.523.52 4.514.51

E [# positively prime solutions] 0

Positively prime solution: a solution in which no variable assigned 1 can be converted to 0.

Fact: If there exists a solution, there exists a positively prime solution.

5.195.194.674.67

Page 9: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Rigorous bounds for random 3-SATRigorous bounds for random 3-SAT

00 1.631.63 3.523.52 4.514.51

E [# symmetrically prime solutions] 0

5.195.194.674.67

Page 10: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

PLRPLR

Random walkRandom walk

Belief propagationBelief propagation

Survey propagationSurvey propagation

SatisfiableSatisfiable

Random 3-SATRandom 3-SAT

00 1.631.63 3.953.953.523.52 4.274.27 4.514.51

MyopicMyopic

x1 x2 x3 x4 x5 x6 x7 x8 n

m = n

Red = proved, green = unproved

Page 11: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Random Walk AlgorithmsRandom Walk Algorithms

[Alekhnovich, Ben-Sasson `03]

Simple Random Walk:

Pick an unsatisfied clausePick a variable in the clause

Flip the variable

Theorem: Finds a solution in O(n) steps for < 1.63.

[Seitz, Alava, Orponen `05][Ardelius, Aurell `06]ASAT:

Pick an unsatisfied clausePick a variable in the clause

Flip it only with prob. p if number of unsatisfied clauses does not increase

Experiment: Takes O(n) steps for < 4.21.

Page 12: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

PLRPLR

Random walkRandom walk

Belief propagationBelief propagation

Survey propagationSurvey propagation

SatisfiableSatisfiable

Random 3-SATRandom 3-SAT

00 1.631.63 3.953.953.523.52 4.274.27 4.514.51

MyopicMyopic

x1 x2 x3 x4 x5 x6 x7 x8 n

m = n

Red = proved, green = unproved

Page 13: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

We can find solutions via inferenceWe can find solutions via inference

Suppose the formula is satisfiable.Suppose the formula is satisfiable.

Consider the uniform distribution Consider the uniform distribution

over satisfying assignments:over satisfying assignments:

Pr[xPr[x11, x, x22, …, x, …, xnn] ] (x (x11, x, x22, …, x, …, xnn))

Simple ClaimSimple Claim: : If we can compute Pr[xIf we can compute Pr[xii=1], then we =1], then we

can find a solution fast.can find a solution fast.

DecimationDecimation: : Assign variables one by one to a value Assign variables one by one to a value that has highest probability. that has highest probability.

Page 14: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Fact:Fact: We cannot hope to compute Pr[x We cannot hope to compute Pr[x ii=1] exactly=1] exactly

Heuristics for guessing the best variable to assign:Heuristics for guessing the best variable to assign:

1.1. Pure Literal Rule (PLR)Pure Literal Rule (PLR): Choose a variable that appears : Choose a variable that appears always positive / always negative.always positive / always negative.

2.2. Myopic RuleMyopic Rule: Choose a variable based on number of : Choose a variable based on number of positive and negative occurrences.positive and negative occurrences.

3. Belief Propagation3. Belief Propagation: Estimate Pr[x: Estimate Pr[xii=1] by belief propagation =1] by belief propagation

and choose variable with largest estimated bias.and choose variable with largest estimated bias.

Page 15: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Computing Pr[xComputing Pr[x11=0] on a tree formula=0] on a tree formula

x1

108108192192

1111

1111

111111

111111

1111

1111

3344

4433

3344

12121212

36364848

#Solutions with 0#Solutions with 0#Solutions with 1#Solutions with 1

#Solns with 0#Solns with 0#Solns with 1#Solns with 1

Page 16: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Vectors can be normalizedVectors can be normalized

x1

.36.36

.64.64

.5.5

.5.5

.43.43

.57.57

.5.5

.5.5

.5.5

.5.5 .5.5.5.5

.5.5

.5.5

.5.5

.5.5.5.5.5.5

.5.5

.5.5

.43.43

.57.57

.43.43

.57.57

.57.57

.43.43

Page 17: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

… … and thought of as messagesand thought of as messagesx1

Vectors can be normalizedVectors can be normalized

Page 18: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

What if the graph is not a tree?What if the graph is not a tree?

Belief propagationBelief propagation

Page 19: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Belief propagationBelief propagation

x11

x5

x1

x4

x10

x6

x9 x8 x7

x3

x2

Pr[xPr[x11, …, x, …, xnn] ] ΠΠaa aa(x(xN(N(aa))) )

(x(x11, x, x22 , x , x33))

Page 20: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Belief Propagation [Pearl ’88]Belief Propagation [Pearl ’88]

x1 x2 x3 x4 x5 x6 x7 nn

mm

Given:Given: Pr[xPr[x1 1 …x…x77]] aa(x(x11, x, x33) ) bb(x(x11, x, x22) ) cc(x(x11, x, x44) ) ……

Goal: Goal: Compute Pr[xCompute Pr[x11] (i.e. ] (i.e. marginalmarginal))

Message passing rules:M i c (xi) = Π M b i (xi)

M c i (xi) = Σ c(x N(c) ) Π M j c (xj)

Estimated marginals:i(xi) = Π M c i (xi)

xj: j N(c)\i j N(c)\i

cN(i)

bN(i)/c

i.e. Markov Random Field (MRF)i.e. Markov Random Field (MRF)

Belief propagation is a dynamic programming algorithm.It is exact only when the recurrence relation holds, i.e.:1. if the graph is a tree.2. if the graph behaves like a tree: large cycles

Page 21: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Applications of belief propagationApplications of belief propagation

• Statistical learning theoryStatistical learning theory• VisionVision• Error-correcting codes (Turbo, LDPC, LT)Error-correcting codes (Turbo, LDPC, LT)• Lossy data-compressionLossy data-compression• Computational biologyComputational biology• Sensor networksSensor networks

Page 22: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

PLRPLR

Random walksRandom walks

Belief propagationBelief propagation

Survey propagationSurvey propagation

SatisfiableSatisfiable

Limitations of BPLimitations of BP

00 1.631.63 3.953.953.523.52 4.274.27 4.514.51

MyopicMyopic

x1 x2 x3 x4 x5 x6 x7 x8 n

m = n

Page 23: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Reason for failure of Belief PropagationReason for failure of Belief Propagation

• Messages from different neighbors are assumed to be almost Messages from different neighbors are assumed to be almost independent independent i.e. there are no long-range correlations i.e. there are no long-range correlations

PLRPLR

Random walksRandom walks

Belief propagationBelief propagation

Survey propagationSurvey propagation

00 1.631.63 3.953.953.523.52 4.274.27 4.514.51

MyopicMyopic

No long-range correlations

Long-range correlations exist

Page 24: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Reason for failure of Belief PropagationReason for failure of Belief Propagation

• Messages from different neighbors are assumed to be almost Messages from different neighbors are assumed to be almost independent independent i.e. there are no long-range correlations i.e. there are no long-range correlations

Fix:Fix: 1-step Replica Symmetry Breaking Ansatz 1-step Replica Symmetry Breaking Ansatz

• The distribution can be decomposed into “phases”The distribution can be decomposed into “phases”• There are no long-range correlations within a phaseThere are no long-range correlations within a phase• Each phase consists of similar assignments – “clusters”Each phase consists of similar assignments – “clusters”• Messages become distributions of distributionsMessages become distributions of distributions• An approximation yields 3-dimensional messages: An approximation yields 3-dimensional messages:

Survey Propagation Survey Propagation [Mezard, Parisi, Zecchina ‘02][Mezard, Parisi, Zecchina ‘02]• Survey propagation finds a phase, then WalkSAT is used to find a Survey propagation finds a phase, then WalkSAT is used to find a

solution in the phasesolution in the phase

Page 25: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Reason for failure of Belief PropagationReason for failure of Belief Propagation

• Messages from different neighbors are assumed to be almost Messages from different neighbors are assumed to be almost independent independent i.e. there are no long-range correlations i.e. there are no long-range correlations

Fix:Fix: 1-step Replica Symmetry Breaking Ansatz 1-step Replica Symmetry Breaking Ansatz

• The distribution can be decomposed into “phases”The distribution can be decomposed into “phases”

Pr[xPr[x11, x, x22, …, x, …, xnn] = ] = p p Pr Pr [x [x11, x, x22, …, x, …, xnn]]

Page 26: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

fixed variables

Page 27: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Space of solutionsSpace of solutions

Satisfying assignments in {0, 1}Satisfying assignments in {0, 1}nn

01011100

101011110110101101

phases

Page 28: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Survey propagationSurvey propagation

.12.12

.81.81

.07.07

0011

Page 29: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Survey propagationSurvey propagation

Mci= ————————

Muic = (1- (1- Mbi )) (1-Mbi)

Msic = (1- (1- Mbi )) (1-Mbi)

Mic = (1- Mbi )

Mujc

Muj c+Ms

j c+Mjc

jN(c)\i

b Nsa (i)b Nu

a (i)

b Nsc (i) b Nu

c (i)

b N(i)\c

x1 x2 x3 x4 x5 x6 x7 x8

You have to satisfy me

with prob. 60%

I’m 0 with prob 10%,1 with prob 70%,

whichever (i.e. ) 20%

Page 30: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Combinatorial interpretation

• Can survey propagation be thought of as inference on cluster assignments?

Not precisely, but close.• We define a related concept of core/cover assignments• Assignments in the same cluster share the same core• However, different cluster may have the same core

Page 31: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Finding the core of a solutionFinding the core of a solution

0

0

01

1

1

0

0

Page 32: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Finding the core of a solutionFinding the core of a solution

0

0

01

1

1

0

0

Page 33: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Finding the core of a solutionFinding the core of a solution

0

0

01

0

1

0

0

unconstrained variables

Page 34: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Finding the core of a solutionFinding the core of a solution

0

0

01

1

0

0

Page 35: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Finding the core of a solutionFinding the core of a solution

0

0

01

1

0

Page 36: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Finding the core of a solutionFinding the core of a solution

0

0

01

0

Page 37: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Finding the core of a solutionFinding the core of a solution

0

0

01

Such a fully constrained partial assignment is called a cover.

Page 38: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Partial assignments {0,1,Partial assignments {0,1,}}nn{0, 1}{0, 1}nn assignments assignments

01011100

101011110110101101

# st

ars

# st

ars

corecore

corecore

Extending the space of assignmentsExtending the space of assignments

Page 39: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Theorem:Theorem: Survey propagation is Survey propagation is equivalentequivalent to belief to belief propagation on the uniform distribution over coverpropagation on the uniform distribution over cover assignments.assignments.

Survey propagation is a Survey propagation is a belief propagation algorithmbelief propagation algorithm

[Maneva, Mossel, Wainwright ‘05][Maneva, Mossel, Wainwright ‘05][Braunstein, Zecchina ‘05][Braunstein, Zecchina ‘05]

But, we still need to look at all partial assignments.

Page 40: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Peeling Experiment for 3-SAT, Peeling Experiment for 3-SAT, n n =10=1055

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

0 20000 40000 60000 80000 100000

# stars

# u

nc

on

str

ain

ed

2

2.5

3

3.5

4

4.1

4.2

Page 41: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Clusters and partial assignmentsClusters and partial assignments

Partial assignmentsPartial assignments{0, 1}{0, 1}nn assignments assignments

# st

ars

# st

ars

0110101101

01011100

101011110110101101

Page 42: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

0101111

4

23

nn(())

3. 3. A family of belief propagation algorithms:A family of belief propagation algorithms:00 11

Vanilla BPVanilla BP SPSP

Pr[Pr[] ] (1- (1- ))nn(()) nnoo(())

Definition of the new distributionDefinition of the new distribution

FormulaFormula

11111111

111111 111111

1111 1111

11

1111

11 11 11

1010101001110111

011011 010010 101000

Partial assignmentsPartial assignments

2. 2. Weight of partial assignments:Weight of partial assignments:

nnoo(())

1. 1. Includes all assignments without contradictions or implicationsIncludes all assignments without contradictions or implications

Page 43: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Partial assignmentsPartial assignments{0, 1}{0, 1}nn assignments assignments

01011100

101011110110101101

# st

ars

# st

ars

corecore

corecore

=0=0

=1=1

Pr[Pr[] ] (1- (1- ))nn(()) nnoo(())

00 11

Vanilla BPVanilla BP SPSP

This is the correct picture for 9-SAT and above.This is the correct picture for 9-SAT and above.[Achlioptas, Ricci-Tersenghi ‘06]

Page 44: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Clustering for k-SATClustering for k-SAT

What is known?What is known?2-SAT: a single cluster 2-SAT: a single cluster

3-SAT to 7-SAT: not known3-SAT to 7-SAT: not known

8-SAT and above: exponential number of clusters 8-SAT and above: exponential number of clusters (with second moment method) (with second moment method) [Mezard, Mora, Zecchina `05] [Mezard, Mora, Zecchina `05] [Achlioptas, Ricci-Tersenghi `06][Achlioptas, Ricci-Tersenghi `06]

9-SAT and above: clusters have non-trivial cores 9-SAT and above: clusters have non-trivial cores (with differential equations method)(with differential equations method) [Achlioptas, Ricci-Tersenghi `06][Achlioptas, Ricci-Tersenghi `06]

Page 45: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

11111111

111111 111111

1111 1111

11

1111

11 11 11

010111

1010101001110111

011011 010010 101000

Convex geometry / AntimatroidConvex geometry / Antimatroid

Total weight is 1 for every Total weight is 1 for every

Page 46: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Rigorous bounds for random 3-SATRigorous bounds for random 3-SAT

00 3.523.52 4.514.51 5.195.19

E [total weight of partial assignments] 0 ( = 0.8)

Fact: If there exists a solution, the weight of partial assignments is at least 1.

4.94.9

Page 47: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Rigorous bounds for random 3-SATRigorous bounds for random 3-SAT

Theorem [ Maneva, Sinclair ] For > 4.453 one of the following holds:1. there are no satisfying assignments with high probability;2. the core of every satisfying assignment is (,,…,).

4.4534.453

00 3.523.52 4.514.51 5.195.19

Page 48: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

PLRPLR

Random walkRandom walk

Belief propagationBelief propagation

Survey propagationSurvey propagation

SatisfiableSatisfiable

Random 3-SATRandom 3-SAT

00 1.631.63 3.953.953.523.52 4.274.27 4.514.51

MyopicMyopic

x1 x2 x3 x4 x5 x6 x7 x8 n

m = n

Red = proved, green = unproved

Page 49: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Challenges

• Improve the bounds on the threshold

• Prove algorithms work with high probability

• Find an algorithm for certifying that a formula with n clauses for large has no solution

Page 50: The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center

Thank youThank you