42
1 Provably hard problems below the satisfiability threshold Paul Beame Univ. of Washington Michael Molloy Univ. of Toronto Dimitris Achlioptas Microsoft Research A sharp threshold in proof complexity yields wer bounds for satisfiability sear

Provably hard problems below the satisfiability threshold

  • Upload
    gustav

  • View
    45

  • Download
    0

Embed Size (px)

DESCRIPTION

A sharp threshold in proof complexity yields lower bounds for satisfiability search. Provably hard problems below the satisfiability threshold. Paul Beame Univ. of Washington. Dimitris Achlioptas Microsoft Research. Michael Molloy Univ. of Toronto. - PowerPoint PPT Presentation

Citation preview

Page 1: Provably hard problems below the satisfiability threshold

1

Provably hard problems below the satisfiability threshold

Paul Beame Univ. of Washington

Michael Molloy Univ. of Toronto

Dimitris Achlioptas Microsoft Research

A sharp threshold in proof complexity yields

lower bounds for satisfiability search

Page 2: Provably hard problems below the satisfiability threshold

2

CNF Satisfiability

• (x1 x2 x4) (x1 x3) (x3 x2) (x4 x3)

• NP-complete but many heuristics because of its practical importance– presumably exponential in the worst case

• If you know formula is satisfiable– How hard is it to find assignment?– No lower bounds known for interesting

heuristics.

Page 3: Provably hard problems below the satisfiability threshold

3

Satisfiability Algorithms

• Local search (incomplete)– GSAT [Selman,Levesque,Mitchell 92]

– Walksat [Kautz,Selman 96]

• Backtracking search (complete)– DPLL [Davis,Putnam 60]

[Davis,Logeman,Loveland 62]

– DPLL + “clause learning”

Page 4: Provably hard problems below the satisfiability threshold

• Select* a literal l (some x or x) Remove all clauses containing l

Shrink all clauses containing l

• While there are 1-clausesPick some (arbitrary) 1-clause, satisfy it and simplify

• If there is a 0-clause (contradiction)Backtrack to last free step

Backtracking search/DPLL

Free step

Yields `residual formula’

*many options for select

Page 5: Provably hard problems below the satisfiability threshold

5

Resolution

• Start with clauses of CNF formula F

• Resolution rule– Given (A x), (B x) can derive

(A B)

• F is unsatisfiable 0-clause derivable– Proof size = # of clauses

Running DPLL (with any select) on an unsatisfable formula F

results in a tree-resolution proof of F

Page 6: Provably hard problems below the satisfiability threshold

6

Random CNF formulas

• Random 2-CNF formula with sn clauses– is satisfiable w.h.p. for s 1

• and simple DPLL will find a satisfying assignment in linear time w.h.p.

– is unsatisfiable w.h.p. for s 1• and simple DPLL will finish and yield a

resolution proof of unsatisfiability in linear time w.h.p.

Page 7: Provably hard problems below the satisfiability threshold

7

DPLL on random 3-CNF*

0

1

probability satisfiable

4.26

ratio of clauses to variables

# of DPLLbacktracks

* n = 50 variables

Can prove 2(n/) time is required for

unsatisfiable formulas above the threshold

What about satisfiableformulas below threshold?

Page 8: Provably hard problems below the satisfiability threshold

8

Phase transitions and algorithmic complexity

• Easy connection– Hardest random problems will always be at

a monotone sharp threshold n if it exists• Can randomly reduce satisfiable problems of

lower density to those at the threshold– Given a formula with n clauses can always

add () n random clauses to make it a random problem nearly at the threshold and use that soln

• Can reduce unsatisfiable problems of larger density to those at the threshold

– Given a formula with n clauses ignore all but the first nof them

Page 9: Provably hard problems below the satisfiability threshold

9

Hard satisfiable formulas?

With non-deterministic select we could simply guess n correct value assignments.

.... How can a satisfiable formula possibly be hard?

Any implementation of select must run in polynomial time.…. Very simple heuristics used in practice

Page 10: Provably hard problems below the satisfiability threshold

Some standard select rules for DPLL algorithms

• UC– Pick variables in a fixed order– Always set True first

• UCwm– Pick variables in a fixed order– Apply a majority vote among 3-clauses for assigning

each value

• GUC– Pick a variable v in a shortest clause C– Set v to satisfy C

Page 11: Provably hard problems below the satisfiability threshold

Contributions

These natural DPLL algorithms take exponential time on satisfiable formulas

family of unsatisfiable random formulas parametrized by s s.t. w.h.p.

s 1 linear size resolution proofs

s 1 only exponential size resolution proofs possible

Page 12: Provably hard problems below the satisfiability threshold

12

Key property of each of the select rules we’ve seen

• On random 3-CNF, before the first backtrack occurs, the residual formula is a uniformly random mix of 2-clauses and 3-clauses – If it has m2 2-clauses and m3 3-clauses

then it is equally likely to be any formula with these properties

• key property proofs of algorithms’ success without backtracking

Page 13: Provably hard problems below the satisfiability threshold

What do long runs look like?

Residual formula at is unsatisfiable

Algorithm’sproof of unsatisfiability is exponentially long

Every resolution2n

Residual formula at each node is a mix of 2- and 3-clauses

Page 14: Provably hard problems below the satisfiability threshold

14

Proof Complexity

[Chvátal-Szemerédi 88]

Formula is unsatisfiable w.h.p. for 4.57

Theorem. A random CNF formula with n 3-clauses

has no resolution refutation of size 2n w.h.p.

and sn 2-clauses where s 1

[Achlioptas,B.,Molloy 2001]

s 1-and ????

Page 15: Provably hard problems below the satisfiability threshold

15

Non-rigorous results

1

4.574.26

SAT

UNSAT

2/3 3-clause

ratio

s We can add 2/3 n 3-clauses but not n 2-clauses

2-clause ratio [Kirkpatrick, Monasson, Selman, Zecchina 97]

Page 16: Provably hard problems below the satisfiability threshold

16

Rigorous results [Achlioptas, Kirousis, Kranakis, Krizanc 97]

1

4.57

SAT

UNSAT

8/32/3

??

??

?

?

??

??

?

?

??

?

s

We can add 2/3 n 3-clauses but not n 2-clauses

2.28

3-clause ratio

2-clause ratio

Page 17: Provably hard problems below the satisfiability threshold

17

Proof Complexity

Formula is unsatisfiable w.h.p. for 4.57

Theorem. A random CNF formula with n 3-clauses

has no resolution refutation of size 2n w.h.p.

and sn 2-clauses where s 1

[Achlioptas,B.,Molloy 2001]

2.281 and s 1- for .0001

Sharp threshold since resolution is linear for s 1+

Page 18: Provably hard problems below the satisfiability threshold

18

These DPLL algorithms follow trajectories

1

2/3

[Chao,Franco 88]

[Frieze,Suen 95]

[Achlioptas 00]

[Achlioptas,Sorkin 00]UCGUC

s

3.26 3-clause ratio

2-clause ratio

8/3

Page 19: Provably hard problems below the satisfiability threshold

19

DPLL crossing into the bad zone

1

4.57

Provably UNSAT& Hard

s

3.26 4.26

ProvablySAT & Easy

Algorithm Trajectory

2-clause ratio

3-clause ratio

Page 20: Provably hard problems below the satisfiability threshold

Exponential lower bounds far below the threshold.

UC = 3.81

UCwm = 3.83

GUC = 4.01

Theorem. Let A {UC, UCwm, GUC}. Let

W.h.p. algorithm A takes more than 2n steps on a random 3-CNF with An clauses

Lower bound also applies to any resolution-based algorithm thatextends the ‘first’ branch of the execution of A

Page 21: Provably hard problems below the satisfiability threshold

21

Related Work

• Experiments suggested DPLL algorithms may not be polynomial all the way to the threshold

• [Cocco, Monasson 01] applied non-rigorous methods to suggest exponential GUC behavior below the threshold– Assumed every branch of GUC tree operates like

an independent version of the first branch– Independent of our work

Page 22: Provably hard problems below the satisfiability threshold

22

Implications for phase transitions and algorithmic complexity

• Difference between polynomial and exponential hardness is not necessarily a function of the phase transition– Applies in both phases, not just the over-

constrained phase– Algorithmically dependent

• A good algorithm will have a transition in a different place from a bad algorithm

• Can’t study the hardness transition in the absence of the study of algorithms

Page 23: Provably hard problems below the satisfiability threshold

23

Proof Ideas

• Connection between pure literals and resolution proof size [Chvátal,Szemerédi 88] [Ben-Sasson,Wigderson 99]

– pure literals are those that occur only positively or only negatively in a formula

• Digraph structure of random 2-CNF subformula– New graph-theoretic notion “clan”

• generalization of connected component

– Sharp concentration properties for clan size• moment generating function argument

– Amortization of pure literals across clans

Page 24: Provably hard problems below the satisfiability threshold

24

Resolution proof size and pure literals [Ben-Sasson,Wigderson 99]

• If formula has an s.t.– Every subformula with n clauses has

at least one pure literal– Every subformula with between n and

n clauses has a linear # of pure literals

• Then– all resolution proofs of the formula

require size 2n

1

2

Page 25: Provably hard problems below the satisfiability threshold

25

Basic idea of argument

• By sparsity of the 2-clause part of the formula, any subset of the 2-clauses will have lots of pure literals

• Clan size analysis & amortization

• In a subformula involving both 2-clauses and 3-clauses, either there are

• so many 3-clauses that they create lots of new pure literals on their own , or

• so few 3-clauses that they can’t cover all the pure literals in the 2-clauses - analysis of clans

easy case

Page 26: Provably hard problems below the satisfiability threshold

26

2-CNF Digraph on literals

x

y

z

w

c

d

x

y

z

w

c

d

(d y) (y x) (z y)

(c w) (x w) (w z)

Page 27: Provably hard problems below the satisfiability threshold

27

Hyper/Digraph on literals

x

y

z

w

a

c

b

d

x

y

z

w

f

c

gd

(a b z) (f g w)

Page 28: Provably hard problems below the satisfiability threshold

28

Pure literals

x

y

z

w

a

c

b

d

x

y

z

w

f

c

gd

Page 29: Provably hard problems below the satisfiability threshold

29

Pure cycle

x

y

z

w

a

c

b

d

x

y

z

w

c

d

fg

Page 30: Provably hard problems below the satisfiability threshold

30

Pure Items & Clans of G

• Clans– small subgraphs of G

• one clan per vertex; they cover G

• analog of connected components in sparse random graphs

– pure items typically two per clan leaves in acyclic

connected components in an ordinary graph

– mostly constant size

– never more than log3n vertices

• if x clan(y) then y clan(x)

Page 31: Provably hard problems below the satisfiability threshold

31

What are clans?

Simpler notion first

in(y) for vertex yin an ordinary digraph

Page 32: Provably hard problems below the satisfiability threshold

32

in(y) in ordinary digraph

x

y

z

w

v

u

t

Subgraph of vertices that can reach y= Ancestors of y

y

Page 33: Provably hard problems below the satisfiability threshold

33

clan(y) in ordinary digraph

x

y

z

w

v

u

t

Descendants of ancestors of y

y

Page 34: Provably hard problems below the satisfiability threshold

34

clan(y) in 2-CNF digraph

yy

Page 35: Provably hard problems below the satisfiability threshold

35

A complication - bad events

x

w

c

y

d

z

x

z

w

c

y

d

(d y) (z y) (c w) (x w) (w z)

(w d)

Page 36: Provably hard problems below the satisfiability threshold

36

in(y) in a bad case

yy

Page 37: Provably hard problems below the satisfiability threshold

37

clan(y) in a bad case

yyThis can cascade

and get even worse!

Page 38: Provably hard problems below the satisfiability threshold

38

Analysis

• If we ignore bad edges |in(y)| is dominated by a component process in a sub-critical random undirected graph– like trimmed out-trees

[Bollobás,Borgs,Chayes,Kim,Wilson]

• Ignoring bad edges |clan(y)| is dominated by a 2-level process– run a component process to get in(y)– take the union of |in(y)| independent

component processes added to in(y)

Page 39: Provably hard problems below the satisfiability threshold

39

Analysis

• w.h.p. no more than one bad event happens per clan– |in(y)| is always dominated by the 2-level

component process

• w.h.p. no more than Clog n bad events occur in the whole digraph– fewer than polylog n literals interact with

bad clans– rest of clans dominated by 2-level process

Page 40: Provably hard problems below the satisfiability threshold

40

Analysis

• Ordinary sub-critical component process on 2n vertices w.h.p.– # of vertices with component size i is at

most 2n (1-)i for some fixed 0

• We show sub-critical 2-level component process on 2n vertices w.h.p.– for i i0, # of vertices with 2-level size i

is at most 2n (1-)i for some fixed 0

This is false for a 3-level component process!

Page 41: Provably hard problems below the satisfiability threshold

41

?

??

?

??

?

?

?

?

?

?

?

Open problemConjecture. For every > 2/3 there exists an s 1such that a random (2,3)-CNF with n 3-clausesand sn 2-clauses is w.h.p. unsatisfiable

1UNSAT

4.573.262/3

SAT

Page 42: Provably hard problems below the satisfiability threshold

42

Open problemConjecture. For every > 2/3 there exists an s 1such that a random (2,3)-CNF with n 3-clausesand sn 2-clauses is w.h.p. unsatisfiable

Implies. For every card-game algorithm A there existsa critical density A such that for random 3-CNF

formulas with n clausesFor A w.h.p. A takes linear time

For A w.h.p. A takes exponential time