Proving Non-Reconstruction on Trees by an Iterative Algorithm Elitza Maneva University of Barcelona...

Preview:

Citation preview

Elitza ManevaElitza ManevaUniversity of BarcelonaUniversity of Barcelona

joint work with N. Bhatnagar, Hebrew University

?

001100

001111

110011

220011

2435243510261026 324324

Optimal Algorithm for Reconstruction: Belief Propagationcomputes the distribution at the root given the boundary

Random variable LR(n) : colors at vertices at level n.

For what degree is limn E[Pr [R at the root | LR(n)]] = 1/q ?

Random variable LR(n) : colors at vertices at level n.

For what degree is limndTV (LR(n), LG(n)) = 0 ?

Total variation distance:

dTV(, )=1/2 L D |(L)-(L)|

Interest in reconstruction (in chronological order)

• Probability: Extremality of the free-boundary Gibbs measure (since 1960s)

• Phylogeny: reconstructing ancestry tree of collection of species

• Physics: Replica symmetry breaking (dynamical transition) in spin-glasses

• Computer Science: Glauber dynamics, MCMC, message-passing algorithms

Space of solutions of random Constraint Satisfaction Problems

n variables

dn constraints are chosen at random

00

Easy Hard Unsat

ddr ds

limndTV (LR(n), LG(n)) = 0

Tree: Random graph:

limndTV (LR(n), LG(n)) = 0

limndTV (LR(n), LG(n)) > 0

Tree: Random graph:

limndTV (LR(n), LG(n)) = 0

limndTV (LR(n), LG(n)) > 0

Tree: Random graph:

Conjecture

Other models

• Potts model: q colors, parameter [0,1]

color of child node = same as parent’s color with prob. every other color with prob.

(1- )/(q-1)

• Asymmetric channels: q colors, qq matrix M Mi, j = Prob [child gets color j | parent is color i]

• Same on Galton-Watson trees (i.e. random degrees)

• 3-SAT:

x1

x2x3 x4 x51 1 0

0

1

RSB, clustering of solutions and reconstruction highlights

• [Mezard-Parisi-Zechina ‘03] Survey Propagation algorithm and satisfiability threshold calculation for 3-SAT and coloring, based on Replica Symmetry Breaking Ansatz.

• [Mezard-Montanari ‘06] The dynamic replica symmetry breaking threshold is the same as the threshold for reconstruction on the tree.

• [Achlioptas--Coja-Oghlan ‘08] For sufficiently large q, there exists a sequence q 0 s.t. the space of colorings is clustered for random graphs of average degree

(1+q ) q log q < d < (2- q) q log q.

• [Sly ‘08] For sufficiently large q: q (log q + log log q + 1 - ln 2 -o(1)) < dr < q (log q + log log q + 1+o(1))

RSB, clustering of solutions and reconstruction highlights

• [Mezard-Parisi-Zechina ‘03] Survey Propagation algorithm and satisfiability threshold calculation for 3-SAT and coloring, based on Replica Symmetry Breaking Ansatz.

• [Mezard-Montanari ‘06] The dynamic replica symmetry breaking threshold is the same as the threshold for reconstruction on the tree.

• [Achlioptas--Coja-Oghlan ‘08] For sufficiently large q, there exists a sequence q 0 s.t. the space of colorings is clustered for random graphs of average degree

(1+q ) q log q < d < (2- q) q log q.

• [Sly ‘08] For sufficiently large q: q (log q + log log q + 1 - ln 2 -o(1)) < dr < q (log q + log log q + 1+o(1))

Easy lower bound by coupling

Easy lower bound by coupling

dreconstruction > q-1

Upper bound: boundary forcing the root

dreconstruction soln( ∑ (-1)j( )(q-1-jx)d /(q-1)d = x)q-1 j

q-1 j=0

The bounds for coloring

• [Zdeborova-Krzakala ‘07] – Heuristic algorithm for computing threshold for general models– Heuristic analysis predicted the asymptotic result of Sly

• [Sly ‘08] [Bhatnagar-Vera-Vigoda ‘08] For q sufficiently large:

Hard step (need large q): After sufficiently many iterations dTV < 2/q.

Easy step: Given that dTV< 2/q, dTV goes to 0.

• [Bhatnagar-Maneva ‘09] – Rigorous algorithm for getting upper bounds on dTV for general

models– Concrete bounds on the threshold for Potts model with small q.

qq 3 4 5 6 7 8 9 10 20

Lower bound 2 3 4 5 6 7 8 9 19

Upper bound 5 9 14 19 24 29 35 41 104

[ZK07] (heur.) 5 8 13 17 22 28 33 38 100

Here we need: Recursion on the distribution over possible distributions at the root when the boundary is chosen at random.

001100

001111

110011

220011

2435243510261026 324324

BP: recursion on the distribution at the root given the boundary

2435243510261026 324324

f:(Rq)d Rq

f i(1 2,…, d) = t=1 (ji t j)

|||| = i i

1 2,… d1 2,… d

dd

Some notation

Recursion on the tree depth

QGn random q-dim vector

QGn (R) := Prob[R|L], where L~ LG(n)

Pr[QGn+1 = ] = 1/(q-1)d Pr [f(Qc1

n, …, Qcdn) ]

c1,.., cd {1,..,q}\G

Population dynamics

Pr[QGn+1 = ] = 1/(q-1)d Pr [f(Qc1

n, …, Qcdn) ]

• Keep “populations” of N samples each from the distributions of QR

n, QGn, …etc.

• Generating the population of QGn+1:

– Choose d colors c1, …, cd from {1, …, q}\G independently

– Choose 1, …, d randomly respectively from the populations for Qc1

n, …, Qcdn

– Save f(1, …, d )/||f(1, …, d )|| into the population for QG

n+1

c1,..,cd {1,..,q}\G

Recursions on the tree depth

• Conditional recursion: QGn random q-dim vector

QGn (R) := Prob[R|L], where L~ LG(n)

Pr[QGn+1 = ] = 1/(q-1)d Pr [f(Qc1

n, …, Qcdn) ]

• Unconditional recursion: Qn random q-dim vector

Qn(R) := Prob[R|L], L is a random boundary

Pr[Qn+1= ] E[ ||f(Qn(1),…, Qn

(d))|| Ind[ f(Qn (1),…,Qn

(d)) ]

c1,.., cd {1,..,q}\G

Discrete surveys algorithm

Pr[Qn+1= ] E[ ||f(Qn(1),…, Qn

(d))|| Ind[f(Qn(1),…, Qn

(d)) ]

• Keep a “survey” of the distribution of Qn

• Generate the survey of Qn+1 by applying the recursion to the survey of Qn.

0 1

1

RR

GG

distrib. of Qn

0.08

0.11

0.05

0.25

Discrete surveys algorithm

Pr[Qn+1= ] E[ ||f(Qn(1),…, Qn

(d))|| Ind[f(Qn(1),…, Qn

(d)) ]

• Keep a “survey” of the distribution of Q_n

• Generate the survey of Qn+1 by applying the recursion to the survey of Qn.

0 1

1

RR

GG

distrib. of Qn

0 1

1

RR

GG

survey of Qn

0.08

0.11

0.05

0.25

Definition of a discrete survey

• Let P be the space of q-dim probability vectors.

• Let S = (S1, …, Sk) P and convex hull of S is <S>

• Let 1, … k be functions i:<S>[0,1], s.t for every <S>:

1. i i () =1

2. = i i() Si ( define a convex decomposition of ).

• Let P be a random element of P with support in <S>.

• Let C be a random element of S with Pr[C=Si] = E[i(P)].

• Then we say that

C is a survey of P on the skeleton (S, 1, … k)

Properties of discrete surveys

• Transitivity: If C is a survey of P and D is a survey of C then D is a survey of P.

• Mixing: If C1, C2 are surveys respectively of P1 and P2 then the r. v. with distribution the mixture p C1+(1-p) C2 is a survey of the r.v. with distribution p P1+(1-p) P2.

• For any multi-affine function f : PdP, if C1, …, Cd are surveys of P1, …, Pd then the r.v. D defined by

Pr[D=] E[ ||f(C1, …, Cd)|| x Ind [f(C1, …, Cd) ]] is a survey of the r.v. Q defined by

Pr[Q=] E[ ||f(P1, …, Pd)|| x Ind [f(P1, …, Pd) ]]

• For a convex function g on P, if C is a survey of P then E[g(P)] ≤ E[g(C)].

Manual part of the algorithm

• Selection of skeletons of small size k– complexity of the algorithm: O(nkd)– k generally needs to be exponential in q– the skeleton can be refined progressively

• Examples on Potts model:– q=3, d=3, =0 n=14, k=19 was enough– q=3, d=2, =0.79 n<100, k≤208 was enough– q=3, d=3, =0.7 n<100, k≤85 was enough– q=3, d=2 or 3, =0.74 n<100, k≤61 was enough

A proof that dTV is small implies dTV 0

• There is no general strategy

• For Potts model, due to [Sly ‘09]:

xn:= EL~L (n)[Prob[R|L]-1/q]= q EL [ (Prob[R|L]-1/q)2]

we have xn+1 ≤ d 2 xn + c2(q,d,) xn2 +… +cd(q,d,) xn

d

• Thus we could find >0 and c<1 such that if xn< then xn+1 < c xn

R

Bound of Formentin and Külske ‘09

• α: stationary vector of positive matrix M

• S(p|α) := Σi p(i) log( p(i)/α(i) )

• L(p) := S(p|α) + S(α|p)• Mrev(i,j) := α(j) M(j, i)/α(i)

• c(M):= supp L(pMrev)/L(p)

Theorem: If E[d] c(M) < 1 then no reconstruction.

Important Questions

• For the Potts model better bounds were obtained by [Formentin-Külske ‘09]. Could they be tight? Can their method be generalized to models with hard constraints?

• The design of the Survey Propagation algorithm also includes a discretization step - could this step be done in a controlled manner too?

• How are reconstruction on trees and clustering of solutions related?

[Mezard, Montanari `05]

The dynamical transition at d correspond to the phase transition for reconstruction on the tree.

[Mezard, Montanari `05]

The dynamical transition at d correspond to the phase transition for reconstruction on the tree.

[Mezard, Montanari `05]

The dynamical transition at d correspond to the phase transition for reconstruction on the tree.

?

[Allan Sly `08]

For q-coloring:

d ≤ q(log q + log log q + 1 + o(1)) (also [Zdeborova, Krzakala ’07])

d ≥ q(log q + log log q + 1 – ln 2 –o(1))

For constant q it is open. Estimates can be obtained

with population dynamics.

• What phenomena on the tree are described by the other transitions?

• Can we make population dynamics official? Find rigorous approximations for it?

(it would imply that 4.267 is an upper bound in the threshold by the results of [Franz, Leone `03] and [Talagrand Panchenko `03])

• About clustering: is “phase” and “cluster” really the same thing?

Recommended