Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
1
Perfect Sampling: The BasicsPerfect Sampling: The Basics
Mark HuberDept. of Mathematics and Inst. of Statistics and Decision Sciences
Duke [email protected]
www.math.duke.edu/~mhuber
2
The Problem
■ Start with state space, finite measure
For discrete , know measure of singletons:
For continuous , know density
■ Goal: generate random variates from
A=∫Af x d x
x ,∀ x∈
A= A
f x
3
Usual approach
■ Construct a Markov chain with as stationary distribution
■ Can use Metropolis-Hastings or Gibbs sampler without knowing
■ Problem: difficult to find the mixing time of the Markov chain
4
Where this arises...
■ Computer Scienceapproximation algs. for #P-complete problemspermanent of 0-1 matrix
■ Numerical Integrationacceptance/rejection: need tight envelopeSelf-reducibility for somewhat smooth functions
■ Statisticsexact p-values, approximate probability intervals
■ Statistical PhysicsIsing, hard-core, random cluster models
5
Numerical Integration
Monte Carlo Integration
Step 1) Draw points from area under say Step 2) Form order statisticsStep 3) Let be new limits
X 1 , , X N
f
Nf
X N /4 , X 3 N /4
1/2Area
6
Numerical Integration Part II
Monte Carlo Integration
Step 4) Draw points from new area Step 5) Use median as new limitStep 6) Repeat until interval small
f
N
7
Numerical Integration Part III
Let
Final Area estimate =
f
ab2 ⋅2r
a b
r1=initial length, r2=b−a r=number of times split area
8
Numerical Integration Part IV
Best news: work grows linearly with dimension!
With dimensions, repeat times
3 log2r1/r2N log 1/
Let
Error from Monte Carlo =
=probability algorithm fails
d d
9
Direct is not perfect
Exact Sampling
Direct Sampling
Perfect Samplingdraws exactly from
without knowing
computing
10
Acceptance Rejection
Acc/Rej Algorithm:
1) Let2) If accept Else goto step 1
FeaturesRunning Time Geometric:No need to know
X Unif
X ∈A
A
P Tk E [T ]e−k
11
Properties of Perfect Sampling
■ Generates exactly from desired distribution
■ Running time is random (Las Vegas algorithm)
■ No knowledge of normalizing constant needed
P T2 k E [T ]1 /2k
■ Direct sampling uses knowledge of
12
The Good News
■ Generates exactly from desired distribution
■ Can be used for continuous or discrete
■ True algorithms(Markov chain methods are not algorithms unless the mixing time is known)
■ Useful even if running time unknown
13
The Bad News
■ Not a magic solution to slow Markov chains
■ Requires more effort than Metroplis-Hastings
■ Methods more complex
14
Perfect Sampling Methods
Protocols
Techniques
General frameworks for creating perfect sampling algorithms
Specific tricks and methods for turning the protocols into algorithms
15
Protocols
Coupling Markov chains
Coupling from the past (Propp, Wilson 1996)Fill, Machida, Murdoch, Rosenthal (1999)Read-Once CFTP (Wilson 2000)High noise CFTP (Häggström, Steif 2000)
Modified Acceptance/Rejection
Popping Algorithms (Propp, Wilson 1998)Randomness Recycler (Fill, Huber 2001)
16
Techniques
Monotonicity (Propp, Wilson 1996)
Multigamma coupling (Murdoch, Green 1998)
Bounding chains (Häggström, Nelander 1999),(H. 1999, 2004)
Multishift coupling (Wilson 2000)
How to build a better coupler:
17
Coupling from the Past
How to describe a Markov chain?Update functionUpdate function (Propp, Wilson 1996)Stochastic Recursive SchemeStochastic Recursive Scheme (Borovkov, Foss 1992)Complete couplingComplete coupling (H. 2004)
U 1, U 2, ...~U [0,1]Given a sequence of independent, identical uniforms:
...a deterministic function , and starting state :fX 0=x0, X t1= f X t ,U t
x0
18
A simple exampleTransposition chain on permutations
Suppose differ by one transposition1,2
P X t1=1∣X t=2=1n2
Example permuation: 3 4 7 2 1 6 5
Say card is in position Chain is just swapping two cards at random
4 2
19
More than one complete coupling...Method 1
LetSwap cards and
iUnif {1,2,. .. , n}, jUnif {1,2,. ... , n}i j
Method 2LetSwap cards at positions and
iUnif {1,2,. .. , n}, jUnif {1,2,. ... , n}i j
20
The best one...Best method
LetSwap card and card in position
iUnif {1,2,. .. , n}, jUnif {1,2,. ... , n}i j
The Key Fact:This chain can be run without knowing !
X 0
21
Bounding chainBegin with unknown stationary state
? ? ? ? ? ? ?Choose card and position
i3, j4
Swap card and positioni j
? ? ? 3 ? ? ?
22
Continuing...The next steps:
i j
? ? ? 3 ? ? ?2 ? ? 3 ? ? ?2 ? ? 3 ? 5 ?2 ? 3 ? ? 5 ?5 ? 3 ? ? 2 ?
2 15 63 32 6
⋮5 3 2 1 6 7 4
23
Once we have a coupler
CFTP(T)1) Generate iid2) If then return3) Else let
return
U 0, U 1, ,U T Unif [0,1]
Defn: F t x:= f f f ⋯ f x ,U 0⋯ ,U t−1 ,U t (so X 0=x X t=F t x )
F T ={X } XX 0CFTP 2T
F T X 0
24
It works
Thm: As long as is constant with positive probability for some , CFTP terminates with probability 1 and is a perfect sampling algorithm
F T T
Drawbacks
Read-Twice: need to storeNoninterruptible: cannot abort algorithm
without biasing the sampleAs slow as underlying Markov chain
U 0, U 1, ,U T
25
Bounding chain: formal definition
⊆CVOriginal state space:
Bounding chain state space: ∗⊆2C V
A chain on bounds one on if they can be coupled so that
∗
X t v ⊆Y t v ∀ v∈V
X t1v ⊆Y t1v ∀ v∈V
26
Other coupling methods
Mononicity: an update function is monotonic if(for some partial ordering of )
X t≤Y t f X t ,U t≤ f Y t ,U t
Track minimum and maximum statesmax
min
stat
27
More useful facts
■ Techniques exist for continuous state spaces: multigamma, multishift couplers specially designed
■ Bounding chains with Metropolis-Hastings works with continuous or discrete spaces
■ CFTP always as good as (and usually faster) than acceptance rejection
28
Multishift coupler
Use Metropolis-Hastings with proposal uniform centered at current location:
If accept for entire interval, then whole interval couples to a single point
29
Acceptance/Rejection Revisited
X i Unif {1, , n}
Acc/Rej(n)1) For2) Generate3) If for some
start algorithm over again4) Return
i∈{1, , n}
X i =X j ji
3 4 1 42 1 5 7 6 3 4
X
30
Why no one does this
Running Time
nn
n!≈en
2 n
Solution: Recycle!
31
Randomness Recycler
Framework (Fill, Huber 2001)Build up variate one coordinate at a timeIf accept that coordinate, keep goingElse recycle what you can, keep going
AccExample: permuations
Acc Acc RejWhat is distribution of given rejected?
[X 1 X 2 X 3]X 4
32
Effect of rejection
The events of interest
The calculation
A:=[X 1 X 2 X 3]=[ x 1 x 2 x 3]B:=X 4 rejected
P A∣B=P A , B/P B =[n n−1n−23 /7]/3/7 =nn−1n−2
The result:Can recycle all but the last element!
33
The RR algorithm
X i Unif {1, , n}
Randomness Recycler for Permuations(n)1) For2) Generate3) If for some
goto step 2)4) Return
i∈{1, , n}
X i =X j ji
X
34
RR for general problems
Bivariate chain
At each step
Quit when
X t∗ , X t , where X t~X t
∗
X t∗ , X t
X acc∗ , X t1
X rej∗ , X t
X t∗=
35
Notes on RR
X acc∗
X acc∗
■ The user gets to choose
■ The closer is to higher probability of acceptance
■ Generally closer to , farther away
X acc∗ X t
∗
X rej∗
36
Compare and Constrast
Randomness Recycler
FasterInterruptibleRead-OnceForward direction (no recursion)Harder to buildRelated to strong stationary times of Markov chains
Coupling from the past
Can utilize existing Markov chains for problemsNoninterruptible, Read-Twice, slower(can fix exactly one of these with modifications)Related to coupling times of Markov chains
37
Running times for permutations
Randomness Recycler
Coupling from the past
nn
nn−1⋯
n2
n1≤n ln n
nn
2
nn−1
2
⋯ n1
2
≤n22
6
38
Another Example
Hard core gas model(physics: gases, computer science: network failures)
Assign each node in a graphGiven constants
x v ∈{0,1}v
x=[∏nodes vv x v ][∏v~w
1x v x w ≤1]v
activity independent set
39
In pictures
identically smallv 1
01 1
0
0
0 0
00
0
1
v 1
1
1
1
0
0 0
0
0
01
1
identically large
0
11
40
Markov chain
1
01
0
0
Gibbs Sampler1) Let2) Let3) If all neighbors of
have and
let4) Else let
X vUnif set of nodes
U Unif [0,1]vw
X w =0
U≥ v 1v
X v 1X v 0
U=.8
1 0
10
41
Bounding chain
U≤ v 1v
When
Always set X v 0
??
A Good Move
? ?
??
0
?? ??
??
??
42
Bounding chain
U≥ v 1v If
Don't know X v
A Bad Move
0
0 0
When , more good
moves than bad moves
≤ 2−2
??
0
????
0
43
Randomness Recycler
0
0
RR for Hard Core Gas model1) Start with 2) While3) Let4) Let 5) If no conflicts accept and set6) Else reject and recycle
' v =0,∀ v∃v : ' v =0
U Unif [0,1]X v 1Uv /[1v ]
' v v
1
0
' v =0
' v =1
44
Recycle
' v 00
10
0Suppose
U=.4
Causes conflict0
1
0
1
Recycle by resetting neighbors of
0
0 011
0
45
RR Type II
Hard core gas model(physics: gases, computer science: network failures)
Assign each node in a graphGiven constants
x v ∈{0,1}v
x=[∏nodes vv x v ][∏v~w
1x v x w ≤1]v
activity independent set
First RR relaxed this constraint
Second RR relaxes this constraint
46
Randomness Recycler II
RR for Hard Core Gas model using edges1) Start with no edges in graph 2) While some edges not in graph3) Add an edge back to graph4) If no conflicts accept edge5) Else reject and recycle
47
Recycle Edges Part I
0
1
0
Edge causes conflict
0
0 0
1 0
0 0 0 1
48
Recycle Edges Part II
0
1
1) Remove contaminated edges2) Reroll values for nodes
0
0 0
1
0
0 0 0
1
1
49
Analysis
Randomness Recycler
Coupling from the past
Runtime Valid
O n
O n ln n
≤ 43−2
≤ 2−2
:=maximum degree of graph
50
Applications of perfect sampling
■ Ising model (random cluster model)■ Proper colorings of a graph (Potts model)■ Widom-Rowlinson model■ Move ahead 1 chain■ Hard core gas model (discrete and continuous)■ Soft (penetrating) core gas models■ Linear extensions of permuations■ Regular, dense restricted permutations of a graph■ Sink free orientations of a graph■ Bayesian analysis: unknown mixture problems■ Multivariate normals in positive orthant■ Exact p-values for nonparametric regression■ Orthonormal model selection
51
What we know
■ True algorithmsno need to know mixing time of anything
■ Several different typesCoupling from the past (and variants)Randomness Recycler
■ No knowledge of normalizing constant needed
■ Works on continuous and discrete problems
52
What we would like to know
■ Crossover potentialcould monotonicity be used with RR?how about bounding chains
■ Conductance in Markov chainsCouplings related to CFTP, SST related to RRThird major method for proving rapid mixing of
Markov chains is conductanceCan we design a protocol to use conductance?
■ Must running time be random?
53
ReferencesA. A. Brovkov and S. G. Foss, Stochastically recursive sequences and their generalizations. Siberian Advances in Mathematics, 2(1): 16—81, 1992
J. A. Fill and M. L. Huber. The Randomness Recycler: A new approach to perfect sampling. In Proc 41st Sympos. on Foundations of Comp. Sci., 503—511, 2000
O. Haggstrom and J. E. Steif. Propp-Wilson algorithms and finitary coding for high noise Markov random fields. Combin. Probab. Computing, 9:425—439, 2000
M. Huber. Perfect sampling using bounding chains. Annals of Applied Probability, 2004, to appear
M. Huber. Perfect sampling with bounding chains. PhD thesis, Cornell University, 1999
54
ReferencesD. J. Murdoch and P. J. Green. Exact sampling from a continuous state space. Scand. J. Statist., 25(3):483—502, 1998
J. G. Propp and D. B. Wilson. Exact sampling with coupled Markov chains and applications to statistical mechanics. Random Structures & Algorithms, 9(1—2):223--252, 1996
D. B. Wilson. How to couple from the past using a read-once source of randomness. Random Structures & Algorithms, 16(1):85—113, 2000
My website:http://www.math.duke.edu/~mhuber
David Wilson's perfect sampling pagehttp://dbwilson.com/exact/