Upload
melia
View
38
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Efficient Stochastic Local Search for MPE Solving. Frank Hutter The University of British Columbia (UBC), Vancouver, Canada Joint work with Holger Hoos ( UBC) and Thomas Stützle ( Darmstadt University of Technology, Germany). - PowerPoint PPT Presentation
Citation preview
1
Efficient Stochastic Local Efficient Stochastic Local Search Search
for MPE Solvingfor MPE Solving
Frank HutterFrank Hutter The University of British Columbia (UBC), Vancouver, The University of British Columbia (UBC), Vancouver,
CanadaCanada
Joint work with Holger Hoos (Joint work with Holger Hoos (UBC) andUBC) and
Thomas Stützle (Thomas Stützle (Darmstadt University of Darmstadt University of Technology, Germany)Technology, Germany)
2
SLS: general algorithmic SLS: general algorithmic framework for solving framework for solving
combinatorial problemscombinatorial problems
3
MPE in graphical models: MPE in graphical models: many applicationsmany applications
4
OutlineOutline Most probable explanation (MPE) problemMost probable explanation (MPE) problem
Problem definitionProblem definition Previous workPrevious work
SLS algorithms for MPESLS algorithms for MPE IllustrationIllustration Previous SLS algorithmsPrevious SLS algorithms Guided Local Search (GLS) in detailGuided Local Search (GLS) in detail
From Guided Local Search to GLSFrom Guided Local Search to GLS++ ModificationsModifications Performance gainsPerformance gains
Comparison to state-of-the-artComparison to state-of-the-art
5
MPE - problem definitionMPE - problem definition(in most general representation: (in most general representation:
factor graphs)factor graphs)
Given a factor graphGiven a factor graph Discrete Discrete Variables Variables XX = {X = {X11, ..., X, ..., Xnn}} Factors Factors = { = {11,...,,...,mm}} over subsets of over subsets of XX
A factor A factor ii over variables over variables VVii µ X µ X assigns a non-negative number to assigns a non-negative number to every complete instantiation every complete instantiation vvii of of VVii
FindFind Complete instantiation {xComplete instantiation {x11,...,x,...,xnn} maximizing } maximizing i=1i=1
mm ii[x[x11,...,x,...,xnn]] NP-hard (simple reduction from SAT)NP-hard (simple reduction from SAT) Also known as Max-product or Maximum a posteriori Also known as Max-product or Maximum a posteriori
(MAP)(MAP)
X1 X2 X4X3 X5
1 2 3 84
X6
5 6 7
88XX66XX55XX44
13.713.7111111
23.223.2001111
00110011
00000011
100100111100
3.23.20011000.90.9110000
0.10.100000021.221.211
0.90.922
0000
11XX112 1 0 0 1 0
6
Previous approaches for Previous approaches for solving MPEsolving MPE
Variable elimination / Junction treeVariable elimination / Junction tree Exponential in the graphical model´s induced Exponential in the graphical model´s induced
widthwidth Approximation with loopy belief propagation and Approximation with loopy belief propagation and
its generalizations its generalizations [Yedidia, Freeman, Weiss ´02][Yedidia, Freeman, Weiss ´02] Approximation with Mini Buckets (MB) Approximation with Mini Buckets (MB) [Dechter [Dechter
& Rish ´97] & Rish ´97] !! also gives lower & upper bound also gives lower & upper bound Search algorithmsSearch algorithms
Local SearchLocal Search Branch and Bound with various MB heuristics Branch and Bound with various MB heuristics
[Dechter´s group, ´99 - 05][Dechter´s group, ´99 - 05]UAI ´03: B&B with MB heuristic shown to be UAI ´03: B&B with MB heuristic shown to be state-of-the-artstate-of-the-art
7
Motivation for our workMotivation for our work B&B clearly outperforms best SLS algorithm so B&B clearly outperforms best SLS algorithm so
far, even on random problem instances far, even on random problem instances [Marinescu, Kask, Dechter, UAI ´03][Marinescu, Kask, Dechter, UAI ´03]
MPE is closely related to weighted Max-SAT MPE is closely related to weighted Max-SAT [Park ´02][Park ´02]
For Max-SAT, SLS is state-of-the-artFor Max-SAT, SLS is state-of-the-art(at the very least for random problems)(at the very least for random problems)
Why is SLS not state-of-the-art for MPE ?Why is SLS not state-of-the-art for MPE ? Additional problem structure inside the factorsAdditional problem structure inside the factors
But for completely random problems ?But for completely random problems ? SLS algos should be much better than they SLS algos should be much better than they
currently arecurrently are We took the best SLS algorithm so far (GLS) and We took the best SLS algorithm so far (GLS) and
improved itimproved it
8
OutlineOutline Most probable explanation (MPE) problemMost probable explanation (MPE) problem
Problem definitionProblem definition Previous workPrevious work
SLS algorithms for MPESLS algorithms for MPE IllustrationIllustration Previous SLS algorithmsPrevious SLS algorithms Guided Local Search (GLS) in detailGuided Local Search (GLS) in detail
From Guided Local Search to GLSFrom Guided Local Search to GLS++ ModificationsModifications Performance gainsPerformance gains
Comparison to state-of-the-artComparison to state-of-the-art
9
SLS for MPE – SLS for MPE – illustrationillustration
XX
11
XX
22
22
00 00 2121
00 11 0.70.7
11 00 00
11 11 11
22 00 0.90.9
22 11 0.0.22
XX
33
44
00 0.0.99
11 0.10.1
X1 X2 X4X3
1 2 3 4 5
XX22 XX33 XX
44
55
00 00 00 1010
00 00 11 0.90.9
00 11 00 00
00 11 11 100100
11 00 00 33.33.22
11 00 11 00
11 11 00 23.223.2
11 11 11 13.713.7
XX11 11
00 00
11 21.221.2
22 0.0.11
2 1 0 0
X1
X3
3
00 00 1.11.1
00 11 2323
11 00 00
11 11 0.70.7
22 00 2.2.77
22 11 4242i=1i=1MM ii[[2,1,0,02,1,0,0] = ] = 0.1 0.1 ** 0.2 0.2 ** 2.7 2.7 **
0.9 0.9 ** 33.2 33.2
Instantiation:
10
SLS for MPE – SLS for MPE – illustrationillustration
XX
11
XX
22
22
00 00 2121
00 11 0.70.7
11 00 00
11 11 11
22 00 0.0.99
22 11 0.0.22
XX
33
44
00 0.0.99
11 0.10.1
X1 X2 X4X3
1 2 3 4 5
XX22 XX33 XX
44
55
00 00 00 101000 00 11 0.90.9
00 11 00 00
00 11 11 100100
11 00 00 33.33.22
11 00 11 00
11 11 00 23.223.2
11 11 11 13.713.7
XX11 11
00 00
11 21.221.2
22 0.0.11
2 1!0 0 0
X1
X3
3
00 00 1.11.1
00 11 2323
11 00 00
11 11 0.70.7
22 00 2.2.77
22 11 4242
Instantiation:
i=1i=1MM ii[2,[2,00,0,0],0,0] = =
i=1i=1MM ii[2,[2,11,0,0] * ,0,0] * 0.9/0.2 **
10/33.2
11
Previous SLS algorithms Previous SLS algorithms for MPEfor MPE
Iterative Conditional Modes Iterative Conditional Modes [Besag, ´86][Besag, ´86] Just greedy hill climbingJust greedy hill climbing
Stochastic SimulationStochastic Simulation Sampling algorithm, very poor for optimizationSampling algorithm, very poor for optimization
Greedy + Stochastic Simulation Greedy + Stochastic Simulation [Kask & [Kask & Dechter, ´99]Dechter, ´99] Outperforms the above & simulated annealing Outperforms the above & simulated annealing
by orders of magnitudeby orders of magnitude Guided Local Search (GLS) Guided Local Search (GLS) [Park ´02][Park ´02] (Iterated Local Search (ILS) (Iterated Local Search (ILS) [Hutter ´04][Hutter ´04]))
Outperforms Greedy + Stochastic Simulation Outperforms Greedy + Stochastic Simulation by orders of magnitudeby orders of magnitude
12
Guided Local Search (GLS) Guided Local Search (GLS) [Voudouris 1997][Voudouris 1997]
Subclass of Dynamic Subclass of Dynamic Local SearchLocal Search [Hoos & Stützle, 2004][Hoos & Stützle, 2004]::Iteratively:Iteratively:1) Local search 1) Local search !! local local optimumoptimum2) Modify evaluation function2) Modify evaluation function
In local optima: penalize some solution In local optima: penalize some solution featuresfeatures Solution features for MPE are partial assigmentsSolution features for MPE are partial assigments Evaluation fct. = Objective fct. - sum of respective penaltiesEvaluation fct. = Objective fct. - sum of respective penalties Penalty update rule Penalty update rule experimentally designedexperimentally designed Performs very well across many problem classesPerforms very well across many problem classes
.
...
13
GLS for MPE GLS for MPE [Park 2002][Park 2002] Initialize penalties to 0Initialize penalties to 0 Evaluation function:Evaluation function:
Obj. function - sum of penalties of current Obj. function - sum of penalties of current instantiationinstantiation
i=1i=1mm ii[x[x11,...,x,...,xnn] - ] - i=1i=1
pp ii[x[x11,...,x,...,xnn]] In local optimum:In local optimum:
Choose partial instantiations (according to GLS Choose partial instantiations (according to GLS update rule)update rule)
Increment their penalty by 1Increment their penalty by 1 Every NEvery N local optima local optima
Smooth all penalties by multiplying them with Smooth all penalties by multiplying them with < 1 < 1 Important to eventually optimize the original Important to eventually optimize the original
objective functionobjective function
14
OutlineOutline Most probable explanation (MPE) problemMost probable explanation (MPE) problem
Problem definitionProblem definition Previous workPrevious work
SLS algorithms for MPESLS algorithms for MPE IllustrationIllustration Previous SLS algorithmsPrevious SLS algorithms Guided Local Search (GLS) in detailGuided Local Search (GLS) in detail
From Guided Local Search to GLSFrom Guided Local Search to GLS++ ModificationsModifications Performance gainsPerformance gains
Comparison to state-of-the-artComparison to state-of-the-art
15
GLS GLS !! GLS GLS++::Overview of modified Overview of modified
componentscomponents Modified evaluation functionModified evaluation function
Pay more attention to the actual objective functionPay more attention to the actual objective function Improved caching of evaluation functionImproved caching of evaluation function
Straightforward adaption from SAT caching Straightforward adaption from SAT caching schemesschemes
Tuning of smoothing parameter Tuning of smoothing parameter Over two orders of magnitude improvement !Over two orders of magnitude improvement !
Initialization with Mini-Buckets instead of Initialization with Mini-Buckets instead of randomrandom Was shown to perform better by Was shown to perform better by [Kask & Dechter, [Kask & Dechter,
1999]1999]
16
GLS GLS !! GLS GLS+ + (1)(1)Modified evaluation Modified evaluation
functionfunction GLSGLS
i=1i=1mm ii[x[x11,...,x,...,xnn]] - - i=1i=1
pp ii[x[x11,...,x,...,xnn]] Product of entries minus sum of penaltiesProduct of entries minus sum of penalties
¼¼ zero minus sum of penalties zero minus sum of penaltiesAlmost neglecting objective functionAlmost neglecting objective function
GLSGLS++
i=1i=1mm log( log(ii[x[x11,...,x,...,xnn])]) - - i=1i=1
pp ii[x[x11,...,x,...,xnn]] Use logarithmic objective functionUse logarithmic objective function Very simple, but much better resultsVery simple, but much better results Penalties are now just new temporary factors Penalties are now just new temporary factors
that decay over time!that decay over time! Could be improved by dynamic weighting of the Could be improved by dynamic weighting of the
penaltiespenalties
17
GLS GLS !! GLS GLS+ + (1)(1) Modified evaluation Modified evaluation
functionfunction Much faster in early stages of the searchMuch faster in early stages of the search Speedups of about 1 order of magnitudeSpeedups of about 1 order of magnitude
GLS
GLS+
GLS+
GLS
18
Time complexity for a single best-improvement Time complexity for a single best-improvement step:step: Previously best caching: Previously best caching: (|V| (|V| ££ |D |DVV| | ££ VV))
Improved caching: Improved caching: (|V(|Vimprovingimproving||££ |D |DVV|)|)
GLS GLS !! GLS GLS+ + (2)(2)Speedups by cachingSpeedups by caching
A
A
A
A
19
GLS GLS !! GLS GLS+ + (3)(3)Tuning the smoothing factor Tuning the smoothing factor [Park ´02][Park ´02] stated GLS to have ``no stated GLS to have ``no
parameters´´parameters´´ Changing Changing from Park`s setting 0.8 to 0.99from Park`s setting 0.8 to 0.99
Sometimes from unsolvable to millisecondsSometimes from unsolvable to milliseconds Effect increases for large instancesEffect increases for large instances
1
=
= 0.99 = 0.999 = 1
20
GLS GLS !! GLS GLS+ + (4)(4)Initialization with Mini-Initialization with Mini-
BucketsBuckets Sometimes a bit worse, sometimes much betterSometimes a bit worse, sometimes much better Particularly helps for some structured instancesParticularly helps for some structured instances
21
OutlineOutline Most probable explanation (MPE) problemMost probable explanation (MPE) problem
Problem definitionProblem definition Previous workPrevious work
SLS algorithms for MPESLS algorithms for MPE IllustrationIllustration Previous SLS algorithmsPrevious SLS algorithms Guided Local Search (GLS) in detailGuided Local Search (GLS) in detail
From Guided Local Search to GLSFrom Guided Local Search to GLS++ ModificationsModifications Performance gainsPerformance gains
Comparison to state-of-the-artComparison to state-of-the-art
22
Comparison based on Comparison based on [Marinescu, Kask, Dechter, [Marinescu, Kask, Dechter,
UAI ´03]UAI ´03] Branch & Bound with MB heuristic was state-of-the-Branch & Bound with MB heuristic was state-of-the-
art for MPE, art for MPE, even for random instances!even for random instances!
Scales better than original GLS withScales better than original GLS with Number of variablesNumber of variables Domain sizeDomain size
Both as anytime algorithm and in terms of time Both as anytime algorithm and in terms of time needed to find optimumneeded to find optimum
On the same problem instances, we show that our On the same problem instances, we show that our new GLSnew GLS++ scales better than their implementation scales better than their implementation withwith Number of variablesNumber of variables Domain sizeDomain size DensityDensity Induced widthInduced width
23
Benchmark instancesBenchmark instances
Randomly generated Bayes netsRandomly generated Bayes nets Graph structure: completely Graph structure: completely
random/grid networks random/grid networks Controlled number of variables & Controlled number of variables &
domain sizedomain size Random networks with controlled Random networks with controlled
induced widthinduced width Bayesian networks from Bayes net Bayesian networks from Bayes net
repositoryrepository
24
Original GLS vs. B&B with MB Original GLS vs. B&B with MB heuristic :heuristic :
relative solution quality after 100 relative solution quality after 100 secondsseconds
for random grid networks of size for random grid networks of size NxNNxN
A
A
SmallMediu
m
Large
25
GLSGLS++ vs. GLS and B&B with MB vs. GLS and B&B with MB heuristic :heuristic :
relative solution quality after 100 relative solution quality after 100 secondsseconds
for random grid networks of size for random grid networks of size NxNNxN
SmallMediu
m
Large
26
GLSGLS++ vs. B&B with MB vs. B&B with MB heuristic : heuristic :
Solution time with increasing Solution time with increasing domain size on random networksdomain size on random networks
Small
Medium
Large
27
Solution times with Solution times with increasing increasing induced width on induced width on
random networksrandom networks
A
d-BBMBs-BBMB
Orig GLS
GLS+
28
Results for Bayes net Results for Bayes net repositoryrepository
GLSGLS++ shows overall best shows overall best performanceperformance Only algorithm to solve Link network Only algorithm to solve Link network
(in 1 second!)(in 1 second!) Problems for Barley and especially Problems for Barley and especially
DiabetesDiabetes Preprocessing with partial variable Preprocessing with partial variable
elimination helps a lotelimination helps a lot Can reduce #(variables) dramaticallyCan reduce #(variables) dramatically
29
ConclusionsConclusions SLS algorithms SLS algorithms areare competitive for MPE solving competitive for MPE solving
Scale very well, especially with induced widthScale very well, especially with induced width But they need careful design, analysis & parameter But they need careful design, analysis & parameter
tuningtuning SLS and Machine Learning (ML) people should SLS and Machine Learning (ML) people should
talktalk SLS can perform very well for some traditional ML SLS can perform very well for some traditional ML
problemsproblems Our C source code is onlineOur C source code is online
Please use it Please use it There‘s also a Matlab interfaceThere‘s also a Matlab interface
30
Extensions in progressExtensions in progress Real problem domainsReal problem domains
MRFs for stereo visionMRFs for stereo vision CRFs for sketch recognitionCRFs for sketch recognition
Domain-dependent extensionsDomain-dependent extensions Hierarchical SLS for problems in computer Hierarchical SLS for problems in computer
visionvision Automated parameter tuningAutomated parameter tuning
Use Machine Learning to predict runtime for Use Machine Learning to predict runtime for different settings of algorithm parametersdifferent settings of algorithm parameters
Use parameter setting with lowest predicted Use parameter setting with lowest predicted runtimeruntime
31
The EndThe End
Thanks to Thanks to Holger Hoos & Thomas StützleHolger Hoos & Thomas Stützle Radu Marinescu for their B&B codeRadu Marinescu for their B&B code You for your attention You for your attention