Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
AI-Augmented Algorithms –How I Learned to Stop
Worrying and Love ChoiceLars Kotthoff
University of [email protected]
TU Eindhoven, 09 July 2018
1
Outline
▷ Big Picture▷ Motivation▷ Algorithm Selection and Portfolios▷ Algorithm Configuration▷ Outlook
2
Big Picture▷ advance the state of the art through meta-algorithmic
techniques▷ rather than inventing new things, use existing things more
intelligently – automatically▷ invent new things through combinations of existing things
3
Big Picture▷ advance the state of the art through meta-algorithmic
techniques▷ rather than inventing new things, use existing things more
intelligently – automatically▷ invent new things through combinations of existing things
3
Motivation – What DifferenceDoes It Make?
4
Prominent Application
Fréchette, Alexandre, Neil Newman, Kevin Leyton-Brown. “Solving theStation Packing Problem.” In Association for the Advancement of ArtificialIntelligence (AAAI), 2016.
5
Performance Differences
0.1
1
10
100
1000
0.1 1 10 100 1000
Virt
ual B
est S
AT
Virtual Best CSP
Hurley, Barry, Lars Kotthoff, Yuri Malitsky, and Barry O’Sullivan. “Proteus:A Hierarchical Portfolio of Solvers and Transformations.” In CPAIOR, 2014.
6
Leveraging the Differences
Xu, Lin, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown.“SATzilla: Portfolio-Based Algorithm Selection for SAT.” J. Artif. Intell. Res.(JAIR) 32 (2008): 565–606.
7
Performance Improvements
Configuration of a SAT Solver for Verification [Hutter et al, 2007]
Ran FocusedILS, 2 days × 10 machines
– On a training set from each benchmark
Compared to manually-engineered default
– 1 week of performance tuning
– Competitive with the state of the art
– Comparison on unseen test instances
10−2
10−1
100
101
102
103
104
10−2
10−1
100
101
102
103
104
SPEAR, original default (s)
SP
EA
R, optim
ized for
IBM
−B
MC
(s)
4.5-fold speedup
on hardware verification
10−2
10−1
100
101
102
103
104
10−2
10−1
100
101
102
103
104
SPEAR, original default (s)
SP
EA
R, optim
ized for
SW
V (
s)
500-fold speedup won category
QF BV in 2007 SMT competitionHutter & Lindauer AC-Tutorial AAAI 2016, Phoenix, USA 17
Hutter, Frank, Domagoj Babic, Holger H. Hoos, and Alan J. Hu.“Boosting Verification by Automatic Tuning of Decision Procedures.” InFMCAD ’07: Proceedings of the Formal Methods in Computer Aided Design,27–34. Washington, DC, USA: IEEE Computer Society, 2007. 8
Common Theme
Performance models of black-box processes▷ also called surrogate models▷ replace expensive underlying process with cheap approximate
model▷ build approximate model based on real evaluations using
machine learning techniques▷ no knowledge of what the underlying process does required
(but can be helpful)▷ allow better understanding of the underlying process through
interrogation of the model
9
Algorithm Selection
10
Algorithm Selection
Given a problem, choose the best algorithm to solve it.
Rice, John R. “The Algorithm Selection Problem.” Advances in Computers15 (1976): 65–118.
11
Algorithm Selection
Portfolio
Algorithm 2Algorithm 1 Algorithm 3
Training Instances
Instance 2Instance 1 Instance 3
Algorithm Selection
Performance Model
Instance 4Instance 5Instance 6
...
Instance 4: Algorithm 2Instance 5: Algorithm 3Instance 6: Algorithm 3
...
Feature Extraction
FeatureExtraction
12
Algorithm Portfolios
▷ instead of a single algorithm, use several complementaryalgorithms
▷ idea from Economics – minimise risk by spreading it outacross several securities
▷ same for computational problems – minimise risk of algorithmperforming poorly
▷ in practice often constructed from competition winners
Huberman, Bernardo A., Rajan M. Lukose, and Tad Hogg. “An EconomicsApproach to Hard Computational Problems.” Science 275, no. 5296 (1997):51–54. doi:10.1126/science.275.5296.51.
13
Algorithms
“algorithm” used in a very loose sense▷ algorithms▷ heuristics▷ machine learning models▷ …
14
Parallel Portfolios
Why not simply run all algorithms in parallel?▷ not enough resources may be available/waste of resources▷ algorithms may be parallelized themselves▷ memory contention
15
Building an Algorithm Selection System
▷ most approaches rely on machine learning▷ train with representative data, i.e. performance of all
algorithms in portfolio on a number of instances▷ evaluate performance on separate set of instances▷ potentially large amount of prep work
16
Key Components of an Algorithm Selection System
▷ feature extraction▷ performance model▷ prediction-based selector/scheduler
optional:▷ presolver▷ secondary/hierarchical models and predictors (e.g. for feature
extraction time)
17
Types of Performance Models
Regression Models
A1
A2A3
A1: 1.2A2: 4.5A3: 3.9
Classification Model
A1A3 A1
A2 A1
Pairwise Classification Models
A1 vs. A2
A1A2 A1
A1
A1 vs. A3
A1A1 A3
A3 …A1: 1 voteA2: 0 votesA3: 2 votes
Pairwise Regression Models
A1 - A2
0
A1 - A3
0…
A1: -1.3A2: 0.4A3: 1.7
Instance 1Instance 2Instance 3
...
Instance 1: Algorithm 2Instance 2: Algorithm 1Instance 3: Algorithm 3
...
18
Benchmark Library – ASlib
▷ currently 29 data sets/scenarios with more in preparation▷ SAT, CSP, QBF, ASP, MAXSAT, OR, machine learning…▷ includes data used frequently in the literature that you may
want to evaluate your approach on▷ performance of common approaches that you can compare to▷ http://aslib.net
Bischl, Bernd, Pascal Kerschke, Lars Kotthoff, Marius Lindauer, YuriMalitsky, Alexandre Fréchette, Holger H. Hoos, et al. “ASlib: A BenchmarkLibrary for Algorithm Selection.” Artificial Intelligence Journal (AIJ), no. 237(2016): 41–58.
19
(Much) More Information
http://larskotthoff.github.io/assurvey/
Kotthoff, Lars. “Algorithm Selection for Combinatorial Search Problems: ASurvey.” AI Magazine 35, no. 3 (2014): 48–60. 20
Algorithm Configuration
21
Algorithm Configuration
Given a (set of) problem(s), find the best parameter configuration.
22
Parameters?
▷ anything you can change that makes sense to change▷ e.g. search heuristic, variable ordering, type of global
constraint decomposition▷ not random seed, whether to enable debugging, etc.▷ some will affect performance, others will have no effect at all
23
Automated Algorithm Configuration
▷ no background knowledge on parameters or algorithm▷ as little manual intervention as possible
▷ failures are handled appropriately▷ resources are not wasted▷ can run unattended on large-scale compute infrastructure
24
Algorithm Configuration
Frank Hutter and Marius Lindauer, “Algorithm Configuration: A Hands onTutorial”, AAAI 2016
25
General Approach
▷ evaluate algorithm as black box function▷ observe effect of parameters without knowing the inner
workings▷ decide where to evaluate next▷ balance diversification/exploration and
intensification/exploitation
26
When are we done?
▷ most approaches incomplete▷ cannot prove optimality, not guaranteed to find optimal
solution (with finite time)▷ performance highly dependent on configuration space
→ How do we know when to stop?
27
Time Budget
How much time/how many function evaluations?▷ too much → wasted resources▷ too little → suboptimal result▷ use statistical tests▷ evaluate on parts of the instance set▷ for runtime: adaptive capping
28
Grid and Random Search▷ evaluate certain points in parameter space
Bergstra, James, and Yoshua Bengio. “Random Search forHyper-Parameter Optimization.” J. Mach. Learn. Res. 13, no. 1 (February2012): 281–305.
29
Model-Based Search
▷ evaluate small number of configurations▷ build model of parameter-performance surface based on the
results▷ use model to predict where to evaluate next▷ repeat▷ allows targeted exploration of new configurations▷ can take instance features into account like algorithm selection
Hutter, Frank, Holger H. Hoos, and Kevin Leyton-Brown. “SequentialModel-Based Optimization for General Algorithm Configuration.” In LION 5,507–23, 2011.
30
Model-Based Search
actualmodel
parameter values
performance
31
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.000
0.005
0.010
0.015
0.020
0.025
x
type
● init
prop
type
y
yhat
ei
Iter = 1, Gap = 1.9909e−01
32
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.00
0.01
0.02
0.03
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 2, Gap = 1.9909e−01
33
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.000
0.002
0.004
0.006
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 3, Gap = 1.9909e−01
34
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0e+00
2e−04
4e−04
6e−04
8e−04
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 4, Gap = 1.9992e−01
35
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0e+00
1e−04
2e−04
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 5, Gap = 1.9992e−01
36
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.00000
0.00003
0.00006
0.00009
0.00012
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 6, Gap = 1.9996e−01
37
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0e+00
1e−05
2e−05
3e−05
4e−05
5e−05
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 7, Gap = 2.0000e−01
38
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.0e+00
5.0e−06
1.0e−05
1.5e−05
2.0e−05
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 8, Gap = 2.0000e−01
39
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.0e+00
2.5e−06
5.0e−06
7.5e−06
1.0e−05
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 9, Gap = 2.0000e−01
40
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0e+00
1e−07
2e−07
3e−07
4e−07
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 10, Gap = 2.0000e−01
41
Benchmark Library – AClib
▷ ASP, MIP, planning, machine learning, …▷ 4 algorithm configuration tools from the literature already
integrated▷ https://bitbucket.org/mlindauer/aclib2
Hutter, Frank, Manuel López-Ibáñez, Chris Fawcett, Marius Lindauer,Holger H. Hoos, Kevin Leyton-Brown, and Thomas Stützle. “AClib: ABenchmark Library for Algorithm Configuration.” In Learning and IntelligentOptimization, 36–40. Cham: Springer International Publishing, 2014.
42
Outlook
43
Quo Vadis, Software Engineering?
Run
Hoos, Holger H. “Programming by Optimization.” Communications of theAssociation for Computing Machinery (CACM) 55, no. 2 (February 2012):70–80. https://doi.org/10.1145/2076450.2076469.
44
Quo Vadis, Software Engineering?
Run
+ AI
Hoos, Holger H. “Programming by Optimization.” Communications of theAssociation for Computing Machinery (CACM) 55, no. 2 (February 2012):70–80. https://doi.org/10.1145/2076450.2076469.
44
Meta-Algorithmics in the Physical Realm – AI and Lasers
45
Tools and Resources
LLAMA https://bitbucket.org/lkotthoff/llamaSATzilla http://www.cs.ubc.ca/labs/beta/Projects/SATzilla/
iRace http://iridia.ulb.ac.be/irace/mlrMBO https://github.com/mlr-org/mlrMBO
SMAC http://www.cs.ubc.ca/labs/beta/Projects/SMAC/Spearmint https://github.com/HIPS/Spearmint
TPE https://jaberg.github.io/hyperopt/
autofolio https://bitbucket.org/mlindauer/autofolio/Auto-WEKA http://www.cs.ubc.ca/labs/beta/Projects/autoweka/
Auto-sklearn https://github.com/automl/auto-sklearn
46
Summary
Algorithm Selection choose the best algorithm for solving aproblem
Algorithm Configuration choose the best parameter configurationfor solving a problem with an algorithm
▷ mature research areas▷ can combine configuration and selection▷ effective tools are available▷ COnfiguration and SElection of ALgorithms group COSEAL
http://www.coseal.net
Don’t set parameters prematurely, embrace choice!
47
I’m hiring!
Several funded graduate positions available.
48