37
HEURISTIC OPTIMIZATION Automatic Algorithm Configuration Thomas St¨ utzle

Automatic Algorithm Configuration - ULBiridia.ulb.ac.be/~stuetzle/Teaching/HO16/Slides/Lecture... · 2017-02-13 · Design choices and parameters everywhere Todays high-performance

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

HEURISTIC OPTIMIZATION

Automatic Algorithm Configuration

Thomas Stutzle

Optimization problems arise everywhere!

Most such problems are computationally very hard (NP-hard!)Heuristic Optimization 2016 2

The algorithmic solution of hard optimizationproblems is one of the OR/CS success stories!

I Exact (systematic search) algorithms

I Branch&Bound, Branch&Cut, constraint programming, . . .

I powerful general-purpose software available

I guarantees on optimality but often time/memory consuming

I Approximate algorithms

I heuristics, local search, metaheuristics, hyperheuristics . . .

I typically special-purpose software

I rarely provable guarantees but often fast and accurate

Much active research on hybrids betweenexact and approximate algorithms!

Heuristic Optimization 2016 3

Design choices and parameters everywhere

Todays high-performance optimizers involve a largenumber of design choices and parameter settings

I exact solversI design choices include alternative models, pre-processing,

variable selection, value selection, branching rules . . .I many design choices have associated numerical parametersI example: SCIP 3.0.1 solver (fastest non-commercial MIP

solver) has more than 200 relevant parameters that influencethe solver’s search mechanism

I approximate algorithmsI design choices include solution representation, operators,

neighborhoods, pre-processing, strategies, . . .I many design choices have associated numerical parametersI example: multi-objective ACO algorithms with 22 parameters

(plus several still hidden ones)

Heuristic Optimization 2016 4

Example: Ant Colony Optimization

Heuristic Optimization 2016 5

ACO, Probabilistic solution construction

i

j

k

g

! ij " ij

?,

Heuristic Optimization 2016 6

Applying Ant Colony Optimization

Heuristic Optimization 2016 7

ACO design choices and numerical parameters

I solution constructionI choice of constructive procedureI choice of pheromone modelI choice of heuristic informationI numerical parameters

I ↵,� influence the weight of pheromone and heuristicinformation, respectively

Iq0 determines greediness of construction procedure

Im, the number of ants

I pheromone update

I which ants deposit pheromone and how much?I numerical parameters

I ⇢: evaporation rateI ⌧0: initial pheromone level

I local searchI . . . many more . . .

Heuristic Optimization 2016 8

Parameter types

I categorical parameters design

I choice of constructive procedure, choice of recombinationoperator, choice of branching strategy,. . .

I ordinal parameters design

I neighborhoods, lower bounds, . . .

I numerical parameters tuning, calibration

I integer or real-valued parametersI weighting factors, population sizes, temperature, hidden

constants, . . .I numerical parameters may be conditional to specific values of

categorical or ordinal parameters

Design and configuration of algorithms involves settingcategorical, ordinal, and numerical parameters

Heuristic Optimization 2016 9

Traditional approaches

I trial–and–error design guided by expertise/intuition

prone to over-generalizations, implicit independenceassumptions, limited exploration of design alternatives

I indications through theoretical studies

often based on over-simplifications, specific assumptions,few parameters

Can we make this approach more principled and automatic?

Heuristic Optimization 2016 10

Designing optimization algorithms

Challenges

I many alternative design choices

I nonlinear interactions among algorithm componentsand/or parameters

I performance assessment is di�cult

Traditional design approach

I trial–and–error design guided by expertise/intuition prone to over-generalizations, implicit independenceassumptions, limited exploration of design alternatives

Can we make this approach more principled and automatic?

Heuristic Optimization 2016 11

Towards automatic algorithm configuration

Automated algorithm configuration

I apply powerful search techniques to design algorithms

I use computation power to explore design spaces

I assist algorithm designer in the design process

I free human creativity for higher level tasks

Heuristic Optimization 2016 12

Example of application scenario

I Mario collects phone orders for 30 minutes

I scheduling deliveries is an optimization problem

I a di↵erent instance arises every 30 minutes

I limited amount of time for scheduling, say one minute

I good news: Mario has an SLS algorithm!

I . . . but the SLS algorithm must be tuned

I You have a limited amount of time for tuning it, say one week

Criterion:Good configurations find good solutions for future instances!

Heuristic Optimization 2016 13

Automatic o✏ine configuration

Typical performance measures

I maximize solution quality (within given computation time)

I minimize computation time (to reach optimal solution)Heuristic Optimization 2016 14

O✏ine configuration and online parametercontrol

O✏ine configuration

I configure algorithm before deploying it

I configuration on training instances

I related to algorithm design

Online parameter control

I adapt parameter setting while solving an instance

I typically limited to a set of known crucial algorithmparameters

I related to parameter calibration

O✏ine configuration techniques can be helpful to configure(online) parameter control strategiesHeuristic Optimization 2016 15

Approaches to configuration

I experimental design techniquesI e.g. CALIBRA [Adenso–Dıaz, Laguna, 2006], [Ridge&Kudenko,

2007], [Coy et al., 2001], [Ruiz, Stutzle, 2005]I numerical optimization techniques

I e.g. MADS [Audet&Orban, 2006], various [Yuan et al., 2012]I heuristic search methods

I e.g. meta-GA [Grefenstette, 1985], ParamILS [Hutter et al., 2007,2009], gender-based GA [Ansotegui at al., 2009], linear GP [Oltean,2005], REVAC(++) [Eiben & students, 2007, 2009, 2010] . . .

I model-based optimization approachesI e.g. SPO [Bartz-Beielstein et al., 2005, 2006, .. ], SMAC [Hutter et

al., 2011, ..]I sequential statistical testing

I e.g. F-race, iterated F-race [Birattari et al, 2002, 2007, . . .]

General, domain-independent methods required: (i) applicable to all variable

types, (ii) multiple training instances, (iii) high performance, (iv) scalable

Heuristic Optimization 2016 16

Approaches to configuration

I experimental design techniquesI e.g. CALIBRA [Adenso–Dıaz, Laguna, 2006], [Ridge&Kudenko,

2007], [Coy et al., 2001], [Ruiz, Stutzle, 2005]I numerical optimization techniques

I e.g. MADS [Audet&Orban, 2006], various [Yuan et al., 2012]I heuristic search methods

I e.g. meta-GA [Grefenstette, 1985], ParamILS [Hutter et al., 2007,2009], gender-based GA [Ansotegui at al., 2009], linear GP [Oltean,2005], REVAC(++) [Eiben & students, 2007, 2009, 2010] . . .

I model-based optimization approachesI e.g. SPO [Bartz-Beielstein et al., 2005, 2006, .. ], SMAC [Hutter et

al., 2011, ..]I sequential statistical testing

I e.g. F-race, iterated F-race [Birattari et al, 2002, 2007, . . .]

General, domain-independent methods required: (i) applicable to all variable

types, (ii) multiple training instances, (iii) high performance, (iv) scalable

Heuristic Optimization 2016 17

The racing approach

i

I start with a set of initial candidates

I consider a stream of instances

I sequentially evaluate candidates

I discard inferior candidatesas su�cient evidence is gathered against them

I . . . repeat until a winner is selectedor until computation time expires

Heuristic Optimization 2016 18

The F-Race algorithm

Statistical testing

1. family-wise tests for di↵erences among configurations

I Friedman two-way analysis of variance by ranks

2. if Friedman rejects H0, perform pairwise comparisons to bestconfiguration

I apply Friedman post-test

Heuristic Optimization 2016 19

Iterated race

Racing is a method for the selection of the best configuration andindependent of the way the set of configurations is sampled

Iterated racing

sample configurations from initial distribution

While not terminate()

apply racemodify sampling distributionsample configurations

Heuristic Optimization 2016 20

The irace Package

Manuel Lopez-Ibanez, Jeremie Dubois-Lacoste, Thomas Stutzle, and Mauro

Birattari. The irace package, Iterated Race for Automatic Algorithm

Configuration. Technical Report TR/IRIDIA/2011-004, IRIDIA, Universite

Libre de Bruxelles, Belgium, 2011.

http://iridia.ulb.ac.be/irace

I implementation of Iterated Racing in R

Goal 1: flexibleGoal 2: easy to use

I but no knowledge of R necessaryI parallel evaluation (MPI, multi-cores, grid engine .. )I initial candidates

irace has shown to be e↵ective for configuration taskswith several hundred of variables

Heuristic Optimization 2016 21

Other tools: ParamILS, SMAC

ParamILS

I iterated local search in configuration space

I requires discretization of numerical parameters

I http://www.cs.ubc.ca/labs/beta/Projects/ParamILS/

SMAC

I surrogate model assisted search process

I encouraging results for large configuration spaces

I http://www.cs.ubc.ca/labs/beta/Projects/SMAC/

capping: e↵ective speed-up technique for configuration targetrun-time

Heuristic Optimization 2016 22

Example, configuration of “black-box” solvers

Mixed-integer programming solvers

Heuristic Optimization 2016 23

Mixed integer programming (MIP) solvers

I powerful commercial (e.g. CPLEX) and non-commercial (e.g.SCIP) solvers available

I large number of parameters (tens to hundreds)

I default configurations not necessarily best for specificproblems

Benchmark set Default Configured SpeedupRegions200 72 10.5 (11.4± 0.9) 6.8Conic.SCH 5.37 2.14 (2.4± 0.29) 2.51CLS 712 23.4 (327± 860) 30.43MIK 64.8 1.19 (301± 948) 54.54QP 969 525 (827± 306) 1.85

FocusedILS tuning CPLEX, 10 runs, 2 CPU days, 63 parameters

Heuristic Optimization 2016 24

Example, bottom-up generation of algorithms

Automatic design of hybrid SLS algorithms

Heuristic Optimization 2016 25

Automatic design of hybrid SLS algorithms

Approach

I decompose single-point SLS methods into components

I derive generalized metaheuristic structure

I component-wise implementation of metaheuristic part

Implementation

I present possible algorithm compositions by a grammar

I instantiate grammer using a parametric representation

I allows use of standard automatic configuration toolsI shows good performance when compared to, e.g., grammatical

evolution [Mascia, Lopes-Ibanez, Dubois-Lacoste, Stutzle, 2014]

Heuristic Optimization 2016 26

General Local Search Structure: ILS

s0 :=initSolutions⇤ := ls(s0)repeat

s 0 :=perturb(s⇤,history)s⇤0 :=ls(s 0)s⇤ :=accept(s⇤,s⇤0,history)

until termination criterion met

I many SLS methods instantiable from this structureI abilities

I hybridizationI recursionI problem specific implementation at low-level

Heuristic Optimization 2016 27

Grammar

<algorithm> ::= <initialization> <ils>

<initialization> ::= random | <pbs_initialization>

<ils> ::= ILS(<perturb>, <ls>, <accept>, <stop>)

<perturb> ::= none | <initialization> | <pbs_perturb>

<ls> ::= <ils> | <descent> | <sa> | <rii> | <pii> | <vns> | <ig> | <pbs_ls>

<accept> ::= alwaysAccept | improvingAccept <comparator>

| prob(<value_prob_accept>) | probRandom | <metropolis>

| threshold(<value_threshold_accept>) | <pbs_accept>

<descent> ::= bestDescent(<comparator>, <stop>)

| firstImprDescent(<comparator>, <stop>)

<sa> ::= ILS(<pbs_move>, no_ls, <metropolis>, <stop>)

<rii> ::= ILS(<pbs_move>, no_ls, probRandom, <stop>)

<pii> ::= ILS(<pbs_move>, no_ls, prob(<value_prob_accept>), <stop>)

<vns> ::= ILS(<pbs_variable_move>, firstImprDescent(improvingStrictly),

improvingAccept(improvingStrictly), <stop>)

<ig> ::= ILS(<deconst-construct_perturb>, <ls>, <accept>, <stop>)

<comparator> ::= improvingStrictly | improving

<value_prob_accept> ::= [0, 1]

<value_threshold_accept> ::= [0, 1]

<metropolis> ::= metropolisAccept(<init_temperature>, <final_temperature>,

<decreasing_temperature_ratio>, <span>)

<init_temperature> ::= {1, 2,..., 10000}

<final_temperature> ::= {1, 2,..., 100}

<decreasing_temperature_ratio> ::= [0, 1]

<span> ::= {1, 2,..., 10000}

Heuristic Optimization 2016 28

Grammar

<algorithm> ::= <initialization> <ils>

<initialization> ::= random | <pbs_initialization>

<ils> ::= ILS(<perturb>, <ls>, <accept>, <stop>)

<perturb> ::= none | <initialization> | <pbs_perturb>

<ls> ::= <ils> | <descent> | <sa> | <rii> | <pii> | <vns> | <ig> | <pbs_ls>

<accept> ::= alwaysAccept | improvingAccept <comparator>

| prob(<value_prob_accept>) | probRandom | <metropolis>

| threshold(<value_threshold_accept>) | <pbs_accept>

<descent> ::= bestDescent(<comparator>, <stop>)

| firstImprDescent(<comparator>, <stop>)

<sa> ::= ILS(<pbs_move>, no_ls, <metropolis>, <stop>)

<rii> ::= ILS(<pbs_move>, no_ls, probRandom, <stop>)

<pii> ::= ILS(<pbs_move>, no_ls, prob(<value_prob_accept>), <stop>)

<vns> ::= ILS(<pbs_variable_move>, firstImprDescent(improvingStrictly),

improvingAccept(improvingStrictly), <stop>)

<ig> ::= ILS(<deconst-construct_perturb>, <ls>, <accept>, <stop>)

<comparator> ::= improvingStrictly | improving

<value_prob_accept> ::= [0, 1]

<value_threshold_accept> ::= [0, 1]

<metropolis> ::= metropolisAccept(<init_temperature>, <final_temperature>,

<decreasing_temperature_ratio>, <span>)

<init_temperature> ::= {1, 2,..., 10000}

<final_temperature> ::= {1, 2,..., 100}

<decreasing_temperature_ratio> ::= [0, 1]

<span> ::= {1, 2,..., 10000}

Heuristic Optimization 2016 29

Grammar

<algorithm> ::= <initialization> <ils>

<initialization> ::= random | <pbs_initialization>

<ils> ::= ILS(<perturb>, <ls>, <accept>, <stop>)

<perturb> ::= none | <initialization> | <pbs_perturb>

<ls> ::= <ils> | <descent> | <sa> | <rii> | <pii> | <vns> | <ig> | <pbs_ls>

<accept> ::= alwaysAccept | improvingAccept <comparator>

| prob(<value_prob_accept>) | probRandom | <metropolis>

| threshold(<value_threshold_accept>) | <pbs_accept>

<descent> ::= bestDescent(<comparator>, <stop>)

| firstImprDescent(<comparator>, <stop>)

<sa> ::= ILS(<pbs_move>, no_ls, <metropolis>, <stop>)

<rii> ::= ILS(<pbs_move>, no_ls, probRandom, <stop>)

<pii> ::= ILS(<pbs_move>, no_ls, prob(<value_prob_accept>), <stop>)

<vns> ::= ILS(<pbs_variable_move>, firstImprDescent(improvingStrictly),

improvingAccept(improvingStrictly), <stop>)

<ig> ::= ILS(<deconst-construct_perturb>, <ls>, <accept>, <stop>)

<comparator> ::= improvingStrictly | improving

<value_prob_accept> ::= [0, 1]

<value_threshold_accept> ::= [0, 1]

<metropolis> ::= metropolisAccept(<init_temperature>, <final_temperature>,

<decreasing_temperature_ratio>, <span>)

<init_temperature> ::= {1, 2,..., 10000}

<final_temperature> ::= {1, 2,..., 100}

<decreasing_temperature_ratio> ::= [0, 1]

<span> ::= {1, 2,..., 10000}

Heuristic Optimization 2016 30

System overview

Heuristic Optimization 2016 31

Flow-shop problem with weighted tardiness

I Automatic configuration:I 1, 2 or 3 levels of recursion (r)I 80, 127, and 174 parameters, respectivelyI budget: r ⇥ 10 000 trials each of 30 seconds

ALS1 ALS2 ALS3 soa−IG2660

027

000

2740

0

Algorithms

Fitn

ess

valu

e

ALS1 ALS2 ALS3 soa−IG24

200

2460

025

000

Algorithms

Fitn

ess

valu

e

ALS1 ALS2 ALS3 soa−IG

3300

033

400

3380

0

Algorithms

Fitn

ess

valu

e

ALS1 ALS2 ALS3 soa−IG

4100

0042

0000

Algorithms

Fitn

ess

valu

e

ALS1 ALS2 ALS3 soa−IG

3250

0033

5000

Algorithms

Fitn

ess

valu

e

ALS1 ALS2 ALS3 soa−IG

4900

0050

0000

5100

00

Algorithms

Fitn

ess

valu

e

results are competitive or superior to state-of-the-art algorithmHeuristic Optimization 2016 32

Why automatic algorithm configuration?

I improvement over manual, ad-hoc methods for tuning

I reduction of development time and human intervention

I increase number of considerable degrees of freedom

I empirical studies, comparisons of algorithms

I support for end users of algorithms

Heuristic Optimization 2016 33

Towards a shift of paradigm in algorithmdesign

������������

�� ������

�������������

��������������

���� �������

Heuristic Optimization 2016 34

Towards a shift of paradigm in algorithmdesign

������������

�� ������

�������������

��������������

���� �������

Heuristic Optimization 2016 35

Towards a shift of paradigm in algorithmdesign

�� ���

������������

�� ������

�������������

��������������

���� �������

��������

� ���� �

Heuristic Optimization 2016 36

Conclusions

Automatic Configuration

I leverages computing power for software design

I is rewarding w.r.t. development time and algorithmperformance

I leads ultimately to a shift in algorithm design

Future work

I more powerful configurators

I pushing the borders

I best practice

Heuristic Optimization 2016 37