Http://creativecommons.org/licenses/by-sa/2.0/. CIS786 Lecture 1 Usman Roshan 'This material is...

Preview:

Citation preview

http://creativecommons.org/licenses/by-sa/2.0/

CIS786Lecture 1

Usman Roshan

'This material is based on slides provided with the book 'Stochastic Local Search: Foundations and Applications' by Holger H. Hoos and Thomas Stützle (Morgan Kaufmann, 2004) - see www.sls-book.net for further information.'

Complexity

• NP: class of decision problems whose solution can be verified in polynomial time

• P: class of decision problems whose solution can be determined in polynomial time

• TSP (decision version)– Problem: is there a Hamiltonian path of length at most k in an

edge-weighted directed graph?– Solution can be verified in polynomial time– Can it be answered in polynomial time for arbitrary instances?

• Shortest path problem (decision version)– Problem: is there a shortest path of length at most k between

vertices u and v in an edge weighted directed graph?– Solution can be verified in polynomial time– It can also be determined in polynomial time!

Complexity

• In real life we are concerned with optimization and finding real solutions (as opposed to decision problems)

• Decision problems mainly have to do with hardness results---if the decision version is very hard to solve then clearly so is the optimization version

• The question of P=NP is the most important in computer science and remains to be answered

Reduction

• Matching: given a bipartite graph, does there exist a matching?

• Can be solved by reducing to the maximum flow problem

s t

NP-hardness

• Reduction: problem X can be reduced to problem Y if the following hold– x is a yes to X iff y=R(x) is a yes to Y– R(x) is a polynomial time reduction function

• Reduction is about decision problems and not optimization ones

• A problem X is NP-complete if– X is in NP– All problems in NP reduce to X

Running time (poly vs exp)

Running time (effect of constants)

Approximation algorithms

• Vertex cover: find minimum set of vertices C in G=(V,E) such that for each edge in G at least one of its endpoints is in C

• 2-approx algorithm:– Initialize C to the empty set– While there are edges in G

• Select any edge (u,v)• Add u and v to C• Delete u and v from G

• But approx algorithms don’t work well in practice

Heuristics

• No guarantees on quality of solution• Usually very fast and are widely used• Studied experimentally on benchmark datasets• Popular heuristics

– Iterative improvement– Randomised iterative improvement– Iterated local search– Tabu search– Genetic algoritms– Simulated annealing– Ant colony optimization

Local search

• Let g be the function to optimize• Assume we have a function M(x) which

generates a neighborhood of x (preferably of polynomial size)

• Iterative improvement– Determine initial candidate solution s– While is not a local optimum

• Choose a neighbor s’ of s such that g(s’)<g(s) (can do best or first improvement)

• Set s = s’

Global view of search

Local view of search

Travelling Salesman problem

• Highly studied in computer science

• Problem is to find shortest Hamiltonian in an edge-weighted directed graph

• NP-hard

• No polynomial time approximation scheme unless P=NP

TSP

Greedy TSP search(Nearest Neighborhood search)

• Start with a randomly selected vertex

• Find neighboring unvisited vertex and it to the path

• When no more visited vertices available add the initial vertex to complete the cycle (if desired)

• Combine with backtracking

TSP---local move

Sort neighboring edges according to increasing weights and selectlowest one first

The Lin-Kernighan algorithm for TSP

Basic LK local move• Start with a path• Obtain a delta-path by

adding edge (v,w)• Break cycle by

removing (w,v’)• Cycle can be

completed by adding edge (v’,u)

Full LK heuristic

1. Input: path p2. Obtain a delta-path p’ by replacing one edge in

p.3. If g(p’) < g(p) then set p=p’ and go to step 24. Else output p

Can be interpreted as a sequence of1-exchange steps that alternate between d-paths and Hamiltonian paths

Local optima

• Big problem!• Simple and commonly employed ways to escape

local optima– Random restart: begin the search from a different

starting point – Non-improving steps: allow selection of candidate

solution with worse evaluation function value

• Neither of these are guaranteed to always escape local optimal

Local optima

• Local optima depend on g and neighborhood function M

• Larger neighborhoods induce fewer local optima but can take longer to search

• Now we look at improvements over basic IIS that can avoid local optima

Simulated annealing

• Create initial candidate solution s• While termination condition not satisfied

– Probabilistically choose a neighbor s’ of s using proposal mechanism

– If s’ satisfies probabilistic acceptance criterion then s=s’

– Update T according to annealing schedule

• T may be constant for some number of iterations or never change

Simulated annealing

• Proposal function is usually uniformly random

• Acceptance function is normally Metropolis function

Simulated annealing for TSP

• Randomly pick a Hamiltonian cycle s• Select neighbor s’ uniformly at random from neighborhood of s• Accept new solution s’ with probability (also known as the

Metropolis condition)

• Annealing schedule: T=0.95T for n(n-1) steps (geometric schedule)

• Terminate when for five successive temperature values no improvement in solution quality

Convergence for SA

Tabu search

• Generate initial candidate s

• While termination condition not met– Determine set N’ of non-tabu neighbors of s– Choose the best improving solution s’ in N’– Update tabu table based on s’– Set s=s’

Tabu search

• Usually, the select candidate (s’) is declared tabu for some fixed number of subsequent steps. This means it cannot be chosen until some time has elapsed and therfore allows for wider exploration of the search space.

• Later we will look at a tabu search algorithm for protein folding under the 2D lattice model.

Iterated local search

• Generate initial candidate solution s• Perform local search on s (for example

iterative improvement starting from s)• While termination condition not met

– Set r=s– Perform perturbation on s– Perform local search on perturbed s – Based on acceptance criterion, keep s or

revert to r

Iterated local search

• ILS can be interpreted as walks in the space of local optima

• Perturbation is key– Needs to be chosen so that it cannot be undone

easily by subsequent local search– It may consist of many perturbation steps– Strong perturbation: more effective escape from local

optima but similar drawbacks as random restart– Weak perturbation: short subsequent local search

phase but risk of revisiting previous optima• Acceptance criteria: usually either the more

recent or the better of two

Iterated local search for TSP

• Perturbation: “double-bridge move” = 4-exchange step

• Cannot be directly reversed by 2-exchange moves

ILS for TPS

• Acceptance criterion: return the better of the two candidate solutions

• Known as Iterated Lin-Kernighan (ILK) algorithm

• Although very simple, it has been shown to achieve excellent performance and is among the state of the art

Ant colony optimization

• Ants communicate via chemicals known as pheromones which are deposited on the ground in the form of trails.

• Pheromone trails provide the basis for (stochastic) trail-following behavior underlying, e.g., the collective ability to find the shortest paths between a food source and the nest

ACOs

• Initialise pheromone trails

• While termination condition is not met– Generate population sp of candidate solutions

using randomized constructive search (such as a greedy heuristic)

– Perform local search (e.g. iterative improvement) on sp

– Update pheromone trails based on sp

ACOs

• In each cycle, each ant creates one candidate solution

• All pheromone trails are initialized to the same value

• Pheromone update typically comprises uniform decrease of all trail levels (evaporation) and increase on some trail levels based on solutions obtained from construction + local search

ACO for TSP (1)

ACO for TSP (2)

ACO for TSP (3)

• Termination: after a fixed number of cycles (construction + local search)

• Ants can be imagined as walking along edges of given graph (using memory to ensure their tours correspond to Hamiltonian cycles) and depositing pheromone to reinforce edges of tours

• Original ACO did not include local search (local search improves performance considerably)

• ACO has also been applied to protein folding which we will see later

Evolutionary (genetic) algorithms

• Determine initial solution sp• While termination condition not met

– Generate set spr of new candidate solutions by recombination

– Generate spm of new candidate solutions from spr and sp by mutation

– Select new population from candidate solutions in sp, spr, and spm

• Pure evolutionary algorithms often lack capability of sufficient search intensification---need to combine with local search

Recombination

Memetic algorithms (genetic local search)

• Determine initial solution sp• Perform local search on sp• While termination condition not met

– Generate set spr of new candidate solutions by recombination– Peform local search on spr– Generate spm of new candidate solutions from spr and sp by

mutation– Perform local search on spm– Select new population from candidate solutions in sp, spr, and

spm

• Pure evolutionary algorithms often lack capability of sufficient search intensification---need to combine with local search

MA for TSP

• A randomized variant of the greedy heuristic (we saw earlier) is used to generate a population

• Among the various recombination operators the GX performs the best

GX operator for TSP MA

• Copy edges that are common to two parents (fraction of edges to be copied is determined by parameter p1)

• Add new short edges not in the parents (again fraction to be added is determined by parameter p2)

• Copy edges from parents where edges are ordered according to increasing length---only edges which do not violate TPS constraints are added and a fraction to be added is determined by parameter p3

• If necessary, complete the tour using the randomized greedy heuristic

Protein folding

• Lattice model assumes amino acids are of two types: hydrophobic, which are black, and hydrophilic, which are white

• They can take on discrete positions only

• The energy value of a fold is determined by the number of non-adjacent hydrophobic residues

Protein folding

• Finding the optimal fold in the 2D lattice is NP-hard

• There are at least an exponential number of possible folds (as demonstrated by the staircase folds)

• Iterative improvements means we need a local move

Pull move

Pull move

• Theorem 1: The class of pull moves is reversible

• Theorem 2: Any protein can be straightened to form a straight line through a sequence of pull moves

Tabu search

• Initialize using randomized greedy construction heuristic– Place amino acid i+1 to the LEFT, RIGHT or

STRAIGHT of i– Repeat this recursively starting from i=2– If reach a dead-end either backtrack or

abandon and restart search

Tabu search parameters

• Each residue is assigned a mobility• Pull moves are performed on residues with high

mobility• Mobilities are updated to encourage exploration

of the search space– Initially all residues are assigned medium mobility for

memSize iterations– Each residue is randomly selected to medium for one

iteration with probability noise– Elements with that have been less altered in the past

are encouraged movement using a diversity parameter

Results

Optimal fold on 100 residue long protein

Results of 200 runs with loosely tuned parameters

Comparison of pull moves to long and pivot moves

Problem Pull move Long pull move

Pivot move

S64 -42 -38 -40

S85 -53 -51 -49

S100a -48 -45 -47

S100b -50 -44 -48