89
Intelligent and Adaptable Software Systems Advanced Algorithms: Optimization and Search Methods Dr. Arno Formella Computer Science Department University of Vigo 09/09 Dr. Arno Formella (University of Vigo) SSIA 09/09 1 / 89

Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Intelligent and Adaptable Software SystemsAdvanced Algorithms: Optimization and Search Methods

Dr. Arno Formella

Computer Science DepartmentUniversity of Vigo

09/09

Dr. Arno Formella (University of Vigo) SSIA 09/09 1 / 89

Page 2: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Advanced Algorithms:Optimization and Search Methods I

1 Course organization

2 Bibliography

3 Motivation

4 Basic concepts

Dr. Arno Formella (University of Vigo) SSIA 09/09 2 / 89

Page 3: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Course organizationcourse notes

Homepage:http://www.ei.uvigo.es/~formella/doc/ssia09

whiteboard illustrations (notations, ideas for proofs, algorithms)very short introduction to certain aspects related to optimizationand search methods and some applications

Dr. Arno Formella (University of Vigo) SSIA 09/09 3 / 89

Page 4: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Course organizationclass room hours

Optimization and Search Methods, Thursdays, 18:00–20:00

24.09. 01.10. 08.10. 15.10. 22.10.class class lab class lab29.10. 05.11. 12.11. 19.11. 26.11.class ?? ?? ?? ??03.12. 10.12. 17.12. 07.01. 14.01.

?? ?? ?? ?? ??

Dr. Arno Formella (University of Vigo) SSIA 09/09 4 / 89

Page 5: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Course organizationclass room hours

Dr. Fernando Diáz Gómezoffice hours: [email protected]. Arno Formellaoffice hours: Mondays, 11-14 and 17-20

Dr. Arno Formella (University of Vigo) SSIA 09/09 5 / 89

Page 6: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Bibliographybooks

to be completed

Dr. Arno Formella (University of Vigo) SSIA 09/09 6 / 89

Page 7: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Your workhomework, lab hours, presentations

to be completed

Dr. Arno Formella (University of Vigo) SSIA 09/09 7 / 89

Page 8: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Bibliography Ilinks

(working in september 2009)

Rui Mendes. Population topologies and their influence in particleswarm performance. PhD Thesis, Universidad de Minho, 2004.http://www.di.uminho.pt/~rcm/

http://www-neos.mcs.anl.govOnline optimization project

http://www.coin-or.org/index.htmlOperation research

http://www.cs.sandia.gov/opt/surveyglobal optimization

Dr. Arno Formella (University of Vigo) SSIA 09/09 8 / 89

Page 9: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Bibliography IIlinks

http://iridia.ulb.ac.be/~mdorigo/ACO/Ant colony optimization

http://www.mat.univie.ac.at/~neum/glopt.htmlGlobal optimization

http://plato.asu.edu/gom.htmlContinuous global optimization software

http://www.swarmintelligence.org/index.phpParticle swarm optimization

Dr. Arno Formella (University of Vigo) SSIA 09/09 9 / 89

Page 10: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Motivationwhat is it?

Optimizing means

search for (at least) one solutionwhich is different from other possible solutionsin the sense of being (sufficiently) extremewithin an orderingpossibly taking into account certain restrictions(within a certain limit of computing time).

Example: hiking in a mountain ridge (with fog).

Dr. Arno Formella (University of Vigo) SSIA 09/09 10 / 89

Page 11: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Motivationexamples

Problems which one wants to solve:

minimizing costmaximizing earningsmaximizing occupationminimizing energyminimizing resources

Dr. Arno Formella (University of Vigo) SSIA 09/09 11 / 89

Page 12: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Basic conceptsobservations

the search space and/or the objective function can be

discrete or continoustotal or partialsimple or complex, especially in respect to evaluation timeexplicite, implicite, experimentaldifferentiable or non–differentiablestatic or dynamic

The objective function must be confined.

Dr. Arno Formella (University of Vigo) SSIA 09/09 12 / 89

Page 13: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Basic conceptsobjective functions

MinimizationMaximizationObviously any maximization problem can be converted to aminimization problem.

Dr. Arno Formella (University of Vigo) SSIA 09/09 13 / 89

Page 14: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Basic conceptsconditions

restrictionsfeasable solution (feasibility problem)coding of the solutions

Dr. Arno Formella (University of Vigo) SSIA 09/09 14 / 89

Page 15: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Basic conceptsclassification

(after NEOS server (almost), Argonne National Laboratory)

continuous discrete

constrainedbound

constrainednonlinearly

programminglinear

least squarenonlinear

equationsnonlinear

unconstrained

constrained

programminginteger

optimizationnumerical

nondifferentiableoptimization

networkprogramming

stochasticprogramming

Dr. Arno Formella (University of Vigo) SSIA 09/09 15 / 89

Page 16: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Basic conceptstypes

to be distinguished

local optimization: usually one starts from an initial solution and stopswhen having found a local (close) minimum

global optimization: one tries to find the best solution globally (amongall possible solutions)

Dr. Arno Formella (University of Vigo) SSIA 09/09 16 / 89

Page 17: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Basic conceptsproblems

The main problem of global optimization is: getting trapped in alocal minimum (premature convergence)

Dr. Arno Formella (University of Vigo) SSIA 09/09 17 / 89

Page 18: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Basic conceptsglobal optimization (incomplete intent)

optimizationglobal

populationbased

linear programminginteger programming

dynamic programmingdivide&conquer

branch&bound

fixed pointmethods

gradientmethods

evolution strategiesgenetic algorithms

genetic programmingmemetic algorithms

particle swarms

evolutionary programmingdifferential evolution

ant colony

annealingsimulated

hill−climbingstochastic

local searchiterated

exact methods

probabilistic

meta heuristics

heuristics

constructive

local search

deterministic

single solution

tabu search

Dr. Arno Formella (University of Vigo) SSIA 09/09 18 / 89

Page 19: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

A real applicationpsm

approximate Point Set Match in 2D and 3D

An application where we need sophisticated search and optimizationtechniques.

Dr. Arno Formella (University of Vigo) SSIA 09/09 19 / 89

Page 20: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

¿Dónde está Wally?

Dr. Arno Formella (University of Vigo) SSIA 09/09 20 / 89

Page 21: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Motivationpsm

searching of patterns(relatively small sets of two– or three–dimensional points),within search spaces(relatively large point sets)comparing point setskey wordsgeometric pattern matching, structure comparison, point setmatching, structural alignment, object recognition

Dr. Arno Formella (University of Vigo) SSIA 09/09 21 / 89

Page 22: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Joint work with

Thorsten Pöschelsome ideas from: Kristian Rother, Stefan GüntherHumboldt Universität—Charité Berlinhttp://www.charite.de/bioinf/people.html

psm is one of the algorithms available athttp://farnsworth.charite.de/superimpose-web

Dr. Arno Formella (University of Vigo) SSIA 09/09 22 / 89

Page 23: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Search of a substructure in a proteinsearch space

Dr. Arno Formella (University of Vigo) SSIA 09/09 23 / 89

Page 24: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Search of a substructure in a proteinsearch pattern

Dr. Arno Formella (University of Vigo) SSIA 09/09 24 / 89

Page 25: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

¿Dónde está Wally?

Dr. Arno Formella (University of Vigo) SSIA 09/09 25 / 89

Page 26: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Informal problem description

given a search space anda search pattern,find the location within the spacewhich represents best the pattern

Dr. Arno Formella (University of Vigo) SSIA 09/09 26 / 89

Page 27: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Extensions

find the best part of the patternwhich can be represented within the search spaceallow certain types of deformation of the patternfind similar parts within the same point set

Dr. Arno Formella (University of Vigo) SSIA 09/09 27 / 89

Page 28: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Formal problem description

search space:S = {s0, s1, . . . , sn−1} ⊂ IRd , |S| = nsearch pattern:P = {p0,p1, . . . ,pk−1} ⊂ IRd , |P| = k ≤ ndimension d = 2 or d = 3

Dr. Arno Formella (University of Vigo) SSIA 09/09 28 / 89

Page 29: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Search and alignment

the aligning process can be separated in two partsfind the matching points in the pattern and the search spacefind the necessary transformation to move the pattern to its location

an approximate alignment must be qualified

Dr. Arno Formella (University of Vigo) SSIA 09/09 29 / 89

Page 30: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Matching

a matching is a function that assigns to each point of the searchpattern a different point of the search spaceµ : P −→ S injective, i.e.,if pi 6= pj then µ(pi) 6= µ(pj)

let’s write: µ(pi) = s′i and µ(P) = S′

Dr. Arno Formella (University of Vigo) SSIA 09/09 30 / 89

Page 31: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Transformations

transformations which maintain distances:translation, rotation and reflectiontransformations which maintain angles:translation, rotation, reflection and scalingdeforming transformations:shearing, projection, and others (local deformations)

Dr. Arno Formella (University of Vigo) SSIA 09/09 31 / 89

Page 32: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Congruent and similar transformations

rigid motion transformation(euclidean transformation or congruent transformation)only translation and rotationsimilar transformationrigid motion transformation with scalingwe may allow reflections as well (L–matches)let T be a transformation (normally congruent)we transform the patternlet’s write: T (pi) = p′i and T (P) = P ′

Dr. Arno Formella (University of Vigo) SSIA 09/09 32 / 89

Page 33: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Alignment

a matching µ together with a transformation T is an alignment(µ,T )

rigid motion transformation: congruent alignmentwith scaling: similar alignmentwith reflection: L–alignment

Dr. Arno Formella (University of Vigo) SSIA 09/09 33 / 89

Page 34: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

One–dimensional example

Dr. Arno Formella (University of Vigo) SSIA 09/09 34 / 89

Page 35: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Distance of an alignment

let (µ,T ) be an alignment of P in Swe can measure the distances between transformed points of thepattern and their partners in the search spacei.e., the distances

di = d(T (pi), µ(pi)) = d(p′i , s′i )

obviously, if di = 0 for all ithen the alignment is perfect

Dr. Arno Formella (University of Vigo) SSIA 09/09 35 / 89

Page 36: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Examples of different distances of an alignment

root mean square distance (RMS)

d =

√1n

∑i

(p′i − s′i )2

average distance (AVG)

d =1n

∑i

|p′i − s′i |

maximum distance (MAX)

d = m«axi|p′i − s′i |

Dr. Arno Formella (University of Vigo) SSIA 09/09 36 / 89

Page 37: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Quality of an alignment

there are many interesting distance measuresa distance d has its value in [0,∞[

we use the quality Q of an alignment Q = 1/(1 + d)

(other possibility: Q = exp(−d))hence: Q = 1 perfect alignment, Q ∈]0,1]

and: Q < 1 approximate alignment

Dr. Arno Formella (University of Vigo) SSIA 09/09 37 / 89

Page 38: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Formal problem description

given a search space S, andgiven a search pattern Pgiven a distance measurefind an alignment (µ,T ) of P in Swith minimum distance d (or maximum quality Q)

Dr. Arno Formella (University of Vigo) SSIA 09/09 38 / 89

Page 39: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Perfect congruent alignments

congruent alignments in IR3 (Boxer 1999):

O(n2,5 4√

log∗ n + kn1,8(log∗ n)O(1)︸ ︷︷ ︸output

log n)

for small k the first term is dominantlog∗ n is smallest l such that

22...2}

l−veces ≥ n log∗ n = 5 =⇒ n ≈ 265000

Dr. Arno Formella (University of Vigo) SSIA 09/09 39 / 89

Page 40: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Perfect similar alignments

similar alignments in IR3 (Boxer 1999):

O(n3 + kn2,2︸ ︷︷ ︸output

log n)

searching approximate alignments and/or partial alignmentsis a much more complex problem

Dr. Arno Formella (University of Vigo) SSIA 09/09 40 / 89

Page 41: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Perfect alignmentsideas

choose one triangle, e.g. (p0,p1,p2), of Psearch for all congruent triangles in S(and their corresponding transformations)verify the rest of the points of P(after having applied the transformation)the run time is not proportional to n3 (in case of congruence)because we can enumerate the triangles of S in a sophisticatedmanner and there are not as many possibilities

Dr. Arno Formella (University of Vigo) SSIA 09/09 41 / 89

Page 42: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Approximate alignments

as stated, we work in two stepswe search for adequate matchings µ(according to a certain tolerance)we calculate the optimal transformations T(according to a certain distance measure)

we select the best alignment(s)

Dr. Arno Formella (University of Vigo) SSIA 09/09 42 / 89

Page 43: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Optimal Transformation

let S′ = µ(P) be a matchinglet d be a distance measurewe look for the optimal rigid motion transformation T ,(only translation and rotation), such thatd(T (P), µ(P)) = d(P ′,S′) is minimal

Dr. Arno Formella (University of Vigo) SSIA 09/09 43 / 89

Page 44: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Root mean square distance

d =

√1n

∑i

d(p′i , s′i )

2

=

√1n

∑i

(U · pi + t − s′i )2

U 3x3 rotation matrix, i.e., orthonormalt translation vector

Objective: find U and t such that d is minimal

Dr. Arno Formella (University of Vigo) SSIA 09/09 44 / 89

Page 45: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

A little bit of math

we observe: t and U are independent

with the partial derivative of d according t

∂d∂t

= 2 ·∑

i

(U · pi + t − s′i ) = 2U∑

i

pi + 2nt − 2∑

i

s′i

we obtain

t = −U1n

∑i

pi +1n

∑i

s′i

= −U · pc + s′c

where pc and s′c are the centroids of both sets

Dr. Arno Formella (University of Vigo) SSIA 09/09 45 / 89

Page 46: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Optimal Rotation

with the above, d can be written as

d =

√1n

∑i

(U · pi + t − s′i )2

=

√1n

∑i

(U · (pi − pc)− (s′i − s′c))2

where U is a matrix with restrictions (has to be orthonormal)

Dr. Arno Formella (University of Vigo) SSIA 09/09 46 / 89

Page 47: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Extremal points of functions with restrictions

one converts the problem with restrictionswith the help of LAGRANGE multiplies intoa problem without restrictionswhich exhibits the same extremal points

Dr. Arno Formella (University of Vigo) SSIA 09/09 47 / 89

Page 48: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Let’s skip the details

basically, we calculate first and second derivative according to theentries uij of Uwe search for the extremal pointsKABSCH algorithm 1976, 1978open source code at my home page

Dr. Arno Formella (University of Vigo) SSIA 09/09 48 / 89

Page 49: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

KABSCH algorithm

let S be the matrix of rows containing the s′ilet P be the matrix of rows containing the pi

we compute R = S · P>

we set A = [a0 a1 a2] with ak being the eigenvectors of R>Rwe compute B = [‖Ra0‖ ‖Ra1‖ ‖Ra2‖]and finally, we get U = B · A>

Dr. Arno Formella (University of Vigo) SSIA 09/09 49 / 89

Page 50: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Introduction of scaling

let us introduce a scaling value σ ∈ IR

d =

√1n

∑i

(σU · (pi − pc)− (s′i − s′c))2

let p′′i = U · (pi − pc) be the translated and rotated point pi

let s′′i = s′i − s′c be the centralized point s′i

the solution for the optimal σ: σ =

∑i

〈s′′i ,p′′i 〉∑i

〈p′′i ,p′′i 〉

Dr. Arno Formella (University of Vigo) SSIA 09/09 50 / 89

Page 51: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Different distance measures—different alignments

AVG

MAX

RMS

Dr. Arno Formella (University of Vigo) SSIA 09/09 51 / 89

Page 52: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Optimal transformations for non–derivable distancemeasures

if the function for d is not derivable, e.g., the averagewe use a gradient free optimization method(only with evaluations of the function)recently developed iterative method that is guaranteed toconverge towards a local minimumalgorithm of RODRÍGUEZ/GARCÍA–PALOMARES (2002)

Dr. Arno Formella (University of Vigo) SSIA 09/09 52 / 89

Page 53: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Idea behind RODRIGUEZ/GARCIA–PALOMARES

let f (x) be the function to be minimizedwe iterate contracting and expanding adequately parametershk > 0 and τ > 0 such thatf (xi+1) = f (xi ± hkdk ) ≤ f (xi)− τ2

where dk is a direction taken from a finite set of directions(which depends on the point xi )with τ −→ 0, xi converges to local optimum(while there are no constraints)

Dr. Arno Formella (University of Vigo) SSIA 09/09 53 / 89

Page 54: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Rotation in quaternion space

a rotation U · p of the point p with the matrix U can be expressedasq ? p̄ ? q−1 in quaternion space IH(HAMILTON formula, C ∼ IR2, IH ∼ IR4)where p̄ = (0,p) is the canonical quaternion of the point pand q = (sin(ϕ/2), cos(ϕ/2)u) is the rotation quaternion(with u ∈ IR3 being the axis and ϕ the angle of rotation)instead of U with 9 constraint variableswe have u and ϕ, i.e., 4 unconstraint variables

Dr. Arno Formella (University of Vigo) SSIA 09/09 54 / 89

Page 55: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Matching Algorithms

1 maximal clique detection within the graph of compatible distances2 geometric hashing of the pattern3 distance geometry

Dr. Arno Formella (University of Vigo) SSIA 09/09 55 / 89

Page 56: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Maximal clique detection

we generate a graph G = (V ,E) (graph of compatible distances)vertices vij ∈ V all pairs (pi , sj)

edges e = (vij , vkl) ∈ E , if d(pi ,pk ) ≈ d(sj , sl)

search for maximum cliques in G

Dr. Arno Formella (University of Vigo) SSIA 09/09 56 / 89

Page 57: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Properties of Clique Detection

the problem is NP–complete(however, we search only for cliques of size ≤ k )fast algorithms need adjacency matricesif n = |S| = 5000 and k = |P| = 100 we need 30 GByte(counting only one bit per edge)

Dr. Arno Formella (University of Vigo) SSIA 09/09 57 / 89

Page 58: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Geometric hashing

preprocessing of the search spacelet’s describe the two–dimensional case

we align each pair (si , sj )with si at the origin and sj in direction xwe insert some information for each other point sk ∈ Sin a hashtable defined on a grid over S

Dr. Arno Formella (University of Vigo) SSIA 09/09 58 / 89

Page 59: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Example: geometric hashing

Dr. Arno Formella (University of Vigo) SSIA 09/09 59 / 89

Page 60: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Searching with geometric hashing

we simulate an insertion of the points of P into the hashtablebut we count only the non–empty entriesmany votes reveal candidates for partial alignmentse.g., if we encounter a pair (pi ,pj) such thatfor each other point of the pattern there is a non–empty cell in thehashtablewe have found a perfect candidate

Dr. Arno Formella (University of Vigo) SSIA 09/09 60 / 89

Page 61: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Properties of geometric hashing

grid size must be selected beforehandpreprocessing time O(nd+1)

searching time O(kd+1)

works only for rigid motion transformations

Dr. Arno Formella (University of Vigo) SSIA 09/09 61 / 89

Page 62: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Distance geometry

we represent both sets S and P as distance graphsthe vertices of the graphs are the points of the setsthe edges of the graphs hold the distances between thecorresponding verticese.g., GP = (P,P × P) complete graph

Dr. Arno Formella (University of Vigo) SSIA 09/09 62 / 89

Page 63: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

The ideas behind psm

we define adequate distance graphs GP and GS

we search for subgraphs G′S of GS that are congruent to the graphGP(allowing certain tolerances)we optimally align GP with the subgraphs of G′Swe select the best one among all hits

we extend the search to work with subgraphs of GP as wellwe select a best subgraph as final solution

Dr. Arno Formella (University of Vigo) SSIA 09/09 63 / 89

Page 64: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

The four main steps of psm

construction of the graphswith: exploitation of locallity propertiessearch of subgraphswith: sophisticated backtrackingalignmentwith: minimization of cost functionssearch of partial patternswith: reactive tabu search

Dr. Arno Formella (University of Vigo) SSIA 09/09 64 / 89

Page 65: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Construction of the graphs

let us assume that the pattern P is smallwe construct GP as the complete graphwe generate a dictionary D(ordered data structure)that contains all distances (intervals) between points in Pwe consider an edge between two vertices in GSif the distance is present in the dictionary D

Dr. Arno Formella (University of Vigo) SSIA 09/09 65 / 89

Page 66: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Construction of the dictionary

let dij = d(pi ,pj) be the distance between two points of Pthe dictionary will contain the interval

[ (1− ε) · dij , (1 + ε/(1− ε) · dij ] ∈ D

where 0 ≤ ε < 1 is an appropriate tolerancethe upper limit can be simplified to (1 + ε)(but we loose the symmetry)we can join intervals in the dictionary D if they intersect

Dr. Arno Formella (University of Vigo) SSIA 09/09 66 / 89

Page 67: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Construction of the graphs that way

GP GS

Dr. Arno Formella (University of Vigo) SSIA 09/09 67 / 89

Page 68: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Fast construction of the graphs

we construct GP as a connected (and rigid) graphmantaining only the short edgeswe order the points of S previously in a grid of size similar to thelargest of the intervals

Dr. Arno Formella (University of Vigo) SSIA 09/09 68 / 89

Page 69: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Construction of the graphs that way

GP GS

Dr. Arno Formella (University of Vigo) SSIA 09/09 69 / 89

Page 70: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Matching

let us assume (at the beginning) that GP is a complete graphwe order the points of GP according to any order e.g.(p0, . . . ,pk−1)

we apply a backtracking algorithmthat tries to encounter for earch pi a partner sifollowing the established orderinghence:

Dr. Arno Formella (University of Vigo) SSIA 09/09 70 / 89

Page 71: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Backtracking for the search

let us assume that we already found a subgraph Gs0,...,si

where the graph Gp0,...,pi can be matchedwe look for candidates si+1 for the next point pi+1

that must be neighbors of the point si within GSthat must not be matched already andthat have similar distances to the sj (j ≤ i) as the pi+1 to the pj(j ≤ i)

while there is a candidate we advance with iif there are no more candidates for si+1, si cannot be a partner forpi neither (i.e.: backtracking)

Dr. Arno Formella (University of Vigo) SSIA 09/09 71 / 89

Page 72: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Termination of the algorithm

the algorithm informs each timea candidate for for pk−1 has been foundthe algorithm terminates when

there are no more candidates for p0 orthe first solution has been found

Dr. Arno Formella (University of Vigo) SSIA 09/09 72 / 89

Page 73: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Optimizations of the basic algorithm

reduction of the edges in GPimplies: reduction of the edges in GS

good ordering of the piimplies: reduction of the number of candidatesconsideration of the type of point (e.g. element type of the atom)implies: reduction of the number of candidatesall heuristics imply:the backtracking advances faster

Dr. Arno Formella (University of Vigo) SSIA 09/09 73 / 89

Page 74: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Search for partial alignments

find the subset of the points of the pattern that can be matchedbest to some points in the search spaceNP–completethere are |P(P)| = 2k possibilities to choose a subsetwe apply:

genetic algorithmreactive tabu search

Dr. Arno Formella (University of Vigo) SSIA 09/09 74 / 89

Page 75: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Genetic Algorithm

maintain graph GS as complete graphgenome: sequence of bits indicating si a point belongs to theactual pattern or notcrossover: two point crossovermutation: flipselection: roulette wheelcost function: distance and size of alignment

Dr. Arno Formella (University of Vigo) SSIA 09/09 75 / 89

Page 76: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Termination of the GA

it is not that easyonce the first solution has been foundonce a sufficently good solution has been foundafter a certain number of iterationsonce diversity of population is too low

Dr. Arno Formella (University of Vigo) SSIA 09/09 76 / 89

Page 77: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Problems with GA

GS must be a complete graph¿You know a crossover operation for non–complete graphs?more precisely:we need a crossover (and mutation) operation that maintains aspecific property of the graphs (e.g., connectivity, rigidness)or some new idea...

Dr. Arno Formella (University of Vigo) SSIA 09/09 77 / 89

Page 78: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Reactive Tabu Search

we start with an admissible solutionwe search for possibilities to improve the current solutionif we can: we choose one randomlyif we cannot:

we search for possibilities to reduce the current solutionif we can: we again try improvementsif we cannot: we jump to another admissible solution

Dr. Arno Formella (University of Vigo) SSIA 09/09 78 / 89

Page 79: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

The Tabu criterion

we avoid repetitive movements taking advantage of a memory thatstores intermediate solutionsi.e.: we mark certain movements as tabu for a certain number ofiterationsreactive means: we adapt the tabu period dynamically

Dr. Arno Formella (University of Vigo) SSIA 09/09 79 / 89

Page 80: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Quality of a partial alignment

evaluation of the cost of a solution:number of aligned points plus quality of the alignmentremember: quality Q ∈]0,1], but we will use Q ≥ thresholdhence, maximal quality: |P|+ 1

Dr. Arno Formella (University of Vigo) SSIA 09/09 80 / 89

Page 81: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Reactive Tabu Search for psm

representation of the problem:sets of indices of the matched pointssearch for candidates to improve (add):(rigidly) connected neighbors within graph GS

search for candidates to reduce (drop):any point of the current solution that mantains the graph GSconnected (and rigid)

Dr. Arno Formella (University of Vigo) SSIA 09/09 81 / 89

Page 82: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

¿When do we terminate?

not that simpleonce we found a sufficiently good solutiononce we have run a certain number of iterations

Dr. Arno Formella (University of Vigo) SSIA 09/09 82 / 89

Page 83: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Search for the largest common pattern

P S

Dr. Arno Formella (University of Vigo) SSIA 09/09 83 / 89

Page 84: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Search for the largest common pattern

p/P p/S

Dr. Arno Formella (University of Vigo) SSIA 09/09 84 / 89

Page 85: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Search for the largest common pattern

s/P s/S

Dr. Arno Formella (University of Vigo) SSIA 09/09 85 / 89

Page 86: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Search for pattern with deformation

instead of the complete graph use a connected sparse graphparts of the graph could be rigidthe graph may specify hinges or torsion axis

Dr. Arno Formella (University of Vigo) SSIA 09/09 86 / 89

Page 87: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

psm

command line tool with configuration fileGUIweb–site to perform searches

Dr. Arno Formella (University of Vigo) SSIA 09/09 87 / 89

Page 88: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Softwarenewer version

almost 11.000 specific lines of C++uses the libraries:

mtl (matrix template library)Gtkmm (graphical user interface)own libraries

Dr. Arno Formella (University of Vigo) SSIA 09/09 88 / 89

Page 89: Intelligent and Adaptable Software Systemsformella.webs.uvigo.es/doc/ssia09/aa/osm/osm.pdf · 2014. 8. 29. · ¿Dónde está Wally? Dr. Arno Formella (University of Vigo) SSIA 09/09

Possible extensions

enumerate more rigorously all locations(up to now we have concentrated on the best solution)extend the properties of the graphs defining deformations of thepattern (e.g. torsion of parts, restriction of angles)allow local tolerances (e.g. per edge), especially withpreknowledge of the biochemical propertiesimprove heuristics with statistical analysis of distributions ofdistances (look for the unusual first)improve the user interfacemore applications

Dr. Arno Formella (University of Vigo) SSIA 09/09 89 / 89