Deterministic search for relational graph matching

Pattern Recognition 32 (1999) 1255}1271

Deterministic search for relational graph matching

Mark L. Williams!, Richard C. Wilson", Edwin R. Hancock",*

!Defence Research Agency, St. Andrews Road, Malvern, Worcestershire, WR14 3PS, UK"Department of Computer Science, University of York, York, Y01 5DD, UK

Received 13 April 1998; accepted 16 October 1998

Abstract

This paper describes a comparative study of various deterministic discrete search-strategies for graph-matching. Theframework for our study is provided by the Bayesian consistency measure recently reported by Wilson and Hancock(IEEE PAMI 19 (1997) 634}648; Pattern Recognition 17 (1996) 263}276) and Wilson et al. (Comput. Vision ImageUnderstanding 72 (1998) 20}38') We investigate two classes of update process. The "rst of these aims to exploit discretegradient ascent methods. We investigate the e!ect of searching in the direction of both the local and global gradientmaximum. An experimental study demonstrates that although more computationally intensive, the global gradientmethod o!ers signi"cant performance advantages in terms of accuracy of match. Our second search strategy is based ontabu search. In order to develop this method we introduce memory into the search procedure by de"ning context-dependant search paths. We illustrate that although it is more e$cient than the global gradient method, tabu searchdelivers almost comparable performance. ( 1999 Pattern Recognition Society. Published by Elsevier Science Ltd.All rights reserved.

Keywords: Tabu search; Graph-matching; Natural gradient; Consistent labelling; Discrete relaxation; Heuristic search

1. Introduction

Relational graph matching is a process that is centralto symbolic interpretation problems in arti"cial intelli-gence and pattern recognition [1, 2]. The formal aspectsof the problem have been studied for over 30 years[3]. In particular, topics such as subgraph isomorphism[4], maximal clique "nding [5] and graph partitioning[6]have attracted considerable interest in the "elds ofdiscrete mathematics and the theory of algorithms. Eachof these problems is known to be NP-complete, and the

*Corresponding author. Tel.: #441904432767; Fax:#441904 433374; E-mail: [email protected]

quest for e$cient algorithms of polynomial complexitystill raises many important theoretical issues.

However, from the perspective of practical problem-solving the issue of theoretical complexity is of lessimportance than the ability to "nd useful, though sub-optimal, solutions using "nite computing resources. Inany case, since the graph-structures under study arelikely to be inexact due to the presence of noise andsegmentation errors, approximate solutions may be thebest that is achievable [7}9]. Indeed, the idea of posinggraph-theoretic problems as optimisation tasks has re-cently attracted considerable interest in the literature[10}13]. Mean-"eld networks have been applied toa variety of graph-theoretic topics including graph-par-titioning [14], the travelling salesman problem [15] andgraph-matching [10,12]. Although these examples can all

0031-3203/99/$20.00 ( 1999 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.PII: S 0 0 3 1 - 3 2 0 3 ( 9 8 ) 0 0 1 5 2 - 6

be regarded as instances of continuation methods, dis-dcrete con"gurational optimisation methods have alsobeen used to great e!ect. For instance, Cross, Wilson andHancock have used genetic sesarch for graph-matching[16]. Heuristic search techniques have also proved to behighly e!ective. In the classical pattern analysis literatureHaralick and Elliot [2] have used forward-checkingbacktracking to search for consistent labellings. Morerecently, Messmer and Bunke [17] have addressed theissue of exponential complexity by demonstrating howsubgraph isomorphisms can be located in polynomial-time if a form of structural-hashing is used to prune thesearch space.

In fact, the idea of using domain-speci"c heuristics toimprove the e$ciency of search was central to much ofthe pioneering work on arti"cial intelligence of the 1970s[18]. Perhaps the most popular of these is the A* algo-rithm which has been widely exploited in applicationssuch as planning and feature extraction. With the adventof more generally applicable global con"gurational opti-misation strategies such as simulated annealing [19],mean-"eld theory [14,15,20] and most recently evolu-tionary optimisation (genetic search), heuristic search hasbeen largely neglected in the literature. However, as re-cently observed by Glover and his co-workers [6,21}23],the abandonment of the method may be somewhat pre-mature, since the use of domain-speci"c knowledge canprove highly e!ective in rapidly locating useful thoughsuboptimal solutions. It is this observation that has leadto the development of a new class of heuristic optimisa-tion techniques known as tabu search.

Tabu-search [6,21}23] exploits constraints known toapply in a speci"c domain to take maximum advantageof the available computational resources. Rather thanadopting a computationally demanding exploration ofthe state space using stochastic (e.g. simulated annealingand genetic search) or continuation methods (e.g. mean-"eld annealing), tabu search deploys the resources topreferentially search potentially pro"table areas. In es-sence, the search procedure possesses memory. Thismemory can be both short- and long-term. In the longterm, regions of unpro"table search are deemed tabu andare not revisited. Short-term memory can be invoked tointensify search in certain regions. If intensi"cation failsto yield a useful solution, then diversi"cation strategiescan be invoked over a longer time scale. In the broadestsense, this process can be viewed as planning the deploy-ment of computational resources to gain maximum yieldin terms of solution quality.

This paper aims to investigate the use of di!erentdeterministic search strategies for graph matching. Ina recent series of papers we have developed a Bayesianframework which allows the consistency of graph match-ing to be gauged using probability distributions [24}26].These distributions are de"ned over the Hamming dis-tances between partially consistent subgraphs and a set

of model sub-graphs residing in a dictionary. We haveexplored the optimisation of this global consistencymeasure using a number of stochastic [18,27] and con-tinuation methods [28}30]. In this paper we aim tocompare the use of steepest gradient methods with a heu-ristic search method inspired by tabu search.

2. Relational graphs

We abstract the matching process in terms of purelysymbolic relational graphs [9,31}33]. We use the nota-tion G"(<, E) to denote the graphs under match, where< is the set of nodes and E is the set of arcs (or edges).Our aim in matching is to associate nodes in a graphG

1"(<

1, E

1) representing data to be matched against

those in a graph G2"(<

2, E

2) representing an available

relational model. This matching process is facilitatedusing symbolic constraints provided by suitable relation-al subunits of the model graph G

2. In order to accom-

modate the possibility of unmatchable data-graph nodes,we augment the model-graph nodes with a null-label /.This label acts as an attractor for extraneous nodes in thedata graph that may, for instance, be due to the presenceof noise or clutter. Formally, the matching is representedby a function f :<

1P<

2from the nodes in the data

graph G1

onto those in the augmented model graph G2.

The function f consists of a set of Cartesian pairs drawnfrom the space of possible matches between the twographs, i.e. f-<

1]<

2X/; it provides a convenient de-

vice for indexing the nodes in the data graph G1

againsttheir matched counterparts in the model graph G

2. We

use the notation (u, v)3f to denote the match of nodeu3<

1against node v3<

2.

In performing the matches of the nodes in the datagraph G

1we will be interested in exploiting structural

constraints provided by the model graph G2. These con-

straints are purely symbolic in nature and are represent-ed by con"gurations of matched labels drawn from themodel graph. We use representational units or subgraphsthat consist of neighbourhoods of nodes interconnectedby arcs. For convenience we refer to these structuralsubunits or N-ary relations as super-cliques. The super-clique of the node indexed j in the graph G

1with arc-set

E1

is denoted by the set of nodes Cj"jXMi D (i, j)3E

1N.

The matched realisation of this super-clique is denotedby the relation !

j"( f (u

1), f (u

2),2 , f (u

@Cj@) ). In order

to facilitate comparison between super-cliques of di!er-ent size, we pad-out the smaller unit with dummy nodesso as to raise it to the same cardinality as the larger unit.Our aim is to modify the match to optimise a measure ofglobal consistency with the constraints provided by themodel graph G

2. The constraints available to us are

provided by the N-ary symbol relations on the super-cliques of the model graph G

2. The critical ingredient in

developing our matching scheme is the set of feasible

1256 M.L. Williams et al. / Pattern Recognition 32 (1999) 1255}1271

Fig. 1. Example super-clique mapping.

mappings between each super-clique of graph G1

andthose of graph G

2. The set of feasible mappings, or

dictionary, for the super-clique Cj

is denoted by#

j"MS

iN where S

i"iXM j D (i, j)3E

2Nq.

Each element Siof #

j, is therefore a relation formed on

the nodes of the model graph; we denote such consistentrelations by S

i"(v

1, v

2,2). The dictionary of feasible

mappings for the super-clique Cjconsists of all the con-

sistent relations that may be elicited from the graph G2.

In practice these relations are formed by performingpermutation of the non-centre nodes for each super-clique with the requisite number of dummy nodes. Anexample of this mapping process is shown in Fig. 1. Thisprocess e!ectively models the disruption of the adjacencystructure of the model graph caused by the addition ofclutter elements. Since it is intrinsically symbolic in na-ture, the resulting dictionary is invariant to scene transla-tions, scalings or rotations.

It is the size of the dictionary which poses the maincomputational bottleneck in the application of ourmatching scheme. For instance, if we are considering thematching of super-cliques of the same size, i.e. no paddingis required, then there are DC

jD cyclic dictionary items of

the super-clique Cj. If, on the other hand, the cyclicity

constraint is lifted then there are DCjD ! items. When pad-

ding is introduced, then the complexity is increased. Ifthe model-graph relation S

iis being compared with the

match residing on the data-graph clique Cj, then

there are ( DSiD!1)!/( DC

jD!1)!(DS

iD!DC

jD )! cyclic dic-

tionary items and ( DSiD!1! DC

jD/( DS

iD!DC

jD) ! non-cyclic

dictionary items. In Section 4 we consider how the under-lying complexity of our method can be restricted bypruning the set of dictionary items in a tabu searchstrategy.

3. Bayesian consistency measure

In this section we review the development of a re"ne-ment of the relational consistency measure originallyreported by Wilson and Hancock [24}26]. As we notedin Section 2, the consistent labellings available for gaug-ing the quality of match are represented by the set ofrelational mappings from C

jonto G

2, i.e. #

j. As de-

manded by the Bayes rule, we compute the probability ofthe required super-clique matching by expanding over

the basis con"gurations belonging to the dictionary #j;

P(!j)" +

Si|#j

P(!jDS

i)P(S

i). (1)

The development of a useful graph-mapping measurefrom this expression requires models of the processes atplay in matching and of their roles in producing errors.These models are represented in terms of the joint condi-tional matching probabilities P(!

jDS

i) and of the joint

priors P (Si) for the consistent relations in the dictionary.

In developing the required models we will limit ourassumptions to the case of matching errors which arememoryless and occur with uniform probability distribu-tion.

To commence our modelling of the conditional prob-abilities, we assume that the various types of matchingerror for nodes belonging to the same super-clique arememoryless. In direct consequence of this assumption,we may factorise the required probability distributionover the symbolic constituents of the relational mappingunder consideration. As a result the conditional prob-abilities P(!

jDS

i) may be expressed in terms of a product

over label confusion probabilities

P(!jDS

i)"

@Si@<k/1

P( f (uk) Dv

k). (2)

Our next step is to propose a two-component model ofthe processes which give rise to erroneous matches. The"rst of these processes is initialisation error, which weaim to rectify iterative label updates. We assume thatinitialisation errors occur with a uniform and memory-less probability P

e. This probability is distributed over

the D<2D!1 possible matching errors that can occur. The

second source of error is structural disturbance of therelational graphs caused by noise, clutter or segmenta-tion error. We assume that structural errors can also bemodelled by a uniform disstribution which occurs withprobability P

(. This probability is distributed over both

the possibility of null-labelling of the data-graph nodesand the possibility of dummy insertions in the dictionaryitem S

i. Under these dual assumptions concerning the

nature of matching errors, the confusion probabilitiesappearing under the product of Eq. (2) may be assignedaccording to the following distrubution rule:

G(1!P

() (1!P

e) if f (u

k)"v

k,

(1!P()

Pe

D<2D!1

if f (uk)Ov

kand v

kOdummy,

P(

if f (uk)"/ or v

kOdummy,

P(

1

D<2D

if f (uk)O/ or v

kOdummy.

(3)

The four cases under this distribution rule require furtherexplanation. The "rst case corresponds to the situation inwhich there is agreement between the current match and

M.L. Williams et al. / Pattern Recognition 32 (1999) 1255}1271 1257

that demanded by the dictionary item Si. The second case

corresponds to matching disagreements. In this case,there are D<

2D!1 erroneous labels over which the error-

probability Pe

may be distributed. The third case ariseswhen the super-clique under consideration containsmore nodes than the dictionary item S

i. In this case the

matching con"guration !j

is padded-out with dummynodes. Since the dummy nodes are not mapped ontospeci"c nodes in the model-graph, the null-match prob-ability is assigned. In the fourth, and "nal case, thedictionary item contains fewer nodes than the data-graphsuper-clique. In this case the dictionary item is paddedwith dummy nodes. Since there are <

2model graph

nodes that can be dummied in this way, the null-matchprobability is distributed uniformly.

As a natural consequence of this distribution rule thejoint conditional probability is a function of three phys-ically meaningful variables. The "rst of these is the Ham-ming distance H(!

j, S

i) between the assigned matching

and the feasible relational mapping Si. This quantity

counts the number of con#icts between the currentmatching assignment !

iresiding on the super-clique

Cj

and those assignments demanded by the relationalmapping S

i. The second variable is the sum of the num-

ber dummy nodes padding the data-graph clique Cj

which we denote by ti, j

. This second quantity is equalto the size di!erence between the structure-preservingmapping S

iand the data-graph clique C

j, i.e. t

i, j"

SiD!DC

jDD. The third quantity is the number of null-labels

assigned to the non-dummy nodes of the data clique Cj,

which we denote by ((!j). With these ingredients, the

resulting expression for the joint conditional probabilityacquires an exponential character

P(!jDS

i)"[(1!P

()(1!P

e)]@Cj@~H(!j,Si)~((!j ,Si )~ti,j

]C(1!P()

Pe

D<2D!1D

H(!j ,Si )

][P(]((!j )

]CP(1

D<2DD

ti,j. (4)

Finally, in order to compute the super-clique matchingprobability P (!

j), we require a model of the joint priors

for the dictionary items. Here we assume that the unitprobability mass is uniformly distributed over the rel-evant items, i.e.

P(Si)"

1

D#jD. (5)

Collecting together terms in the expression for P(!jDS

i)

and substituting for the joint priors for the dictionaryitems, we obtain the following expression for the super-

clique matching probability:

P(!j)"

KCj

D#jD

+Si|#j

exp [!(ke(!

j, S

i)#k

(Mt

i,j#((!

j)N

#ln D<2Dt

i, j)], (6)

where Kcj"[(1!P

e) (1!P

()]@Cj@. The two exponential

constants appearing in the above expression are relatedto the matching-error probability and the null matchprobability, i.e. k

e"ln (1!P

e) ( D<

2D!1)/P

eand

k("ln (1!P

e)(1!P

()/P

(. The probability distribution

may be regarded as providing a natural way of softeningthe hard relational constraints operating in the modelgraph. The most striking and critical feature of the ex-pression for P(!

j) is that the consistency of match is

gauged by a series of exponentials that are compoundedover the dictionary of consistently mapped relations.

We use the super-clique matching probabilities toconstruct a global consistency measure for the currentstate of match. For the sake of simplicity, we averagethe consistency measure over the data-graph nodes usingthe following quantity:

Q( f )"1

D<1D

+j|V1

P(!j). (7)

In the next section of this paper we discuss variousalternative strategies for searching for matching con-"gurations which maximise this global consistencymeasure.

4. Search strategies

In the previous section we reviewed the developmentof a Bayesian consistency measure that can be utilised inthe search for graph matches. This measure is based ona purely symbolic representation of the matching processand does not, for instance, draw on attribute informationto establish consistency of match. In this section wedescribe some alternative ways in which the consistencymeasure can be used in the deterministic search for graphmatches.

4.1. Gradient ascent

In the continuous domain, gradient ascent involvesselecting parameter updates which are aligned in thedirection of maximum slope on the optimisation surface.When the optimisation surface is de"ned over a set ofdiscrete entities, then the de"nition of the steepest gradi-ent requires more care. Suppose that *

a,aQ ( f ) is thechange in the global consistency measure when thematch on the node a in the data graph is switched fromits current value f (a) to the new value a. With thede"nition of the global consistency measure given in


Eq. (7) the change is evaluated over those super-cliquesmodi"ed by the label update, i.e.

*a,aQ( f )" +

b|Ca

[P (a, f (c), ∀c3C(b)!a)!P( f (a), f (c),

∀c3C(b)!a)]. (8)

With this de"nition of gradient, there are two ways inwhich the match can be updated. These are described inthe following subsections of the paper.

4.1.1. Local steepest gradient

The simplest way of searching for consistent matches isto apply steepest gradient methods in sequential order.Here the graph nodes are considered one after the otherin pre-speci"ed order and the matching updates made inaccordance with the following rule:

f (a)"argmaxa|V2

*a,aQ( f ). (9)

The update selected is the one that results in the greatestincrease in consistency. Unfortunately, this simple updat-ing scheme does not necessarily follow the direction ofsteepest positive gradient. The search is limited by thefact that it is only the node currently under considerationthat can have its matching assignment changed. Thereason for this is that although the state-space of theoptimisation surface is D<

1]<

2D dimensional, each move

to a new solution is constrained to lie within a D<2D-

dimensional subspace. The number of computationsrequired at the node a is of order DC

aD . D<

2D . D#

aD.

The consequence of this is that in general the pathto the global maximum of the surface will be longerthan the one dictated by true gradient ascent. More-over, the chance of encountering a local maxima isincreased.

4.1.2. Global steepest gradientTrue gradient ascent requires the computation of the

consistency metric under the complete set of possiblemappings for all the nodes in the graph. In other words,rather than choosing the data-graph node in some pre-speci"ed order, we visit the node a3<

1which possesses

the largest value of *a,aQ( f ) over the set of model-graph

nodes. The update process involves identifying the Car-tesian pair (a, a) which maximises the gradient and up-dating the state of match accordingly, i.e.

(a, a)"arg max(a,a)|V1CV2

*a,aQ ( f )Nf (a)"a. (10)

At "rst sight this would appear prohibitively expensivesince it implies an increase in the number of computa-tions required by a factor D<

1D. In fact, the number of

computations required for the update decision at node a

is D<1D . D<

2D . D#

aD. However, in practice it is only neces-

sary to re-evaluate the consistency measure for thosesuper-cliques modi"ed by a matching re-assignment. Inother words, the full set of consistency values need onlybe calculated in the "rst iteration. In fact the number ofrequired calculations increases by a factor equal to theaverage number of nodes in each super-clique. Bymonitoring the super-cliques modi"ed by label updates,the complexity at subsequent iterations is no greater thanthat for local-gradient search.

4.2. Heuristic planning } the jigsaw puzzle algorithm

The idea of grading the entire state-space of possiblematches included in the set of Cartesian pairs <

1]<

2according to a con"dence measure provides an interest-ing departure from the conventional sequential search bydiscrete relaxation. Although this is closer in spirit to truegradient ascent on the optimisation surface, this is onlyachieved at the expense of greater computational over-heads. The main computational bottleneck is the require-ment to evaluate the super-clique matching probabilitiesover the complete set of dictionary items. It is interestingto note, however, that in our derivation of the matching-probabilities we assumed that the di!erent dictionaryitems were equiprobable. In practice, however, we canre"ne our viewpoint and make the probabilities contextdependent. In this way, we can impose a distribution onthe di!erent dictionary items which e!ectively favourscertain search paths over others. In a nutshell, we aim toenhance the probability of consistently abutting diction-ary items. This can be viewed as a form of tabu searchwhich uses memory concerning the structure of themodel-graph to limit the search space so as to preferen-tially explore the most potentially pro"table regions[6,21}23].

The adopted approach is to make the dictionary #j

both context and iteration dependent. Suppose that weare considering the consistency of the match a on thenode a of the data graph. According to our de"nition ofconsistency, we gauge con"dence in this match by thequantity

qa,a" +

j|Ca

+Si|#j

P(!jDS

1)P(S

i). (11)

We modify this con"dence measure by introducing theconcept of a conditional dictionary for the super-cliquesmodi"ed by the label update. We let #3

j@ f (a)/a be the setof structure-preserving mappings permitted on the super-clique C

jgiven that the match a resides on the data-

graph node a. In this case the con"dence of match isgauged by quantity

qa,a" +

Si|#a

P (!aDS

i)P(S

i)# +

j|CajEa

+Si|#3 jf (a)/a

P(!jDS

i)P(S

i). (12)


The conditional dictionaries for the neighbouring super-cliques are pruned so as to remove those con"gurationsthat cannot abut with the match f (a)"a.

This strategy for computing the con"dence of matchhas many similarities to the problem of solving a jigsawpuzzle. Here pieces of the puzzle are sorted according totheir saliency. In the initial stages of search it is only thesalient pieces that are used to construct islands of consist-ency. At later stages these islands are joined together toconstruct the completed global solution. Moreover, thisconcept of making compound moves in the search pro-cedure has much in common with Glover's ejectionchains [21]. By contrast, the sequential search proceduredescribed in the previous section would be akin to over-looking the role of saliency and exhaustively checking forconsistency of match.

The basic idea underpinning the tabu-search strategydescribed in this section is to invoke memory to rank thepieces according to con"dence of match. This rank isused to determine an order of search so that computa-tional resources can be concentrated in an intensi"edsearch of islands of consistency. These islands eventuallymerge to form a global solution. In this way areas of lowconsistency are only visited when the islands of highconsistency encroach. In this way computational re-sources are not wasted fruitlessly updating matcheswhich are unstable or #uctuate between ambiguousstates. Moreover, because it is context-dependent, thecon"dence measure embodies the concept of using anejection-chain structure in the search procedure [24].Based on these observations, we adopt the followingsearch strategy.

f The algorithm commences by assigning randommatches to the data graph nodes. Based on theserandom matches, the con"dence measure q

a, a isevaluated over the complete space of potential up-dates, ∀ (a, a)3<

1]<

2. The con"dence measures are

then used to rank the nodes of the data graph accord-ing to the value of o

a"maxa|V2

qa,a . According to

our jigsaw-puzzle analogy, this corresponds todrawing pieces at random from the box and retainingthe salient ones as seeds from which to build thesolution.

f The best-N ranked matches are then selected as seedsfor the search procedure. The number of nodes isselected to re#ect the fraction of correct matches an-ticipated to be present in the initial random con"gura-tion. Using the pruned dictionaries, each neighbour ofthe initial seeds is then updated so as to select thematch of maximum con"dence. Again, according toour jigsaw analogy, this corresponds to buildingout from the seeds by drawing from pools of likelymatches.

f Once all neighbours of the initial seeds have beenvisited and updated, the con"dence measures are

re-evaluated and the nodes re-ranked. An enlarged setof seeds is then selected. The updating step describedabove is then repeated, incrementally increasing thepopulation of seeds. In this way islands of consistencynaturally develop. Computational resources are con-centrated on extending the boundaries of the islands.This process is iterated until the islands of consistencymerge and the match stabilises. This corresponds tolinking seed patches in the jigsaw by extending thescope of the search.

The philosophy underpinning the tabu-search strategycontrasts with the global gradient method. Rather thanconcentrating resources on regions of the search spacewhere the gradient is largest, it attempts to consolidateregions of consistency by preferentially making movesthat are more likely to result in improvements. The basicdi!erence resides in the ranking of matches. This endowsthe search procedure with memory. Is this feature that isdistinctive of tabu search.

5. Experiments

In this section we provide some experimental evalu-ation of our deterministic search methods. Thereare three aspects to this study. Firstly, we investigatesome of the algorithm characteristics using a Monte-Carlo study. Secondly, we provide some qualitativeexamples of the solution-tracking capabilities of ourmethod. Thirdly, and "nally, we provide some examplesof the graph-matching technique on real-world aerialimages.

5.1. Monte-Carlo Study

We pose our evaluation as a Monte-Carlo study usingrandomly generated synthetic graphs. The graphs aregenerated in the following way. We commence by ran-domly distributing points on the image plane. Thesepoints are used to seed a Voronoi tessellation of theimage plane. The relational structure that we use in ourexperiments is the Delaunay triangulation, i.e. the regionadjacency graph for the Voronoi regions.

To simulate the e!ects of relational corruption, weboth randomly add and delete a controlled fraction of thenodes from the dot patterns used to seed the Delaunaygraphs. We compare matching performance as a functionof this corruption fraction. Our measure of performanceis the fraction of the nodes correctly matched. Eachexperimental data point is based on the averaging ofmatching performance over a sample of ten randomgraphs each containing 30 nodes.

We commence by comparing the results of local andglobal gradient ascent methods. The solid curve in Fig. 2shows the best possible fraction of correct matchesachievable as a function of the fraction of added clutter.


Fig. 2. Comparing the two gradient-based methods; the solid curve is the maximum achievable fraction of correct matches; the dashedcurve is the result obtained with global gradient; the dotted curve is the result obtained with local gradient.

The dashed curve is the result of applying global gradientascent while the dotted curve is the result of applyinglocal gradient ascent. The main conclusion to be drawnfrom this plot is that the global method consistentlyoutperforms the local method at all levels of structuralcorruption. In both cases, the update process has beenapplied to the exponential consistency measure.

Our next set of experiments aim to compare globalgradient ascent with the tabu search strategy. In Fig. 3the solid curve and the dashed curve are again, respec-tively, the best-achievable result and the result of globalgradient ascent. The dotted curve now shows the result ofapplying the tabu search strategy to the exponentialconsistency measure. The performance of the tabu searchmethod is consistently marginally lower than the globalgradient ascent method. However, it is signi"cantlyhigher in performance than the local gradient ascentmethod. Fig. 4 illustrates how the accuracy match scaleswith graph-size for tabu search. The log-linear plotexhibits a linear increase in matching accuracy withthe number of graph-nodes. In other words, the underly-ing increase in performance is polynomial with graphsize.

To give some idea of the computational tradeo! be-tween accuracy of match and convergence rate, timingssuggest that the local gradient method and the tabusearch strategy require comparable resources. Bothare 20 times faster than the global gradient method.This accords with our discussion of complexity in

Section 4, since the average second order neighbourhoodconnectivity of the Delaunay graph is approximately 20.In other words, the tabu search method compares fa-vourably with the global-gradient method in terms ofboth accuracy of match and in terms of computationaloverheads.

5.2. Solution tracking

In the subsection we extend our experimental evalu-ation by providing some qualitative examples which il-lustrate the iterative growth of islands of consistency. Tovisualise the process, we have #agged as correct thosematches whose probability exceeds a threshold value P

T.

Obviously the threshold should not be too low or all thenodes will be #agged and the whole graph will be as-signed as one island from the outset. The higher thethreshold, the more the "nal solution is fractured intoseparate regions. The best way to control this thresholdhas been found in practice to set it at such a level that thenumber of nodes initially #agged accords with the num-ber of nodes excepted to be correctly labelled in the initialsolution. In the case of the error-model developed herePT"1!P

e.

Each #agged node that does not have any #agged nodeas an immediate neighbour is used to begin a new region.As and when new nodes are #agged as being correctlylabelled then there are two possible outcomes. In the "rstcase, if isolated, they are used to seed new regions. In the


Fig. 3. Comparing global gradient and tabu search; the solid curve is the maximum achievable fraction of correct matches; the dashedcurve is the result of applying the global gradient method; the dotted curve is the result of applying tabu search.

Fig. 4. Performance of tabu search as a function graph size.

second case, if the node has a neighbour that is alreadypart of an existing region, then it is added to that region.In this way the "nal solution is composed of one or moreregions.

In Fig. 5 we show the ground-truth labelling at theoutset of the matching algorithm. The black nodes arecorrect, the red nodes are incorrect while the green nodesrepresent added clutter nodes. The graphs contain 50


Fig. 5. Correctly matched nodes at start of solution. Black"Correct, Red"Incorrect, Green"Corrupt.

Fig. 6. Correctly matched nodes after 20 iterations.

nodes. There is 10 per cent corruption between bothgraphs and the initial fraction of the match set correct is10 per cent. Figs. 6 and 7 show two subsequent stages inthe growth of the solution.

Figs. 8}10 show the matching probabilities. Here thecolour is given in the caption of Fig. 8. Figs. 11}13 showthe corresponding regions, coded by colour. It can beseen that the largest region (size 19 nodes) contains 17

nodes that are correctly labelled. The next largest region(size 17 nodes) contains just three correctly matchednodes. In total the "nal solution contained 27 correctlymatched nodes. Thus the tracking algorithm has donea good job of identifying a region of true consistency anda region of false consistency. These two regions were usedto seed second and third attempts at solving the match.The &&good'' island led to a "nal result with 43 out of


Fig. 7. Correctly matched nodes after 43 iterations.

Fig. 8. Matching probabilities at start of solution. (High) Pink } Red } Green (Low).

the maximum possible 45 nodes correctly matched. The&bad'' island led to a "nal solution with only "ve correctlymatched nodes.

It can be seen that although the probabilities giveinformation concerning the regions and their boundariesas the solution develops, in the "nal solution this in-formation is absent.

5.3. Real-world imagery

The graph matching methods described in this paperare based on growing islands of consistency. Hitherto,our evaluation has been based on simulation data.To conclude our experimental work, in this subsectionwe provide some real-world examples of the application


Fig. 9. Matching probabilities after 20 iterations.

Fig. 10. Matching probabilities after 43 iterations.

of the tabu search and global gradient matchingmethods.

The application chosen for this study is furnished bymatching graphs extracted from two aerial images of arural site in Bedfordshire, England. The two images arecollected using di!erent sensors #ying on di!erent datacollecting missions. The "rst image was collected usingan optical camera operating in the visible part of the

spectrum and is shown in Fig. 14. The second image wascollected using an infrared line-scan device and is shownin Fig. 15.

The dominant structures in this imagery are hedge-row patterns. These present themselves as intensityridges. In order to extract features for matching, wehave segmented the ridge-structures using a line "nderand have "tted straight-line segments to the detected


Fig. 11. Regions of consistency at start of solution. Group codes by colour and/or symbol.

Fig. 12. Regions of consistency after 20 iterations.

feature-points. We have established graphs by construct-ing the Delaunay triangulation of the line centres. This isthe methodology adopted in our previous work on graphmatching. More details of the processing chain can befound in [26]. The graphs extracted from the optical andinfrared images are, respectively, shown in Figs. 16 and 17.

In order to explore the e!ect of structural corruptionon the resulting graphs, we have conducted the following

experiment. We commence by adding a controlled frac-tion of additional points at random locations to the setof line centres. Once the point-set has been corrupted, were-compute the Delaunay triangulation. Finally, we re-match the resulting graphs and record the fraction ofnodes correctly matched. Fig. 18 shows the result of thisexperiment. The solid line shows the maximum fractionof matchable nodes. It is important to stress that the two


Fig. 13. Regions of consistency after 43 iterations.

Fig. 14. Optical aerial image.

images are not perfectly overlapped. For this reason,even before we add random corruption there are certainnodes which cannot "nd a correct correspondence. Asa result, the smallest level of structural corruption

recorded in the plot is 0.1. The dashed curve is the resultof applying the global steepest gradient method while thedotted curve is the result of applying tabu search.As in the case of the arti"cial data, the main feature to


Fig. 15. Infrared linescan image.

Fig. 16. Graph for the optical image.


Fig. 17. Graph for the infrared line-scan image.

Fig. 18. Performance curve shoring the fraction of correct matches as a function of the level of graph corruption: The plot compares theresults obtained with global gradient ascent (dashed curve) and with tabu search (dotted curve). The solid curve is the maximum fractionof correct matches achievable.

note from the plot is that tabu search compares veryfavourably with global gradient ascent. The moelof &&bush-"re'' spread of consistency that underliesthe tabu-search method has allowed comparable match-ing results to be obtained an order of magnitude morequickly.

6. Conclusion

Our "rst main contribution in this paper has been tocompare three deterministic search methods for graphmatching. Two of these revolve around computing a dis-crete approximation to the maximum gradient direction.


About the Author*DR MARK WILLIAMS received the B.Sc. degree in Physics from the University of Bristol in 1987 and completedhis Ph.D. in Mathematics at Imperial College in 1992. Currently he is a Senior Scientist in the Optimisation For Decision Support group

Here we demonstrate the advantages to be gained fromsearching for updates in the direction of the global max-imum gradient. In other words, we show that nodesshould be visited in an order determined by the value ofthe gradient, rather than by some predetermined se-quence.

A secondary contribution has been to develop a tabusearch strategy. This algorithm has been inspired by thestrategies adopted in solving jigsaw puzzles. The basicidea is to rank potential matches according to saliencyand to grow islands of consistency from a populationof high-ranking seeds. The resulting algorithm o!ersmatching accuracy that is comparable to global gradientsearch. However, it requires computational resources ofonly the same order as local sequential search.

References

[1] R.M. Haralick, J. Kartus, Arrangements, homomor-phisms and discrete relaxation, IEEE SMC 8 (1978)600}612.

[2] R.M. Haralick, G. Elliott, Increasing tree search e$ciencyfor constraint satisfaction problems, Arti"cial Intell. 14(1980) 263}313.

[3] J.R. Ullman, Associating parts of patterns, Inform. andControl 9 (1966) 583}601.

[4] J.R. Ullman, An algorithm for subgraph isomorphism,J. ACM 23 (1976) 31}42.

[5] H.G. Barrow, R.M. Burstall, Subgraph isomorphism,matching relational structures and maximal cliques, In-form. Process. Lett. 4 (1976) 83}84.

[6] E. Rolland, H. Pirkul, F. Glover, Tabu search for graphpartitioning, Ann. Oper. Res. 63 (1996) 290}232.

[7] A. Sanfeliu, K.S. Fu, A distance measure between at-tributed relational graphs for pattern recognition, IEEESMC 13 (1983) 353}362.

[8] L.G. Shapiro, R.M. Haralick, Structural description andinexact matching, IEEE PAMI 3 (1981) 504}519.

[9] L.G. Shapiro, R.M. Haralick, A metric for comparingrelational descriptions, IEEE PAMI 7 (1985) 90}94.

[10] S. Gold, A. Rangarajan, A graduated assignmentalgorithm for graph matching, IEEE PAMI 18 (1966)377}388.

[11] S. Gold, A. Rangarajan, E. Mjolsness, Learning withpre-knowledge: clustering with point and graph-match-ing distance measures, Neural Comput. 8 (1996) 787}804.

[12] E. Mjolsness, G. Gindi, P. Anandan, Optimisation inmodel matching and perceptual organisation, NeuralComput. 1 (1989) 218}219.

[13] P.N. Suganathan, E.K. Teoh, D.P. Mital, Pattern recogni-tion by graph matching using Potts MFT networks, Pat-tern Recognition 28 (1995) 997}1009.

[14] J.J. Kosowsky, A.L. Yuille, The invisible hand algorithm:solving the assignment problem with statistical physics,Neural Networks 7 (1994) 477}490.

[15] A. Yuille, Generalised deformable models, statistical phys-ics and matching problems, Neural Comput. 2 (1990)1}24.

[16] A.D.J. Cross, R.C. Wilson, E.R. Hancock, Inexact GraphMatching using Genetic Search'', Pattern Recognition, 30(1997) 953}970.

[17] B.T. Messmer, H. Bunke, E$cient error-tolerant subgraphisomorphism detection, in: (D. Dori, A. Bruckstein (Eds.),Shape, Structure and Pattern Recognition, World Scient-i"c Press, Singapore, 1995, pp. 231}240.

[18] N.J. Nilsson, Problem solving Methods in Arti"cial Intelli-gence, McGraw-Hill, New York, 1971.

[19] D. Geman, S. Geman, Stochastic relaxation, Gibbs distri-butions and Bayesian restoration of images, IEEE PAMI6 (1984) 721 }741.

[20] A.L. Yuille, J.J. Kosowsky, Statistical physics algorithmsthat converge, Nueral Comput. 6 (1994) 341}356.

[21] F. Glover, Ejection chains, reference structures and alter-nating path methods for traveling salesman problems,Discrete Appl. Math. 65 (1996) 223}253.

[22] F. Glover, Genetic algorithms and tabu search } hybridsfor optimisation, Discrtete Appl. Math. 49 (1995) 111}134.

[23] F. Glover, Tabu search for nonlinear and parametric opti-misation (with links to genetic algorithms), Discrete Appl.Math. 49 (1995) 231}255.

[24] R.C. Wilson, E.R. Hancock, Structural matching by dis-crete relaxation, IEEE PAMI 19 (1997) 634}648.

[25] R.C. Wilson, A.D.J. Cross, E.R. Hancock, Structuralmatching with active triangulations, Comput. Vision Im-age Understanding 72 (1988) 20}28.

[26] R.C. Wilson, E.R. Hancock, Structural Matching by dis-crete relaxation, IEEE PAMI 19 (1997) 834}648.

[27] A.D.J. Cross, E.R. Hancock, Relational matching withstochastic optimisation, IEEE Int. Symp. on ComputerVision, 1995, pp. 365}370.

[28] A.M. Finch, R.C. Wilson, E. R. Hancock, Relationalmatching with mean-"eld annealing, Proc. 13th Int. Conf.on Pattern Recognition, vol. II, 1996, pp. 359}363.

[29] A.M. Finch, R.C. Wilson, E.R. Hancock, Softening discreterelaxation, Advances in Neural Information ProcessingSystems 9, MIT Press, Cambridge, MA, 1997, pp.438}444.

[30] A.M. Finch, R.C. Wilson, E.R. Hancock, An energy func-tion and continuous edit process for Graph-Matching,Neural Comput. 10 (1998) 1873}1894.

[31] K. Boyer, A. Kak, Structural setereopsis for 3D vision,IEEE PAMI 10 (1988) 144}166.

[32] P.J. Flynn, A.K. Jain, CAD-based vision } from CADmodels to relational graphs, IEEE PAMI 13 (1991)114}132.

[33] J. Kittler, W.J. Christmas, M. Petrou, Probabilistic relax-ation for matching problems in machine vision, Proc. 4thInt. Conf. on Computer Vision, 1993, pp. 666}674.


at the Defence Evaluation and Research Agency in the United Kingdom. He is interested in combinatorial optimisation, heuristics,uncertainty and real world applications such as the Radio Link Frequency Assignment problem.

About the Author*DR RICHARD WILSON is currently a research associate in the Department of Computer Science at the Universityof York. Dr Wilson was awarded an open scholarship to read physics at St John's College, University of Oxford, graduating with "rstclass honours in 1992. Between 1992 and 1995 he undertook research at the University of York on the topic of relational graph matchingfor which he was awarded the DPhil degree. He has published some 50 papers in journals, edited books and refereed conferences. In 1998he received an honourable mention in the Pattern Recognition Society best paper award. His reserch interests are in high-level vision,scene understanding, volumetric image analysis and structural pattern recognition.

About the Author*DR EDWIN HANCOCK gained his B.Sc. in physics in 1977 and Ph.D. in high energy nuclear physics in 1981, bothfrom the University of Durham, UK. After a period of postdoctoral research working on charm-photo-production experiments at theStanford Linear Accelerator Centre, he moved into the "elds of computer vision and pattern recognition in 1985. Between 1981 and1991, he held posts at the Rutherford-Appleton Laboratory, the Open University and the University of Surrey. Dr Hancock is currentlyReader in the Department of Computer Science at the University of York. He leads a group of some 15 researchers in the areas ofcomputer vision and pattern recognition. He has published about 180 refereed papers in the "elds of high energy nuclear physics,computer vision, image processing and pattern recognition. He was awarded the 1990 Pattern Recognition Society Medal and receivedan honorable mention in 1997. Dr Hancock serves as an Associate Editor of the journal Pattern Recognition and has been a guest editorfor the Image and Vision Computing Journal. He is currently guest-editing a special edition of the Pattern Recognition journal devotedto energy minimisation methods in computer vision and pattern recognition. He chaired the 1994 British Machine Vision Conferenceand has been a programme committee member for several national and international conferences.


Documents

Deterministic search for relational graph matching