11
1400 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 12, NO. 6, NOVEMBER 2001 Hopfield Neural Networks for Affine Invariant Matching Wen-Jing Li and Tong Lee Abstract—The affine transformation, which consists of rotation, translation, scaling, and shearing transformations, can be con- sidered as an approximation to the perspective transformation. Therefore, it is very important to find an effective means for establishing point correspondences under affine transformation in many applications. In this paper, we consider the point correspon- dence problem as a subgraph matching problem and develop an energy formulation for affine invariant matching by Hopfield type neural network. The fourth-order network is investigated first, then order reduction is done by incorporating the neighborhood information in the data. Thus we can use second-order Hopfield network to perform subgraph isomorphism invariant to affine transformation, which can be applied to affine invariant shape recognition problem. Experimental results show the effectiveness and the efficiency of the proposed method. Index Terms—Affine transformation, Hopfield neural network, shape recognition, subgraph isomorphism. I. INTRODUCTION S INCE Hopfield and Tank proposed the Hopfield network for the traveling salesman problem [1], many engineering problems have been formulated as optimization problems in which an energy function is minimized. The customary approach is to formulate the original problem as one of energy minimization and then to use a proper relaxation network to find minimizers of this function. Such solutions are attractive because they offer the advantage of parallel analog very large scale integration (VLSI) implementations. One typical example is the graph/subgraph isomorphism problem. Graph matching approaches based on neural networks have been widely investigated in the literature [2]–[12]. Nasrabadi and W. Li [2] first used a two-dimensional (2-D) Hopfield net- work to perform a subgraph isomorphism to obtain the optimal compatible matches between the two graphs with application in object recognition. Li [4] used a relaxation labeling method to perform invariant matching between patterns. Suganthan et al. [8] concerned with programming of the Potts mean field theory networks to implement homomorphic mappings between the at- tributed relational graphs (ARGs). Rangarajan and Mjolsness [7], have proposed a Lagrangian relaxation network for graph matching. However, most works aimed at finding the corre- sponding matching points under Euclidean transformation (ro- tation and translation), or similarity transformation, which in- Manuscript received August 14, 2000; revised May 30, 2001. The authors are with the Computer Vision and Image Processing Laboratory, Department of Electronic Engineering, The Chinese University of Hong Kong, Shatin, New Territory, Hong Kong (e-mail: [email protected]; [email protected]). Publisher Item Identifier S 1045-9227(01)10193-1. cluded rotation, translation, and scale only. By comparison, the problem of subgraph matching invariant to general affine trans- formation by neural-network method has never been addressed, although there were some works using trained neural-network models to learn the affine parameters or to accomplish the recog- nition task [13], [14]. In [14], point correspondences were estab- lished by hand, and the training views were obtained by sam- pling the space of affine transformed views of the object. An architecture of competitive neural network for affine invariant pattern recognition was proposed in [13] and was simulated by simple character patterns only, without any occlusion. However, a neural system that is capable of determining the correspon- dence as well as the corresponding affine transformation for general situations is still not available. Affine transformations have been widely used in computer vision and particularly, in the area of model-based object recognition. They have been used to represent the mapping from a 2-D object to a 2-D image or to approximate the 2-D image of a planar object in three-dimensional (3-D) environment. Many methods have been developed for the recognition of objects under affine transformation, including Fourier descriptors [15], moment invariant method [16], B-spline invariant matching [17], geometric hashing [18], and others. Among these methods, Fourier descriptor and moment invariants-based methods are unable to handle the occlusion, because they are global invariants. Local invariants such as vertices, line segments, and fiducial marks, are more immune to occlusion and thus have been used frequently. Therefore, in many applications, it is necessary to find an effective means for establishing point correspondences between the model object and the input image. This kind of problems can be easily formu- lated as subgraph matching problem, which can be solved using either a neural-network approach by minimizing an objective function or a nonneural-network approach such as sequential tree search with heuristic pruning [19]. Because the former has the advantage of parallel analog VLSI implementation, thus has been interested by many researchers. Following the works of applying Hopfield neural networks to subgraph matching problem under similarity transformation [2], [6], [9], [10], [20], in this paper, we study the subgraph matching problem under affine transformation. We first derive a new energy formulation for general affine invariant matching. Then we show that the matching problem can be solved by a fourth-order generalized Hopfield network. The convergence of the network has also been proved. When the proposed network is used to solve affine invariant shape recognition problem, we show that the fourth-order energy function can be approximated using a standard second-order form by incorporating a priori 1045–9227/01$10.00 © 2001 IEEE

Hopfield neural networks for affine invariant matching

  • Upload
    t

  • View
    214

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Hopfield neural networks for affine invariant matching

1400 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 12, NO. 6, NOVEMBER 2001

Hopfield Neural Networks for AffineInvariant Matching

Wen-Jing Li and Tong Lee

Abstract—The affine transformation, which consists of rotation,translation, scaling, and shearing transformations, can be con-sidered as an approximation to the perspective transformation.Therefore, it is very important to find an effective means forestablishing point correspondences under affine transformation inmany applications. In this paper, we consider the point correspon-dence problem as a subgraph matching problem and develop anenergy formulation for affine invariant matching by Hopfield typeneural network. The fourth-order network is investigated first,then order reduction is done by incorporating the neighborhoodinformation in the data. Thus we can use second-order Hopfieldnetwork to perform subgraph isomorphism invariant to affinetransformation, which can be applied to affine invariant shaperecognition problem. Experimental results show the effectivenessand the efficiency of the proposed method.

Index Terms—Affine transformation, Hopfield neural network,shape recognition, subgraph isomorphism.

I. INTRODUCTION

SINCE Hopfield and Tank proposed the Hopfield networkfor the traveling salesman problem [1], many engineering

problems have been formulated as optimization problemsin which an energy function is minimized. The customaryapproach is to formulate the original problem as one of energyminimization and then to use a proper relaxation network tofind minimizers of this function. Such solutions are attractivebecause they offer the advantage of parallel analog very largescale integration (VLSI) implementations. One typical exampleis the graph/subgraph isomorphism problem.

Graph matching approaches based on neural networks havebeen widely investigated in the literature [2]–[12]. Nasrabadiand W. Li [2] first used a two-dimensional (2-D) Hopfield net-work to perform a subgraph isomorphism to obtain the optimalcompatible matches between the two graphs with application inobject recognition. Li [4] used a relaxation labeling method toperform invariant matching between patterns. Suganthanet al.[8] concerned with programming of the Potts mean field theorynetworks to implement homomorphic mappings between the at-tributed relational graphs (ARGs). Rangarajan and Mjolsness[7], have proposed a Lagrangian relaxation network for graphmatching. However, most works aimed at finding the corre-sponding matching points under Euclidean transformation (ro-tation and translation), or similarity transformation, which in-

Manuscript received August 14, 2000; revised May 30, 2001.The authors are with the Computer Vision and Image Processing Laboratory,

Department of Electronic Engineering, The Chinese University of HongKong, Shatin, New Territory, Hong Kong (e-mail: [email protected];[email protected]).

Publisher Item Identifier S 1045-9227(01)10193-1.

cluded rotation, translation, and scale only. By comparison, theproblem of subgraph matching invariant to general affine trans-formation by neural-network method has never been addressed,although there were some works using trained neural-networkmodels to learn the affine parameters or to accomplish the recog-nition task [13], [14]. In [14], point correspondences were estab-lished by hand, and the training views were obtained by sam-pling the space of affine transformed views of the object. Anarchitecture of competitive neural network for affine invariantpattern recognition was proposed in [13] and was simulated bysimple character patterns only, without any occlusion. However,a neural system that is capable of determining the correspon-dence as well as the corresponding affine transformation forgeneral situations is still not available.

Affine transformations have been widely used in computervision and particularly, in the area of model-based objectrecognition. They have been used to represent the mappingfrom a 2-D object to a 2-D image or to approximate the2-D image of a planar object in three-dimensional (3-D)environment. Many methods have been developed for therecognition of objects under affine transformation, includingFourier descriptors [15], moment invariant method [16],B-spline invariant matching [17], geometric hashing [18], andothers. Among these methods, Fourier descriptor and momentinvariants-based methods are unable to handle the occlusion,because they are global invariants. Local invariants such asvertices, line segments, and fiducial marks, are more immuneto occlusion and thus have been used frequently. Therefore, inmany applications, it is necessary to find an effective means forestablishing point correspondences between the model objectand the input image. This kind of problems can be easily formu-lated as subgraph matching problem, which can be solved usingeither a neural-network approach by minimizing an objectivefunction or a nonneural-network approach such as sequentialtree search with heuristic pruning [19]. Because the former hasthe advantage of parallel analog VLSI implementation, thushas been interested by many researchers.

Following the works of applying Hopfield neural networksto subgraph matching problem under similarity transformation[2], [6], [9], [10], [20], in this paper, we study the subgraphmatching problem under affine transformation. We first derivea new energy formulation for general affine invariant matching.Then we show that the matching problem can be solved by afourth-order generalized Hopfield network. The convergence ofthe network has also been proved. When the proposed networkis used to solve affine invariant shape recognition problem, weshow that the fourth-order energy function can be approximatedusing a standard second-order form by incorporatinga priori

1045–9227/01$10.00 © 2001 IEEE

Page 2: Hopfield neural networks for affine invariant matching

LI AND LEE: HOPFIELD NEURAL NETWORKS FOR AFFINE INVARIANT MATCHING 1401

neighborhood information in the data. Experimental results forreal images demonstrate the efficiency and the effectiveness ofthe proposed method.

The rest of this paper is organized as follows. In Section II, wedefine the 2-D affine invariant graph matching problem first, andthen some properties of affine transformation are introduced,which form the compatibility constraint embedded in the en-ergy function of the proposed network. In Section III, a new en-ergy formulation of the fourth-order Hopfield network for affineinvariant matching is developed. Section IV is devoted to theformulation of the second-order Hopfield network for affine in-variant shape recognition. Experimental results on both simu-lated data and real data are shown in Section V. Conclusions aregiven finally. The proof for the convergence of the fourth-orderHopfield network for affine invariant matching can be found inthe Appendix.

II. 2-D AFFINE INVARIANT GRAPH MATCHING

A. Problem Definition

We assume that a model object can be represented by a setof feature points , where

, for . An input scene, which may consist ofone or several overlapping objects, can be represented as anotherset of feature points , where

, for . And they are related by affine trans-formation. So can be considered as a model graph withnodes. Similarly, can be considered as a scene graph with

nodes. The objective of the matching exercise is to establisha mapping list , for all . If we canfind a correct one to one mapping between the two graphs, thegraph/subgraph isomorphism is obtained.

Therefore, the point correspondence problem under affinetransformation can be cast as an inexact graph matchingproblem and then formulated in terms of constraints satisfac-tion, which can be mapped to a network where the nodes arethe hypotheses and the links are the constraints. The network isthen employed to select the optimal subset of hypotheses whichsatisfies the given constraints.

It should be noted that, in this paper, the emphasis is on thematching process rather than on the process of extracting fea-ture points. So optimization of the feature extraction method hasnot been considered. In our experiments, the gray level imagesare thresholded and segmented to get the contour images. Thefeature points are the high curvature points along the contour,which are automatically extracted using the method introducedby Ansariet al. [21].

B. Properties of Affine Transformation

Before formulating the energy function for our Hopfield net-work, we note several properties of 2-D affine transformationsthat are used in the discussion below. These properties hold ifand only if the transformation is affine. Derivations of the prop-erties can be found in a number of standard texts, such as [22]and thus are not presented here.

Definition 1: An affine transformation of the plane,can be represented as a nonsingular 22 matrix, , and a

translation vector , such that

(1)

for any .Affine transformation is a combination of several simple

mappings, such as rotation, translation, scaling, and shearing.It is a linear transformation if . The similarity transfor-mation is a special case of affine transformation. It preserveslength ratios and angles while the affine mapping, in generaldoes not. Nevertheless, here are some properties that hold foraffine transformation.

Property 1: An affine transformation of the plane is defineduniquely by three pairs of points. If, , and are noncollinearpoints, and , , and are the corresponding points, then thereexists a unique affine transformation , mappingeach of the three points to its corresponding point.

According to this property, given three pairs of correspondingpoints, we can compute the affine parameters related by solvinglinear equations.

Property 2: Ratios of triangle areas are preserved. Given twosets of noncollinear points, , , and , , (not neces-sarily distinct from , , and ), if is an affine transformation,then

(2)

where is the area of triangle.Therefore, the following corollary comes from the above

properties of affine transformation immediately.Corollary 1: Given four points in the plane,, , , and ,

the area ratios of two of the triangles to a third triangle, such as

and

uniquely define the set of four points up to an affine transfor-mation. Moreover, given two sets of points, , , and ,

, , that have the same area ratios, there is a unique affinetransformation mapping one set to the other.

Therefore, given two sets of points , , , and , ,, , if they have the same area ratios, there exists a unique

affine transformation mapping one set to the other; conversely,if they do not have the same area ratios, they cannot be relatedby any unique affine transformation. Hence the ratios of triangleareas of four corresponding point pairs can uniquely determinean affine transformation and vice versa, which is the theoreticalfoundation of the compatibility constraint embedded in our en-ergy function formulation under affine transformation.

III. FOURTH–ORDER NETWORK FOR AFFINE

INVARIANT MATCHING

For graph matching, Hopfield network can be considered asa 2-D array. If the model graph has nodes and the scenegraph has nodes, the number of neurons in the network willbe . The final state of each neuron represents whetherthe corresponding node in the model graph matches the node inthe scene graph or not. The network configuration can be seenin Fig. 1.

Page 3: Hopfield neural networks for affine invariant matching

1402 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 12, NO. 6, NOVEMBER 2001

Fig. 1. The Hopfield network used to generate graph isomorphism.

From the energy function perspective, the original Hopfieldnetwork in effect minimizes a quadratic function. However,in Section II, we have shown that we need at least four corre-sponding point pairs to compute the invariants—the ratios oftriangle areas. To tackle this kind of high-order minimizationproblems, it is natural to generalize the network model byintroducing high-order connections. The high-order neuron isthe basic block of high-order neural networks, and containshigh-order synaptic weights from its inputs. The total input ofsuch a high-order neuron consists of a linear combination ofthe network inputs and the input products. The details can befound in [23], [24], while a hardware VLSI implementation forthird-order network can be found in [25]. To define the networkorder quantitatively, a Hopfield network is said to be ofthorder if the corresponding energy function is described by an

th-order polynomial expression. The generalized Hopfieldnetwork model has been described by Tsuiet al. and theconvergence property was also discussed in [26] and [27].

Based on the generalized Hopfield network model, we de-fine our energy function of fourth-order network under affinetransformation in (3), shown at the bottom of the page, where

, , and are constants, is the output state of neuron. If the th node of the model graph matches theth node of

the scene graph, will be one ; otherwise, it will be zero. ,where , is the control parameter. The constants, ,

and can be chosen from experience or assigned adaptively[6]. In our experiments, satisfactory results were obtained with

.The first and second terms of (3) are uniqueness constraints

which force that at most one neuron will be active in eachcolumn and row of the network. The third term has to beincluded to avoid the system being trapped to the degeneratedstate in which all neurons are inactive. The last two termsare the compatibility constraints which are used to measurethe strength of the compatibility between the nodes of the

model graph and the scene graph. The fourth one only usesthe information of unary features and the last one uses theinformation of relational properties between the two graphs.

In general, the relational constraint is the function of the re-lational property of the graphs: , where is the mappingerror owing to the constraint; is the error threshold term forthe system to be tolerable with additive noise. Usuallyis thefunction of the node indexes in the graphs. For example, in thegraph matching problem invariant to Euclidean transformation,

can be defined as for the unary constraint, whereand are the unary features, such as the subtended angles at

the vertices in the two graphs; or it can be for thebinary constraint, where and are the Euclidean distancesin the model and in the scene, respectively. Therefore, the com-patibility function used in our fourth-order networkfor graph matching in (3) should have the following character-istics:

• should decrease monotonically asincreases.• approaches 1 asis smaller than ; approaches

or 0 as is larger than .• should be symmetric, i.e.,

.where the last condition is important for the network to con-verge, as shown in the Appendix.

According to the Corollary 1 in Section II, we know that foraffine invariant graph matching, at least four pairs of nodes areneeded to compute the relational constraint. Therefore, basedon the above characteristics of the compatibility function, wechoose the compatibility constraint of our network to be

(4)

where

(5)

(3)

Page 4: Hopfield neural networks for affine invariant matching

LI AND LEE: HOPFIELD NEURAL NETWORKS FOR AFFINE INVARIANT MATCHING 1403

Fig. 2. The compatibility function.

Here we use the information of relational properties be-tween the quadruple set of nodes of modelgraph and of scene graph as the compatibilityconstraint. is the average difference of the triangle arearatios between the model nodes and the scenenodes . Twelve items are included here to makesure that the compatibility constraint is strictlysymmetric

. is thetemperature constraint, determining the steepness of the func-tion. So will approach to the step function with verysmall value of . The compatibility function used is depicted inFig. 2, which means that if the value ofis smaller than thethreshold , there exists a unique affine transformation mappingthe set of model nodes to the scene nodes, the compatibilityconstraint approaches ; otherwise, such uniqueaffine transformation does not exist, and approaches

.The unary constraint is defined using the same type of com-

patibility function as (4), while is changed to

(6)

where and are the unary features of the model graph andscene graph, respectively. For affine invariant matching, theycan be selected as the convexity and concavity of the featurepoints extracted from the shape, as described in Section IV, orthe radiometric similarity of the points.

From (3), it can be seen that when , the last term of theenergy function is zero, the network only uses the informationof unary features. When is gradually reduced, the weight oflast term becomes larger and larger. Whenis reduced to zero,the network only uses the information of relational properties.Penget al. [5] adopted this approach to integrate the local andrelational properties in the Hopfield network.

By rearranging the energy function in (3) into the standardform [27] of the fourth-order Hopfield network

(7)

the connection weights and bias input are found to be

(8)

(9)

(10)

where is the fourth-order connection weight based onthe four neurons: , , , . Similarly,is the second-order connection. is defined as the Kroneckerdelta function

(11)

So the updating equation of the network is

(12)

Page 5: Hopfield neural networks for affine invariant matching

1404 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 12, NO. 6, NOVEMBER 2001

And the activation function is set to be

(13)

where is the initial state of the network.It can be seen from (8) and (9) that and are

both symmetric. Meanwhile, and ,when either two of the indexes are identical. According to theproof in the Appendix, the proposed fourth-order network foraffine invariant matching will hence converge to a stable statewhere the energy function is minimized.

Thus the algorithm of the fourth-order network for matchingany discrete set of data invariant to affine transformation can bedescribed as follows:

Step 1) Set the initial state of the network, and the controlparameter is set to be one. (If no unary featuresare available, we can setto zero directly.)

Step 2) Update the state of the network according to (12) and(13), until a stable output state is achieved.

Step 3) Reduce the control parameterwith a small value:, check if , if yes, go to Step 2);

otherwise, go to Step 4).Step 4) Output the matching results.

IV. A FFINE INVARIANT SHAPE RECOGNITION BY

SECOND–ORDER NETWORK

The above model of fourth-order network is suitable formatching any discrete set of feature points. However, due to thecomplex high-order connections, the simulation time will betoo long. Meanwhile, with the increase of the number of nodes,occlusion, and noise, the local minimum problem becomesmore serious. Therefore, it is ideal to derive a lower ordernetwork for real applications.

In many practical problems, data are organized in some or-dered manner, such as contours, trees [28], convex hull [29] ofa set of points, and so on. We have found that the order of thenetwork can be reduced if the model and input data have beenarranged in some certain order. Thus the second-order Hopfieldnetwork can be used to perform affine invariant matching bytaking advantage of the neighborhood information in the data.Here we will take the affine invariant shape recognition as anexample and reexamine the energy formulation in (3) accord-ingly.

The first three terms in (3) are unchanged because they are re-lated to the uniqueness constraints. The fourth term is the unaryconstraint and does not utilize the order information, therefore itremains unchanged, too. For affine invariant shape recognition,we have observed that, the concavity and convexity of the highcurvature points along the contour are not changed under affinetransformation. This property can be used as unary features ofthe corresponding points, which can be defined as

(14)

where is the curvature value of theth feature point corre-sponding to the Gaussian smoothed contour. Because the featurepoints are extracted along the original boundary correspondingto the extreme positive and negative curvature values [21], if

the curvature value is extreme positive, it corresponds to thesharp convex point; if it is extreme negative, it corresponds tothe sharp concave point. Although a concave point cannot matchto a convex point, and vice versa, this kind of feature has highmatching ambiguity. So the relational constraint will play a keyrole in finding the exact correspondences.

For the relational constraint, because the data are now ar-ranged in order, we can always choose nodeas the adjacentnode of and node as the adjacent node ofin the modelgraph. Similarly, nodes and can be chosen as the adjacentnodes of and in the scene graph, respectively, such that whennodes and match nodesand , respectively, nodesandwould most probably match nodesand . Therefore, the con-straint can be determined by the two neuronsand only. Then we can use a second-order Hopfield net-work instead of the fourth-order one to perform the affine in-variant matching. The energy function of the second-order net-work can be defined as

(15)

where the first four terms are the same as those in (3), and in thelast term, is the same as (4).

It can be proved that the convergence of this simplified modelis still well guaranteed. The updating equation of this second-order network can be derived by following the similar steps as(7)–(12) and hence is not repeated here.

V. EXPERIMENTS AND APPLICATIONS

A. Simulated Data

1) Affine Invariant Matching by Fourth-Order Net-work: The high-order Hopfield model derived for generalaffine invariant matching in Section III is suitable for matchingany discrete set of feature points, whether they are ordered ornot. Based on this formulation, a number of experiments withsynthetic data have been done to verify the proposed method.The asynchronous matching algorithm similar with [3] wasused. In addition, no unary constraint has been used in theseexperiments, so we set the value ofto be zero.

The simulated data consisted of several discrete points whichwere generated randomly. The scene data was obtained by ap-plying some affine transformation to the original data, and thenpermuting the transformed data. For example, Fig. 3(a) is theoriginal data which consists of six discrete feature points andFig. 3(b) is the permuted data after applying the following affinetransformation:

(16)

In the simulated data set, some extra points may be added toevaluate the subgraph matching capability required in occlusionsituations. For example, Experiment 2 and Experiment 4 shownin Table I have one point added, respectively.

Page 6: Hopfield neural networks for affine invariant matching

LI AND LEE: HOPFIELD NEURAL NETWORKS FOR AFFINE INVARIANT MATCHING 1405

(a)

(b)

Fig. 3. An example of the simulated data. (a) Model data. (b) Scene data.

TABLE IMATCHING RESULTS OF THEFOURTH-ORDER NETWORK ON THE

SIMULATED DATA

The matching results of such four experiments are given inTable I. In the table, and denote the number of nodesin the model and the scene, respectively. The fourth columnis the final energy value when the network converges. Thefifth column is the number of correspondences found by thematching process, among which, the last column denotes thenumber of false matches. In the following, experimental resultsare reported in the similar form. From Table I, it can be seen thatall the four experiments can find the correct correspondencepoints between the model and the scene without any wrongmatch by using the relational constraint only, even when thescene graphs are occluded. However, when the number ofnodes increases, the method needs more computation time. Forexample, the network generally takes 5–6 min to converge for5 5 nodes and 1–1.5 h for 66 nodes on a Sun Sparc10. Ifthe scene graph is an occluded one, it will take longer time.

TABLE IIPERFORMANCECOMPARISON BETWEEN THE FOURTH-ORDER AND

SECOND-ORDER NETWORKS ON THESIMULATED DATA

2) Performance Comparison Between the Fourth-Order andSecond-Order Networks:In Section IV, we have also proposeda simplified model based on second-order network for affine in-variant matching, provided that the neighborhood informationin the data is available. A number of experiments with syntheticdata have been done to compare the performance of the pro-posed fourth-order network and the second-order one.

In these experiments, the order of the points is made availablein the synthetic data set, such that we can use them to test theproposed second-order network method as well. We also assumethat the ordered feature points are dominant points in a closecontour such that the concavity and convexity of the points canbe used as unary features in both methods. Three sets of datahave been added with more feature points to observe how theconverging time varies with number of feature points extracted.In both of the networks, same asynchronous matching algorithmwas employed. The results are tabulated in Table II. The sixthand the last column denote the simulated time when the twomethods run on a Sunsparc 10 workstation. From Table II, bothmethods can find the correct correspondence points between themodel and the scene without any wrong match. Comparing withthe results in Table I, the fourth-order network converges fasterby including the unary constraints. Moreover, by incorporatingthe neighborhood information in the data, the second-order net-work converges much faster than the fourth-order one. We havealso observed that 1) the fourth-order network may get trappedin local minima for the first run and need another run to find thesolution and 2) and with just increase of one feature point in boththe model and the scene, the converging time of the fourth-ordernetwork is much longer than that of the second-order one.

In summary, the matching algorithm based on second-orderHopfield network is more efficient if the neighborhood infor-mation in the data is available. However, since the proposedsecond-order network reduces the network complexity byutilizing the neighborhood information, its performance willdeteriorate with inaccurate neighborhood information. InSection V-B, we will concentrate on the affine invariant shapematching experiments in handling noisy feature points, evenmissing feature points in real images, so that the practical valueof the second-order network can be better evaluated.

B. Affine Invariant Shape Matching by Second-Order Network

In these experiments, all the images were taken by digitalcamera in different and unknown viewing positions. The images

Page 7: Hopfield neural networks for affine invariant matching

1406 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 12, NO. 6, NOVEMBER 2001

(a) (b) (c)

(d) (e) (f)

(g) (h)

Fig. 4. 2-D symbol images to test the second-order network. (a) Model image.

were segmented by intensity thresholding. The feature pointswere chosen as extreme curvature points along the outside con-tour of the objects. They had been automatically extracted andlabeled in clockwise manner. After finding the matching cor-respondence by the proposed method, a postclustering algo-rithm [8] was also employed to find the parameters of the affinetransformation, and eliminate any spurious hypotheses gener-ated. The affine parameters were computed by solving the linearequations of every three pairs of points.

1) 2-D Symbol Matching:First, we test the method using2-D symbol images. Fig. 4 is a set of arrow images taken fromarbitrary viewpoints in our laboratory, including some occludedimages. This kind of experiments is very useful in robot nav-igation [30]. Assume that a robot with a camera is navigatingin a building, it is natural to look for some predefined signs ex-tracted from the environment to indicate the moving directionof the robot.

By applying our matching method based on second-orderHopfield network, the matching correspondence between thescene images and the model image can be found effectively,even the scene images are occluded. The matching details aresummarized in Table III. From the table, it can be seen thatthere are some wrong matches in the occluded images, whichare denoted in the eighth column of the table. However, theycan be eliminated efficiently with the subsequent postcluster al-gorithm. The last column of the table denotes the affine parame-ters detected for the corresponding scene image by the postclus-

tering algorithm. In the table, the affine parameters are denotedas , where

(17)

The transformed models are overlaid onto the scene in Fig. 5 ac-cording to the detected affine transformations. The dashed linesin the Fig. 5 denote the scene contours, and the solid lines denotethe transformed contours of the model object. It can be seen thatthe transformed models almost perfectly match the scene con-tours. From Fig. 4(e)–(h), even some corner features are missingor the input image has been heavily distorted as in Fig. 4(f), theaffine transformation can still be successfully recovered with theremaining information.

For the unoccluded scene images, such as Fig. 4(b)–(d), thematching algorithm generally takes no more than 1 s. to con-verge on a Sun Sparc10. For the occluded ones, it generallytakes no more than half a minute to converge, depending on thenumber of occluded points.

2) Industrial Tool Recognition:We have also done severalexperiments of recognizing industrial objects in compositescenes. Models used are shown in Fig. 6(a)–(d). These objectsare actually 3-D, but they have a good 2-D approximation.Some model objects have one degree of freedom, such asthe angle between the handles of the pliers. However, thisfree parameter has been kept constant for each model objectthroughout the experiments. (We have proposed a scene graphpartition method [20], [31] to recognize such kind of articu-

Page 8: Hopfield neural networks for affine invariant matching

LI AND LEE: HOPFIELD NEURAL NETWORKS FOR AFFINE INVARIANT MATCHING 1407

TABLE IIIMATCHING RESULTS ON THEARROW IMAGES SHOWN IN FIG. 4

(a) (b) (c) (d)

(e) (f) (g) (h)

Fig. 5. Matching results of 2-D symbol images against the model image in (a), with dashed lines denoting the scene contours, and the solid lines denoting thetransformed contours of the model object.

(a) (b) (c) (d)

Fig. 6. Contour images for the models and their feature points extracted.

lated objects under similarity transformations.) We intend torecognize these models from several composite scenes usingthe proposed method.

The model images were taken from frontal view. However,since the recognition method is 2-D affine invariant, any otherfeasible camera viewpoint is possible. The three scene imagesshown in Fig. 7(a)–(c) were taken from unknown viewing posi-tions. By applying the proposed matching method based on thesecond-order Hopfield network, the matching results are sum-

marized in Table IV. From the table, the matching correspon-dences can be established successfully in the composite scenesunder affine transformations. The matching results with othermodels for this three scene images are not listed here, becausethey do not lead to any true matching pairs. The parametersof corresponding affine transformations have been recoveredby applying the postclustering algorithm mentioned above. Thetransformed models are overlaid on to the scenes, which areshown in Fig. 7(d)–(f). From the figure, it can be seen that the

Page 9: Hopfield neural networks for affine invariant matching

1408 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 12, NO. 6, NOVEMBER 2001

(a) (b) (c)

(d) (e) (f)

Fig. 7. (a)–(c) Contour images for the scenes and their feature points extracted. (d)–(f) Matching results of the scene images, with dashed lines denoting the scenecontours, and the solid lines denoting the transformed contours of the model objects.

TABLE IVMATCHING RESULTS ON THEINDUSTRIAL TOOL IMAGES SHOWN IN FIG. 7

transformed models have some deviation from the scenes. Thisis because 2-D affine transformation is assumed to be the ap-proximation of weak perspective transformation in a 3-D envi-ronment, thus there exists some perspective effects as well assome noise in segmentation and feature extraction processes.

VI. CONCLUSION

In this paper, we have presented two neural-network solu-tions for affine invariant matching problem between two graphsbased on Hopfield type network. The relational compatibilityconstraint based on affine invariants is developed for the firsttime and is recognized to be the key factor for generating thesubgraph isomorphic mapping by the Hopfield network whenthe affine transformations are permitted. A fourth-order Hop-field network based on this constraint has been derived first,which is suitable for matching any discrete set of feature points.And then order reduction is studied by taking advantage of theneighborhood information in the data. A second-order Hopfieldnetwork has been subsequently proposed for affine invariantshape recognition. Since the second-order model is derived withneighborhood assumption, incorrect neighborhood informationsuch as missing feature points, will inevitably introduce errorsto the system. However, the experimental results on 2-D symbolmatching and industrial tool recognition show that the intro-

duced errors are not significant enough to distort the final re-sults.

The proposed framework with appropriate extensions, is ex-pected to be capable of solving problems involving articulatedobject recognition and many to one correspondences invariant toaffine transformation. Our work show that high-order networkscan be transformed to lower order ones by utilizing someapriori information. Neighborhood information has been utilizedin this paper for this purpose in silhouettes matching because thecorresponding neighborhood property can be easily defined. Fora discrete point data set, we could derive the neighborhood in-formation from its convex hull [32]. Also if only partial orderof the data is available as in the tree structure, the parent-childrelation may be used as the necessarya priori information. Ifno reliablea priori information is available, the affine invariantmatching problem can only be solved with the fourth-order net-work suggested in Section III, in such case we have to rely onhardware implementation for efficiency. Alternatively, differentstrategies of order reduction should be derived.

APPENDIX

PROOF FOR THECONVERGENCE OF THEFOURTH-ORDER

HOPFIELD NETWORK

In this Appendix, we follow the similar steps in [26] and [33]to show that our energy function of the fourth-order Hopfield

Page 10: Hopfield neural networks for affine invariant matching

LI AND LEE: HOPFIELD NEURAL NETWORKS FOR AFFINE INVARIANT MATCHING 1409

network for affine invariant matching is of a Lyapunov-stylewhich is continuously decreasing until a minimum has beenreached.

The energy function formed for standard fourth-order contin-uous Hopfield network can be rewritten as follows [In (7), theintegration item is ignored.]:

(18)

In order to show that the energy function is a Lyapunov functionwhich always converge to a stable state, we take the derivativeof the (18) with respect to time

(19)

where is calculated first

(20)

Here we have assumed that and ,with either two of the indexes being identical. According to theupdating (12)

(21)

Substituting (21) into (20), we have

(22)

Substituting (22) into (19), the dynamics of the energy functionis obtained

(23)

If the two connection weights are bothsymmetric

(24)

(25)

then

(26)Obviously

Because , as is nondecreasing Sig-moid function. Also, if and only if , forall , .

Therefore, we have shown that the energy function of thestandard fourth-order network in (18) will continuously de-crease until a minimum is reached, provided that (24) and (25)are both satisfied, and , , when eithertwo of the indexes are identical. Next, we are going to showthat the fourth-order network we proposed for affine invariantmatching in Section III satisfies the above conditions such thatit must converge to a stable state.

In our fourth-order network proposed for affine invariantmatching, from (4) and (5), we can see the compatibilityconstraint is independent of the ordering of theindex. So we have

and also from (8) and (9), we can see

and , , with either two of the indexesbeing identical.

From the above derivation, we have clarified that the dy-namics of our energy function for affine invariant matching iscontinuously decreasing until the minimum energy is reachedwhere the states of neurons remain constant and the network isstabilized. Hence we can conclude that the energy function usedin our fourth network is of a Lyapunov type which implies thenetwork will converge to a stable state as time passes.

ACKNOWLEDGMENT

The authors wish to thank the anonymous reviewers for theirvaluable suggestions.

REFERENCES

[1] J. J. Hopfield and D. W. Tank, “Neural computations of decisions inoptimization problems,”Biol. Cybern., vol. 52, pp. 141–152, 1985.

[2] N. M. Nasrabadi and W. Li, “Object recognition by a Hopfield neuralnetwork,” IEEE Trans. Syst., Man, Cybern., vol. 21, pp. 1523–1535,Nov. 1991.

[3] W. C. Lin, C. K. Tsao, and T. Lingutla, “A hierarchical multiple-viewapproach to 3-D object recognition,”IEEE Trans. Neural Networks, vol.2, pp. 84–92, Jan. 1991.

[4] S. Z. Li, “Atching: Invariant to translations, rotation, and scale changes,”Pattern Recognition, vol. 25, no. 6, pp. 583–594, 1992.

[5] M. K. Peng and N. K. Gupta, “Occluded object recognition by Hopfieldnetworks,” inProc. IEEE Int. Conf. Neural Networks, vol. 7, 1994, pp.4309–4315.

Page 11: Hopfield neural networks for affine invariant matching

1410 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 12, NO. 6, NOVEMBER 2001

[6] P. N. Suganthan, E. K. Teoh, and D. P. Mital, “Hopfield network withconstraint parameter adaption for overlapped shape recognition,”IEEETrans. Neural Networks, vol. 10, pp. 444–449, Mar. 1999.

[7] A. Rangarajan and D. Mjolsness, “A Lagrangian relaxation network forgraph matching,”IEEE Trans. Neural Networks, vol. 7, pp. 1365–1380,Nov. 1996.

[8] P. N. Suganthan, E. K. Teoh, and D. P. Mital, “Pattern recognition bygraph matching using the potts MFT neural networks,”Pattern Recog-nition, vol. 28, no. 7, pp. 997–1009, 1995.

[9] P. M. Wong and T. Lee, “Graph matching of 3-D point image invariantto scale, rotation, and translation using Hopfield network,” inProc.ISSPR’98, vol. 1, 1998, pp. 201–206.

[10] S. S. Young, P. D. Scott, and N. M. Nasrabadi, “Object recognition usingmultilayer Hopfield neural network,”IEEE Trans. Image Processing,vol. 6, pp. 357–372, Mar. 1997.

[11] J. H. Kim, S. H. Yoon, and K. H. Sohn, “A robust boundary-based ob-ject recognition in occlusion environment by hybrid Hopfield neural net-works,” Pattern Recognition, vol. 29, no. 12, pp. 2047–2060, 1996.

[12] T.-W. Chen and W.-C. Lin, “A neural-network approach to CSG-based3-D object recognition,”IEEE Trans. Pattern Anal. Machine Intell., vol.16, pp. 719–726, Jul. 1994.

[13] S. Kurogi, “Competitive neural network for affine invariant patternrecognition,” in Proc. Int. Joint Conf. Neural Networks, 1993, pp.181–184.

[14] G. Bebis, M. Georgiopoulos, N. da Vitoria Lobo, and M. Shah,“Learning affine transformations,”Pattern Recognition, vol. 32, pp.1783–1799, 1999.

[15] K. Arbter, W. E. Snyder, H. Burkhardt, and G. Hirzinger, “Application ofaffine-invariant Fourier descriptors to recognition of 3-D objects,”IEEETrans. Pattern Anal. Machine Intell., vol. 12, pp. 640–647, Jul. 1990.

[16] Z. Huang and F. S. Cohen, “Affine-invariant B-spline moments for curvematching,”IEEE Trans. Image Processing, vol. 5, pp. 1473–1480, Oct.1996.

[17] F. S. Cohen, Z. Huang, and Z. Yang, “Invariant matching and identifica-tion of curves using B-splines curve representation,”IEEE Trans. ImageProcessing, vol. 4, pp. 1–10, Jan. 1995.

[18] Y. Lamdan, J. T. Schwartz, and H. J. Wolfson, “Affine invariantmodel-based object recognition,”IEEE Trans. Robot. Automat., vol. 6,pp. 578–589, Oct. 1990.

[19] L. G. Shapiro and R. M. Haralick, “Structural descriptions and inexactmatching,” IEEE Trans. Pattern Anal. Machine Intell., vol. 3, pp.504–519, Sep. 1981.

[20] W. J. Li and T. Lee, “Object recognition by subscene graph matching,”in Proc. IEEE Int. Conf. Robot. Automat., Apr. 2000, pp. 1459–1464.

[21] N. Ansari and E. J. Delp, “On detecting dominant points,”PatternRecognition, vol. 24, no. 5, pp. 441–451, 1991.

[22] J. L. Mundy and A. Zisserman,Geometric Invariance in Computer Vi-sion. Cambridge, MA: MIT Press, 1992.

[23] C. L. Giles and T. Maxwell, “Learning, invariance, and generalization inhigh-order neural networks,”Appl. Opt., vol. 26, no. 23, pp. 4972–4978,Dec. 1987.

[24] T. Samad and P. Harper, “High-order Hopfield and tank optimizationnetworks,”Parallel Comput., vol. 16, pp. 287–292, 1990.

[25] J. Su, A. Hu, and Z. He, “Solving a kind of nonlinear programmingproblems via analog neural networks,”Neurocomput., vol. 18, pp. 1–9,1998.

[26] W. T. Tsui, X. Xu, and N. K. Huang, “A generalized neural-networkmodel,” inAbstracts. 1st Annu. INNS Meet., vol. 1 (Supplement), 1988.

[27] O. M. Omidvar, Progress in Neural Networks-Shape Recogni-tion. Exeter, U.K.: Intellect Books, 1999, vol. 6.

[28] K. Siddiqi, A. Shokoufandeh, S. J. Dickinson, and S. W. Zucker, “Shockgraphs and shape matching,”Int. J. Comput. Vision, vol. 35, pp. 13–32,1999.

[29] Z. Yang and F. S. Cohen, “Image registration and object recognitionusing affine invariants and convex hulls,”IEEE Trans. Image Pro-cessing, vol. 8, pp. 934–946, Jul. 1999.

[30] A. J. Briggs, D. Scharstein, D. Braziunas, C. Dima, and P. Wall, “Mobilerobot navigation using self-similar landmarks,” inProc. IEEE Int. Conf.Robot. Automat., Apr. 2000, pp. 1428–1434.

[31] W.-J. Li, T. Lee, and H.-T. Tsui, “Image analysis by accumulative Hop-field matching,” inProc. 15th Int. Conf. Pattern Recognition, Sep. 2000,pp. 442–445.

[32] W.-J. Li and T. Lee, “Image registration and object recognition by affineinvariant matching,” inProc. Int. Symp. Intell. Multimedia, Video,Speech Processing, May 2001, pp. 56–59.

[33] J. J. Hopfield, “Neurons with graded response have collective computa-tional properties like those of two-state neurons,” inProc. Nat. AcademySci. USA, May 1984, pp. 3088–3092.

Wen-Jing Li received the B.E. and M.E. degreesfrom the Department of Computer Science andEngineering, Hebei University of Technology,Tianjin, China, in 1995 and 1998, respectively.She is currently pursuing the Ph.D. degree inneural computing and computer vision from theComputer Vision and Image Processing Laboratory,Department of Electronic Engineering, The ChineseUniversity of Hong Kong.

Tong Lee received the B.E. degree with first classhonors from the School of Electrical Engineeringand Computer Science, University of New SouthWales, Australia, in 1983, being awarded theUniversity Medal on graduation. He also receivedthe Ph.D. degree in electrical engineering from thesame university for his research in image processingand pattern recognition in 1987.

In 1988, he joined the Chinese University of HongKong, and is currently an Associate Professor in theDepartment of Electronic Engineering. His research

interests are neural computing, pattern recognition, image processing, and evo-lutionary computation.