16
Pattern Recognition 33 (2000) 53 } 68 Relaxation labeling in stereo image matching Gonzalo Pajares*, Jesu H s Manuel de la Cruz, Jose H Antonio Lo H pez-Orozco Dpto. Arquitectura de Computadores y Automa & tica, Facultad de CC Fn { sicas, Universidad Complutense de Madrid 28040 Madrid, Spain Received 8 July 1998; accepted 14 December 1998 Abstract This paper outlines a method for solving the global stereovision matching problem using edge segments as the primitives. A relaxation scheme is the technique commonly used by existing methods to solve this problem. These techniques generally impose the following competing constraints: similarity, smoothness, ordering and uniqueness, and assume a bound on the disparity range. The smoothness constraint is basic in the relaxation process. We have veri"ed that the smoothness and ordering constraints can be violated by objects close to the cameras and that the setting of the disparity limit is a serious problem. This problem also arises when repetitive structures appear in the scene (i.e. complex images), where the existing methods produce a high number of failures. We develop our approach from a relaxation labeling method ([1] W.J. Christmas, J. Kittler, M. Petrou, structural matching in computer vision using probabilistic relaxation, IEEE Trans. Pattern Anal. Mach. Intell. 17 (8) (1995) 749}764), which allows us to map the above constraints. The main contribution is made, (1) by applying a learning strategy in the similarity constraint and (2) by introducing speci"c conditions to overcome the violation of the smoothness constraint and to avoid the serious problem produced by the required "xation of a disparity limit. Consequently, we improve the stereovision matching process. A better performance of the proposed method is illustrated by comparative analysis against some recent global matching methods. ( 1999 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. Keywords: Stereo correspondence; Probabilistic relaxation; Similarity; Smoothness; Uniqueness 1. Introduction A major portion of the research e!orts of the computer vision community has been directed towards the study of the three-dimensional (3-D) structure of objects using machine analysis of images [2}4]. Analysis of video im- ages in stereo has emerged as an important passive method for extracting the 3-D structure of a scene. Fol- lowing the Barnard and Fischler [5] terminology, we can view the problem of stereo analysis as consisting of the following steps: image acquisition, camera modeling, fea- ture acquisition, image matching, depth determination and interpolation. The key step is that of image match- ing, that is, the process of identifying the corresponding points in two images that are cast by the same physical point in 3-D space. This paper is devoted solely to this * Corresponding author. Tel.: #0034-91-3944477; fax: #0034- 91-3944687. E-mail address: pajares@eucmax.sim.ucm.es (G. Pajares) problem. The basic principle involved in the recovery of depth using passive imaging is triangulation. In stereop- sis the triangulation needs to be achieved with the help of only the existing environmental illumination. Hence, a correspondence needs to be established between fea- tures from two images that correspond to some physical feature in space. Then, provided that the position of centers of projection, the e!ective local length, the ori- entation of the optical axis, and the sampling interval of each camera are known, the depth can be established using triangulation [6]. 1.1. Techniques in stereovision matching A review of the state of art in stereovision matching allows us to distinguish two sorts of techniques broadly used in this discipline: area based and feature based [3,4,7,8]. Area-based stereo techniques use correlation between brightness (intensity) patterns in the local neigh- borhood of a pixel in one image with brightness patterns 0031-3203/99/$20.00 ( 1999 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. PII: S 0 0 3 1 - 3 2 0 3 ( 9 9 ) 0 0 0 3 6 - 9

Relaxation labeling in stereo image matching

Embed Size (px)

Citation preview

Pattern Recognition 33 (2000) 53}68

Relaxation labeling in stereo image matching

Gonzalo Pajares*, JesuH s Manuel de la Cruz, JoseH Antonio LoH pez-Orozco

Dpto. Arquitectura de Computadores y Automa& tica, Facultad de CC Fn{ sicas, Universidad Complutense de Madrid 28040 Madrid, Spain

Received 8 July 1998; accepted 14 December 1998

Abstract

This paper outlines a method for solving the global stereovision matching problem using edge segments as theprimitives. A relaxation scheme is the technique commonly used by existing methods to solve this problem. Thesetechniques generally impose the following competing constraints: similarity, smoothness, ordering and uniqueness, andassume a bound on the disparity range. The smoothness constraint is basic in the relaxation process. We have veri"edthat the smoothness and ordering constraints can be violated by objects close to the cameras and that the setting of thedisparity limit is a serious problem. This problem also arises when repetitive structures appear in the scene (i.e. compleximages), where the existing methods produce a high number of failures. We develop our approach from a relaxationlabeling method ([1] W.J. Christmas, J. Kittler, M. Petrou, structural matching in computer vision using probabilisticrelaxation, IEEE Trans. Pattern Anal. Mach. Intell. 17 (8) (1995) 749}764), which allows us to map the above constraints.The main contribution is made, (1) by applying a learning strategy in the similarity constraint and (2) by introducingspeci"c conditions to overcome the violation of the smoothness constraint and to avoid the serious problem produced bythe required "xation of a disparity limit. Consequently, we improve the stereovision matching process. A betterperformance of the proposed method is illustrated by comparative analysis against some recent global matchingmethods. ( 1999 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.

Keywords: Stereo correspondence; Probabilistic relaxation; Similarity; Smoothness; Uniqueness

1. Introduction

A major portion of the research e!orts of the computervision community has been directed towards the study ofthe three-dimensional (3-D) structure of objects usingmachine analysis of images [2}4]. Analysis of video im-ages in stereo has emerged as an important passivemethod for extracting the 3-D structure of a scene. Fol-lowing the Barnard and Fischler [5] terminology, we canview the problem of stereo analysis as consisting of thefollowing steps: image acquisition, camera modeling, fea-ture acquisition, image matching, depth determinationand interpolation. The key step is that of image match-ing, that is, the process of identifying the correspondingpoints in two images that are cast by the same physicalpoint in 3-D space. This paper is devoted solely to this

*Corresponding author. Tel.: #0034-91-3944477; fax: #0034-91-3944687.

E-mail address: [email protected] (G. Pajares)

problem. The basic principle involved in the recovery ofdepth using passive imaging is triangulation. In stereop-sis the triangulation needs to be achieved with the help ofonly the existing environmental illumination. Hence,a correspondence needs to be established between fea-tures from two images that correspond to some physicalfeature in space. Then, provided that the position ofcenters of projection, the e!ective local length, the ori-entation of the optical axis, and the sampling interval ofeach camera are known, the depth can be establishedusing triangulation [6].

1.1. Techniques in stereovision matching

A review of the state of art in stereovision matchingallows us to distinguish two sorts of techniques broadlyused in this discipline: area based and feature based[3,4,7,8]. Area-based stereo techniques use correlationbetween brightness (intensity) patterns in the local neigh-borhood of a pixel in one image with brightness patterns

0031-3203/99/$20.00 ( 1999 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.PII: S 0 0 3 1 - 3 2 0 3 ( 9 9 ) 0 0 0 3 6 - 9

in the local neighborhood of the other image [9}14],where the number of possible matches is intrinsicallyhigh, while feature-based methods use sets of pixels withsimilar attributes, normally, either pixels belonging toedges [15}27], or the corresponding edges themselves[4,8,28}38]. These latter methods lead to a sparse depthmap only, leaving the rest of the surface to be reconstruc-ted by interpolation; but they are faster than area-basedmethods as there are many fewer points (features) to beconsidered [3]. We select a feature-based method asjusti"ed below.

1.2. Factors awecting the physical stereovision system andchoice of features

There are extrinsic and intrinsic factors a!ecting thestereovision matching system: (a) extrinsic, in a practicalstereo vision system, the left and right images are ob-tained with two cameras placed at di!erent posi-tions/angles, although they capture both the same scene,each camera may receive a di!erent illumination andalso, incidentally, di!erent re#ections; (b) intrinsic, thestereovision system is equipped with two physicalcameras, always placed at the same relative position (leftand right), although they are the same commercialmodel, their optical devices and electronic componentsare di!erent, hence each camera may convert the sameillumination level to a di!erent gray level independently.

As a result of the above-mentioned factors the corre-sponding features in both images may display di!erentvalues. This may lead to incorrect matches. Thus, it isvery important to "nd features in both images which areunique or independent of possible variation in the images[39]. This research paper use a feature-based methodwhere edge segments are to be matched, since our experi-ment has been carried out in an arti"cial environmentwhere edge segments are abundant. This environment iswhere a mobile robot equipped with the stereovisionsystem navigates.

Such features have been studied in terms of reliability[7] and robustness [39] and, as mentioned before, havealso been used in previous stereovision matching works.This justi"es our choice of features, although they may betoo local. Four average attribute values (module anddirection gradient, variance and Laplace) are computedfor each edge-segment as we will see later. The ideasdescribed here can be used for other simple geometricprimitives with other attributes, even though in the de-tails of the current implementation the line segments playan important role.

1.3. Constraints applied in stereovision matching

Our stereo correspondence problem can be de"ned interms of "nding pairs of true matches, namely, pairs ofedge segments in two images that are generated by the

same physical edge segment in space. These true matchesgenerally satisfy some competing constraints [22,23,27]:(1) similarity, matched edge segments have similar localproperties or attributes; (2) smoothness, disparity valuesin a given neighborhood change smoothly, except ata few depth discontinuities; (3) ordering, the relative posi-tion among two segments in the left image is preserved inthe right one for the corresponding matches; (4) unique-ness, each edge segment in one image should be matchedto a unique edge segment in the other image.

Immediately after, we introduce the very importantdisparity concept. Assume a stereo pair of edge segments,where one segment belongs to the left image and theother to the right one. If we superpose the right imageover the left one, for example, the two edge segments ofthe stereo pair appear horizontally displaced. Followinghorizontal (epipolar) lines we determine the two intersec-tion points for each epipolar line with the pair of edgesegments. The relative displacement of the x-coordinatesof the two intersection points is the disparity for suchpoints. We compute the disparity for the pair of edgesegments as the average displacement between pointsbelonging to both edge segments along the commonlength.

The similarity constraint is associated to a localmatching process and the smoothness and ordering con-straints to a global matching one. The major di$culty ofstereo processing arises due to the need to make globalcorrespondences. A local edge segment in one image maymatch equally well with a number of edge segments in theother image. This problem is compounded by the factthat the local matches are not perfect due to the above-mentioned extrinsic and intrinsic factors. These ambi-guities in local matches can only be resolved by consider-ing sets of local matches globally. Hence, to make globalcorrespondences given a pair of edge segments, we con-sider a set of neighboring edge segments, where a boundon the disparity range de"nes the neighborhood. Relax-ation is a technique commonly used to "nd the bestmatches globally and it refers to any computationalmechanism that employs a set of locally interacting par-allel processes, one associated with each image unit, thatin an iterative fashion update each unit's current labelingin order to achieve a globally consistent interpretation ofimage data [40,41].

1.4. Relaxation in stereovision

Relaxation labeling is a technique proposed by Rosen-feld et al. [42] and developed to deal with uncertainty insensory data interpretation systems and to "nd the bestmatches. It uses contextual information as an aid inclassifying a set of interdependent objects by allowinginteractions among the possible classi"cations of relatedobjects. In the stereo paradigm the problem involvesassigning unique labels (or matches) to a set of features in

54 G. Pajares et al. / Pattern Recognition 33 (2000) 53}68

an image from a given list of possible matches. So, thegoal is to assign each feature (edge segment) a valuecorresponding to disparity in a manner consistent withcertain prede"ned constraints.

Two main relaxation processes can be distinguished instereo correspondence: optimization-based and prob-abilistic/merit-based. In the optimization-based pro-cesses stereo correspondence is carried out by minimizingan energy function which is formulated from the applic-able constraints. It represents a mechanism for thepropagation of constraints among neighboring matchfeatures for the removal of ambiguity of multiple stereomatches in an iterative manner. The optimal solution isground state, that is, the state (or states) of the lowestenergy. In the probabilistic/merit-based processes, theinitial probabilities/merits, established from a localstereo correspondence process and computed from sim-ilarity in the feature values, are updated iteratively de-pending on the matching probabilities/merits of neigh-boring features and also from the applicable constraints.The following papers use a relaxation technique: (a)probabilistic/merit [1,4,17,18,20}22,25,26,41}54] and (b)optimization through a Hop"eld neural network [12,14,24,27,36,55,56]; we use (a).

1.5. Motivational research and contribution

For a given pair of features the smoothness constraintmakes global correspondences based on how well itsneighbors match. The neighborhood region is de"ned bya bound on the disparity range. In our experiments wehave found complex images with highly repetitive struc-tures or with objects very close to the cameras or withocclusions. In complex images setting a disparity limit isa di$cult task because according to such a limit thesmoothness constraint can be violated. Additionally, incomplex images the ordering constraint is also easilyviolated.

We propose a probabilistic relaxation approach forsolving the edge-segment stereovision matching problemand base our method on the work of Christmas et al. [1](similar to these proposed by Rosenfeld et al. [42] orPeleg [48]) for the following reasons: (a) it incorporatesthe recent advances of the theory of probabilistic relax-ation, (b) the similarity, smoothness and ordering con-straints can be mapped directly, and (c) it has beensatisfactorily applied in stereovision matching tasks [1].

The main contribution of the paper is the use in therelaxation process of a compatibility measure betweenpairs of matched segments. This enforces consistencyimposing smoothness and ordering constraints when nosuspicion exits about the violation of such constraints,otherwise a di!erent compatibility measure is used andthe lost global consistency is replaced by local onesthrough the matching probabilities. With this approach,our method is valid for common stereo images and it is

extended to deal robustly with complex stereo imageswhere the smoothness and ordering constraints are com-monly violated. The extension of our approach is basedupon the idea of Dhond and Aggarwal [57] that dealswith occluding objects.

Additional contributions are made in the followingways: (a) by using the learning strategy of Cruz et al. [31]and Pajares [4] to compute the matching probability foreach pair of edge segments. The matching probabilitiesare used to start the relaxation process and then toobtain a compatibility measure when the smoothnessand ordering constraints are violated, and (b) by usingthe minimum di!erential disparity criterion of Medioniand Nevatia [8] and the approach of Ruichek and Pos-taire [27] to map the smoothness and ordering con-straints, respectively.

Therefore, our method integrates several criteria fol-lowing the idea of Pavlidis [58] and we get that it worksproperly even if the images are complex.

1.6. Paper organization

The paper is organized as follows: in Section 2 thestereovision probabilistic relaxation matching scheme isdescribed. In Section 3 we summarize the proposedmatching procedure. The performance of the method isillustrated in Section 4, where a comparative studyagainst existing global relaxation matching methods iscarried out, and the necessity of using a relaxationmatching strategy rather than a local matching techniqueis also made clear. Finally in Section 5 there is a dis-cussion of some related topics.

2. Probabilistic relaxation scheme in stereovisionmatching

As mentioned earlier, this paper proposes a probabilis-tic relaxation technique for solving the stereovisionmatching problem. This technique is extended to workproperly with complex images and integrates severalcriteria. In this section we deal with these issues.

2.1. Feature and attribute extraction

We believe, as in Medioni and Nevatia [8], that fea-ture-based stereo systems have strong advantages overarea-based correlation systems. However, detection ofthe boundaries is a complex and time-consuming scene-analysis task. The contour edges in both images areextracted using the Laplacian of Gaussian "lter in ac-cordance with the zero-crossing criterion [59,60]. Foreach zero crossing in a given image, its gradient vector,Laplacian and variance values are computed from thegray levels of a central pixel and its eight immediateneighbors. The gradient vector (magnitude and direction)

G. Pajares et al. / Pattern Recognition 33 (2000) 53}68 55

Fig. 1. Broken edge segment v and w may match with u.

is computed as in Leu and Yau [61], the Laplacian valueas in Lew et al. [62] and the variance value as in Krotkov[63]. To "nd the gradient magnitude of the central pixelwe compare the gray-level di!erences of the four pairs ofopposite pixels in the 8-neighborhood, and the largestdi!erence is taken as the gradient magnitude. The gradi-ent direction of the central pixel is the direction out of theeight principal directions whose opposite pixels yield thelargest gray-level di!erence and also points in the direc-tion in which the pixel gray level is increasing. We usea chain-code and we assign eight digits to represent theeight principal directions; these digits are integer num-bers from 1 to 8. This approach allows the normalizationof the gradient direction, so its values fall in the samerange as the remainder of the attribute values. In order toavoid noise problems in edge detection that can lead tolater mismatches in realistic images, the following twoglobal consistent methods are used: (1) the edges areobtained by joining adjacent zero crossings following thealgorithm of Tanaka and Kak [64]; a margin of devi-ation of $20% in gradient magnitude and of $453 ingradient direction is allowed; (2) each detected contour isapproximated by a series of piecewise linear line seg-ments [65]. Finally, for every segment, an average valueof the four attributes is obtained from all computedvalues of its zero crossings. All average attribute valuesare normalized in the same range. Each segment is identi-"ed with its initial and "nal pixel coordinates, its lengthand its label.

Therefore, each pair of features has two associatedfour-dimensional vectors x

-and x

3, where the compo-

nents are the attribute values, and the sub-indices l andr denote features belonging to the left and right images,respectively. A four-dimensional di!erence measurementvector x is then also obtained from the above x

-and

x3

vectors, x"x-!x

3"Mx

., x

$, x

-, x

7N. The compo-

nents of x are the corresponding di!erences for moduleand direction gradient, Laplacian and variance values,

respectively, associated to the same four attributes usedin our approach.

Of all the possible combinations of pairs of matchesformed by segments of the left and right images, onlythose pairs with a di!erence value in the direction of thegradient less than $453, taking into account that theyoverlap will be processed. This is called the initial condi-tion. The remaining pairs that do not meet such condi-tions are directly classi"ed as False correspondences. Theoverlap is a concept introduced by Medioni and Nevatia[8], two segments u and v overlap if by sliding one ofthem in a direction parallel to the epipolar line, theywould intersect. We note that a segment in one image isallowed to match more than one segment in the otherimage. Therefore, we make a provision for broken seg-ments resulting in possible multiple correct matches. Thefollowing pedagogical example in Fig. 1 clari"es theabove.

The edge segment u in the left image matches with thebroken segment represented by v and w in the rightimage, but under the condition that v and w do notoverlap and that their orientations, measured by theircorresponding x

-, do not di!er by more than $103.

2.2. Computation of the local matching probability

The local probabilities for all pairs of features thatmeet the above initial conditions are computed follow-ing a learning strategy [4,31]. This is brie#y describedbelow.

Due to the above-mentioned extrinsic and intrinsicfactors, we have veri"ed that the di!erences between theattributes for the true matches cluster in a cloud arounda center and we have assumed an underlying probabilitydensity function (PDFt). The form of the PDFt is known.Our experience lead us to model this PDFt as a Gaussianone with two unknown parameters, the mean di!erencevector and the covariance matrix. The "rst is the center of

56 G. Pajares et al. / Pattern Recognition 33 (2000) 53}68

the cluster and the second gives us information about thedispersion of the di!erences of the attributes in the clus-ter. Both parameters are estimated through a learningprocess based on the parametric Bayes classi"er bya maximum likelihood method [4,66]. The learning pro-cess is carried out with a number of samples througha training process.

Once the PDFt is fully described, it allows us to com-pute the local matching probability associated with anynew and current pair of edge segments using the di!er-ences of its attributes according to the following expres-sion [4,31,66}68].

p(x)"1

D&D0.5expC!

1

2(x!l)5 &~1(x!l)D, (1)

where l and & are the corresponding learned clustercenter, and the covariance matrix, respectively, x is thefour-dimensional di!erence measurement vector (see Sec-tion 2.1) and t stands for the transpose.

During the computation of the local probabilities, weexplicitly apply the similarity constraint between featuresfrom the left and right images. These local probabilitiesare the starting point for the next relaxation matchingprocess, i.e. initial probabilities for the global relaxationprocess.

2.3. A review of the relaxation labeling processes

Recent advances in the theory of probabilistic relax-ation can be found in Refs. [1,69], although it has beenstudied in previous works [42,44,48,51,54]. The probabil-istic relaxation process is made applicable to generalmatching problems. The matching problem is formulatedin the Bayesian framework for contextual label assign-ment. The formulation leads to an evidence-combiningformula which prescribes in a consistent and uni"edmanner how unary attribute measurements relating tosingle entities, binary relation measurements relating topairs of objects, and prior knowledge of the environment(initial probabilities) should be jointly brought to bear onthe object labeling problem. The theoretical frameworkalso suggests how the probabilistic relaxation processshould be initialized based on unary measurements (ac-cording to Section 2.2). This contrasts with the approachadopted by Li [70] who started the iterative process froma random assignment of label probabilities.

A set A of N objects of the scene is to be matched,A"Ma

1, a

2, 2, a

NN. The goal is to match the scene to

a model. Each object aihas assigned a label h

i, which may

take as its value any of the M model labels that form theset ) : )"Mu

1, u

2, 2, u

MN. The notation uhi indicates

that a model label is associated with a particular scenelabel h

i. At the end of the labeling process, it is expected

that each object will have one unambiguous labelvalue. For convenience two sets of indices are de"ned:

N0,M1, 2,2, NN and N

i,M1, 2,2, i!1, i#1,2, NN.

For each object aia set of m

1measurements x

iis avail-

able, corresponding to the unary attributes of the object:xi"Mx(1)

i, x(2)

i, 2, x(m1)

iN. The abbreviation x

i,ioN0denotes

the set of all unary measurement vectors ximade on the

set A of objects, i.e., xi,ioN0

"Mx1, 2, x

NN.

For each pair of objects ai

and aj

a set of m2

binary measurements is available Aij

: Aij"MA(1)

ij,

A(2)ij

, 2, A(m2)ij

N. The same classes of unary and binarymeasurements are also made on the model. After intro-duction of the above notations and following Christmaset al. [1] the theoretical framework of Bayesian probabil-ity for object labeling using probabilistic relaxation is asfollows: the label h

iof an object a

iwill be given the value

uhi, provided that it is the most probable label given allthe information we have for the system, i.e. all unarymeasurements and the values of all binary relations be-tween the various objects. By mathematical induction itcan be shown that there is no need for the inclusion of allbinary relations of all objects in order to identify anobject, and the most appropriate label of object a

iis

uhi given by

P(hi"uhiDxj, j|No

, Aij, j | Ni

)

"maxuj|)

P (hi"ujD xj, j | No

, Aij, j | Ni

), (2)

where the upper-case P represents the probability of anevent; thus P(h

i"uj) and P(h

i"uhi) denote the prob-

abilities that scene label hiis matched to model labels

uj and uhi, respectively. Under certain often-adoptedassumptions, the conditional probabilities in this equa-tion can be expressed in a form that indicates that a re-laxation updating rule would be an appropriate methodof "nding the required maximum. Using the Bayes's ruleand after an exhaustive treatment (see Christmas et al.[1]) the desired solution to the problem of labeling, asde"ned by Eq. (2) can be obtained using an iterativescheme as follows:

P(n`1)(hi"uhi)"

P(n) (hi"uhi)Q(n) (h

i"uhi)

+uj | )P(n) (hi"uj)Q(n) (h

i"uj)

, (3)

where

Q(n) (hi"ua)

" <j | Ni

+ub | )

P(n) (hj"ub)p(A

ijD h

i"ua, hj"ub), (4)

where P(n) (hi"uhi) is the matching probability at level

(iteration) n that scene label hiis matched to model label

uhi and P(n`1) (hi"uhi) is the updated probability of the

corresponding match at level n#1. The quantityQ(n) (h

i"ua) expresses the support the match h

i"ua

receives at the nth iteration step from the other objects inthe scene, taking into consideration the binary relations

G. Pajares et al. / Pattern Recognition 33 (2000) 53}68 57

that exist between them and object ai. The density func-

tion p(AijD h

i"ua, hj"ub) corresponds to the compati-

bility coe$cients of other methods [42,44]; i.e. it quan-ti"es the compatibility between the match h

i"ua and

a neighboring match hj"ub. The density function

ranges from 0 to 1 and will be expressed as a compatibil-ity coe$cient in Eq. (6).

2.4. Stereo correspondence using relaxation labeling

At this stage there are two goals in mind: (a) to showhow the relaxation labeling theory may be applied tostereovision matching problems and (b) to map the sim-ilarity, smoothness and ordering stereovision matchingconstraints in Eqs. (3) and (4). The uniqueness constraintwill be considered later during the decision process inSection 4.3.

2.4.1. Stereovision matching by relaxation labelingAs explained in the introduction, the stereovision

matching is the process of identifying the correspondingfeatures in two images (left and right) that are cast by thesame physical feature in 3-D space. In order to apply therelaxation labeling theory to the stereovision matchingand without loss of generality, we associate the leftstereo-image to the scene and the right one to the model.With this approach we have to match the scene to amodel and the stereovision matching problem is now arelaxation matching one.

In our approach, the objects in the scene and themodel are edge segments. Following the notation intro-duced in Section 2.3, the set A, of N objects, is identi"edwith the set of N edge segments in the left image from thestereo pair, each left edge segment (object) is identi"ed byits corresponding label h

i. The edge segments from the

right image are associated to the model labels that formthe set ). Then the notation uhi in relaxation labelingindicates now, that we wish to match a right edge seg-ment with a particular left edge segment.

According to Section 2.3 the unary attributes m1, to

map the similarity constraint are module and directiongradient, Laplacian and variance values, as explainedin Section 2.1. Therefore m

1takes a value of 4. The

binary relations are: (a) the di!erential disparity ofRef. [8] where we apply the smoothness constraint,and (b) the ordering relation to apply the orderingconstraint. Hence m

2takes the value of 2. According to

Section 1.3, we can infer that unary and binary measure-ments are associated to local and global correspond-ences, respectively.

2.4.2. Mapping the similarity, smoothness and orderingstereovision matching constraints

The next step consists in the mapping of the sim-ilarity, smoothness and ordering stereovision matching

constraints [20}23,27], into the relaxation labelingEqs. (3) and (4). To achieve this we have made twoimportant contributions to solve the stereovision match-ing problem: (a) by applying a learning strategy in thesimilarity constraint, and (b) by introducing speci"c con-ditions where we take into account the possible violationof the smoothness and ordering constraints (objects inthe 3-D scene very close to the cameras and occlusions)and the problem of "xing a disparity limit (repetitivestructures). We can also point out that unlike Christmaset al. [1], and due to the possible violation of the smooth-ness and ordering constraints, we have not consideredstructural or geometrical binary relations for mappingthe smoothness constraint. Some of these binary relationscan be found in Ref. [1] (angle between line segments,minimum distance between the endpoints of line seg-ments, distance between the midpoints of line segments).

(a) Mapping the similarity constraint: The similarityconstraint is mapped as follows: Given a pair of edge-segments, from left and right images, to be matched andlabeled as h

iand uj, respectively, we have the

four-dimensional di!erence measurement vector xi,

which is obviously a measurement of similarity betweenthe two edge segments involved. This vector is obtainedfrom the corresponding unary measurement vectorsassociated to the left edge segment x

-, and the right

edge segment x3, as explained in Section 2.1, so

xi"Mx

im, x

id, x

il, x

ivN. The matching probability at level

n, left-hand side of Eq. (1), that the left edge-segment hi

is matched to the right edge-segment uj is nowP(n) (h

i"uj) and x is now x

i(b) Mapping the smoothness constraint: Assuming no

violation of the smoothness constraint and unlike inChristmas et al. [1] the density function p(A

ijD h

i"ua,

hj"ub) in the support of the match, given in Eq. (4),

is replaced by a compatibility coe$cient where thesmoothness and ordering constraints are mapped. This isthe idea proposed by Hummel and Zucker [44] andRosenfeld et al. [42]. Traditional interpretations of com-patibility coe$cients have been in terms of statisticalmeasures such as correlation [42] or structural informa-tion [1]. In this paper we de"ne a new compatibilitycoe$cient.

However, beforehand we change the notation for sim-plicity. So, given two pairs of matched edge segments(h

i, u

j) and (h

h, u

k). Throughout the remainder of the

paper the pairs of edge segments (hi, u

j) and (h

h, u

k) are

denoted as (i, j) and (h, k) respectively, and the matchingprobabilities P(n) (h

i"u

j) and P(n) (h

h"u

k) as P(n)

ijand

P(n)hk

, respectively. We measure the consistency between(i, j) and (h, k) by means of the compatibility coe$cientc(i, j; h, k) which ranges in the interval [0,1].

According with Medioni and Nevatia [8], the smooth-ness constraint assumes that neighboring edge segmentshave similar disparities, except at a few depth discontinu-ities [12]. Generally, to de"ne the neighboring region a

58 G. Pajares et al. / Pattern Recognition 33 (2000) 53}68

Fig. 2. Common overlapping length and intersection points produced by a scanline.

bound is assumed on the disparity range allowed for anygiven segment. This limit is denoted as maxd, "xed ateight pixels in this paper. For each edge segment 00i11 inthe left image we de"ne a window w(i) in the right imagein which corresponding segments from the right imagemust lie and, similarly, for each edge segment 00j11 in theright image, we de"ne a window w( j) in the left image inwhich corresponding segments from the left image mustlie. It is said that an edge segment 00h11 must lie if at leastthe 30% of the length of the segment 00h11 is contained inthe corresponding window. The shape of these windowsis a parallelogram, two sides are "xed by 00i11 or 00j11 andthe other two are lines of length 2. maxd (see Fig. 3). Thesmoothness constraint implies that 00i11 in w( j) implies 00j11in w(i).

Now, given 00i11 and 00h11 in w( j) and 00j11 and 00k11 in w(i)where 00i11 matches with 00j11 and 00h11 with 00k11 the di!eren-tial disparity Dd

ij!d

hkD, to be included later in c(i, j; h, k),

measures how close the disparity between edge segments00i11 and 00j11 denoted as d

ijis to the disparity d

hkbetween

edge segments 00h11 and 00k11. The disparity between edgesegments is the average of the disparity between the twoedge segments along the length they overlap. This di!er-ential disparity criterion is used in Refs. [8,27,36,55,56]among others.

(c) Mapping the ordering constraint: Assuming, as in thesmoothness constraint, no violation of the ordering con-straint, we extend the concept de"ned in Ruicheck andPostaire [27] for edge points to edge segments. As de-scribed in Ref. [27], if an edge point 00i11 in the left image ismatched with an edge point 00j11 in the right image, then itis not possible for an edge point 00h11 in the left image, suchthat x

h(x

i, to be matched with an edge point 00k11 in the

right image for which xj'x

k, where x denotes the x-

coordinate in a Cartesian system with its origin in thebottom left corner of each image.

We de"ne the ordering constraint coe$cient Oijhk

forthe edge segments as follows:

Oijhk

"

1

N+N

oijhk

where oijhk

"DS(xi!x

h)!S(x

j!x

k)D

and S(r)"G1 if r'0,

0 otherwise,(5)

this upper-case Oijhk

coe$cient measures the relativeposition of edge segments 00i11 and 00h11 in the left imagewith respect to 00j11 and 00k11 in the right image. Thelower-case o

ijhkcoe$cient is de"ned in Ref. [27] for edge

points taking the discrete values of 0 and 1. Oijhk

is theordering average measure for edge points along the com-mon overlapping length for the four edge segments, itranges from 0 to 1 and is included later in the compatibil-ity coe$cient c(i, j; h, k). We trace N scanlines along thecommon overlapping length, each scanline produces a setof four intersection points with the four edge-segments.With this intersection points lower-case o

ijhkis computed.

Fig. 2 clari"es the above.(d) Problems to apply the smoothness and ordering con-

straints: During our experiments we have found the fol-lowing problems related with these constraints, namely:

(1) Violation of such constraints in systems with parallelgeometry due to the presence of objects close to thecameras and occlusions. This type of objects is par-ticularly dangerous and must be considered im-mediately in order to avoid, for example, undesiredcollisions in navigation systems.

(2) The "xation of a bound on the disparity is a hardtask, because one can never be sure which is the bestlimit. In images with repetitive structures this prob-lem is more evident, because an edge segment ina window does not imply that its couple is in theother window.

As mentioned in Section 1.5, we call complex imagesthe images in which these problems appear and we try tosolve them by introducing a speci"c condition in the

G. Pajares et al. / Pattern Recognition 33 (2000) 53}68 59

compatibility coe$cient, as we will see later. So, oursystem works properly with complex and no compleximages.

(e) Dexnition of the compatibility coezcient: In ourstereovision relaxation labeling model, the probabilitiescomputed according to Eq. (3), are used to determine thevarious possible correspondences between edge seg-ments. The goal of the relaxation process is to increasethe consistency of a given pair of edge segments amongthe constraints so that the probability of a correct matchcan be increased and the probability of any incorrectmatch can be decreased during each iteration. Therefore,we look for a compatibility coe$cient, which must beable to represent the consistency between the current pairof edge segments under correspondence and the pairs ofedge segments in a given neighborhood.

When no violation of the smoothness or orderingconstraints is considered, the compatibility coe$cientmakes global consistency between neighbor pairs of edgesegments based on such constraints. This is usual instereovision matching systems and is made by condi-tion C

1.

When violation of the smoothness and ordering con-straints is assumed, the global consistency is lost andsuch constraints cannot be applied, we must fall back onto local consistency, as this is the unique resource avail-able. The local consistency is represented by thesimilarityconstraint, that is by the matching probability, but nowconsidering the corresponding neighbors edge segments.This is made through condition C

2.

The compatibility coe$cient c(i, j; h, k) is "nally de-"ned as follows:

c(i, j; h, k)"GO

ijhkdijhk

l#Ddij!d

hkD

if C1,

dijhk

P(n)ij

# P(n)hk

2if C

2,

0 otherwise,

(6)

where dijhk

"(overlap(i, j)#overlap(h, k))/2, which is anoverlap function in the interval [0,1] and measures thecommon average overlap rates of the edge segmentsunder matching (see Section 2.1); the term d

ijis the

average disparity, which is the average of the disparitybetween the two segments 00i11 and 00j11 along the commonoverlap length and P(n)

ijis the matching probability of

edge segments 00i11 and 00j11 at level n. Oijhk

is the ordercoe$cient.

Condition C1: Assumes the mapping of the ordering

and smoothness constraints. The assumption that theordering constraint is not violated is simple, and onlyrequires that O

ijhkis greater than a "xed threshold, set to

0.85 in this paper. The assumption that the smoothnessconstraint is not violated is based on the following cri-terion. If the edge segment 00h11 is a neighbor of the edge

segment 00i11(h in w(j)) its preferred match 00k11 is a neighborof the edge segment 00j11(k in w(i)). This is equivalent to saythat both pairs of edge segments have similar disparities,this is the minimum di!erential disparity criterion in Ref.[8]. Given a segment 00h11 in the left image (LI) anda segment 00k11 in the right image (RI), we de"ne a prefer-red match 00k11 of 00h11, denoted as pm(h, k) ifP(n)hk*AH P(n)

.!9 hk{∀k@ 3 RI. Where P(n)

hk'¹ (with the

threshold ¹ set to 0.5 in this experiment) andP(n).!9 hk{

"maxk{ oRI

P(n)hk{

is the maximum matching prob-ability of the edge segment 00h11 with any edge segment00k@11 in RI. The constant A is a coe$cient to "x the degreeof exigency to de"ne the preferred match (set to 0.85 inthis paper). Hence, the compatibility coe$cient is max-imum if the di!erential disparity is minimum andO

ijhkand the overlap function are maximum and vice

versa.Condition C

2: Unlike C

1this condition assumes the

violation of the smoothness or ordering constraints (dueto objects close to the cameras or occlusions) or thepossibility that an edge segment 00h11 in w(j) has a prefer-red match 00k11 out of w(i) (repetitive structures). C

2is only

considered when C1

is false. The loss of global consist-ency is replaced by local consistency in the compatibilitycoe$cient as the average of the matching probabilitieswhere the overlap function is also taken into account. Wemust point out that this condition C

2overrides the

mapping of the ordering and smoothness constraints, totake up again the similarity constraint through thematching probabilities, so that the pair of edge segmentsunder correspondence is not penalized during the relax-ation process due to violation of the smoothness andordering constraints. The compatibility coe$cient ismaximum if the matching probabilities and overlap func-tion are maximum and vice versa.

(f) A pedagogical example: The following is a simpleexample to clarify the above concepts related with condi-tions C

1and C

2, where we consider the evaluation of

a pair of edge segments (i, j), (Fig. 3):

(1) The preferred match of 00h11 is 00k11, where h3w( j) andk3w(i); ordering and smoothness are not violated.

(2) The preferred match of 00h11 is 00l11, hence the orderingconstraint is preserved but the smoothness constraintis violated because lNw(i).

(3) The preferred match of 00h11 is 00m11, hence the orderingconstraint is violated but the smoothness constraintis preserved because m3w(i).

(4) The preferred match of 00h11 is 00n11, in which caseboth ordering and smoothness constraints are viol-ated

In the "rst case condition C1

reinforces the match (i, j)as expected, but in the remainder of the cases the rein-forcement is assumed by condition C

2. This is the appro-

priate process. Without condition C2

the match (i, j)should be penalized and this is unwanted as the edge

60 G. Pajares et al. / Pattern Recognition 33 (2000) 53}68

Fig. 3. Evaluation of the pair (i, j) considering its neighborhood.

segments in the right image are all preferred matches of00h11. The dotted lines in the right image represent thepositions of 00i11 and 00h11 when the left image is superim-posed to the right one.

3. Summary of the matching procedure

After mapping the similarity, ordering and smoothnessstereovision matching constraints, the correspondenceprocess is achieved by allowing the system to evolve sothat it reaches a stable state, i.e. when no changes occur inthe matching probabilities during the updating procedure.

The whole matching procedure can be summarized asfollows:

(1) The system is organized with a set of pairs of edgesegments (from the left and right images) with a dif-ference in the direction of the gradient of less than$453 and considering that they overlap (initial con-ditions in Section 2.1). Each pair of edge segments ischaracterized by its initial matching probability.

(2) The assignment of the initial matching probabilitiesare computed according to Eq. (1).

(3) The variable npair, which represents the number ofpairs of edge-segments for which the matching prob-abilities are modi"ed by the updating procedure, isset to 0.

(4) The pairs of edge segments are sequentially selectedand the relaxation procedure applied to each one.

(5) At the nth iteration, the matching probability is com-puted according to Eq. (3).

(6) If DP(n`1) (hi"uhi) !P(n) (h

i"uhi)D'e then npair is

increased. This is the criterion used in Ref. [1], sothat for each object a

ithere is one label uhi for which

the matching probability at a given level is withinsome small distance e from unity. The constant valuee is introduced to accelerate the convergence and isset to 0.01 in this experiment.

(7) If npair is not equal to 0 then go to step 3.(8) Matching decisions: A left edge segment can be as-

signed to a unique right edge segment (unambiguouspair) or several right edge segments (ambiguouspairs). The decision about whether a match is corrector not correct is made by choosing the greater prob-ability value (in the unambiguous case there is onlyone) whenever it surpasses a previous "xed probabil-ity threshold ¹, set to 0.5. The ambiguities under thecriterion given in Section 2.1 are allowed as they canbe produced by broken edge segments. This steprepresents the mapping of the uniqueness constraint,which completes the set of matching constraints instereovision matching.

4. Validation, comparative analysis and performanceevaluation

4.1. Design of a test strategy

In order to assess the validity and performance of theproposed method (i.e. how well our approach works inimages with and without additional complexity), we haveselected 48 stereo pairs of realistic stereo images from anindoor environment. Each pair consists of two left andright original images and two left and right images oflabeled edge segments. All tested images are 512]512pixels in size, with 256 gray levels. The two cameras haveequal focal lengths and are aligned so that their viewingdirection is parallel. The plane formed by the viewed

G. Pajares et al. / Pattern Recognition 33 (2000) 53}68 61

Fig. 4. Group SP1: (a) original left stereo image, (b) originalright stereo image, (c) labeled segments left image, (d) labeledsegments right image.

Fig. 5. Group SP2: (a) original left stereo image, (b) originalright stereo image, (c) labeled segments left image, (d) labeledsegments right image.

Fig. 6. Group SP3: (a) original left stereo image, (b) originalright stereo image, (c) labeled segments left image, (d) labeledsegments right image.

point and the two focal points intersects the image planealong a horizontal line known as the epipolar line. A gridpattern of parallel horizontal lines is used for cameraalignment. The cameras are adjusted so that the imagesof these lines coincide [71]. Thus, each point in an edgesegment lies along the same horizontal or epipolar linecrossing both images of the stereo pair. The 48 stereopairs are classi"ed into three groups: SP1, SP2 and SP3with 18, 18 and 12 pairs of stereo images, respectively.For space reasons a unique representative stereo pair isshown for each SP set.

Group SP1 consists of stereo images without apparentcomplexity. Figs. 4a}d (original images and labeled edgesegments as stated) show a stereo pair representative ofSP1. Group SP2 corresponds to scenes where a repetitivestructure has been captured, Figs. 5a}d, show a stereopair representative of SP2, where in this case the repeti-tive structure is provided by the vertical books in theforeground of the scene. Finally, group SP3 containsobjects close to the cameras, that produce a high range ofdisparity violating the smoothness and ordering con-straints. Figs. 6a}d is a stereo pair representative of SP3,where we can see the object labeled as 9, 10 in left imageand 11, 12 in right image as a characteristic example of anobject close to the cameras occluding the edge segment19 in the right image. Although this last type of images isunusual, its treatment is very important as they couldproduce fatal errors in navigation systems for example,where the nearest objects must be processed immediately.

As mentioned during the introduction, we base ourmethod on the relaxation work of Christmas et al. [1], inwhich a stereovision matching problem is solved. This

algorithm uses straight-line segments as features for thematching, a unique unary measurement of each line seg-ment (orientation of the scene relative to the model) andfour binary relations. These binary relations are: (a) theangle between line segments a

iand a

j; (b) the angle

between segment aiand the line joining the center-points

of segments aiand a

j; (c) the minimum distance between

the endpoints of line segments ai

and aj; and (d) the

62 G. Pajares et al. / Pattern Recognition 33 (2000) 53}68

distance between the midpoints of line segments aiand a

j.

It is a structural matching method based on geometricalrelations. These are the fundamental and main di!er-ences with regard to our approach. Indeed, we use fourunary measurements (module and direction gradient,Laplacian and variance) to map the similarity constraintand two binary relations to map the smoothness andordering constraints. Only the direction gradient in ourapproach could be identi"ed with the unary measure-ment in Ref. [1]. These reasons impede us from perform-ing the appropriate comparative analysis between ourapproach and the structural method in Ref. [1].

Moreover, the geometrical relationships would be onlyapplicable with great di$culty to our complex images.According to Section 1.4, there are di!erent relaxationapproaches: probabilistic, merit and optimization. Thispaper develops a probabilistic relaxation (PR) method.To compare our PR technique, we choose a merit relax-ation and two optimization relaxation methods.

4.1.1. The merit relaxation approach (MR)The minimum di!erential disparity algorithm of

Medioni and Nevatia [8] is the merit relaxation methodchosen for comparison purposes for the following rea-sons: (1) it applies the commonly used constraints, sim-ilarity, smoothness and uniqueness; (2) it uses edge seg-ments as features and the contrast and orientation of thefeatures as attributes; (3) it is implied in the derivation ofPR; and (4) Medioni and Nevatia conclude that severalimprovements to the matching method may be possible.A brief description of the MR algorithm is given below.

MR uses straight-line segments as features becausethey believe that the feature-based stereo systems havestrong advantages over area-based correlation systems.The method de"nes two windows around every segmentin the left and right images and assumes a bound on thedisparity range allowed. We have applied these conceptsto our algorithm and they are exhaustively explained inSection 2.4.2.

The MR local matching process de"nes a booleanfunction indicating whether two segments are potentialmatches, such a function is true i! the segments overlapand have similar contrast and orientation (similarity con-straint). The overlap concept in PR is also taken fromthis method and the contrast and orientation could beidenti"ed with our module and direction gradient, al-though we believe that gradient provides better perfor-mance. To each pair of segments, the MR associates anaverage disparity which is the average of the disparitybetween the two involved segments along the length oftheir overlap. This is also applied in our method.

Thus, a global relaxation matching process is trig-gered, where a merit function is updated iteratively, andchanges if the segments in the neighborhood are assignednew preferred matches. The preferred matches are se-lected according to the boolean function and the neigh-

borhood concept, this latter de"ned on the basis of thedisparity bound. The MR applies the smoothness con-straint through the minimum di!erential disparity cri-terion. We have also applied this criterion in PR.

The fundamental di!erences between MR and PRcan be summarized as follows: (1) PR uses quantitativeprobability values (matching probabilities) and MRuses a qualitative boolean function; (2) PR introducesspeci"c conditions to overcome the possible violationof the smoothness constraint and to avoid the problemof the "xation of the disparity limit, which is missingin MR.

4.1.2. Two optimization relaxation approaches (OR1 andOR2)

Two appropriate techniques to compare our PR arethe approaches based on the Hop"eld neural network[36] (OR1) and [27] (OR2). These techniques take intoaccount other previous works in this area [12,14,24,55,56]. They are chosen for the following reasons: (1) in OR1the similarity and smoothness constraints are mapped asin PR, moreover both use the same set of attributes andfeatures, (2) the compatibility coe$cient in OR1 onlytakes into account the violation of the smoothness con-straint, whereas in PR the violation of both the smooth-ness and ordering constraints are considered, and (3)although OR2 uses edge pixels as features it is consideredas a guideline to map the similarity, smoothness andordering constraints for edge segments, this is made as inPR but removing the speci"c conditions C

1and C

2, that

is, with respect to C1

the preferred match 00k11 of 00h11ful"lls P(n)

hk"P(n)

.!9 hk{∀k@ 3 RI and with respect to C

2the

violation of the smoothness and ordering constraints isnot considered.

The comparison between PR and OR1 allows us tocheck the e!ectiveness of the ordering constraint becausethey use a similar compatibility coe$cient and the com-parison between PR and OR2 allows us to prove thee!ectiveness of the compatibility coe$cient itself, andmore speci"cally when the smoothness and ordering con-straints are violated.

4.2. Experimental results

In this study, the proposed PR method for solving thestereovision matching problem was executed on a VAX8810 workstation. During the test, 48 relaxation pro-cesses were built-in and executed, one for each stereoimage pair. The number of pairs of edge segments in thedi!erent relaxation processes is variable and ranges from36 to 104. This variability depends on the number of edgesegments extracted and the initial conditions establishedin Section 2.1.

Eq. (3) describes the evolution of the matching prob-abilities for our stereo global matching approach. Fig. 7shows the number of pairs (npair) of edge segments for

G. Pajares et al. / Pattern Recognition 33 (2000) 53}68 63

Fig. 7. Number of pairs (npair) of edge segments for which thematching probabilities are modi"ed by the updating procedureas a function of the iteration number, averaged for all relaxationprocesses.

Table 1Percentage of successes for the group of stereo-pairs SP1

SP1 0 8 16 24 32

PR 79.1 86.1 91.2 94.3 97.3OR1 79.1 84.3 88.7 92.9 96.2OR2 79.1 80.3 83.5 87.6 89.9MR 79.1 79.8 82.6 85.2 86.7

Table 2Percentage of successes for the group of stereo-pairs SP2

SP2 0 8 16 24 32

PR 61.3 75.6 84.8 90.1 96.7OR1 61.3 72.7 81.9 88.9 95.3OR2 61.3 67.8 70.4 76.9 80.3MR 61.3 66.5 69.8 73.7 77.2

which the matching probabilities are modi"ed bythe updating procedure as a function of the iterationnumber. This number is averaged for the 48 executedrelaxation processes. It is computed according to point6 in Section 3.

The number of pairs that change their matching prob-abilities drops in a relatively smooth fashion, with themaximum number of pairs that change between an iter-ation and the subsequent one being 9. This means thatthe number of correct matches grows slowly during theglobal matching process, and this is due to the followingtwo reasons: (a) initially (iteration 0) there is a relativelyhigh percentage of true matches (as we will see later inTable 1), this is a direct consequence of the local stereovi-sion matching process used, and is the same idea that theused in Christmas et al. [1], where the initial probabilitiesare computed based on the unary measurements, and (b)the coe$cient e is introduced in Section 3, point 6, so thatthe criterion to determine whether npair changes is re-laxed. Finally, we can see that from iteration number 15,npair varies slowly and it can be assumed that all ana-lyzed stereo images reach an acceptably high degree ofperformance at iteration 15.

We can point out that due to the variability in thenumber of pairs of edge segments to be matched for each

stereo image and to di!erent complexity of the stereoimages the time for convergence is also variable. We haveveri"ed in all tested stereo images that an average timefor convergence is about 7 min. This time is measuredfrom iterations 0 to 32, the start and end of the globalrelaxation process, which is the objective of this paper.

4.3. Comparative analysis

As we already know, the stereo-matching process canbe split into local and global processes. This paper devel-ops a global relaxation process, probabilistic based,which is now of interest to us, since we have alreadyexhaustively studied the local process in previous works[4,31,32]. Therefore, the comparative analysis we per-form is only concerned with the relaxation process.

The system processes the SP1, SP2 and SP3 groups ofstereo images. Of all the possible combinations of pairs ofmatches formed by segments of left and right images only468, 359 and 294 pairs are considered for SP1, SP2 andSP3, respectively, because the remainder of the pairs donot meet the initial conditions.

For each pair of features we compute the local match-ing probability following the learning procedure de-scribed in Section 2.2. This is the starting point for allrelaxation processes. For PR, OR1 and OR2 the localmatching probabilities are the initial matching probabil-ities. MR requires the de"nition of a boolean function toindicate whether two segments are potential matches. Wehave chosen the criterion that if the local matching prob-ability is greater than 0.5, the intermediate probabilityvalue in [0,1], the pair of segments is a potential matchand the boolean function is true.

The computed results are summarized in Tables 1}3.They show the percentage of successes for each group ofstereo-pairs (SP1, SP2 and SP3) and for each method(PR, OR1, OR2 and MR) as a function of the number ofiterations. Iteration 0 corresponds with results for localmatching probabilities, the starting point for the relax-ation process. As shown in Fig. 7, 32 iterations su$ce tohalt the PR process. This is the number of iterations usedfor the remainder of the methods. We have performedexperiments with window-width values (maxd, see Sec-tion 2.4b) in the range 3}15 pixels, where the best resultswere obtained when maxd was set to 8 pixels.

64 G. Pajares et al. / Pattern Recognition 33 (2000) 53}68

Table 3Percentage of successes for the group of stereo-pairs SP3

SP3 0 8 16 24 32

PR 62.8 78.3 87.9 91.4 95.7OR1 62.8 76.5 86.8 90.7 95.1OR2 62.8 64.6 67.1 69.3 70.5MR 62.8 62.1 61.6 60.7 59.8

4.3.1. Decision processWhen the relaxation processes have "nished, there are

still both unambiguous pairs of segments and ambiguousones, depending on whether one and only one, or severalright image segments correspond to a given left imagesegment. In any case, the decision about whether a matchis correct or not is made by choosing the result of greaterprobability for PR, OR1 and OR2 and the best meritvalue for MR. In the unambiguous case there is only one.For PR, OR1 and OR2, it is necessary that this probabil-ity surpasses the previous "xed threshold probability(¹"0.5).

It is allowed that an edge segment l in the left imagematches with r

1and r

2in the right image if r

1and r

2do

not overlap, their directions di!er a value smaller thana "xed threshold (set to 10 in this paper) and the match-ing probabilities between l and r

1and l and r

2are greater

than ¹.According to values from Tables 1}3, the following

conclusions may be inferred:

(1) Group SP1: All methods progress in a similarfashion in the number of successes. The best results areobtained with PR and OR1. In both cases this is due tothe de"nition of C

1in the compatibility coe$cient.

(2) Group SP2: The best results are once again obtainedwith PR and OR1 where the di!erence with respect toOR2 and particularly with respect to MR are signi"cant.The explanation can be found by the presence of repeti-tive structures in the images. These structures producefalse matches, which are taken as true by MR and OR2and false by PR and OR1, hence PR and OR1 work asexpected. The reason is the de"nition of condition C

2applied to the compatibility coe$cient.

(3) Group SP3: The number of successes for PR andOR1 evolves as expected whereas for OR1 remains prac-tically stable and for MR decreases as the number ofiterations increases. This last case is a surprising result.This is a direct consequence of images with some objectsclose to the cameras violating the smoothness and order-ing constraints (in Fig. 3a}d, see object bounded by labels9 and 10 in left image and 11 and 12 in right image). MRincreases the merit of false matches, such as (3, 3) or (2, 2),and PR and OR1 handle this phenomenon correctlythrough condition C

2.

In general, the best results are always obtained withPR. The small di!erence with respect to OR1 is due tothe convergence criterion used in the Hop"eld neuralnetwork, which requires a greater number of iterations(about 22) than PR to obtain a similar percentage ofsuccesses. We have veri"ed that the ordering constraintis not relevant, this is because the impact of such a con-straint is assumed by the smoothness constraint. Wethink that this is the reason for which the orderingconstraint is not considered in classical stereovisionmatching methods [8,14,18,20,23,55] among others.Nevertheless, it is useful in structural or geometricmatching approach methods [1,30,37].

(4) The di!erence in the results between PR and OR2tells us that the de"nition of the compatibility coe$cientwith its two conditions is appropriate to deal with theviolation of the smoothness and ordering constraints andalso to "x the disparity limit.

(5) Additionally, we can point out that the globalrelaxation matching process substantially improves thelocal matching results (compare results between iteration0 and 32). Hence the relaxation process becomes parti-cularly necessary for complex stereo images.

5. Concluding remarks

An approach to stereo correspondence using a prob-abilistic relaxation method is presented. The stereo corre-spondence problem is formulated as a relaxation taskwhere a matching probability is updated through themapping of four constraints (similarity, smoothness,ordering and uniqueness). The similarity constraint isdoubly applied: (a) during the computation of the localmatching probability (see Section 2.2), and (b) during themapping of the corresponding constraint (see Section2.4a). We have also introduced, through the compatibil-ity coe$cient of Eq. (6), speci"c conditions to overcomethe violation of the smoothness and ordering constraintsand to avoid the serious problem caused by the setting ofthe disparity limit. The advantage of using a relaxationprocess is that a global match is automatically achieved,taking into account each stereo pair of edge segments tobe matched: (a) in an isolated fashion (unary measure-ments), and (b) as a member of a neighborhood (binarymeasurements).

A comparative analysis is performed against othermethods OR1, OR2 and MR. We have shown that theproposed method PR gives a better performance. We "rststarted our experiments with normal images and thenused more complex images with repetitive structures,objects violating the smoothness and ordering con-straints and occlusions, always obtaining satisfactory re-sults. We have also shown the necessity of using a globalrelaxation strategy.

G. Pajares et al. / Pattern Recognition 33 (2000) 53}68 65

We hope that the above results show the potential ofPR. However, our method does not yet provide a fullperformance as it does not reach the 100% success leveland several improvements to the matching method maybe possible. We are currently investigating some of these:(1) hierarchical processing, starting at region level and (2)using corners as complementary features [54].

References

[1] W.J. Christmas, J. Kittler, M. Petrou, Structural matchingin computer vision using probabilistic relaxation, IEEETrans. Pattern Anal. Mach. Intell. 17 (8) (1995) 749}764.

[2] A.R. Dhond, J.K. Aggarwal, Structure from stereo } a re-view, IEEE Trans. Systems Man Cybernet. 19 (1989)1489}1510.

[3] T. Ozanian, Approaches for stereo matching } a review,Modeling Identi"cation Control 16 (2) (1995) 65}94.

[4] G. Pajares, Estrategia de solucion al problema de la corre-spondencia en vision estereoscopica por la jerarqumHametodologica y la integracion de criterios, Ph.D. Thesis,Facultad de C.C. Fisicas, U.N.E.D., Department of Infor-matica y Automatica, Madrid, 1995.

[5] S. Barnard, M. Fishler, Computational stereo, ACM Com-put, Surveys 14 (1982) 553}572.

[6] K.S. Fu, R.C. GonzaH lez, C.S.G. Lee, RoboH tica: Control,DeteccioH n, VisioH n e Inteligencia, McGraw-Hill, Madrid,1988.

[7] D.H. Kim, R.H. Park, Analysis of quantization error inline-based stereo matching, Pattern Recognition 8 (1994)913}924.

[8] G. Medioni, R. Nevatia, Segment based stereo matching,Comput. Vision Graphics Image Process. 31 (1985)2}18.

[9] H.H. Baker, Building and Using Scene Representations inImage Understanding, AGARD-LS-185 Machine Percep-tion. 1982, 3.1}3.11.

[10] I.J. Cox, S.L. Hingorani, S.B. Rao, B.M. Maggs, A max-imum likelihood stereo algorithm, Comput. Vision ImageUnderstanding 63 (2) (1996) 542}567.

[11] P. Fua, A Parallel Algorithm that produces dense depthmaps and preserves image features, Mach. Vision Appl.6 (1993) 35}49.

[12] J.J. Lee, J.C. Shim, Y.H. Ha, Stereo correspondence usingthe Hop"eld neural network of a new energy function,Pattern Recognition 27 (11) (1994) 1513}1522.

[13] Y. Shirai, Three-Dimensional Computer Vision, Springer,Berlin, 1983.

[14] Y. Zhou, R. Chellappa, Arti"cial Neural Networks forComputer Vision, Springer, New York, 1992.

[15] W.E.L. Grimson, Computational experiments with a fea-ture-based stereo algorithm, IEEE Trans. Pattern Anal.Mach. Intell. 7 (1985) 17}34.

[16] A. Khotanzad, A. Bokil, Y.W. Lee, Stereopsis by con-straint learning feed-forward neural networks, IEEETrans. Neural Networks 4 (1993) 332}342.

[17] Y.C. Kim, J.K. Aggarwal, Positioning three-dimensionalobjects using stereo images, IEEE J. Robot. Automat.3 (1987) 361}373.

[18] V.R. Lutsiv, T.A. Novikova, On the use of a neurocom-puter for stereoimage processing, Pattern RecognitionImage Anal. 2 (1992) 441}444.

[19] D. Maravall, E. Fernandez, Contribution to the matchingproblem in stereovision, Proceedings of the 11th IAPR:International Conference in Pattern Recognition, TheHague, 1992, pp. 411}414.

[20] D. Marr, La Vision, Alianza Editorial, Madrid, 1985.[21] D. Marr, Vision, Freeman, San Francisco, 1982.[22] D. Marr, T. Poggio, A computational theory of human

stereovision, Proc. Roy. Soc. London B 207 (1979)301}328.

[23] D. Marr, T. Poggio, Cooperative computation of stereodisparity, Science 194 (1976) 283}287.

[24] M.S. Mousavi, R.J. Schalko!, ANN Implementation ofstereo vision using a multi-layer feedback architecture,IEEE Trans. Systems Man Cybernet. 24 (8) (1994)1220}1238.

[25] P. Rubio, Analisis comparativo de Metodos de Correspon-dencia estereoscopica, Ph.D. Thesis, Facultad de Psicologia,Universidad Complutense, Madrid, 1991.

[26] P. Rubio, RP: un algoritmo e"ciente para la buH squeda decorrespondencias en visioH n estereoscoH pica, InformaH ticay AutomaH tica 26 (1993) 5}15.

[27] Y. Ruycheck, J.G. Postaire, A neural network algorithmfor 3-D reconstruction from stereo pairs of linear images,Pattern Recognition Lett. 17 (1996) 387}398.

[28] N. Ayache, B. Faverjon, E$cient registration of stereoimages by matching graph descriptions of edge segments,Int. J. Comput. Vision 1 (1987) 107}131.

[29] H.H. Baker, T.O. Binford, Depth from edge and intensitybased stereo, in: Proceedings of the 7th International JointConference Arti"cial Intellengence, Vancouver, Canada,1981, pp. 631}636.

[30] K.L. Boyer, A.C. Kak, Structural Stereopsis for 3-D vision,IEEE Trans. Pattern Anal. Mach. Intell. 10 (2) (1988)144}166.

[31] J.M. Cruz, G. Pajares, J. Aranda, A neural network ap-proach to the stereovision correspondence problem byunsupervised learning, Neural Networks. 8 (5) (1995)805}813.

[32] J.M. Cruz, G. Pajares, J. Aranda, J.L.F. Vindel,Stereo matching technique based on the perceptron cri-terion function, Pattern Recognition Letters. 16 (1995)933}944.

[33] W. Ho!, N. Ahuja, Surface from Stereo: Integrating fea-ture matching, disparity estimation, and contour detec-tion, IEEE Trans. Pattern Anal. Mach. Intell. 11 (1989)121}136.

[34] D.H. Kim, W.Y. Choi, R.H. Park, Stereo matching tech-nique based on the theory of possibility, Pattern Recogni-tion Lett. 13 (1992) 735}744.

[35] Y. Ohta, T. Kanade, Stereo by intra- and inter-scanlinesearch using dynamic programming, IEEE Trans. PatternAnal. Mach. Intell. 7 (2) (1985) 139}154.

[36] G. Pajares, J.M. Cruz, J. Aranda, Relaxation by Hop"eldnetwork in stereo image matching, Pattern Recognition31 (5) (1998) 561}574.

[37] L.G. Shapiro, R.M. Haralick, Structural descriptions andinexact matching, IEEE Trans. Pattern Anal. MachineIntell. 3 (5) (1981) 504}519.

66 G. Pajares et al. / Pattern Recognition 33 (2000) 53}68

[38] M.S. Wu, J.J. Leou, A bipartite matching approach tofeature correspondence in stereo vision, Pattern Recogni-tion Letters 16 (1995) 23}31.

[39] D.M. Wuescher, K.L. Boyer, Robust contour decomposi-tion using a constraint curvature criterion, IEEE Trans.Pattern Anal. Mach. Intell. 13 (1) (1991) 41}51.

[40] R.M. Haralick, L.G. Shapiro, Computer and Robot Vi-sion, Vols. I and II, Addison-Wesley, Reading, MA, 1992,1993.

[41] M. Sonka, V. Hlavac, R. Boyle, Image Processing, Analysisand Machine Vision, Chapman & Hall, London, 1995.

[42] A. Rosenfeld, R. Hummel, S. Zucker, Scene labelling byrelaxation operation, IEEE Trans. Systems Man Cybernet.6 (1976) 420}453.

[43] S.T. Barnard, W.B. Thompson, Disparity analysis of im-ages, IEEE Trans. on Pattern Anal. Mach. Intell. 13 (1982)333}340.

[44] R. Hummel, S. Zucker, On the foundations of relaxationlabeling processes, IEEE Trans. Pattern Anal. Mach. In-tell. 5 (1983) 267}287.

[45] A. Laine, G. Roman, A parallel algorithm for incrementalstereo matching on SIMD machines, IEEE Trans. Robot.and Automat. 7 (1991) 123}134.

[46] S. Lloyd, E. Haddow, J. Boyce, A parallel binocular stereoalgorithm utilizing dynamic programming and relaxationlabelling, Comput. Vision Graphics Image Process. 39(1987) 202}225.

[47] N.M. Nasrabadi, A stereo vision technique using curve-segments and relaxation matching, IEEE Trans. PatternAnal. Mach. Intell. 14 (5) (1992) 566}572.

[48] S. Peleg, A new probabilistic relaxation scheme, IEEETrans. Pattern Anal. Mach. Intell. 2 (1980) 362}369.

[49] K. Prazdny, Detection of binocular disparities, Biol.Cybernet. 52 (1985) 93}99.

[50] K.E. Price, Relaxation matching techniques } a compari-son, IEEE Trans. Pattern Anal. Mach. Intell. 7 (5) (1985)617}623.

[51] S. Ranade, A. Rosenfeld, Point pattern matching by relax-ation, Pattern Recognition 12 (1980) 269}275.

[52] D. Sherman, S. Peleg, Stereo by incremental matching ofcontours, IEEE Trans. Pattern Anal. Mach. Intell. 12(1990) 1102}1106.

[53] C. Stewart, C. Dyer, Simulation of a connectionist stereoalgorithm on a memory-shared multiprocessor, in: V.Kumar (Ed.), Parallel Algorithms for Machine Intelligenceand Vision, Springer, Berlin, 1990, pp. 341}359.

[54] C.Y. Wang, H. Sun, S. Yada, A. Rosenfeld, Some experi-ments in relaxation image matching using corner features,Pattern Recognition 16 (2) (1983) 167}182.

[55] N.M. Nasrabadi, C.Y. Choo, Hop"eld network forstereovision correspondence, IEEE Trans. Neural Net-works 3 (1992) 123}135.

[56] Y.-H. Tseng, J.-J. Tzen, K.-P. Tang, S.-H. Lin, Image-to-image registration by matching area features using Fourierdescriptors and neural networks, Photogrammetric Eng.Remote Sensing 63 (8) (1997) 975}983.

[57] U.R. Dhond, J.K. Aggarwal, Stereo matching in the pres-ence of narrow occluding objects using dynamic disparitysearch, IEEE Trans. Pattern Anal. Mach. Intell. 17 (7)(1995) 719}724.

[58] T. Pavlidis, Why progress in machine vision is so slow,Pattern Recognition Lett. 13 (1992) 221}225.

[59] A. Huertas, G. Medioni, Detection of intensity changeswith subpixel accuracy using Laplacian}Gaussian masks,IEEE Trans. Pattern Anal. Mach. Intell. 8 (5) (1986)651}664.

[60] D. Marr, E. Hildreth, Theory of edge detection, Proc. Roy.Soc. London B 207 (1980) 187}217.

[61] J.G. Leu, H.L. Yau, Detecting the dislocations in metalcrystals from microscopic images, Pattern Recognition 24(1991) 41}56.

[62] M.S. Lew, T.S. Huang, K. Wong, Learning and featureselection in stereo matching, IEEE Trans. Pattern Anal.Mach. Intell. 16 (9) (1994) 869}881.

[63] E.P. Krotkov, Active Computer Vision by CooperativeFocus and Stereo, Springer, New York, 1989.

[64] S. Tanaka, A.C. Kak, A rule-based approach to binocularstereopsis, in: R.C. Jain, A.K. Jain (Eds.), Analysis andInterpretation of range images, Springer, Berlin, 1990, pp.33}139.

[65] R. Nevatia, K.R. Babu, Linear feature extraction and de-scription, Comput. Vision Graphics Image Process. 13(1980) 257}269.

[66] R.O. Duda, P.E. Hart, Pattern Classi"cation and SceneAnalysis, Wiley, New York, 1973.

[67] S. Haykin, Neural Networks: A Comprehensive Founda-tion, Macmillan College Publishing Company, New York,1994.

[68] B. Kosko, Neural Networks and Fuzzy Systems, Prentice-Hall Englewood Cli!s, NJ, 1992.

[69] A.M.N. Fu, H. Yan, A new probabilistic relaxationmethod based on probability space partition, Pattern Rec-ognition 30 (11) (1997) 1905}1917.

[70] S.Z. Li, Matching: invariant to translations, rotations andscale changes, Pattern Recognition 25 (1992) 583}594.

[71] J. Majumdar, Seethalakshmy, E$cient parallel processingfor depth calculation using stereo, Robot. AutonomousSystems 20 (1997) 1}13.

About the Author*G. PAJARES received the B.S. and Ph. D degress in physics from U.N.E.D. (Distance University of Spain) (1987,1995) discussing a thesis on the application of pattern recognition techniques to stereovision. Since 1990 he has worked at ENOSA incritical software development. He joined the Complutense University in 1995 as an Associate Professor in Robotics. His current researchinterests include robotics vision systems and applications of automatic control to robotics.

About the Author*J.M. CRUZ received a M.Sc. degree in Physics and a Ph.D. from the Complutense University in 1979 and 1984respectively. From 1985 to 1990 he was with the department of Automatic Control, UNED (Distance University of Spain), and fromOctober 1990 to 1992 with the Department of Electronic, University of Santander. In October 1992, he joined the Department ofComputer Science and Automatic Control of the Complutense Uiversity where he is professor. His current research interests includerobotics vision ssytems, fusion sensors and applications of automatic control to robotics and #ight control.

G. Pajares et al. / Pattern Recognition 33 (2000) 53}68 67

About the Author*J.A. LOPEZ-OROZCO received an MSc degree in Physics from the Complutense University in 1991. From 1992 to1994 he had a Fellowship to work on Inertial Navigation Systems. Since 1995 he has been an assistant researcher in robotics at theDepartment of Computers Architecture and Automatic control, where he is working for his Ph.D. His current interest are roboticsvision systems and multisensor data fusion.

68 G. Pajares et al. / Pattern Recognition 33 (2000) 53}68