15
IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 9, NO. 6, DECEMBER 1993 80 I Significant Line Segments for an Indoor Mobile Robot Xavier Lebkgue, Student Member, IEEE, and J. K. Aggarwal, Fellow, IEEE Abstract-New algorithms for detecting and interpreting lin- ear features of a real scene as imaged by a single camera on a mobile robot are described. The low-level processing stages are specifically designed to increase the usefulness and the quality of the extracted features for indoor scene understanding. In order to derive 3-D information from a 2-D image, we consider only lines with particular orientations in 3-D. The detection and interpretation processes provide a 3-D orientation hypothesis for each 2-D segment. This in turn is used to estimate the ro- bot’s orientation and relative position in the environment. Next, the orientation data is used by a motion stereo algorithm to fully estimate the 3-D structure when a sequence of images be- comes available. From detection to 3-D estimation, a strong emphasis is placed on real-world applications and very fast processing with conventional hardware. Results of experimen- tation with a mobile robot under realistic conditions are given and discussed. I. INTRODUCTION HIS paper presents a new approach for extracting im- T portant line segments from monocular images, for es- timating a mobile robot’s orientation and relative posi- tion, and for estimating the 3-D position of important segments in the robot’s environment. The common idea is to design all stages of image interpretation, including the lowest image processing level, such that they provide the higher stages with the most semantically useful fea- tures. The tasks to be accomplished by the robot are usually specified in high-level semantic terms, such as “go down the hallway and go through the last door on the left.” In order to execute this task, the robot must be able to iden- tify the objects of interest (here, hallways and doors) in its perception of the environment. One approach is to re- construct a 3-D line segment description of the environ- ment of the robot from several intensity images (see for example [ 11-[4]), and then to match groups of those seg- ments to selected object models. For this, several 3-D segments are selected according to orientation and posi- tion criteria [5]. A 3-D hypothesis of an object is gener- ated and matched to the selected segments [6], [7]. The process is repeated for other segments and other hy- Manuscript received June 22, 1992; revised March 23, 1993. This work was supported in part by the Department of Defense Joint Services Elec- tronics Program through the Air Force Office of Scientific Research (AFSC) under Contract F49620-89-C-0044, and by the Army Research Office under Contract DAAL03-9 I-G-0050. The authors are with the Computer and Vision Research Center, De- partment of Electrical and Computer Engineering, the University of Texas at Austin, Austin, TX 78712. IEEE Log Number 9212599. potheses. The high-level semantic interpretation is then used for determining the free space or finding objects of interest [SI. Since high-level interpreters are looking for line seg- ments of particular orientations in 3-D, we designed our lower level image-processing stages to take advantage of this information. We will show practical examples of how this top-down information can benefit the feature extrac- tion stage by reducing the amount of unwanted features, increasing the sensitivity to good features, and drastically speeding the computation. Preliminary results of this re- search were presented in [9]-[ 131. Section I1 of this paper discusses the a priori knowl- edge of a few prominent 3-D orientations for lines in the scene. In the rest of the paper, we will only consider line segments that have one of these desired orientations in 3-D. No other undefined 3-D orientation will be detected or processed. Section I11 describes a very fast algorithm to extract useful line segments from a single image. The algorithm simultaneously hypothesizes the 3-D orienta- tion of segments. Section IV shows how the robot’s ori- entation and relative position in the scene is determined. Finally, Section V extends these results to a sequence of monocular images. A Kalman filter is used for a recursive estimation of the position of 3-D segments. Unlike purely metrical approaches for the recursive estimation, the qual- itative information derived independently for each image is used to enhance the quality and usefulness of the results while decreasing the computation time. Experimental re- sults with an indoor mobile robot are provided and ana- lyzed. 11. MOTIVATION AND ASSUMPTIONS A. Knowledge of Prominent 3-0 Orientations Estimating the 3-D structure of a scene from a single visual image is impossible without certain assumptions about the scene structure and the projection geometry. We chose to concentrate on objects that have parallel lines with known 3-D orientations in a world coordinate sys- tem. For example, in indoor scenes, rooms and hallways usually have a rectangular structure, and there are three prominent orientations for 3-D line segments: one vertical and two horizontal orientations perpendicular to each other. Outdoor urban scenes also display similar charac- teristics. In this paper, any 3-0 Orientation is permitted, as long as it is given to the algorithms. Therefore, more complex environments, such as polygonal buildings with 1042-296X/93$03.00 0 1993 IEEE

Significant line segments for an indoor mobile robot

  • Upload
    jk

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Significant line segments for an indoor mobile robot

IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 9, NO. 6, DECEMBER 1993 80 I

Significant Line Segments for an Indoor Mobile Robot

Xavier Lebkgue, Student Member, IEEE, and J. K. Aggarwal, Fellow, IEEE

Abstract-New algorithms for detecting and interpreting lin- ear features of a real scene as imaged by a single camera on a mobile robot are described. The low-level processing stages are specifically designed to increase the usefulness and the quality of the extracted features for indoor scene understanding. In order to derive 3-D information from a 2-D image, we consider only lines with particular orientations in 3-D. The detection and interpretation processes provide a 3-D orientation hypothesis for each 2-D segment. This in turn is used to estimate the ro- bot’s orientation and relative position in the environment. Next, the orientation data is used by a motion stereo algorithm to fully estimate the 3-D structure when a sequence of images be- comes available. From detection to 3-D estimation, a strong emphasis is placed on real-world applications and very fast processing with conventional hardware. Results of experimen- tation with a mobile robot under realistic conditions are given and discussed.

I. INTRODUCTION HIS paper presents a new approach for extracting im- T portant line segments from monocular images, for es-

timating a mobile robot’s orientation and relative posi- tion, and for estimating the 3-D position of important segments in the robot’s environment. The common idea is to design all stages of image interpretation, including the lowest image processing level, such that they provide the higher stages with the most semantically useful fea- tures.

The tasks to be accomplished by the robot are usually specified in high-level semantic terms, such as “go down the hallway and go through the last door on the left.” In order to execute this task, the robot must be able to iden- tify the objects of interest (here, hallways and doors) in its perception of the environment. One approach is to re- construct a 3-D line segment description of the environ- ment of the robot from several intensity images (see for example [ 11-[4]), and then to match groups of those seg- ments to selected object models. For this, several 3-D segments are selected according to orientation and posi- tion criteria [5] . A 3-D hypothesis of an object is gener- ated and matched to the selected segments [6], [7]. The process is repeated for other segments and other hy-

Manuscript received June 22, 1992; revised March 23, 1993. This work was supported in part by the Department of Defense Joint Services Elec- tronics Program through the Air Force Office of Scientific Research (AFSC) under Contract F49620-89-C-0044, and by the Army Research Office under Contract DAAL03-9 I-G-0050.

The authors are with the Computer and Vision Research Center, De- partment of Electrical and Computer Engineering, the University of Texas at Austin, Austin, TX 78712.

IEEE Log Number 9212599.

potheses. The high-level semantic interpretation is then used for determining the free space or finding objects of interest [SI.

Since high-level interpreters are looking for line seg- ments of particular orientations in 3-D, we designed our lower level image-processing stages to take advantage of this information. We will show practical examples of how this top-down information can benefit the feature extrac- tion stage by reducing the amount of unwanted features, increasing the sensitivity to good features, and drastically speeding the computation. Preliminary results of this re- search were presented in [9]-[ 131.

Section I1 of this paper discusses the a priori knowl- edge of a few prominent 3-D orientations for lines in the scene. In the rest of the paper, we will only consider line segments that have one of these desired orientations in 3-D. No other undefined 3-D orientation will be detected or processed. Section I11 describes a very fast algorithm to extract useful line segments from a single image. The algorithm simultaneously hypothesizes the 3-D orienta- tion of segments. Section IV shows how the robot’s ori- entation and relative position in the scene is determined. Finally, Section V extends these results to a sequence of monocular images. A Kalman filter is used for a recursive estimation of the position of 3-D segments. Unlike purely metrical approaches for the recursive estimation, the qual- itative information derived independently for each image is used to enhance the quality and usefulness of the results while decreasing the computation time. Experimental re- sults with an indoor mobile robot are provided and ana- lyzed.

11. MOTIVATION AND ASSUMPTIONS A . Knowledge of Prominent 3-0 Orientations

Estimating the 3-D structure of a scene from a single visual image is impossible without certain assumptions about the scene structure and the projection geometry. We chose to concentrate on objects that have parallel lines with known 3-D orientations in a world coordinate sys- tem. For example, in indoor scenes, rooms and hallways usually have a rectangular structure, and there are three prominent orientations for 3-D line segments: one vertical and two horizontal orientations perpendicular to each other. Outdoor urban scenes also display similar charac- teristics. In this paper, any 3-0 Orientation is permitted, as long as it is given to the algorithms. Therefore, more complex environments, such as polygonal buildings with

1042-296X/93$03.00 0 1993 IEEE

Page 2: Significant line segments for an indoor mobile robot

802 IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 9, NO. 6, DECEMBER 1993

angles other than 90°, are handled as well if these angles are known to the algorithms. The assumption of known prominent 3-D orientations is consistent with the goal of identifying objects of interest for a mobile robot, such as walls, doorways, or entire buildings.

B. Orientation of the Camera Relative to the Scene Vertical lines constitute an interesting special case for

two reasons: they are especially common in man-made scenes, and their 3-D orientation can easily be known in the 3-D camera coordinate system. If a two-axis incli- nometer is mounted on the mobile robot and properly cal- ibrated, a 3-D vertical vector can be expressed in the 3-D coordinate system aligned with the 2-D image coordinate system. It is interesting to note that humans and many animals also have “inclinometers” : the human vestibular system, located in the inner ear, provides orientation to gravity. The saccule and utricle, two sack-like structures, contain small granules embedded in a gelatinous material. Nerves detect the motion of these granules and transmit it to the brain, Research has shown strong connections be- tween the vestibular vertical and the visually perceived vertical [14], [15], although the exact interaction is still not clear.

We have experimented with two single-axis inclinom- eters (Midori Precision PMP-SlOT) mounted on a CCD camera, as shown in Fig. 1 . There is one inclinometer for pitch and one for roll. The pendulums of these inclinom- eters are oil-damped to attenuate the effects of short ac- celerations. The inclinometers are connected to the on- board computer through a 12-bit analog-to-digital con- verter. Rotations corresponding to local one-pixel shifts in the image were detected in a stationary experiment (no robot motion). Our algorithms can tolerate much larger errors (a few degrees). However, if the robot moves, it is conceivable that sustained accelerations would corrupt in- clinometer readings beyond an acceptable level. In that case, gyroscopes would be necessary. We have not tried to read inclinometers while the robot was moving.

There is an easy alternative to inclinometers for indoor mobile robots. If we assume that the robot is driving on a horizontal plane, we can measure the vertical in the camera coordinate system at calibration time. This is what we have done for all the experiments presented in the rest of this paper. In practice, uneven floors are acceptable, but tilted ramps are not.

Through the inclinometer or the planar floor assump- tion, we have a good estimate of two of the three angular degrees of freedom of the robot. We estimate the third degree of freedom from odometer readings of the robot. Provided that the odometer is constantly corrected by vi- sion (which it will be in Section IV), the odometric head- ing does not drift without bounds. An interval of possible robot headings can therefore be computed before visual processing. This interval is typically much less than 20” wide. The orientation of the camera with respect to the scene follows from the orientation of the robot and the eye/wheel calibration data (see Fig. 2).

Fig. 1. Roll and pitch inclinometers mounted on the robot’s camera

C. Projecting Segments of Known 3 - 0 Orientation In this section, we draw inferences about the likely 3-D

orientation of the line segments from their 2-D projections in the image plane. With a pinhole perspective projection model, lines parallel to each other in the 3-D scene will converge to a vanishing point in the 2-D projection. Fig. 3 illustrates this property in one particular case, when the roll and pitch of the camera are zero. The knowledge of vanishing points provides useful information about the structure of the scene [ 161-1201, about the camera’s pa- rameters 1211-[27] and the camera’s orientation with re- spect to the scene [28]-[30]. Traditionally, the location of the vanishing points is extracted from the image 1311- 1361. In our approach, we try to estimate the position of the vanishing points before we even process the image.

If the orientation of the camera relative to the scene is known, a vanishing point can be computed for each given 3-D orientation before the image is processed. All the lines that have a given orientation in 3-D must pass through the associated vanishing point when projected. Conversely, if a line does not pass through a vanishing point, it cannot have the 3-D orientation associated with that vanishing point, In practice, if a line does pass through a vanishing point when projected, it is likely to have the associated 3-D orientation. The results of Sec- tion I11 will show how likely this is in practical situations.

We know the roll and pitch of the camera, and we have an interval of possible values for the third orientation an- gle from the odometer. We can deduce that (refer to Fig.

1) The exact orientation of the 2-D projections of ver- tical lines can be computed as a function of their location in the image plane.

3) :

‘We assume for now that we have accurate roll and pitch. Section IV will compute a posteriori roll and pitch from the image and correct impre- cisions. The algorithm can be run again with the corrected values if nec- essary.

Page 3: Significant line segments for an indoor mobile robot

LEBEGUE A N D AGGARWAL: SIGNIFICANT LINE SEGMENTS FOR MOBILE ROBOT

Robot sing

OW w World coordinate system (the z axis is vertical)

Fig. 2 . The coordinate systems

To vanishing point V, I I

> \

/

\ - \

/ I

/

Vanishing point V,

Fig. 3 . The effects of the heading on segment orientations.

2) The same is true of other lines if the exact heading of the robot is given.

3) Even if the heading of the robot is only an estimate, the projection of the lines that are approximately parallel to the image plane of the camera in 3-D will not be sensitive to the heading. Their orientation can be estimated with little error, as a function of their location in the image plane.

4) For other lines, the 2-D line orientation belongs to an interval. This interval is a function of the line location in the image plane and the interval of pos- sible robot headings.

The last two cases do not imply that one set of hori- zontal lines must be parallel to the image plane, with the other being parallel to the robot’s heading. Instead, we arbitrarily assign to case 3) the lines that form a 3-D angle smaller than 45 O with the image plane, and to case 4) the other nonvertical lines. The distinction is based on the fact that in case 3), vanishing points are far from the cen- ter of the image. Even if their exact location is uncertain, the 2-D orientation of the associated segments in the im- age is known with little error. In case 4) however, van- ishing points are close to the image rectangle (possibly in the image itself). The 2-D orientation of the associated segments depends on the exact location of the vanishing

803

points. This is why an explicit interval of possible robot headings is used in case 4). Finally, if an image does not contain lines in case 4), the visual heading cannot be de- termined, and therefore the odometric heading is not cor- rected for drift.

In Section 111, we present an algorithm to extract 2-D segments from an image and estimate their 3-D orienta- tion based on these observations.

To summarize, the line detection algorithm of Section I11 knows in each point of the image plane the orientation that a projected line segment would have if it had one of the predefined 3-D orientations. Therefore, the basic idea is to detect the 2-D segments with one of the possible orientations, and mark them with the associated 3-D ori- entation hypothesis.

D. Coordinate Systems and Transformations The coordinate systems, as shown in Fig. 2, are W (the

World coordinate system, with a vertical z-axis), R (the Robot coordinate system), C (the Camera coordinate sys- tem), and P (the coordinate system used for the perspec- tive projection on the retina). Let h denote the heading of the robot.

The homogeneous coordinate transformation matrix from W to R is r COS h sin h o 0- I - s r h c o s h 0 0

TWR = Troll Tpitch 0 1 0

L 0 0 0 1 -

Troll and Tpitch are known with good precision through the inclinometer. h belongs to a known interval centered on the odoinetric heading. x Gbot, y zbot, and z gbot are approx- imated by the odometer but are not used in this section.

TRC, the coordinate transformation matrix from R to C , needs to be completely determined through eye / wheel calibration [ 121.

The perspective projection is given by

Page 4: Significant line segments for an indoor mobile robot

804 IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 9, NO. 6, DECEMBER 1993

where U and U represent the coordinates in pixels of a point on the image plane. The parameters cru,f, CY,^, uo, and uo need to be determined through calibration of the camera [ 121. CY, and CY, are the conversion factors in pixels per unit of length, uo and uo are the coordinates (in pixels) of the intersection between the optical axis and the image plane, and f is the focal distance in units of length. s rep- resents the depth, and therefore cannot be determined from a single 2-D image.

111. DETECTING SEGMENTS AND ESTIMATING THEIR 3-D ORIENTATION

A . Previous Research Line detectors may be broadly classified in two cate-

gories. In the first category, line segments are extracted after a traditional free-form edge detector processes the input image. In the second, straight lines are directly ob- tained without computing a free-form edge map. The most popular edge detectors that employ the first approach in- clude the Laplacian of Gaussian followed by a zero-cross- ings detector, Canny’s algorithm [37], and Deriche’s re- cursive implementation [38]. At this stage of computation, various thresholds may be applied to eliminate edges with a weak contrast or a short length. The free-form edge map is then chain-coded and fed into a line detector such as [39]. Depending on the implementation, either the entire edge map is approximated by polygons, or only the linear pieces are approximated. An example of this approach is the segment-based stereovision system of Lustman and Faugeras [2], which led to a custom hardware implemen- tation [40] intended for a mobile robot. The second ap- proach for line detectors is best exemplified by Bums’ al- gorithm [41]. This algorithm first computes the intensity gradient orientation for all pixels in the image. Next the neighboring pixels with similar gradient orientation are grouped into “line-support regions” by a process involv- ing coarse orientation “buckets. ” Finally, a line segment is fit to the large line-support regions by a least-squares procedure. An optimized version of this algorithm for use with mobile robots was presented in 1421.

The algorithm described in this section belongs to the second category of line detectors as defined above be- cause it does not use a free-form edge detector. It is de- signed not only to extract 2-D line segments from an in- tensity image, but also to indicate what are the most probable orientations for the corresponding 3-D segments in the scene.

B. Overview of the Algorithm The processing can be outlined as follows: 1) Line support region extraction: compute the angle

between the intensity gradient at each pixel and the ex- pected direction of the projection of each 3-D orientation (see Section III-C for details). Use a loose threshold to allow for noise in the gradient orientation. Reject im- proper pixels and 3-D orientations.

2) Nonmaxima suppression: keep only the local gra-

dient maxima along the estimated perpendicular to the line.

3 ) Pixel linking: create chains of pixels using a partial neighborhood search in the direction of the estimated van- ishing points. This creates noisy linear chains. 4) Linejtt ing: perform a least-squares fit of line seg-

ments to the pixel chains. Recursively break the pixel chains which cannot be closely approximated with a line segment into smaller chains.

5 ) Global orientation check: compute the match be- tween each line and each 3-D orientation, like in the line support extraction step, but with a much tighter threshold.

C. Extracting Line Support Regions For each pixel in the input intensity image and for each

category of possible 3-D orientations, we compute the an- gle between the intensity gradient and the expected direc- tion of the line in 2-D. The expected line is given by the current pixel and the vanishing point associated with the 3-D orientation. It is not necessary to compute the loca- tion of the vanishing point (which may lie at infinity).

The homogeneous transformation matrix changing world coordinates into projective coordinates is

TWP = TCPTRCTWR. (3)

Let

[:;l o w

be a nonnull vector in the 3-D direction under consider- ation. If

defines the relation between a 2-D point [U uIT and its antecedent by the perspective projection, then

defines another point of the estimated 2-D line. A 2-D vector d i n the image plane pointing to the vanishing point from the current point is then collinear to

I:: T a]

Page 5: Significant line segments for an indoor mobile robot

LEBECUE AND AGGARWAL: SIGNIFICANT LINE SEGMENTS FOR MOBILE ROBOT

Algebraic manipulations lead to

where

805

putation, which is local and therefore could be done in parallel for processing at video rates.

Note that a , , a y , and a, need to be computed only once for each 3-D orientation.

The current pixel is retained for the 3-D direction under consideration if the angle between d and the local inten- sity gradient vector g is 90" plus or minus an angular threshold y. This can be expressed by

or equivalently:

with r = (cos y)2 computed once for all. Using this for- mulation, the entire line support extraction is reduced to 8 additions and 11 multiplications per pixel and per 3-D Orientation. If an even greater speedup is desired, ( g: + gt ) may be computed first and thresholded. Pixels with a very low gradient magnitude may then be rejected before having to compute d .

D. Nonmaxima Suppression At this point in processing, line segments appear as

dense elongated pixel "clouds. " The nonmaxima suppression step is designed to thin those line-support re- gions by keeping only the local gradient maxima. The processing is local to a 3 X 1 neighborhood. For each pixel selected by the previous step and for each 3-D ori- entation, the gradient magnitude is compared to that of two of its neighbors. If it is less than either one, the cur- rent pixel is eliminated. The two neighbors should be in the direction perpendicular to the orientation of the line (which we derive from the orientation category). Ideally, interpolation between neighbors should be performed. We found experimentally that this is not necessary, and we consider either the top and bottom neighbors or the right and left neighbors. The choice is made based on the ex- pected 2-D orientation. For 3-D orientations that are not sensitive to the heading (case 3) of Section 11), we con- sidered the gradient magnitude along the perpendicular to the a priori line orientation instead of the magnitude it- self. One of the benefits of keeping only local maxima is that parallel lines that have the same gradient direction and are close to each other can be distinguished. Finally, the nonmaxima suppression step requires very little com-

E. Pixel Linking The output of the nonmaxima suppression stage is

scanned for each 3-D orientation category. When a valid pixel is found, the algorithm tries to find more valid pixels in the direction of the vanishing point. The algorithm ac- counts for noisy line support regions by searching a small neighborhood, but it gives priority to the pixel that is clos- est to the line passing through the current point and the vanishing point associated with the current 3-D orienta- tion. The result of this processing step consists of chains of pixels. Short chains are eliminated since they usually result from noise or insignificant details of the scene. In practice, most chains are almost linear.

If the vanishing point is uncertain because the heading is uncertain, pixel linking may be affected near the van- ishing point: the linker may attract the chain of pixels to- ward the estimated vanishing point, away from the real one. In our implementation, this may happen for 3-D lines that are almost parallel to the optical axis. For this reason, scanning for valid pixels begins away from the estimated vanishing point. As soon as the first pixel of a chain is found, the direction to the vanishing point is estimated. This direction is used for all remaining pixels of the chain.

F. Line Fitting This step converts chains of pixels into line segments

by a least-squares method. If the mean square error ex- ceeds a threshold, the pixel chain is broken into two at an appropriate pixel. The line fitting routine is called recur- sively on the resulting chains.

G. Global Line Verijication The final step is to verify the compatibility between the

observed line segments and the vanishing point under consideration. The principle is the same as for the line support extraction, but with much tighter thresholds. Note that a line segment may be compatible with several 3-D orientations. The classification of the segment is then am- biguous. Most ambiguities can be resolved by tighter thresholds, especially after a better estimate of the robot's heading becomes available. Also note that ambiguities can only occur in a few narrow regions of the image plane.

When the a priori orientation is sensitive to uncertainty in the heading (case 4) of Section II), we replace the an- gular threshold in the image plane by a threshold on a posteriori heading. The verification is done as follows: a) the 2-D line equation parameters are computed; b) the heading that the robot should have for the current line to be associated with the current vanishing point is com- puted; and c) the line is validated if this a posteriori value falls within the a priori interval. This process ensures equal tightness of threshold for all locations on the retina.

Page 6: Significant line segments for an indoor mobile robot

806 IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 9, NO. 6, DECEMBER 1993

\ I (bj

Fig. 4. (a) The input intensity image. (b) The 2-D segments.

H. Results The algorithm was implemented in C on an HP 730

workstation, and tested on hundreds of indoor images ob- tained by our mobile robot (see [lo] for a video presen- tation of the results). Here, the predefined 3-D orienta- tions are the vertical and the two horizontal orientations perpendicular to each other and aligned with the axes of our building. Fig. 4 shows the results of line extraction for one image in a sequence (the intensity image shows lens distortion because the camera is equipped with a 6-mm wide-angle lens, as described in Section V). Fig. 5

shows the segments associated with each predefined 3-D orientation. The small vertical segment in the top of the first image and the small horizontal segment on the right of the second image have an ambiguous classification: they also appear in the third image. It is interesting to note that such misclassifications can only occur in narrow parts of the image where different 3-D orientations have similar 2-D projections.

The processing time is only 1.2 s for each 512 X 480 image, from the intensity image to the list of categorized segments. The fast speed can be explained partly by the absence of multicycle floating-point instructions from the line orientation equations, when properly expressed. The lines are not broken up easily by a noisy gradient orien- tation, and the output quality does not degrade abruptly with high image noise, provided that the thresholds for local gradient orientations are loosened. A few misclas- sifications occur in some parts of the images, but are marked as ambiguities.

Table I compares the real and computed 3-D orientation of 1439 detected segments from eight images in three dif- ferent environments (see Figs. 4 and 6 ) . The presence of people in some scenes, as well as noise in the radio trans- mission of images, did not seem to generate many mis- classifications. The percentages in each column sum to more than 100 percent because of the multiple classifica- tion of ambiguous segments.

IV. COMPUTING THE ROBOT~S ORIENTATION AND POSITION

A . Computing the Robot’s Orientation

In this section we show how the robot’s roll, pitch, and heading may be computed a posteriori from the 2-D seg- ments extracted in Section 111. To achieve the best pre- cision, each angle should be evaluated from the vanishing point that varies the most with that angle. We begin with the roll, which is best derived from the 3-D vertical lines if the camera is looking approximately forward. Because of our definition of the heading (a rotation around the ver- tical axis of W ) , the vanishing point associated with the vertical lines does not depend on the heading. If all the 2-D segments classified as 3-D ve,rtical are parallel in the image, the vanishing point lies at infinity, the pitch p is zero, and the roll r is the 2-D orientation of the segments in the image. Otherwise, we first estimate the location [uvp u , ~ ] ~ of the vanishing point by a weighted least- squares method. We then have

sv vvp = lim -

z + m S

Page 7: Significant line segments for an indoor mobile robot

LEBEGUE AND AGGARWAL: SIGNIFICANT LINE SEGMENTS FOR MOBILE ROBOT 807

\

Fig. 5 . The line segments associated with each 3-D orientation.

TABLE I HYPOTHESIZING 3-D ORIENTATIONS: SUCCESS AND ERROR RATES

How Detected Segments are Classified

Real 3-D Orientation

Classified As Vertical Horizontal 1 Horizontal 2 Other

Vertical Horizontal 1 Horizontal 2

100% 0%

0.4%

Total 100.4% Number of segments 2 1 1

0% 1.1% 21 % 100% 0 . 3 % 0%

0% 99.8% 73 %

100% 101.2% 100% 88 1123 11

Page 8: Significant line segments for an indoor mobile robot

808 IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 9, NO. 6 , DECEMBER 1993

Fig. 6. Different environments for testing segment classification.

with

su

SZI L.- 1 .

cos r 0 -sin r 0 0 0 0

= T R p [ sin 1 r 1 0 cos 1 r :] 0 L, 0 -s inp cosp 0) L.;, (12)

Let TRp,,, be the element i , j of the matrix TRp. After al- gebraic manipulations, we find that r is the arc-tangent of

puted using vanishing points every time an image is pro- cessed. This technique prevents the heading from drifting

The pitch of the robot may also be computed from the vanishing point of vertical lines, but for a better preci- sion, we use another vanishing point more dependent on pitch and heading. In our implementation, we use the van- ishing point that lies closest to the center of the image. We obtain the pitch and the heading with a method iden- tical to the computation of roll described above.

B. Computing the Robot's Relative Position Although the robot's odometers provide a relative po-

sition and a heading, these measurements cannot be used alone, since odometers drift without bounds. Odometry may be adequate for estimating the motion between a few images, but it is insufficient for the long sequences nec- essary in mapping an entire building.

The robot is equipped with an odometer on its right and left driving wheels. In practice, translation measurements are very good because they are derived from the average of the two odometers. Rotations (changes in robot head- ing), however, drift much faster because they rely on the difference between the two odometers. For this reason, we replace the odometric heading by the heading com-

without bounds. The position of the robot, obtained by integrating small displacements, can only drift because of errors in translations, which build up much slower than errors in rotations. C. Results

The quality of the results depends on the error on the vanishing point extraction and the precision of calibra- tion. To illustrate the performance of heading computa- tion from individual images, we plotted the path of the robot around a building floor. In Fig. 7, the heading ex- tracted by the vision algorithm was combined to the trans- lation given by odometry to estimate the path of the robot. At the beginning and end of the experiment, the robot was approximately aligned with the center of a hallway. The robot stopped about 5 m before the departure point. The total traveled distance was 125 m, and 97 images were used to compute the headings. The algorithm seemed un- affected by people moving in the field of view and by a significant noise due to the transmission of the images by radio. The processing time for a posteriori heading esti- mation is negligible compared to the time needed for line extraction.

Page 9: Significant line segments for an indoor mobile robot

LEBEGUE AND AGGARWAL: SIGNIFICANT LINE SEGMENTS FOR MOBILE ROBOT

__

809

i

Fig. 7 . The path of the robot (a) as estimated by odometry, and (b) after correction by vision.

V . ESTIMATING 3-D SEGMENT POSITIONS A . Introduction

The recursive estimation of 3-D segments from se- quences using Kalman filtering is the next logical step to- ward scene understanding. It aims at estimating the depth information that is absent from 2-D images. Existing ap- proaches include [l], [43], to name only a few. ViCville and Faugeras [44] concentrate on estimating motion and structure from monocular segment images and present re-

sults on the reconstruction of a calibration pattern. JC- zouin and Ayache [45], [46] review different segment- tracking techniques using monocular images and present results on 3-D reconstruction from synthetic aerial im- ages. Kriegman, Triendl, and Binford [47] find walls and doorways in a corridor using recursive estimation. They employ a simple 1-D stereo algorithm on vertical lines, assuming zero camera roll and tilt. Wells [48], [49] re- quires only one camera on a mobile robot to recursively estimate 3-D segments. Unfortunately, no results are pro- vided and comparison is impossible.

Applications in autonomous mobile robots imply the ability to deal with real indoor scenes while using limited computation resources. As in the other sections of this paper, we aim at providing useful features for an inter- pretation of the environment of the robot. We only con- sider the segments that have been extracted by the algo- rithm of Section 111. We will show how this choice benefits the robustness and speed of the recursive estimation pro- cess.

B. Preliminary Observations A 3-D segment should be observed from very different

angles to be accurately reconstructed. Two approaches can be used to achieve this with a monocular camera. The first one, active vision, pans and tilts the camera to keep some interesting features in the field of view. The second ap- proach, which we chose, consists of fitting the camera with a wide-angle lens. A 3-D segment is usually first detected when far ahead of the robot, but as the robot drives closer, the 2-D projection moves to one side of the focus of expansion. A short focal length will enable the 3-D segment to remain in the field of view long enough for an accurate reconstruction. The drawback to using a wide-angle lens is geometric distortion. The first process- ing step is therefore to correct this distortion [50]. This takes under 0.3 s on an HP 730 workstation.

The 3-D segments that are parallel to the direction of motion cannot be properly reconstructed without consid- ering their endpoints. The experience of all researchers using real images indicates that endpoints are unreliable, because 2-D segments may be broken or extended (due to specular reflection, for example). Therefore, we will not use endpoints to estimate the parameters of infinite 3-D lines. There is no justification for wasting computation to estimate features that will not help the interpretation of the scene. In practice, we will not attempt to reconstruct the 3-D segments parallel to the direction of motion. Those segments are sent to the semantic interpreter with an estimate of orientation but not of position, unless the robot changes its course. This contrasts with other ap- proaches that do send an estimate of the position and ori- entation of those segments, both with a huge uncertainty.

Fig. 8 presents a block diagram of the recursive esti- mation of 3-D segments. As in Section IV we combine translations from odometry to rotations from vanishing points to estimate the motion of the robot. Because the

Page 10: Significant line segments for an indoor mobile robot

-

810

Intensity Roll, image pitch

IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 9, NO. 6 , DECEMBER 1993

A A priori heading7

Fast segment extraction and 3-D orientation estimation

-

I I I

A posteriori estimation of heading, roll, and pitch

Rotation Translation

I I

A posteriori heading

I Prediction I I I 1 I

2-D segments 2-D segments 3-D segments with orientation with orientation

Match

I Rotation & Translation Matches

t I

3-D segment representation update

3-D segments

Semantic interpretation

Fig. 8. Recursive estimation of 3-D segments.

rotations are obtained from each 2-D image with an ex- cellent accuracy, we do not introduce any uncertainty on their parameters. Translations, however, rely partly on odometry. Therefore, we model the location of the robot in the world coordinate system at time k by a 3-D vector tk and its covariance matrix E ( t , t ; ) under the Gaussian noise assumption.

C. Representation of Lines

The computational complexity of a Kalman filter in- creases with the cube of the dimension of the state vector, therefore the state vector that represents 3-D segments should be as simple as possible. Since the algorithm of Section I11 already provides the most likely 3-D line ori- entation, we only want to estimate two positional param- eters for each 3-D line. Let us define a coordinate system L, for each predefined 3-D orientation n as follows: L, is a rotation of W so that the z axis of L, is parallel to the 3-D orientation n. The Trmw matrices are all known by

made of the first two components of the coordinates in L, of any point of the line.

Because we are not using stereovision, a problem arises when a 2-D segment is acquired for the first time. Our approach is to model it by a state vector with arbitrary depth and a very large uncertainty in depth. After re- peated observations from different angles, the uncertainty ellipsoid associated with the state vector will shrink around the best estimate of the position.

The measurement Yk,; of a 2-D line i in frame k given its associated 3-D orientation is defined as follows: if the line is ‘‘rather horizontal, ” or strictly speaking, nonver- tical, it can be described in the image plane by U = cyu + Yk,i with some parameter a. If the line is “rather verti- cal,” Yk,; is such that U = cyu + Yk,i. Deciding whether a line is rather horizontal or vertical is based on the esti- mated 3-D orientation, so that singularities in the repre- sentation are avoided yet consistency across frames is maintained. We will derive the equations for the first case and then show how they apply to the second case.

D. State and Measurement Equations

sociated with the 2-D line given by Yk,i. We have Let [x y z]lfl be a running point on the 3-D line i as-

1 0 0 0

VI E R , [a -1 Yk,j] 0 1 0 0 i 0 0 1 0

Because this holds Vz E R , we can derive

= o

0

0 O ] L l 1

[ = o (15)

The second equation gives CY as a function of Yk,i and T1 but is independent from the noisy translation tk. Let ,., -,, . .

definition. The state vectir’of a tracked 3-D line is theh T = Tk,L,p. Appendix A handles the case T1,3 = 0. If T1,3

Page 11: Significant line segments for an indoor mobile robot

LEBEGUE AND AGGARWAL: SIGNIFICANT LINE SEGMENTS FOR MOBILE ROBOT 811

# 0, the second equation implies that

[a -1 y k , i l = [ y k , i I1

The first equation becomes

with

T 3 . 3

p = [-% T2,3 T1.3

0

-1 0 I 1

This implies

= o

1 0 0 0

0 1 0 0

0 0 1 0

(Yk,i[PI,l p 1 , 2 1 + v 2 . 1 P 2 . 2 1 )

+ ( y k , i p I , 3 + p2,3) = 0 which can be expressed

with

Hk,i = Y k , i [ P I . I P1.21 + [p2,1 P 2 . 2 1

zk.i = - y k , i p I , 3 - p2,3.

The noisy measurement equation is therefore

2k.i = H k , i & , i + vk,; and the state equation in the fixed coordinate system L, is simply

(24)

vk,, is a centered Gaussian noise that depends on the actual measurement noise of Yk,i and the noise on the po- sition of the robot. Appendix B derives the explicit rela- tion. The noise W k , ; represents other unmodeled effects.

x k + l , i = x k , i + wk,i .

Note that these results also apply to the second case of 2-D segment orientation by exchanging the first two rows of the matrix T.

E. Implementation Issues The implementation benefits from hypothesizing the

3-D orientation of lines before estimating their 3-D posi- tion in the following ways:

Matching is attempted only between segments of the same orientation in 3-D. In practice, other imple- mentations of 3-D segment recursive estimators use arbitrary 2-D orientation “buckets” to achieve the same purpose. By matching segments of given 3-D orientations, we have a more formal approach to the problem of limiting possible matches. No computation is wasted on segments that cannot be reconstructed because they are parallel in 3-D to the direction of motion. The problem of singularities in the representation (see [45] for example) is avoided: although there is still a need for different line parameterizations, the appropriate one needs to be determined only once for each segment being tracked. Traditional approaches have difficulties dealing with moving objects and a moving camera. Large moving objects will corrupt the 3-D map of the environment and yield incorrect motion estimates if the egomo- tion is estimated from the image. By tuning the line extractor to certain orientations of interest, it is un- likely that an unexpected moving object could dur- ably display misleading features, unless the moving object happens to be “box-like.’’ In particular, the identification of architectural features of a scene will not be significantly hampered by humans walking in the field of view. This is extremely important in real- world situations.

Several approaches are possible for tracking segments across images. If a conventional token-tracker is used [51], matching is necessary only between segments with the same 3-D orientation. The results we are presenting here have been obtained with manual tracking. Prelimi- nary results with a new type of tracker, to be published later, show that very similar results can be obtained au- tomatically.

F. Results

Fig. 9 shows the vertical segments in a top view of a linear corridor, as estimated after 1 , 4, 7 , and 10 images in a sequence. The corridor is represented horizontally in the figure. The position and position uncertainty of seg- ments are given separately at each step. Each small “+” sign indicates the likeliest position of a vertical edge, overlaid on a grid with 1-m spacing. Uncertainty is rep-

Page 12: Significant line segments for an indoor mobile robot

812 IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 9, NO. 6 , DECEMBER 1993

TABLE I1 COMPARING ESTIMATED DOOR FRAME EDGES TO THEIR MODEL

Vertical Segments in the Top View

Angle of the Number A x in mm Ay in mm Error in Ax Door Frame

Door

1 917 - 54 - 9 7 % - 3 4" 1083 4 6 6% 0.2" 2

3 1013 - 18 -0 .3% - 1 .O" 4 996 25 -2.0% 1.4" 5 169 - 62 -24 .3% -4 6" 6 1114 16 9 .6% 0 8"

8 292 99 -71.3% 18.7"

. . . . 1. .

7 42 1 - 72 -58 6 % 9 7"

. c . . . . I

H / v-- \-. \.- - Fig. 9 Top view of a hallway after 1, 4, 7 , and 10 images (estimate and

uncertainty of vertical segments)

resented by an elongated ellipse around the likeliest po- sition.

The robot moved 9 m during the sequence, or about a quarter of the length of the corridor (the motion is from left to right in the figure). The segments visible for the first time are given in arbitrary depth and a very large uncertainty in depth, as shown after processing the first

x * * - _"*..,

Fig. 10 Reconstructed vertical edges overlaid on floor plan. -

image. As the robot moves forward, the uncertainty el- lipses shrink around the best estimates of position. At the end of the sequence, edges on the right and left side of the corridor are clearly visible. The position of the robot is indicated by a small dot in the middle of the corridor. It is most easily seen in the last uncertainty view. In this sequence, there are few interesting horizontal edges that can be tracked. They are not shown here in order to keep the figures readable. The width of the corridor at the lo- cation of the robot was estimated with an error varying from 0.4 to 4.7%.

As an illustration, Table I1 analyzes the vertical edges of the door frames shown in the last top view of Fig. 9. Ideally, the two vertical segments should be 1016 mm apart and the angle of the door frame in the scene should be zero. The last two doors are about 9 m ahead of the robot and the 3-D position of their segments is still very uncertain. The number of images in which the third and fourth door frames were visible could explain the high precision of their reconstruction. But in fact, the number of images is not as important as the range of observation angles. In Fig. 9, these doorways are shown as having the smallest uncertainty ellipses.

After a very careful calibration, we processed another short sequence of images acquired by the robot in the en- vironment depicted in Fig. 4. The result of the reconstruc- tion of vertical edges using four images is presented in Fig. 10, and compared to the architectural floor plan of the building. Vertical edges are indicated by small crosses. The robot moved 3 m from left to right. The last position of the robot is represented by a small square in the left of the figure. The grid has a 1-m spacing. The door in the top right comer was closed, and some segments on it are reconstructed. The high precision of the reconstruction of these segments is surprising, since they are close to the focus of expansion and still more than 4 m ahead of the robot. A few segments with no architectural interest are picked up. For example, two of them lie on the wall at the bottom of the figure. The segments in the top-left quadrant of the figure appear to have been shifted to the right. After measuring the actual dimensions in the scene, we found that the corner of the comdor was built approx- imately 5 cm to the right of the position shown on the architectural drawing. The robot therefore correctly in- dicated this discrepancy, although it overestimated it slightly.

Page 13: Significant line segments for an indoor mobile robot

LEBEGUE AND AGGARWAL: SIGNIFICANT LINE SEGMENTS FOR MOBILE ROBOT 813

VI. CONCLUSION We have presented a new approach to the interpretation

of monocular line images for the navigation of a mobile robot. We described algorithms for interpreting single im- ages and for recursively updating interpretations with monocular sequences of images. The quality and useful- ness of the results of each processing stage justified the practical constraint placed on the problem: the a priori knowledge of a few prominent 3-D segment orientations. This constraint is reasonable since even humans get dis- oriented when placed in a tilted environment, despite hav- ing our own “inclinometers” in our inner ears [52], [53]. By reasoning in terms of what features the high-level interpretation stage will actually use, we were able to de- sign lower-level algorithms for better and faster process- ing. The ultimate benefits of this approach were demon- strated on real images in real situations, using conventional computing hardware, and with image acqui- sitions every few seconds. We feel that the kind of top- down information employed in this paper will increas- ingly be used by vision algorithms to assist bottom-up im- age processing in robotic applications.

APPENDIX I CASE = 0

As stated in Section V , this derivation applies to segments that are nonvertical in 2-D. Therefore, the infinite 2-D line associated with the segment intersects the axis U = 0. If = 0, we have d, = 0 at U = 0. Either d, # 0, and the segment is vertical (contradiction); or d, = 0, which means that the vanishing point lies on the U = 0 axis. The latter case is very unlikely since we do not track 3-D segments that are parallel to the direction of motion. Should it occur nevertheless, the robot should be rotated and a new image acquired.

APPENDIX I1 DERIVATION OF E ( Vk, V l i )

The measurement equation is

zk , i = Hk, ixk , i + vk,i

vk,i = -6Zk.j + 6Hk,i%,j.

(28)

therefore

(29)

Since

zk, i = - y k , i p I , 3 - p2,3 (30)

we have

Let [U vIT be a point on the 2-D segment under con- sideration. It follows from Section I11 that the vector

Since

where

points to the vanishing point n , and is therefore parallel to the segment under consideration.

With T = TLnp, d [ p231 ] + x;;- p2’2 6 t k .

T l , 3 - T3,3u dtk

T2,3 - T3,3u From the definition of P , we have d = [ 1. (27)

T3,JTI. 1 T3, 3 T l , 2 T3,3 TI, 4 +T3.1 -~ + T3,2 -~ + T3,4

T l , 3 Tl,3 T l , 3

T2, 3 1 T2, 3 T I , 2 T2, 3 4 T2.2 ~ T2,4

- 7-2.1 ~ - ~-

T1,3 Tl,3 Tl.3

(34)

(35)

Page 14: Significant line segments for an indoor mobile robot

814 IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, VOL. 9, NO. 6, DECEMBER 1993

with Tk,L,p = T = TcpTRcTk,wRTL,w. If we decompose Tk, wR into a rotation matrix and a translation matrix, this equation can be rewritten as

1 M 3 x 3 B 3 x 3 1 [ - A 3 x 3 h + 4 x 1 1

[O 0 01 1 T = [ Note that (dP[, , /dtk) = 0 if m # 3 , therefore

[2] F. Lustman, “Vision sterkoscopique et perception du mouvement en vision artificielle,” Ph.D. dissertation, Universite de Paris-sud, Dec. 1987.

[3] J. Crowley and P. Stelmaszyk, “Measurement and integration of 3-D structures by tracking edge lines,” in Proc. 1st Euro. Conf Comput. Vision, Antibes, France. Berlin: Springer-Verlag, Apr. 1990, pp. 269-280.

[4] J . L. Crowley, P. Bobet, and K . Sarachik, “Mobile robot perception using vertical line stereo,” in Vision-Based Vehicle Guidance, I. Ma- saki, Ed.

[5] M. Thonnat, “Semantic interpretation of 3-D stereo data: Finding the main structures,” in Proc. 8th Int. Conf. Pattern Recogn., Paris, France, Oct. 1986, pp. 1051-1054.

[6] C. Fennema, A. Hanson, E. Riseman, J . R. Beveridge, and R. Ku- mar, “Model-directed mobile robot navigation,” IEEE Trans. Sys- tems, Man, Cybern., vol. 20, pp. 1352-1369, Nov. 1990.

[7] C. Fennema and A. R. Hanson, “Experiments in autonomous navi- gation,” in Proc. 10th Int. Con5 Pattern Recogn., Atlantic City, NJ, June 1990, pp. 24-31.

[SI E. Triendl and D. J. Kriegman, “Stereo vision and navigation within buildings,” in Proc. IEEE Int. Con$ Robotics Automation, Raleigh, NC. Mar. 1987, pp. 1725-1730.

[9] X. Lebkgue and J . K. Agganval, “Extraction and interpretation of semantically significant line segments for a mobile robot,” in Proc. IEEE Int. Con$ Robotics Automat., Nice, France, May 1992, pp. 1778-1785.

[lo] -, “Semantically significant line segments for a mobile robot . . .ROBOTEX,” in Video Proc. IEEE Int. Conf Robotics Auto- mat. , Nice, France, May 1992.

[ l l ] -, “Detecting 3-D parallel lines for perceptual organization,” in Proc. 2nd Europ. Conf. Comput. Vision, Santa Margherita Ligure, Italy.

[12] -, “A mobile robot for visual measurements in architectural ap- plications,” in Proc. IAPR Workshop Machine Vision Applicat., To- kyo, Japan, Dec. 1992, pp. 195-198.

[13] -, “Robotex: An autonomous mobile robot for precise survey- ing,” in Proc. Int. Con6 Intell. Autonom. Syst . , Pittsburgh, PA, Feb. 1993, pp. 460-469.

1141 I . P. Howard, “The effects of tilting the visual frame of reference on judgments of the visual vertical,” in Human Visual Orientation. New York: Wiley, 1982, ch. 10.3, pp. 419-427.

[15] N. Bischof, “Optic-vestibular orientation to the vertical,” in Hand- book of Sensory Physiology V 1 / 2 , H. H. Komhuber, Ed. Berlin: Springer-Verlag, 1974, ch. 2, pp. 155-190.

[16] T. Shakunaga, “3-D corridor scene modeling from a single view un- der natural lighting conditions,” IEEE Trans. on Pattern Anal. Mu- chine Intell., vol. 14, pp. 293-298, Feb. 1992.

[I71 M. Straforini, C . Coelho, M. Campani, and V. Torre, “The recovery and understanding of a line drawing from indoor scenes,” IEEE Trans. Pattern Anal. Machine Intell. , vol. 14, pp. 298-303, Feb. 1992.

[18] P. Olivieri, M. Gatti, M. Straforini, and V. Torre, “A method for the 3D reconstruction of indoor scenes from monocular images,” in Proc. 2nd Euro. Conf Comput. Vision, Santa Margherita Ligure, It- aly. Berlin: Springer-Verlag, May 1992, pp, 696-700.

[19] P. Belluta, G. Collini, A. Vem, and V. Torre, “3D visual informa- tion from vanishing points,” in Proc. IEEE Workshop Interpretation 3 0 Scenes, Austin, TX, Nov. 1989, pp. 41-49.

[20] T. Tsubouchi and S. Yuta, “Map assisted vision system of mobile robots for reckoning in a building environment,” in Proc. IEEE Int. Con5 Robotics Automat., Raleigh, NC, Mar. 1987, pp. 1978-1984.

[21] L.-L. Wang and W.-H. Tsai, “Camera calibration by vanishing lines for 3-D computer vision,’’ IEEE Trans. Pattern Anal. Machine I n - tell. , vol. 13, pp. 370-376, Apr. 1991.

[22] M. A. Penna, “Determining camera parameters from the perspective projection of a quadrilateral,” Parrern Recogn., vol. 24, no. 6 , pp.

[23] W . Chen and B. C. Jiang, “3-D camera calibration using vanishing point concept,” Pattern Recogn., vol. 24, no. I , pp. 57-67, 1991.

[24] B. Caprile and V . Torre, “Using vanishing points for camera cali- bration,” Int. J . Comput. Vision., vol. 4 , pp. 127-139, Mar. 1990.

[25] L.-L. Wang and W.-H. Tsai, “Computing camera parameters using vanishing-line information from a rectangular parallelepiped,” Ma- chine Vision Applicat., vol. 3, pp. 129-141, Summer 1990.

1261 T. Echiao, “A camera calibration technique using three sets of par-

Berlin: Springer-Verlag, 1992, ch. 15, pp. 300-324.

Berlin: Springer-Vzrlag, May 1992, pp. 720-724.

533-541, 1991.

a p 2 , 3 a ( T 2 ; 1 7 . 4 ~- T2.3 aT1,4 aT2,4

atk.h atk,h 4 = - -

T2.3 A l , h + A 2 , h .

- - -- T l , 3

(43)

[I] N. Ayache, Artificial Vision for Mobile Robots. Cambridge, MA: MIT Press, 1991. . .

Page 15: Significant line segments for an indoor mobile robot

LEBBGUE AND AGGARWAL: SIGNIFICANT LINE SEGMENTS FOR MOBILE ROBOT 815

allel lines,” Machine Vision Applicar., vol. 3, pp. 159-167, Summer 1990. G.-Q. Wei and Z.-Y. He, “Determining vanishing point and camera parameter: New approaches,” in Proc. 9th Int. Con$ Pattern Re- cogn., Rome, Italy, Nov. 1988, pp. 450-452. P. Belluta, G. Collini, A. Vem, and V. Torre, “Navigation by track- ing vanishing points,” in Working Notes AAAIRobot Navigat. Symp., Stanford Univ., Mar. 1989, pp. 6-10. R. S. Weiss, H. Nakatani, and E. M. Riseman, “An error analysis for surface orientation from vanishing points,” IEEE Trans. Pattern Anal. Machine Intell., vol. 12, pp. 1179-1185, Dec. 1990. A. B. Martinez, J . Climent, M. J. Asensio, and J . Battle, “Absolute positioning for indoor mobile robots guidance,” in Proc. 23rd Int. Synp. Industr. Robots, Barcelona, Spain, Oct. 1992, pp. 529-532. S . T. Bamard, “Interpreting perspective images,” Art$cial Intell.,

M. I , Magee and J. K. Agganval, “Determining vanishing points from perspective images,” Comput. Vision, Graphics, Image Pro- cess., vol. 26, pp. 256-267, May 1984. B. Brillault and O’Mahony, “New method for vanishing point detec- tion,” Comput. Vision, Graphics, Image Process., Image Under- standing, vol. 54, pp. 289-300, Sept. 1991. L. Quan and R. Mohr, “Determining perspective structures using hi- erarchical Hough transform,” Pattern Recogn. Lett., vol. 9 , pp. 279- 286, May 1989. R. T. Collins and R. S. Weiss, “Vanishing point calculation as a statistical inference on the unit sphere,” in Proc. 3rd Inr. Con$ Com- put. Vision, Osaka, Japan, Dec. 1990, pp. 400-403. S.-P. Liou and R. C. Jain, “Road following using vanishing points,” Comput. Vision, Graphics, Image Process., vol. 39, pp. 116-130, July 1987. J . Canny, “A computational approach to edge detection,” IEEE Trans. Patrern Anal. Machine Intell., vol. 8 , pp. 679-698, Nov. 1986. R. Deriche, “Using Canny’s criteria to derive an optimal edge detec- tor recursively implemented,” Int. J . Comput. Vision, vol. I , Apr. 1987. J. G. Dunham, “Optimum uniform piecewise linear approximation of planar curves,” IEEE Trans. Pattern Anal. Machine Intell., vol. 8. nn. 66-75. Jan. 1986.

vol. 21, pp. 435-462, NOV. 1983.

-. r r ~~ . [40] 0. D. Faugeras, R. Deriche, N. Ayache, F . Lustman, and E. Giuli-

ano, “Depth and Motion Analysis: The machine being developed within Esprit Project 940,” in Proc. IAPR Workshop Comput. Vision, Tokyo, Japan, Oct. 1988, pp. 35-44.

[41] J . B. Burns, A. R. Hanson, and E. M. Riseman, “Extracting straight lines,” IEEE Trans. Patrern Anal. Machine Intell., vol. 8, pp. 425- 455. July 1986.

[42] P. Kahn. L. Kitchen, and E. M. Riseman, “A fast line finder for vision-guided robot navigation,” IEEE Trans. Parrern Anal. Machine Inre//., vol. 12, pp. 1098-1102, Nov. 1990.

[43] N. Navab, R. Deriche, and 0. Faugeras, “Recovering 3D motion and structure from stereo and 2D token tracking cooperation,” in Proc. 3rd In?. Con$ Comput. Vision, Osaka, Japan, Dec. 1990, pp. 513- 516.

[44] T . Viiville and 0. Faugeras, “Feed-forward recovery of motion and structure from a sequence of 2D-line matches,” in Proc. 3rdInt. Conf: Compur. Vision, Osaka, Japan, Dec. 1990, pp. 517-520.

[45] J . L. Jkzouin and N. Ayache, “Computing 3D structure from a mon- ocular sequence of images with known motion,” Tech. Rep., Matra- LTIS, BP 235, 78052 Guyancourt, France, 1991.

[46] -, “3D structure from a monocular sequence of images,” in Proc. 3rd Inr. Con$ Comput. Vision, Osaka, Japan, Dec. 1990, pp. 441- 445.

[47] D. J . Kriegman, E. Triendl, and T. 0. Binford, “Stereo vision and navigation in buildings for mobile robots, ” IEEE Trans. Robotics Automat., vol. 5 , pp. 792-803, Dec. 1989.

[48] W. M. Wells, “Visual estimation of 3-D line segments from mo- tion-A mobile robot vision system,” IEEE Trans. Robotics Auto- mat., vol. 5 , pp. 820-825, Dec. 1989.

[49] -, “Visual estimation of 3-D line segments from motion-A mo- bile robot vision system,” in Proc. 6th Nut. Con$ Art$cial Intell., Seattle, WA, July 1987, pp. 172-176.

[SO] D. A. Butler and P. K. Pierson, “A distortion-correction scheme for industrial machine-vision applications,” IEEE Trans. Robotics Au- tomat., vol. 7, pp. 546-551, Aug. 1991.

[51] R. Deriche and 0. Faugeras, “Tracking line segments,” in Proc. 1st Euro. Con& Comput. Vision, Antibes, France. Berlin: Springer- Verlag, Apr. 1990, pp. 259-268.

[52] S. E. Asch and H. A. Witkin, “Studies in space orientation: I. Per- ception of the upright with displaced visual fields,” J . Experimenr. Psychol., vol. 3 8 , pp. 325-337, June 1948.

[53] H. A. Witkin and S. E. Asch, “Studies in space orientation: IV. Fur- ther experiments on perception of the upright with displaced visual fields,” J . Experiment. Psychol., vol. 38, pp. 762-782, Dec. 1948.

Xavier Leb&gue (S’88) received the DiplBme d’hgtnieur degree from the Ecole Centrale de Lyon, France, and the Master of Engineering from Cornell University, Ithaca, NY, in 1988.

He spent a year at INRIA, France, under Dr. Olivier Faugeras. He is presenting working to- ward the Ph.D. degree at the Computer and Vi- sion Research Center, University of Texas at Aus- tin. His research interests include robotics, computer vision, image processing and video compression.

Mr. Lebegue was the recipient of the Philips Award for Best Student Paper at the 1992 IEEE International Conference on Robotics and Auto- mation, Nice, France.

J. K. Aggarwal (F’76) has served on the faculty of the University of Texas at Austin, College of Engineering, since 1964 and is now the Cullen Professor of Electrical and Computer Engineering and Director of the Computer and Vision Re- search Center. His research fields include com- puter vision, parallel processing of images, and pattern recognition.

Dr. Agganval is author or editor of 6 books and 20 book chapters; author of over 150 journal pa- pers, as well as numerous proceedings papers, and

technical reports. He served as Chairman of the IEEE Computer Society Technical Committee on Pattern Analysis and Machine Intelligence (1987- 1989); Director of the NATO Advanced Research Workshop on Multi- sensor Fusion for Computer Vision, Grenoble, France (1989); Chairman of the Computer Vision Program Committee, 10th International Confer- ence on Pattern Recognition, Atlantic City (1990); and was General Chair- man of the 1993 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. He received the 1992 Senior Research Award of the American Society of Engineering Education. He is IEEE Computer Society representative for the International Association for Pattern Rec- ognition and President of the International Association for Pattern Recog- nition, and is an Editor of IEEE TRANSACTIONS O N PARALLEL A N D DISTRIB- UTED SYSTEMS.