12
Robotics and Autonomous Systems 32 (2000) 207–218 Centering behavior with a mobile robot using monocular foveated vision Marc Ebner * , Andreas Zell Eberhard-Karls-Universität Tübingen, Wilhelm-Schickard-Institut für Informatik, Arbeitsbereich Rechnerarchitektur, Köstlinstraße 6, 72074 Tübingen, Germany Received 3 December 1997; received in revised form 25 October 1999 Communicated by F.C.A. Groen Abstract A system for corridor following based on properties of the human visual system is presented. The robot extracts image features using an interest operator to compute sparse optical flow induced by the translatory motion of the robot. The available status information from the robot is used to compensate for the known rotatory movement of the image. Control of the robot is done by transforming the optical flow to ego-motion complex log space. The difference between the median flow extracted from the left and right peripheral visual areas is used to control the heading of the robot. Forward velocity is controlled by trying to keep the perceived optical flow constant. © 2000 Elsevier Science B.V. All rights reserved. Keywords: Visual robot control; Centering behavior; Complex log mapping 1. Motivation Control of a mobile robot may be achieved by con- structing a complete three-dimensional model of the world, planning a path and then executing the required steps to move the robot along the path. Of course it is desirable to know as much as possible about the environment. However, this strategy requires much computing power which may be used for other, higher level tasks if a simple control strategy can be em- ployed at the same time. Horswill [17] has pointed out that it is unnecessary to construct a three-dimensional model of the environment simply to reduce it to a single number such as heading inside a corridor. * Corresponding author. Present address: Universität Würzburg, Lehrstuhl für Informatik II, Programmiersprachen und Program- miermethodik, Am Hubland, 97074 Würzburg, Germany. E-mail address: [email protected] (M. Ebner). Aloimonos [1] introduced purposive vision which only uses the data required to solve the particular task, thereby saving valuable computer resources. This makes it possible to close the loop and achieve vision during action [31]. Bekey [2] offered construc- tive proofs that biologically inspired robot control can succeed where traditional approaches are difficult or impossible. Like Sandini et al. [31] we are working in the field of visual robot control with an emphasis on vision during action. In this context much may be learnt by looking at how insect vision works [15,16]. Some of this knowledge might be transferable to the field of robot vision [12]. In particular, researchers have been inspired by insects like bees for robot control [6,31–34]. Bees are unable to measure distances larger than a few centime- ters using stereo vision because of the small distance between their eyes. However, bees are able to infer the range of an object from the perceived angular velocity 0921-8890/00/$ – see front matter © 2000 Elsevier Science B.V. All rights reserved. PII:S0921-8890(99)00127-X

Centering behavior with a mobile robot using monocular foveated vision

Embed Size (px)

Citation preview

Page 1: Centering behavior with a mobile robot using monocular foveated vision

Robotics and Autonomous Systems 32 (2000) 207–218

Centering behavior with a mobile robot usingmonocular foveated vision

Marc Ebner∗, Andreas ZellEberhard-Karls-Universität Tübingen, Wilhelm-Schickard-Institut für Informatik, Arbeitsbereich

Rechnerarchitektur, Köstlinstraße 6, 72074 Tübingen, Germany

Received 3 December 1997; received in revised form 25 October 1999Communicated by F.C.A. Groen

Abstract

A system for corridor following based on properties of the human visual system is presented. The robot extracts imagefeatures using an interest operator to compute sparse optical flow induced by the translatory motion of the robot. The availablestatus information from the robot is used to compensate for the known rotatory movement of the image. Control of the robotis done by transforming the optical flow to ego-motion complex log space. The difference between the median flow extractedfrom the left and right peripheral visual areas is used to control the heading of the robot. Forward velocity is controlled bytrying to keep the perceived optical flow constant. © 2000 Elsevier Science B.V. All rights reserved.

Keywords:Visual robot control; Centering behavior; Complex log mapping

1. Motivation

Control of a mobile robot may be achieved by con-structing a complete three-dimensional model of theworld, planning a path and then executing the requiredsteps to move the robot along the path. Of course itis desirable to know as much as possible about theenvironment. However, this strategy requires muchcomputing power which may be used for other, higherlevel tasks if a simple control strategy can be em-ployed at the same time. Horswill [17] has pointed outthat it is unnecessary to construct a three-dimensionalmodel of the environment simply to reduce it to asingle number such as heading inside a corridor.

∗ Corresponding author. Present address: Universität Würzburg,Lehrstuhl für Informatik II, Programmiersprachen und Program-miermethodik, Am Hubland, 97074 Würzburg, Germany.E-mail address:[email protected] (M. Ebner).

Aloimonos [1] introduced purposive vision whichonly uses the data required to solve the particulartask, thereby saving valuable computer resources.This makes it possible to close the loop and achievevision during action [31]. Bekey [2] offered construc-tive proofs that biologically inspired robot control cansucceed where traditional approaches are difficult orimpossible. Like Sandini et al. [31] we are workingin the field of visual robot control with an emphasison vision during action. In this context much may belearnt by looking at how insect vision works [15,16].Some of this knowledge might be transferable to thefield of robot vision [12].

In particular, researchers have been inspired byinsects like bees for robot control [6,31–34]. Bees areunable to measure distances larger than a few centime-ters using stereo vision because of the small distancebetween their eyes. However, bees are able to infer therange of an object from the perceived angular velocity

0921-8890/00/$ – see front matter © 2000 Elsevier Science B.V. All rights reserved.PII: S0921-8890(99)00127-X

Page 2: Centering behavior with a mobile robot using monocular foveated vision

208 M. Ebner, A. Zell / Robotics and Autonomous Systems 32 (2000) 207–218

provided that the ego-motion of the eye and the bear-ing of the object are known [36]. Bees are performingcentering behavior in a corridor by balancing thespeeds of the retinal images in the two eyes [36,37].The higher the speed of the retinal image, the smallerthe distance to the wall. Thus the bee is located exactlyin the center of a corridor if the image speeds of the leftand right eye are equal. Bees seem to be able to per-form this centering behavior independent of the con-trast or spatial-frequency content of gratings along thewalls.

However, humans have a more advanced visual sys-tem which is different from the visual system of thebee. An introduction to the human visual system isgiven by Tovée [41]. In the human visual system theaxis of fixation is facing forward with a rather smallfield of view [4]. The spatial resolution is much higherin the center (fovea) than in the periphery of the retina.Humans perceive the world as stationary even thoughthe eye may be moving. Information about the desiredeye movements is used to compensate for the motionin the image induced by the ego-motion of the eye[42]. If the eye is moved passively by using a fingeror one tries to change the gaze with a mechanicallyfixated eye the world seems to move. Ego-motion ofthe head is determined by the vestibular apparatus [4].Information from the vestibular apparatus could beused to improve the perceived image. In this respectour work has been more inspired by the human visualsystem than the visual system of the bee.

The paper is organized as follows. Section 2 re-views relevant research in the field of visual robotcontrol in general and corridor following in particular.Section 3 reviews the complex logarithmic mappingwhose properties are important for our work. Section4 describes our control algorithm in detail. The com-putation of the ego-motion of the camera from theavailable status information supplied by our mobilerobot is also described. Section 5 describes some ofthe experiments performed with our mobile robot.The paper ends with a conclusion in Section 6.

2. Background

Much research has been done in the field of visualobstacle avoidance and navigation. In this section webriefly review some of these approaches. Research

related to the task of corridor following using opticalflow is discussed in more detail below.

2.1. Visual robot navigation

Moravec [25] used a mobile robot with a slidingcamera inspired by the motion of lizards to determinea three-dimensional map of interesting points in anenvironment. A path is planned though the environ-ment and subsequently updated. The robot moves indiscrete steps of 1 m.

Brady and Wang [3] determine scene structure inthe environment of a mobile robot by extracting cor-ner points. For each point depth is computed using astereo camera or a structure from motion approach.Brady and Wang suggest that it might not be neces-sary to explicitly compute depth. Useful informationfor robot control may be derived directly from opticalflow, disparity or rate of change of disparity. Indeed,Tomasi and Kanade [40] have shown that it is possi-ble to extract shape and motion from image sequenceswithout calculating camera-centered depth.

Fossa et al. [11] developed a vision based navigationsystem using three cameras. Two cameras are used forobstacle detection and avoidance and the third camerais used for self-localization.

Franceschini et al. [12] developed an artificial com-pound eye with facets which was inspired by the fly’seye. Although control is done continuously the move-ment of the robot is separated into a purely transla-tional step followed by a rotational step. Franceschiniet al. use this type of motion to determine the transla-tory component of the optical flow.

Horswill [17] developed a simple, cheap and robustvisual navigation system. Both the rotational and thetranslational velocity are under computer control. Cor-ridor following is done by estimating the heading ofthe robot and calculating a measure for the distanceto the left and right walls from visual data. Control ofthe robot is done in a closed loop except for turning atcorners which is performed open-loop. Horswill em-phasizes that only the required data needed for the taskat hand has to be extracted from the visual data, e.g. itis not necessary to construct a complete 3D model ofthe environment just to achieve a corridor followingbehavior.

Crespi et al. [8] followed a memory-based approachto navigation inside a corridor. First, images labeled

Page 3: Centering behavior with a mobile robot using monocular foveated vision

M. Ebner, A. Zell / Robotics and Autonomous Systems 32 (2000) 207–218 209

with known lateral displacement and heading of therobot inside the corridor are used during a learningphase. Lateral displacement is discretized into threevalues and heading into three or four values. Duringrobot control a probability distribution is calculatedfor the classes which specifies the new translationaland rotational velocity.

Sobey [38] developed a zig-zag motion strategy fora monocular robot to avoid collisions. First an orien-tation movement is performed to determine the dis-tance to the objects 60◦ on both sides of the goal.All subsequent movements are made in a direction forwhich a safe distance to travel is known using a po-tential field method. A change of direction between20◦ and 120◦ is made for every move which resultsin a zig-zag motion. Range values are computed onlyfor a horizontal strip of the image. Sobey has usedan action-perception-planning cycle which acquires 16images for every translatory motion.

Vogelgesang et al. [43] developed an algorithm fordepth perception from radial optical flow fields. Theypropose to use their algorithm with a mobile robot.However if the heading of the robot changes the algo-rithm has to be reset.

Jochem and Pomerleau [21] developed a very suc-cessful vision-based system for driving on outdoorroads which is also able to perform tactical maneuverssuch as lane changing. The system is based on an arti-ficial neural net trained by watching a human teacher.

Košecká [22] represented the environment as a placegraph and used visual servoing to move from one placeto the next. It is assumed that world coordinates of thelandmarks used for navigation are known.

2.2. Centering behavior

Coombs and Roberts [6] used two cameras toachieve a centering behavior inspired by the honey-bee. The cameras are located at an angle of 60◦ fromthe front of the robot. Coombs and Roberts determinegradient-parallel optical flow at full spatial resolutionfor each of the cameras. The maximum optical flowis determined by a histogram method. Both maximaare compared to calculate the desired heading of therobot. Coombs and Roberts note that this approachworks reliable as long as the focus of expansion doesnot enter the image. The desired heading is satu-rated to prevent this from happening. The gaze of the

camera is stabilized as the robot rotates due to thearchitecture of the robot. An alternate strategy pro-posed by Coombs and Roberts would be to subtractthe optical flow, which is observed around the focusof expansion, from the peripheral optical flow.

Coombs et al. [5] use peripheral normal flowobtained from a camera with a wide lens to achievecentering behavior. Another camera with a narrowlens is used to estimate time-to-contact from flowdivergence to detect imminent collisions. This infor-mation is used to turn the robot around in a “deadend”. The cameras are rotationally stabilized by ro-tating them in the direction opposite to the rotationof the robot’s body. A saccade is made if the cameraorientation differs too much from the heading of therobot to align them again.

Santos-Victor et al. [33,34] equipped a mobilerobot with two laterally pointing cameras to achievea centering reflex also motivated by the behavior ofhoneybees. This basic centering reflex was extendedwith a sustaining mechanism which allows the robotto navigate whenever optical flow is only availablefrom one of the cameras. Due to their camera setupSantos-Victor et al. assume that only horizontal opti-cal flow can occur which simplifies calculation of op-tical flow. Average optical flow from the left and rightcamera is used to control the robot. Santos-Victor etal. derived constraints that have to be met to controlthe robot during rotations. By choosing the minimumtranslational velocity, the maximum rotational velo-city and the setup of the cameras appropriately theseconstraints may be met. Santos-Victor et al. also notethe possibility of suppressing robot control duringtimes when the constraints are not met. Both possibili-ties worked successfully with their robot. They controlthe rotational velocity of the robot with a PID con-troller. Translational velocity is controlled by keepingthe desired optical flow at a rate of 2 pixels/frame.

Neven and Schöner [27] developed a behavior basedvision-guided approach to homing and obstacle avoid-ance. Time to contact is estimated from optical flowwhich is used to control heading direction and veloc-ity. In the absence of reliable visual information therobot uses information from dead reckoning. Nevenand Schöner compute optical flow and time to con-tact from coarsely sampled images with two cameras(left and right). Neven and Schöner separate transla-tory and rotatory motions of the robot. Both forward

Page 4: Centering behavior with a mobile robot using monocular foveated vision

210 M. Ebner, A. Zell / Robotics and Autonomous Systems 32 (2000) 207–218

and angular velocity are controlled by the estimatedtime to contact information. Time to contact estimatesclose to the center of the image are excluded becausethe peripheral information is more reliable. Small ro-tatory motions that may occur even if the robot is sup-posed to perform only a translatory motion along theoptical axis are removed by subtracting the averageflow along the horizontal and vertical meridian.

The approach presented here works with a singlecamera with the focus of expansion in view. Data isgathered in a purposive way. Control is done continu-ously in a closed loop. Our approach differs from theprevious ones by Coombs and Roberts, Coombs et al.,Santos-Victor et al., and Neven and Schöner. Infor-mation about the ego-motion of the camera obtainedfrom the robot’s status information is used to com-pensate for the rotatory motion of the camera. By us-ing sensed velocities instead of the desired velocitiesa separate algorithm may be used to control the gazeof the camera. Obviously a centering behavior cannotbe achieved with one camera pointing sideways. Nev-ertheless it should be possible to achieve centering be-havior as long as both walls are visible and sufficientvisual information is present in the visual areas whichare used to control the robot.

3. Ego-motion complex logarithmic mapping

Since we are working with a camera with the focusof expansion in view we use the complex logarithmicmapping to transform the optical flow calculated inimage coordinates to complex log space. This trans-formation simplifies the comparison of radial opticalflow. The complex logarithmic mapping has alreadybeen extensively studied by several researchers and

Fig. 1. Image, image in complex log space and its inverse.

has been used in a number of different areas such ascharacter recognition [28], template matching [44],extraction of moving objects [13,18], motion stereo[19] and object detection and centering [29]. Complexlogarithmic mapping is pseudo-invariant to size, rota-tion, and projection scaling [35]. For a fixated object,changes of size or rotation lead to a linear shift of aninvariant image. The complex logarithmic mappingperforms a conformal mapping of the log polar planeto a Cartesian plane [39]. The ego-motion complexlogarithmic mapping is defined as [20]:

r(x, y) = log√

(x − xFOE)2 + (y − yFOE)2, (1)

θ(x, y) = tan−1(

y − yFOE

x − xFOE

). (2)

An image transformed with this mapping is shownin Fig. 1. The ego-motion complex logarithmic map-ping differs from the complex logarithmic mapping inthat it is not taken about the origin. Instead it is takenabout the focus of expansion(xFOE, yFOE). For a for-ward moving observer stationary objects in the fieldof view only produce horizontal flow in ego-motioncomplex log space [18]. Jain et al. [19] have shownthat (assuming a focal lengthf = 1):

dr

dZ= − 1

Zand

dZ= 0. (3)

Thus the optical flow in ego-motion complex logspace is a direct measure for the distanceZ to theobjects in view for a known translatory motion.

Tistarelli and Sandini [39] also analyzed the po-lar and log polar mapping and determined severaladvantages that may be gained by using a polar orlog polar mapping. In addition to the already men-tioned invariance property to scaling and rotations the

Page 5: Centering behavior with a mobile robot using monocular foveated vision

M. Ebner, A. Zell / Robotics and Autonomous Systems 32 (2000) 207–218 211

complex log mapping also performs a data reductiondue to the non-uniform sampling of the image. Mo-tion equations which relate ego-motion to the opticalflow are simplified and time to impact may be easilycalculated using a complex log mapping.

Schwartz [35] has shown that the retino-striate map-pings of the rhesus, squirrel and owl monkey may bedescribed by a complex logarithmic mapping. Withthe direction of the optical axis facing forward humanscould be using the complex logarithmic mapping toachieve obstacle avoidance or centering behavior. Weare not suggesting that human centering behavior isactually achieved the way we constructed our algo-rithm. We are investigating if it is indeed possible toachieve centering behavior with monocular vision andthe focus of expansion directly in view for robot con-trol using the complex logarithmic mapping.

4. Control of a mobile robot using foveated vision

The control algorithm is divided into three parts.First relevant information is extracted from the imagesby locating interesting points with an interest opera-tor. These points are then used to compute a sparseoptical flow field which is transformed into complexlog space. An image with a sparse optical flow field inimage coordinates is shown in Fig. 2. Finally, the dif-ference between the left and right optical flows is usedto control the robot. The input to the algorithm con-sists of an imageI (t2) taken at timet2. The algorithmworks on two images at any given time. The data fromthe previous iteration of the control algorithm, imageI (t1), is stored internally.

Fig. 2. Image, image with sparse optical flow, previous image compensated for rotatory camera movement, difference picture betweenimage and previous image compensated for rotatory camera movement.

4.1. Feature extraction

Interesting points are located in the input imageswhich simplify the computation of optical flow. Let theinteresting points of imageI (t2) beF(t2). Because ofthe aperture problem [20] dense optical flow describ-ing the actual motion is difficult to compute. Mostmethods compute optical flow in an iterative way (e.g.Horn and Schunck [14]). Due to temporal Gaussiansmoothing, several images are often needed [34] whichintroduces an additional delay because optical flow isonly computed for the center image. Santos-Victor etal. [34] therefore separate image acquisition and eval-uation by grabbing a set of five images at video rateand then analyzing the data. In contrast to the work ofSantos-Victor et al. we only grab one image for eachiteration of the control algorithm.

To detect interesting points in the image, we areusing the Moravec interest operator [24,25]. It is asimple operator which computes the local maxima ofthe minimum variance of local image intensities inthe horizontal, vertical and both diagonals. Compu-tation of optical flow for a set of interesting points iseasier than the computation of full optical flow dueto the aperture problem [20]. The type of featuresrelevant to control the robot obviously depends on theenvironment of the robot. Using genetic programming[23] (a class of evolutionary algorithms which can beused to evolve variable sized, hierarchical individuals)we are evolving interest operators which are optimalaccording to some measure. This would make thefirst stage of the algorithm, extraction of interestingpoints for the calculation of optical flow, adaptive. Wepreviously evolved edge and interest operators [9,10]by directly incorporating the output of the desired

Page 6: Centering behavior with a mobile robot using monocular foveated vision

212 M. Ebner, A. Zell / Robotics and Autonomous Systems 32 (2000) 207–218

operator into the fitness calculation and are now work-ing towards adaptive feature extraction where onlythe desired qualities of the operators are specified.

4.2. Sparse optical flow in ego-motion complex logspace

The ego-motion complex log mapping is only de-fined for a translatory movement of the observer. How-ever to control an arbitrary moving mobile robot wecompensate for any rotatory movements of the cam-era. A difference in image between the current im-age and the previous image, compensated for rotatorymovements of the camera is shown in Fig. 2. Thisimage shows how accurate the compensation of ro-tatory ego-motion is performed. We compensate forthe rotatory movement of the camera by transformingthe image to subtract this component from the opti-cal flow. This leaves only the translatory componentof the optical flow which is relevant to the centeringbehavior. We are using the robot’s status information(translational and rotational velocity, pan and tilt an-gles) to predict the imageI (t2) at timet2 from ImageI (t1) due to the rotatory motion of the camera. It is as-sumed that the robot moves exactly as is derived fromthe status information. Indeed, the human visual sys-tem makes a similar assumption as has already beendescribed above.

In the following text we are using the notation ofCraig [7] to describe coordinate transformations. LetC(t1)PPP = [X(t1), Y (t1), Z(t1), 1] be a point in thecamera frame{C(t1)} at timet1. Then the coordinatesof this point areC(t2)PPP = [X(t2), Y (t2), Z(t2), 1] inthe camera frame{C(t2)} at timet2. Let C(t2)

C(t1)TTT be the

homogeneous transformation from the camera frame{C(t1)} to the camera frame{C(t2)} (Fig. 3). Thus

Fig. 3. Movement of the camera.

we have

C(t2)PPP =C(t2)C(t1)

TTT ·C(t1) PPP , (4)

X(t2)

Y (t2)

Z(t2)

1

=

r11 r12 r13 tx

r21 r22 r23 ty

r31 r32 r33 tz

0 0 0 1

X(t1)

Y (t1)

Z(t1)

1

. (5)

Using perspective projectionx(t) = f (X(t)/Z(t))

and y(t) = f (Y (t)/Z(t)) with focal lengthf givesthe following transformation for every image point fora rotatory camera motion [26]:

x(t2) = fr11x(t1) + r12y(t1) + fr13

r31x(t1) + r32y(t1) + fr33, (6)

y(t2) = fr21x(t1) + r22y(t1) + fr23

r31x(t1) + r32y(t1) + fr33. (7)

Sparse optical flow is computed by matching imageintensities in a local area around the interesting points.We are matching interesting pointsF(t2) from the newimage at timet2 with the interesting pointsF(t2) fromthe image at timet2 which has been compensated forcamera rotations. The search is constrained to thosepoints that radiate outward. The quality of the match(squared differences in a local area around the points)is scaled with the distance of the match. Thus prefer-ence is given to a close match. After the interestingpoints in the predicted imageI (t2) and the actual im-ageI (t2) have been matched, sparse optical flow hasbeen determined in the original image.

The optical flow radiates outward from the focus ofexpansion for a forward motion of the camera. Pro-vided that peripheral optical flow is available from theleft and right sides, these flow vectors may be usedto achieve a centering behavior with the focus of ex-pansion in view. The focus of expansion is calculatedfrom the status information of the robot:[

xFOEyFOE

]= f

[tx/tzty/tz

]. (8)

The sparse optical flow is then transformed to complexlog space according to

r = rmax

logrsrclog

√(x − xFOE)2 + (y − yFOE)2, (9)

θ = θmax

(π + tan−1

(−x − xFOE

y − yFOE

)), (10)

Page 7: Centering behavior with a mobile robot using monocular foveated vision

M. Ebner, A. Zell / Robotics and Autonomous Systems 32 (2000) 207–218 213

wherer andθ are the coordinates in ego-motion com-plex log space andrsrc the maximum radius in coor-dinates of the source image. The number of pixels incomplex log space used for the radial direction is spec-ified by rmax andθmax is the number of pixels used forthe angular direction. Thus we havermin 6 r 6 rmaxand 06 θ 6 θmax with

rmin = rmax

logrsrclog

θmax

2π.

Optical flow is computed first for the acquired im-age and then transformed to complex log space be-cause the untransformed images have a much largerresolution than the transformed ones. Of course op-tical flow could be computed directly for the alreadytransformed images. However due to the reduced im-age size, the number of interesting points that can beextracted from the transformed image is reduced. Weexperimented with several different methods, e.g. ex-traction of features from the transformed image, cal-culation of full optical flow for the original imagesand the transformed images. The approach describedhere was found to be the one that worked best.

We only use the optical flow located in the areaθlmin 6 θ 6 θlmax, θrmin 6 θ 6 θrmax andr ′

min 6 r 6rmax as shown in Fig. 4. For our experiments descri-bed below we chose the valuesθlmin = 45◦, θlmax =135◦, θrmin = 225◦ andθrmax = 315◦. This effectivelyexcludes the optical flow from the ceiling and the floorwhich cannot be used to control the robot. Also we ex-cluded the area directly around the focus of expansionsince the optical flow is usually small in this areas andmight be further disturbed due to incorrect compensa-tion for the rotatory motion of the camera. Since it is

Fig. 4. Areas of complex logarithmic space used to control therobot.

impossible to determine exact data from sensor infor-mation we exclude this area because its informationis too unreliable. For our experiments we have chosenr ′min = 0.75(rmax − rmin) + rmin. Let fl andfr be the

median of the optical flow extracted from the left andright peripheral areas, respectively.

4.3. Computation of camera motion

The homogeneous transformation of the cameramovement may be calculated directly from the sta-tus information of the robot. Let{C} be the cameracoordinate system and{R} the coordinate system ofthe robot base. The coordinate frames for our cam-era setup are shown in Fig. 5. We split the cameratransformation into three parts. The first describesthe location of the camera relative to the robot’s baseat t2, the next describes the movement of the robotand the last describes the location of the camera rel-ative to the robot at timet1. Then the homogeneoustransformC(t2)

C(t1)TTT which transforms coordinates in the

camera frame{C(t1)} to coordinates in the cameraframe{C(t2)} may be calculated as

C(t2)C(t1)

TTT =C(t2)R(t2)

TTT ·R(t2)R(t1)

TTT ·R(t1)C(t1)

TTT , (11)

whereC(t2)R(t2)

TTT is the transformation from the coordinate

system of the robot{R(t2)} to the coordinate systemof the camera{C(t2)} at timet2,

R(t2)R(t1)

TTT describes the

robot movement andR(t1)C(t1)

TTT transforms the coordinatesystem of the camera{C(t1)} to the coordinate systemof the robot{R(t1)} at time t1. R(t2)

R(t1)TTT is calculated

from the status information of the robot as

R(t2)R(t1)

TTT = RRRZ(−ω(t2 − t1)) · DDDX(−v(t2 − t1)), (12)

where DDDX describes the translatory motion of therobot along theX-axis with translational velocityvandRRRZ the rotatory motion of the robot whereω is therotational velocity. The matrices relating the cameracoordinate system to the base coordinate system arecomputed using standard manipulator kinematics.

4.4. Robot control

A PID controller [7] was implemented to con-trol the rotational velocity of the robot. The gainshave been determined experimentally and were set to

Page 8: Centering behavior with a mobile robot using monocular foveated vision

214 M. Ebner, A. Zell / Robotics and Autonomous Systems 32 (2000) 207–218

kp = 5.0, ki = 0.1 andkd = 0.005 during the experi-ments described here. The control law is given by

ωd(t) = kpe(t) + ki

∫e(τ ) dτ + kde(t), (13)

wheree(t) = fl − fr is the difference between theextracted flow on the left and right peripheral visualareas in complex log space andωd specifies the desiredrotational velocity of the robot. In case no interestingpoints can be determined on one side, the robot movestowards the other side with a constant velocity. Thisbehavior is inspired by the work of Sobey [38]. It issafer to move into a direction for which the distanceto the nearest obstacle is known. If the robot turns tothe other wall it will be repelled by it as soon as someof the interesting points of the other wall enter theperipheral area.

The translational velocityvd of the robot is setaccording to the control law

vd =

vmax iffd

favmax > vmax,

fd

favmax if vmin 6 fd

favmax 6 vmax,

vmin iffd

favmax < vmin,

(14)

wherevmin andvmax are the minimum and maximumtranslational velocities,fd the desired radial opticalflow in the original image andfa is computed fromthe maximum offl andfr which is transformed backinto the original image. The control law could alsohave been formulated by transforming the desired op-tical flow to complex log space. This control law tries

Fig. 5. Colin, a Real World Interface B21 robot and the camera model with attached frames.

to keep the optical flow constant at aboutfd by slow-ing the robot down as it approaches a wall. Speed ofthe robot is increased again as the actual optical flowdecreases. Santos-Victor et al. [34] also used velocitycontrol to keep the measured optical flow constant us-ing a sigmoid function for the velocity-control loop.

A distributed client–server architecture is used tocontrol the robot. Therefore to obtain accurate statusinformation from the robot a delay of 150 ms is usedafter a change of velocity and the old image is dis-carded. This allows us to achieve accurate compensa-tion of the rotatory motion of the camera. The PID con-troller produces a smooth trajectory with only smallrotations. Therefore we used a much simpler controlstrategy for most of our experiments. This strategyturns the robot to the left with constant velocity if theoptical flow in the peripheral area is larger on the rightside than on the left and to the right if the optical flowon the left is larger than the flow on the right. Onlyin the rare case where both flows would be exactlyequal we stop the rotatory movement. However thisalmost never happens. This control strategy producesan oscillatory behavior where the robot is constantlyturning to one side or the other. Because we want toperform robot control during rotatory motions this isa suitable strategy to test our robot.

5. Experiments

For our experiments we used Colin, a Real WorldInterface B21 mobile robot (Fig. 5) with a DirectedPerception pan-tilt unit. We only used one camera for

Page 9: Centering behavior with a mobile robot using monocular foveated vision

M. Ebner, A. Zell / Robotics and Autonomous Systems 32 (2000) 207–218 215

Fig. 6. Path of the robot recorded from odometry data for different starting positions. The two on the left were recorded using the simplecontroller and the two on the right were recorded with the PID controller. The images along the path are a subset of the images that wererecorded during execution of the algorithm. The path is overlayed on a map of the environment.

Page 10: Centering behavior with a mobile robot using monocular foveated vision

216 M. Ebner, A. Zell / Robotics and Autonomous Systems 32 (2000) 207–218

the experiments described here. The camera setup canbe seen in Fig. 5. Images were acquired with a reso-lution of 128× 128 pixels and the transformation tocomplex log space was done withrmax = 32, rsrc = 64andθmax = 32. The experiments were done in the cor-ridor at our lab. The width of the corridor was approx-imately 1.41 m. The environment was not modified byattaching structured wall paper which would have sim-plified the computation of optical flow. However theenvironment did contain several posters mounted onthe walls along the corridor.

With this setup we conducted a series of experi-ments with the simple controller and with the PIDcontroller. During the experiments with the simplecontroller the robot was moving with a constant for-ward velocity of 40 cm/s. During the experimentswith the PID controller minimum translational ve-locity vmin was set to 20 cm/s and the maximumtranslational velocityvmax was set to 55 cm/s. Thedesired optical flow in the original image was set to3 pixels. Fig. 6 shows the results for the simple con-troller and for the PID controller. We recorded thepath travelled by the robot using the robot’s odometrydata. The images shown are a subset of the same im-ages which were used to control the robot during theruns.

As can be seen in the images after a distance ofapproximately 11 m the corridor is 0.35 m wider thanthe remainder of the corridor. This part of the cor-ridor is approximately 3.21 m long. Just before thispart of the corridor there is a staircase located on theright side of the corridor. In these areas the path ofthe robot is slightly offset to the right side of the cor-ridor. The path recorded with the PID controller ismuch smoother than the one using the simple con-troller. During the runs which started in the middleof the corridor 126 images (1.77 frames per second)were recorded with the simple controller and 125 im-ages (1.73 frames per second) with the PID controller.The robot travelled 27.1m respectively 26.9 m duringthe runs. Distance was measured using the robot’sodometry.

6. Conclusion and ongoing research

We have demonstrated that centering behaviormay also be achieved with a single camera with the

focus of expansion in view. Comparison of the opticalflow for the left and right peripheral areas is done inego-motion complex log space. Since only the opticalflow due to translation of the robot may be used forthe centering behavior we compensate for the imagemovement which is due to the rotatory componentof the camera motion. To compensate for the imagemovement we use the status information which isdirectly available from the robot. Using the availablestatus information may speed up robot control byeliminating the need to derive the ego-motion of therobot from the images.

Acknowledgements

This work was supported in part by a scholarshipaccording to the Landesgraduiertenförderungsgesetzto Marc Ebner.

For image processing we have used the Vista soft-ware environment [30].

References

[1] Y. Aloimonos (Ed.), Active Perception, Lawrence Erlbaum,Hillsdale, NJ, 1993.

[2] G.A. Bekey, Biologically inspired control of autonomousrobots, Robotics and Autonomous Systems 18 (1996)21–31.

[3] M. Brady, H. Wang, Vision for mobile robots, PhilosophicalTransactions of the Royal Society of London, Series B 337(1992) 341–350.

[4] R.H.S. Carpenter, Movements of the Eyes, 2nd edn., PionLimited, London, 1988.

[5] D. Coombs, M. Herman, T.-H. Hong, M. Nashman, Real-timeobstacle avoidance using central flow divergence, and perip-heral flow, IEEE Transactions on Robotics and Automation14 (1) (1998) 49–59.

[6] D. Coombs, K. Roberts, “Bee-bot”: Using peripheral opticalflow to avoid obstacles, in: D. Casasent (Ed.), IntelligentRobots and Computer Vision XI, Proceedings of theSociety of Photo-Optical Instrumentation Engineers, 1992,pp. 714–721.

[7] J.J. Craig, Introduction to Robotics: Mechanics and Control,2nd edn., Addision-Wesley, Reading, MA, 1989.

[8] B. Crespi, C. Furlanello, L. Stringa, A memory-basedapproach to navigation, Biological Cybernetics 69 (1993)385–393.

[9] M. Ebner, On the evolution of edge detectors for robot visionusing genetic programming, in: H.-M. Groß (Ed.), WorkshopSOAVE’97 Selbstorganisation von Adaptivem Verhalten,VDI, Düsseldorf, 1997, pp. 127–134.

Page 11: Centering behavior with a mobile robot using monocular foveated vision

M. Ebner, A. Zell / Robotics and Autonomous Systems 32 (2000) 207–218 217

[10] M. Eber, On the evolution of interest operators usinggenetic programming, in: R. Poli, W.B. Langdon, M.Schoenauer, T. Fogarty, W. Banzhaf (Eds.), Late BreakingPapers at EuroGP’98: The First European Workshopon Genetic Programming, Paris, France, April 1998,pp. 6–10.

[11] M. Fossa, E. Grosso, F. Ferrari, M. Magrassi, G. Sandini, M.Zapendouski, A visually guided mobile robot acting in indoorenvironments, in: Proceedings of the IEEE Workshop onApplications of Computer Vision, Palm Springs, CA, IEEE,New York, 1992, pp. 308–316.

[12] N. Franceschini, J.M. Pichon, C. Blanes, From insect vision torobot vision, Philosophical Transactions of the Royal Societyof London, Series B 337 (1992) 283–294.

[13] J. Frazier, R. Nevatia, Detecting moving objects froma moving platform, in: Proceedings of the DARPAImage Understanding Workshop, Pittsburgh, PA, 1990,pp. 348–355.

[14] B.K.P. Horn, B.G. Schunck, Determining optical flow,Artificial Intelligence 17 (1981) 185–203.

[15] G.A. Horridge, A theory of insect vision: Velocity parallax,Proceedings of the Royal Society, London, Series B 229(1981) 13–27.

[16] G.A. Horridge, What can engineers learn from insect vision?Philosophical Transactions of the Royal Society of London,Series B 337 (1992) 271–282.

[17] I. Horswill, A simple cheap, and robust visual navigationsystem, in: J.-A. Meyer, H.L. Roitblat, S.W. Wilson (Eds.),From Animals to Animats 2: Proceedings of the SecondInternational Conference on Simulation of Adaptive Behavior,Honolulu, Hawaii, 1992, MIT Press, Cambridge, MA, 1993,pp. 129–136.

[18] R. Jain, Segmentation of frame sequences obtained by amoving observer, IEEE Transactions on Pattern Analysis andMachine Intelligence 6 (5) (1987) 624–629.

[19] R. Jain, S.L. Bartlett, N. O’Brian, Motion stereo usingego-motion complex logarithmic mapping, IEEE Transactionson Pattern Analysis and Machine Intelligence 9 (3) (1987)356–369.

[20] R. Jain, R. Kasturi, B.G. Schunck, Machine Vision,McGraw-Hill, New York, 1995.

[21] T. Jochem, D. Pomerleau, Life in the fast lane: The evolutionof an adaptive vehicle control system, Artificial IntelligenceMagazine 17 (2) (1996) 11–50.

[22] J. Košecká, Visually guided navigation, Robotics andAutonomous Systems 21 (1997) 37–50.

[23] J.R. Koza, Genetic Programming, On the Programming ofComputers by Means of Natural Selection, MIT Press,Cambridge, MA, 1992.

[24] H.P. Moravec, Towards automatic visual obstacle avoidance,in: Proceedings of the Fifth International Joint Conferenceon Artificial Intelligence, Cambridge, MA, 1977, p. 584.

[25] H.P. Moravec, Ostacle Avoidance and Navigation in theReal World by a Seeing Robot Rover, Ph.D. Thesis,Computer Science Department, Stanford University, No.STAN-CS-80-813 and AIM-340, September 1980.

[26] D. Murray, A. Basu, Motion tracking with an activecamera, IEEE Transactions on Pattern Analysis and MachineIntelligence 16 (5) (1994) 449–459.

[27] H. Neven, G. Schöner, Dynamics parametrically controlledby image correlations organize robot navigation, BiologicalCybernetics 75 (1996) 293–307.

[28] P.-W. Ong, R.S. Wallace, E.L. Schwartz, Space-variant opticalcharacter recognition, in: Proceedings of the 11th InternationalConference on Pattern Recognition, IEEE, New York, 1992,pp. 504–507.

[29] R.A. Peters II, M. Bishay, Centering peripheral features inan indoor environment using a binocular log-polar 4 dofcamera head, Robotics and Autonomous Systems 18 (1996)271–281.

[30] A.R. Pope, D.G. Lowe, Vista: A software environmentfor computer vision research, in: Proceedings of the1994 IEEE Computer Society Conference on ComputerVision and Pattern Recognition, IEEE, New York, 1994,pp. 768–772.

[31] G. Sandini, F. Gandolfo, E. Grosso, M. Tistarelli, Visionduring action, in: Y. Aloimonos (Ed.), Active Perception,Lawrence Erlbaum, Hillsdale, NJ, 1993, pp. 151–190.

[32] J. Santos-Victor, G. Sandini, Embedded visual behaviors fornavigation, Robotics and Autonomous Systems 19 (1997)299–313.

[33] J. Santos-Victor, G. Sandini, F. Curotto, S. Garibaldi,Divergent stereo for robot navigation: learning from bees,in: Proceedings of Computer Vision and Pattern Recognition,New York, 1993, pp. 434–439.

[34] J. Santos-Victor, G. Sandini, F. Curotto, S. Garibaldi,Divergent stereo in autonomous navigation: From bees torobots, International Journal of Computer Vision 14 (1995)159–177.

[35] E.L. Schwartz, Computational anatomy and functionalarchitecture of striate cortex: A spatial mapping approach toperceptual coding, Vision Research 20 (1980) 645–669.

[36] M.V. Srinivasan, Distance perception in insects, CurrentDirections in Psychological Science 1 (1) (1992) 22–26.

[37] M.V. Srinivasan, How bees exploit optic flow: Behaviouralexperiments and neural models, Philosophical Transactionsof the Royal Society of London, Series B 337 (1992)253–259.

[38] P.J. Sobey, Active navigation with a monocular robot,Biological Cybernetics 71 (1994) 433–440.

[39] M. Tistarelli, G. Sandini, On the advantages of polar andlog-polar mapping for direct estimation of time-to-impactfrom optical flow, IEEE Transactions on Pattern Analysis andMachine Intelligence 15 (4) (1993) 401–410.

[40] C. Tomasi, T. Kanade, Factoring image sequence into shapeand motion, in: Proceedings of the IEEE Workshop on VisualMotion, Nassau Inn, Princeton, NJ, 7–9 October 1991, IEEEComputer Society Press, Silver Spring, MD, 1991, pp. 21–28.

[41] M.J. Tovée, An Introduction to the Visual System, CambridgeUniversity Press, Cambridge, 1996.

[42] E. von Holst, H. Mittelstaedt, Das ReafferenzprinzipWechselwirkung zwischen Zentralnervensystem undPeripherie, Die Naturwissenschaften 37 (20) (1950) 464–476.

Page 12: Centering behavior with a mobile robot using monocular foveated vision

218 M. Ebner, A. Zell / Robotics and Autonomous Systems 32 (2000) 207–218

[43] J. Vogelgesang, A. Cozzi, F. Wörgötter, A parallel algorithmfor depth perception from radical optical flow fields, in: C. vonder Malsburg, J.C. Vorbrüggen, W. von Seelen, B. Sendhoff(Eds.), Artificial Neural Networks: Sixth InternationalConferene, Proceedings of ICANN 1996, Springer, Berlin,1996, pp. 721–725.

[44] R.S. Wallace, P.-W. Ong, B.B. Bederson, E.L. Schwartz,Space variant image processing, International Journal ofComputer Vision 13 (1) (1994) 71–90.

Marc Ebner was born in Stuttgart, Ger-many, in 1969. He received the M.S. de-gree in Computer Science from New YorkUniversity, NY, in 1994, Dipl.-Inform.from the Universität Stuttgart, Germany, in1996 and Dr. rer. nat. from the UniversitätTübingen, Germany, in 1999. At presenthe is a research assistant at the UniversitätWürzburg, Germany. His research inter-ests include biologically inspired systems,

evolutionary algorithms in computer vision, evolutionary robotics,evolutionary algorithms, computer vision and robotics.

Andreas Zell received the Diploma inComputer Science from the University ofKaiserslautern, Germany, in 1986, M.S.degree from Stanford University in 1987,Ph.D. in 1989 and the Habilitation (ve-nia legendi) in 1994 from the Universityof Stuttgart, Germany, all in computer sci-ence. From 1990–1995 he has been anAssistant Professor at the University ofStuttgart, Institute for Parallel and Dis-

tributed High-Performance Systems. In 1991 he won the GermanUniversity Software Prize for the development of the StuttgartNeural Network Simulator (SNNS) and received an InternationalMasPar Challenge Prize in 1992. Since 1995 he has been ap-pointed as Professor to the Chair of Computer Architecture at theUniversity of Tübingen, Wilhelm-Schickard-Institute for ComputerScience. His research interests include artificial neural networks,evolutionary algorithms, artificial life, mobile robotics and bioin-formatics. He directs the Mobile Robotics Lab at the Universityof Tübingen whose team was a finalist in the 1998 RoboCupmiddle size competition at Paris. He participates in a number ofjoint research projects with industry. He also coordinates the newbioinformatics curriculum at the University of Tübingen.