8
Biol. Cybern. 71, 433-440 (1994) Springer-Verlag 1994 Active navigation with a monocular robot P. J. Sobey Centre for Visual Science, Research School of Biological Sciences, Australian National University, GPO Box 475, Canberra 2601, Australia Received: 17 August 1993/Accepted: 2 May 1994 Abstract. A scheme is presented that uses the self-motion of a robot, equipped with a single visual sensor, to navi- gate in a safe manner. The motion strategy used is modelled on the motion of insects that effectively have a single eye and must move in order to determine range. The essence of the strategy is to employ a zigzag motion in order to (a) estimate the range to objects and (b) know the safe distance of travel in the present direction. An example is presented of a laboratory robot moving in a cluttered environment. The results show that this motion strategy can be successfully employed in an autonomous robot to avoid collisions. 1 Introduction This paper explores the unique problems associated with navigating with a single visual sensor. This is the situ- ation that is faced by many walking and flying animals, particularly insects. The compound eyes of an insect are too close together for stereo ranging except at extremely short distances (Lehrer et al. 1988; Srinivasan et al. 1989; Lehrer 1990; Srinivasan et al. 1991; Srinivasan 1992). In order to estimate range, the insect must have a know- ledge of the speed of its travel and then use some measure of the optic flow on the retina (Gibson 1950; Nakayama and Loomis 1974; Horridge 1986). Previous studies of robot autonomy that used animal behaviour as a paradigm (Anderson and Donarth 1990) have applied ideas using a robot with ultrasonic sensors arranged all around the robot (Arkin 1990). A robot equipped with ultrasonic ranging receives data from all around with equal confidence. If an echo does not return, the assumption is that the space is clear for a large distance in that direction. The truth, however, may be that the robot is close to an oblique wall. Similarly, laser-ranging robots (Anderson et al. 1992; Prescott and Mayhew 1992) are able to obtain a high resolution depth map by using phase, time-of-flight or light stripe methods. All these methods are active sensing in that they must emit radiation in order to interpret the reflec- ted radiation. (A review of range-finding techniques can be found in Jarvis 1983.) However, in many applications the robot must only use passive sensors, relying on ambi- ent reflected light. The visually guided robot with a single sensor must move a known distance to be able to make an estimate of the range to objects. The confidence of the measurement is greatest when looking to the side, perpendicular to the direction of motion, and least in the forward (or rear- ward) directions. This means that the robot has no know- ledge of what is directly in front of it. Further, if the robot views a featureless wall, it cannot calculate a range measurement but also cannot assume that this is an indication of free space. On the contrary, it must assume that there is an obstacle at some unknown distance, perhaps at the distance of neighbouring, known ob- stacles. This situation of a monocular robot is similar to that of a flying or walking insect. The aim of this paper is to explore the motion strategies adopted by animals to maximise the safety of navigating with a single visual sensor. Central to this paper is the idea that motion must always be in the direction of an obstacle at a known range, because this is a safe way to proceed. In Sect. 2 the situation of a monocular autonomous agent is presented, with reference to the biological situation. Section 3 de- scribes the mobile robot used in the experiments, and Sect. 4 details the potential field paradigm used. Section 5 presents the application of these ideas with the mobile robot navigating amongst obstacles in the laboratory. A discussion of the advantages of this scheme and pos- sible applications follows in Sect. 6. 2 Autonomous agent behaviour A flying insect is faced with a unique situation: it is totally dependent upon visual feedback for course control and flight stabilisation. It must have knowledge of its flight control parameters in order to counteract the effects of cross-winds and wing damage. It does this by performing turning saccades that confirm or adjust its flight control parameters (Mayer et al. 1988; Wolf and Heisenberg 1990; Wolf et al. 1992; Heisenberg and Wolf 1993). An insect can judge its airspeed by the force it expends on its wing beats (Bastian and Esch 1970; Rowell 1989). The

Active navigation with a monocular robot

Embed Size (px)

Citation preview

Page 1: Active navigation with a monocular robot

Biol. Cybern. 71, 433-440 (1994)

�9 Springer-Verlag 1994

Active navigation with a monocular robot

P. J. Sobey

Centre for Visual Science, Research School of Biological Sciences, Australian National University, GPO Box 475, Canberra 2601, Australia

Received: 17 August 1993/Accepted: 2 May 1994

Abstract. A scheme is presented that uses the self-motion of a robot, equipped with a single visual sensor, to navi- gate in a safe manner. The motion strategy used is modelled on the motion of insects that effectively have a single eye and must move in order to determine range. The essence of the strategy is to employ a zigzag motion in order to (a) estimate the range to objects and (b) know the safe distance of travel in the present direction. An example is presented of a laboratory robot moving in a cluttered environment. The results show that this motion strategy can be successfully employed in an autonomous robot to avoid collisions.

1 Introduction

This paper explores the unique problems associated with navigating with a single visual sensor. This is the situ- ation that is faced by many walking and flying animals, particularly insects. The compound eyes of an insect are too close together for stereo ranging except at extremely short distances (Lehrer et al. 1988; Srinivasan et al. 1989; Lehrer 1990; Srinivasan et al. 1991; Srinivasan 1992). In order to estimate range, the insect must have a know- ledge of the speed of its travel and then use some measure of the optic flow on the retina (Gibson 1950; Nakayama and Loomis 1974; Horridge 1986).

Previous studies of robot autonomy that used animal behaviour as a paradigm (Anderson and Donarth 1990) have applied ideas using a robot with ultrasonic sensors arranged all around the robot (Arkin 1990). A robot equipped with ultrasonic ranging receives data from all around with equal confidence. If an echo does not return, the assumption is that the space is clear for a large distance in that direction. The truth, however, may be that the robot is close to an oblique wall. Similarly, laser-ranging robots (Anderson et al. 1992; Prescott and Mayhew 1992) are able to obtain a high resolution depth map by using phase, time-of-flight or light stripe methods. All these methods are active sensing in that they must emit radiation in order to interpret the reflec- ted radiation. (A review of range-finding techniques can be found in Jarvis 1983.) However, in many applications

the robot must only use passive sensors, relying on ambi- ent reflected light.

The visually guided robot with a single sensor must move a known distance to be able to make an estimate of the range to objects. The confidence of the measurement is greatest when looking to the side, perpendicular to the direction of motion, and least in the forward (or rear- ward) directions. This means that the robot has no know- ledge of what is directly in front of it. Further, if the robot views a featureless wall, it cannot calculate a range measurement but also cannot assume that this is an indication of free space. On the contrary, it must assume that there is an obstacle at some unknown distance, perhaps at the distance of neighbouring, known ob- stacles.

This situation of a monocular robot is similar to that of a flying or walking insect. The aim of this paper is to explore the motion strategies adopted by animals to maximise the safety of navigating with a single visual sensor. Central to this paper is the idea that motion must always be in the direction of an obstacle at a known range, because this is a safe way to proceed. In Sect. 2 the situation of a monocular autonomous agent is presented, with reference to the biological situation. Section 3 de- scribes the mobile robot used in the experiments, and Sect. 4 details the potential field paradigm used. Section 5 presents the application of these ideas with the mobile robot navigating amongst obstacles in the laboratory. A discussion of the advantages of this scheme and pos- sible applications follows in Sect. 6.

2 Autonomous agent behaviour

A flying insect is faced with a unique situation: it is totally dependent upon visual feedback for course control and flight stabilisation. It must have knowledge of its flight control parameters in order to counteract the effects of cross-winds and wing damage. It does this by performing turning saccades that confirm or adjust its flight control parameters (Mayer et al. 1988; Wolf and Heisenberg 1990; Wolf et al. 1992; Heisenberg and Wolf 1993). An insect can judge its airspeed by the force it expends on its wing beats (Bastian and Esch 1970; Rowell 1989). The

Page 2: Active navigation with a monocular robot

434

role of visual feedback in the insect's estimation of speed is currently being investigated by other researchers.

If the insect is a predator, the hunt for prey requires a specific motion strategy in order to be successful. Many dragonflies lie perched in wait for objects of a particular size range to move in their field of view. When an object is detected, the dragonfly makes a translationary move, to fix the range of the prey. It then does a homing manoeuvre on the insect (O'Carroll, 1992, pers. comm.). This is a complex combination of manoeuvres that incor- porates a number of real-time control systems. A simpler situation is that of an insect flying to a location through an environment cluttered with trees, poles, buildings and bushes. Here, the main requirement is to avoid collisions with the many obstacles.

Consider a camera moving along its optical axis (Fig. 1) with a velocity ~. As the camera approaches an object, the object appears to have an angular velocity, 0, that corresponds to the velocity of the image of the object on the image plane of the camera (not shown). The angular velocity of objects across the image plane of a moving CCD camera can be expressed (see Nakayama and Loomis 1974) as:

0 = ~ . s i n O/d (1)

where 0 is the angular position of an object on the image plane with respect to the direction of motion, 0 is the angular velocity of the object image, and d is the distance to the object. It can be seen that when 0 is zero, the image velocity is also zero, no matter what the range of the object. When 0 is small, the image velocity is also small, and any noise in the measurement will make the distance measurement unreliable.

Psychophysical studies of humans have indicated that central vision is more important than peripheral vision in determining the direction of heading from a di- verging vector field (Warren and Kurtz 1992). In a paper by Lee and Reddish (1981), the diving behaviour of the plummeting gannets was studied in order to elucidate the mechanism by which they judge the position of the sur- face of the water. In this example, the birds must assume that the sea is fiat in order to be able to interpret the

moving camera

Fig. 1. The angular velocity of the image of an object across the image plane of a moving CCD camera can be expressed as d = ~ * sinO/d where 0 is the angular position of an object on the image plane with respect to the direction of motion, :~ is the velocity of the camera, 0 is the angular velocity of the object image, and d is the distance to the object. It can be seen that when 0 is zero, the image velocity is also zero, no matter what the range of the object

diverging flow field. In order to use the expanding optical flow field as a measure for time-to-impact, it must be assumed that the flow is produced by extended planar surfaces (Tistarelli and Sandini 1993; Uras et al. 1988).

A flying insect cannot make this assumption when navigating in a complex environment such as through foliage. Imagine the situation where one is flying directly toward a thin rod, such as the branch of a tree, that is lying coaxially with the direction of travel. Only the small end of the rod will be visible, and it will be diverging so slowly that it will not be possible to detect its range from the optical flow. Without binocular vision or some other method that can provide an additional amount of depth information, a collision will be imminent.

With a single visual sensor, one must adopt a motion strategy of translating in order to determine the distance to objects to either side. Then one can turn and travel towards one of those known points with the surety that one will not have a collision for a known distance, while at the same time gathering the information that will be used for the next turn and translation. So, it makes sense to not look where one is going, but instead to travel in the direction of a known obstacle and to look where one intends to go next. This results in a zigzag behaviour.

The zig-zag course of flies has been previously noted (Wagner 1986; Heisenberg and Wolf 1993), and I contend that such a motion strategy is essential if collisions are to be avoided. Because a fly can obtain no range informa- tion from the part of the visual field corresponding to the direction of motion, it continually changes direction, as a series of straight line segments, and only travels in the direction of a known obstacle. An examination of the landing trajectories of flies onto black stationary spheres (Wagner 1982) reveals that at no time do they actually fly directly toward the target (except, of course, at the very last instant before contact).

Summarising the term situatedness Brooks (1991) makes the point that 'the world is its own best model'. A situated robot does not construct a complete model of the world with which to make planning decisions but uses the world as its model, thereby avoiding the assump- tions and errors associated with model construction. In order to test the collision avoidance motion strategy developed here, it has been embodied in a small robot and tested (situated) in a cluttered laboratory environ- ment.

3 Experiments with a mobile robot

The monocular agent is implemented using a small mobile robot platform (originally built by Fujitsu for a neural networks project and modified by us for our purposes, see Fig. 2). The robot has four wheels that drive and turn together, enabling it to turn on its axis. One motor is used for forward motion and an optical chopper coupled to it provides a distance counter. The robot always moves in the direction it is pointing, and a stepper motor is used to change this direction. Another stepper motor rotates the camera about the robot axis to allow it to view at any angle to the direction of motion. The robot

Page 3: Active navigation with a monocular robot

Fig. 2. The robot used to investigate motion strategies of a monocular agent

is connected to a PC by an umbilical cord which con- tains: an RS-232 link to the on-board microprocessor that controls the motors, the camera cable to a frame grabber in the PC, and power. Tethering, while not as aesthetic as a cordless robot, is not considered detrimen- tal to the project aims and is reliable. Ideally, one would have all the computing and power resources on the robot to make it truly independent, but this is impractical on a robot of this size.

Of more concern to the aims of exploring auto- nomous behaviour than tethering is the lack of computa- tional power to perform real-time calculations. A 'dumb' framegrabber in a 486 PC means that movements must be restricted to a 'move and think' tactic in a stationary, unchanging environment. Further, the memory capacity of the framegrabber means that only 16 frames of 256. 256 images can be stored. This forces the adoption of an action-perception-planning cycle, with the action and perception cycles inextricably linked. The robot is commanded to travel a certain distance with the camera loading the framegrabber in a cyclic manner. At the end of the motion, the frame grabber is halted and contains a sequence of the last 16 frames of the motion. This sequence is then analysed using the generalised gradient model (Srinivasan 1990; Sobey and Srinivasan 1991) to measure the image motion between the frames of the sequence. With knowledge of the camera position and geometry and the robot's speed, the image velocities can be related to object ranges.

The robot motion is restricted to the horizontal plane, so only a horizontal strip of the image is processed. The robot executes a linear translation, and the velocity in the horizontal strip of the image is calculated using the generalised gradient model. The 64-deg field-of-view of the camera is divided into 16 equiangled sectors of 4 deg, and the calculated velocity is averaged within each sec- tor. This average sector velocity is then averaged over 14 frame times and converted to a range value, knowing the robot speed and the direction of view in relation to the direction of travel. This time average proved to be neces-

435

sary for reliability because the robot motion is not smooth but quite bumpy, the wheels being small and stiff and the linoleum floor relatively uneven. The video frame rate is used as the clock, and the robot has a counter on the wheel position to measure egomotion.

The resulting distance measurements are, at best, accurate to within about 10%, but the aim here is not to achieve absolute accuracy but to achieve accuracy suffi- cient for the task at hand. This task is to navigate through a cluttered environment with only a passive visual sensor, and to avoid collisions. There is no require- ment to identify objects or to classify features. There is a requirement for a goal, otherwise the act of navigation has no meaning. Initially, the goal is defined simply to be a location relative to the robot.

A typical navigation task is run as follows, beginning with an orientation sequence. The goal location (as x, y coordinates) is given to the host which commands the robot to turn toward the goal (+15 deg) and make a short (100 mm) translation while looking to one side. The last 16 frames of the move are used to calculate the distance to the objects in the field-of-view. The robot then turns toward the goal ( - 1 5 deg) and repeats the translation and calculation while looking to the other side. The result is a knowledge of the scene 60 deg on either side of the goal. This is the orientation movement sequence. The robot must then choose one of the range readings as the next heading direction.

4 Potential field navigation

After the orientation movement, every subsequent move is made toward a known object, with a decision based upon a number of potential fields (Fig. 3). Potential fields have been considered by a number of researchers as a means of guiding the motion of robots (Krogh 1984; Hogan 1985; Khatib 1986; Newman and Hogan 1987) and are a useful means of making decisions based on sensory data in a reactive system. No high-level know- ledge of the system is required, and the rules for deci- sion-making can be 'hard-wired' into the system yet still deliver satisfactory performance. The fields described here are ad hoc in the sense that they are not specifically modelled on a biological system but were derived from an understanding of the attributes of the planning pro- cess and from interacting in the development of the mobile robot.

A range reading is calculated for each 4-deg sector in the field-of-view of the camera (assuming that there is some visual texture). Each range reading represents a possible new heading direction, and a decision is made based on the likelihood that the new heading will ad- vance the robot toward the goal. Each range reading can be considered to be an object in the scene and is given a score, or attractiveness rating, based upon its relation- ship to the position of the goal, the present heading direction, and the distance of the robot to the object. The attractiveness rating is calculated in the manner of a po- tential field, with the values of the rating corresponding to the gradient of the potential. The complete potential

Page 4: Active navigation with a monocular robot

436

GOAL FIELD 90

120 . . . . . . . . . . . . . . . . " . . . . . . . . . . . . . . . 60 , ..~ 2 ..

. . . . . . . . . , . . .

. / .. "::. " 1.5 ' " ' . . . . . . . :: . . . . . . . . . . / . " 30 150 ' -

." ':':Attraction ..... : ...... " ' " . .''::."" " .. �9 , ..?" . : : . : . . . . .

,8o ! 7'i -/I:' ....... ' ' , ),::.;:: '?::i ......... ;.:" ;?:.::i . /

210":- '"., '.::" ..... i ..... :"i' .."" )"330

240 ............. i . . . . . . . . . . . . . . 300

A 270

0 o,

VIEW FIELD 90

120 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 ..... : " o.8 j . : ....

. " 5,.: ....... ............ " .

150 ." . " ". ........... i " / . . ; i " . . ~ } " . . 30

1 8 o i . . . . . . . . . . . . . . . . . . . . . il . . . . . . . . . . . .

2107. ' '" ' 'J:" .... i,"330

240 ............. i . . . . . . . . . . . . . . O0

B 270

0 a.

Fig. 3 A - D . The four fields that determine the choice of heading. The goal field is the primary field in the direction of the goal. The view field is the secondary field that emphasises the areas that were viewed in the last motion, and so is dependent on the heading direction. The third field acts to repel the robot when it approaches too close to objects, and is only dependent on the range of the object. The fourth field ensures that the robot is not attracted to objects that block the path to the goal.

NEAR DISTANCE REPULSION 0

-1o ! i . . . . . i i �9 i ........

~. -15 . . . . . . . . . . . . . . . . .......... . . . . . . . .

-20 ....... ......... "6 ~z -25 ........ ; ........ . . . . . . . . . . .

-30 .... ...... . . . . . . . . . . . . . . . . . ....... ............

-35

40 0 0.5 1 1.5 2 2.5 3 3.5 4

Distance / Clear distance C

GOAL BLOCKING 90

120 . . . . . . . . . . . . . ! . . . . . . . . . . . . . . 60 �9 .'.:"" 0.8 i '"'..

"" ,'"'.i" 0.6:: .... .":". 15o " ' ........ :: ............ -" 30 .'.., . - ' " .."L" : . ' , . , . . . . 2 " ,

... . . . . .:-Repu~o. '-.. 0 :4 ~ . . . . . . . .~." ....... ... ?:"~:~ ..

" .../st~ga,.: iio:2.i...'. ". ......... : . . . . . "'..:, 18o i: ....i: ...(....i )(:.ii ~ , . , " ~ : : " " " : ~ " ~ . . . . : "~"" ";

:?<:'::,/: i 210": "".. ".::" .... :: ..... ":": ...... :.':'" i'/330

f " ' " ' . . . . . . . . i . . . . . . . ' " " " , . . .

240 ................. i ............... 300 D 270

0 o.

Objects that lie between the robot and the goal can exert a repulsive field whose width depends on the proximity of the object to the robot. An object close to the robot exerts a wide repulsive field ( d o t t e d line), whilst an object distant from the robot exerts a narrow repulsive field ( so l id line). The four fields are calculated and added for each object, and the direction with the highest attraction is chosen for the new heading

field that the robot experiences is not based upon a field calculated at all points in the environment but only those points that have been measured. Points that have not been measured exert no influence upon the robot; the robot cannot choose to go to other than a measured point. The new heading direction is chosen as the direc- t ion of that object that has the highest attractiveness, which is calculated in the manner described below.

The parameters of the robot -camera system for each distance measurement, i, are defined as: ri = distance to object; C = min imum allowable distance between the robot centre and an object; G = range to the goal; 0~o = object angle; 0H = heading angle; 0~0 = angle be- tween an object and goal; O~h = angle between an object and the heading.

The pr imary potential field for each object, G~(Oo), is defined by the position of the goal, exerting an at tract ion in a cosine function (Fig. 3A).

Gi(Oo) = 2.5 , c o s ( 3 �9 Oio) if [Oio[ < 30 ~

= 0 otherwise (2)

The secondary at t ract ion field, Vi(Oh), is a bi-lobed cosine function that emphasises at t ract ion to either side of the heading direction (Fig. 3b). This is to give greater cre- dence to those distance measurements calculated in the last move.

Vi(Oh) = c o s ( 2 , (O~h --45~ if COS(2* (10,h[- 45~ > 0

= 0 otherwise (3)

Page 5: Active navigation with a monocular robot

A repulsion field is overlaid that is a function of the distance to objects, Di(r), operating only at close distan- ces, which acts to keep the robot a safe distance from walls and obstructions (Fig. 3c). A dimensionless quanti- ty of range, C/r, is defined in terms of the ratio of the range of an object to the clear distance that the robot is required to keep from the object.

D i ( r ) = - 2 * C / r ~ + 1 i f r i < 2 * C

= 0 otherwise (4)

A final field, B~(r, Og, G), acts negatively to ensure that the robot is not attracted to objects that block the path to the goal. In Fig. 3d three curves are shown, all of the same magnitude but varying in angular spread with differing ranges to the obstacle with respect to the distance and angle to the goal. The field that corresponds to a distant object has a narrow angular influence (solid line), whilst the field that corresponds to a near object has a large angular influence (dotted line).

Bi(r, 0g, G) = - cos(0ig �9 (ri + C)/C)

if l0,a I < 90 ~ and (r i + C) < G

= 0 otherwise (5)

The value of the potential field, P~, is calculated for each object as the sum of the four fields, G, 1I, D and B. The object with the largest attractive potential, P j, is chosen, and the new heading direction is the direction of this object, 0jo, as long as it involves a change in direction greater than 20 deg and less than 120 deg.

Pi(r, 0 o, Oh, G)=Gi(Oo)+ Vi(Oh)+Di(r)+B~(r, 0 o, G) (6)

P j = m a x P i i = l . . . n , i

Oio: 20 ~ < IOH -- Oio[ < 120 ~ (7)

On = 0io (8)

With each move, the ranges to objects in the field-of-view are calculated, and objects in the field-of-view from pre- vious moves are deleted. To each object is attributed a reliability that is in proportion to the number of valid measurements made in that sector of view, multiplied by a sine function of the direction of view (with zero degrees being straight ahead). With each move, the reliability of old objects is decreased, acting as a means of 'forgetting' objects that have not been looked at for a while. This is equivalent to a short-term memory and prevents the robot from acting upon information that has accumu- lated the errors of more than about 10 moves.

These simple rules are sufficient to act as a collision avoidance mechanism which can be considered an auton- omous function of a more complex navigation system. The aim is to refine this model and embed it into an agent that has a number of goals and tasks to perform.

5 Moving to a goal

The simplest goal is to simply travel to a known location, that is, to travel a given distance in a given direction. The

437

collision avoidance mechanism is required to change the local heading at each zig and zag to prevent collisions, but there must be a tendency exerted in the choice of heading to cause the robot to arrive at the goal. After each zigzag a heading is chosen that is closest to the desired heading, as defined by the potential fields, and that keeps the robot at least 125 mm from any obstacle (the robot is 250 mm in diameter).

At the point of a zigzag manoeuvre, it is desirable to update the range information by discarding the old ranges in the field-of-view now under observation. This is reasonable because the latest ranges are of the highest reliability, and accumulated errors have reduced the cer- tainty of the old measurements.

The direction that is chosen past an obstacle must be in the direction of a known obstacle. This ensures that there is a clear path on the intended route. The absence of a range measurement cannot be assumed to be an indica- tion of a clear route but, on the contrary, must be con- sidered unknown and hazardous. This is contrary to the situation with a sonar system.

In order to simulate the situation of search, the robot was first given a location to travel to whilst searching for a light source, which it detected using a simple threshold technique. The example in Fig. 4 illustrates (a) obstacle avoidance, (b) searching for a goal, and (c) travelling to a goal.

Figure 4A illustrates the situation after the robot's initial orientation movement. The robot is the outlined circle in the centre of the 500-mm grid, with four lines representing the two sectors viewed by the camera on the next move. Note that the 30-deg sector in the direction of motion is not viewed by the robot. The path travelled by the robot is displayed as a connected series of dots. Motion proceeds toward the goal (the heavily outlined circle) in two steps, one looking to the left and one looking to the right. Range measurements, obstacles, are represented by open circles which are darker if the ranging measurement has a high confidence. The circle size is dependent upon the distance of the object from the robot. The potential fields are applied, and the safe dis- tance in the direction of each obstacle is calculated, represented by the filled circle. The darker the filled circle, the greater the attraction for the robot. The choice of a new direction is made after each two-step movement based upon the potential fields, the most attractive direc- tion being chosen with the restriction that the change in direction be larger than 20 deg and less than 120 deg. This is to ensure that the next move will obtain informa- tion in the forward direction that was blind in the pre- vious move. The safe point that is chosen to supply the new heading direction is circled.

The obstacles in the room are shown in Fig. 4A and F, showing the edges of the room, the three obstacles and the light source. The robot looks for the light source in its field-of-view with every move. During the orientation movement the light is obscured from the robot by one of the obstacles, so the robot maintains the initial goal. In Fig. 4B the robot sights the light and makes a small traverse in order to determine its range by triangulation. The goal is reassigned, and at this point the robot is being

Page 6: Active navigation with a monocular robot

438

0

o oo �9 ~o~

~ o 0

�9 0

2 O O o

o ~

o

�9 :: _Uo o

0 o O o 0 O O

0 O0

0

0 C

Fig. 4A-F. A sequence showing the robot finding and navigating to a light, at first obscured, whilst avoiding various obstacles. The outlined circle in the centre of the grid is the robot, with four lines defining the two sectors viewed by the camera on the next move. The black line with small black squares is the history of the robot movement. The heavily outlined circle is the goal, defined by the user initially, and subsequently by the position of the light. The grid has a spacing of approximately

~ ~ O i

o .::~ , 'oo o ,I �9

0 o _ L , ~ o O o V ~

, . ,, o

a~

D

: ii i! ~} :

�9 ~ t ' ~ 1 7 6 1 7 6

g ~ ' o ~ . " " ~

� 9 �9 �9 Q

(D

O

o O O o

Q 0 - 0

)

v

0 I

0

F 0

500 mm. The open circles are the locations of obstacles determined in previous moves. The darker the open circle, the more confident is the measurement. The size of the circle indicates its distance from the robot. The filled circles are the computed safe paths in the direction of each obstacle. The darker the filled circle, the stronger the attraction of the robot to that path

Page 7: Active navigation with a monocular robot

strongly repelled by the very near obstacle. Every three moves the robot does another measurement of the goal range in order to update its position, thereby overcoming the inaccuracies of the ranging technique.

Figure 4C shows that the robot has moved to a posi- tion behind an obstacle that prevents it from sighting the goal, so the goal position is left unchanged. At this point the robot does a 180-deg turn and travels back the way it came. This is a safe manoeuvre because the path has already been traversed. The result is that, in Fig. 4D, the robot has chosen a path that takes it between the ob- stacles. It makes another ranging of the light source, which reassigns the goal slightly further away.

With each move the ranges in the fields-of-view are updated with new information, sometimes eliminating ghost obstacles that are a result of the ranging algorithm. When the robot views past the edge of an obstacle, it receives range readings from both the obstacle and the background. This can lead to a ranging that lies partway between the two. Continuous updating resolves these ambiguous readings.

Three more moves and then a final ranging of the goal in Fig. 4E confirms its position, and the robot halts at the closest, safest distance (Fig. 4F).

This example illustrates the basic elements of navi- gation with a single visual sensor, that is, motion in short, straight segments punctuated by sudden changes in di- rection and always moving toward known obstacles.

6 Discussion

The zigzag motion strategy is an action-oriented percep- tion schema. No a priori knowledge of the scene is required in order to achieve the goal. Except for the initialisation movement, every motion is in a safe direc- tion toward a known obstacle. Unlike motor schemas that comprise a collection of individual motor behav- iours (Arkin 1990), the zigzag motion with a potential field paradigm is a single motion strategy and is capable of complex trajectories.

Moving to a specific location is the elemental build- ing block for an autonomous agent. However, it is insuffi- cient to fulfill more complex navigation tasks. In order to enhance the capabilities of the present scheme, it would need to be embedded in a more complex robotic system that included a way of recognising unreachable goals and choosing new ones. This is to avoid the 'fly at the win- dow' behaviour. For example, if the robot attempts to move to a location that is on the other side of a long wall, it will move up the wall for a certain distance to one side until it has 'forgotten' the objects closest to the goal and will then turn around and go back along the wall. It will repeat this ad infinitum unless the current goal is sup- pressed and replaced by another.

In the present scheme this is achieved by choosing a new, temporary goal if the robot fails to progress at more than a certain rate over a given period. This indi- cates a deadlock situation at a location. Figure 5 shows the path of the robot attempting to navigate to a goal that is concealed behind obstacles. (In this instance the

439

�9

Fig. 5. Dealing with a deadlock situation. The robot is navigating toward the goal, but the positioning of the obstacles causes it to track back and forth without getting close to the goal. After seven moves, if the ratio of the total path length to the distance traveled (i.e. the distance between the endpoints of the seven moves) is greater than a threshold, then the goal is repositioned. The robot moves toward the new goal, and after another seven moves the goal is returned to its original position. The robot approaches the goal from a new angle and is able to pass to the other side of the obstacles, eliminating the deadlock situation. Range readings are omitted for clarity

robot is not navigating to a light but simply to a spatial location.) After seven pairs of moves, the ratio of the total path length to the actual distance covered (between the end points of the seven moves) is calculated, and if the value is high, it indicates that the robot is trapped in a deadlock situation. At this point the goal is temporarily repositioned and after another seven moves is returned to its original location. This provides the robot with a chance to approach the goal from a different angle and perhaps kick it out of the deadlock, as indeed happens in the example.

A further enhancement would be to utilise the whole of the optical flow field. In the examples shown, only a horizontal strip of the image is used. This is mainly due to the cost of computation, but there is no reason why other areas of the image cannot be used to expand the perceived world from two dimensions to three. In the case of the robot traversing level ground, the information in the lower part of the field can be used to detect small projections from the floor plane. In the case of a flying robot, the full flow-field could be used to navigate in a cluttered environment such as a helicopter flying nap- of-the-earth (Suorsa and Sridhar 1990) where the vehicle must remain below the level of the tree tops as far as possible.

The main obstacle to further development is the com- putational cost of the image motion calculations. It is envisioned that this processing can be conducted at video rates with a suitable computational engine, such as ISHTAR (Sobey et al. 1992). This would then allow the exploration of dynamic situations. Kalman filtering could be used to provide an expectation of the changes in the optical flow in a cluttered environment as the robot

Page 8: Active navigation with a monocular robot

440

moves. Deviations from this expectation could be marked as other objects moving in the environment and motion strategies developed to determine the position of these objects.

Acknowledgements. This work was performed during Australian Research Council Postdoctoral Fellowship. I would like to thank M.V. Srinivasan for useful discussions, T. Maddess for help with the figures, and the anonymous reviewer whose comments helped improve the manuscript.

References

Anderson CS, Madsen CB, Sorensen J J, Kirkeby NOS, Jones JP, Christensen HI (1992) Navigation using range images on a mobile robot. Robotics Auton Syst 10:147-160

Anderson TL, Donarth M (1990) Animal behaviour as a paradigm for developing robot autonomy. Robotics Auton Syst 6:145-168

Arkin RC (1990) Integrating behavioral, perceptual and world know- ledge in reactive navigation. Robotics Auton Syst 6:105-122

Bastian J, Esch H (1970) The nervous control of the indirect flight muscles of the honey bee. Z Vgl Physiol 67:307-324

Brooks RA (1991) Intelligence without reason. In: 12th Int Joint Conf on Artificial Intelligence, August, pp 569-595

Gibson JJ (1950) The perception of the visual world. Houghton Mifflin, Boston

Heisenberg M, Wolf R (1993) The sensory-motor link in motion-depen- dent flight control of flies. In: Miles F.A, Wallman J (eds) Visual motion and its role in the stabilisation of gaze. Elsevier Science Publishers, pp 265-283

Hogan N (1985) Impedence control: an approach to manipulation. J Dyn Syst Meas Contrib 107:1-24

Horridge GA (1986) A theory of insect vision: velocity parallax. Proc R Soc Lond [Biol] 229:13-27

Jarvis RA (1983) A perspective on range finding techniques for com- puter vision. IEEE Trans Pattern Anal Mach Intell 5-2:122-139

Khatib O (1986) Real-time obstacle avoidance for manipulators and mobile robots. Int J Robotics Res 5-1:90-98

Krogh BH (1984) A generalised potential field approach to obstacle avoidance control. In: Int Robotics Res Conf, August

Lee DN, Reddish PE (1981) Plummeting gannets: a paradigm of eco- logical optics. Nature 293:293-294

Lehrer M (1990) How bees use peripheral eye regions to localise a frontally positioned target. J Comp Physiol [A] 167:173-185

Lehrer M, Srinivasan MV, Zhang SW, Horridge GA (1988) Motion cues provide the bee's visual world with a third dimension. Nature 6162:356-357

Mayer M, Vogtmann K, Bausenwein B, Wolf R, Heisenberg M (1988) Fight control during 'free yaw turns' in Drosophila melanogasrer. J Comp Physiol I-AI 163:389-399

Nakayama K, Loomis JM (1974) Optical velocity patterns, velocity- sensitive neurons, and space perception: a hypothesis. Perception 3: 63-80

Newman WS, Hogan N (1987) High speed robot control and obstacle avoidance using dynamic potential functions. In: Robotics Auto- mation. Proceedings IEEE International Conference on Robotics and Automation, Piscataway, NJ, pp 14-22

Prescott T, Mayhew J (1992) Adaptive local navigation. In: Blake A, Yuille A (eds) Active vision. MIT Press, Cambridge, Massa- chusetts, pp 203-215

Rowell CHF (1989) Descending interneurones of the locust reporting deviation from flight course: what is their role in steering? J Exp Biol 146:177-194

Sobey P, Srinivasan MV (1991) Measurement of optical flow using a generalized gradient scheme. J Opt Soc Am 8 : 1488-1498

Sobey PJ, Sasaki S, Nagle M, Toriu T, Srinivasan MV (1992) A hardware system for computing image velocity in real-time. In: Batchelor BG, Solomon SS, Waltz FM (eds) Machine vision applications, architectures and systems integration. SPIE 1823: 334-341

Srinivasan MV (1990) Generalized gradient schemes for the measurement of two-dimensional image motion. Biol Cybern 63:421-431

Srinivasan MV (1992) Distance perception in insects. Curr Dir Psychol Sci 1-1:22-26

Srinivasan MV, Lehrer M, Kirchner WH, Zhang SW (1991) Range perception through apparent image speed in freely flying honey- bees. Vis Neurosci 6:519-535

Srinivasan MV, Lehrer M, Zhang SW, Horridge GA (1989) How honeybees measure their distance from objects of unknown size. J Comp Physiol [A] 165:605-613

Suorsa R, Sridhar B (1990) Validation of vision-based obstacle detection algorithms for low-altitude helicopter flight. In: Chun WH, Wolfe WJ (eds) Mobile robots V. SPIE 1388: 90-103

Tistarelli M, Sandini G (1993) On the advantages of polar and log-polar mapping for direct estimation of time-to-impact from optical flow. IEEE Trans Patt Anal Mach Intell 15-4:401-410

Uras S, Girosi F, Verri A, Torre V (1988) A computational approach to motion perception. Biol Cybern 60:79-87

Wagner H (1982) Flow-field variables trigger landing in flies. Nature 297:147-148

Wagner H (1986) Flight performance and visual control of flight of the free-flying housefly (Musca domestica L.) III. Interactions between angular movement induced by wide- and smallfield stimuli. Philos Trans R Soc Lond I-Biol] 312:581-595

Warren WH, Kurtz KJ (1992) The role of central and peripheral vision in perceiving the direction of self-motion. Percept Psychophys 51:443-454

Wolf R, Heisenberg M (1990) Visual control of straight flight in Drosophila melanogaster. J Comp Physiol I-A] 167:269-283

Wolf R, Voss A, Hein S, Heisenberg M (1992) Can a fly ride a bicycle? Philos Trans R Soc Lond [Biol] 337:261-269