6
Fusion of Discrete and Continuous Epipolar Geometry for Visual Odometry and Localization David Tick Computer Science Dept. University of Texas at Dallas Dallas, Texas 75252 Email: [email protected] Jinglin Shen Electrical Engineering Dept. University of Texas at Dallas Dallas, Texas 75252 Email: [email protected] Dr. Nicholas Gans Electrical Engineering Dept. University of Texas at Dallas Dallas, Texas 75252 Email: [email protected] Abstract—Localization is a critical problem for building mobile robotic systems capable of autonomous navigation. This paper describes a novel visual odometry method to improve the accu- racy of localization when a camera is viewing a piecewise planar scene. Discrete and continuous Homography Matrices are used to recover position, heading, and velocity from images of co- planar feature points. A Kalman filter is used to fuse pose and velocity estimates and increase the accuracy of the estimates. Simulation results are presented to demonstrate the performance of the proposed method. I. I NTRODUCTION Building mobile robotic systems that are capable of real- time, autonomous navigation is a complex, multi-faceted prob- lem. One of the primary aspects of this problem is the task of localization. Localization, or pose-estimation, refers to the task of estimating the velocity (both angular and linear), position, and orientation of the robot at any given instant. There are many established ways to approach the task of localization including wheel odometry [1], inertial sensors [1], GPS [1], sonar [2], and IR/laser-based range finding sensors [3]. There has also been significant development of localization techniques which are solely vision-based. Visual odometry is a method of localization that uses one or more cameras to continuously capture images or video frames taken of a scene [4]. The frames are analyzed sequentially using various computer vision techniques. The analysis of these frames estimates angular and linear velocities of the camera between each time step. These velocities can then be integrated over time to estimate how the camera has moved. Using this technique, an estimate of a mobile robot’s speed, position, and heading (orientation) can be calculated and maintained over time. A pose estimate produced in this manner can be further enhanced by combining visual odometry with traditional wheel odometry and other sensor devices. This can be achieved through the use of various signal processing tech- niques called sensor fusion. The Kalman filter is particularly useful for fusing signals from multiple sensors and removing errors in localization that occur due to many factors such as sensor noise, quantization, flawed process modeling, and sensor bias or drift [1], [5]. Some vision-based localization techniques are designed to calculate the pose of the camera relative to some well known reference object that appears in each frame [6], [7]. Usually these techniques require the system to have accurate geometric information about the scene and/or some reference object(s) in the scene prior to execution. In many situations, there exists little accurate prior knowledge regarding a scene and the objects therein. In such cases, two frames can be compared with one another based on a set of feature points which exists in both frames. Once a set of feature points is identified in both frames, a homogeneous mapping of the feature points from one frame onto the next must exist. This mapping can be modeled as a linear transformation and encapsulates the rotation and translation of the camera that occurs between the taking of each picture. In practice, the Essential Matrix or Euclidean Homography Matrix is used to estimate a camera’s pose in terms of a set of rotational and translational transfor- mations [8], [9]. Over the years, many control systems have been designed that utilize Essential or Homography Matrices in vision-based robotic tasks [10]–[13]. The Euclidean Homography Matrix can be used in both a continuous as well as a discrete form [9]. The Homography Matrix, in its discrete form, has been widely used for many years by the community [14]–[18]. However, it seems that there has not been as much work done regarding applications of the Homography Matrix in its continuous form. Of particular interest in this paper, is the application of the Euclidean Homography Matrix as a means of estimating veloc- ity [19]. If one does a discrete homography-based estimation of the camera position and orientation (i.e., pose), then one can also integrate the continuous estimate of the velocity at each time step to extract the pose as well. The pose estimate ob- tained from integrating the continuous homographic estimate of angular velocity must agree with the discretely estimated position of the camera. In this paper, we propose a method which uses a single six degrees of freedom (6DOF), or “eye in hand”, camera to do homography-based visual odometry. This system will capture video frames and identify sets of coplanar feature points in a scene. It will estimate the change in pose of the camera by using the discrete form of the Euclidean Homography Matrix. The system will also utilize the continuous form of the Euclidean Homography Matrix to estimate the velocity of the camera. Our system utilizes the well known Kalman filter to fuse these two estimates and remove error which accrues 978-1-4244-7148-5/10/$26.00 ©2010 IEEE

Fusion of Discrete and Continuous Epipolar Geometry for ...ngans/papers/HKF_Rose.pdf · Fusion of Discrete and Continuous Epipolar Geometry for Visual Odometry ... point as measured

  • Upload
    ngodung

  • View
    219

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Fusion of Discrete and Continuous Epipolar Geometry for ...ngans/papers/HKF_Rose.pdf · Fusion of Discrete and Continuous Epipolar Geometry for Visual Odometry ... point as measured

Fusion of Discrete and Continuous EpipolarGeometry for Visual Odometry and Localization

David TickComputer Science Dept.

University of Texas at DallasDallas, Texas 75252

Email: [email protected]

Jinglin ShenElectrical Engineering Dept.University of Texas at Dallas

Dallas, Texas 75252Email: [email protected]

Dr. Nicholas GansElectrical Engineering Dept.University of Texas at Dallas

Dallas, Texas 75252Email: [email protected]

Abstract—Localization is a critical problem for building mobilerobotic systems capable of autonomous navigation. This paperdescribes a novel visual odometry method to improve the accu-racy of localization when a camera is viewing a piecewise planarscene. Discrete and continuous Homography Matrices are usedto recover position, heading, and velocity from images of co-planar feature points. A Kalman filter is used to fuse pose andvelocity estimates and increase the accuracy of the estimates.Simulation results are presented to demonstrate the performanceof the proposed method.

I. INTRODUCTION

Building mobile robotic systems that are capable of real-time, autonomous navigation is a complex, multi-faceted prob-lem. One of the primary aspects of this problem is the task oflocalization. Localization, or pose-estimation, refers to the taskof estimating the velocity (both angular and linear), position,and orientation of the robot at any given instant. There aremany established ways to approach the task of localizationincluding wheel odometry [1], inertial sensors [1], GPS [1],sonar [2], and IR/laser-based range finding sensors [3].

There has also been significant development of localizationtechniques which are solely vision-based. Visual odometryis a method of localization that uses one or more camerasto continuously capture images or video frames taken of ascene [4]. The frames are analyzed sequentially using variouscomputer vision techniques. The analysis of these framesestimates angular and linear velocities of the camera betweeneach time step. These velocities can then be integrated overtime to estimate how the camera has moved.

Using this technique, an estimate of a mobile robot’s speed,position, and heading (orientation) can be calculated andmaintained over time. A pose estimate produced in this mannercan be further enhanced by combining visual odometry withtraditional wheel odometry and other sensor devices. This canbe achieved through the use of various signal processing tech-niques called sensor fusion. The Kalman filter is particularlyuseful for fusing signals from multiple sensors and removingerrors in localization that occur due to many factors suchas sensor noise, quantization, flawed process modeling, andsensor bias or drift [1], [5].

Some vision-based localization techniques are designed tocalculate the pose of the camera relative to some well knownreference object that appears in each frame [6], [7]. Usually

these techniques require the system to have accurate geometricinformation about the scene and/or some reference object(s)in the scene prior to execution. In many situations, thereexists little accurate prior knowledge regarding a scene andthe objects therein. In such cases, two frames can be comparedwith one another based on a set of feature points which existsin both frames. Once a set of feature points is identified inboth frames, a homogeneous mapping of the feature pointsfrom one frame onto the next must exist. This mapping canbe modeled as a linear transformation and encapsulates therotation and translation of the camera that occurs between thetaking of each picture. In practice, the Essential Matrix orEuclidean Homography Matrix is used to estimate a camera’spose in terms of a set of rotational and translational transfor-mations [8], [9].

Over the years, many control systems have been designedthat utilize Essential or Homography Matrices in vision-basedrobotic tasks [10]–[13]. The Euclidean Homography Matrixcan be used in both a continuous as well as a discrete form [9].The Homography Matrix, in its discrete form, has been widelyused for many years by the community [14]–[18]. However,it seems that there has not been as much work done regardingapplications of the Homography Matrix in its continuous form.

Of particular interest in this paper, is the application of theEuclidean Homography Matrix as a means of estimating veloc-ity [19]. If one does a discrete homography-based estimationof the camera position and orientation (i.e., pose), then one canalso integrate the continuous estimate of the velocity at eachtime step to extract the pose as well. The pose estimate ob-tained from integrating the continuous homographic estimateof angular velocity must agree with the discretely estimatedposition of the camera.

In this paper, we propose a method which uses a single sixdegrees of freedom (6DOF), or “eye in hand”, camera to dohomography-based visual odometry. This system will capturevideo frames and identify sets of coplanar feature points ina scene. It will estimate the change in pose of the cameraby using the discrete form of the Euclidean HomographyMatrix. The system will also utilize the continuous form ofthe Euclidean Homography Matrix to estimate the velocity ofthe camera. Our system utilizes the well known Kalman filterto fuse these two estimates and remove error which accrues

978-1-4244-7148-5/10/$26.00 ©2010 IEEE

Page 2: Fusion of Discrete and Continuous Epipolar Geometry for ...ngans/papers/HKF_Rose.pdf · Fusion of Discrete and Continuous Epipolar Geometry for Visual Odometry ... point as measured

over time due to integration, noise, and quantization [5].Section II clarifies the terminology used in this paper as

well as formally defining and modeling the problem. SectionIII explains the proposed approach at homographic visualodometry. We present simulation results in section IV thatillustrate the effectiveness of the proposed system. Finally, insection V we espouse the various conclusions that we havereached.

II. BACKGROUND

While there are certain formal conventions that exist withrespect to the terminology used in vision-based localization,there is no official standard. Thus, it is necessary that we definethe terminology that we use in this paper.

A. Formal Definition and Terminology

Fig. 1 illustrates a camera, with attached reference frame,moving over time. The lowercase sub-script attached to eachaxis label indicates the frame of reference to which thataxis belongs. Navigation sensors, including cameras, reportmeasurements in terms of the moving body frame, formallycalled Fb. These measurements are then rotated to obtainlocalization with respect to the world frame, formally calledFw. The z-axis of Fb is oriented along the optical axis,the x-axis is oriented along the horizontal direction of theimage plane, and the y-axis is oriented parallel to the verticaldirection of the image.

The orientation and position (collectively referred to aspose) of a body frame in terms of the world frame is expressedas a function of time by including the index number for agiven time step in parentheses. For example, in Fig. 1 Fb (t0)describes the pose of the body frame Fb at time index t0 asmeasured from Fb (t0), while Fb (tk) describes the pose ofFb at time tk with respect to Fb (t0).

The changes in the pose of Fb that occur over the timeinterval [t0, t1, ..., tk−1, tk] is described in terms of a trans-lation vector Tk ∈ R3 and a rotation matrix Rk ∈ SO (3),formally written as (Tk, Rk), where SO (3) means “SpecialOrthogonal Group of order three”. At any given time tk, theinstantaneous linear and angular velocities of the body frameFb are described, as a pair of vectors (vk, ωk), where vk ∈ R3

and ωk ∈ R3.

B. Pinhole Camera Model

Fig. 2 illustrates a camera taking two images from twodifferent poses F∗

b and Fb (t). F∗b is considered a static

reference frame, such as the pose at time t = 0. Without lossof generality we take Fw = F∗

b . Fb (t) is considered a movingframe or current frame. The changes which exist between thetwo poses, as stated above, are encapsulated by (T,R). Fig. 2also shows N ≥ 4 feature points that all lie in the plane πs.The 3D coordinates of each feature point as measured fromthe reference frame of the pose Fb (t) are defined in terms ofa vector mj (t) ∈ R3. Similarly, the 3D coordinates of eachfeature point as measured from the reference frame of the pose

Zw

Xw

Yw

( Tk , Rk )Fb(t0)

Fb(tk)

Fb(tk-1)

Fb(t1)

Fb(t2 … tk-2)

(vk , 𝜔k)

Xb

Yb

Zb

Xb

Yb

(v1 , 𝜔1)

(v2 ... k-2 , 𝜔2 … k-2)

(vk-1 , 𝜔k-1)

(v0 , 𝜔0)

Fw

Fig. 1. Translation and Rotation (Tk, Rk) of Body/Camera Frame Fb

*

bX

*

bY

*

bZ

bZ t

bY t

bX t

*m1

m t1

1 2

3 4

1 1

(T, R)

*

1m m t

1

bt*

b

s

b tb*

Fig. 2. Planar Scenes and Feature Points

F∗b are defined in terms of a vector m∗

j ∈ R3. Formally thesevectors are given as

m∗j ∈ R3 =

[x∗j , y

∗j , z

∗j

]T,∀j ∈ {1, . . . , N}

mj (t) ∈ R3 = [xj (t) , yj (t) , zj (t)]T,∀j ∈ {1, . . . , N}.

Two different images are captured by the camera at thetwo poses F∗

b and Fb (t). This is modeled by projectingthe feature points onto the 2D image planes π∗

b and πb (t).The coordinates of the feature points in these 2D planes areexpressed as a normalized set of 3D coordinates, where depthalong the Z-axis is set equal to one. The normalized imageplane coordinates that result from this projection, as measuredfrom the reference frame of the pose Fb (t) are defined interms of a vector mj (t) ∈ R3. Similarly, the normalized imageplane coordinates of each feature point as measured from thereference frame of the pose F∗

b are defined in terms of a vectorm∗

j ∈ R3. These vectors are expressed as

m∗j ∈ R3 =

[x∗jz∗j,y∗jz∗j, 1

]T,∀j ∈ {1, . . . , N}

mj (t) ∈ R3 =

[xj (t)

zj (t),yj (t)

zj (t), 1

]T,∀j ∈ {1, . . . , N}.

Page 3: Fusion of Discrete and Continuous Epipolar Geometry for ...ngans/papers/HKF_Rose.pdf · Fusion of Discrete and Continuous Epipolar Geometry for Visual Odometry ... point as measured

C. Euclidean Homography

In this work, we focus on the planar Homography case, thatis, all 3D feature points are on the same plane. Consider thetwo sets of points m(t) on plane πs, mentioned above, thetransformation between the two sets is given by [9]

mj = Rm∗j + T. (1)

The relationship in (1) can be rewritten in terms of imagepoints as [8]

mj = (R+1

d∗Tn∗T )m∗

j

where n∗ = [n∗x, n∗y, n

∗z]T is the constant unit normal vector

of plane πs measured in F∗b , and d∗ is the constant distance

between the optical center of the camera (i.e. the origin ofF∗

b ) to the plane πs. We assume that d∗ is known in thisinitial investigation. The matrix

Hd = R+1

d∗Tn∗T

defines what is known as the discrete Homography Matrix. Byusing the four-point algorithm, the Homography Matrix Hd(t)can be solved to recover the translation vector T (t) and therotation matrix R(t) [8].

In the continuous case, image point mj (t) and its opticalflow mj (t) are measured instead of image pair m∗

j and mj (t).The time derivative of mj (t) satisfies

mj = ωmj + v (2)

where ω(t) ∈ R3×3 is the skew-symmetric matrix of ω(t).The relationship in (2) can be rewritten in terms of image

points as [9]

mj (t) = (ω +1

d∗vn∗T )mj (t) .

The matrixHc = ω +

1

d∗vn∗T

is defined as the continuous Homography Matrix. Similar tothe discrete form, a four-points algorithm gives the solution forlinear velocity v(t) and angular velocity ω(t) of the camera.

III. APPROACH

As described in section II, the translation and rotation of acamera can be recovered by applying the Homography Matrixmethod in its discrete form. Similarly, the linear and angularvelocities of a camera can be obtained using the continuousform. However, the estimation can be noisy due to possiblesensor noise from the camera and numerical error producedduring computation of the Homography Matrix. Integrationof the continuous velocity estimate over time should agreewith the discrete pose estimate, thus fusing velocity and poseestimates will reduce the effect of noise. A Kalman filter isbuilt to perform the sensor fusion, including integration, andgenerate the final pose and velocity estimates.

The reference feature points m∗j are taken as the points at

initial time t0. At each time step tk, feature points m∗j and

mj(tk) are used to calculate the Homography Matrices Hd(tk)and Hc(tk). The Homography Matrices are decomposed tosolve for Rk, Tk, vk, ωk. The system states are the position andvelocity of the camera. Roll, pitch, and yaw angles (denotedr(tk), p(tk), w(tk)) represent the camera’s orientation insteadof the recovered rotation matrix. The relation between theangle rates r(tk), p(tk), w(tk) and angular velocity ωk =[ωx(tk), ωy(tk), ωz(tk)]T is given by [20]

p = (ωx cos r − ωy sin r) (3)w = (ωx sin r + ωy cos r) sec p (4)r = (ωx sin r + ωy cos r) tan p+ ωz. (5)

The system equations for the Kalman Filter are given by

xk = Fk · xk−1 + wk (6)yk = Hk · xk + vk. (7)

In (6), the state vector is given by

xk = [x, y, z, p, w, r, vx, vy, vz, ωx, ωy, ωz]T ∈ R12

and wk ∈ R12 is a normally distributed random process withzero mean and covariance matrix Qk ∈ R12×12. The statetransition matrix Fk ∈ R12×12 and the process covariancematrix Qk can be found in the Appendix. A random walkprocess is used to determine Qk. In Fk and Qk the term ∆tis the frame rate of the camera, and the various σ terms arescaling terms for the states and are indicated as subscripts.In (7), the measurement matrix Hk ∈ R12×12 is an identitymatrix since all components of the state vector are beingmeasured. The sensor noise vector vk ∈ R12 is a normallydistributed random process with zero mean and covariancematrix Rk ∈ R12×12, which is a diagonal matrix, whosediagonal elements are decided according to the estimatedmeasurement noise for each state vector component.

Given the terms in (6) and (7) the Kalman filter is designedand updated in the typical manner [5]. Estimates from theupdate step are utilized to provide the pose and velocityestimates that are in turn used for localization of the camera.

IV. RESULTS

In this section, simulation results of the proposed Ho-mographic visual odometry method are presented. A singlecamera observes four static, coplanar feature points. Thecamera motion is generated so that the feature points alwaysstay in the field of view. The linear velocity is sinusoidal alongall degrees of freedom. The angular velocity is zero along thex-axis and sinusoidal along the other two axes. The frame rateof the camera is 30 frames/second. Discrete and continuousHomography matrices compute the estimated camera position,orientation and velocities. Next, these homographic estimatesare input into the Kalman filter, which fuses them, producingthe final state estimate. The results are then plotted against thetrue camera motion for comparison.

In the first simulation, the fused results from the Kalmanfilter are compared with results which come directly fromusing only one of the Homography matrices. Gaussian noisewith zero mean and variance equal to 0.01 is added. The

Page 4: Fusion of Discrete and Continuous Epipolar Geometry for ...ngans/papers/HKF_Rose.pdf · Fusion of Discrete and Continuous Epipolar Geometry for Visual Odometry ... point as measured

diagonal values of Rk used in our simulations are [0.01, 0.01,0.01, 0.01, 0.01, 0.01, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]. Fig. 3through Fig. 6 show the estimated camera position, orientation,linear and angular velocities respectively. The dashed line(blue) represents the true value of the simulated trajectory,and the two solid lines (green and red) represent the estimateobtained directly from Homography(H) and the fused estimateproduced by the Kalman filter(KF).

Fig. 3 and Fig. 4 show that given a random initial state,the estimated values of position and orientation convergetoward the true trajectory quickly, and thereafter estimationerror remains small for both the Homography and the Kalmanfiltered estimates. Fig. 5 and Fig. 6, show that the velocityestimates from Kalman filter track the true values quite well,while the continuous Homography estimate is much noiser.Also, note that a lag exists between the true signals and thefiltered estimates. Large values of the diagonal elements inthe measurement covariance matrix Rk cause this lag. Largervalues in Rk make the estimated trajectories more accurate,but increase the lag of the Kalman filter. There is a trade-offbetween the system’s robustness against noise, and the overallspeed of the system.

In the second simulation, we investigate the robustness ofour proposed method by looking at its performance whendifferent amounts of noise are added. Fig. 7 through Fig. 10once again show estimated camera position, orientation, linearand angular velocities respectively. The dashed line (blue)represents the true value of the simulated trajectory, and thetwo solid lines (green and red) represent the fused estimateproduced by the Kalman filter with no noise added(n=0), andwith Gaussian noise having zero mean and variance equal to0.05 added(n=0.05). Comparison with the set of results fromthe first simulation, reveals that the accuracy of the positionand orientation estimates are not severely impacted by theintroduction of more noise. However, the velocity estimatesdo become significantly noisier as the variance of the sensornoise increases.

These results indicate that our proposed system is sensitiveto noise in its velocity estimate. There are several approacheswhich could mitigate this noise sensitivity. One possible solu-tion is to utilize a Fixed Interval Kalman Smoother [21]. Asan example, the effect of this smoother on velocity estimationis shown in Fig. 11 and Fig. 12. Note that the velocity esti-mates are drastically improved when this method is employed.However, this solution that is not usable in real time. Anothersolution, which is capable of being run in real time, is to addmore feature points to the set. While this would likely providesome additional robustness against noise, it should be notedthat in this initial investigation we are evaluating performanceof the proposed system in the worst case situation of havingonly the minimum set of four feature points.

V. CONCLUSION

In this paper, we present a novel visual odometry methodfor localization of a mobile robotic system. Continuous anddiscrete forms of the Euclidean Homography Matrix are used

0 1 2 3 4 5 6

−5

0

5

Posx(m

)

estimated(H)

estimated(KF)

true

0 1 2 3 4 5 6

−5

0

5

Posy(m

)

0 1 2 3 4 5 6

−5

0

5

Posz(m

)

time(s)

Fig. 3. Filter vs DH: Position Estimate (noise variance = 0.01)

0 1 2 3 4 5 6

−0.2

0

0.2

Angx(r

ad)

estimated(H)

estimated(KF)

true

0 1 2 3 4 5 6

−0.2

0

0.2

Angy(r

ad)

0 1 2 3 4 5 6

−0.2

0

0.2

Angz(r

ad)

time(s)

Fig. 4. Filter vs DH: Orientation Estimate (noise variance = 0.01)

0 1 2 3 4 5 6

−5

0

5

Vx(m

/s)

estimated(H)

estimated(KF)

true

0 1 2 3 4 5 6

−5

0

5

Vy(m

/s)

0 1 2 3 4 5 6

−5

0

5

Vz(m

/s)

time(s)

Fig. 5. Filter vs CH: Linear Velocity Estimate (noise variance = 0.01)

Page 5: Fusion of Discrete and Continuous Epipolar Geometry for ...ngans/papers/HKF_Rose.pdf · Fusion of Discrete and Continuous Epipolar Geometry for Visual Odometry ... point as measured

0 1 2 3 4 5 6

−0.2

0

0.2

Wx(r

ad/s

)

estimated(H)

estimated(KF)

true

0 1 2 3 4 5 6

−0.2

0

0.2

Wy(r

ad/s

)

0 1 2 3 4 5 6

−0.2

0

0.2

Wz(r

ad/s

)

time(s)

Fig. 6. Filter vs CH: Angular Velocity Estimate (noise variance = 0.01)

0 1 2 3 4 5 6

−5

0

5

Posx(m

)

estimated(n=0)

estimated(n=0.05)

true

0 1 2 3 4 5 6

−5

0

5

Posy(m

)

0 1 2 3 4 5 6

−5

0

5

Posz(m

)

time(s)

Fig. 7. Filtered Position Estimate with Camera Noise Added

0 1 2 3 4 5 6

−0.2

0

0.2

Angx(r

ad)

estimated(n=0)

estimated(n=0.05)

true

0 1 2 3 4 5 6

−0.2

0

0.2

Angy(r

ad)

0 1 2 3 4 5 6

−0.2

0

0.2

Angz(r

ad)

time(s)

Fig. 8. Filtered Orientation Estimate with Camera Noise Added

0 1 2 3 4 5 6

−5

0

5

Vx(m

/s)

estimated(n=0)

estimated(n=0.05)

true

0 1 2 3 4 5 6

−5

0

5

Vy(m

/s)

0 1 2 3 4 5 6

−5

0

5

Vz(m

/s)

time(s)

Fig. 9. Filtered Linear Velocity Estimate with Camera Noise Added

0 1 2 3 4 5 6

−0.2

0

0.2

Wx(r

ad/s

)

estimated(n=0)

estimated(n=0.05)

true

0 1 2 3 4 5 6

−0.2

0

0.2

Wy(r

ad/s

)

0 1 2 3 4 5 6

−0.2

0

0.2

Wz(r

ad/s

)

time(s)

Fig. 10. Filtered Angular Velocity Estimate with Camera Noise Added

0 1 2 3 4 5 6

−5

0

5

Vx(m

/s)

estimate(KF)

estimate(KS)

true

0 1 2 3 4 5 6

−5

0

5

Vy(m

/s)

0 1 2 3 4 5 6

−5

0

5

Vz(m

/s)

time(s)

Fig. 11. Smoothed Estimate of Filtered Linear Velocity w/Noise Added

Page 6: Fusion of Discrete and Continuous Epipolar Geometry for ...ngans/papers/HKF_Rose.pdf · Fusion of Discrete and Continuous Epipolar Geometry for Visual Odometry ... point as measured

0 1 2 3 4 5 6

−0.2

0

0.2

Wx(r

ad/s

)

estimate(KF)

estimate(KS)

actual

0 1 2 3 4 5 6

−0.2

0

0.2

Wy(r

ad/s

)

0 1 2 3 4 5 6

−0.2

0

0.2

Wz(r

ad/s

)

time(s)

Fig. 12. Smoothed Estimate of Filtered Angular Velocity w/Noise Added

to recover position and velocity information from cameraimages. A Kalman filter is used to fuse the velocity and poseestimates. This reduces the effects of noise and improves theestimate. Simulations were performed to explore the perfor-mance of the proposed method. When sensor noise is notconsidered, the estimated positions and velocities show littleerror from the true signals. Even when noise is introducedto the camera model, the velocity estimates produced by ourproposed method are dramatically better than the velocityestimates obtained directly from the continuous HomographyMatrix. Carefully choosing the parameters of the Kalman filterallows the user to balance the accuracy and responsiveness ofthe system. Furthermore, there exist both offline and real-timesolutions that mitigate the effects of noise sensitivity.

There are several avenues open for future work. In the shortterm, experiments will be performed of the presented method.This will require accurate and fast detection and tracking offeatures. Low levels of tracking error will be important tomitigate sensor noise. In the longer term, a method mustbe determined to handle large scale movements of the robotand camera such that feature points can be allowed to enterand leave the camera field of view without disrupting theestimation. Finally this method will be incorporated withother localization and odometry methods, including inertialmeasurement units and wheel encoders.

REFERENCES

[1] G. Dudek and M. Jenkin, “Inertial sensors, gps, and odometry,” inSpringer Handbook of Robotics, 2008, pp. 477–490.

[2] L. Kleeman and R. Kuc, “Sonar sensing,” in Springer Handbook ofRobotics, 2008, pp. 491–519.

[3] R. B. Fisher and K. Konolige, “Range sensors,” in Springer Handbookof Robotics, 2008, pp. 521–542.

[4] D. Nister, O. Naroditsky, and J. Bergen, “Visual odometry,” in ComputerVision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the2004 IEEE Computer Society Conference on, vol. 1, 27 2004, pp. I–652– I–659 Vol.1.

[5] R. G. Brown, Introduction to Random Signal Analysis and KalmanFiltering. John Wiley & Sons, 1983.

[6] D. DeMenthon and L. S. Davis, “Model-based object pose in 25 linescode,” in European Conf. on Computer Vision, 1992, pp. 335–343.

APPENDIXKALMAN FILTER MATRICES

Fk =

1 0 0 0 0 0 ∆t 0 0 0 0 0

0 1 0 0 0 0 0 ∆t 0 0 0 0

0 0 1 0 0 0 0 0 ∆t 0 0 0

0 0 0 1 0 0 0 0 0 cos (r) ∆t − sin (r) ∆t 0

0 0 0 0 1 0 0 0 0sin(r)cos(p)

∆tcos(r)cos(p)

∆t 0

0 0 0 0 0 1 0 0 0 sin (r) tan (p) ∆t cos (r) tan (p) ∆t ∆t

0 0 0 0 0 0 1 0 0 0 0 0

0 0 0 0 0 0 0 1 0 0 0 0

0 0 0 0 0 0 0 0 1 0 0 0

0 0 0 0 0 0 0 0 0 1 0 0

0 0 0 0 0 0 0 0 0 0 1 0

0 0 0 0 0 0 0 0 0 0 0 1

Qk =

(σx∆t)3

0 0 0 0 0(σx∆t)

2

20 0 0 0 0

0

(σy∆t

)33

0 0 0 0 0

(σy∆t

)22

0 0 0 0

0 0(σz∆t)

3

30 0 0 0 0

(σz∆t)2

20 0 0

0 0 0

(σp∆t

)33

0 0 0 0 0

(σp∆t

)22

0 0

0 0 0 0(σw∆t)

3

30 0 0 0 0

(σw∆t)2

20

0 0 0 0 0(σr∆t)

3

30 0 0 0 0

(σr∆t)2

2

(σx∆t)2

20 0 0 0 0 σx∆t 0 0 0 0 0

0

(σy∆t

)22

0 0 0 0 0 σy∆t 0 0 0 0

0 0(σz∆t)

2

20 0 0 0 0 σz∆t 0 0 0

0 0 0

(σp∆t

)22

0 0 0 0 0 σp∆t 0 0

0 0 0 0(σw∆t)

2

20 0 0 0 0 σw∆t 0

0 0 0 0 0(σr∆t)

2

20 0 0 0 0 σr∆t

[7] L. Quan and Z.-D. Lan, “Linear n-point camera pose determination,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 21, no. 8, pp. 774–780,1999.

[8] O. D. Faugeras and F. Lustman, “Motion and structure from motion ina piecewise planar environment,” Int. J. Pattern Recog. and ArtificialIntell., vol. 2, no. 3, pp. 485–508, 1988.

[9] Y. Ma, S. Soatto, J. Koseck, and S. Sastry, An Invitation to 3-D Vision.Springer, 2004.

[10] E. Malis, F. Chaumette, and S. Boudet, “2-1/2D visual servoing,” IEEETrans. Robot. Autom., vol. 15, no. 2, pp. 238–250, 1999.

[11] C. Taylor and J. Ostrowski, “Robust vision-based pose control,” in Proc.IEEE Int. Conf. Robotics and Automation, 2000, pp. 2734–2740.

[12] Y. Fang, D. Dawson, W. Dixon, and M. de Queiroz, “Homography-based visual servoing wheeled mobile robots,” in Proc. IEEE Conf. onDecision and Control, 2002, pp. 2866–2871.

[13] Y. Fang, W. E. Dixon, D. M. Dawson, and P. Chawda, “Homography-based visual servo regulation of mobile robots,” IEEE Trans. Syst., Man,Cybern., vol. 35, no. 5, pp. 1041–1050, 2005.

[14] Z. Zhang and A. Hanson, “3D reconstruction based on homographymapping,” in Proc. ARPA Image Understanding Workshop Palm SpringsCA, 1996.

[15] N. Daucher, M. Dhome, J. Laprest, and G. Rives, “Speed commanda robotic system by monocular pose estimate,” in Proc. IEEE/RSJ Int.Conf. Intelligent Robots and Systems, 1997, pp. 55–43.

[16] C. Baillard and A. Zisserman, “Automatic reconstruction planar modelsfrom multiple views,” in Proc. IEEE Conf. Computer Vision and PatternRecognition, 1999, pp. 559–565.

[17] E. Malis and F. Chaumette, “2 1/2D visual servoing with respect to un-known objects through a new estimation scheme camera displacement,”Int. J. Computer Vision, vol. 37, no. 1, pp. 79–97, 2000.

[18] K. Okada, S. Kagami, M. Inaba, and H. Inoue, “Plane segment finder:algorithm, implementation and applications,” in Proc. IEEE Int. Conf.Robotics and Automation, 2001, pp. 2120–2125.

[19] V. Chitrakaran, D. M. Dawson, W. E. Dixon, and J. Chen, “Identificationa moving object´s velocity with a fixed camera,” Automatica, vol. 41,no. 3, pp. 553–562, 2005.

[20] D. H. Titterton and J. L. Weston, Strapdown Inertial Navigation Tech-nology (2nd Edition)l. Institution of Engineering and Technology, 2004.

[21] H. Rauch, F. Tung, and C. Striebel, “Maximum likelihood estimates oflinear dynamic systems,” AIAA, vol. 3(8), pp. 1445–1450, 1965.