6
Monocular Vision based Autonomous Navigation for a Cost-Effective MAV in GPS-denied Environments Inkyu Sa, Hu He, Van Huynh, Peter Corke Abstract—In this paper, we present a monocular vision based autonomous navigation system for Micro Aerial Vehicles (MAVs) in GPS-denied environments. The major drawback of monocular systems is that the depth scale of the scene can not be determined without prior knowledge or other sensors. To address this problem, we minimize a cost function consisting of a drift-free altitude measurement and up-to-scale position estimate obtained using the visual sensor. We evaluate the scale estimator, state estimator and controller performance by comparing with ground truth data acquired using a motion capture system. All resources including source code, tutorial documentation and system models are available online 4 . I. INTRODUCTION In recent years, interest in aerial robotics has grown considering their broad range of applications in civil and military domains, such as aerial transportation, manipulation and surveillance. Particularly, quadrotor micro-aerial vehicles (MAVs) offer a flexible and adaptable platform amendable to aerial research due to their small size and high maneu- verability. Thus, a remarkably large amount of work has been presented for aerial robotic research using quadrotor platform [1] [2] [3] [4] [5]. However, most of this work either utilizes expensive platforms or platforms where the operating software is closed, which makes it difficult for other researchers to replicate or extend the work. Therefore, we are motivated [6] to provide open source software for low cost out of the box quadrotor platforms (e.g., AR.Drone) to achieve the fundamental abilities of hovering and way point following, on which some high level applications (e.g., manipulation, navigation and exploration) can be built. In this paper, we choose a Parrot AR.Drone as the experi- mental platform. Using the sensors on this platform, we apply a key-frame based visual SLAM system (i.e., PTAM [7]) to provide 6 DOF pose estimations. Since PTAM only provides up-to-scale position in Euclidean space, which cannot be applied directly for autonomous control, we estimate the metric scale with the assistance of sonar measurements. A Kalman Filter (KF) is then implemented for state estima- tion of the AR.Drone allowing direct generation of control commands for autonomous manoeuvres through a set of PID controllers. Indoor and outdoor experiments are undertaken to demonstrate the performance of our system. A motion capture system (Vicon) is used for ground truth in the indoor case to quantitatively evaluate the system performance. The authors are with the CyPhy Laboratory, School of Electrical Engineering and Computer Science, Queensland University of Tech- nology, Australia. [email protected], [email protected], thvan [email protected], [email protected] !"#$%&'( *&&%+,'-.( φ ψ θ x y z {B} Fig. 1. AR.Drone platform. It has a front and a downward camera. The ultrasonic sensor estimates altitude. Red, green and blue denotes x,y and z axis used in this paper. φ ,θ and ψ is rotation along x,y and z axis respectively and is governed right-hand rule. Accurate lateral velocity estimation based on aerodynamics and optical flow is a key feature of this product allowing stable flight. The paper is organized as follows. Section I-A addresses related works. Section II details the proposed approach, defining the cost function and both state and scale esti- mators. In Section III, we present experimental results and evaluate performance with respect to Vicon-based ground truth. Lastly, conclusions and further discussion is provided in Section IV. A. Related work Unmanned aerial vehicles (UAVs) are getting smaller and lighter with an increase in capabilities allowing them to fly for longer periods of time with larger payloads. This can be contributed, in part, to considerable growth in sensor and in- tegrated circuit technology. As a result, micro aerial vehicles (MAVs) have emerged and prompted further research and development activities. Ascending Technologies 1 has been developing a wide range of UAVs for commercial and re- search purposes. Although these platforms can provide scien- tifically rich information, such as inertial measurements in SI units and high update rates, their high cost presents a barrier for both amateur hobbyists and researchers. MikroKopter 2 has also been developing commercial grade UAVs that are suitable for broadcasting systems or novel filming systems. Even though they are moderately priced, open source and have GPS based autonomous position holding systems, the lack of academic oriented documentation has made it difficult for researchers to access resources [8] [9]. Recently, Parrot 3 has released high-tech recreational UAVs that are capable of 1 Ascending Technologies : http://www.asctec.de 2 MikroKopter : http://www.mikrokopter.de 3 Parrot SA. : http://ardrone.parrot.com 2013 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM) Wollongong, Australia, July 9-12, 2013 978-1-4673-5320-5/13/$31.00 ©2013 IEEE 1355

[IEEE 2013 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM) - Wollongong, NSW (2013.07.9-2013.07.12)] 2013 IEEE/ASME International Conference on Advanced

  • Upload
    peter

  • View
    213

  • Download
    1

Embed Size (px)

Citation preview

Page 1: [IEEE 2013 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM) - Wollongong, NSW (2013.07.9-2013.07.12)] 2013 IEEE/ASME International Conference on Advanced

Monocular Vision based Autonomous Navigation for a Cost-Effective

MAV in GPS-denied Environments

Inkyu Sa, Hu He, Van Huynh, Peter Corke

Abstract— In this paper, we present a monocular visionbased autonomous navigation system for Micro Aerial Vehicles(MAVs) in GPS-denied environments. The major drawback ofmonocular systems is that the depth scale of the scene can notbe determined without prior knowledge or other sensors. Toaddress this problem, we minimize a cost function consistingof a drift-free altitude measurement and up-to-scale positionestimate obtained using the visual sensor. We evaluate thescale estimator, state estimator and controller performance bycomparing with ground truth data acquired using a motioncapture system. All resources including source code, tutorialdocumentation and system models are available online4.

I. INTRODUCTION

In recent years, interest in aerial robotics has grown

considering their broad range of applications in civil and

military domains, such as aerial transportation, manipulation

and surveillance. Particularly, quadrotor micro-aerial vehicles

(MAVs) offer a flexible and adaptable platform amendable

to aerial research due to their small size and high maneu-

verability. Thus, a remarkably large amount of work has

been presented for aerial robotic research using quadrotor

platform [1] [2] [3] [4] [5]. However, most of this work

either utilizes expensive platforms or platforms where the

operating software is closed, which makes it difficult for

other researchers to replicate or extend the work. Therefore,

we are motivated [6] to provide open source software for low

cost out of the box quadrotor platforms (e.g., AR.Drone)

to achieve the fundamental abilities of hovering and way

point following, on which some high level applications (e.g.,

manipulation, navigation and exploration) can be built.

In this paper, we choose a Parrot AR.Drone as the experi-

mental platform. Using the sensors on this platform, we apply

a key-frame based visual SLAM system (i.e., PTAM [7]) to

provide 6 DOF pose estimations. Since PTAM only provides

up-to-scale position in Euclidean space, which cannot be

applied directly for autonomous control, we estimate the

metric scale with the assistance of sonar measurements. A

Kalman Filter (KF) is then implemented for state estima-

tion of the AR.Drone allowing direct generation of control

commands for autonomous manoeuvres through a set of PID

controllers. Indoor and outdoor experiments are undertaken

to demonstrate the performance of our system. A motion

capture system (Vicon) is used for ground truth in the indoor

case to quantitatively evaluate the system performance.

The authors are with the CyPhy Laboratory, School of ElectricalEngineering and Computer Science, Queensland University of Tech-nology, Australia. [email protected], [email protected],

thvan [email protected], [email protected]

!"#$%&'()*&&%+,'-.(

φ

ψ

θ

x

y

z

{B}

Fig. 1. AR.Drone platform. It has a front and a downward camera. Theultrasonic sensor estimates altitude. Red, green and blue denotes x,y and z

axis used in this paper. φ ,θ and ψ is rotation along x,y and z axis respectivelyand is governed right-hand rule. Accurate lateral velocity estimation basedon aerodynamics and optical flow is a key feature of this product allowingstable flight.

The paper is organized as follows. Section I-A addresses

related works. Section II details the proposed approach,

defining the cost function and both state and scale esti-

mators. In Section III, we present experimental results and

evaluate performance with respect to Vicon-based ground

truth. Lastly, conclusions and further discussion is provided

in Section IV.

A. Related work

Unmanned aerial vehicles (UAVs) are getting smaller and

lighter with an increase in capabilities allowing them to fly

for longer periods of time with larger payloads. This can be

contributed, in part, to considerable growth in sensor and in-

tegrated circuit technology. As a result, micro aerial vehicles

(MAVs) have emerged and prompted further research and

development activities. Ascending Technologies1 has been

developing a wide range of UAVs for commercial and re-

search purposes. Although these platforms can provide scien-

tifically rich information, such as inertial measurements in SI

units and high update rates, their high cost presents a barrier

for both amateur hobbyists and researchers. MikroKopter2

has also been developing commercial grade UAVs that are

suitable for broadcasting systems or novel filming systems.

Even though they are moderately priced, open source and

have GPS based autonomous position holding systems, the

lack of academic oriented documentation has made it difficult

for researchers to access resources [8] [9]. Recently, Parrot3

has released high-tech recreational UAVs that are capable of

1Ascending Technologies : http://www.asctec.de2MikroKopter : http://www.mikrokopter.de3Parrot SA. : http://ardrone.parrot.com

2013 IEEE/ASME International Conference onAdvanced Intelligent Mechatronics (AIM)Wollongong, Australia, July 9-12, 2013

978-1-4673-5320-5/13/$31.00 ©2013 IEEE 1355

Page 2: [IEEE 2013 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM) - Wollongong, NSW (2013.07.9-2013.07.12)] 2013 IEEE/ASME International Conference on Advanced

indoor and outdoor navigation by utilising low-cost inertial

measurement units (IMUs), vision systems and aerodynamic

models [10]. The company has also disclosed the software

development toolkits and academic oriented documentation

for both developers and researchers. We are now able to

access a variety of sensor data from this platform including

3 axis accelerations, 3 axis gyros, 3 axis magnetometers, 3

axis angles, barometer, sonar height measurement and images

from front and down-looking cameras. Additionally, this low-

cost platform (approximately USD300) is economical and is

easy to equip with additional tools available from local retail

stores. Some institutes have begun to use this platform for

research purposes in the past couple of years [11] [12] [13].

In order to control indoor and outdoor flight, MAVs

must estimate their current state. GPS based systems have

shown stable performance for outdoor navigation and con-

trol [14]. For indoor navigation, without GPS, MAVs must

constantly observe the surrounding environment to estimate

their current location. S.Shen et al. [15] and D.Ivan et al.

[16] developed platforms for surveillance and exploration

of indoor environments. Their UAVs have the capability

of autonomous navigation and mapping of multiple-floor

buildings. However, they require a high energy-consuming

laser range finder for state estimation which in turn com-

promises flying time and requires larger payloads. Daniel et

al. [17] [3] have suggested designing dynamically feasible

trajectories to guide quadrotors. A controller-parametrised

fragment of trajectories is generated and then incrementally

refined through successive experimental trials to compensate

noise in the actuators and IMU sensors. These experimental

results showed potential for flying aerial robots, but required

a motion capture system which can provide 6 degrees of

freedom estimates with sub-millimeter accuracy at 100Hz in

limited space.

Of greater value would be onboard systems using

lightweight low-cost sensors only such that smaller plat-

forms can be controlled indoor and outdoor without aux-

iliary equipment. Cameras’ are a natural choice but pose

a number of challenges. Importantly, scale estimation can

not be determined without the aid of other sensors or prior

knowledge such as known metric information of the scene.

Various researchers have used RGBD cameras, Kinect [18]

and stereo vision [19] approaches to simultaneously resolve

scale estimation problems and provide on-board state esti-

mation for MAVs. Processing RGBD data and stereo images

require intensive computation power and flying robots must

be equipped with a high payload computer. As a result,

flying time is dramatically reduced. A monocular camera

based approach provides an alternative method, addressing

this drawback. G. Nutzi et al. [20] have presented scale

estimation by employing spline fitting with least square

optimization and using an Extended Kalman Filter (EKF)

framework to fuse IMU and vision data. In their work, only

simulated results for least square optimization using artificial

noise are presented. Unfortunately, acceleration sensors can

not be modelled by simply adding white gaussian noise

since they typically contain bias. This can be attributed

to highly nonlinear factors, temperature, airframe vibration,

misalignment and scale factors. Additionally, MATLAB was

used which has limitations when running in real time.

In contrast to this, we propose scale and state estimation

by merging unbiased and drift-free metric altitude measure-

ment with forward camera measurements in a real time

least squares solver. Similar work was presented by J.Engel

et al. [12] which showed computationally efficient closed

form scale calculations and state estimation using an EKF.

Although their work is similar to our proposed approach, it

is difficult to argue that scale estimation and state estimation

are accurate by comparing with manually measured ground

truth. Moreover, it is even more challenging to measure

while flying. As such, we evaluate the performance of our

implementation on the AR.Drone platform with respect to

Vicon ground truth, which is missing in [12]. We achieve

competitive results and return knowledge to the community

by releasing publicly available source code that will allow

serious amateurs and researchers to access and reproduce the

same results for their own purposes. A video demonstration

is available online at

www.youtube.com/watch?v=zvNhFcbvXSI

II. METHODOLOGY

In this section, we state the key parts of the proposed

system (Fig. 5). Specifically, we describe the monocular

simultaneous localisation and mapping (SLAM) system in-

cluding camera pose tracking, scale estimation and state

estimation. For camera pose tracking we use PTAM and

introduce a metric scale estimation scheme to transfer up-

to-scale position to metric form. Then metric horizontal

positions x and y from the vision system are fused with

velocities vx and vy from the AR.Drone’s internal imple-

mentation in a KF framework for state estimation. Lastly,

system identification is used in conjunction with the SLAM

system to derive PID controllers for position in x, y, z and

orientation in yaw ψ .

A. Monocular SLAM System

As described in [12], we also use the PTAM system for

camera pose tracking. After take-off, an automatic initial-

ization is implemented allowing PTAM to generate initial

maps. In order to achieve a successful initialization, several

salient features are required in the field of view. Following

initialisation, the map and camera poses are updated con-

tinually. Camera yaw orientation as estimated from PTAM

is more robust than that from the IMU sensor. As shown

in Fig. 2 the IMU based yaw estimation exhibits significant

drift after approximately 40s unlike the PTAM estimates.

Quantitative yaw estimation performance between PTAM

and ground truth can be found in Fig. 8. However, using

PTAM means the camera position is only up-to-scale with

respect to the PTAM frame and therefore requires a scale

estimation to derive the actual scene scale.

B. Scale Estimation

In order to control the quadrotor using pose estimation

from PTAM, scale is required. This is used to convert

1356

Page 3: [IEEE 2013 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM) - Wollongong, NSW (2013.07.9-2013.07.12)] 2013 IEEE/ASME International Conference on Advanced

! "! #! $! %! &!! &"! &#!('

!

'

&!

&'

()*+,-.

/+01++

2

2

34567289:

;7<289:

(

(

(

Fig. 2. Yaw angle comparison between PTAM and IMU. Yaw angleestimation from IMU clearly shows rapid drift, unlike PTAM. The quadrotorlands at approximately ∼110s.

! "! #! $! %! &!! &"! &#! &$! &%! "!!(!'&#

(!'&"

(!'&

(!'!%

(!'!$

(!'!#

(!'!"

!

()*+,-.

/012234)567+0)508(9:06540;23<4,*.

0

0

/0)508(9:

/0)50;23<4

/06=7+30=)77)5>

Fig. 3. Scale estimation for PTAM using z displacement. The dash linedenotes the measured z from PTAM. The thin solid line is the measuredz from the sonar sensor. The bold solid line is the metric form for z

from PTAM, corrected using the estimated scale λ . The scale is accuratelyrecovered since the actual z and the scale corrected estimation are close.

PTAM based poses into metric form. The perceived vertical

displacement between time tk and tk+1 for the forward facing

camera and sonar should the same as both are rigidly attached

to the quadrotor. We project the sonar reading to the z axis

using IMU pitch and roll angle estimation such that sonar

data from a tilted MAV can be used for optimisation. We

assume the ground is even and flat during flight.

Let △zsonark and △zcamera

k denote the change in vertical

distance for the sonar and camera between time tk and tk+1,

respectively. It should follow that △zsonark ≈ λ△zcamera

k where

λ ∈R denotes an unknown scale factor. To estimate the scale

factor and compensate the error introduced by sensor noise,

we minimise the residual error cost function such that

λ ⋆ := argminλ

T

∑k=1

‖△zsonark −λ△zcamera

k ‖2 (1)

where T denotes the number of samples used for optimiza-

tion. It grows as more samples are obtained so we apply a

simple rule in order to retrieve more distinctive samples. If

the vehicle moves faster than ± 7.5cm/s along z axis then

the samples are added to a vector list for optimization. This

speed threshold was empirically chosen within a reasonable

bound.

As the cost function has quadratic form, it can be easily

solved by Levenberg-Marquardt algorithm. Fig. 3 illustrates

that the proposed scale estimation method can accurately

convert the up-to-scale pose from PTAM into metric form.

C. State Estimation

A Kalman Filter is employed to track 4 states including

lateral position and velocity. All states are observable and a

linear velocity process model is used. All states are expressed

w.r.t world coordinate.1) States:

Xk =[

xk yk xk yk

]T(2)

2) Process model:

xk+1 =xk +δ (cos(ψk)− sin(ψk))xk

yk+1 =yk +δ (sin(ψk)+ cos(ψk))yk

xk+1 =xk

yk+1 =yk

where δ is sample rate of the Kalman Filter running at

160Hz and ψk is the drift-free yaw angle estimation from

SLAM.3) Measurement model:

H =I4 (3)

zk =HXk (4)

where zk is a 4×1 position and velocity measurement vector.

Position is obtained from PTAM with scale estimation at

18Hz and velocity comes from the AR.Drone onboard at

160Hz. We choose process noise Q and measurement noise

R such that

wpos = 0.1m, wvel = 0.3m/s

vpos = 0.1m, vvel = 0.05m/s

The remaining steps, prediction and update, follow normal

discrete Kalman Filter procedures. More details can be found

in the technical documentation4. Although it is also possible

to estimate states using more sophisticated filters such as the

Extended Kalman Filter [12] or Particle Filter, the AR.Drone

is already capable of navigation using an onboard control

system that relies on an aerodynamics model, IMU and

computer vision. To the best of our knowledge, it is sufficient

to use a computational efficient Kalman Filter for state

estimation in this case.

D. Control

First, system identification was performed to determine

system dynamic models. This will allow researchers to

purchase a factory calibrated uniform platform from local

retail stores and apply these dynamics models directly. Fig.

4 shows an example for height system identification resulting

in the 3rd order system transfer function

H(s) =5.74

s3 +2.444s2 +10.06s+3.009(5)

More details and system identification models for the x,y

and ψ degrees of freedom can be found in the technical

documentation4.

Second, a set of traditional PID controllers for x,y,z

position and yaw orientation ψ are derived. Inputs for the

controllers are position or angle error with reference to a

goal state. Outputs are desired roll φ , pitch θ and yaw ψangle and desired z velocity. Controller gain is tuned using

the dynamic models derived during system identification. An

onboard attitude controller then stabilizes the roll φ and pitch

θ angles accordingly.

1357

Page 4: [IEEE 2013 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM) - Wollongong, NSW (2013.07.9-2013.07.12)] 2013 IEEE/ASME International Conference on Advanced

! " # $ % &! &"(!'"

(!'&

!

!'&

!'"

!'(

!'#

!')

*+,-

.-/0-

.-1230-4516452+,371/-45,84-7583/93/

5

5

.84-7

:-+;</

Fig. 4. The dotted line shows the measured altitude. The solid lineshows the estimated altitude using the dynamic model. We use MATLABcontinuous-time transfer function model in order to estimate the modelwhich has a 87.52% fit.

!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]_abcdefghijklmnopqrstuvwxyz{|}~

AR.DronePlatform

Autonomy driver

IMU@160Hz

Image@18Hz

Sonar@20Hz

Pose estimator

Kalman Filter@160Hz

Scale estimator

LM nonlinear solver

vSLAM

Parallel Tracking and Mapping @18Hz

PID controllers

x,y,z,yaw@160Hz

Wireless Lan

Offboard computer

Fig. 5. System overview. AR.Drone and a offboard computer communicatethrough WiFi network. Each box denotes ROS nodes running on a remotecomputer.

III. EXPERIMENTS AND RESULTS

In this section, we present experimental results for a

hovering quadrotor. The quadrotor is maintained in a hover

attitude using the proposed controller and state estimator.

Performance is evaluated with respect to ground truth.

A. Implementation

All software is implemented on the ROS platform and runs

in real-time. Fig. 5 shows the overall system architecture.

Autonomy driver plays roles in retrieving and sending

data from the MAV. vSLAM is a key framing based front-end

SLAM solution that can provide up-to-scale position and ori-

entation. Scale estimator runs Levenberg-Marquardt

optimization given observation data to estimate scale at

18Hz. Pose estimator is an implementation of the

Kalman Filter algorithm. Four PID controllers close the loop

for each x,y,z and ψ in PID controllers. All software

is available online4.

B. Controller Performance Evaluation

To evaluate the controllers performance, the MAV is flown

within an indoor motion capture system. Since this system

can provide sub-millimetre accuracy position estimation at

100Hz, we regard it as the ground truth. Fig. 6 shows a

4https://code.google.com/p/ardrone-qut-cyphy

! "! #! $! %! &! '! (! )! *!(!+#

(!+"

!

!+"

!+#

,-./012

3/45/

6

6

789:;6<

=>?@6<

! "! #! $! %! &! '! (! )! *!(!+#

(!+"

!

!+"

!+#

,-./012

3/45/

6

6

789:;6A

=>?@6A

! "! #! $! %! &! '! (! )! *!!+(&

!+)

!+)&

,-./012

3/45/

6

6

789:;6B

=>?@6B

Fig. 6. Controller performance evaluation with ground truth data whilehovering. Note that top two plots and the bottom plot have different scaleson the vertical axis. The reason why x and y states are initially non-zero,is because 0 seconds corresponds to the epoch when vSLAM is initializedafter taking off.

TABLE I

CONTROLLER PERFORMANCE: TIME PERIOD = 0S ∼ 90S.

RMSE x,y,z(m) 0.0386

RMSE z(m) 0.0057

reference or goal position (x=0,y=0,z=0.8) and the quadrotor

states measured using the Vicon. The robot lost wireless

connection at 30s, introducing an unstable manoeuvre. After

recovering the network, the robotic platform is able to recov-

ery and stabilise its state. Fig. 9 depicts quadrotor position,

and thus controller performance, in different environments.

The MAV is able to hold its position more accurately in

indoor environments. Outdoor flying is more challenging

due to added wind disturbances and changing illumination

conditions, which causes increased feature tracking error

using PTAM and scale estimation.

Table I shows the root mean squared error of Euclidean

distance in 3D space between the goal and the MAV.

C. Estimator Accuracy Performance Evaluation

The accuracy of the state estimator is evaluated against

recorded Vicon data taken as ground truth. Fig. 8 and

Fig. 7 display the state estimator output and corresponding

error with respect to ground truth. Table II summarizes the

performance of the state estimator. Note that state estimation

along x and y axis comes from the Kalman Filter whilst z,

φ and θ are from AR.Drone onboard measurements. Key

frame based vSLAM is able to provide drift-free yaw angle

ψ estimation. As shown in Fig. 8, vSLAM observes more

features in order to create a detailed 3D map that leads

to a gradual reduction in estimated yaw angle error. The

AR.Drone uses a complementary filter to fuse unbiased gyro

measurements for attitude estimation, φ ,θ and ψ . Even

though φ and θ can be compensated by using unbiased

acceleration measurement as a reference, ψ is infeasible to

correct without an exteroceptive sensor such as visual sensors

or magnetometers.

IV. CONCLUSIONS

In this paper, we have presented an autonomous monocular

vision-based navigation system that can be used for an

1358

Page 5: [IEEE 2013 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM) - Wollongong, NSW (2013.07.9-2013.07.12)] 2013 IEEE/ASME International Conference on Advanced

! "! #! $! %! &! '! (! )! *!

(!+#(!+"

!

!+"

!+#

!+$

,-./012

3/45/

6

6

789:;6<

:=>6<

! "! #! $! %! &! '! (! )! *!

(!+#(!+"

!

!+"

!+#

!+$

,-./012

3/45/

6

6

789:;6?

:=>6?

! "! #! $! %! &! '! (! )! *!!+(&

!+)

!+)&

,-./012

3/45/

6

6

789:;6@

:=>6@

! "! #! $! %! &! '! (! )! *!

(!+#(!+"

!

!+"

!+#

!+$

,-./012

3/45/

6

6

7558569

! "! #! $! %! &! '! (! )! *!

(!+#(!+"

!

!+"

!+#

!+$

,-./012

3/45/

6

6

755856:

! "! #! $! %! &! '! (! )! *!(!+!"

(!+!!&

!

!+!!&

!+!"

!+!"&

,-./012

3/45/

6

6

755856;

Fig. 7. Top two plots show translational state estimation and the third plotshows the raw ultrasonic height measurement with ground truth data. Thebottom three plots show their corresponding errors while hovering. Notethat wireless connection is lost between 30s to 35s, introducing divergencein state estimation. Although this affects control performance and stateestimation, the quadrotor recovers and the state tracking improves after re-establishing the connection.

TABLE II

STATE ESTIMATION ACCURACY RESULTS:TIME PERIOD = 0S, 90S

RMSE x(m)=0.0208 RMSE y(m)=0.0748 RMSE z(m)=0.0308

RMSE φ (◦)=0.8875 RMSE θ (◦)=0.6054 RMSE ψ(◦)=0.7529

MAV in GPS-denied environments. A major challenge for

systems with monocular vision is estimating the scale of

the scene. We propose a scale estimator that optimizes a

cost function based on altitude measurements and forward

facing visual measurements. By comparing the state esti-

mation performance with the ground truth, we are able to

confirm that the scale estimator produces accurate results.

We show flight demonstrations in a variety of environments

to highlight the robustness and stability of our system. Open-

source, self explanatory technical documentation along with

derived system dynamic models distill our knowledge about

the AR.Drone and can help other researchers to reproduce

the same results. This in turn can be built upon to improve

performance and extend the platform capability.

The feature tracking of the PTAM system used in this

paper was fragile and dependant on camera field of view

(FOV). Using a fairly narrow FOV camera, PTAM tracking

can be disrupted when the yaw angle rapidly changes by

more than 90◦/s. Additionally, we found ambiguity in the

forward looking camera when insufficient translation caused

inaccurate 3D maps and position estimates. To improve this,

! "! #! $! %! &! '! (! )! *!("!

(&

!

&

"!

+,-./01

2.3

4

4

567894:;<<

8=>4:;<<

! "! #! $! %! &! '! (! )! *!("!

(&

!

&

"!

+,-./01

2.3

4

4

567894?,@AB

8=>4?,@AB

! "! #! $! %! &! '! (! )! *!("&

("!

(&

!

&

+,-./01

2.3

4

4

567894CDE

8=>4CDE

! "! #! $! %! &! '! (! )! *!(%

(#

!

#

%

+,-./01

2.3

4

4

5667646788

! "! #! $! %! &! '! (! )! *!(%

(#

!

#

%

+,-./01

2.3

4

4

5667649,:;<

! "! #! $! %! &! '! (! )! *!("

!

"

#

$

%

+,-./01

2.3

4

4

566764=>?

Fig. 8. Top two plots show the orientation obtained from the onboadstate estimators of the AR.Drone and the third plot shows the drift-free yawestimation from PTAM. The bottom three plots show their correspondingerrors while hovering. Roll and pitch angle estimation is accurate and yawangle estimation improves as more features are observed.

an IMU-aided vSLAM system could distinguish between

rotational and translational motion.

Currently we have two ongoing plans including au-

tonomous exploration in large GPS-denied environments and

investigation into a long distance transport service using

MAVs.

ACKNOWLEDGMENT

The authors would like to thank A. Mcfadyen and L.

Mejias for assistance while using the Vicon system at the

Australian Research Centre for Aerospace Automation (AR-

CAA). We also gratefully acknowledge the contributions of

Mcfadyen for his valuable comments and suggestions.

REFERENCES

[1] S. M. Weiss, Vision based navigation for micro helicopters. PhDthesis, ETHZ, 2012.

[2] R. Mahony, V. Kumar, and P. Corke, “Multirotor Aerial Vehicles: Mod-eling, Estimation, and Control of Quadrotor,” Robotics Automation

Magazine, IEEE, vol. 19, pp. 20 –32, Sept. 2012.[3] D. Mellinger and V. Kumar, “Minimum Snap Trajectory Generation

and Control for Quadrotors,” in Proceedings of the IEEE International

Conference on Robotics and Automation (ICRA), pp. 2520– 2525, May2011.

[4] P. Pounds and R. Mahony, “Design principles of large quadrotorsfor practical applications,” in Proceedings of the IEEE International

Conference on Robotics and Automation (ICRA), pp. 3265 –3270,2009.

[5] H. Lim, H. Lee, and H. Kim, “Onboard flight control of a microquadrotor using single strapdown optical flow sensor,” in Intelligent

Robots and Systems (IROS), 2012 IEEE/RSJ International Conference

on, pp. 495 –500, oct. 2012.

1359

Page 6: [IEEE 2013 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM) - Wollongong, NSW (2013.07.9-2013.07.12)] 2013 IEEE/ASME International Conference on Advanced

(! ("#$ ("#% ("#& ("#' " "#' "#& "#% "#$ !(!

("#$

("#%

("#&

("#'

"

"#'

"#&

"#%

"#$

!

()*+,-+.

/)*+,-+.

(! ("#$ ("#% ("#& ("#' " "#' "#& "#% "#$ !(!

("#$

("#%

("#&

("#'

"

"#'

"#&

"#%

"#$

!

()*+,-+.

/)*+,-+.

Fig. 9. Top and bottom rows show indoor and outdoor experiments respectively. The first column is images of the environment where the MAV is flown.The second column shows 3D points reconstruction and position estimation from vSLAM. The third column shows the corresponding images from MAV’sforward camera with feature extraction. The last column shows the top-view x,y position estimation while hovering. These clearly show that manoeuvringoutdoor is more challenge than indoor environments due to wind gusts or dynamic features.

(!"# ! !"# $ $"#!

!"$

!"%

!"&

!"'

!"#

!"(

!")

!"*

!"+

$ ,

-./01203

,

4./01203

56789

:;<=

(!"#

!

!"#

$

$"#

($"#($(!"#!!"#

%

&'()*+),

%

-'()*+),

./012

3456

(!"#

!

!"#

$

$"#

($"#

($

(!"#

!

!"#

!

!"%

!"&

!"'

!"(

$

)

*+,-./-01+,-./-0

)

2+,-./-0 34567

89:;

Fig. 10. Way points following results. Top left shows side view of thetrajectory for evaluation of z controller while way points following. Top rightis top view and it clearly shows scale estimation is accurate. The bottomperspective view presents qualitatively evaluation of way points following.

[6] P. I. Corke, Robotics, Vision & Control: Fundamental Algorithms in

MATLAB. Springer, 2011. ISBN 978-3-642-20143-1.

[7] G. Klein, Visual Tracking for Augmented Reality. PhD thesis, Univer-sity of Cambridge, 2006.

[8] I. Sa and P. Corke, “Vertical Infrastructure Inspection using a Quad-copter and Shared Autonomy Control,” in the International Conference

on Field and Service Robotics(FSR), July 2012.

[9] I. Sa and P. Corke, “System Identification, Estimation and Controlfor a Cost Effective Open-Source Quadcopter,” in Proceedings of the

IEEE International Conference on Robotics and Automation (ICRA),pp. 2202– 2209, May 2012.

[10] P.-J. Bristeau, F. Callou, D. Vissiere, and N. Petit, “The Navigationand Control technology inside the AR.Drone micro UAV,” in 18th

IFAC World Congress, pp. 1477–1484, 2011.

[11] C. Bills, J. Chen, and A. Saxena, “Autonomous MAV flight in indoorenvironments using single image perspective cues,” in Robotics and

Automation (ICRA), 2011 IEEE International Conference on, pp. 5776–5783, May 2011.

[12] J. Engel, J. Sturm, and D. Cremers, “Camera-Based Navigation of aLow-Cost Quadrocopter,” in Proceedings of IEEE/RSJ Conference on

Intelligent Robots and Systems (IROS), pp. 2815– 2821, Oct. 2012.[13] N. Dijkshoorn and A. Visser, “Integrating Sensor and Motion Models

to Localize an Autonomous AR.Drone,” in International Journal of

Micro Air Vehicles, vol. 3, pp. 183–200, Dec. 2011.[14] G. M. Hoffmann, Autonomy for Sensor-Rich Vehicles: Interaction

between Sensing and Control Actions. PhD thesis, Stanford Univ.,2008.

[15] S. Shen, N. Michael, and V. Kumar, “Autonomous Multi-Floor IndoorNavigation with a Computationally Constrained MAV,” in Proceedings

of the IEEE International Conference on Robotics and Automation

(ICRA), pp. 20– 25, May 2011.[16] I. Dryanovski, W. Morris, and J. Xiao, “An Open-Source Pose

Estimation System for Micro-Air Vehicles,” in Proceedings of the

IEEE International Conference on Robotics and Automation (ICRA),pp. 4449– 4454, May 2011.

[17] D. Mellinger, A. Kushleyev, and V. Kuma, “Mixed-Integer QuadraticProgram Trajectory Generation for Heterogeneous Quadrotor Teams,”in Proceedings of the IEEE International Conference on Robotics and

Automation (ICRA), pp. 477– 483, May 2012.[18] Huang, Bachrach, Henry, Krainin, Maurana, Fox, and Roy, “Visual

Odometry and Mapping for Autonomous Flight using and RGB-DCamera,” in Int. Symposium on Robotics Research (ISRR), Aug. 2011.

[19] G. H. Lee, F. Fraundorfer, and M. Pollefey, “MAV Visual SLAMwith Plane Constraint,” in Proceedings of the IEEE International

Conference on Robotics and Automation (ICRA), pp. 3139– 3144, May2011.

[20] G. Nuetzi, S. Weiss, D. Scaramuzza, and R. Siegwart, “Fusion ofIMU and Vision for Absolute Scale Estimation in Monocular SLAM,”Journal of Intelligent and Robotic Systems, vol. 61, pp. 287–299, 2011.

1360