13
1002 IEEE TRANSACTIONS ON ROBOTICS, VOL. 24, NO. 5, OCTOBER 2008 Efficient View-Based SLAM Using Visual Loop Closures Ian Mahon, Stefan B. Williams, Member, IEEE, Oscar Pizarro, Member, IEEE, and Matthew Johnson-Roberson, Student Member, IEEE Abstract—This paper presents a simultaneous localization and mapping algorithm suitable for large-scale visual navigation. The estimation process is based on the viewpoint augmented navigation (VAN) framework using an extended information filter. Cholesky factorization modifications are used to maintain a factor of the VAN information matrix, enabling efficient recovery of state esti- mates and covariances. The algorithm is demonstrated using data acquired by an autonomous underwater vehicle performing a vi- sual survey of sponge beds. Loop-closure observations produced by a stereo vision system are used to correct the estimated vehicle trajectory produced by dead reckoning sensors. Index Terms—Autonomous underwater vehicle (AUV) naviga- tion, Cholesky factorization, extended information filter (EIF), si- multaneous localization and mapping (SLAM). I. INTRODUCTION S IMULTANEOUS localisation and mapping (SLAM) has been widely used to estimate the position of a robot in an ini- tially unknown environment. In the original formulation [1]–[3], the state of the robot and the position of a set of features ex- tracted from observations of the environment are jointly esti- mated using an extended Kalman filter (EKF). The complexity of updating the filter after acquiring an observation is quadratic in the number of estimated features, resulting in a large re- search effort to produce more scalable SLAM methods. Exam- ples include partitioned updates [4] and submapping techniques [5]–[7]. Recently, there has been increasing interest in SLAM algo- rithms using an extended information filter (EIF), in which an observation update can be performed in constant time. Sparsifi- cation approximations can be used to ignore many near-zero elements of the information matrix in feature-based SLAM approaches [8], [9], while the information matrix is exactly sparse when past vehicle poses are maintained by the filter, Manuscript received December 15, 2007; revised July 4, 2008. First published October 14, 2008; current version published October 31, 2008. This paper was recommended for publication by Associate Editor A. Davison and Editor L. Parker upon evaluation of the reviewers’ comments. This work was sup- ported in part by the Australian Research Council (ARC) Centre of Excellence Programme and in part by the New South Wales State Government. The authors are with the ARC Centre of Excellence for Autonomous Sys- tems, Australian Centre for Field Robotics, The University of Sydney, Sydney, NSW 2006, Australia (e-mail: [email protected]; [email protected]; [email protected]; [email protected]). This paper has supplementary downloadable material available at http://ieeexplore.ieee.org provided by the authors. This material includes two movies showing the evolution of the estimated vehicle trajectory and a 3-D seafloor reconstruction. The size of the material is 12.5 MB. Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TRO.2008.2004888 such as in the viewpoint augmented navigation (VAN) frame- work [9]–[11]. Exploiting the sparsity of the information matrix can reduce both the computational complexity and memory re- quirements of the filter. A related approach is the smoothing and mapping (SAM) framework [12], [13], which estimates the states of a set of fea- tures and a history of robot poses. Unlike EKF or EIF approaches in which linearization errors are permanently incorporated into the filter, the SAM algorithm can perform an iterative least- squares optimization process to converge to an optimal state es- timate. The information matrix in the normal equations solved during each iteration possesses a similar sparsity structure to that of the VAN framework. The main difficulty with information-form SLAM algorithms is the recovery of state estimates and covariances. State estimates are required in the EIF prediction, observation, and update oper- ations, while state covariances are required for data association or loop-closure hypothesis generation. Efficient state estimate and covariance recovery is the main focus of this paper. In a previous VAN implementation [9]–[11], state estimates and covariances were recovered using a Cholesky factor of the information matrix that was recalculated each time an image was acquired. Using Cholesky factorization modifications to keep a factor up-to-date in a SAM application was previously proposed, but not implemented due to the complexity of the algorithms when applied to sparse matrices [12], [13]. In this paper, the use of Cholesky factorisation modifications in the VAN framework is investigated, utilizing a recently developed implementation [14]. In parallel to the paper presented here, an incremental SAM approach has been developed [15], [16], in which a QR fac- torization of the SAM measurement Jacobian is updated using Givens rotations. The two approaches are closely related, since the upper triangular matrix R in a QR factorization of the SAM measurement Jacobian is a Cholesky factor of the information matrix [13]. This paper is organized as follows. Section II provides a justification for using the VAN framework for visual navigation applications. Section III summarizes the information-form VAN filtering process. Section IV describes the Cholesky factoriza- tion process, and the modifications used to maintain a factor of the VAN information matrix. Section V describes state esti- mate recovery methods. Section VI describes state covariance recovery methods. Section VII outlines the process to gener- ate loop-closure hypotheses. Section VIII presents the results of the efficient VAN algorithm applied to data acquired by an autonomous underwater vehicle (AUV). Finally, Section IX provides concluding remarks. 1552-3098/$25.00 © 2008 IEEE

Efficient View-Based SLAM Using Visual Loop Closures

  • Upload
    m

  • View
    228

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Efficient View-Based SLAM Using Visual Loop Closures

1002 IEEE TRANSACTIONS ON ROBOTICS, VOL. 24, NO. 5, OCTOBER 2008

Efficient View-Based SLAM Using VisualLoop Closures

Ian Mahon, Stefan B. Williams, Member, IEEE, Oscar Pizarro, Member, IEEE,and Matthew Johnson-Roberson, Student Member, IEEE

Abstract—This paper presents a simultaneous localization andmapping algorithm suitable for large-scale visual navigation. Theestimation process is based on the viewpoint augmented navigation(VAN) framework using an extended information filter. Choleskyfactorization modifications are used to maintain a factor of theVAN information matrix, enabling efficient recovery of state esti-mates and covariances. The algorithm is demonstrated using dataacquired by an autonomous underwater vehicle performing a vi-sual survey of sponge beds. Loop-closure observations producedby a stereo vision system are used to correct the estimated vehicletrajectory produced by dead reckoning sensors.

Index Terms—Autonomous underwater vehicle (AUV) naviga-tion, Cholesky factorization, extended information filter (EIF), si-multaneous localization and mapping (SLAM).

I. INTRODUCTION

S IMULTANEOUS localisation and mapping (SLAM) hasbeen widely used to estimate the position of a robot in an ini-

tially unknown environment. In the original formulation [1]–[3],the state of the robot and the position of a set of features ex-tracted from observations of the environment are jointly esti-mated using an extended Kalman filter (EKF). The complexityof updating the filter after acquiring an observation is quadraticin the number of estimated features, resulting in a large re-search effort to produce more scalable SLAM methods. Exam-ples include partitioned updates [4] and submapping techniques[5]–[7].

Recently, there has been increasing interest in SLAM algo-rithms using an extended information filter (EIF), in which anobservation update can be performed in constant time. Sparsifi-cation approximations can be used to ignore many near-zeroelements of the information matrix in feature-based SLAMapproaches [8], [9], while the information matrix is exactlysparse when past vehicle poses are maintained by the filter,

Manuscript received December 15, 2007; revised July 4, 2008. First publishedOctober 14, 2008; current version published October 31, 2008. This paperwas recommended for publication by Associate Editor A. Davison and EditorL. Parker upon evaluation of the reviewers’ comments. This work was sup-ported in part by the Australian Research Council (ARC) Centre of ExcellenceProgramme and in part by the New South Wales State Government.

The authors are with the ARC Centre of Excellence for Autonomous Sys-tems, Australian Centre for Field Robotics, The University of Sydney, Sydney,NSW 2006, Australia (e-mail: [email protected]; [email protected];[email protected]; [email protected]).

This paper has supplementary downloadable material available athttp://ieeexplore.ieee.org provided by the authors. This material includes twomovies showing the evolution of the estimated vehicle trajectory and a 3-Dseafloor reconstruction. The size of the material is 12.5 MB.

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TRO.2008.2004888

such as in the viewpoint augmented navigation (VAN) frame-work [9]–[11]. Exploiting the sparsity of the information matrixcan reduce both the computational complexity and memory re-quirements of the filter.

A related approach is the smoothing and mapping (SAM)framework [12], [13], which estimates the states of a set of fea-tures and a history of robot poses. Unlike EKF or EIF approachesin which linearization errors are permanently incorporated intothe filter, the SAM algorithm can perform an iterative least-squares optimization process to converge to an optimal state es-timate. The information matrix in the normal equations solvedduring each iteration possesses a similar sparsity structure tothat of the VAN framework.

The main difficulty with information-form SLAM algorithmsis the recovery of state estimates and covariances. State estimatesare required in the EIF prediction, observation, and update oper-ations, while state covariances are required for data associationor loop-closure hypothesis generation. Efficient state estimateand covariance recovery is the main focus of this paper.

In a previous VAN implementation [9]–[11], state estimatesand covariances were recovered using a Cholesky factor of theinformation matrix that was recalculated each time an imagewas acquired. Using Cholesky factorization modifications tokeep a factor up-to-date in a SAM application was previouslyproposed, but not implemented due to the complexity of thealgorithms when applied to sparse matrices [12], [13]. In thispaper, the use of Cholesky factorisation modifications in theVAN framework is investigated, utilizing a recently developedimplementation [14].

In parallel to the paper presented here, an incremental SAMapproach has been developed [15], [16], in which a QR fac-torization of the SAM measurement Jacobian is updated usingGivens rotations. The two approaches are closely related, sincethe upper triangular matrix R in a QR factorization of the SAMmeasurement Jacobian is a Cholesky factor of the informationmatrix [13].

This paper is organized as follows. Section II provides ajustification for using the VAN framework for visual navigationapplications. Section III summarizes the information-form VANfiltering process. Section IV describes the Cholesky factoriza-tion process, and the modifications used to maintain a factorof the VAN information matrix. Section V describes state esti-mate recovery methods. Section VI describes state covariancerecovery methods. Section VII outlines the process to gener-ate loop-closure hypotheses. Section VIII presents the resultsof the efficient VAN algorithm applied to data acquired byan autonomous underwater vehicle (AUV). Finally, Section IXprovides concluding remarks.

1552-3098/$25.00 © 2008 IEEE

Page 2: Efficient View-Based SLAM Using Visual Loop Closures

MAHON et al.: EFFICIENT VIEW-BASED SLAM USING VISUAL LOOP CLOSURES 1003

II. SLAM FRAMEWORKS AND VISUAL NAVIGATION

Two main SLAM frameworks have been proposed: feature-based and view-based algorithms. In feature-based SLAM[1]–[8], the positions of features are estimated, and a loop clo-sure is performed by observing a previously initialized feature.In view-based SLAM [9]–[11], [17], a set of vehicle poses atlocations where sensor data was acquired is estimated. A loopclosure is performed by registering two sets of sensor data toproduce an observation of the relative pose between the vehiclelocations where the data was acquired.

A disadvantage of the view-based method is the need to findpairs of previously unused sensor data to construct independentloop-closure observations. Two relative pose measurements cre-ated using common feature observations will be correlated, andignoring these correlations will cause the filter to become incon-sistent. Applying multiple relative pose observations to the filtersimultaneously while considering the correlations is possible;however, it is impractical since loop-closure events involving asingle pose may occur at multiple different times. In compari-son, the feature-based approach has no such problem, since thefilter maintains all correlations and observations can be appliedindividually.

The feature-based approach has the disadvantages of requir-ing the filter to estimate the feature states, and the need to selectwhich features will be used at the time they are first observed.In comparison, the view-based approach has the advantagethat the selection of a subset of features used in a loop-closureobservation can be delayed until the feature association processis performed.

As a result of these properties, feature-based approaches aremore suitable for applications in which a small set of features canreliably be extracted and associated, while pose-based methodsare more appropriate for applications in which large numbersof features can be extracted, particularly when it is uncertainwhich features can be associated in the future.

When evaluating the suitability of each framework for large-scale visual navigation, the properties of visual feature extrac-tion and association algorithms need be considered. A rangeof wide-baseline approaches suitable for loop-closure situationshave been developed [18]–[21]. Association of such featurescan typically be performed at high precision, but at low recallrates (incorrect feature associations are uncommon; however,the number of associations produced is small) [22], [23].

When used within a feature-based SLAM algorithm, the prop-erties of wide-baseline visual feature extraction and associationalgorithms result in a difficult feature selection problem. Thou-sands of features can be extracted from an image; however,few will be matched in a loop closure situation. Estimatingthe positions of all features becomes infeasible; however, ifonly a few are selected, a loop-closure observation becomes un-likely. The ability to use all the sensor data, rather than a sparseset of previously selected features at a loop-closure event is acritical advantage for view-based SLAM algorithms in visionapplications.

An additional benefit of the view-based approach for vi-sual navigation applications is its ability to handle delayed

observations. Visual feature extraction and association are time-consuming processes, so a delay is likely to occur between thetime an image is acquired and a loop-closure observation isproduced. In the view-based framework, a relative pose con-straint can be applied between two previously augmented poseswhenever the image analysis operations are complete.

Due to avoidance of the feature selection problem, the inher-ent ability to handle delayed observations, and the efficiencywhen using the information form, the view-based VAN frame-work will be utilized in this paper.

III. VAN

A. Estimated State Vector

In the VAN framework, the current vehicle state is estimatedalong with a selection of past vehicle poses, leading to a stateestimate vector of the form

x+ (tk ) =

x+p1

(tk )...

x+pn

(tk )x+

v (tk )

=

[x+

t (tk )x+

v (tk )

](1)

where x+v (tk ) contains the current vehicle states, and x+

t (tk ) =[x+T

p1 (tk ), . . . , x+Tpn (tk )

]Tis a vector of trajectory states con-

sisting of n past vehicle vectors.The covariance matrix has the form

P+ (tk ) =[

P+tt (tk ) P+

tv (tk )P+T

tv (tk ) P+vv (tk )

]. (2)

In the information form, the filter maintains the informationmatrix Y+ (tk ), which is the inverse of the covariance matrix

Y+ (tk ) =[P+ (tk )

]−1(3)

and the information vector y+ (tk ), which is related to the stateestimate by

y+ (tk ) = Y+ (tk )x+ (tk ). (4)

The VAN information vector has the form

y+ (tk ) =[y+

t (tk )y+

v (tk )

](5)

and the information matrix is

Y+ (tk ) =[

Y+tt (tk ) Y+

tv (tk )Y+T

tv (tk ) Y+vv (tk )

]. (6)

B. Estimation Process

The VAN estimation process uses the standard EIF three-stepprediction, observation, and update cycle. The vehicle states areassumed to evolve according to a process model of the form

xv (tk ) = fv[xv (tk−1) ,u (tk )

]+ w (tk ) (7)

in which u (tk ) is a vector of control inputs, and w (tk ) isan error vector from a zero-mean Gaussian distribution withcovariance Q (tk ).

When propagating the vehicle states to a new timestep witha prediction operation, a decision on whether or not the current

Page 3: Efficient View-Based SLAM Using Visual Loop Closures

1004 IEEE TRANSACTIONS ON ROBOTICS, VOL. 24, NO. 5, OCTOBER 2008

vehicle pose should be kept in the state vector is required. Thecurrent vehicle pose should be kept if it marks the location wheredata that may be used in future loop-closure observations wereacquired.

For example, in the AUV application detailed in Section VIIIin which loop-closure observations are produced from stereovi-sion, prediction with augmentation (keeping the current pose)is performed after each stereo image pair is acquired. Whenpropagating the filter forward from the time of a vehicle depth,attitude, or velocity observation, prediction without augmenta-tion is performed.

Prediction with augmentation is performed as in [9], using (8)and (9), shown at the bottom of the page, in which ∇xfv (tk )is the Jacobian of the vehicle model with respect to the vehiclestates.

The equations for prediction without augmentation can be ob-tained by marginalizing the previous pose from the augmentedsystem of (8) and (9). The result is (10) and (11), shown atthe bottom of this page, in which three subterms are defined in(12)–(14).δ (tk ) = fv

[x+

v (tk−1),u (tk )]− ∇xfv (tk ) x+

v (tk−1) (12)

Ω (tk ) = Y+vv (tk−1) + ∇T

x fv (tk )Q−1 (tk ) ∇xfv (tk ) (13)

Ψ (tk ) =(Q (tk ) + ∇xfv (tk )

[Y+

vv (tk−1)]−1 ∇T

x fv (tk ))−1

.

(14)Observations are assumed to be made according to a model ofthe form

z (tk ) = h[x (tk )

]+ v (tk ) (15)

in which z (tk ) is an observation vector, and v (tk ) is a vectorof observation errors with covariance R (tk ). The differencebetween the actual and predicted observations is the innovation

ν (tk ) = z (tk ) − h[x− (tk )

]. (16)

The innovation is used to update the information vector andmatrix

y+ (tk ) = y− (tk ) + i (tk ) (17)

Y+ (tk ) = Y− (tk ) + I (tk ) (18)in which

i (tk ) = ∇xh (tk )R−1 (tk )(ν (tk ) + ∇xh (tk ) x− (tk )

)(19)

I (tk ) = ∇xh (tk )R−1 (tk ) ∇Tx h (tk ) (20)

where ∇xh (tk ) is the Jacobian of the observation function withrespect to the vehicle states.

The observation and update process is efficient in the infor-mation form, since only the elements of the information vectorand matrix corresponding to the observed states are modified.

The prediction operation equations (8) and (12) require theprior vehicle pose state estimate, while in the update opera-tion, (16) and (19) require the prior estimates of the observedstates. In addition, the prediction operations require the vehicleprocess model to be linearized at the prior vehicle state esti-mate, and the update operation requires the observation modelto be linearized at the estimate of the observed states. Oncethe necessary state estimates have been recovered, prediction(with or without augmentation) and observations are constant-time operations independent of the number of estimatedposes.

C. Structure of the Information Matrix

Elements of the VAN information matrix off the block di-agonal are nonzero only if an observation relating the twocorresponding poses has been applied to the filter. Fig. 1(a)shows an example of an information matrix sparsity pattern andMarkov graph that results from dead reckoning (DR). Sinceeach pose is related to the previous and next pose throughodometry constraints, DR results in a block tridiagonal ma-trix. The Markov graph provides a visual representation of therelationship between the estimated variables, with an edge inthe graph corresponding to a nonzero block in the informationmatrix.

When loop-closure observations are applied to the filter,additional nonzero elements in the information matrix arecreated at the locations corresponding to the two observedposes. Fig. 1(c) displays the information matrix resulting fromadding loop closure observations between the first and last twoposes.

The sparsity of the information matrix is important for thecomputational efficiency and storage requirements of the filter.In large-scale applications with many augmented poses, EKF-based approaches are infeasible due to the memory requirementsof dense covariance matrices.

y− (tk ) =

y+t (tk−1)

y+v (tk−1) − ∇T

x fv (tk )Q−1 (tk )(fv [x+

v (tk−1),u (tk )] − ∇xfv (tk ) x+v (tk−1)

)Q−1 (tk )

(fv [x+

v (tk−1),u (tk )] − ∇xfv (tk ) x+v (tk−1)

) (8)

Y− (tk ) =

Y+

tt (tk−1) Y+tv (tk−1) 0

Y+Ttv (tk−1) Y+

vv (tk−1) + ∇Tx fv (tk )Q−1 (tk ) ∇xfv (tk ) −∇T

x fv (tk )Q−1 (tk )0 −Q−1 (tk ) ∇xfv (tk ) Q−1 (tk )

(9)

y− (tk ) =

[y+

t (tk−1) − Y+tv (tk−1)Ω−1 (tk )

(y+

v (tk−1) − ∇Tx fv (tk )Q−1 (tk ) δ (tk )

)Q−1 (tk ) ∇xfv (tk )Ω−1 (tk ) y+

v (tk−1) + Ψ (tk ) δ (tk )

](10)

Y− (tk ) =[Y+

tt (tk−1) − Y+tv (tk−1)Ω−1 (tk )Y+T

tv (tk−1) Y+tv (tk−1)Ω−1 (tk ) ∇T

x fv (tk )Q−1 (tk )Q−1 (tk ) ∇xfv (tk )Ω−1 (tk )Y+T

tv (tk−1) Ψ (tk )

](11)

Page 4: Efficient View-Based SLAM Using Visual Loop Closures

MAHON et al.: EFFICIENT VIEW-BASED SLAM USING VISUAL LOOP CLOSURES 1005

Fig. 1. Sparsity pattern of the VAN information matrix. Nonzero blocks in theinformation matrix correspond to edges in the Markov graph. (a) DR informationmatrix, in which odometry constraints produce a block tridiagonal structure. (b)DR Markov graph. (c) SLAM information matrix, in which two loop-closureobservations constrain the final two poses relative to the initial pose. (d) SLAMMarkov graph.

IV. CHOLESKY FACTORIZATION AND MODIFICATIONS

The Cholesky factorization is commonly used to solve linearsystems of the form

AX = B (21)

where A is a positive definite symmetric matrix and X is amatrix of unknowns.

In this SLAM application, a Cholesky factor of the informa-tion matrix will be used to recover state estimates and covari-ances. Relationships for the state estimate vector and covariancematrix in the form of (21) can be produced by rearranging (3)and (4) to obtain

Y+ (tk )x+ (tk ) = y+ (tk ) (22)

Y+ (tk )P+ (tk ) = I. (23)

The LDLT form of the Cholesky decomposition of the matrixA is defined by

A = LDLT (24)

where the L is a lower triangular matrix with all elements onthe diagonal equal to one, and D is a diagonal matrix.

The solution to a system of equations in the form of (21)is calculated from the Cholesky factorization using a two-stepforward and backward solve process. First, a forward solve stepis performed on the lower triangular system

LZ = B (25)

to recover the rows of the forward-solve result Z in order fromfirst to last. The solution X can then be recovered using a

backward solve operation on the upper triangular system

DLTX = Z (26)

in which the rows of X are recovered in order from last to first.The structure of the Cholesky factor L of a sparse matrix is

related to the sparsity pattern of the original matrix A. Nonzeroelements in the Cholesky factor are present at the locations ofall nonzero elements in the original matrix; however, additionalnonzeros known as “fill-in” are introduced. Fill-in is undesir-able, since additional nonzero elements increase the compu-tational complexity of the factorization and equation-solvingprocesses.

Many algorithms to calculate the Cholesky decompositionexist [24]. The experiments presented in Section VIII use anefficient sparse uplooking algorithm [14]. In Fig. 2, the right-looking factorization algorithm is used to demonstrate the pro-cess of fill-in. During each iteration of the algorithm, the jthcolumn of the factor is produced by dividing the jth columnof the active submatrix by its element on the diagonal. The jthvariable is then eliminated by marginalizing it from the remain-ing active submatrix. The fill-in produced in the Cholesky factoris equivalent to the additional edges produced by marginalizinga variable from the Markov graph, in which the neighbors of aneliminated node form a clique.

A. Reducing Fill-in With Variable Reordering

Fill-in can be reduced by reordering the variables to changethe sequence in which they are eliminated during the factoriza-tion process. Since finding the optimal permutation that pro-duces minimal fill-in is NP-hard, heuristic-based approaches,such as the approximate minimum degree (AMD) algorithm,are typically used [24], [25].

In each iteration of a right-looking Cholesky factorization,the AMD algorithm employs the greedy strategy of selectingfor elimination of the variable corresponding to the graph nodewith the smallest degree (the number of neighbors), or equiva-lently, the sparsest row of the remaining active submatrix to befactorized.

Fig. 2(b) illustrates the factorization process for the matrixpreviously decomposed in Fig. 2(a), using the variable orderingproduced by AMD. The selected order in which the poses areeliminated is 5, 4, 1, 2, 3. The benefit of variable ordering canbe observed by comparing the three blocks of fill-in producedwhen using the natural ordering to the one block of fill-in withthe AMD ordering.

B. Scalability of the Factorization Process

The computational complexity of the Cholesky factorizationfor a dense n × n matrix is O(n3). For sparse matrices, however,the complexity is dependent on the number of nonzeros in theCholesky factor, which is influenced by the structure of thematrix being factorized and the variable ordering.

If the number of nonzeros in the Cholesky factor grows lin-early with the number of estimated poses, as is the case for a DRVAN information matrix with a tridiagonal structure or a VANsystem with a constant number of loop closures, the complexity

Page 5: Efficient View-Based SLAM Using Visual Loop Closures

1006 IEEE TRANSACTIONS ON ROBOTICS, VOL. 24, NO. 5, OCTOBER 2008

Fig. 2. Right-looking Cholesky factorization algorithm demonstrating fill-in. During each iteration of the algorithm, the jth column of the factor is producedby dividing the jth column of the active submatrix by its element on the diagonal. The jth variable is then eliminated (marginalized) from the active submatrix.Fill-in elements (nonzeros at the locations of zeros in the original matrix) are shown as dark grey matrix blocks and thick graph edges. (a) Factorisation using thenatural variable ordering, producing three blocks of fill-in. (b) Factorisation using the AMD variable ordering, resulting in only one block of fill-in.

of the Cholesky decomposition process is O(n). However, ingeneral, where the Cholesky factor contains O(n2) nonzeros,as can be expected in SLAM applications where the numberof loop-closure observations grows linearly with the number ofposes, the complexity of the factorization process is O(n3).

C. Modifying a Factor

If a previously factorized system of equations is changed, it isoften possible to efficiently modify an existing factor instead ofrepeating the computationally expensive factorization process.

The complex equations and algorithms used to compute mod-ified components of a sparse Cholesky factorization will not bepresented here. Instead, the focus will be on illustrating whichcomponents of the factorization change, and the resulting com-plexity of the operation. Further details on the Cholesky mod-ification algorithms can be found in [26] and [27], and the im-plementation used in the experiments presented in the paper isdescribed in [14].

Four Cholesky factor modification operations are used: rowadditions, row deletions, updates, and downdates. The row ad-dition and deletion operations allow the introduction of a newvariable or removal of an existing variable from the system oflinear equations. A two-step process of row deletion and addi-tion can be used to perform an arbitrary change to a row of thefactorized matrix.

Update and downdate operations allow a special case modifi-cation to the factorized system of equations. A modification ofthe form

A = A + WWT, B = B + ∆B (27)

where W is an n × k matrix is known as a rank-k update, whilea modification of the form

A = A − WWT, B = B + ∆B (28)

is a rank-k downdate.Equations (27) and (28) include a change to the right-hand-

side matrix B of the system of linear equations to enable the

Page 6: Efficient View-Based SLAM Using Visual Loop Closures

MAHON et al.: EFFICIENT VIEW-BASED SLAM USING VISUAL LOOP CLOSURES 1007

Fig. 3. Cholesky factor update modification example. A system of equations ofthe form AX = B, with an existing factorisation L and forward solve result Z,are modified using an update matrix W and right-hand side change matrix ∆B .In the resulting modified factor L and forward solve result Z, altered blocksare shown in black. (a) Original A matrix. (b) Original B matrix. (c) Originalfactor. (d) Original forward solve result. (e) Update matrix. (f) Right-hand sidechange matrix. (g) Modified factor. (h) Modified forward solve result.

forward solve result Z to be modified in addition to the Choleskyfactor.

Fig. 3 illustrates an update modification performed on thesystem of equations previously factorized in Fig. 2(b). For eachof the modification operations, if an element in row j of the fac-torized matrix is modified (added, removed, updated, or down-dated), the elements of the factor that are changed are limitedto the columns j to n. Considering the Cholesky factorizationprocess in Fig. 2, this is a logical result, since these columns ofthe Cholesky factor were previously produced after the modifiedvariable was marginalized from the active submatrix.

D. Maintaining a Factor of the VAN Information Matrix

The information-form VAN filter operations of Section III-Bcan all be described using the row addition, row deletion, update,and downdate modifications.

The prediction with augmentation equations (8) and (9) canbe implemented with row additions for the new pose variables,and an update on the previous pose states. Similarly, the pre-diction without augmentation equations (10) and (11) can be

implemented with row removal and row addition operations toperform the changes to the current vehicle pose states, and adowndate on the previous pose states. The observation updateequations (17) and (18) can simply be implemented with a singleupdate modification.

Modifications are used to maintain an up-to-date factor afterprediction and vehicle state observation operations. However,when a loop-closure observation is applied between past poses,the structure of the information matrix is significantly changed,causing the previous variable ordering to be ineffective in min-imizing fill-in. Therefore, when a loop-closure observation isapplied to the filter, a new variable ordering is found, and a newfactor of the information matrix is calculated.

E. Variable Ordering for Efficient VAN Operations

Prediction operations and observations of the current vehiclestates are the most frequent procedures in a SLAM algorithm,with the number of loop-closure observations being relativelysmall. After considering the pattern of modified factor elementsin Fig. 3, it is clear that ordering the vehicle states last willminimize the complexity of maintaining a factor of the VANinformation matrix.

If the current vehicle states are ordered last, the number ofelements in the factor that need to be recalculated is indepen-dent of the number of augmented poses, allowing the Choleskyfactorization modifications for the prediction and vehicle stateobservation operations to be performed in constant time.

While ordering the vehicle states last may not result in the theminimal amount of fill-in, the benefit of constant-time predic-tion and observation operations outweigh the additional com-putational complexity due to the additional fill-in caused by thisconstraint.

V. STATE ESTIMATE RECOVERY

A. Complete State Recovery

The complete state estimate vector can be recovered by solv-ing the relationship

Y+ (tk )x+ (tk ) = y+ (tk ) (29)

using the Cholesky factor of the information matrix and theprocess described in Section IV.

The efficiency of the forward and backward solve processused to solve (29) is dependent on the sparsity of the Choleskyfactor. If the factor contains O(n) nonzero elements, as is thecase for VAN systems with only odometry constraints or a con-stant number of loop-closure observations, the complete stateestimate vector can be recovered in O(n) time. However, ingeneral, where the Cholesky factor contains O(n2) nonzeros,as can be expected in SLAM applications where the numberof loop-closure observations grows linearly with the number ofposes, the computational complexity of recovering the completevector is O(n2).

Page 7: Efficient View-Based SLAM Using Visual Loop Closures

1008 IEEE TRANSACTIONS ON ROBOTICS, VOL. 24, NO. 5, OCTOBER 2008

B. Approximate Vehicle State Recovery

In a previous VAN implementation [9]–[11], approximate es-timates of the current vehicle states were produced by partition-ing the state vector into a “local” portion consisting of the statesto be recovered, and the remaining “benign” states for whichan approximate estimate is available. Using the subscript l forthe local subvector and b for the benign states, the partitionedversion of (29) is[

Y+bb (tk ) Y+

bl (tk )

Y+Tbl (tk ) Y+

ll (tk )

] [x+

b (tk )x+

l (tk )

]=

[y+

b (tk )y+

l (tk )

]. (30)

If the benign states have not changed significantly since theywere last recovered, providing a good approximation xb (tk ) (atilde is used to denote approximate estimates), an approximateestimate of the local states can be calculated with

xl (tk ) =[Y+

ll (tk )]−1 (

y+l (tk ) − Y+T

bl (tk )xb (tk )). (31)

Only one block of Y+Tbl (tk ) corresponding to the previous-

to-current pose cross-information submatrix contains nonzeroelements, allowing the approximate vehicle state estimate to becalculated in constant time.

The assumption underlying this approximation is that the pastvehicle poses have not been significantly updated by observa-tions applied to the filter since the estimates of the benign stateswere last recovered.

If an observation such as a loop closure or global positioningsystem (GPS) fix that provides a large correction to states withdrifting estimates is applied to the filter, a significant correctionwill be propagated to the previous pose states. As a result, theaccuracy of the approximation will be poor and the completestate vector including new estimates of the benign states wouldneed to be recovered using the method of Section V-A.

C. Exact Vehicle State Recovery

In Section IV-C, it was shown that a Cholesky factor andthe forward solve result can be efficiently modified to reflectchanges to the original system of linear equations. The onlyremaining operation required to solve the modified system oflinear equations is the backward solve of the upper-triangularsystem of (26), which has the form

D1. . .

Dn

LT11 . . . LT

n1. . .

...LT

nn

X1

...Xn

=

Z1

...Zn

.

(32)The backward solve operation recovers the variables in reverseorder from last to first row. The last block of the solution X can,therefore, be calculated by solving

DnLTnnXn = Zn . (33)

If the current vehicle pose variables are ordered last, and theforward substitution result is updated along with the Choleskyfactor each time it is modified during a prediction or observa-tion operation, this approach allows the current vehicle stateestimates to be recovered in constant time. This is an importantimprovement over the method of Section V-B, since it allows

prediction and observation operations to be performed withoutcorrupting the filter with approximate estimates. As a result, theEIF will have the same optimality properties as an EKF solution.

VI. COVARIANCE RECOVERY

A. Complete Inverse Recovery

Using the Cholesky decomposition of the information matrix,the complete covariance matrix can be recovered by solving theequation

Y+ (tk )P+ (tk ) = I. (34)

While an information matrix may be sparse, the correspondingcovariance matrix is dense. Recovering the complete covariancematrix is only feasible for problems with small state vectors.

B. Recovery of Columns of the Inverse

The jth column of the covariance matrix can be recovered bysolving the equation

Y+ (tk )P+∗j (tk ) = I∗j (35)

where P+∗j (tk ) is the jth column of the covariance matrix, and

I∗j is the jth column of an identity matrix with the same dimen-sions as the information matrix.

If the Cholesky factor contains O(n) nonzero elements, whichoccur in VAN systems containing only odometry constraints or aconstant number of loop closures, the computational complex-ity of recovering a column of the covariance matrix is O(n).However, in general, where the Cholesky factor contains O(n2)nonzeros, the complexity of recovering a column is O(n2).

C. Recovery of the Sparse Inverse

Recovering the joint pose distributions used for loop-closurehypothesis generation requires the coariance of the augmentedposes, which are located on the block diagonal of the covariancematrix. The covariance recovery method of Section VI-B isinefficient for this task, since many irrelevant elements of theinverse are calculated.

An alternative recovery method [13], [28]–[30] can be derivedfrom the Takahashi relationship

A−1 = (LT)−1D−1 − A−1(L − I). (36)

If (36) is used to calculate the lower triangle of the inverse,the upper triangular component (LT)−1 , which contains oneson its diagonal can be ignored. Individual elements of the lowertriangle of the inverse can, therefore, be calculated using therecursive relationship

[A−1 ]ij = [D−1 ]ij −n∑

k=j+1

[A−1 ]ikLkj , for i ≥ j. (37)

In (37), an element of the inverse in column j is describedin terms of other elements of the inverse in columns j to n,along with the Cholesky factorization components L and D. Ifthe matrices A and L are sparse, not all elements of the inverseneed to be recovered.

Page 8: Efficient View-Based SLAM Using Visual Loop Closures

MAHON et al.: EFFICIENT VIEW-BASED SLAM USING VISUAL LOOP CLOSURES 1009

Fig. 4. Structure of the sparse inverse matrix. In general, the inverse of asparse matrix is dense. The sparse inverse matrix contains the elements of theinverse at the locations of non-zeros in the Cholesky factor. The sparse inversemay contain more nonzero elements than the original matrix due to fill-in inthe Cholesky factor. (a) Original matrix. (b) Factor. (c) Inverse. (d) Sparseinverse.

The set of elements of the inverse at the locations of nonze-ros in the Cholesky factor is known as the “sparse inverse,”which is illustrated in Fig. 4. All elements of the sparse in-verse can be calculated using only other members of the sparseinverse and the factorization components [29]. When appliedto the factorization of a VAN information matrix, the sparse in-verse includes the block diagonal, providing a method to recoverthe augmented pose covariances.

If the Cholesky factor contains O(n) nonzero elements, whichoccur in VAN system containing only odometry constraints ora constant number of loop closures, the sparse inverse can berecovered in O(n) time. However, in general, where the fac-tor contains O(n2) nonzeros, the complexity of recovering thesparse inverse is O(n3).

VII. GENERATING LOOP-CLOSURE HYPOTHESES

Since visual feature extraction and association is computa-tionally expensive, generating a small set of loop-closure hy-potheses on which image analysis will be performed is criticalfor the efficiency of the VAN algorithm. Deciding if a pair ofposes is accepted as a loop-closure hypothesis is performed byevaluating their joint distributions to estimate the likelihood thatimages acquired at each pose overlap.

Due to the computational complexity of recovering covari-ances from an information filter, a previous VAN implementa-tion [9]–[11] used covariances recovered at previous timestepsto generate loop-closure hypotheses. Since the uncertainty ofaugmented past poses can only decrease, the use of old covari-ances is a conservative strategy. The filter is not corrupted, sinceno approximate values are used in any prediction or observationoperation. However, the use of conservative covariances mayincrease the number of loop-closure hypotheses generated.

The conservative pose covariances can be used to create anapproximation of the predicted joint distribution covariance of

the form

P(i,v ) (tk ) =[

Pii (tk ) P−iv (tk )

P−Tiv (tk ) P−

vv (tk )

](38)

where Pii (tk ) is the conservative covariance of pose i, andP−

iv (tk ) and P−vv (tk ) are the optimal past-to-current cross co-

variance and current pose covariances, which can be recoveredfrom the vehicle columns of the covariance matrix using themethod of Section VI-B.

To maintain the set of conservative past pose covariances, thecurrent vehicle pose covariance is appended to the set each timea new pose is augmented to the state vector. When a loop-closureobservation that significantly changes the past pose distributionsis applied to the filter, the approximate covariances are updated.

In previous VAN applications [9]–[11], each time a loop-closure observation is applied to the filter, an EKF update isperformed on the approximate joint distribution covariance toyield an updated covariance for the past pose. Since all of theestimated poses are correlated, a loop-closure observation willreduce the uncertainty of all trajectory states. This approach,however, only reduces the uncertainty in one of the maintainedpose covariances, leaving the others highly conservative.

The sparse inverse recovery method of Section VI-C providesan alternative method to efficiently update all the augmentedpose covariances. While this operation is more computationallycomplex than the single-pose EKF update, the reduction in theconservative pose uncertainties will cause fewer loop-closurehypotheses to be analyzed, and is likely to result in an overallimprovement in efficiency.

VIII. RESULTS

The SLAM algorithm described in this paper has been appliedto data acquired by the AUV Sirius, a modified version of theSeaBED AUV [31] developed at the Woods Hole OceanographicInstitution. DR is performed using a Doppler velocity log (DVL)that provides the velocity of the vehicle in three axes relativeto the seafloor, a compass and a tilt sensor that observe thevehicle’s orientation, and a pressure sensor to measure depth.A stereovision rig is used to provide loop-closure observations.Due to the accuracy of the DVL over short distances, the visionsystem is not used to provide odometry information.

In this application, loop-closure hypotheses are created us-ing the simplified visibility model illustrated in Fig. 5, whichis designed to be conservative and computationally efficient. Inthis model, the terrain is assumed to be planar, a conservativecircular bound is used to approximate the stereo rig’s field ofview, and the vehicle is assumed to have zero roll and pitch(a reasonable approximation for the stable Sirius vehicle). Un-der these assumptions, image overlap occurs if the magnitudeof the 2-D stereo rig displacement [xij , yij ] between poses iand j is less than the sum of the circular image footprint radiiri and rj . A distribution for the 2-D displacement is createdfrom the conservative joint pose covariance in (38). The like-lihood of image overlap is calculated by integrating the 2-Ddisplacement distribution over the circular region defined by√

xij2 + yij

2 < (ri + rj ). In this experiment, an approximate

Page 9: Efficient View-Based SLAM Using Visual Loop Closures

1010 IEEE TRANSACTIONS ON ROBOTICS, VOL. 24, NO. 5, OCTOBER 2008

Fig. 5. Simplified image overlap model for loop-closure hypotheses. A planarterrain structure is assumed, along with zero vehicle roll and pitch, and a constantradial field of view of α. The altitudes of the stereo rig are ai and aj , resultingin circular image footprints with radii ri and rj . Under these assumptions,

overlapping images occur if√

xij2 + yij

2 < (ri + rj ).

integration is performed by sampling the 2-D displacement dis-tribution on a 20× 20 cell grid as demonstrated in Fig. 6, andpose pairs with an overlap likelihood greater than 0.005 areaccepted as loop-closure hypotheses.

Loop-closure observations are created using a six degree-of-freedom stereovision relative pose estimation algorithm [32].The SURF algorithm [20] is used to extract and associate vi-sual features, and epipolar geometry [33] is used to reject in-consistent feature observations within each stereo image pair.Triangulation [33] is performed to calculate initial estimates ofthe feature positions relative to the stereo rig, and a redescend-ing M-estimator [34], [35] is used to calculate a relative posehypothesis that minimizes a robustified registration error costfunction. Any remaining outliers with observations inconsistentwith the motion hypothesis are then rejected. Finally, the max-imum likelihood relative vehicle pose estimate and covarianceare then calculated from the remaining inlier features. An ex-ample set of stereo image pairs and the visual features used toproduce a loop-closure observation are presented in Fig. 7.

In a deployment to survey sea sponges in the Ningaloo MarinePark near Exmouth in Western Australia, the AUV traversed agrid pattern within a square region of 150 m× 150 m, collecting2156 pairs of stereo images. The ocean depth at the survey siteis approximately 40 m, and the AUV maintained an altitude of2 m above the seafloor. The vehicle trajectory is approximately2.2 km in length, and required approximately 75 min tocomplete.

A comparison of the estimated trajectories produced by DRand SLAM is shown in Fig. 8. A total of 111 loop-closureobservations were applied to the SLAM filter, shown by the

Fig. 6. Calculating the likelihood of overlapping images for loop-closure hy-potheses. In this example, the mean and covariance of the 2-D stereo pose

displacement distribution are µ =[

22

]and Σ =

[4 −0.5

−0.5 2

], and the

maximum distance for image overlap (ri + rj ) is 1.5 m. The mean of therelative pose distribution is marked by a cross, and the one and two standarddeviation ellipses are drawn in black. The gray circle shows the bounds of 2-Ddisplacement vectors that support overlapping images. The displacement distri-bution is sampled on a grid, and cells within the overlap bounds are integrated toestimate the likelihood of overlapping images. A 20× 20 grid has been used, re-quiring the calculation of 400 samples. The grayscale intensity of each grid celldisplays the evaluated relative pose likelihood. In this example, the likelihoodof overlapping images was calculated to be approximately 0.08.

Fig. 7. Stereovision loop-closure observation example. The left and rightstereo images acquired at a first pose are shown on top of the stereo imagesacquired at a second pose. Features associated between all images are markedby lines joining their locations in both left and right frames.

red lines joining observed poses. Applying the loop-closureobservations results in a trajectory estimate that suggests thevehicle drifted approximately 30 m southwest of the desiredsurvey area.

While no ground truth for the survey is available, argumentsfor the superiority of the SLAM solution can be created by con-sidering the consistency of the final vehicle position estimateswith GPS observations acquired after the vehicle surfaced at the

Page 10: Efficient View-Based SLAM Using Visual Loop Closures

MAHON et al.: EFFICIENT VIEW-BASED SLAM USING VISUAL LOOP CLOSURES 1011

Fig. 8. Comparison of DR and SLAM vehicle trajectory estimates. The SLAMestimates suggest the vehicle has drifted approximately 30 m southwest of thedesired survey area. Mosaics of images acquired at the trajectory crossoverpoints marked “A” and “B” are shown in Fig. 9. A video showing the evolutionof the DR and SLAM trajectories is available at http://ieeexplore.ieee.org.

TABLE IFINAL VEHICLE POSITION ESTIMATES

end of the mission, and the self-consistency of each estimatedtrajectory.

Estimates of the final vehicle position at the end of the missionproduced by DR, SLAM, and GPS are listed in Table I. The dif-ference between the SLAM estimate and GPS is approximatelyhalf that of the DR solution. It is likely that a large portion of theerror in the SLAM solution was accumulated in the descent tothe seafloor and ascent to the surface, since during these times,no visual observations are available to correct drifting estimates.

The superior self-consistency of the SLAM solution can beobserved in mosaics of images acquired at trajectory crossoverpoints. Fig. 9 presents mosaics for the crossover points marked“A” and “B” within the DR and SLAM trajectory estimates inFig. 8. The mosaic of the DR crossover point in Fig. 9(a) isinconsistent, since images hypothesized to overlap contain nocommon features. In contrast, the mosaic of Fig. 9(b), producedusing vehicle pose estimates from SLAM displays, accuratelyregistered overlapping images, demonstrating the correction ofDR drift.

Table II lists the processing times for the SLAM algorithmwith and without the use of Cholesky modifications, and usingthe approximate and exact vehicle state recovery methods. Theexact vehicle state recovery process is slightly more efficient dueto the complexity of the matrix inverse operation required bythe approximate method. Using Cholesky factor modifications

Fig. 9. Mosaic reconstructions of crossover points in the estimated vehicletrajectories. (a) Mosaic of the region marked “A” in Fig. 8, demonstrating theinconsistency of the DR trajectory. Images predicted to contain overlap do notmatch due to significant localization errors. (b) Mosaic of the region marked “B”in Fig. 8, demonstrating the superior self-consistency of the SLAM trajectory.

TABLE IIESTIMATOR PROCESSING TIMES†

provides a significant advantage, since many computationallyexpensive factorization operations (worst case O(n3)) are re-placed by constant-time modifications. When applied to largerdatasets, the difference between modifying and recalculatingthe factor will be greater.

The processing times listed in Table II do not include the timerequired to produce the loop-closure observations. In the current

Page 11: Efficient View-Based SLAM Using Visual Loop Closures

1012 IEEE TRANSACTIONS ON ROBOTICS, VOL. 24, NO. 5, OCTOBER 2008

Fig. 10. Evaluation of conservative covariance updating strategies. The traceof the conservative pose covariance submatrices has been used as a measureof their uncertainty. In an exploration style mission with few loop closures,updating a single pose covariance after each loop closure with the EKF updatemethod produces little benefit. Updating all pose covariances using the sparseinverse recovery method each time a loop-closure observation is applied to thefilter maintains conservative covariances that are close to optimal.

TABLE IIILOOP-CLOSURE STATISTICS FOR CONSERVATIVE POSE UPDATING STRATEGIES

implementation, all vision processing is performed on the sameCPU as the SLAM filter. For the Ningaloo dataset, the visionprocessing required an additional 4 min and 1 s of processingtime. In the future, the computationally expensive image anal-ysis operations, such as feature extraction, may be performedon a separate device such as a graphics processing unit.

In this application, exact vehicle state recovery provides lit-tle benefit in accuracy over the approximate method. The DVL,orientation, and depth sensors provide high frequency and accu-rate observations, resulting in only small corrections to the pastvehicle pose states. If DR is performed using each vehicle staterecovery method, the maximum difference in the vehicle posi-tion estimates for the Ningaloo survey is 9 cm. The benefits ofthe exact vehicle state recovery method may be greater in otherapplications, where observations provide larger corrections tothe past pose estimates.

The superiority of the sparse inverse method to update thepast pose conservative covariances is demonstrated in Fig. 10,where the trace of the covariances is used as a measurementof their uncertainty. For comparison, optimal (nonconservative)values were produced by recovering the true pose covariancesat each timestep. For a survey pattern with few crossover points,applying a single-pose EKF update after each loop closure pro-vides little benefit. The strategy of updating the conservative

Fig. 11. Growth of the number of nonzero elements in the Cholesky factor.The number of nonzeros grows linearly with the number of augmented posesbetween loop-closure observations, the first of which occurs when there are1039 augmented past poses. The irregularities in the growth pattern result fromthe greedy nature of the AMD algorithm. Some loop closures cause a decreasein the number of nonzeros when a better variable ordering is found. The worstcase number of nonzeros is O(n2) in the number poses; however, the sparse setof loop closures in the Ningaloo experiment result in a growth that is not muchworse than linear.

poses using the sparse inverse method after each loop closureproduces near-optimal results. The numbers of loop-closure hy-potheses and observations produced when using each conserva-tive pose update strategy are listed in Table III. As expected, thesparse inverse method results in a significant reduction in thenumber of generated loop-closure hypotheses.

The final state vector for the Ningaloo Marine Park experi-ment contains 25 884 variables from 2157 poses. Each vehiclepose contains 12 states: three for position, three for orientation,three for velocity, and three for angular velocity.

The final information matrix is 99.86% sparse, and its lowertriangle contains 482 706 nonzero elements. Most of the nonzeroelements result from odometry constraints; however, each ofthe 111 loop-closure observations result in a block of nonzerosbelow the block tridiagonal.

If the natural variable ordering is used, the Cholesky fac-tor of the final information matrix contains 5 165 838 nonzeroelements. The AMD variable ordering produces a factor with804 222 nonzeroes (approximately one-sixth of the number pro-duced by the natural ordering), resulting in significant compu-tational efficiency advantages when performing state estimateand covariance recovery.

The growth in the number of nonzeros in the Cholesky factorfor the Ningaloo experiment is shown in Fig. 11. In general,the number of nonzeros for SLAM is O(n2) in the numberof poses; however, due to the sparse set of crossover pointsin the Ningaloo experiment, the number of nonzeros causedby new poses and odometry constraints (which grow linearly)outnumbers those from loop-closure observations. As a result,in this case, the growth in the number of nonzeros is not muchworse than linear.

Page 12: Efficient View-Based SLAM Using Visual Loop Closures

MAHON et al.: EFFICIENT VIEW-BASED SLAM USING VISUAL LOOP CLOSURES 1013

Fig. 12. Processing times for loop-closure observations. The displayed pro-cessing times were acquired on a 2.0 GHz Intel Pentium M Processor. Applyinga loop-closure observation update to the information matrix is a O(n) operationin this implementation due to the use of a compressed row storage format. Ingeneral, the computational complexity of the Cholesky factorization and sparseinverse recovery operations is O(n3 ), and the forward solve operation is O(n2 ).In this experiment, the growth of the processing times is not much worse thanlinear due to the near-linear growth in the number of nonzeros in the Choleskyfactor.

Fig. 13. 3-D reconstruction of the Ningaloo survey site. (a) Overview of thereconstruction. A gap is present in the data near the northeast corner, where thestereo rig failed to log images. (b) Detail view of a trajectory crossover point,where loop-closure observations have been applied to the filter. (c) 3-D detailview, showing the structure of the terrain, including a few of the sponges thatwere the target of the survey. A video showing the seafloor reconstruction indetail is available at http://ieeexplore.ieee.org.

The most computationally expensive operations in the SLAMalgorithm are loop-closure observations. The processing timesfor each component of the loop-closure observations in theNingaloo experiment are shown in Fig. 12. Updating the in-formation matrix is a O(n) operation in this implementationdue to the use of a compressed row storage format requiring

O(n) values to be shifted when a new nonzero element is in-serted. Recalculating the Cholesky factor and recovering thesparse inverse to update the conservative pose covariances arethe most time-consuming components. While the computationalcomplexity of these operations, in general, is O(n3), the growthof their processing times in this experiment is not much worsethan linear due to the near-linear growth in the number of nonze-ros in the Cholesky factor.

A 3-D reconstruction of the survey site has been produced bytriangulating features in the stereo images, and registering thepoint clouds in a common reference frame using the SLAM-estimated vehicle trajectory. The source images have then beenprojected onto the resulting mesh, which can be observed inFig. 13.

IX. CONCLUSION

A SLAM algorithm using the VAN framework was presentedand demonstrated using data acquired by the Sirius AUV atNingaloo Marine Park in Western Australia.

The use of Cholesky factorization modifications to update adecomposition of the information matrix prevents the need torepeatedly perform the computationally expensive factorizationprocess each time state estimates and covariances are recovered.

Through the selection of an appropriate variable ordering,recovery of the vehicle state estimates can be performed inconstant time, allowing prediction and vehicle state observationoperations to be performed without corrupting the filter withapproximate vehicle state estimates.

Updating the conservative covariances of all past poses usingthe sparse inverse recovery method results in the generation ofsignificantly fewer loop-closure hypotheses than the previouslyused single-pose update method.

Currently, all processing is performed offline on logged data.While the wost case computational complexity of some filteroperations is O(n3) in the number of augmented poses, theresult for a typical underwater survey with a sparse set of loop-closure events suggests an online implementation is feasible forour application.

ACKNOWLEDGMENT

The authors thank the Australian Institute of Marine Science(AIMS) for providing ship time aboard the R/V Cape Ferguson.In particular, they wish to thank A. Heyward, M. Rees, J. Col-lquhoun, and the crew of the Cape Ferguson for providing thisopportunity and lending a hand whenever necessary. They alsoacknowledge the help of all those working behind the scenes tokeep our AUV operational, including P. Rigby, J. Randle, thelate A. Trinder, and B. Crundwell.

REFERENCES

[1] R. Smith, M. Self, and P. Cheeseman, “Estimating uncertain spatial rela-tionships in robotics,” Auton. Robot Veh., vol. 8, pp. 167–193, 1990.

[2] J. J. Leonard, H. F. Durrant-Whyte, and I. J. Cox, “Dynamic map buildingfor an autonomous mobile robot,” Int. J. Robot. Res., vol. 11, no. 4,pp. 286–298, 1992.

[3] M. W. M. G. Dissanayake, P. Newman, S. Clark, H. F. Durrant-Whyte,and M. Csobra, “A solution to the simultaneous localization and mapbuilding (SLAM) problem,” IEEE Trans. Robot. Autom., vol. 17, no. 3,pp. 229–241, Jun. 2001.

Page 13: Efficient View-Based SLAM Using Visual Loop Closures

1014 IEEE TRANSACTIONS ON ROBOTICS, VOL. 24, NO. 5, OCTOBER 2008

[4] J. Guivant and E. M. Nebot, “Optimization of the simultaneous localizationand map-building algorithm for real-time implementation,” IEEE Trans.Robot. Autom., vol. 17, no. 3, pp. 242–257, Jun. 2001.

[5] S. B. Williams, “Efficient solutions to autonomous mapping and nav-igation problem,” Ph.D. dissertation, Aust. Centre Field Robot., Univ.Sydney, Sydney, Australia, 2001.

[6] J. D. Tardos, J. Neira, P. M. Newman, and J. J. Leonard, “Robust mappingand localization in indoor environments using sonar data,” Int. J. Robot.Res., vol. 21, no. 4, pp. 311–330, 2002.

[7] M. Bosse, P. Newman, J. Leonard, M. Soika, W. Feiten, and S. Teller, “Anatlas framework for scalable mapping,” in Proc. IEEE Int. Conf. Robot.Autom., 2003, vol. 2, pp. 1899–1906.

[8] S. Thrun, Y. Liu, D. Koller, A. Y. Ng, and H. Durrant-Whyte, “Simulta-neous localization and mapping with sparse extended information filters,”Int. J. Robot. Res., vol. 23, no. 7–8, pp. 693–716, 2004.

[9] R. M. Eustice, “Large-area visually augmented navigation for autonomousunderwater vehicles,” Ph.D. dissertation, Massachusetts Inst. Technol.Woods Hole Oceanogr. Inst., Wood’s Hole, MA, 2005.

[10] R. M. Eustice, H. Singh, and J. J. Leonard, “Exactly sparse delayed-state filters for view-based SLAM,” IEEE Trans. Robot., vol. 22, no. 6,pp. 1100–1114, Dec. 2006.

[11] R. M. Eustice, H. Singh, J. J. Leonard, and M. R. Walter, “Visuallymapping the RMS Titanic: Conservative covariance estimates for SLAMinformation filters,” Int. J. Robot. Res., vol. 25, no. 12, pp. 1223–1242,2006.

[12] F. Dellaert, “Square root SAM,” in Proc. Robot.: Sci. Syst., Cambridge,MA, Jun. 2005, pp. 177–184.

[13] F. Dellaert and M. Kaess, “Square root SAM: Simultaneous localizationand mapping via square root information smoothing,” Int. J. Robot. Res.,vol. 25, no. 12, pp. 1181–1203, 2006.

[14] Y. Chen, T. A. Davis, W. W. Hager, and S. Rajamanickam, “Algo-rithm 8xx: Cholmod, supernodal sparse cholesky factorization and up-date/downdate,” Dept. Comput. Inf. Sci. Eng., Univ. Florida, Gainesville,FL, Tech. Rep. TR-2006-005, 2006.

[15] M. Kaess, A. Ranganathan, and F. Dellaert, “iSAM: Fast incrementalsmoothing and mapping with efficient data association,” in Proc. IEEEInt. Conf. Robot. Autom., 2007, pp. 1670–1677.

[16] M. Kaess, A. Ranganathan, and F. Dellaert, “Fast incremental squareroot information smoothing,” in Proc. Int. Joint Conf. Artif. Intell., 2007,pp. 2129–2134.

[17] R. Eustice, O. Pizarro, and H. Singh, “Visually augmented navigation inan unstructured environment using a delayed state history,” in Proc. IEEEInt. Conf. Robot. Autom., 2004, vol. 1, pp. 25–32.

[18] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,”Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, 2004.

[19] J. Matas, O. Chum, M. Urban, and T. Pajdla, “Robust wide baseline stereofrom maximally stable extremal regions,” in Proc. Br. Mach. Vis. Conf.,vol. 1, Cardiff, U.K.: British Machine Vision Assoc., 2002, pp. 384–392.

[20] H. Bay, T. Tuytelaars, and L. V. Gool, “SURF: Speeded up robust features,”in Proc. 9th Eur. Conf. Comput. Vis., vol. 13, Graz, Austria: Springer,May 2006, pp. 404–417.

[21] T. Kadir, A. Zisserman, and M. Brady, “An affine invariant salient regiondetector,” in Proc. Eur. Conf. Comput. Vis., 2004, pp. 404–416.

[22] K. Mikolajczk and C. Schmid, “A performance evaluation of local descrip-tors,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2003, vol. 2,pp. 257–263.

[23] K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas,F. Schaffalitzky, T. Kadir, and L. V. Gool, “A comparison of affine re-gion detectors,” Int. J. Comput. Vis., vol. 65, no. 1–2, pp. 43–72, 2005.

[24] T. A. Davis, Direct Methods for Sparse Linear Systems. Philadelphia,PA: SIAM, 2006.

[25] S. Ingram. (2006). Minimum degree reordering algorithms: A tu-torial, 2006, [Online]. Available: http://www.cs.ubc.ca/∼sfingram/cs517_final.pdf

[26] T. A. Davis and W. W. Hager, “Modifying a sparse Cholesky factoriza-tion,” SIAM J. Matrix Anal. Appl., vol. 20, no. 3, pp. 606–627, 1999.

[27] T. A. Davis and W. W. Hager, “Row modifications of a sparse Choleskyfactorization,” SIAM J. Matrix Anal. Appl., vol. 26, no. 3, pp. 621–639,2005.

[28] H. Niessner and K. Reichert, “On computing the inverse of a sparsematrix,” Int. J. Num. Methods Eng., vol. 19, no. 10, pp. 1513–1526, 1983.

[29] A. M. Erisman and W. F. Tinney, “On computing certain elements of theinverse of a sparse matrix,” Commun. ACM, vol. 18, no. 3, pp. 177–179,1975.

[30] B. Triggs, P. McLauchlan, R. Hartley, and A. Fitzgibbon, “Bundleadjustment—modern synthesis,” in Vision Algorithms: Theory and Prac-tice. Lecture Notes in Computer Science. Berlin, Germany: Springer-Verlag, 2000, pp. 298–375.

[31] H. Singh, A. Can, R. Eustice, S. Lerner, N. McPhee, O. Pizarro, and C. Ro-man, “Seabed AUV offers new platform for high-resolution imaging,” inEOS Trans. Amer. Geophys. Union, vol. 85, no. 31, pp. 289–294–295,Nov. 2004.

[32] I. Mahon, “Vision-based navigation for autonomous underwater vehi-cles,” Ph.D. dissertation, Aust. Centre Field Robot., Univ. Sydney, Sydney,Australia, 2008.

[33] R. I. Hartley and A. Zisserman, Multiple View Geometry in ComputerVision. Cambridge, U.K.: Cambridge Univ. Press, 2000.

[34] P. J. Huber, Robust Statistics. New York: Wiley, 1981.[35] R. A. Maronna, R. D. Martin, and V. J. Yohai, Robust Statistics. Berlin,

Germany: Springer-Verlag, 2006.

Ian Mahon received the B.E./B.Sc. degree in mecha-tronic engineering and computer science in 2002 andthe Ph.D. degree in 2008, both from the Universityof Sydney, Sydney, Australia.

He is currently a Research Fellow with theAustralian Centre for Field Robotics within the Uni-versity of Sydney, where he is a Member of the Ma-rine Robotics Group. His current research interestsinclude robotic mapping, computer vision, and au-tonomous underwater vehicles.

Stefan B. Williams (S’99–A’01–M’02) received theB.A.Sc. degree (with first-class honors) in systemsengineering design from the University of Waterloo,Waterloo, ON, Canada, in 1997 and the Ph.D. de-gree in field robotics from the University of Sydney,Sydney, Australia, in 2002.

He is a Senior Lecturer with the University ofSydney’s School of Aerospace, Mechanical, andMechatronic Engineering. He is a Member of theAustralian Centre for Field Robotics, where he leadsthe Marine Robotics Group. He is also the head of

Australia’s Integrated Marine Observing System AUV Facility. His current re-search interests include simultaneous localization and mapping in unstructuredunderwater environments, as well as algorithms for autonomous navigation andcontrol.

Oscar Pizarro (S’93–M’04) received the B.S. de-gree in electronic engineering from the Universi-dad de Concepcion, Concepcion, Chile, in 1997, thedual M.Sc. degree in ocean engineering and electri-cal engineering and computer sciences and the Ph.D.degree in oceanographic engineering, both fromthe Massachusetts Institute of Technology/WoodsHole Oceanographic Institution Joint Program,Cambridge, in 2003 and 2004, respectively.

Since 2005, he has been at the University ofSydney’s Australian Centre for Field Robotics,

Sydney, Australia, as an Australian Research Council (ARC) Postdoctoral Fel-low. His current research interests include 3-D seafloor reconstructions fromoptical imagery, classification and automated interpretation of survey data, andthe transition of advanced instrumentation and algorithms to low-cost platforms.

Matthew Johnson-Roberson (S’06) received theB.S. degree in computer science from CarnegieMellon University, Pittsburgh, PA, in 2005. He iscurrently working toward the Ph.D. degree with theUniversity of Sydney Sydney, Australia.

He has previously worked on autonomous vehi-cle software for the Defense Advanced ReseearchProjects Agency Off-Road Grand Challenge. His cur-rent research interests include the classification andreconstruction of outdoor environments.