Automatic urban building boundary extraction from high resolution aerial images using an innovative model of active contours

Automatic urban building boundary extraction from high resolution aerial imagesusing an innovative model of active contours

Salman Ahmadi a,*, M.J. Valadan Zoej a, Hamid Ebadi a, Hamid Abrishami Moghaddam b,Ali Mohammadzadeh a

a K.N. Toosi University of Technology, Faculty of Geomatics Engineering, Tehran, Iranb K.N. Toosi University of Technology, Faculty of Electrical Engineering, Tehran, Iran

International Journal of Applied Earth Observation and Geoinformation 12 (2010) 150–157

A R T I C L E I N F O

Article history:

Received 16 January 2009

Accepted 2 February 2010

Keywords:

Building boundary extraction

Active contours

Level set

Aerial images

A B S T R A C T

To present a new method for building boundary detection and extraction based on the active contour

model, is the main objective of this research. Classical models of this type are associated with several

shortcomings; they require extensive initialization, they are sensitive to noise, and adjustment issues

often become problematic with complex images. In this research a new model of active contours has

been proposed that is optimized for the automatic building extraction. This new active contour model, in

comparison to the classical ones, can detect and extract the building boundaries more accurately, and is

capable of avoiding detection of the boundaries of features in the neighborhood of buildings such as

streets and trees. Finally, the detected building boundaries are generalized to obtain a regular shape for

building boundaries. Tests with our proposed model demonstrate excellent accuracy in terms of building

boundary extraction. However, due to the radiometric similarity between building roofs and the image

background, our system fails to recognize a few buildings.

� 2010 Elsevier B.V. All rights reserved.

Contents lists available at ScienceDirect

International Journal of Applied Earth Observation andGeoinformation

journal homepage: www.e lsev ier .com/ locate / jag

1. Introduction

Nowadays, automatic extraction of man-made objects such asbuildings and roads in urban areas has become a topic of growinginterest for photogrammetric and computer vision community.Researches in this domain started from late 1980s and used quitedifferent types of source images ranging from single intensityimages, color images, laser range images to stereo and multipleimages (Peng et al., 2005). Some useful applications are automa-tion information extraction from images and updating geographicinformation system (GIS) databases. The establishment of thedatabase for urban areas is frequently done by the analysis of aerialimagery since photogrammetric data is three-dimensional, accu-rate, largely complete and up-to-date. Because manual interpreta-tion is very time consuming, a lot of efforts have been spent tospeed up this process by automatic or semi-automatic procedures.A wide range of techniques and algorithms have been proposed forautomatically constructing 2D or 3D building models from satelliteand aerial imagery.

* Corresponding author.

E-mail addresses: [email protected], [email protected]

(S. Ahmadi), [email protected] (M.J. Valadan Zoej), [email protected] (H. Ebadi),

[email protected] (H.A. Moghaddam), [email protected]

(A. Mohammadzadeh).

0303-2434/$ – see front matter � 2010 Elsevier B.V. All rights reserved.

doi:10.1016/j.jag.2010.02.001

In this field Dash et al. in 2004 used height variation in thecontext of object periphery data to develop a method based onstandard deviation to distinguish between trees and buildings(Dash et al., 2004). Sohn et al. employed Lidar (Light Detection andRanging) data in 2007 to generate height data for features in anurban region (Sohn and Dowman, 2007). They carried out thefollowing steps for building extraction: first, they identified allfeatures that were a certain height above ground level. Next, usingthe NDVI index and other information, they distinguished thebuildings from other features. Finally, they detected the sharpedges of buildings and matched polygons to the close edges, inorder to robustly identify building boundaries (Sohn and Dowman,2007). In 1999, Halla and co-workers extracted building locationsfrom images using classification algorithms and height data (Hallaand Brenner, 1999). Zimmermann et al. in 2000 produced a DigitalSurface Model (DSM) data from stereo images. They then used themodel to detect building roofs by applying slope and aspectoperators (Zimmermann, 2000). Finally, in another study, heightdata and morphological operators were utilized to extractbuildings (Zhao and Trinder, 2000).

As reported by Hongjiana and Shiqiang (2006), anotherapproach involves extracting the data and connecting edge pixels.This allows for the derivation of building heights from sparse lasersamples and can be used to reconstruct 3D information for eachbuilding. Miliaresis and Kokkas (2007) proposed a new method forextracting a class of buildings using digital elevation models

mailto:[email protected]






http://www.sciencedirect.com/science/journal/03032434

http://dx.doi.org/10.1016/j.jag.2010.02.001

S. Ahmadi et al. / International Journal of Applied Earth Observation and Geoinformation 12 (2010) 150–157 151

(DEMs) generated by Lidar data on the basis of geomorphometricsegmentation principles. Lafarge et al. (2008) presented anautomatic building extraction method that involved digitalelevation models based on an object approach. Using this method,a rough approximation of all relevant building footprints was firstcalculated from marked point processes. The resulting rectangularfootprints were then normalized by improving the connectionsbetween neighboring rectangles and detecting any roof heightdiscontinuities (Lafarge et al., 2008). Samadzadegana et al. (2005)proposed a novel approach for object recognition, based on neuro-fuzzy modeling, in which height data were integrated with texturaland spectral information by means of a fuzzy reasoning process.

One method frequently used in building extraction is the snakemodel, and approach that was originally introduced by Kass et al.(1998). In 2004, Peng introduced a variation on the snake model byincorporating a new energy function to extract building bound-aries from aerial images (Peng et al., 2004). Another study used asemi-automatic algorithm to extract buildings from Quickbirdimages (Mayunga et al., 2005). Under that algorithm, a point is firstselected within the boundary of each building. Thereafter, thecurves of the model are reproduced and accurate buildingboundaries are detected using an iterative procedure (Mayungaet al., 2005). Guo and Yasuoka (2002) estimated buildingboundaries from Lidar data, and then applied the snake modelto determine their exact positions. Also, Cao and Yang (2007)extract man-made features from aerial images by using Fractalerror metric and multi-stage active contour. Finally, Karantzalosand Paragios (2009) have applied prior shape knowledge ofbuildings in active contours to detect buildings with special shapesfrom aerial images.

Due to the fact that the most important characteristics ofbuildings in urban areas are their height discrepancies in relationto other features, a large number of above investigations havefocused on integrating height data with aerial or satellite images toautomatically extract buildings. Also, other researches have beendone in this field (Rottensteiner et al., 2005; Schenk and Csatho,2002; Weidner and Forstner, 1995; Baillard and Maitre, 1999;Vestri, 2006). Those types of algorithms require high computa-tional efforts and need significant technological resources in theproduction and analysis of the data. But it should be noted thatDEM information can be used to increase their performance forisolated building detection.

In this work, a new model, based on level set formulation, isintroduced to detect buildings in aerial images using activecontour models. In our model, all building boundaries are detectedby introducing certain points in the buildings’ vicinity. Similarly tothe classical snake model, we avoid the need for initial curves.Moreover, our proposed model detects most relevant buildingboundaries and it does not need height data and additionalinformation to distinguish between buildings and other features.

This paper is organized into five sections. The new activecontour model for building boundary detection is elucidated inSection 2. Experimental results are listed in Section 3, and anaccuracy assessment of the model is provided in Section 4. Finally,Section 5 concludes.

2. Development of a new active contour model toautomatically extract buildings

This study utilizes an active contour model for automaticbuilding boundary detection and extraction. The active contour, orsnakes model, was first introduced by Kass et al. in 1987 (Kasset al., 1998). This model involves dynamic curves or surfaces thatmove within an image domain to capture desired image features.The curve’s motion is driven by a combination of internal andexternal forces, which achieve a minimal energy state when the

curve/surface reaches the targeted image boundaries. Activecontour models have been used in handling a variety of imageproblems, including image segmentation, shape recovery, andvisual tracking (Li et al., 2005).

Active contour models can be classified into two categories:parametric snakes and geometrical snakes. Parametric snakes arerepresented explicitly as parameterized contours, and the snakeevolution is only performed on the predetermined spline controlpoints. Parametric active contours have two main drawbacks: (1)the initial contour must, in general, be close to the true boundary;otherwise it would likely converge to the wrong points (Hou andHan, 2005) and (2) these models can never change topologiesduring evolution. This means that when there is more than oneobject to capture in an image, multiple snakes must be manuallyand separately initialized within the neighborhood of each object(Li et al., 2005; Yan and Kassim, 2006).

Geometrical snakes, on the other hand, are representedimplicitly as the zero-level sets of higher dimensional surfaces,and the updating is performed on the surface function within theentire image domain (Li et al., 2005). Geometrical active contoursconsist of two major types of edge based and region-based activecontours. Edge based methods primarily use gradient informationto locate object boundaries in the images. Caselles et al. (1995) andMalladi et al. (1995) presented the original geometric activecontours independently. In addition, other researches havecontributed in order to improve this model (Caselles et al.,1997; Yezzi et al., 1997; Siddiqi et al., 1998; Vasilerskiy and Siddiqi,2002; Pi et al., 2007; Ying et al., 2009a,b).

Conversely, region-based geometrical active contours rely onthe homogeneity of spatially localized features such as gray levelintensity, texture, and other pixel statistics. The active contourmodel based on the Mumford–Shah function for image segmenta-tion was first proposed by Chan and Vese in 2001 (Chan and Vese,2001). The most important advantage of the active contour modelis the implicit handling of topological changes and its ability toextract objects without noticeable edges from images. Anotheradvantage of this model is its low sensitivity to noise (Chan andVese, 2001). Several other studies in this field have worked withthis approach (Chan et al., 2000; Lie et al., 2006; Brox and Weickert,2004; Chen et al., 2006).

The above-mentioned advantages of the region-based geome-trical active contour model motivated our decision to use it in thisstudy.

In the Chan and Vese active contour model, the image ispartitioned into regions that exhibit maximum homogeneity andsimilarity. The object boundaries are extracted from the imagesbased on the following contour energy function (Chan and Vese,2001):

E1ðCÞ þ E2ðCÞ ¼Z

insideðCÞu� cinj j2dxdyþ

ZoutsideðCÞ

u� coutj j2dxdy

(1)

Here, C stands for the curve of the active contour, u represents thepixel value of the image, and cin and cout illustrate the average ofpixel values inside and outside of C, respectively (Chan and Vese,2001).

In Eq. (1), the termR

insideðCÞ u� cinj j2dxdy represents sum ofthe differences between pixels’ gray values inside the contourand their corresponding mean value. Similarly, the termR

outsideðCÞ u� coutj j2dxdy represents sum of the differences betweenpixels’ gray values outside the contour and their correspondingmean value. Then the contour shape changes in order to find theminimum value for sum of the above-mentioned terms. In thisway, the contour would fit to the edges of objects.

https://www.researchgate.net/publication/222832137_Automatic_building_extraction_from_DEMs_using_an_object_approach_and_application_to_the_3D-city_modeling?el=1_x_8&enrichId=rgreq-829b2e12-6ace-4a1a-9c67-59498b8a231e&enrichSource=Y292ZXJQYWdlOzIyMDQ5MjAyMjtBUzoxMzE4Njg0MDE4Njg4MDBAMTQwODQ1MTI4NTMyMw==

https://www.researchgate.net/publication/222832137_Automatic_building_extraction_from_DEMs_using_an_object_approach_and_application_to_the_3D-city_modeling?el=1_x_8&enrichId=rgreq-829b2e12-6ace-4a1a-9c67-59498b8a231e&enrichSource=Y292ZXJQYWdlOzIyMDQ5MjAyMjtBUzoxMzE4Njg0MDE4Njg4MDBAMTQwODQ1MTI4NTMyMw==

https://www.researchgate.net/publication/7309044_Segmentation_of_volumetric_MRA_images_by_using_capillary_active_contour?el=1_x_8&enrichId=rgreq-829b2e12-6ace-4a1a-9c67-59498b8a231e&enrichSource=Y292ZXJQYWdlOzIyMDQ5MjAyMjtBUzoxMzE4Njg0MDE4Njg4MDBAMTQwODQ1MTI4NTMyMw==

https://www.researchgate.net/publication/223553744_Automatic_3D_object_recognition_and_reconstruction_based_on_neuro-fuzzy_modelling?el=1_x_8&enrichId=rgreq-829b2e12-6ace-4a1a-9c67-59498b8a231e&enrichSource=Y292ZXJQYWdlOzIyMDQ5MjAyMjtBUzoxMzE4Njg0MDE4Njg4MDBAMTQwODQ1MTI4NTMyMw==

https://www.researchgate.net/publication/228347564_Semi-automatic_building_extraction_utilizing_Quickbird_imagery?el=1_x_8&enrichId=rgreq-829b2e12-6ace-4a1a-9c67-59498b8a231e&enrichSource=Y292ZXJQYWdlOzIyMDQ5MjAyMjtBUzoxMzE4Njg0MDE4Njg4MDBAMTQwODQ1MTI4NTMyMw==

https://www.researchgate.net/publication/222377853_An_improved_snake_model_for_building_detection_from_urban_aerial_images?el=1_x_8&enrichId=rgreq-829b2e12-6ace-4a1a-9c67-59498b8a231e&enrichSource=Y292ZXJQYWdlOzIyMDQ5MjAyMjtBUzoxMzE4Njg0MDE4Njg4MDBAMTQwODQ1MTI4NTMyMw==

https://www.researchgate.net/publication/232852140_Man-made_object_detection_in_aerial_images_using_multi-stage_level_set_evolution?el=1_x_8&enrichId=rgreq-829b2e12-6ace-4a1a-9c67-59498b8a231e&enrichSource=Y292ZXJQYWdlOzIyMDQ5MjAyMjtBUzoxMzE4Njg0MDE4Njg4MDBAMTQwODQ1MTI4NTMyMw==

S. Ahmadi et al. / International Journal of Applied Earth Observation and Geoinformation 12 (2010) 150–157152

If the curve C is the inside of an object, then E1(C) � 0 andE2(C) > 0. Conversely, if the curve C is the outside of the targetobject, then E1(C) > 0 and E2(C) � 0. Finally, if some part of thecurve C is inside and some part of the same curve is outside of thetarget object, then E1(C) > 0 and E2(C) > 0 (Chan and Vese, 2001).Minimizing the function in Eq. (1), the curve C is fitted to theboundary of the target object (C0) and the relation is obtained as:

infc E1ðCÞ þ E2ðCÞg � 0 � E1ðC0Þ þ E2ðC0Þgff (2)

To obtain a regularizing version of the function, Chan and Veseincluded two terms like length of C and area inside C in Eq. (1):

EðCÞ ¼ m � lengthðCÞ þ n � areaðinsideðCÞÞ

þ l1

ZinsideðCÞ

u� c1j j2dxdyþ l2

ZoutsideðCÞ

u� c2j j2dxdy (3)

Here, m � 0, and l1,l2 � 0 are constant parameters (Chan andVese, 2001).

The above-mentioned active contour model detects all regionsthat exhibit a given similarity and homogeneity. Therefore, bychanging the parameters in Eq. (3), different regions with variousdegrees of similarity and homogeneity can be detected from theimage. The Chan and Vese model detects and extracts theboundaries of all objects in the image, but it is not appropriatefor detecting special features such as buildings. For this reason, it isnecessary to modify the model for boundary detection purposes.

To this end, we add two terms to the Chan and Vese energyfunction to prevent the divergence of contours toward all imagefeatures. This enables us to detect only the desired features. Tocalculate these additional terms, the operator should input thepixel values from certain points on the desired features accordingto the radiometric variation of the desired features and thecomplexity of the background. The resulting metrics minimize thedifference between pixel values inside and outside of the contours,using pixel values of relevance to the desired features and theimage background, respectively. So, in the proposed model, theknowledge about the buildings is given to the system byintroducing of some pixel values of points inside buildingsboundaries as training data. Then the system can make a differencebetween buildings and other type of objects in the image. Thenumber of points of training data equals to the number of buildingclasses in the image and each class of buildings is represented by atraining point in the proposed model.

The new energy function of the model is written as:

Etotal ¼ m � E1 þ v � E2 þ l1 � E3 þ l2 � E4 þ a � E5 þ b � E6 (4)

where m, n, l1, l2, a and b are constant coefficients and:

E1 ¼ lengthðCÞE2 ¼ areaðinsideðCÞÞE3 ¼

ZinsideðCÞ

u� c1j j2dxdy

E4 ¼Z

outsideðCÞu� coutj j2dxdy

E5 ¼Z

insideðCÞu� dj j2dxdy

E6 ¼Z

outsideðCÞu� ej j2dxdy

(5)

In this function, d and e are the pixel values for the buildings andimage background, respectively, and are provided by the user astraining data.

In Eq. (5), function (E5) shows sum of the differences betweenpixels’ gray values inside the contour and user defined value for theobject of interest like buildings. In a similar concept, function (E6)indicates sum of the differences between pixels’ gray values

outside the contour and user defined value for the background ofthe image.

The radiometric characteristics of desired features such as thebuildings and the image background are introduced into the modelby the parameters d and e in the E5 and E6 energy functional terms.Consequently, the E5 and E6 terms may attract the active contour tothe desired features in an image and will avoid detecting otherundesired objects. Now minimizing the summation of functions(E4), (E5), and (E6) would result in detection of the edges of theobjects which are homogenous and have radiometric valuessimilar to the training pixels.

The model proposed in Eq. (4) is applicable for images in whichthe gray values of building roofs are sufficiently similar that theirradiometric information does not vary excessively in the imagedomain. Therefore, the above model energy function should beextended for different gray valued building roofs. To do this, sometraining samples from target objects (e.g. m pixels) and back-ground image (e.g. n pixels) are chosen by the user to train thecontour. For each training data, the function (E5) value is calculatedand the minimum value among them is chosen to be considered asfinal value for term (E5). The same manner is performed to obtainthe term (E6) value. In this way, the trained contour positionevolves to the edges of the objects which are radiometricallysimilar to one of the training data.

So, if our test scenario involves different types of building roofsand complex image (6) backgrounds, then Eq. (3) becomes:

Etotal ¼ m � E1 þ v � E2 þ l1 � E3 þ l2 � E4 þ a � E5 þ b � E6 (6)

In this function E1, E2, E3 and E4 have been already defined inEq. (5) and E5 and E6 are obtained from the following equations:

E5 ¼Min Ai½ �1�n

� �; Ai ¼

ZinsideðCÞ

u� dij j2dxdy; i ¼ 1;2; . . . n

E6 ¼Min B j

� �1�m

� �; B j ¼

ZoutsideðCÞ

u� e j

�� 2dxdy; j ¼ 1;2; . . . m

(7)

where di is the gray values of the ith building classes, ej is the grayvalues of the jth background classes, n and m are number ofbuilding and background classes, respectively.

Also, we note that the model uses only grayscale or single bandimages as input data, and cannot utilize multiband information todetect features such as buildings. Therefore, the above modelenergy function should be extended for multicolored buildings andmultiband images. The extended new E3, E4, E5 and E6 are definedin RGB color space by summation of the corresponding functionvalues in each band as follows:

E3 ¼X3

b¼1

ZinsideðCÞ

ub � c1bj j2dxdy

!

E4 ¼X3

b¼1

ZoutsideðCÞ

ub � c2bj j2dxdy

!

E5 ¼Min Ai½ �1�n

� �; Ai ¼

X3

b¼1

ZinsideðCÞ

ub � dibj j2dxdy

!; i¼ 1;2; . . . n

E6¼Min Bj

� �1�m

� �; Bj¼

X3

b¼1

ZoutsideðCÞ

ub � ejb

�� 2dxdy

!; j¼1;2; . . . m

(8)

where b is the number of RGB bands. To minimize the above energyfunction, we use the level set method.

2.1. The level set formulation of the model

The level set method is used to calculate energy functional ofthe model over the entire image domain V. In the level set method

https://www.researchgate.net/publication/5601946_Active_Contour_Without_Edges?el=1_x_8&enrichId=rgreq-829b2e12-6ace-4a1a-9c67-59498b8a231e&enrichSource=Y292ZXJQYWdlOzIyMDQ5MjAyMjtBUzoxMzE4Njg0MDE4Njg4MDBAMTQwODQ1MTI4NTMyMw==


(Osher and Sethian, 1988), curve C is represented by the zero-levelset of a Lipschitz function f: IR2! IR, such that (Chan and Vese,2001):

C ¼ fx2 IR2; fðx; yÞ ¼ 0ginside ‘‘C’’ ¼ fx2 IR2; fðx; yÞ<0goutside ‘‘C’’ ¼ fx2 IR2; fðx; yÞ>0g

8<: (9)

To define the functional energy based on the level set method,Chan and Vese introduced two additional functions (Chan andVese, 2001) (Heaviside function H and the one-dimensional Diracmeasure d):

HðfÞ ¼ 1; f�00; f<0

�

dðfÞ ¼ d

dfHðfÞ

(10)

Based on level set method and using Eqs. (9) and (10), thedefinition of each element of Eq. (6) becomes:

lengthff ¼ 0g ¼ZrHðfÞj jdxdy ¼

ZV

dðfÞ rfj jdxdy

areaff�0g ¼ZV

HðfÞdxdy

E1 ¼ZV

dðfÞ rfj jdxdy

E2 ¼ZV

HðfÞdxdy

E3 ¼ZV

u� cinj j2HðfÞdxdy

E4 ¼ZV

u� coutj j2 1� HðfÞð Þdxdy

(11)

where cin and cout are obtained from the following equations:

cinðfÞ ¼RV u � HðfÞdxdyR

V HðfÞdxdy

coutðfÞ ¼RV u � ð1� HðfÞdxdyR

Vð1� HðfÞdxdy

(12)

More details can be found in (Chan and Vese, 2001). Also, basedon new formulation E5 and E6 in Eq. (6) are obtained from thefollowing equations:

E5 ¼ Min Ai½ �1�n

� �; Ai ¼

ZV

u� dij j2HðfÞdxdy; i ¼ 1;2; . . . n

E6 ¼ Min B j

� �1�m

� �; B j ¼

ZV

u� e j

�� 2ð1� HðfÞÞdxdy; j ¼ 1;2; . . . m

(13)

where V is the entire image domain in the above equations.

2.2. Minimizing the energy functional of the model

Now consider the minimization of the functional Etotal. It can besolved by the following evolution equation (Evans, 1998):

@f@t¼ � @Etotal

@f(14)

Here t represents the time. Therefore, the function f thatminimizes this functional satisfies the Euler–Lagrange equation.The steepest descent process for minimization of the functionalEtotal is the following equation:

@f@t¼ � @Etotal

@f

¼ dðfÞ m � F1 � n� l1 � F2 þ l2 � F3 � a � F4 þ b � F5½ � ¼ 0 (15)

Here, F1, F2, F3 and F4 are obtained from the following equations:

F1 ¼ divrfrfj j

�

F2 ¼ ðu� cinÞ2

F3 ¼ ðu� coutÞ2

F4 ¼Min Mi½ �1�n

� �; Mi ¼ ðu� diÞ2; i ¼ 1;2; . . . ;n

F5 ¼Min N j

� �1�m

� �; N j ¼ ðu� e jÞ2; j ¼ 1;2; . . . ;m

(16)

To solve the above set of equations, H(f) and d(f) should benormalized because the building boundaries in the images havenot been smoothed. Chan and Vese (2001) proposed the followingformulas:

HeðfÞ ¼1

2þ 1

parctan

fe

�

deðfÞ ¼1

p� ee2 þ f2

(17)

In the above equations e is a very small number greater thanzero. When e! 0 both He(f) and de(f) converge to H(f) and d(f)respectively. More details can be found in (Chan and Vese, 2001).

2.3. The numerical approximation of the model

To discretize the equation inf, we use a finite differences implicitscheme (Chan and Vese, 2001). Therefore, the discretization andlinearization of the curve evolution function are written as:

’nþ1i; j � ’n

i; j

Dt¼de’n

i; j

"m

h2Dx�

Dxþ’

nþ1i; jffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

Dxþ’

ni; j

� �2

h2 þ’n

i; jþ1�’n

i; j�1

� �ð2hÞ2

vuut

0BBBBBB@

1CCCCCCA

þ m

h2Dy�

Dyþ’

nþ1i; jffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

ðDyþ’

ni; jÞ2

h2 þð’n

iþ1; j�’n

i�1; jÞ

ð2hÞ2

r0BB@

1CCA

� v� l1ðui; j � cinð’nÞÞ2 þ l2ðui; j � coutð’nÞÞ2

� aMinn

k¼1

ðui; j � dkð’nÞÞ2

�þ bMin

m

p¼1

ðui; j � e pð’nÞÞ2

�#

(18)

where Dt and h are time iteration step and space iteration steprespectively and the forward differences of ’n

i; j are calculated basedon:

Dx�’i; j ¼ ’i; j � ’i�1; j

Dxþ’i; j ¼ ’iþ1; j � ’i; j

Dy�’i; j ¼ ’i; j � ’i; j�1

Dyþ’i; j ¼ ’i; jþ1 � ’i; j

(19)

More details about discritization of Eq. (16) can be found in(Chan and Vese, 2001). For the level set formulation, the newproposed building extraction model incorporates five major stages.In the first stage, geometric and radiometric corrections are appliedto the input image. The next step, depending on the number ofbuildings and background classes, introduces appropriate pixelvalues for points inside and outside building boundaries as trainingdata. Then, the proposed active contour model is activated, andbuilding boundaries are extracted. Subsequently, the extractedbuilding boundaries are then generalized to smooth out jaggedlines by using perpendicular angles and straight lines (Dutter et al.,2007). Finally, the accuracy of the detected buildings is evaluated.Fig. 1 illustrates the process.

Fig. 1. Flowchart of building extraction algorithm.

Table 1Optimum model parameters.

Constant parameters Optimum number

m 1

n 1

l1 1.5

l2 2

a 4

b 0.8


3. Experimental results

The proposed model was implemented and tested on an aerialimage from Lavasan (central Iran). Fig. 2 shows the original imagesfrom the test regions. The spatial resolution of the image is 0.5 andit was acquired on August 2005.

Fig. 2. Lavasan photograph used for test purpose.

Fig. 3. Initial model curves.

The model was initialized by introducing sample data of twopoints from buildings (for two classes of buildings) and two pointsfrom the background (for two classes of image background) in theimage. The initial curves were generated automatically as a seriesof regular circles all over the image (Fig. 3).

Based on our experimental results, we conclude that if there aremore than 36 initial curves (6 � 6 circles), then the curve numberwill only impact the model’s speed of execution. By increasing theinitial curve number, the model’s output does not vary. However,having too few initial curves can impact the model’s accuracy,leading to some buildings going undetected.

In the next stage, the constant parameters of the model are setand the active contour model suggests initial positions for thecurves. After ten iterations, the curves had been linked with thebuilding boundaries. The optimum values of the model parameterswere determined experimentally using a trial and error approach.It should be mentioned that once the parameters are determinedfor an image, then the parameters have a slight change for othertest images. For instance, if parameter alpha is found to be 1 for animage, then there would be a small variation for alpha parameterfor other images such as a = 1.15. Thus the parameters do notchange extensively and it can be easily adopted for other images.Table 1 represents the optimum values of the constant parameters.

After ten iterations, the contours remained static and theprocess could be terminated. Fig. 4 demonstrates curve positionsfrom 10th iterations.

The extracted building boundaries had many vertices and werehighly irregular—entirely unsuitable for importing into GeographicInformation Systems. Therefore, in the final stage, we generalizedall of our primary building boundaries to eliminate the redundantpoints. Our chosen generalization method mirrors that recom-mended by Dutter et al. in 2007 (Osher and Sethian, 1988). Theoutput of this step was a series of regular building polygons withsemi-perpendicular angles and straight lines (Fig. 5).An accuracyassessment of the presented model is presented in the next section.

Fig. 4. primary extracted boundaries of buildings in test region.

Table 3The completeness factor for the proposed model.

All buildings 347

Extracted buildings 281

Completeness factor 80%

Fig. 5. Generalized boundaries of buildings in the test areas.

Table 2Shape accuracy as obtained from the model given image of the

tested region.

Maximum shape accuracy 96

Minimum shape accuracy 77

Mean shape accuracy 85


4. Accuracy assessment of the model

Many parameters affect the accuracy of the proposed modeland can impact its outputs. The most important effectiveparameters are listed below:

� N

sh

umber of initial curves
� N umber of building and background classes � V alues of constant parameters � I teration number
(I) N
Fig. 6. Some of have not been extracted buildings.
ap

umber of initial curves: Based on our experimental results, weconclude that if there are more than 16 initial curves (circles)for the image (4 � 4 circles), then the curve number will onlyimpact the model’s speed of execution. By increasing theinitial curve number, the model’s output does not vary.However, having too few initial curves can impact the model’saccuracy, leading to some buildings going undetected.

(II) N
umber of building and background classes: The principle factorin achieving the best model result is the number of buildingand background classes selected. Using fewer building classesthan the actual number of building classes will result in somebuildings not being extracted or being detected incorrectly.Also, if there are more than 8 classes of buildings in the image,then the proposed model cannot detect all buildings preciselyand probably some buildings are not detected.
(III) V
alue of the constant parameters: The model incorporateparameters that determine the effect of each term consistentwith the energy function in the model. These can be assignedexperimentally for each image.
(IV) It
eration number: This parameter controls the curve evolutionprocess and terminates the model after a certain iterationnumber. In our model, all of the curve positions would changefor up to ten iterations with image, after which the contourswould become essentially immobile.
To evaluate the accuracy of our model, we calculated McKeon’sshape accuracy metric (McKeown et al., 2000), completeness andcorrectness factors of the extracted building boundaries.

� S
Table 4The correctness factor of our model.
hape accuracy: To calculate the shape accuracy, all true buildingareas were compared to the model values (McKeown et al.,2000):

e accuracy ¼ 1� A1 � A2j jA

� � 100 (20)

In this equation, A1 is the true building area and A2 is the areaof its corresponding detected value. It should be mentioned thatin this research the true position of buildings are generatedmanually in ArcGIS software. Table 2 demonstrates the results ofapplying our model to the tested images.
� C ompleteness: The completeness of the presented model is
obtained from the ratio of extracted buildings to the totalnumber of existing buildings in the image domain.

Table 3 illustrates the completeness factor for our model.

Because of radiometric similarity between some buildingsand image background, these buildings are not extracted. Also,size of some buildings in the image is very small, so this type ofbuildings is not detected. In the above-mentioned equations bychanging m and n with respect to other parameters, the operatorcan adjust the minimum size of buildings that are detected.InFig. 6 some of the buildings that have not been extracted arepresented by using green ellipses.
� C orrectness: This factor demonstrates the accuracy of the model
in terms of boundary extraction performance. The factor is theratio of accurately extracted buildings to the total. Table 4 showsthe correctness factor for our model.

Some buildings in the image are not extracted correctlybecause the radiometric characteristic of this building is similarto image background. Fig. 7 shows some of inaccuratelyextracted buildings.

Extracted buildings 281

Accurate Extracted buildings 270

Correctness factor 96%


5. Conclusion

In this paper, an improved active contour model was developedand exploited for automatically extracting buildings from aerialimages. Unlike with the classical snake model, our approach doesnot require the introduction of initial curves near building edges.Moreover, the new proposed model not only detects relevantbuilding boundaries, but also avoids extracting edges from otherobjects in the image. One advantage of this new model is that it isindependent of additional data, such as height information, that isusually required for building extraction. Furthermore, our modelproduces GIS-ready data. We conclude that our approachgenerates acceptable results.

References

Baillard, C., Maitre, H., 1999. 3D reconstruction of urban scenes from aerial stereoimagery: A focusing strategy. Computer Vision and Image Understanding 76 (3),244–258.

Brox, T., Weickert, J., 2004. Level set based image segmentation with multipleregions. Pattern Recognition 3175 (2), 415–423.

Cao, G., Yang, X., 2007. Man-made object detection in aerial images using multi-stage level set evolution. International Journal of Remote Sensing 28 (8), 1747–1757.

Caselles, V., Kimmel, R., Sapiro, G., 1995. Geodesic active contours. IEEE Interna-tional Conference in Computer Vision 694–699.

Caselles, V., Kimmel, R., Sapiro, G., 1997. Geodesic active contours. InternationalJournal of Computer Vision 22 (1), 61–79.

Chan, T.F., Vese, L.A., 2001. Active contours without edges. IEEE Transaction onImage Processing 10, 266–277.

Fig. 7. (a) some of inaccurate extracted buildings and (b) zooming view of one of

that buildings.

Chan, T.F., Sandberg, B.Y., Vese, L.A., 2000. Active contours without edges forvectorvalue images. Journal of Visional Communication and Image Representa-tion 11, 30–141.

Chen, L., Zhou, Y., Wang, Y.G., Yang, J., 2006. GACV: geodesic-aided C-V method.Pattern Recognition 39 (7), 1391–1395.

Dash, J., Steinle, E., Singh, R.P., Bahr, H.P., 2004. Automatic building extraction fromlaser scanning data: an input tool for disaster management. Advances in SpaceResearch 33, 317–322.

Dutter M., Hollaus, M., Pfeifer, N., 2007. Generalization of building footprintsderived from high resolution remote sensing data.

Evans, L., 1998. Partial differential equations. American Mathematical Society.Guo, T., Yasuoka, Y., 2002. Snake-based approach for building extraction from high-

resolution satellite images and height data in urban areas. In: Proceedings ofthe 23rd Asian Conference on Remote Sensing, 2002, Kathmandu, November25–29, p. 7.

Halla, Brenner, N.C., 1999. Extraction of building and trees in urban environment.ISPRS Journal of Photogrammetry & Remote Sensing 54, 130–137.

Hongjiana, Y., Shiqiang, Z., 2006. 3D building reconstruction from aerial CCD,image and sparse laser sample data. Optics and Lasers in Engineering 44, 555–566.

Hou, Z., Han, C., 2005. Force field analysis snake: an improved parametric activecontour model. Pattern Recognition Letters 26, 513–526.

Karantzalos, K., Paragios, N., 2009. Recognition-driven two dimensional competingpriors toward automatic and accurate building detection. IEEE Transaction onGeoscience and Remote Sensing 47 (1).

Kass, M., Witkin, A., Terzopoulos, D., 1998. Snakes: active contour models. Inter-national Journal of Computer Vision 1, 321–331.

Lafarge, F., Descombes, X., Zerubia, J., Pierrot-Deseilligny, M., 2008. Automaticbuilding extraction from DEMs using an object approach and application tothe 3D-city modeling. ISPRS Journal of Photogrammetry & Remote Sensing 63,365–381.

Li, C., Liu, J., Fox, Martin D., 2005. Segmentation of external force field for automaticinitialization and splitting of snakes. Pattern Recognition 38, 1947–1960.

Lie, J., Lysaker, M., Tai, X.C., 2006. Binary level set model and some applications toMumford–Shah Image segmentation. IEEE Transaction on Image Processing 15(5).

Malladi, R., Sethian, J.A., Vemuri, B.C., 1995. Shape modeling with front propagation:a level set approach. IEEE Transaction on Pattern Anal Machine Intelligence 17,158–175.

Mayunga, S.D., Zhang, Y., Coleman, D.J., 2005. Semi-automatic building extractionutilizing Quickbird imagery. IAPRS XXXVI (Part 3/W24), 29–30.

McKeown, D.M., Bulwinkle, T.M., Cochran, S., Harvey, W., McGlone, C., Shufelt,J.A., 2000. Performance evaluation for automatic feature extraction. Interna-tional Archives of Photogrammetry and Remote Sensing 33 (Part B2), 379–394.

Miliaresis, G., Kokkas, N., 2007. Segmentation & object based classification for theextraction of the building class from LIDAR DEMs. Computers & Geosciences 33,1076–1087.

Osher, S., Sethian, J.A., 1988. Fronts propagating with curvature-dependent speed:algorithms based on Hamilton-Jacobi formulation. Journal of ComputationalPhysics 79, 12–49.

Peng, J., Zhang, D., Liu, Y., 2004. An improved snake model for building detectionfrom urban aerial images. Pattern Recognition Letters 587–595.

Peng, J., Zhang, D., Liu, Y., 2005. An improved snake model for building detectionfrom urban aerial images. Pattern Recognition Letters 26, 587–595.

Pi, L., Fan, J.S., Shen, C.M., 2007. Color image segmentation for objects of interestwith modified geodesic active contour method. Journal of Mathematical Ima-ging and Vision 27 (1), 51–57.

Rottensteiner, F., Trinder, J., Clode, S., Kubik, K., 2005. Using the Dempster–Shafermethod for the fusion of LIDAR data and multispectral images for buildingdetection. Information Fusion 6 (4), 283–300.

Samadzadegana, F., Azizi, A., Hahnb, M., Lucasa, T.C., 2005. Automatic 3D objectrecognition and reconstruction based on neuro-fuzzy modeling. ISPRS Journalof Photogrammetry & Remote Sensing 59, 255–277.

Schenk, T., Csatho, B., 2002. Fusion of LiDAR data and aerial imagery for amore complete surface description. International Archives of Photogramme-try & Remote Sensing and Spatial Information Sciences 34 (Part 3), 310–317.

Siddiqi, K., Lauziere, Y.B., Tannenbaum, A., Zucker, S.W., 1998. Area and lengthminimizing flows for shape segmentation. IEEE Transaction on IP 7 (3), 433–443.

Sohn, G., Dowman, I., 2007. Data fusion of high-resolution satellite imagery andLiDAR data for automatic building extraction. ISPRS Journal of Photogrammetry& Remote Sensing 62, 43–63.

Vasilerskiy, A., Siddiqi, K., 2002. Flow maximizing geometric flows. IEEE Transactionon PAMI 24 (12), 1565–1578.

Vestri, C., 2006. Using range data in automatic modeling of buildings. Image andVision Computing 24, 709–719.

Weidner, U., Forstner, W., 1995. Towards automatic building reconstruction fromhigh resolution digital elevation models. ISPRS Journal of Photogrammetry andRemote Sensing 50 (4), 38–49.

Yan, Pi., Kassim, A.A., 2006. Segmentation of volumetric MRA images by usingcapillary active contour. Medical Image Analysis 10, 317–329.

Yezzi Jr., A., Kichenassamy, S., Kumar, A., Olver, P., Tannenbaum, A., 1997. Ageometric snake model for segmentation of medical imagery. IEEE Transactionon MI 16 (2), 199–209.


Ying, Z., Guangyao, L., Xiehua, S., Xinmin, Z., 2009a. A geometric active contourmodel without re-initialization for color images. Image and Vision Comput-ing.

Ying, Z., Guangyao, L., Xiehua, S., Xinmin, Z., 2009b. Geometric active contourswithout re-initialization for image segmentation. Pattern Recognition.

Zhao, B., Trinder, J.C., 2000. Integrated-approach-based automatic building extraction.International Archives of Photogrammetry and Remote Sensing XXXIII (Part B3).

Zimmermann, P., 2000. A new framework for automatic building detection analyz-ing multiple cue data. International Archives of Photogrammetry and RemoteSensing XXXIII (Part B3).

Documents

Automatic urban building boundary extraction from high resolution aerial images using an innovative model of active contours