Improving Relief Classification with Contextual Merging€¦ · Improving Relief Classification with Contextual Merging Bård Romstad Department of Physical Geography, University

Improving Relief Classification with ContextualMerging

Bård Romstad

Department of Physical Geography, University of OsloP.O. Box 1042 Blindern, N-0316 OSLO, Norway

[email protected]

Abstract. Automatic classification of relief attributes into meaningfulmorphological units has a great potential within the field of geomorphology.When applying common classification algorithms such an iterative clusteranalysis to relief data, the result is often a set of classes with a marked lack ofcoherence in geographical space. The scattering of classes occurs because thereis an authentic overlap between different classes in both attribute andgeographical space. Therefore other procedures should be used for reliefstructuring that take the class overlap into account. Such a procedure could bethe application of a contextual merging, or generalisation, prior toclassification. As a case study, an area close to Ny-Ålesund, Spitsbergen, isclassified using this procedure, and it is shown that in this specific relief thecoherence and interpretability of the result is increased compared to a simplecluster analysis alone.

1 Introduction

Generalisation of data into a smaller number of classes, relevant to a certainapplication, is a task we perform every day without giving it much consideration. Ingeomorphology such generalisations include the classification of a continuous surfaceinto units, or landforms, such as a valley, hill or cliff. These are terms that make senseto most people as spatial units with a certain shape, extent and topography, but canusually also be defined as a domain within which a specific physical surface-processis dominant. Hence the quantification of such units is useful not only as an objectivedescription of a landscape, but also as a component in spatial modelling ofgeomorphological processes.

Topographic parameters, derived from digital elevation models, can give usvaluable information on the characteristics of a certain surface element. However, thespatial aspect, which is essential in a geomorphological context, is often neglected inthe statistical algorithms most commonly used to classify large sets of multivariatedata. The aim of this paper is to illustrate how contextual merging, an algorithm forgeneralisation of continuous data into spatial units, can be used to improveclassification based on topographic parameters. Since the procedure is at anexperimental stage, an attempt is also made to point out some problematic issues

associated with the procedure, and some modifications are suggested that mightimprove the classification results.

2. Theory

Geomorphological processes are primarily controlled by topography. With theincreasing availability of digital elevation models, and technology enabled to processthese, it has become necessary to define quantitative relationships betweentopography and processes [1, 2, 9, 10, 12]. Pike [13] introduces the concept ofdescribing a landform by a geometric signature. He shows that disparate landscapescan be distinguished with a set of measures that describe the topographic form. Thisconcept has been taken further within the field of geomorphometry [4, 6, 14, 15]. Thebasic assumption in geomorphometry is that there is a close relationship betweensurface processes and surface characteristics that can be expressed as topographicparameters.

Combinations of topographic parameters define topographic regions (relief units)after a classification process (relief classification). The units are assumed to representareas with a predominance of certain surface processes and therefore also landforms.If an empirical or physical relationship between a set of topographic parameters and asurface processes can be established, geomorphometry can be used as a tool forspatial modelling. If we further assume that these relationships are scale independent,at least within a certain range, it becomes possible to up-scale and downscale spatiallydistributed information [5].

The classification of a landscape into functional morphological units, describingphysical or conceptual domains, is a task commonly performed within thegeosciences. This task is conventionally carried out during field surveys or byinterpreting aerial photographs. This method is both time-consuming and dependenton the surveyor’s interpretation of a more or less qualitative set of rules. However, thenumerical description of a surface’s geometrical or topographical characteristics,coupled with the relationship between topography, process and landform, gives us thepossibility to set up a more consistent set of rules for landform delineation. Thus theproblem can be approached more objectively. Still, as the number of variables neededto delineate a certain terrain type increases, the classification process becomes morecomplex and soon it becomes too complicated to be performed without the help ofstatistical analysis and computer technology.

Generalisation of multivariate data by means of statistical analysis, such asiterative cluster analysis, has become a common practice within the fields of GIS andremote sensing. Iterative cluster analysis defines classes based on the naturalclustering of the data in attribute space and is used in many off-the-shelf softwarepackages such as ESRI’s ArcInfo and ERDAS’ Imagine. The use of such algorithmsfor spatial structuring has proved successful when applied to data that represent innatecrisp classes with little overlap in attribute or geographical space. This is often thecase for spectral data from air- or space-borne sensors; the classes that emerge fromsuch data reflect different types of land-cover that are more or less distinct, anyoverlap between classes is mainly an issue of scale rather than ambiguity in class

definitions. Relief data, however, most commonly represents a multivariatecontinuum that has little or no distinct class-boundaries because there is an authenticoverlap between different classes, in both space and characteristics. This fact suggeststhat other procedures should be used for relief structuring that take the class overlapinto account. Irvin [8] and Burrough [3] describe the use of continuous classification(fuzzy set) methods. In these methods individual cells or data points are assigned anaffinity to each cluster rather than an absolute membership. They can easily beintegrated in a cluster analysis, but the result of a fuzzy classification is hard tovisualise, and assessment requires a comprehensive understanding of how thealgorithms work and the nature of the data. Furthermore, these methods take overlaponly in attribute space into account, and the issue of spatial overlap is not addressed.

Friedrich [7] suggests a different approach; he couples the cluster analysis with apreliminary spatial-neighbourhood analysis. The approach is somewhat similar tosmoothing the data by a window operation, except instead of using the rigidly definedwindow an iterative procedure is applied. The generalisation algorithm is based on theproximity distance vector in multivariate space between neighbouring cells in theterrain. Initially each cell is considered a unique class. For each step of the procedurethe two neighbouring classes with the shortest distance vector between them aremerged, and new attribute values calculated. When a satisfactory degree ofgeneralisation is reached, the algorithm is halted. The idea is that the resulting classesrepresent individual, unique landform units present in the terrain. The total amount ofclasses and their suitability for further analysis will therefore depend on the size andrelief of the study area and the variables used for similarity assessment.

3. Method

A case study was carried out on an area close to Ny-Ålesund, Spitsbergen (Fig. 1).The study area was 1300 × 1200 metres and covered the eastern slope of the ScheteligMountain. Total relief in the area was 640 metres, and at the base of the slope severallarge talus cones indicated extensive slope activity. From the area a relatively gooddigital elevation model (DEM) with a cell size of 10 × 10 metres was available. Adigital map of the talus cones was produced from field data and aerial photos.

Building on the generalisation algorithm suggested by Friedrich[7], the approachtaken here was to use the algorithm to break up the study area into homogenouslandscape units. Because each cell’s relation to its neighbourhood, or context,determines whether or not it should be generalised, this process will be referred to as“contextual merging.” For each of the relief units, new attribute values werecalculated and used as input to a cluster analysis that classified the units into moregeneral landform types (see Fig. 2). The resulting map was then compared to thelandform map and to a classification based on iterative cluster analysis alone.

10Ê

10Ê

15Ê

15Ê

20Ê

20Ê

77Ê 77Ê

78Ê 78Ê

79Ê 79Ê

80Ê 80Ê

0 50 100

km

Study area

N

2 0 2 4 Kilometers

Fig. 1. Location of the study area at the Brøgger peninsula close to the community of Ny-Ålesund.

Iterativecluster analysis

Contextualmerging

Iterativecluster analysis

Contextualmerging

Fig. 2. The classification process: To the left the original continuous datasets, after thecontextual merging the study area is divided into unique areas with internal homogeneity.These areas are then classified into more general landform types by iterative cluster analysis.

3.1 Contextual Merging

In order to pinpoint some of issues that need to be addressed before contextualmerging is applied, it is useful to look at the algorithm in more detail. The basic unitin the procedure is the distance vector between two classes. This vector is calculatedfor each neighbouring class in the dataset and can be described as in (1).

ÿ −=n

iii bav 2)(

ÿ. (1)

Where vÿ

is the vector between classesa andb, n is the total number of attributes,andai andbi are the values of attributei in classesa andb respectively. When all thevectors have been calculated they are put in an array and sorted. Then the twoneighbouring classes with the shortest vector between them can be merged. Newattribute values are calculated for the class by averaging the values in the two originalclasses, and the distance vectors between the new class and its neighbours are updatedbefore the procedure is iterated.

The merging was performed using IVHG, a program described by Friedrich [7]and modified by the author. The modifications were made mainly to allow thedistance vectors to be monitored during the cell merging process. This provided, asexplained below, valuable information that was used to decide on a halting criterion.

Selecting and Processing Geomorphometric Attributes.The first problem that arises is determining suitable morphometric attributes for thecontextual merging. To be suitable, the attributes should have relatively smallervariation within landforms than between them. If this is the case the algorithm willmerge neighbouring cells into meaningful units, even though we may not be able tosay anything qualitative about them. The attributes should be normalised prior togeneralisation, but attributes that are expected to have greater salience in the givencontext may be weighted accordingly. If certain ranges of attribute values are morerelevant than others, one could apply a non-linear stretch function to the data.Defining weights and especially stretch functions does, however, require a deeperunderstanding of the relationship between the selected attributes and the landforms inthe study area.

The following four variables were used as input to the algorithm (for a detailedexplanation of these variables see [6, 10, 11]):

• Slope; magnitude of the maximum slope angle between each cell and its eightneighbours

• Profile curvature; the curvature of the land surface in the direction of themaximum slope

• Planform curvature; the curvature of the land surface perpendicular to themaximum slope

• Wetness index; defined as ln(As/tanB) where As is the upslope contributingarea and B is the slope

To best focus on active slope processes, the attribute of slope gradient wasweighted by two prior to generalisation. While curvature measures provide importantinformation about the characteristics of a landscape unit, they are very sensitive toeven small changes in their local neighbourhood and therefore often appear in a ratherchaotic pattern. This was especially evident in the strong relief in the study area, andthe curvatures were therefore only weighted by one. The topography of talus coneswill always lead to a relatively high wetness index in the concave areas along theiredges, thus the wetness index is also suitable for their delineation. It is also a measurethat gives information, not only on a cell’s immediate neighbourhood, but also on itstotal drainage area. Given its relevance for this application, the wetness index wasweighted by four.

Choosing Degree of Generalisation.The degree of generalisation chosen will act as a halting criterion for the contextual

merging. A suitable degree of generalisation will primarily be dependant on the scaleof the data relative to the scale of the features being classified: the higher theresolution of the data, the greater the degree of generalisation. The texture, orroughness, also affects when the algorithm should be halted, i.e., the smoother theterrain, the greater degree of generalisation is possible. This is because morphologicalunits in smooth areas by definition are more homogeneous than they are in ruggedterrain. Thus problems arise when you have landscapes with different textures withinthe study area. The algorithm will then have a tendency to start merging dissimilarlandforms in the smooth areas before merging cells within single landforms inrougher areas.

To make the selection of generalisation degree less arbitrary, the distance vectorswere monitored during the cell merging process. By defining the halting criterion asthe maximum displacement allowed rather than the degree of generalisation, the samecriterion can be used in areas with different relief given that the same attributes areused as input. When a satisfactory degree of generalisation was reached, the distancevector between merged classes represented a displacement of about 15% of the totalvariation in all attribute values. The original 15,600 cells in the study area had thenbeen reduced to about 1,500 relief units, which is a generalisation of about 91% (seeFig. 3b).

3.2 Cluster Analysis

To classify the relief units into more general landscape types, an iterative clusteranalysis was used. Before applying this analysis, however, input attributes needed tobe determined. In this study, the attributes used were the same four as used in thegeneralisation, supplemented with the following:

• Overall curvature; the overall curvature of the land surface• Wetness index × Planform curvature

All of the variables were calculated from the original DEM, and then averagedover the relief units generated by the contextual merging. The six averaged datasetswere then classified into 10 classes using the ArcInfo procedures ISOCLUSTER andMLCLASSIFY (cluster analysis with maximum likelihood). For comparison the samecluster analysis was performed with the same data, but without averaging them overthe relief units.

Fig. 3. A photo of the study area (a) and the terrain model with the mapped talus cones and thecontextually merged relief units (b).

4 Results

The results can be seen in Fig. 4. From the figure it is evident that even though themain patterns are the same between the two, the classes resulting from classificationwithout contextual merging are more scattered and spatially inconsistent.

The classification based on generalised data also proved to classify the mappedlandforms more accurately. Of the 976 cells mapped as taluses, 884 (91%) matchedone single class. The total number of cells in this class was 1,029, which means thatonly 14% of the cells had occurred elsewhere. For the other classification the mostsignificant class only covered 59% of the mapped talus cells, and about 55% occurredelsewhere. Most of the remaining talus cells were included in a cluster with a very

distinct convex planform curvature, and within this cluster the cells matching themapped taluses only represented 54% of the class.

Fig. 4. The resulting classifications from the cluster analysis with (a) and without (b) contextualmerging.

The contextual merging also exposed a pattern overlooked by the other procedure,that is, the horizontal strata seen in the photo in Fig. 3a. This is probably because, forthe merging algorithm, these steep and curved features acted as distinctive barriersbetween the smoother areas above and below, and were consequently merged intoseparate units. In the generalised dataset they were distinctive enough to be identifiedas a separate class by the cluster analysis, but when looked at cell by cell, thesefeatures were not distinguished from steep and curved areas elsewhere.

Classes representing ravines leading down to each of the talus cones can be seen inboth classifications, even though the generalisation produced a less ambiguouspattern. There seem to be a clear relationship between the size of these ravine classesand the size of the talus cones below them. This supports the suggestion that theclasses represent actual physical domains: in this case, one of weathering and erosion

(the ravines) and another of accumulation (the talus cones). By looking at theconcurrence of ravine classes and classes of talus cones, it seems clear that it ispossible to determine areas correctly classified as talus cones without the use ofauxiliary field data.

Not unexpectedly the attributes’ variation within clusters was generally lowerwhen only cluster analysis had been applied. Standard deviations here were between8% and 12% of the total variation in the dataset for all attributes, whereas theclassification with contextual merging had standard deviations between 8% and 18%.

5 Discussion and Conclusion

This paper showed how the algorithm suggested by Friedrich [7] could be used forclassification of a specific landform type in an alpine relief. In this specific setting itdelivered a more spatially correlated and apparently also more accurate classificationthan was achieved with an iterative cluster analysis alone. However, little effort wasmade to find the optimal number of classes for the cluster analysis. Bothclassifications might have performed better if number of classes had been increasedand the most similar ones merged. Furthermore the employment of a continuous(fuzzy) classification, rather than a crisp iterative cluster analysis, might have alloweda better assessment of the spatial variations within classes and the transition zonesbetween them.

It was found that using degree of displacement, rather than degree ofgeneralisation, as a halting criterion added transferability to the procedure. With littleeffort, changes could be made to the algorithm so that maximum displacements couldbe defined for each attribute rather than, or in addition to, maximum totaldisplacement. This would make it easier to determine halting criteria for differenttypes of terrain, scale and applications by the use of field data or general knowledgeof the landforms and geomorphometric attributes under consideration.

Because the attribute values for the relief units were calculated simply byaveraging the value of the individual cells, the full potential of contextual mergingwas not realised in this study. A more comprehensive approach would be to calculateattribute values for each relief unit on the basis of its general shape, internaltopography and context in the terrain, for example, by using parameters such as therelief unit’s length/width ratio, length in slope direction, area/elevation skew, and soon. Using these kinds of higher order parameters could result in a more nuancedclassification result, one that not only reflects the physical processes at the cell level,but also at the level of the entire relief unit.

This study shows that contextual modelling can augment current methods used inspatial modelling of geomorphological processes. It was not only able to reaffirmgeomorphological assumptions about the relationships between topographicalparameters and landforms, but it also indicates a clear, and perhaps quantifiable,relationship between some of the classes created. This not only simplifies theinterpretation of the classification results, but may also improve the accuracy.

Acknowledgments

This study was carried out as part of my MSc dissertation at the Department ofPhysical Geography, University of Oslo, under the supervision of Dr. BerndEtzelmüller. The field data was collected during the course "Arctic geomorphology",arranged by Dept. of Phys. Geography and funded by the Faculty of Mathematics andNatural Sciences.

I would like to thank PhD student Eva Heggem for help with the collection of fielddata, the analysis of these and productive comments and ideas during the initial stagesof the study. I would also like to thank Bernd Etzelmüller and Lynn Nygård forvaluable comments on this paper.

References

1. Beven, K.J. and M.J. Kirkby. A physically based, variable contributing area model of basinhydrology. Hydrological Sciences, 1979. 24: p. 43-69.

2. Beven, K.J., M.J. Kirkby, N. Schofield, and A.F. Tagg. Testing a physically-based floodforecasting model (TOPMODEL) for three U.K. catchments. Journal of Hydrology, 1984.69(1-4): p. 119-143.

3. Burrough, P.A., P.F.M. vanGaans, and R.A. MacMillan. High-resolution landformclassification using fuzzy k-means. Fuzzy sets and systems, 2000. 113(1): p. 37-52.

4. Dikau, R., The application of a digital relief model to landform analysis in geomorphology,in Three dimensional applications in geographical information systems., J. Raper, Editor.1989, Taylor and Francis: London, United Kingdom. p. 51-77.

5. Etzelmüller, B., R. Ødegård, I. Berthling, and J.L. Sollid. Terrain parameters and remotesensing data in periglacial reseach. Permafrost and Periglacial Processes, 2001. 12(1): p. 79-92.

6. Evans, I.S., General geomorphometry, derivatives of altitude, and descriptive statistics., inSpatial analysis in geomorphology, R.J. Chorley, Editor. 1972, Mathuen & Co Ltd. p. 17-92.

7. Friedrich, K. Multivariate distance methods for geomorphographic relief classification. inProceedings EU Workshop on Land Information Systems: Developments for planning thesustainable use of land resources. 1996. Hannover: European Soil Bureau.

8. Irvin, B.J., S.J. Ventura, and B.K. Slater. Fuzzy and isodata classification of landformelements from digital terrain data in Pleasant Valley, Wisconsin. Geoderma, 1997. 77(2-4):p. 137-154.

9. Moore, I.D., P.E. Gessler, G.A. Nielsen, and G.A. Peterson. Soil attribute prediction usingterrain analysis. Soil Science Society of America Journal, 1993. 57(2): p. 443-452.

10.Moore, I.D., R.B. Grayson, and A.R. Ladson. Digital terrain modelling: a review ofhydrological, geomorphological, and biological applications. Hydrological Processes, 1991.5(1): p. 3-30.

11.Moore, I.D., A. Lewis, and J.C. Gallant, Terrain attributes; estimation methods and scaleeffects, in Modelling change in environmental systems., A.J. Jakeman, M.B. Beck, and M.J.McAleer, Editors. 1993, John Wiley and Sons: Chichester, United Kingdom. p. 189-214.

12.Moore, I.D., A.K. Turner, J.P. Wilson, S.K. Jenson, and L.E. Band, GIS and land-surface-subsurface process modeling, in Environmental modeling with GIS, M.F. Goodchild, B.O.Parks, and L.T. Steyaert, Editors. 1993, Oxford University Press: New York, NY. p. 196-230.

13.Pike, R.J. The geometric signature; quantifying landslide-terrain types from digital elevationmodels. Mathematical Geology, 1988. 20(5): p. 491-511.

14.Pike, R.J. Geomorphometry - progress, practice, and prospect. Zeitschrift furGeomorphologie, Supplementband, 1995. 101: p. 221-238.

15.Pike, R.J. Geomorphometry - diversity in quantitative surface analysis. Progress in PhysicalGeography, 2000. 24(1): p. 1-20.

Documents

Improving Relief Classification with Contextual Merging€¦ · Improving Relief Classification with Contextual Merging Bård Romstad Department of Physical Geography, University