Image segmentation using texture boundary detection

ELSEVIER

June 1994

Pattern Recognition Letters 15 (1994) 533-541

Pattern Recognition Letters

Image segmentation using texture boundary detection

G r a h a m J o n e s

21e Balnakeil, Durness, Lairg, Sutherland 11127 4PT, United Kingdom

Received 16 November 1991

Abstract

The problem of segmenting an image with an unknown number of unknown textures is addressed. An operator is defined which gives a low output in the middle of homogeneous regions and a high output near boundaries. A hierarchical segmentation is then obtained using a novel algorithm which finds significant hollows in the output of the operator.

Key words: Texture segmentation, Watershed algorithm, Cluster analysis

1. Introduction

This paper describes a new algorithm for image segmentation when there is no prior information about the number or type o f regions involved. It is not aimed at very difficult texture segmentations, but rather at the sort o f images that people can solve "without scrutiny", as described by Marr ( 1982, pp. 93-97) . There is evidence that in human vision, ex- plicit boundaries are formed between different areas in such images. Furthermore it seems that these boundaries have little to do with how the areas are distinguished. We adopt a similar strategy here, that is, we try to construct an operator which detects boundaries in the image in a very general way. For example, changes in grey-level at various scales and changes in orientation are detected. The output o f this operator can be seen as a measure of how rare or unusual a pixel in the image is, and is therefore called the "rarity map". Nothing precise is intended by this name.

Boundary detection is unusual in texture segmentation schemes: this may be because all the problems associated with simple grey-level edge detection

(broad output at edges, false edges, discontinuities, etc.) are multiplied enormously when it comes to textures. The second stage o f our method, the "catchment algorithm", is designed to deal with this sort o f situation. It is best understood by analogy. We image the rarity map as a landscape and model the flow of water over it. Catchments are areas that hold water, that is, areas into which water drains but cannot eas- ily escape. The size of the catchments depends on the amount of water in the system: the higher the "rainfall", the larger the regions that are obtained.

The next two sections describe the rarity operator and the catchment algorithm. Results are presented in Section 4, followed by a comparison with other work and a conclusion.

2. The rarity map

2.1. Definition

We consider the image to be a real-valued function P(x, y) where x and y are integers in some range. Thus each pixel corresponds to a particular (x, y ) and ad-

0167-8655/94/$07.00 © 1994 Elsevier Science B.V. All rights reserved SSDIOI67-8655(93)EOO71-U

534 G. Jones/Pattern Recognition Letters 15 (1994'.) 533-541

jacent pixels are a distance 1 apart. The construction of the rarity map from P is a matter of applying various sequences of operators to P and then adding up the results. We begin by describing the operators.

We use FoP to denote the convolution of P with a function F=F(x, y), it being understood that F is normalized to make the integral of I FI over the plane is unity, and then digitized for computation, Let go(x)=exp(-x2/2tr 2) be the one-dimensional Gaussian and let Go(x, y)=go(X)go(y) be the two- dimensional version. We convolve P with G~ and G2 to obtain smoothed images G~ oP and G2 oP. Together with P itself, this gives us three feature maps.

We now describe some features that detect direc- tional information, starting with the vertical version Vo which responds to vertical edges and lines. Let do(x) = (1-x2/a2)go(X), the one-dimensional La- placian. Firstly we convolve with g2o(y)do(x), then we take the absolute value of the result at each pixel, and finally we convolve with G3o/2. We actually only use the scale a = 1 in the results below, in four direc- tions obtained by rotating I"1 clockwise through 0, ~/ 4, n/2, and 3n/4. We will denote them by V, U, H, D respectively.

Now we define the "rarity operator", which is the central part of the construction of the rarity map. We denote it by Ro where a is a scale constant. Suppose that F=F(x, y) is a feature map (for example F=G~oP). Put t ingp= (x, y) we define

RoF~) = ~ Go(p-q)[GooF(p)-F(q) [ , q

where the sum is over pixels q in a neighbourhood of p, taken to be {q:Iq-pJ<3a) here. The term GooF(p) can be seen as a local sample mean, in which a pixel r is sampled with weight G , ( p - r ) . Thus the sum is a (crude) estimate of the local sample variance ofF . It can also be seen as a (very crude) estimate of the inverse of the probability density at ( F ~ ) , p) in (feature space) × (image space). The basic point is that R has a high output at places where F is changing fast.

We use the output of R , as a feature in its own right for the smaller scale features, extending our list of features to ten by adding R~P, R2R~P, and R2GI oP.

Our final operator is denoted by N~. Let E=E(x, y) be an edge map (such as the result of applying R to a feature map) . Then

N~,E(p) =max(0, Go/2oE(p)-G3o/2 oE(p)} .

The second term is a difference of Gaussians, and is therefore a "Mexican Hat" shaped operator like the Laplacian. It improves the output of R in ways to be described later.

Now that we have described the various operators, we can give the complete definition of the rarity map a s

Rarity(p) = ½N~R~ P(p) + N2R2R, P(p)

+ N4R4R2R~ P(p ) + N2R2 G~ o p(p )

+ N4R4R2GI oP(p) + N4R4G2 op(P)

+ 2 (N4R4 VP(p) +N4R4 UP(p)

+ N,R, HP(p) + N,R, DP(p) ).

2.2. Discussion

A boundary in an image can result in various be- haviour in a feature map. There may be a step, or a ridge or a valley, all of which R deals with effectively, producing a ridge at the boundary. Another common situation is a change in the variance of a feature values across the boundary. Here the output of R tends to be low on one side, and high and hilly on the other - in other words a step, albeit with large-scale noise added. It thus makes sense to use the output of R as a new feature and apply R again with a larger scale constant.

Although the catchment algorithm can effectively thin an edge regardless of its breadth, there is a criti- cal sort of situation in which it fails. This is where a bold (e.g. high contrast) edge occurs close to, and roughly parallel to, a faint edge in the image. Instead of obtaining a hollow between them, we just get some sort of bulge on the slope of a broad ridge. Because the operator N responds to convexities (of the fight scale) it can restore the hollow between the edges. It also has the desirable effect of turning any steps that may be left in the output of R into ridges. It is not entirely successful in these aims, as we shall see, but it certainly produces an improvement.

G. Jones/Pattern Recognition Letters 15 (1994) 533-541 535

3. Catchment algorithm

The input to this algorithm is an edge map such as the rarity map. The output is a hierarchical segmentation which consists of an initial labelling of the pixels plus a list of merges. Each merge consists of a pair of labels together with a number (the "rainfall") rep- resenting the difficulty of merging the two regions.

Note that "neighbour" means "4-connected neighbour" in the following description, and that all regions are therefore 4-connected. We denote the rarity map by R = R ( p ) whe rep= (x, y) is a pixel.

1. Give each local minimum in R a unique label. 2. Follow a path of steepest descent from each pixel to a local minimum, and label the pixel the same way as the minimum. This is the initial segmentation. 3. FOR(each region r)

a. Start at the minimum of r and add pixels in as- cending order until a way down into a different region s is found.

b. Record s and the volume of water required to do this, and also the area of r.

c. For all pixelsp in r which are "underwater", up- date R (p) to the height of the water surface. NEXT r 4. WHILE (TRUE)

a. Find a region r with smallest volume/area ( = rainfall).

b. Output r, its "spill-neighbour" s, and the rainfall. c. I f the number of regions is 2 then END. d. Merge r and s by relabelling, and adding their

volumes and areas. e. I f the "spill-neighbour" of s before merging was

r, then top up the merged region as in 3a, b, c. ENDWHILE

The updating of R( ) in 3c means that each pixel in each region can be reached from the minimum without going downhill, which simplifies subsequent filling of merged regions.

4. Implementation and results

All the images are 128 X 128 by 8 bits. Intermedi- ate results were calculated to a higher accuracy. The rarity operator R~(p) is slow since all the pixels in a

neighbourhood of each pixel have to be accessed. To speed things up the neighbourhood was restricted to {q : I q - f l < 3tr}., and for R4 only every 4th pixel was used. This means that 109 accesses per pixel per feature were needed (less for R~ ). In order to minimise unwanted effects at the edges of the images, the images were extended by reflection about the image edges.

Fig. 1 shows an input image "Line-angle", and the corresponding rarity map is shown in Fig. 2. In Fig. 3, the final result is shown, with line thickness used to represent edge strength, and the triangle is clearly recovered. The edge strength between any pair of ad- jacent pixels is the lowest rainfall at which they both have the same label. For most such pairs, this is zero. The other figures follow similar conventions.

In Figs. 4 and 5 the input and output for another image "Dot-size", is shown. We now look at two more complex images. Fig. 6 shows image "G-noise", which is divided into 4 squares, each containing a circle or ellipse. The 8 areas each consist of a uniform grey- level with Gaussian noise superimposed. The means and standard deviations are shown in the caption. The values were chosen to correspond with those used in Jolion, Meer and Rosenfeld (1990). We see in Fig. 7 that the program finds boundaries due to changes in both mean and variance in Fig. 6, and that it has no particular problem with elongated objects. The background in the bottom right square has some quite strong "false" edges (about 2/3 the strength of the circle) and the very noisy circle (top left) is not found.

In Figs. 8 and 9 we see that a dislocation of a texture, or a dotted line is sufficient to cause a segmentation. The dislocation boundary tends to run along lines in the image near to the actual dislocation rather than along the dislocation itself. Also note that the dotted lines are not seen properly if they get too close to the edge of the grid, and that the nearest outline circle has become joined to the grid area. The human visual system seems to know that the dislocation should be seen on a fairly large scale, but that the outline and dotted circles should be seen on a smaller scale. The present system makes such automatic de- cisions as long as the objects are fairly isolated, but fails when objects get too close. This kind of interfer- ence means that the large-scale operators needed for

L~

xS

3 -to

#


Fig. 3. Final output from Fig. 1. Line thickness is proportional to edge strength, except for strengths below 5% of the maximum, which are ignored, and those in the range 5% to 10%, which are shown as dotted lines.

more difficult texture segmentations cannot be added without degrading performance on small objects.

5. Comparisons with other methods

A few algorithms for image segmentation (Raafat and Wong, 1988; Haralick and Shapiro, 1985, Sec- tion 5 ) make use of information about the density in feature space near F (p ) (where p is a pixel and F a feature) to guide a region growing method. The density estimate in these methods is based on the entire image rather than a local estimate like that provided by R. Also, the region growing starts at the most typ- ical (i.e., least rare) pixels or blocks and works out- ward, while we use the catchment algorithm.

The closest relative to the catchment algorithm seems to be the watershed algorithm developed by Lantejoul, Beucher and Maissoneuve (see Serra, 1982, 1988). This is very similar to the initial very fine segmentation in the catchment algorithm, with

one region for each minimum. What the catchment algorithm adds is the ability to ignore detail, or noise, and capture the more global variations in the input. The catchment algorithm deals well with thick edges of variable strength, and does not require the setting of any thresholds until after the entire segmentation is obtained.

6. Conclusion

Many segmentation schemes give good results on a restricted class of images but fail to cope with any- thing like the variety of images that the human visual system deals with quickly and effortlessly. The present method makes very few assumptions about the image and is therefore able to deal with a very wide range of segmentations. The regions may be large or small, more or less any shape (as long as they are not too thin), and they may be distinguished by grey-level or texture or by a boundary between areas of identi-

i "

• • • • m • • • • • • •

• • • • • • • • • • •

• • • • • • • ,

• • • ,

• • • • •

• • • . • ,

• • _- • • m • . • am

• • • m •

• • • • • . , •

Fig. 4. "Dot-s ize" image.

" - . . ~':

n rq

4 • 1 :" i / /

Fig. 5. Output f r o m Fig. 4. C o n v e n t i o n s as Fig. 3.

% 1 • "

I " • • • |

• L •

i

• ~ i I • •

: : , . ~ ' _ . . , . . ,

/0:.'t : ' " , - " • • m m ~ ima

• I . ; - . - . . - J . 0 J l . ' - . " . - ' • % " ¢ , - - . r - o ~' "s - -,

• , " , ' , j J P , , ; ," - . " .

"N

Fig. 6. "G-noise" image. This shows 3 circles (diameter 32 pixels) and an ellipse (32 by 8 ) on four square backgrounds (64 by 64 ). The means m and standard deviations s of the grey-levels in each area are as follows. Top left: s = 60, background m = 110, circle m = 140. Top right: s = 3 0 , background m = 110, ellipse m = 170. Bottom left: s = 15, background m = 110, circle m = 140. Bottom right: s = 100, hack- ground m = 7 5 , circle m = 175 .

:ii:

O Fig. 7. Output from Fig. 6. Conventions as Fig. 3.

I I I I I I I I I ,,,I 1 111 I I • I I I . . . .

I ' " I I " I I I "..." I l l ! ! i ! I I I I I I i i i . . .

i'" !" F I I I ! I I I

I I I I"

IIII,,,'"I"'"'II!IIIIII ,, I l i l

© © ©

Fig. 8. "Dislocation" image.

© 0 G

Fig. 9. Output from Fig. 8. Conventions as Fig. 3.


cal texture. Fur the rmore , to a large extent, all these possibi l i t ies may occur in the same image: the catchment a lgor i thm allows us to s imply add var ious edge maps, and still recover the most impor t an t d iscont in- uities. However , there remains the p rob lem of inter- ference between the output of small- and large-scale operators .

References

Jolion, J., P. Meer and A. Rosenfeld (1990). Border delineation in image pyramids by concurrent tree growing. Pattern Recognition Lett. 11, 107-115.

Marr, D. (1982). Vision. Freeman, New York. Raafat, H.M. and A.K.C, Wong (1988 ). A texture information-

directed region growing algorithm for image segmentation and region classification. Computer Vision, Graphics, and Image Processing 43, 1-21.

Serra, J. (1982). Image Analysis and Mathematical MorpholoKy. Academic Press, New York.

Serra, J. (1988). Image Analysis and Mathematical Morphology, Vol. 2. Academic Press, New York.

Haralick, R.M. and L.G. Shapiro (1985). A survey of image segmentation techniques. Computer Vision, Graphics, and Image Processing 29, 100-132.

Documents

Image segmentation using texture boundary detection