prev

next

out of 10

Published on

14-Jul-2016View

215Download

2

Transcript

Pergamon Pattern Recoonition, Vol. 28, No. 11, pp. 1783 1792, 1995 Elsevier Science Ltd Copyright 1995 Pattern Recognition Society Printed in Great Britain. All rights reserved 0031-3203/95 $9.50+.00 0031-3203(95)00036-4 SHAPE DECOMPOSITION AND REPRESENTATION USING A RECURSIVE MORPHOLOGICAL OPERATION DEMIN WANG,t VERONIQUE HAESE-COAT, and JOSEPH RONSIN:~ "t" IRISA/INRIA, Campus Universitaire de Beaulieu, 35042 Rennes Cedex, France $ Laboratoire d'Automatique, Equipe Image, Institut National des Sciences Appliqures 35043 Rennes Cedex, France (Received 10 May 1994; in revised forra 22 February 1995; received for publication 17 March 1995) Abstract--This paper presents a recursive morphological operation developed in order to perform efficient shape representation. This operation uses a structuring element as a geometrical primitive to evaluate the shape of an object. It results in a set of loci of the translated structuring elements that are included in the object but which do not overlap. The analysis of its computational complexity shows that it is usually less time-consuming than morphological erosion. By using this operation, an object decomposition algorithm is then developed for shape representation. It decomposes an object into a union of simple and non- overlapping object components. The object is represented by the sizes and loci of its object components. This representation is information preserving, shift and scale invariant, and non-redundant. It has been compared with skeletons, morphological decomposition, chain codes and quatrees in terms of compression ability and image processing facility. Experimental results shows that it is very compact, especially if information loss is allowed. Because of the non-overlap between object components, many image processing tasks can be easily performed by directly using this shape representation. Shape representation Object description Image decomposition Mathematical morphology Recursive algorithm Data compression Image coding l. INTRODUCTION Shape representation is a very important issue in digital image processing and computer vision. It pro- vides descriptions of binary image objects in a com- pressed form which can be used for the design of automatic image analysis, pattern recognition and computer vision systems, as well as for image coding techniques. A good shape representation scheme should have the following properties: (i) information preserving: it should contain all the information concerning the image under consideration, and exact reconstruction of the original object should be possible. (ii) Math- ematically tractable: it should be efficient and easy to use for various image analysis and computer vision applications. This usually requires the representation to be invariant under translation and scaling. More- over, a hierarchical representation is often desired. (iii) Compact: it should be non-redundant and provide high data compression. Non-redundant representa- tion is defined as allowing a reconstruction of the original object, however, removal of any one of its elements would violate this reconstruction. In recent years, a multitude of shape representation schemes based on different theories have been develop- ed. The schemes related to mathematical morphol- ogy t1'2"3~ include granulometric size distribution, skeletonization and morphological decomposition. Granulometric size distribution was conceived by Matheron ~I) as a descriptor of granularity or texture within an image. Its derivative is often referred to as pattern spectrum. 141 Granulometric size distribution and pattern spectrum give good quantitative descrip- tions of shapes. They have been employed to study shape-size complexity 14) and to classify texture im- ages. tS) However, original objects cannot be recon- structed from these representations. Skeletonization was initially called medial axis transform where the skeletons were defined as the locus of maximal disks that can be inscribed inside objects. ~6) It has been proved that skeletons can be obtained by morphologi- cal operations. Iz) Skeletons give accurate representa- tions of shapes. Pitas and Venetsanopoulos ~71 have proposed a morphological decomposition algorithm for shape representation. This algorithm decomposes a binary object into a union of simple subsets by using a family of structuring elements of different sizes. The extension of this decomposition to multilevel images has been presented in references (8 and 9) for texture classification and segmentation, and in reference (10) for range image analysis. Another shape decomposi- tion algorithm based on mathematical morphology, has been proposed by Ronse and Macq." 11 Its prin- ciple is similar to that in reference (7). These shape decompositions are information preserving, invariant under translation and scaling. In morphological skeletonization and shape decom- positions, erosions are used to extract representative points and dilations are used to reconstruct original images from these representative points. The points extracted by erosion are usually connected to each 1783 1784 D. WANG et al. other, instead of being isolated. In the reconstruction procedure using dilation, these connected points are then replaced by structuring elements located at these points. Due to the resulting overlap between the struc- turing elements these representations have two draw- backs: firstly it is difficult to perform various image analysis tasks by directly using these representations, and secondly there is redundancy in these representa- tions. The globally minimal skeletonization proposed by Maragos and Schafer t12~ is a non-redundant repre- sentation. However, it eliminates only a part of the overlap. To remove all of the overlap, a new mor- phological operation must be developed to replace the erosions and dilations previously used. This paper presents a recursive morphological oper- ation. A non-redundant shape representation, based on this operation, is then proposed. The rest of the paper is organized as follows. In Section 2, we first describe the recursive morphological operation. Then we study its properties and its computational complex- ity. In Section 3, the non-redundant shape representa- tion based on the new operation is presented. Its compactness, i.e. data compression capability, is studied and compared with other schemes. The ease with which this representation can be directly used in image processing is demonstrated. Finally, Section 4 summarizes the results of our research. 2. RECURSIVE MORPHOLOGICAL OPERATION To obtain non-redundant shape representation, we have developed a recursive morphological operation. In this section, we first present the definition of the operation. Then, we analyse its properties and discuss its computational complexity. 2.1. Definition of the operation In mathematical morphology, a discrete binary image is considered as a point set A defined on a dis- crete two-dimensional (2-D) rectangle region con- sisting of an M x N regularly spaced lattice. Let B denote the structuring element, which is usually a set of simple shape, and A\B denote the set differ- ence between A and B. We use A + x, instead of A x, to denote the translation of A by point x because the notation A + x will avoid confusion with the variables having subscripts which will be used in the paper. This notation has been used by Maragos 14~ and Dougherty~5,13~ to denote translated sets. The erosion and dilation of A by B are respectively denoted by A e B and A ~) B, and defined as: TM AB= NA-x={x lB+x~_A} (1) xeB AB= UA+x= UB+x, (2) xeB xeA where x is the point in 2-D space Z 2. As indicated by equations (1) and (2), erosion of A by B can be implemented by moving B over all points in llIIIIIII llllllllTI Il l lITTI \ A A @ B B iiilllll- IIllllll l l l I l l J l TIITTTII A Q B O B Fig. 1. Erosion and dilation. Z 2. At each point x, we examine whether B + x is included or not in A. The set of points corresponding to positive answers forms the erosion result. Dilation of A by B can be implemented by replacing each point xeA by the translated structuring element B + x and considering their union. If representative points in a shape representation are extracted by erosion, and original images are recon- structed by dilation of the representative points with the same structuring element [as in the morphological skeletons and morphological decomposition pres- ented in references (12) and (7)], then the representative points may be adjacent to one another. In the recon- structed image, adjacent representative points are re- placed by overlapping structuring elements, as shown in Fig. 1. The overlap between the structuring elements leads to representation redundancy, that is, the re- moval of some representaive points may not alter the reconstruction results. For example, if we keep only the leftmost and rightmost points of A e B in Fig. 1 as the representative points, the dilation of these points is a union of two non-overlapping structuring elements and remains equal to (A e B) ~3 B. We expect to represent an object by a few isolated representative points so that the reconstruction of the object is made by a union of non-overlapping structur- ing elements. Such representative points can be ob- tained by the following recursive procedure. Let structuring element B be translated point by a point in Z 2. At the first point (i, j), we examine whether B + (i, j) is included or not in object A. If not, B is translated to the next point and we repeat the examination. Other- wise, point (i,j) is taken as one of the representative points. Then B + (i,j) is subtracted from A, B is trans- lated to the next point and we examine whether the translated structuring element is included in A\B+(i, j ) . When the second representative point is found, the corresponding translated structuring element is subtracted from A\B + (i, j). This procedure is repeated until B has been translated over all the points in Z 2. By the above-mentioned procedure, the result ob- tained at each point depends on both object A and the results obtained at previous points. Thus, this pro- cedure leads to a recursive operation3 TM To imple- ment this operation for a 2-D image, we encounter the Shape decomposition and representation 1785 M-1 N-I ,--.,: : 6 : : : ? i .o o o ~o . . .~ .o o - .... 0: : : : : , - . . .~ . . . . . . . . . : o : : : : : i~ io o o .o ----~I,- o o - 0 . . . . . ,~ ........... ~:: : : : : : : :~o o ........ o . . . . o - - -~ o o . . . . . . . . . . . . . . . . . . ~. :::: ::: o o o ~ . o . . . . . . o ........... o . . . . . . ..( o o ~ '~-~ "past" points -~ "present" point o "future" points Fig. 2. Order of structuring element translation. problem of scanning 2-D space with 1-D translations of structuring elements. The simplest solution is that the structuring element is translated one line at a time, sequentially, left to right, top to bottom, as shown in Fig. 2. Consequently, at any point within an image, some points are the "past", one point is the "present", and the remaining points are the "future". These words have their conventional meaning with respect to the order in which the points are processed. According to this order, point (i,j) is equivalent to point (iN +j) and B + (i,j) can be expressed by B + (iN +j). B + (i,j)c~,B + (iN +j), for i=0 , 1 . . . . . M- 1, j=0 ,1 . . . . . N - l , (3) where M and N are, respectively, the height and width of images. To formulate this recursive operation, let Yk denote the intermediate result of the operation when B has been translated from the beginning (i.e. point 0) to point k, where k < MN and Yk = ~ for k < 0. Suppose tht Yk- 1 is known and Yk is to be expressed in terms of Yx i for x < k. According to the above-mentioned procedure, we have: forO1786 D. WANG et al. m m m \ / A B NE(&B) m m m \ / C D NE(C,D) Fig. 3. Examples of recursive morphological operation. 7 " NO[NO(A, B), B] = NO(A, B), i.e. non-overlapping opening is an indempotent operation. Property 5 shows that non-overlapping erosion is not idem- potent. In addition, it verifies that NE(A, B) generally consists of isolated points or disjoint subsets. Property 6 will be useful for quantitative image analysis. Card[NE(A, B)] gives the maximal number of B that can be included in A. If A = NO(A, B), the product of Card [NE(A,B)] by Card(B) is equal to Card(A). Otherwise, it is less than Card(A). 2.3. Computational complexity of the operation Non-overlapping erosion is closely related to ero- sion. In order to discuss its computational complexity and compare it with that of erosion, we will first present the computational complexity of erosion. To implement erosion on a conventional computer, one could simply represent an image by binary logical functions, whose values are equal to logical "1" at points of the object and equal to logical "0" at its background. Structuring element B is considered as a shifting window, and one of its points is taken as a testing point. B + x ~_ A in (1) is equivalent to the logical AND of all the points within the corresponding shifting window is equal to logical "1". The fact that the testing point falls in the image background obviously results in the logical "0" of erosion. Therefore, only if the testing point corresponds to an object point, it is necessary to take the logical AND of all points within the shifting window, which requires [Card(B)- 1] AND operations. Consequently, the total number of logical AND operations required for erosion of A by B is: N e = Card(A) [Card(B) - 1 ]. (10) A similar procedure could be used for the implemen- tation of non-overlapping erosion. Here, the testing point is chosen so that all other points in B are its "future" points according to the scanning order men- tioned in subsection 2.1, as shown in Fig. 4. Set sub- traction in (4) and (5) is equivalent to changing the corresponding points from "1" to "0". Once the logical AND of all points within the shifting window is equal to logical "1", all these points are changed from "1" to "0". That leads to [Card(B)- 1] "future" points of logical "1", with respect to the testing point, becoming logical "0". At the end of operation, Card [NE(A,B)][Card(B)-1] "future" points that were logical "1" have been removed from the image. The number of the points equal to logical "1" when the testing point has passed through them is Card(A)- Card[NE(A, B)](Card(B) - 1). Therefore, the number of AND operations needed for implementation of NE(A, B) is: N,e = [Card(A) - Card[NE(A, B)](Card(B) - 1)] x (Card(B) - 1). (11) O0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Fig. 4. Choice of testing point ~" in B. Shape decomposition and representation 1787 O0 O O 0 O O O 0 O0 O 0 0 O O O 0 O O 0 O O 0 0 O O O 0 Fig. 5. Family of structuring elements generated by the square B of size 2 2. By comparing equations (10) and (11), it can be found that non-overlapping erosion requires Card[NE(A,B)][Card(B)-1] z logical ANDs less than erosion, although the expressions (4) and (5) for non-overlapping erosion are a little more complicated than that for classical erosion (1). The parallel and serial composition properties of erosion 12'31 can be exploited to speed up the implemen- tation of erosion. For example, considering two sets A and B shown in Fig. 1, classical erosion of A by B requires 184 AND operations according to the above-mentioned implementation procedure. If B is decomposed into the dilation of one horizontal-line segment of three points by one vertical-line segment of the same length, the erosion can be performed by eroding A successively by the two line segments. This composite implementation requires 80 AND oper- ations. However, the number of AND operations needed for implementation of NE(A, B) is only 56. 3. NON-REDUNDANT SHAPE REPRESENTATION In this section, a new decomposition algorithm based on non-overlapping erosion is presented for shape representation. The compactness of the repre- sentation is analysed and compared with that of morphological skeletons, Pitas-Venetsanopoulos decomposition, chain codes and quadtrees. Then the facility of performing image processing tasks directly using the representation is demonstrated. 3.1. Object decomposition algorithm In architecture, building of various forms can be constructed by blocks. These blocks are non-overlap- ping, but usually connected to one another. To speed up the construction, blocks of several different sizes must be available and we must use the biggest ones as often as possible. Similarly, objects within a binary image can be considered as being constructed with simple object components of different sizes that may be connected but not overlapping. These object compo- nents are here representated by a family of structuring elements. Let B be a finite connected subset of Z 2. A family of structuring elements with different sizes is generated by dilation of B: rB=B~)B~.. .~B, with r > 0. r times r = O, rB = {(0,0)} by convention. Primary struc- turing element B controls the form of rB, whereas integer r controls the size of rB. The family of structur- ing elements generated by square B of size 2 x 2 is shown in Fig. 5. As seen above, non-overlapping opening of a set by a structuring element is a union of the non-overlapping structuring elements. In order to decompose an object A into the largest possible object components, the non- overlapping opening of A by the biggest structuring element nB is taken as the first group of object compo- nents of A. The second group of object components is obtained as follows: the first group of object compo- nents is subtracted from A. The non-overlapping open- ing of the remainder of A by the second biggest structuring element (n - 1)B is computed as the second group of object components. Then, the second group of object components is further subtracted from A and this procedure is repeated until the structuring element becomes 0B or the remainder of set A is empty. If S, denotes the group of object components corre- sponding to structuring element (n- r)B, the whole procedure can be described by the following relations: -- r)B], for r = O, 1 . . . . . n, where n -- max{ilA @iB ~}. Since X 0 = X, X, = Xr- I \S,- I = A\ U Sj, these relations can be com- O1788 D. WANG et al. where T, are sets consisting of isolated points, and (n - r)B + (k, l) are the object components (translated structuring elements which do no overlap). So the new algorithm decomposes an object into a union of simple and non-overlapping object components. From equation (13), it can be seen that S O consists of the biggest object components (structuring elements) nB+ (k, l). It gives the coarsest object approximation. The object components contained in S r become small- er and smaller as index r increases, and consequently provide additional finer details of the object. So this decomposition provides a hierarchical shape descrip- tion starting from the coarsest approximation and adding progressively finer details. Using all the object components guarantees exact reconstruction of the original object. If some loss is acceptable, the object can be approximately reconstructed without small components. To decrease the loss, we can reconstruct the object using the following formula: where denotes closing. Evidently, A---A o and AorB ~_ A, ~_ U st. O 2W. In order to precisely evaluate its compression capa- bility, the new shape representation scheme has been applied to two images of 512 x 512 pixels, as shown in Figs 8(a) and (b). These images are taken from the standard facsimile images of the International Tele- graph and Telephone Consultative Committee (CCITT) and scanned at about 200 d.p.i. "51 They are, respectively, parts of CCITT image numbers 1 and 2. In the following, they are referred to as text and graph, respectively. These images are decomposed by the family of structuring elements presented in Fig. 5. The decomposition results are reported in Table 1, where the data volumes are calculated according to equation (16) as follows: two fields for the coordinates of each Shape decomposition and representation 1789 Our Ref. 3$O/PJC/EAC !~. P.N. Cundall, Mning Survey8 Ltd., liolroyd bsd , Reading, Berko. Demr Petej (a) m IO..L (b) Our Ref, 3501FJC/EAC Dr. P.N. Cundall, Hinin$ Surveys Ltd., Holroyd Road, Reading~ krks . tO.O. W Dear Pete, (c) (d) Fig. 8. Original images and their approximated representations, (a) original image text, (b) original image graph, (c) approximated representation A 1 of text, and (d) A1 of graph. point in T,, one for n - r and an additional one to indicate the value change of r. The normalized error of the approximated representation described by equa- tion (15) is also reported. This error is the percentage of the pixets in reconstructed image which differ from the original. It can be seen in Table 1 that these images can be exactly represented by only 3356 and 3187 points, respectively. The corresponding data volumes are, re- spectively, 6726 and 6404 fields. In most image pro- cessing applications, if a binary image is represented by an array, each pixel must be stored as a field although it Table 1. Results of accurate and approximated representations by the new decomposition Text in Fig. 8(a) Graph in Fig. 8(b) Image A A 1 A 2 A A 1 A 2 Number of representative points 3356 961 630 3187 921 512 Data volume (field) 6726 1934 1270 6404 1870 1050 Normalized error (%) 0.0 0.84 1.35 0.0 0.75 1.33 1790 D. WANG et al. 5000 4000 3000 2000 1000 Number of representative points \ \ new decre. O-- skeleton D- - P-V. decom. 0.0 0.5 1.0 1.5 Normalized error (%) 2.0 (a) 5000 4000 3000 2000 1000 Number of representative points 0 ~ e new decom O-- skeleton n- - P-V. decom. 0.0 0.5 1.0 1.5 2.0 Normalized error (%) (b) Fig. 9. Numbers of representative points resulting from dif- ferent representation schemes, (a) for image text and (b) for image graph. requires only one bit. In this case, the image data are compressed about 40 times. For approximated repre- sentation (A 1 and A2), the data volumes and the num- bers of representative points are greatly reduced, and the reconstruction errors are reasonable. For example, the data volume for AI of image text decreases to 28.75%, i.e. the image data is compressed over 130 times, while the normalized error is 0.84%. Figs 8(c) and (d) illustrate the approximation A 1 for the two images. For the purpose of comparison, the original images have also been represented by skeletons of a 3 x 3 square structuring element and the Pitas-Vene- tsanopoulos decomposition with the family of struc- turing elements shown in Fig. 5. They are also hierarchical representation schemes allowing approxi- mated representation (partial reconstruction) of the original object [see references (12) and (7)]. From the results illustrated in Fig. 9, it is apparent that the new decomposition results in the most compact repre- sentation if some loss is acceptable. For lossless Table 2. Comparative compression ratios for lossless coding New Method decomposition Chain codes Quadtrees Text in Fig. 8(a) 10.15 12.99 9.41 Graph in Fig. 8(b) 9.08 15.82 9.41 representation, it is more compact than skeletons and F comparable to the Pitas-Venetsanopoulos decom- position. For comparison with chain codes t16) and quad- trees, c6) the representative points resulting from the new decomposition have been encoded by Runlength- Elias Coding which was used in reference (12) for skeleton subset compression. The experimental results are reported in Table 2 in terms of compression ratios: the proportion of pixels in the original image to bits in the encoded form. From these results, it can be seen that the new representation performs comparably to quadtrees, but is inferior to chain codes in the case of lossless representation. However, it is difficult to ob- tain a hierarchical(approximated) representation from chain codes. The compression ratio of the new repre- sentation considerably increases in the case of approxi- mated representation. For example, the compression ratios for the approximation A1 of the two images ) ' reach, respectively, 23.93 and 21.54. 3.3. Image processing using the representation As mentioned in Section 1, a good shape representa- tion should be efficient for performing various image processing and analysis tasks. Here, the word "effi- cient" implies simple to implement and fast in compu- tation. Many image processing and analysis tasks are performed through linear convolution, Fourier trans- form and computation of geometric features such as area, centroids, and so on. With the new decomposition, a binary image can be represented as the superposition of its object compo- nents in binary array form. Let qJ(.) denote the trans- formation from set to binary array, a(i,j) denote the binary array corresponding to object A, and br(i,j) the binary array corresponding to structuring element rB, that is to say: 1, ( i , j )eA a ( i , j )=V(A)= O, (i,j)A b,(i, j) = U~(rB) b,(i - k, j - l) = ~P(rB + (k, l)). Due to the non-overlap between object components, relation (14) can be described as: a(i,j)= ~, Z bn- , ( i - k , j - l ) " (17) OShape decomposition and representation 1791 of Fourier transform and linear convolution can be employed to simplify the implementation of Fourier transform and linear convolution in this representa- tion form. For example, let g,(i,j) denote the convolu- tion of b,(i,j) by h(i,j), where h(i,j) is a binary or multilevel array. According to equation (17), the con- volution of a(i,j) by h(i,j) can be expressed as: (a*h)(i,j)= ~ ~ gn_ , ( i -k , j - I ) . (18) O1792 D. WANG et al. 12. P. Maragos and R. W. Schafer, Morphological skeleton representation and coding of binary images, IEEE Trans. Acoust., Speech, Signal Process. 34, 1228-1244 (1986). 13. E. R. Dougherty, Application of the Hausdorffmetric in gray-scale mathematical morphology via truncated um- brae, J. Vis. Commun. Image Represent. 2, 177-187 (1991). 14. W. D. Stanley, Digital Signal Processing. Reston Publish- ing Company, Inc., Virginia (1984). 15. J.W. Brandt, A. K. Jain and V. R. Algazi, Medial axis representation and encoding of scanned documents, J. Vis. Commun. Image Represent. 2, 151-165 (1991). 16. R. C. Gonzalez and R. E. Woods, Digital Image Process- ing. Addison-Wesley Publishing Company, Inc. (1992). About the Author--DEMIN WANG was born in China in 1959. He received B.Sc. and M.Sc. degrees in electrical engineering from Shandong Polytechnic University, China, in 1982 and 1985, respectively, and a Ph.D. degree from the Institut National des Sciences Appliqures (INSA) de Rennes, France, in 1991. From 1985 to 1989 he worked as an assistant, then a lecturer, at Shandong Polytechnic University. He was a research and teaching assistant at the INSA de Rennes from 1989 to 1991, and a professor of electronic and computer engineering at Shandong Polytechnic University from 1991 to 1993. During 1993 and 1994, he worked as a post-doctoral research fellow at the University of Sherbrooke, Canada. Currently, he is an invited professor at the Institut National de Recherche en Informatique et en Automatique (INRIA) at the IRISA Rennes. His research interests include image coding, shape representation, texture analysis, non-linear filtering and mathematical morphology. About the Aathor--VERONIQUE HAESE-COAT was born in France in 1960. She graduated in electronic engineering in 1983, from the Institut National des Sciences Appliqures (INSA) de Rennes, where she also received, in 1987, a Ph.D. degree in image processing. Since October 1989, she has been a lecturer in the Department of Electrical Engineering at the INSA de Rennes. She is also an external researcher for IRISA/INRIA Rennes. Her principal research interests are texture analysis for segmentation, classification and pattern recognition, and image coding. About the Author--JOSEPH RONSIN was born in 1948. He is of French nationality. He obtained a M.Sc. in electronics from the University of Rennes, France, in 1972. He became a lecturer at the Institut National des Sciences Appliqure of Rennes, France, in 1972. He became Directeur de Recherches in 1989 and professor 1 year later. He has been responsible for several industrial grants between the INSA and. the state or private laboratories, in the fields of image analysis, image synthesis and image compression. Joseph Ronsin is a professor in the Department of Electronic Engineering of the INSA and Director of the Laboratoire d'Automatique at this institute. He is also an external researcher for IRISA/INRIA Rennes. His principal research interests are texture analysis and image coding.