Skeletonization by a topology-adaptive self-organizing neural network

*Corresponding author. Fax: #91-33-577-6680.E-mail address: [email protected] (A. Datta).

Pattern Recognition 34 (2001) 617}629

Skeletonization by a topology-adaptive self-organizingneural network

Amitava Datta!,*, S.K. Parui", B.B. Chaudhuri"

!Computer and Statistical Service Centre, Indian Statistical Institute, 203 B.T. Road, Calcutta 700035, India"Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, 203 B.T. Road, Calcutta 700035, India

Received 5 May 1999; received in revised form 3 December 1999; accepted 20 December 1999

Abstract

A self-organizing neural network model is proposed to generate the skeleton of a pattern. The proposed neural net istopology-adaptive and has a few advantages over other self-organizing models. The model is dynamic in the sense that itgrows in size over time. The model is especially designed to produce a vector skeleton of a pattern. It works on binarypatterns, dot patterns and also on gray-level patterns. Thus it provides a uni"ed approach to skeletonization. Theproposed model is highly robust to noise (boundary and interior noise) as compared to existing conventional skeletoniz-ation algorithms and is invariant under arbitrary rotation. It is also e$cient in medial axis representation and in datareduction. ( 2001 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.

Keywords: Skeleton; Medial axis; Neural networks; Self-organization; Adaptive topology; Robustness

1. Introduction

Skeletonization is the process of transforming a severalpixel wide object to a single pixel wide object so that thetopological properties of the object are more or lesspreserved. The resulting object is called a skeleton. Suchskeletons are useful in the recognition of elongatedshaped objects, for example, character patterns, chromo-some patterns, etc. The skeleton provides an abstractionof geometrical and topological features of the object.Since it stores the essential structural information of thepattern, skeletonization can be viewed as data compres-sion as well.

The concept of a skeleton (in continuous case, it iscalled medial axis transformation or MAT) was introduc-ed by Blum [1]. Formally, the medial axis can be de"nedas follows. Consider the circles that totally lie within theobject and touch the object boundary at two or more

points. The centres of all such circles form the medial axisof the object. Extensive researches have been carried outin this area (in digital domain) and as a result, a largenumber of skeletonization or thinning techniques havebeen developed in the last few decades. These techniquesdi!er from each other in performance and in implementa-tion. But the main objective is to achieve a close approxi-mation of the MAT of the object. A large number ofskeletonization algorithms remove outer layer pixelsiteratively until a one-pixel thick skeleton is achieved.Another popular approach uses the distance transform(DT). Several other techniques based on polygon approx-imation, run-length encoding, contour following, etc., arealso available. These techniques are surveyed in a classi-"ed manner by Smith [2] and Lam et al. [3]. The outputskeleton, produced by some of these algorithms, may notbe a raster representation. Rather, it may be in the formof a planar graph providing a line-segment approxima-tion of the input pattern. In this paper, we call suchrepresentation as vector skeleton as opposed to rasterskeleton produced by other techniques.

This paper deals with generation of the vector skeletonof an object. For this, we propose a self-organizing neural

0031-3203/01/$20.00 ( 2001 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.PII: S 0 0 3 1 - 3 2 0 3 ( 0 0 ) 0 0 0 1 3 - 3

network model which is topology-adaptive. The neuralnet grows in size with iterations according to the localtopology of the input pattern. When the net converges,the topology of the processors de"nes the vector skel-eton.

Our study shows that the proposed neural algorithmhas quite a few advantages over the conventional ones.The most important advantage is high robustness withrespect to boundary and interior noise. Second, the out-put here is rotation invariant to arbitrary angles which isnot so for many conventional algorithms. The medialaxis representation e$ciency is found to be satisfactory.The algorithm is also capable of higher data reduction.Finally, the proposed algorithm can be seen as a uni"edapproach to skeletonization because it is applicable to allthe three types of input patterns, namely, binary images,dot patterns and gray-level images.

We describe our neural network model for skeletoniz-ation in Sections 2 and 3. Section 4 presents comparativeresults between the proposed neural algorithm and someconventional thinning algorithms. Conclusions are givenin Section 5.

2. The topology-adaptive self-organizing neural network(TASONN) model

Kohonen's self-organizing neural network (SONN)model [4] uses a network of "xed (either linear or planar)topology. The neighbourhood topology in the network is"xed. Such a net of "xed neighbourhood topology doesnot work well in some situations [5,6]. It is so becauseduring weight-updation process the weight vectors lyingin zero-density areas may be a!ected by input vectorsfrom the surrounding parts of the nonzero distribution.As the neighbourhoods are shrunk, the #uctuation van-ishes that makes some processors remain outlier due tothe residual e!ect. Moreover, because of rigid topology ofthe net, the topology of the input pattern cannot becompletely adapted. These pose problems, as discussed inthe next section, because the resulting network does notgive the required vector skeleton.

From the above issues it is felt that a dynamic networkwith an adaptive local neighbourhood is required forskeletonization. The model proposed here meets thisrequirement. In the present model, the initial list of pro-cessors is empty and the resulting topology is completelydata driven. That is, initially there is no processor and noneighbourhood is de"ned. The net grows in size bymeans of a certain processor evolution mechanism. Dur-ing the learning process the processors create/adapt theirneighbourhoods dynamically, by means of connectionbuilding, on the basis of the input. The neighbourhood ofa processor in the net is totally input driven which getsdynamically de"ned. The degrees of di!erent processors(number of neighbours) may vary and may increase or

decrease over time and hence the neighbourhood topol-ogy of the net is not "xed. The topology is adapted on thebasis of the local input vectors. The model enables thenetwork to learn the weight vectors as well as its topol-ogy from the input vectors in an unsupervised manner.Thus the present model is a topology-adaptive self-or-ganizing neural network (TASONN).

A di!erent topology adaptive network was proposedby Martinetz and Schulten [7] but, as we shall see lateron, this network is not suitable for generating the vectorskeleton of a pattern.

We shall "rst describe our model in a general case andthen demonstrate its applicability to skeletonization. Letthe input vectors X come from a manifold de"ned in them-dimensional space. Denote the array of processors byMn

1,n

2,2, n

nN and the neighbourhood N

iof a processor

niby Mn

pDn

pis connected to n

iN which excludes n

i. The

processor ni

stores an m-dimensional weight vector=

iwhich can be considered to be the location of the

processor in Rm.

De5nition 1. By sensitive region of a processor we meanan m-dimensional ball of a given radius centred at theweight vector of the processor. The radius is called thesensitive level.

De5nition 2. A processor is said to respond to an inputsignal if the input vector falls inside the sensitive region ofthe processor.

De5nition 3. A processor is said to win if it is the nearestto the input vector among all the processors. The proces-sor is called the winner processor. The processor that issecond nearest to the input vector is called the secondwinner.

In the present model, the entire network is evolved.Initially, there is no processor, that is, the set of proces-sors is null. New processors can be added to the networkin two manners (which will be discussed in a shortwhile)and we use two terms create and insert to distinguishbetween them. Both creation and insertion are performedaccording to the demand by the input. The former is totake care of new input regions while the latter is to takecare of highly dense regions. If some input region isunattended, that is, if no other processor is foundaround, a new processor is created there. Again, if anyinput region is highly dense, new processors are insertedthere.

An input vector is received through the input lines. The"rst processor is created at the location of the "rst inputvector. A sensitive level is set and a sensitive region isassigned to the processor. From the second input on-wards the process continues as follows. If the presentedinput falls outside the sensitive regions of all the existingprocessors (that is, no processor responds) then a proces-

618 A. Datta et al. / Pattern Recognition 34 (2001) 617}629

Fig. 1. For &L'-shaped pattern, the output net achieved byTASONN after convergence: (a) d"25, (b) d"50. The nonzerob values are shown against each link.

sor is created at the location of the input. If at least oneprocessor responds, then (1) adjust the weights of thewinner and the second winner; and (2) strengthen thestrength variable of the edge connecting the winner andthe second winner and weaken that of all other edges.

Several iterations (presentations of input) make onephase. One phase is completed when the weight vectors ofthe current set of processors converge, that is, when thenetwork becomes stable. After every phase, a new proces-sor is inserted at the middle of the link with the maximumstrength value. The links for the new processor are rebuiltafter removing the old link and the strengths are read-justed (strength of the old link is distributed equally tothe two new links). The next input vector is presented andthe whole process is continued until certain convergencecriterion is met. Formally, the above process can bedescribed as follows.

Processor creation: Suppose the sensitive level is q.A new processor is created when the input vector X(t), attime t, comes from a new region. A new input region isdetected if no processor is found within a distance ofq from X(t). In other words, on presentation of an input, ifno processor responds (or if the processor set is empty)with the given sensitive level q then create a new proces-sor at the location of X(t).

Link construction: Establish a link between two nodesnu

and nv

(uOv) if X is an input from p (X) such that

max(d(X,=u), d(X,=

v)))d(X,=

k) ∀k (kOu, kOv),

where d( ) ) stands for distance. The above criterionmeans, in the iterative process, if some input vector ariseswhich has=

uand=

vas the two nearest weight vectors

(one is the nearest and the other is the second nearest)then the nodes n

uand n

vare joined by a link, say, ¸

uv.

A strength bij

(0)bij)1) is associated with every such

link ¸ij. These strengths are updated during learning.

Initially, all bij's are set to 0. Suppose, the input vector at

iteration t is X(t). Let nu

and nv

(uOv) be, respectively,the "rst and second nearest processors of X(t), that is,

max(d(X(t),=u), d(X(t),=

v)))d(X(t),=

k)

∀k (kOu, kOv). (1)

Then the strengths are updated as follows:

buv

(t#1)"buv

(t)#1

t#1(1!b

uv(t)). (2)

For all other links,

bij(t#1)"b

ij(t)#

1

t#1(0!b

ij(t))"

t

t#1bij(t). (3)

This is as if X(t) pulls only buv

towards 1 and pulls allother b's towards 0.

Weight updation: After presentation of input X(t) at tthiteration the links are established by condition (1) and the

weights are updated by

=u(t#1)"=

u(t)#a

1(t)[X(t)!=

u(t)], (4)

=v(t#1)"=

v(t)#a

2(t)[X(t)!=

v(t)], (5)

where a1(t) and a

2(t) (0(a

2(t))a

1(t)(1) are chosen as

in the SONN model.The asymptotic relation between b

ij's and=

i's will be

as follows. Suppose, the asymptotic values of the weightvectors are =H

i. Consider the Voronoi tiles [7] of the

Voronoi diagram of order 2 of the set of weight vectors=H

i:

<ij"MX3Rm: max(d(X,=H

i), d(X,=H

j))

)d(X,=Hk) ∀k (kOi, kOj)N.

The present model considers pattern distribution p(X)that has support only on a submanifold, not on the entireembedding space Rm. It can be noted that the asymptoticvalue of the strength b

ijis :

Vijp(X) dX. For some <

ijO/,

the integral :Vij

p(X) dX may vanish as a result of whichthe respective b

ijvalues will tend to zero. Again, it is

possible that some bij

can have a positive value at a cer-tain stage of learning although <

ijis in fact empty. In

both cases, a link introduced during learning will vanishin the limit. By similar arguments given by Martinetz andSchulten [7], the links with asymptotically non-zero b

ij's

will form a subgraph of the Delaunay triangulation of=H

i(i"1, 2,2, n).

It is to be noted that the asymptotic links alongwiththeir strengths contain information regarding p(X). Theb value, against each link, indicates the degree of exist-ence of the link. For example, for an &L'-shaped pattern(Fig. 1(a)), there are 13 processors (after convergence) anda link between two processors is shown whenever thecorresponding b value is greater than zero. Here p(X) isthe uniform distribution over the &L'-shaped area. Thevalue of b for the diagonal link is 0.015 while the b valuesfor the other links are between 0.070 and 0.080 (for theend processors they are 0.110). Removal of a link with asigni"cantly low (in comparison to others) b value can

A. Datta et al. / Pattern Recognition 34 (2001) 617}629 619

thus result in a network that represents the skeletal shapeof the input pattern more accurately. It is to be noted thatsuch undesirable spurious triangles may occur at thejunctions of a pattern. They can be avoided by increasingthe gaps between two processors (Fig. 1(b)). Alternatively,such spurious triangles can be removed by deletion oflinks with signi"cantly low b.

Processor insertion: The processors are inserted onlyafter a phase is completed. Suppose, at the end of the sthphase, the weight vectors of the processors are=

1(ts),2,=

n(s)(ts) where n(s) is the number of processors

during the sth phase and ts

is the total number of iter-ations needed to reach the end of the sth phase. Now, thestopping criterion can be set depending upon the applica-tion in hand (see the next section). If the stopping cri-terion is not met then a new processor n

lis inserted

between nk

and nk{, where

bkk{

" maxi/1,2, n(s)

maxni{ |Ni

bii{

. (6)

Set blk"b

lk{"1

2bkk{

, new bkk{

"0 and the weight ofnl"1

2[=

k(ts)#=

k{(ts)].

By this process a processor is inserted at a locationwhere the demand is maximum. The process continuesuntil, at the end of a phase, the stopping criterion is met.

The TASONN AlgorithmStep 1: Initialize t"0.Step 2: Present the input pattern X(t).Step 3: If some processor responds to X(t) then go to

Step 5.Step 4: Create a processor at the location of X(t).Step 5: If the second winner exists then update the

strengths bij's according to Eqs. (2) and (3).

Step 6: Update the weights of the winner and thesecond winner, according to Eqs. (4) and (5).

Step 7: If the net is not stable, set t"t#1 and gotoStep 2.

Step 8: If the stopping criterion is met, then go to Step10.

Step 9: Insert a processor according to condition (6), sett"t#1. Goto Step 2.

Step 10: Stop.

We shall now see how the above model can be used inskeletal shape extraction. The main feature of theTASONN model, useful in skeletonization, is the dynamictopology adaptivity and its dynamic growth in size.

3. Skeletonization by TASONN model

To make the propopsed TASONN model applicableto skeletonization problem, a suitable stopping criterionis used. A parameter d is introduced as an upper bound ofthe distances between two neighbouring processors. The

problem of skeletal shape extraction of two-dimensionalvisual patterns, images and dot patterns, are consideredtogether here. Here, the set of input vectors isS"MP

1,P

2,2,P

NN where P

jrepresents the positional

co-ordinates of an object point. One presentation of allthe points in S makes one sweep consisting of N iter-ations. After one sweep is completed, the iterative processfor the next sweep starts again from P

1through P

N. With

the present stopping criterion, the processor insertionstep (discussed in the previous section) is set as follows.

Processor insertion: Processors are inserted after eachphase. A phase is completed when the network with thecurrent set of processors converges, that is, when

D=i(t)!=

i(t@)D(e ∀i,

where t and t@ are the iteration numbers at the end of twoconsecutive sweeps and e is a predetermined small posit-ive quantity. If

D=k(ts)!=

k{(ts)D" max

i/1,2,n(s)

maxni{ |Ni

D=i(ts)!=

i{(ts)D'd

then one processor is inserted between nk

and nk{. The

weight of the new processor and the new b values are setas before. After the insertion of a processor, the nextphase starts with the new set of processors. The processcontinues until, at the end of a phase,

for all i, D=i(ts)!=

i{(ts)D)d, ∀n

i{3N

i

The output network of the algorithm gives an approxim-ate global shape of the input pattern. The "nal networkobtained by the above algorithm gives a vector skeletonfor the given input pattern. The raster skeleton can easilybe derived from the network as follows. It should bementioned that the output net depends on the parameterd, and with a proper choice of d, one can get a satisfactoryvector skeleton. This issue is discussed in Section 3.1.

Procedure raster-skel: For each link, the line segmentconnecting the weight vectors of the two correspondingprocessors, is considered. The set of all object pixelsintersecting such a line segment gives the raster skeleton.

Experimental results on various input patterns haveshown that the above algorithm converges and the re-sulting network gives an approximation of the skeletonof the input pattern. The following discussions wouldhelp us to see how the resulting net gives an approxima-tion of the skeletal shape of the pattern.

For arc patterns, it can easily be seen that the abovenode joining criteria would join the processors by singlelinks (other link strengths are zero) so that the topologyof the input is preserved (for example, see Fig. 2(a)). Heretwo processors corresponding to the regions S

2and

S3

are joined since they are the two nearest processorsfrom the input vectors lying in the shaded region. Sim-ilarly, other links are established. The strengths of allother possible links are zero since no input vectorcontributes to them. For a node representing a fork


Fig. 2. Link establishments in (a) arc pattern, (b) fork pattern.

Fig. 3. Output nets for a &R'-shaped pattern when (a) d"30, (b)d"25, (c) d"20, (d) d"15 (e) d"10, (f) d"5 without activa-tion level.

Fig. 4. Activation regions (a) at time t1, (b) at time t

2, (c) at time

t3. t

1(t

2(t

3.

junction region S1

(Fig. 2(b)) the node pairs correspond-ing to the region pairs (S

1,S

2), (S

1, S

3) and (S

1,S

4) are

joined for the same reason.

3.1. Role of parameter d

The parameter d controls how densely the processorsare to be located in the network. It controls the gapbetween two neighbouring processors. A desirable skel-eton may not be achieved if d is too high or too low. If d isvery high then the skeleton, although may preserve theessential topology of the pattern, does not properly rep-resent the medial axis (Figs. 3(a) and (b)). A portion of theoutput skeleton may lie outside the object. On the otherhand, very low values of d might produce spurious tri-angles in the network (Figs. 3(e) and (f )) which does notrepresent the true skeletal shape of the pattern. Hence werequire some adaptive mechanism to avoid this situationso that a satisfactory medial axis representation can beobtained. One such mechanism is to introduce activationlevel in the weight updating process which is describedbelow.

We specify an activation region of a processor so that ifan input vector falls within the region then only it acti-vates the processor. The activation region decreases overtime. In the present problem it is de"ned as follows.

De5nition 4. The activation level ai(s) of ith processor,

for i"1,2,n(s), at the end of the sth phase, is de"ned as

ai(s)"

1

ci

+ni{ |Ni

D=i(ts)!=

i{(ts)D

where ciis the number of neighbours of n

iexcluding n

i.

De5nition 5. The activation region of a processor isa circle with the corresponding weight vector as thecentre and the activation level as its radius (see Fig. 4).

It should be mentioned at this stage that the sensitivelevel and activation level, although similar, are used indi!erent contexts and hence are given di!erent names.The sensitive level remains "xed over time and is used tocreate the initial set of processors. It is not used any moreunless a new input region occurs. The activation levelvaries over time and may not be the same for all theprocessors at a point of time. It is used in competitionand weight updating.

A processor is called active if the presented inputvector lies within its activation region. In other words, if


Fig. 5. Output nets for a &R'-shaped pattern when (a) d"10, (b)d"5, (c) d"3 with activation level.

an input vector lies outside all the activation regions, it isignored in the competition and updating process. Aninput must activate a processor "rst before entering intothe competition. Only active processors are quali"ed forcompetition after which the winner processor is selectedand the weight vectors are updated accordingly. Thus,the weak signals (here the object pixels near the bound-ary of the object) gradually become ine!ective in theweight updating process. This is so because, as theprocessors become more and more close to each other,the region of input vectors that activates (in#uences)a processor is also shrunk and thereby the in#uence ofthe outer layers are symmetrically decreased. The objectpixels near the medial axis aquires more control over theprocess (Fig. 4(c)).

It should be noted that the sensitive level is also takenas a parameter in the TASONN model. But, since it isused to detect new regions and to select the initial fewprocessors, its choice is not crucial. Within a wide rangeof the sensitive level, the model produces identical results.

Although the activation level overcomes the formationof spurious triangles in linear or arc segments of thepattern, it cannot overcome the problem completely atjunctions of the pattern. If two such segments meet ata small angle then spurious triangles may occur evenafter incorporating activation levels. But this situationcan be avoided by keeping d high as seen in Fig. 1.

With these observations, we implement the TASONNmodel for skeletonization as follows. The whole processis divided into two stages (a) "nding an initial skeletonto get the overall topology of the pattern and (b) arrivingat a more accurate "nal skeleton after incorporatingthe activation level. In the "rst stage, set d high and getan initial skeleton that does not contain any spurioustriangle but preserves the topology of the input pattern(Figs. 3(a)}(d)). Note that higher the value of d,lower is the chance of forming a spurious triangle. Ifthere is a spurious triangle then one of the followingactions can be taken: (i) remove a link by choosinga threshold on the b values as described in Fig. 1 and (ii)take a higher value of d and repeat the process to get aninitial skeleton.

By the above method, we get an initial skeleton free ofspurious triangles that re#ects the overall topology of thepattern but may not be close to the medial axis. Now, inthe second stage, to get a close medial axis approxima-tion, a lower value is assigned to d and the weightupdating process is continued. As the topology of theinput pattern is already learned, the strengths (b) areignored at this stage. That is, we do not develop any moreconnections or links in the second stage. We only updatethe weight vectors in an e!ort to position them moreaccurately along the medial axis. New processor inser-tion is continued as before. For example, the outputnetworks for the &R'-shaped pattern, using activationlevel, are shown for di!erent d values in Fig. 5. It can be

observed that these results give satisfactory skeletalshape of the pattern even for small d's (compare withFig. 3 where activation level is not used).

An important question is: what value of d should wechoose in stage (a)? Experiments have shown that thischoice can be made from a considerablly wide range andhence one need not guess an exact value. For example, inFig. 3, for an &R'-shaped pattern, the essential topologycould be achieved for 15)d)30. Within this range, theinitial skeletons have the same topology. In many situ-ations, the initial skeleton itself (for example, Figs.3(a)}(d)) serves the purpose. For example, skeletonizationis applied as a preprocessing step in character recogni-tion problems. In character recognition, a vector skeletonof the character pattern makes the recognition taskeasier. This vector skeleton need not be very close to themedial axis. Crude vector skeletons similar to that shownin Figs. 3(a)}(d) are good enough for structural analysisand recognition of the input pattern.

Moreover, for printed and hand-printed character rec-ognition (for a given font, "xed pen thickness and scannerresolution) we can learn, by trial and error method, anestimate of d from a number of training characters sothat the initial skeleton itself is satisfactory (in otherwords, the initial skeleton provides the essential topologyof the input pattern). This estimate can be subsequentlyused in stage (a). However, if one wants to get more accu-rate skeleton, he can go for the second stage mentionedabove.

As mentioned earlier, a topology-adaptive model,`neural gas networka (NGN) model, was proposed byMartinetz and Schulten [7]. Fritzke [8] modi"ed theNGN model and suggested a scheme to insert processorsin the netwok. Thus in the `growing neural gas networka(GNGN) model of Fritzke, the number of weight vectorsto start with need not be known a priori. The modelsNGN and GNGN are found to be e!ective methodsfor topology learning. They are advantageous over theSONN model because the SONN model assumes aprede"ned topology of the network that remains "xedthroughout learning. However, the NGN and GNGNmodels are not readily useable in the extraction of a vec-tor skeleton. Experiments show that in extraction of theskeleton of a pattern, the TASONN model produces


Fig. 6. Responses of di!erent models for Y-shaped pattern. (a)SONN model using linear topology, (b) SONN model usingrectangular topology, (c) NGN/GNGN models, (d) TASONNmodel (only the links with nonzero b are shown).

much better results than the SONN, NGN and GNGNmodels. This is demonstrated in Fig. 6. In Fig. 6, it can beseen that the output nets may form mesh-like structuresrather than a vector skeleton in SONN and NGN mod-els. However, in GNGN model, this problem can beavoided by keeping the number of processors small (forwhich one needs to know the optimum number a priori).For example, the GNGN model may produce a skeletonas good as that of the TASONN model if the number ofprocessors is optimally chosen (here 10). The proposedTASONN model gives an adaptive way (using activationregion) to avoid this type of situations.

Second, in NGN model, weights are updated for ther nearest processors. Initially, the value of r is set high sothat the chance of getting stuck at a local minimum isreduced. Ordering of the r distances, for each presenta-tion of the input, makes the model computationally ex-pensive.

Third, in NGN and GNGN models, the edge-destruc-tion mechanism is based on an aging scheme. Connec-

tions made at an early stage of the adaptation may not bevalid anymore at an advanced stage. The authors devel-op an age counter for every link after the link is construc-ted. If the age of a link exceeds a prede"ned `lifetimeathen the link is destroyed. The lifetime is chosen subjec-tively here. This technique may pose problems in certainsituations. In the proposed TASONN model, the aging ofthe links is done di!erently and the link adaptations aredone more e$ciently. In NGN and GNGN models, alink is constructed/destroyed in a sudden manner. A linkis constructed between two processors as soon as aninput arises for which these two processors are the closestand the second closest. Similarly, a link is destroyed assoon as its age exceeds the lifetime. This may causesudden loss of previous learning. Note that if the lifetimeis small then the previous learning may be lost soon wheninput keeps on coming from a new region [5]. If thelifetime is big then some undesirable links may survivewhich should not. So the choice of the lifetime which isdone manually is crucial. In our model, the links are notconstructed or destroyed suddenly. Instead, it assignsa strength variable to each edge and these strengths areadapted (lying between 0 and 1) gradually in the learningprocess. Thus it does not forget the previous learning ina sudden manner. Moreover, in Kohonen's SONNmodel and in NGN/GNGN model, a link between twoprocessors either exists or does not exist (deterministic).But in TASONN model a link is allowed to have a kindof degree of its existence in the form of b and this can beof use in shape analysis (as explained in Fig. 1).

Finally, in NGN model, all the processors are createdinitially at rondom positions. This may lead to a numberof processors being dead and "nally coming to no use.The GNGN model overcomes this problem by takingonly two processors initially and then by growing the sizeof the network. The present model starts with an emptyset of processors and processors are created where theinput demands them. Thus a new input region can beadapted easily and more e$ciently. Moreover, in GNGNmodel, new processors are inserted at constant (decidedmanually) time intervals while in our model insertionsare done when the current network stabilizes. After theinsertion the learning again continues.

4. Performance evaluation of the proposed algorithm

Performance of the proposed neural algorithm is ana-lysed in comparison with some existing conventionalthinning algorithms. The comparisons are carried outwith respect to the following desirable properties of askeletonization algorithm: (a) robustness or noise im-munity; (b) medial axis representation; (c) rotation invari-ance; (d) data reduction e$ciency; (e) extendibility toother input types (e.g., dot patterns and gray-level pat-terns). These properties are discussed in di!erent subsec-


Fig. 7. Robustness to boundary noise, an illustration. &0' repres-ents object and &*' represents skeletal pixel: (a) a line segmentwith four boundary noise pixels, (b) result of a conventionaliterative thinning algorithm, (c)}(d) results of our neural algo-rithm with the same noise and higher noise.

tions below. The response of our algorithm with respectto these properties are also presented.

4.1. Robustness

Two types of noise namely, boundary noise and objectnoise, are considered here. An important aspect of ourskeletonization algorithm is noise immunity whichmakes it qualitatively di!erent from the conventionalones. If the original object contains noise, the skeletonshould not deviate much from the skeleton in the idealsituation. A serious problem with many of the existingalgorithms is that they sometimes produce noisy skel-etons if the input patterns contain noisy boundary (see[9,10]). Moreover, these algorithms cannot handle objectnoise. On the contrary, our algorithm can take care ofboth types of noise.

Justi"cations of the above claim are as follows. Theresulting skeleton here is given by the weight vectors afterconvergence, and their links. Its "nal position is highlyinsensitive to noise pixels because of two factors. First,the weight vector converges to the centre of gravity of therespective Voronoi region (S

i) and this centre is not

greatly a!ected by a few noise pixels. Second, the activa-tion region of a processor decreases over time and as aresult, the boundary noise pixels are kept outside it to agreat extent. Thus, the noise insensitivity of the presentalgorithm is clear from its learning mechanism and con-vergence property. Most of the existing conventionalalgorithms use a rigid de"nition of connectedness of theobject * which in e!ect causes noise sensitivity. Theproposed neural approach relaxes the concept of con-nectivity and it is found that such a relaxation is usefulfor robustness even when SNR (signal-to-noise ratio) isclose to 1.

4.1.1. Boundary noiseThe proposed skeletonization algorithm is found to be

highly robust to boundary noise. The boundary noise isdistributed on the boundary of the object (white noise)and on its immediate neighbourhood in the background(black noise). Here we add black and white boundarynoise pixels and study their e!ect on the skeleton. Wehave experimented on several example patterns withdi!erent SNR values where the SNR is de"ned as follows:

SNRB"

Number of boundary black pixels

(B#=),

where B is the number of black noise pixels and = thenumber of white noise pixels. The robustness of ouralgorithm to boundary noise is demonstrated in Fig. 7.A conventional iterative thinning algorithm (for example,the algorithm by Jang and Chin [11]) is found to be noisesensitive (Fig. 7(b)) while our proposed algorithm is in-sensitive to it (Fig. 7(c), (d)).

An error measure, as given below, has been suggestedby Jang and Chin [10]:

me"minG1,

Area [SK!S@

K]#Area [S@

K!S

K]

Area [SK] H,


Fig. 8. Boundary noise immunity.

Fig. 9. Output skeleton generates big holes in the presence ofsingle-noise pixels. Output of conventional thinning algorithms(a) without any noise; (b) with two single-noise pixels; (c) vectorskeleton by the proposed algorithm with the same noise as in (b);(d) raster skeleton by the proposed algorithm with the samenoise.

where SK

and S@K

are the skeletons obtained from S andits noisy version, respectively. Area[.] is an operator thatcounts the number of pixels.

The average values of me

against di!erent SNR valuesare computed for comparing the algorithms. Using theabove measure, Jang and Chin's algorithm [10] has beenfound to be superior to several other conventional algo-rithms [12}16] in terms of boundary noise immunity.The average m

evalues, in our proposed algorithm, are

found to be more encouraging (see Fig. 8) and re#ecthigher robustness to boundary noise.

4.1.2. Object noiseMany conventional algorithms are not able to handle

noise which is interior to the object. By object noise wemean the white noise distributed over the entire objectincluding its boundary. Such noise may occur in practicedue to several reasons. The conventional algorithms (par-ticularly, the iterative ones) use the property of localconnectivity within a small window (3]3 or 5]5) andtry to preserve such local connectivities throughout.They treat a single white noise pixel as a hole consistingof a single pixel. As a result, in the output skeleton itproduces a big hole (Fig. 9(b)). Fig. 9 shows how only twonoise pixels misclassify a &1'-like pattern as an &8'-likepattern. On the contrary, the proposed algorithm usesthe connectivity concept in a more general sense (notethat two processors are joined by a link if the tworespective regions are adjacent). The algorithm treatssmall holes as white noise at the cost of a possibility ofmissing a true small hole. Thus, very small holes havehardly any e!ect on the resulting skeleton (Figs. 9(c) and(d)). But if the hole is large enough and is a part of thepattern (for example, consider an &A'-shaped or &R'-shaped pattern), it is output as a hole in the resultingskeleton. We have experimented and found that the algo-rithm is robust and performs satisfactorily with moder-

ately low SNR (moderate amount of noise) where theSNR is de"ned by

SNR"

Number of object pixels

Number of noise pixels on the object.

For a very low SNR, that is, for a very high amount ofnoise, the binary object becomes merely a set of scatteredpixels or a dot pattern. The proposed algorithm can stillproduce the global skeletal shape of the pattern assumingthe noise to be uniformly distributed.

An illustration is given in Fig. 10 for the pattern &A'with di!erent SNR values. We have taken a very highamount of object noise and tested the algorithm forseveral character patterns. It has been found that even inthe presence of very high noise (SNR"2.0, 1.5 and 1.1),the proposed algorithm is able to extract the skeletalshape of the original object as can be seen in the example"gure (Fig. 10). The existing conventional algorithms failin such situations.

4.2. Medial-axis representation

The primary objective of skeletonization is to approx-imate the medial axis of the object pattern. So it is


Fig. 10. The "nal skeletons obtained for the pattern &A' withobject noise. (a) SNR"2, (b) SNR"1.5, (c) SNR"1.1.

Table 1Index of medial-axis representation

Algorithms g

Holt et al. [12] 0.891Hall [13] 0.902Chin et al. [14] 0.819Zhang and Suen [15] 0.886Lu and Wang [16] 0.898Arcelli and Sanniti di Baja [17] 0.867Jang and Chin [11] 0.881Fan et al. [18] 0.811Proposed neural algorithm 0.890

Fig. 11. E!ect of rotation by angles: (a) 453, (b) 223, and (c) 113.

important that the output skeleton should approximatethe medial axis as closely as possible. After gettingthe raster skeleton as mentioned earlier, the followingmeasure [11] is computed for comparison of the good-ness of "t of medial axis representation with some otherthinning algorithms:

g"Area [SA]Area [S]

,

where S is the set of all object pixels in the input pattern,SA is the union of the maximal digital disks (included in S)centred at all the skeletal pixels.

Clearly, g (lying between 0 and 1) measures the close-ness of the output skeleton to the ideal medial axis. Thederived skeleton is identical to the ideal medial axisif g is 1.

For the proposed algorithm, the average g value isfound to be 0.890 (with d"3). Average values of g arecomputed for a number of conventional algorithms forthe same set of test patterns. The results are summarizedin Table 1. It is found that for the proposed algorithm,the medial axis representation index is as satisfactory asin the conventional algorithms.

4.3. Rotation invariance

It is easy to see that, in the proposed algorithm, theoutput skeleton is invariant under rotation of the inputpattern by arbitrary angles. This is due to the facts that

the proposed method does not assume any underlyinggrid and that the measure of closeness is done in terms ofEuclidean distance. The iterative methods for skeletoniz-ation that use square grid are usually invariant underrotation by multiples of 903 only. The work due to Jangand Chin [10], based on derived grid, reports invarianceunder 453 rotation also. As an illustrative example, wehave rotated the pattern &X' by di!erent angles (453, 223and 113) and shown the output skeletons in Fig. 11. Itshows hardly any e!ect of rotation of the input patternon the output skeleton.

4.4. Data reduction ezciency

After convergence, the proposed neural network modelcreates an adaptive vector quantization [4] of the input


Fig. 12. Output raster skeletons of gray-level patterns.

set. Each weight vector tends to the centroid of therespective Voronoi regions. Within each region all pixelshave the same weight vector as their nearest one. Ingeneral, the output weight vectors give the prototypes orexamplar vectors from the corresponding regions andthese can be an encoded version of the input in lessstorage space. In the present algorithm, the set of weightvectors along with their interconnections, or the graph(planar straight line graph) with the weight vectors as itsnodes and the interconnections as its edges, representsa vector skeleton of the input pattern. This skeletonrequires much less space than the original input set andhence a considerable data reduction is achieved.

One of the basic purposes of skeletonization is toreduce the storage space required to store the image datawithout losing the essential structural information. Theproposed method can achieve more data reduction com-pared to most of the existing skeletonization algorithms.It can be seen that the less the number of processors inthe network the more is the data reduction. By choosinglarger value of d, we can make higher data reduction. Butthis might worsen the accuracy of medial axis representa-tion (see Fig. 3). A proper choice of d as mentioned earliercan balance this trade-o!.

4.5. Extendibility to dot patterns and gray-level images

Unlike binary images, a skeleton cannot be properlyde"ned for a dot pattern. But human visual system canstill perceive its skeletal shape and extract the perceptualskeleton from a dot pattern. For example, a dot patternhaving a de"nite shape can be recognised by the humanbrain almost as easily as a binary image having the sameshape. The conventional thinning algorithms that extractskeletons from binary images do not work for dot pat-terns. On the contrary, as already seen, the proposedneural algorithm can be used to extract the perceptualskeleton of a dot pattern (see Fig. 10).

Another advantage of the proposed algorithm is that itcan take care of gray-level patterns also. The conven-tional binary image thinning algorithms work only onbinary images. They do not work on gray-level images.On the other hand, the neural technique discussed hereworks on gray-level images also. In case of gray-levelimages the area of interest (i.e., the object portion) can beinterpreted as a multi-valued foreground emerging froma single-valued background (see [19]). Suppose fora gray-level pattern, g

abis the gray value of the pixel at

the ath row and bth column. If the weight update rules (4)and (5), are rewritten as

=u(t#1)"=

u(t)#a

1(t)[X(t)!=

u(t)]g

ab,

=v(t#1)"=

v(t)#a

2(t)[X(t)!=

v(t)]g

ab,

then the TASONN model takes care of gray-level pat-terns as well. It is to be noted that if g

ab" 0 or 1 (for all

a, b) then the above update rules are equivalent to rules(4) and (5). The algorithm has been tested on severalgray-level images. Fig. 12 shows the results of such exam-ples. For all the test patterns in this paper, the initialvalue of a is taken as 0.01 and the initial weight vectorsare chosen at random. The values of a are changed overtime as a

1(s)"0.01/(1#s/50) and a

2(s)"0.01/

(1#s/20), where s is the sweep number.

4.6. Computational complexity

In general, "nding skeletons is a computationally ex-pensive task. That is why parallel algorithms have beenencouraged in this "eld. The proposed neural network-based method is parallel in nature (in general, neurocom-puting has been considered to be a new form of parallelcomputing). In the proposed skeletonization algorithm,the input vectors are presented to the network sequen-tially. All the processors present in the network computethe distances from their respective weight vectors inparallel in a constant time and the winner is selected by a


Fig. 13. (a), (c) The vector skeleton misses very small gaps. (b),(d) However, the raster skeleton is alright.

&maxnet' [20]-like network. The winner and the secondwinner processors update their weights simultaneously.Thus for a single input, the time taken by the aboveprocess is not dependent on the network size. This pro-cess is repeated for all the input vectors. Hence onecomplete pass of the input, that is, one sweep requiresO(N) time where N is the total number of object pixels. Itcan be seen that the total number of sweeps does notdepend on the input size N.

It should be mentioned here that in many existingparallel thinning algorithms multiple processors are usedwhere all the processors work in parallel. But in thesealgorithms, all the input values are fed together in aniteration and all the processors (or a large subsetof them) compute their output in parallel which wouldbe the input in the next iteration. The total number ofpasses that are required by this class of algorithms isO(width

max) where width

maxis the maximum width of the

input pattern. These algorithms use cellular networkswhere each pixel in the image is assigned a processor.Thus the number of processors required is O(N

I) where

NI

is the total number of pixels in the whole (includingthe background) image. In our algorithm, the size of thenetwork is O(N

S) where N

Sis the number of skeletal

pixels. In most of the applications, NI<N

S.

5. Discussion and conclusion

A topology-adaptive self-organizing neural networkmodel is proposed which is applicable in skeletal shapeextraction of an object. In Kohonen's SONN model, thenetwork topology is initially set and it is maintainedthroughout. In the SONN model, the network is self-organizing in that it tends to approximate the inputpattern space in an orderly fashion without any super-vision. In the proposed model the same is achieved and,in addition, the net topology is adapted automaticallyunlike in the SONN model. The topology adaptivity isa major issue of the proposed model which makes itapplicable to skeletonization tasks. Due to a "xed networktopology set initially, Kohonen's model cannot alwaysproperly represent the shape of a pattern while the pro-posed model can do it more accurately. The proposedmodel evolves the topology of the network and it isself-organizing as well. Moreover, the model is especiallytuned to generate a vector skeleton (from which a rasterskeleton can also be obtained) of the underlying pattern.

As a skeletonization algorithm, the present model pos-sesses quite a few advantages over the existing conven-tional thinning algorithms. The most important one isthe high robustness of the present algorithm with respectto boundary and interior noise. Next, it is rotation invari-ant to arbitrary angles while most of the conventionalalgorithms are not. The medial-axis-representation e$-ciency is found to be satisfactory. The algorithm is also

capable of higher data reduction. Finally, the proposedalgorithm can be seen as a uni"ed approach to skeleton-ization because it is applicable to all the three types ofinput patterns, namely, binary images, dot patterns andgray-level images.

Since the proposed technique is very noise insensitive,it may treat very small holes or gaps as noise. Thus theproposed model cannot strictly guarantee homotopyproperty of the object (see Fig. 9). Consider a broken linepattern. By our algorithm, the vector skeleton may missthe narrow gaps and recognize it as a single object(Fig. 13(a)). Similar situation will occur in a ring-likepattern with a narrow gap (Fig. 13(c)). This, however,does not cause any serious problem since the rasterskeleton will be broken as found in Fig. 13(b) and (d).

Lastly, an important advantage of the TASONNmodel is fault tolerance. It is easy to see that damage of afew nodes or links can be automatically repaired in theTASONN model while it is not so in the SONN model.During learning, if any node or link is damaged, the inputvectors in the surroundings will generate new nodes andnew links. The link strengths (b values) are adapted fromthe subsequent input presentations.

References

[1] H. Blum, A transformation for extracting new descrip-tions of shape, IEEE Proceedings of the Symposium onModels for the Speech and Vision Form, Boston 1964,pp. 362}380.

[2] R.W. Smith, Computer processing of line images: a survey,Pattern Recognition 20 (1987) 7}15.

[3] L. Lam, S.W. Lee, C.Y. Suen, Thinning methodologies* a comprehensive survey, IEEE Trans. Pattern Anal.Mach. Intell. 14 (1992) 869}885.


[4] T. Kohonen, Self-Organization and Associative Memory,Springer, Berlin, 1989.

[5] D. Choi, S. Park, Self-creating and organizing neural net-works, IEEE Neural Networks 5 (1994) 561}575.

[6] J.A. Kangas, T. Kohonen, J. Laaksonen, Variants of self-organizing maps, IEEE Trans. Neural Networks 1 (1990)93}99.

[7] T.M. Martinetz, K.J. Schulten, Topology representingnetworks, Neural Networks 7 (1994) 507}522.

[8] B. Fritzke, A growing neural gas network learns topolo-gies, in: G. Tesauro, D.S. Towetzky, T.K. Leen, (Eds.),Advances in Neural Information Processing Systems, Vol.7, MIT Press, Cambridge MA, 1995, pp. 625}632.

[9] A. Datta, S.K. Parui, A robust parallel thinning algorithmfor binary images, Pattern Recognition 27 (1994)1181}1192.

[10] B.K. Jang, R.T. Chin, One-pass parallel thinning: analysis,properties and quantitative evaluation, IEEE Trans. Pat-tern Anal. Mach. Intell. 14 (1992) 1129}1140.

[11] B.K. Jang, R.T. Chin, Analysis of thinning algorithmsusing mathematical morphology, IEEE Trans. PatternAnal. Mach. Intell. 12 (1990) 541}551.

[12] C.M. Holt, A. Stewart, M. Clint, R.H. Perrott, An im-proved parallel thinning algorithm, Commun. ACM 30(1987) 156}160.

[13] R.W. Hall, Fast parallel thinning algorithms: parallelspeed and connectivity preservation, Commun. ACM 32(1989) 124}131.

[14] R.T. Chin, H.-K. Wan, D.L. Stover, R.D. Iverson, Aone-pass thinning algorithm and its parallel implementa-tion, Comput. Vision Graphics Image Process. 40 (1987)30}40.

[15] T.Y. Zhang, C.Y. Suen, A fast parallel algorithm for thinn-ing digital patterns, Commun. ACM 27 (1984) 236}239.

[16] H.E. Lu, P.S. Wang, A comment on a fast parallel algo-rithm for thinning digital patterns, Commun. ACM 29(1986) 239}242.

[17] C. Arcelli, G. Sanniti di Baja, A width-independent fastthinning algorithm, IEEE Trans. Pattern Anal. Mach.Intell. 7 (1985) 463}474.

[18] K.C. Fan, D.F. Chen, M.G. Wen, Skeletonization of bi-nary images with nonuniform width via block decomposi-tion and contour vector matching, Pattern Recognition 31(1998) 823}838.

[19] C. Arcelli, G. Ramella, Finding grey-skeletons byiterated pixel removal, Image Vision Comput. 13 (1995)159}167.

[20] K. Mehrotra, C.K. Mohan, S. Ranka, Elements ofArti"cial Neural Networks, MIT Press, Cambridge, MA,1997.

About the Author*AMITAVA DATTA received his Master's degree in Statistics from the Indian Statistical Institute in 1977 and dida post-graduate course in Computer Science from the same institute in 1978. After working for a few years in reputed computerindustries he joined the Indian Statistical Institute as a System Analyst in 1988. Since then he has been working in image processing andpattern recognition. From 1991 to 1992 he visited GSF, Munich as a Guest Scientist and worked on query-based decision supportsystem. His current interests are in neural net based image processing and computer vision.

About the Author*S.K. PARUI received his Master's degree in Statistics and his Ph.D. degree from the Indian Statistical Institute in1975 and 1986, respectively. From 1985 to 1987 he held a post-doc position in Leicester Polytechnic, England and during 1988}89 hewas a visiting scientist in GSF, Munich. He has been with the Indian Statistical Institute from 1987 and is now an Associate Professor.His current research interests include shape analysis, remote sensing, statistical pattern recognition and neural networks.

About the Author*B.B. CHAUDHURI received his B.Sc. (Hons), B.Tech. and M.Tech. degrees from Calcutta University, India in 1969,1972 and 1974, respectively, and Ph.D. degree from the Indian Institute of Technology, Kanpur in 1980. He joined the Indian StatisticalInstitute in 1978 where he served as the Project Co-ordinator and Head of the National Nodal Center for Knowledge Based Computing.Currently, Professor Chaudhuri is the head of the Computer Vision and Pattern Recognition Unit of the institute. His research interestsinclude Pattern Recognition, Image Processing, Computer Vision, Natural Language Processing and Digital Document Processingincluding OCR. He has published 150 research papers in reputed international journals and has authored two books entitled Two ToneImage Processing and Recognition (Wiley Eastern, 1993) and Object Oriented Programming: Fundamentals and Applications(Prentice-Hall, 1998). He was awarded the Sir J.C. Bose Memorial Award for best engineering science oriented paper in 1986, M.N. SahaMemorial Award (twice) for best application-oriented papers in 1989 and 1991, the Homi Bhabha Fellowship award in 1992 for OCR ofIndian Languages and computer communication for the blind, Dr. Vikram Sarabhai Research Award in 1995 for his outstandingachievements in the "elds of Electronics, Informatics and Telematics and C. Achuta Menon Prize in 1996 for computer-based Indianlanguage processing. He worked as a Leverhulme visiting fellow at Queen's University, UK in 1981}82, a visiting scientist at GSF,Munich and guest faculty at the Technical University of Hannover during 1986}88. He is a Senior member of IEEE, member secretary(Indian Section) of International Academy of Sciences, Fellow of International Association for Pattern Recognition, Fellow of theNational Academy of Sciences (India), and Fellow of the Indian National Academy of Engineering. He is serving as associate editor ofthe journals Pattern Recognition, Pattern Recognition Letters, and Vivek. He also acted as guest editor of a special issue of the Journalof the IETE on Fuzzy Systems.


Documents

Skeletonization by a topology-adaptive self-organizing neural network