ANSIG — An Analytic Signature for Permutation Invariant 2D ...fenix.tecnico.ulisboa.pt/downloadFile/395137788040/final_resumo_tese.pdfwe propose a new permutation-invariant representa-tion

ANSIG — An Analytic Signature for Permutation Invariant 2D ShapeRepresentation

Jose J. Rodrigues

Advisor: Pedro M. Q. Aguiar Co-advisor: Joao M. F. Xavier

Abstract

Many applications require a computer representa-tion of 2D shape, usually described by a set of points.The challenge of this representation is that, besidescapturing the shape characteristics, it must be invari-ant to relevant transformations.

Invariance to geometric transformations was ad-dressed in the past under the assumption that thepoints are labeled, i.e., that the shape is characterizedby an ordered set of landmarks. However, in manypractical scenarios the points are obtained from anautomatic process, e.g., edge detection, thus withoutnatural ordering. Obviously, the combinatorial problemof computing the correspondences between the pointsof two shapes becomes a quagmire when the numberof points is large. We avoid this problem by represent-ing shapes in a way that is invariant to the permutationof the landmarks.

We map a shape to an analytic function on the com-plex plane, leading to what we call its analytic signa-ture (ANSIG). We show that different shapes lead todifferent ANSIGs but that shapes that differ by a per-mutation of the landmarks lead to the same ANSIG.Thus, ANSIG is a maximal invariant with respect tothe permutation group. To store an ANSIG, it sufficesto sample it along a closed contour. We further showhow easy it is to factor out geometric transformationswhen comparing shapes.

We developed a shape-based image classificationsystem, based on the ANSIG. Our experiments illus-trate the capabilities of this system.

1. Introduction

Many objects are primarily recognized by theirshape, rather than their color or texture. However, au-tomatic shape-based classification has proved to be avery hard task, remaining an open problem, underly-ing which is the fundamental question of how to rep-

resent shape. This work deals with the representa-tion of two-dimensional (2D) shape. In our context,a shape is described by the coordinates of a set ofpoints, or landmarks. We seek efficient ways to repre-sent such sets, in particular we seek representationsthat are suitable to shape-based recognition. Whenthe 2D shape is described by a set of labeled land-marks, there is an established theory that copes withgeometric transformations and shape variations, thestatistical theory of shape [1]. Although this theoryhas lead to significant results when the images to com-pare are characterized by feature points whose corre-spondences from image to image can be obtained, inmany practical scenarios, such correspondences arenot available. We thus focus on unlabeled data.

The majority of the methods that cope with unla-beled points focus on representing a “blob”, i.e., ashape that consists of a connected set of points. Anumber of techniques, usually called region-based,describe these shapes by using moment descriptors,e.g., geometrical [2], Legendre [3], Zernike [3, 4], orTchebichef [5]. Other approaches, called contour-based, represent the boundary of the shape using,e.g., curvature scale space [6], wavelets [7], con-tour displacements [8], splines [9], or Fourier descrip-tors [10, 11]. Some of these representations exhibitdesired invariance to geometric transformations butthey are restricted to shapes well described by closedcontours.

In image analysis, shape cues come primarily fromimage edges. Since, in general, it is hard to ex-tract complete contours when dealing with real im-ages, researchers developed local shape descriptorsthat, at each point of the shape, capture the relativedistribution of the remaining points, e.g., shape con-texts [12] and distance multisets [13]. Although theselocal representations can deal with contour disconti-nuities, they do not deal with general shapes and ge-ometric transformations. In this work, we seek repre-sentations for shapes characterized by arbitrary setsof points.

1

A number of approaches to deal with shapes de-scribed by general sets of unlabeled points are moti-vated by the need to register the corresponding im-ages, i.e., to compute the rigid transformation thatbest “aligns” them. The majority of these registra-tion methods are inspired by the fact that the rigidtransformation is easily computed when the labels,i.e., the point correspondences, are known (the Pro-crustes Matching problem). To cope with unlabeledpoints, they develop iterative algorithms that com-pute, in alternate steps, the rigid registration param-eters and the point correspondences. One of themost widely known examples is the Iterative Clos-est Point (ICP) algorithm [14]. More recently, otherresearchers have proposed statistical methods thatuse “soft” correspondences [15, 16, 17], leading toExpectation-Maximization (EM)-like two-step iterativealgorithms. Although these methods have succeed,even in challenging scenarios, e.g., shape part de-composition [16] or nonrigid registration [15], theyhave the limitations of iterative algorithms, includingthe uncertain convergence and the sensitivity to ini-tialization.

The relevance of being able to lead with unla-beled data in learning tasks, i.e., the relevance ofpermutation-invariant representations, has been re-cently pointed out [18, 19, 20]. In these works, theinvariance to permutation is explicitly constructed, i.e.,the permutation is factored out, after being computedas the solution of a convex optimization problem. How-ever, this formulation does not deal with geometrictransformations such as rotation. In this work, ratherthan attempting to compute the permutation betweentwo sets of points describing the shapes to compare,we propose a new permutation-invariant representa-tion for sets of 2D points. We represent a 2D shapeby what we call its analytic signature (ANSIG), an an-alytic function defined over the complex plane. Weshow that shapes that differ by a re-ordering of the setof landmarks have the same ANSIG. Thus, the ANSIGrepresentation is permutation-invariant. Furthermore,we show that this representation enables discriminat-ing different shapes, i.e., different shapes are repre-sented by different analytic functions.

Under our approach, shape-based classificationboils down to comparing the ANSIG of a candidateshape with the ones in a database. As any ana-lytic function, the ANSIG of a shape is completely de-scribed by the values it takes on a closed contour onthe complex plane. Thus, we store an ANSIG by sam-pling it on the unit-circle. To compare ANSIGs, it suf-fices to measure the difference between the vectorscollecting the corresponding samples.

The ANSIG representation is not invariant with re-spect to geometric transformations such as transla-tion, rotation, and scale. However, an adequate pre-processing step factors out translation and scale. Al-though rotation can not be factored out, we show thatthe rotation that best aligns two shapes is easily ob-tained from their ANSIGs. In fact, the rotation that min-imizes the above mentioned error measure is obtainedin a computationally simple way by using FFTs.

In many situations, the points describing a shapeare obtained automatically, e.g., by detecting edges inan image. It is then common that two sets of points tocompare, besides being noisy, have distinct cardinal-ity (particularly when dealing with images of differentsizes/resolutions).

In this work, we illustrate the invariance propertiesof the ANSIG representation using synthetic shapes.To demonstrate the usefulness of this representationin shape-based image classification, we developed areal-life image recognition system composed by a PCrunning a Matlab software package and a webcam ac-quiring (low quality) images, described in [21]. We de-scribe the performance of this system in object recog-nition and automatic trademark retrieval applications.

Contributions

ANSIG. A new shape representation scheme that hasthe distinguishing feature of being invariant to the per-mutation of the landmarks, i.e. to the labels of thepoints representing the shape. Additionally, the AN-SIG representation copes with shape-preserving geo-metric transformations and different cardinality of pointsets.

Analytic functions for shape representation . Wepropose the use of analytic functions as a newparadigm for 2D shape representation. Besides AN-SIG, we also present the polynomial signature, in or-der to stimulate the search for new representationswith similar structure.

Theoretical derivation of the invariance propertiesof the ANSIG . We provide the proof for the invarianceproperties of the ANSIG. Namely, we prove the maxi-mal invariance of the ANSIG with respect to permuta-tion, translation, and scale, and its equivariance withrespect to rotation. We also prove the invariance ofthe ANSIG with respect to the replication of the land-marks.

System for automatic shape classification . A sys-tem to perform automatic shape classification, de-scribed in [21], composed by a Matlab software pack-age and a webcam. The shapes extracted from the ac-quired images are represented by their ANSIGs, mak-

ing the system able to deal with images containing ar-bitrary shapes. The system can be used for severalapplications, e.g., automatic trademark retrieval, as il-lustrated.

Partial versions of this work appear in [22, 23, 24].The complete description of the work is in [21]. Thisdocument is essentially based on [22]. The system forautomatic shape classification was patented [25] anddemonstrated in public for several times [26, 27, 28].

2. Permutation invariance: the analyticsignature of a shape

We consider that a 2D shape is a set of n unlabeledpoints in the complex plane C, thus described by acomplex vector

z =

z1

z2

...zn

=

x1 + jy1

x2 + jy2

...xn + jyn

∈ C

n. (1)

However, as the ordering of the landmarks is irrele-vant, the choice in (1) is not unique: the same shapeis equivalently represented by any vector in the set

{Πz : Π ∈ Π(n)}, (2)

where Π(n) denotes the set of n×n permutation ma-trices.

Using the language of group theory, a shape isseen as a point in the quotient space Cn/Π(n). Here,Π(n) is a group and matrix multiplication is the groupoperation:

Π(n) × Cn ∋ (Π,z) 7→ Πz ∈ C

n. (3)

This group action induces a partition of Cn into disjointorbits, where each orbit collects all the possible waysof representing the same shape vector, i.e., it is theset in (2). Each shape corresponds to an orbit and thequotient space Cn/Π(n) is the set of orbits, i.e., thespace of shapes. The canonical map, also referred bythe quotient map,

π : Cn → C

n/Π(n) , z 7→ π(z), (4)

maps each point to its orbit. In our case, π mapseach vector z ∈ Cn to the shape it represents, π(z) ∈Cn/Π(n).

From the definition above, two vectors z, w ∈ Cn

that are not related by a permutation, will be mappedby π to distinct points in the quotient space, i.e., π(z) 6=π(w). This is usually referred to as a maximal invari-ance property, meaning that π(z) = π(w) if and only if

z and w represent the same shape. Although this is aproperty we look for (the map π detects whether or notz and w represent the same shape), this mechanismis hardly implementable, due to the rather abstract na-ture of the quotient space and map.

Definition of analytic signature (ANSIG). We nowdevelop a version of the objects introduced above,which is suitable for use in practice. We propose toreplace the quotient map π : Cn → Cn/Π(n) by amap a : Cn → A,

A := {f : C → C : f is analytic}. (5)

As the quotient map π, our surrogate map a, whichmaps shape vectors in Cn to analytic functions on thecomplex plane, will exhibit maximal invariance with re-spect to the permutation group Π(n). This means thata maps z and w to the same analytic function if andonly if z = Πw, for some permutation matrix Π, i.e.,if and only if z and w describe the same shape. Werefer to the analytic function a(z, ·) : C → C as the an-alytic signature (ANSIG) of the shape described by z.

In particular, we propose the ANSIG map a : Cn →A, z 7→ a(z, ·), given by

a(z, ξ) :=1

n

n∑

m=1

ezmξ, (6)

where ξ is a dummy complex variable. We will seethat the ANSIG just defined, exhibits properties thatmake it adequate for shape representation, in particu-lar, the maximal invariance with respect to the permu-tation group.

Maximal invariance of the ANSIG. It is clear thatinvariance of a with respect to the permutation groupholds. In fact, from the definition of the ANSIG map ain (6), it follows that a(z, ·) = a(w, ·) whenever z,w ∈Cn are related by a permutation, i.e., whenever z =Π(n)w.

To establish the maximal invariance, consider twovectors z = [ z1 z2 · · · zn ]T and w = [w1 w2 · · · wn ]T

that have equal ANSIGs, a(z, ·) = a(w, ·). We willshow that z and w represent the same shape, i.e., thatthey differ only by a permutation of their entries. Sincethe analytic functions a(z, ·) and a(w, ·) are equal,their first n-th order derivatives at the origin also co-incide:

dk

dξka(z, ξ)

∣∣∣∣ξ=0

=dk

dξka(w, ξ)

∣∣∣∣ξ=0

, k = 1, 2, . . . , n.

Using the definition of the ANSIG a in (6), this systemof equations is written in terms of the entries of z andw as

z1 + z2 + · · · + zn = w1 + w2 + · · · + wn

z21 + z2

2 + · · · + z2n = w2

1 + w22 + · · · + w2

n

...zn1 + zn

2 + · · · + znn = wn

1 + wn2 + · · · + wn

n .

This set of equalities implies that the polynomialsp(t) = (t−z1)(t−z2) · · · (t−zn) and q(t) = (t−w1)(t−w2) · · · (t−wn) are identical [29]. In particular, p(t) andq(t) share the same system of roots (including theirmultiplicities), thus the (multi-)sets {z1, z2, . . . , zn} and{w1, w2, . . . , wn} are equal. This way we concludethat the vectors z and w are equal, up to a permu-tation, thus proving the maximal invariance of our AN-SIG map a in (6).

Storing the ANSIG. The ANSIG map a : Cn → Ain (6) maps shape vectors to an analytic functions onthe complex plane. As the space of analytic func-tions A is infinite-dimensional, our ANSIG may seemsinadequate to a computer implementation. However,a major consequence of Cauchy’s integral formula isthat any analytic function f is unambiguously deter-mined by the values it takes on a simple closed con-tour, see, e.g., [30]. Choosing this contour to be theunit-circle S1 (the circle of radius 1 centered at the ori-gin), the analytic function f is uniquely determined by{f(ejϕ) : ϕ ∈ [0, 2π]}. In practice, this is approx-imated by sampling, i.e., by considering the valuesof f on a finite set of K points in the unit-circle, say{1,WK ,W 2

K , . . . ,WK−1

K

}, where

WK := ej 2πK . (7)

In summary, we approximate the ANSIG map a :Cn → A in (6) by its discrete counterpart aK : Cn →CK , z 7→ aK(z), given by

aK(z) :=

a(z, 1)a(z,WK)a(z,W 2

K)· · ·

a(z,WK−1

K

)

∈ C

K . (8)

3. Quotienting out shape-preserving geo-metric transformations

We now address how the ANSIG representationhandles shape-preserving geometric transformations,such as translation, rotation and scale. As introducedin the previous section, a shape is an orbit generated

by a group G of pertinent transformations. We startedby considering the permutation group G = Π(n) andprovided a maximal invariant with respect to this groupaction, the ANSIG a in (6). In fact, the ANSIG map ais such that a(z, ·) = a(w, ·) if and only if z and w

differ by a permutation. Building on this result, wenow enlarge the group G to also accommodate shape-preserving geometric transformations.

Two vectors z,w ∈ Cn represent the same shapeif they are equal, up to, not only a permutation of theirentries, but also a translation, rotation, and scale fac-tor, affecting the set of n points they describe in theplane. To take this into account, we consider the groupof transformations G = Π(n) × R+ × C × S1, definingthe action of G on Cn as the map G × Cn → Cn,

(Π, λ, v, ejθ

)· z := Πλejθz + v1n , (9)

where 1n := [ 1 1 · · · 1 ]T is the n-dimensional vectorwith all entries equal to 1. It is clear from (9) that theaction of one element of the group G on a shape vec-tor z corresponds to a permutation (Π), translation (v),rotation (θ), and scaling (λ), applied to the shape de-scribed by z.

Obviously, deciding if two given vectors z, w repre-sent the same shape, i.e., if they are in the same orbit,corresponds to checking if the value of the optimiza-tion problem

min∥∥z − (Π, λ, v, ejθ) · w

∥∥(Π, λ, v, ejθ) ∈ G

(10)

is zero. However, to the best of our knowledge, solv-ing (10) requires an exhaustive search over the groupΠ(n), which has cardinality n!. Clearly, this is not fea-sible, even for moderate values of the number of land-marks, say n = 100 (100! ≃ 10158). In contrast, our ap-proach is to use the ANSIG representation to devisea scheme that circumvents the combinatorial search.This way, in our experiments, we were able to processshapes described by very large sets of points, e.g.,with n up to 40000.

Translation and scale. We start by quotienting outall the transformations, except the rotation, througha map φ. Then, we show that φ is equivariant withrespect to the rotations, what will enable a compu-tationally simple scheme to detect equality of orbits.To factor out translation and scale, we make the sim-ple pre-processing step of centering and normalizingthe shape. This corresponds to considering the mapφ : Cn → A, defined by

φ(z, ·) := a

(√n

z − z

‖z − z‖ , ·)

, (11)

where a is the ANSIG (6), z := 1

n1

Tnz1n is a vector

with all entries equal to the mean value of z, and ‖·‖denotes the 2-norm, ‖v‖ := vHv, where vH is theHermitian (i.e., conjugate) transpose of v. The readermay wonder why the factor

√n in (11). In fact, this fac-

tor has nothing to do with factoring out translation orscale — it is constant for shapes described by the afixed number of landmarks, n. The motivation for thefactor

√n in (11) is precisely to enable dealing well

with shapes described by different numbers of land-marks, providing robustness to over/under samplingthe shapes.

Using the maximal invariance of the ANSIG a withrespect to permutation, it is easy to show that the mapφ in (11) is a maximal invariant with respect to permu-tation, translation, and scale, i.e., that φ(z, ·) = φ(w, ·)if and only if z = (Π, λ, v, 1) ·w, for some (Π, λ, v, 1) ∈G. Thus, the map φ in (11) quotients out all transfor-mations in G, except the rotation.

Rotation. We now show that, although the map φ isnot invariant to rotations, it is equivariant. Let us beginby looking at how the rotation of a shape affects itsANSIG. From the definition of the ANSIG a in (6), itfollows that

a(ejθz, ξ) =1

n

n∑

k=1

eejθzkξ

=1

n

n∑

k=1

ezk(ejθξ)

= a(z, ejθξ) . (12)

Thus, the ANSIG of the rotated shape is a rotated ver-sion (in the complex plane) of the ANSIG of the origi-nal shape.

This property enables the derivation of the desiredequivariance of φ, through the following chain of equal-ities:

φ((Π, λ, v, ejθ) · z, ξ

)= φ

(Πλejθz + v1n, ξ

)(13)

= a

(√nΠejθ(z − z)

‖z − z‖ , ξ

)(14)

= a

(√n

ejθ(z − z)

‖z − z‖ , ξ

)(15)

= a

(√n

z − z

‖z − z‖ , ejθξ

)(16)

= φ(z, ejθξ) , (17)

where: (13) comes from the definition of the group ac-tion (9); (14) uses the definition of the map φ (11); (15)results from the permutation invariance of the ANSIG;

(16) comes from property (12); and (17) uses againthe definition of φ.

Since the analytic function φ(z, ·) is univocally de-termined by the values it takes on a closed contour,i.e., by its the restriction to the unit-circle, φS1 : Cn ×S1 → C, φS1(z, ·) := φ(z, ·), it will be useful to state theequivariance (17) in terms of this restriction:

φS1

((Π, λ, v, ejθ

)· z, ξ

)= φS1

(z, ejθξ

). (18)

This leads to the following short summary of our re-sults: vectors z and w describe the same shape, i.e.,they differ by permutation, translation, rotation, andscale, if and only if

φS1(z, ·) = φS1

(w, ejθ ·

), for some θ ∈ [0, 2π]. (19)

4. Implementation

As referred in Section 2, a practical way to store theANSIG a : Cn → A is through its unit-circle sampledversion, aK : Cn → CK , introduced in (8). Thus, inpractice, the map φS1 is approximated by φK : Cn →CK , defined as

φK(z) := aK

(√n

z − z

‖z − z‖

), (20)

where K, the number of samples in the unit-circle, ischosen by the user. In all our experiments, we usedK = 512.

When the rotation angle matches one of those ofthe samples in the unit-circle, i.e., when ejθ = W k

K ,for some k, the equivariance of φS1 in (18) gracefullytransfers to φK :

φK

((Π, λ, v, eiθ

)· z

)= φK(z) mod k , (21)

where mod k denotes a k-step cyclic shift. Thus, ourtest for deciding if two shape vectors z and w cor-respond to the same shape, expressed in (19), boilsdown to checking if φK(w) is a cyclic-shifted version ofφK(z). Naturally, with K sufficiently large, ejθ ≃ W k

K ,for some k, and (21) holds for practical purposes. Theshape similarity test can then be carried out by check-ing if the following error is below a small threshold:

mink=0,1,...,K−1

‖φK(z) − φK(w)mod k‖2. (22)

To evaluate (22), we must find the cilic-shift k∗ thatbest “aligns” φK(z) and φK(w). Solving this by ex-haustive search leads to an algorithm of complexityO(K2). We reduce the complexity by using FFTs. De-note the Discrete Fourier Transform (DFT) of v ∈ CK

by v ∈ CK ,v := DH

Kv , (23)

where DK is the K × K DFT matrix, see, e.g., [31].Using the facts that the DFT is an unitary operator(DH

KDK = IK) and that the DFT of a k-cyclic-shiftedvector equals the DFT of the original vector, multipliedby

√Kdk, where dk is the k-th column of DK [31], the

minimizer of (22) is written in the frequency domain as

k∗ = arg mink

∥∥∥φK(z) − φK(w) ⊙√

K dk

∥∥∥2

, (24)

where ⊙ is the Schur-Hadammard (i.e., elementwise)product. Expressing the square norm in (24) and re-moving the terms that do not depend on k, we get (Re

is the real part):

k∗ = arg maxk

Re

{dH

k

(φK(z) ⊙ φK(w)

)}. (25)

Expression (25) encodes a computationally simpleprocedure to “align” φK(z) and φK(w): i) computetheir DFTs; ii) compute the DFT of the elementwiseproduct of these DFTs; iii) locate the entry with largestreal part. Since the complexity of computing a DFTusing the Fast Fourier Transform (FFT) algorithm isO(K log K), the overall complexity of our procedureis O(K + 3K log K).

Naturally, the appropriate measure of ANSIG simi-larity need not be the one in (22). In fact, our shape-based image classification experiments have shownthat a better choice is the (cosine of) the angle be-tween the ANSIG vectors,

similarity (z,w) :=

∣∣∣φHK(z)φK(w)mod k∗

∣∣∣‖φK(z)‖ ‖φK(w)‖ . (26)

5. Experiments

We start by describing experiments that illustratethe behavior of the ANSIG representation, emphasiz-ing its invariance and its capability to deal with shapesdescribed by sets of points of distinct cardinality. Then,we report the use of the ANSIG representation inshape-based image classification, emphasizing its ro-bustness to noise.

Invariance of the ANSIG. We use the shapes rep-resented in Fig. 1, where the shape on the right plot isobtained by translating, rotating, and resizing the oneon the left. The vectors containing the coordinates ofthe points of these shapes also differ by a random per-mutation of their entries (obviously, the plots in Fig. 1do not capture this difference).

In Fig. 2, we represent the magnitude (left plot) andthe phase (right plot) of the discretized ANSIGs (φK

0 100 200 300 400 500 600 7000

50

100

150

200

250

300

350Shape 1

250 300 350 400 450 500 550 600 650 7001300

1350

1400

1450

1500

1550

1600

1650

1700Shape 2

Figure 1. Two shapes that differ by shape-preserving geo-metric transformations. The ordering of the points describ-ing the shapes is also distinct; thus, although the left andright images represent the same shape, this is not triviallyinferred from the stored data.

in (20)) of the two shapes in Fig. 1. Note that both themagnitude and the phase plots of the two shapes onlydiffer by a (circular) translation. This is in agreementwith our derivations, see expressions (18) and (21):the rotation of a shape induces the same rotation onits ANSIG and this rotation is seen as a (circular) shiftof the vector collecting the ANSIG samples on the unit-circle. Since aligning ANSIGs is very simple, as de-scribed in Section 4, see (25), our representation fullycaptures the similarity of the shapes in Fig. 1, factoringout geometric transformations and point labeling.

0 1 2 3 4 5 6 7

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

1.5

θ (rad)

abs(

AN

SIG

)

Shape 1Shape 2

0 1 2 3 4 5 6 7−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

θ (rad)

angl

e(A

NS

IG)

Shape 1Shape 2

Figure 2. Magnitude and phase of the ANSIGs of the twoshapes in Fig. 1. Note that, although these shapes differby position, rotation, scale, and ordering of the points, theirANSIGs only differ by a (circular) translation that is easilycomputed.

The experiment just described used shapes de-scribed by sets of points of the same cardinality. Wenow show that ANSIG also deals well with shapes de-scribed by sets of (very) distinct cardinality. This char-acteristic is important in practice, to enable comparingshapes obtained from images, by an automatic pro-cess (due do the pixelization, the same shape oftenleads to point sets of different cardinality, when seenat different scales). We use the shapes representedin Fig. 3: both shapes represent the same object butthe shape on the right is described by just 30% of thepoints of the one on the left. In Fig. 4, we representthe ANSIGs of both shapes. Clearly, the ANSIGs ofthe differently sampled shapes almost coincide, as de-

sired. This robustness to distinct sampling density wasanticipated in Section 3: it motivated the factor

√n in

the normalization (11).

20 40 60 80 100 120

30

40

50

60

70

80

90

100

110

Shape 1

20 40 60 80 100 120

30

40

50

60

70

80

90

100

110

Shape 2

Figure 3. Two shapes that differ in sampling density.

0 1 2 3 4 5 6 70.8

0.85

0.9

0.95

1

1.05

1.1

1.15

1.2

1.25

1.3

θ /rad

abs(

AN

SIG

)

Shape 1Shape 2

0 1 2 3 4 5 6 7−0.25

−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

0.2

0.25

θ /rad

angl

e(A

NS

IG)

Shape 1Shape 2

Figure 4. Magnitude and phase of the ANSIGs of the twoshapes in Fig. 3. Although the number of points describingthese shapes differ significantly, their ANSIGs are similar.

Shape-based classification. We now use the AN-SIG representation in automatic classification. Theshape database is a collection of prototype shapes(one ANSIG per propotype). The classification is sim-ply made by comparing the ANSIG of a test shape withthe ANSIGs in the database and selecting the mostsimilar one, according to (26).

To test the robustness to noise, we used a particu-larly challenging database of four geometric shapes —a circumference, an hexagon, a square, and a triangle— that are difficult to distinguish when in presence ofnoise. In Fig. 5, we show some of the test shapes ofour experiments. They are noisy (besides translated,rotated, scaled, and randomly point re-ordered) ver-sions of the four geometric shapes. Note in particularhow the circle and the hexagon become very similarwhen the noise level is high.

We tested the classifier with 2000 tests for each ofthe four geometric shapes, at each noise level. Theresults are summarized in Table 1. We obtained 100%correct classifications in all tests with SNR above28dB. Shapes with this level of noise are displayed inthe plots of the middle column of Fig. 5; they are farfrom being visually “clean”. From Table 1, we see thatwe only get a few classifications errors for higher levelsof noise and when dealing with circles or hexagons,

Figure 5. Noisy test shapes for classification. From top tobottom, circle, hexagon, square, and triangle. Noise level in-creases from the left to the right. The various shapes exhibitdistinct translations, rotations, and scales (besides point or-dering). Note how the shapes become difficult to distinguish,when the noise increases.

−100 −50 0 50 100−100

−80

−60

−40

−20

0

20

40

60

80

100

SNR=Inf dB

100 150 200 250 300 350 400

−150

−100

−50

0

50

100SNR=27.96 dB

350 400 450 500 550

−420

−400

−380

−360

−340

−320

−300

−280

−260

SNR=23.1 dB

0 50 100 150 200

80

100

120

140

160

180

200

220

240

260

SNR=Inf dB

150 200 250 300 350 400 450

−100

−50

0

50

100

SNR=27.96 dB

100 150 200 250 300 350

50

100

150

200

250

SNR=23.1 dB

−50 0 50 100 150 200

−50

0

50

100

150

SNR=Inf dB

280 290 300 310 320 330 340 350 360 370

300

310

320

330

340

350

360

SNR=27.96 dB

100 120 140 160 180 200 220 240 260320

340

360

380

400

420

440

460

SNR=23.1 dB

50 100 150 200 250

40

60

80

100

120

140

160

180

200

220

SNR=Inf dB

340 360 380 400 420 440 460 480 500 520 540

−340

−320

−300

−280

−260

−240

−220

−200

−180

SNR=27.96 dB

280 290 300 310 320 330

−215

−210

−205

−200

−195

−190

−185

−180

−175

−170

SNR=23.1 dB

which, as noted before, for such levels of noise, are infact difficult to distinguish, even for humans, see therightmost plots of the first two lines of Fig. 5.

Table 1. Percentages of correct classifications the shapes il-lustrated in Fig. 5 (2000 tests per shape and per noise level).

SNR[dB] ≥ 27.96 26.02 24.44 23.1 21.94

100 99.95 99.9 99.45 97.3

100 100 99.9 99.85 99.25

100 100 100 100 100

100 100 100 100 100

MPEG-7 shape database. Although not the focus ofthis work, we now illustrate that the ANSIG represen-tation can also be used when dealing with non-rigiddistortions, if more than one prototype per class isincluded in the database. We use the subset of theMPEG-7 shape database [32, 33]. It has 18 classes,each containing 12 shapes that, although perceptu-ally similar, are not geometrically equal. We increaseshape variability by creating test shapes that are noisyversions of the MPEG-7 database subset (the noiselevels are illustrated with an example in Fig. 6).

0 20 40 60 80 100 120

10

20

30

40

50

60

70

80

90

100

σ=0

0 20 40 60 80 100 120

10

20

30

40

50

60

70

80

90

100

σ=0.5

0 20 40 60 80 100 120

10

20

30

40

50

60

70

80

90

100

σ=1.5

Figure 6. A shape from the MPEG-7 database (left image)and noisy versions for classification (middle and right im-ages).

We stored the ANSIGs of the noiseless shapes andthen classified a set of 43200 test shapes (200 foreach sample shape of the MPEG-7 database subset),for each of the two noise levels illustrated in Fig. 6, us-ing a simple 1-NN classifier, i.e., selecting the classwith most similar ANSIG. We got 100% correct classi-fications for both noise levels. See [21] for more detailson this experiment and on other that demonstrates ro-bustness to noise with a clip-art shape database.

Automatic trademark retrieval. We now describean experiment using real images of trademarks. Wegot, from the internet, logos that are well described bytheir shape. They are shown in Fig. 7, but not at theirreal scale: they range from 80×80 to 500×500 pixels.We processed each of these images with the Cannyedge detector [34] and stored the ANSIGs of the cor-responding edge maps in our database.

Figure 7. Trademark images obtained from the internet.

We printed the trademarks in Fig. 7 and pho-tographed their paper version with a low quality digi-tal camera, with several paper-camera positions, ori-entations, and distances. This way we got a set ofchallenging test images, where the candidate logosappear at different sizes and positions, see some ex-amples in Fig. 8. Finally, we computed the edges ofthese test images, without any particular care to tunedetection parameters, obtaining a set of test shapes.

Being edge maps of real photos, our test shapesexhibit distortions (relative to their versions in thedatabase) that are not modeled explicitly by the AN-SIG, like: missing points (when edges are not de-tected, e.g., due to inaccurate focusing); spurious

Figure 8. Examples of trademark images to be classified.

points (when too much edges are detected, e.g., dueto paper texture noticed at large zoom); or perspec-tive distortion (when the camera image plane wasnot parallel to the paper). We classified these testshapes by proceeding as in the experiments above.Except in cases where the distortions just mentionedare dramatic (e.g., out-of-focus images), the classifi-cation was correct.

See [21] for other experiments, including shape-based object recognition and a study of the sensitivityto edge-detection failures and perspective effect.

6. Conclusion

We proposed a new method to represent2D shapes, described by a set of points, or landmarks,in the plane. Our method is based on what we callthe analytic signature (ANSIG) of the shape, whosemost distinctive characteristic is its invariance to theway the landmarks are labeled. This makes ANSIGparticularly suited to cope with shapes described bylarge sets of edge points in images. We illustrated itsperformance with a system for shape-based imageclassification.

We envisage paths for future research based on theANSIG representation. For example, while in this workwe store ANSIGs by sampling them on the unit-circle,a topic that deserves further study is the adoption ofdifferent sampling schemes, e.g., the use of two ormore circles, for robustness. Another topic that de-serves attention is the extension of the ANSIG to therepresentation of 3D shapes. Also, the derivations inthis work are targeted to the representation and com-parison of complete shapes. However, in many practi-cal scenarios, it is also necessary to recognize a set ofpoints as being a part, i.e., a subset of a given shape.A good challenge is then to adapt the ANSIG repre-sentation to deal with incomplete shapes.

References

[1] D. Kendall, D. Barden, T. Carne, and H. Le. Shape andShape Theory. John Wiley and Sons, 1999. 1

[2] M. Hu. Visual pattern recognition by moment invariants.IRE Trans. on Information Theory, 8:179–187, 1962. 1

[3] M. Teague. Visual pattern recognition by moment in-variants. Journal of the Optical Society of America, 70,1980. 1

[4] A. Khotanzad and Y. Hong. Invariant image recognitionby Zernike moments. IEEE T-PAMI, 12:489–497, 1990.1

[5] R. Mukundan, S. Ong, and P. Lee. Image analysis byTchebichef moments. IEEE T-IP, 10:1357–1364, 2001.1

[6] F. Mokhtarian and A. Mackworth. Scale-based de-scription and recognition of planar curves and two-dimensional shapes. IEEE T-PAMI, 8:34–43, 1986. 1

[7] G. Chauang and C. Kuo. Wavelet descriptor of planarcurves: Theory and applications. IEEE T-IP, 5:56–70,1996. 1

[8] T. Adamek and N. O’Connor. A multiscale represen-tation method for nonrigid shapes with a single closedcontour. IEEE T-PAMI, 14(5):742–753, 2004. 1

[9] P. Dierckx. Curve and Surface Fitting with Splines. Ox-ford Science Publications, 1995. 1

[10] D. Zhang and G. Lu. Study and evaluation of differentfourier methods for image retrieval. IJCV, 1234:33–49,2005. 1

[11] I. Bartolini, P. Ciaccia, and M. Patella. Warp: Accurateretrieval of shapes using phase of fourier descriptorsand time warping distance. IEEE T-PAMI, 27(1):142–147, 2005. 1

[12] S. Belongie, J. Malik, and J. Puzicha. Shape matchingand object recognition using shape contexts. IEEE T-PAMI, 24(24):509–522, 2002. 1

[13] C. Grigorescu and N. Petkov. Distance sets for shapefilters and shape recognition. IEEE T-IP, 12(10), 2003.1

[14] P. Besl and N. McKay. A method for registration of 3-Dshapes. IEEE T-PAMI, 14(2), 1992. 2

[15] H. Chui and A. Rangarajan. A new algorithm for non-rigid point matching. In IEEE CVPR, Hilton Head Is-land, South Carolina, USA, 2000. 2

[16] G. McNeill and S. Vijayakumar. Part-based probabilis-tic point matching using equivalence constraints. InProc. of NIPS, Vancouver BC, Canada, 2006. 2

[17] G. McNeill and S. Vijayakumar. Hierarchical procrustesmatching for shape retrieval. In IEEE CVPR, NY, 2006.2

[18] T. Jebara. Images as bags of pixels. In Proc. ofthe IEEE Int. Conf. on Computer Vision, Nice, France,2003. 2

[19] T. Jebara. Kernelizing sorting, permutation and align-ment for minimum volume PCA. In Proc. of Annual

Conf. on Learning Theory, Banff, Alberta, Canada,2004. 2

[20] P. Shivaswamy, and T. Jebara. Permutation invariantSVMs. In Int. Conf. on Machine Learning, PittsburghPA, 2006. 2

[21] J. Rodrigues ANSIG – An Analytic Signature for Per-mutation Invariant 2D Shape Representation. MasterThesis, Instituto Superior Tecnico, Lisbon, Portugal,September 2008. 2, 3, 8

[22] J. Rodrigues, P. Aguiar, and J. Xavier. ANSIG –An Analytical Signature for Permutation-Invariant Two-Dimensional Shape Representation. IEEE Int. Conf. onComputer Vision and Pattern Recognition (CVPR’08),Anchorange AK, June 2008. 3

[23] J. Rodrigues, J. Xavier, and P. Aguiar. Classification ofunlabeled point sets using ANSIG. IEEE Int. Conf. onImage Processing (ICIP’08), San Diego CA, October2008. 3

[24] J. Rodrigues, P. Aguiar, and J. Xavier. ANSIG – An An-alytical Signature for Arbitrary 2D Shapes (or Bags ofUnlabeled Points). Submitted to IEEE Trans. on PatternAnalysis and Machine Intelligence, 2008 3

[25] J. Rodrigues, P. Aguiar, and J. Xavier. System andMethod for Shape Recognition. Patent filed with Por-tuguese Patent Office, PT 104003, March 2008. 3

[26] J. Rodrigues. ANSIG, uma representacao para for-mas 2D. Demonstracao ao vivo, nas 3as Jornadas deInovacao, FIL, Lisbon, Portugal, November 2007. 3

[27] J. Rodrigues. ANSIG – An Analytic Signature forPermutation-Invariant Two-Dimensional Shape Repre-sentation. Real-life demonstration at the IEEE Inter-national Conference on Computer Vision and PatternRecognition (CVPR’08), Anchorage AK, June 2008. 3

[28] J. Rodrigues. Identificacao Automatica de Objec-tos: uma assinatura unica para formas bidimensionais.Demonstracao ao vivo, Ciencia 2008 – Encontro paraa ciencia em Portugal, Fundacao Calouste Gulbenkian,Lisbon, Portugal, July 2008. 3

[29] R. Horn and C. Johnson. Matrix Analysis. CambridgeUniversity Press, Cambridge, UK, 1985. 4

[30] L. Ahlfors. Complex Analysis. McGraw-Hill, USA, 1978.4

[31] A. Oppenheim, R. Schafer, and J. Buck. Discrete-TimeSignal Processing. Prentice Hall, 1999. 6

[32] L. J. Latecki, R. Lakamper, and U. Eckhardt. Shapedescriptors for nonrigid shapes with a single closedcontour. Int. Conf. on Computer Vision and PatternRecognition (CVPR’00), 424–429, Hilton Head IslandSC, June 2000. 7

[33] T. Sebastian, P. Klein, and B. Kimia. Recognition ofshapes by editing their shock graphs. IEEE T-PAMI,26(5), 2004. 7

[34] J. Canny. A computational approach to edge detection.IEEE T-PAMI, 8:679–714, 1986. 8

Documents

ANSIG — An Analytic Signature for Permutation Invariant 2D ...fenix.tecnico.ulisboa.pt/downloadFile/395137788040/final_resumo_tese.pdfwe propose a new permutation-invariant representa-tion