Fuzzy Encoding For Image Classification Using Gustafson-Kessel Aglorithm

Fuzzy Encoding For Image Classification Using Gustafson-Kessel AglorithmAshish Gupta, Richard Bowden

Centre for Vision, Speech, and Signal Processing, University of Surrey, Guildford, United Kingdom

AbstractFeature vectors in visual descriptor vector space do not exist innaturally occurring clusters, but have been shown to exhibitvisual ambiguity. ‘Bag-of-Features’ (BoF), the most popularapproach, which utilizes hard partitioning, ignores the existenceof a semantic continuum in this space. Kernel Codebooks havebeen demonstrated to improve upon BoF, by soft-partitioningdescriptor space. Building on these results, this paper, to modelthe visual ambiguity, formulates feature encoding, as clusteringwith fuzzy logic. The Gustafson-Kessel algorithm, computeshyper-ellipsoid shaped clusters, each with learnt mean andco-variance. The approach is demonstrated to provide betterclassification performance than BoF for several populardata-sets.

Visual Word Ambiguity

Figure: Hard partitioning leads to: suitable(triangle); uncertain (square); and plausible(diamond) types of assignments. Image from [1]

Figure: Kernel Codebook amelioratesissues with uncertainty and plausibility byweighted assignment to words. Image from [1]

The feature descriptor space is a continuum, which dictates thatthe exclusive assignment of a descriptor to a cluster prototype -crisp logic - can be modelled better by soft-assignment ofdescriptor to multiple cluster prototypes [1].

Image Classification Pipeline

The training data-set, for visual category ‘car’, consists ofpositive and negative labelled images. A sample of descriptorscomputed on these images is clustered using either K-Means(BoF), Fuzzy C-Means (FCM) [2], or Gustafson-Kessel (GK) [3].The descriptors from each image are encoded according toweighted word assignment(s) to compute histogram featurevectors for each image. A classifier is trained using these featurevectors.

Fuzzy Encoding: Methods

K-means computes words V = {v1, v2, . . . , vK} and assignment ofdescriptors X = {x1, x2, . . . , xN} to words by minimizing:

J(X;V) =K∑

i=1

N∑k=1

1ik ‖ xk − vi ‖2 , 1i

k =

{1 if xk ∈ vi0 otherwise

The encoded feature for an image ν is a discrete valuedhistogram. The FCM algorithm attempts to minimize :

J(X;U,V) =K∑

i=1

N∑k=1

µmik ‖ xk − vi ‖2 , 1 ≤ m <∞

where U = [µik ], and µik is degree of membership of xk to cluster i .m is a measure of ‘fuzzification’. ν reflects a degree ofmembership of descriptor to words. However, FCM in unable toadapt to local distribution of descriptors. The GK algorithmextends FCM, using a metric induced by a positive definite matrixA, learning hyper-ellipsoidal clusters instead of hyper-sphericalclusters of FCM. The objective function minimized in GK is:

J(X;U,V,A) =K∑

i=1

N∑k=1

µmik D2

ikAi

where,

D2ikAi

= (xk − vi)T Ai(xk − vi) , 1 ≤ i ≤ K , 1 ≤ k ≤ N

Gustafson-Kessel Algorithm

Algorithm 1 Gustafson-Kessel

τ ← 1repeat

v (τ )i ←

∑Nk=1(µ

(τ−1)ik )mxk∑N

k=1(µ(τ−1)ik )m

Fi ←∑N

k=1(µ(τ−1)ik )m(xk−v (τ )

i )(xk−v (τ )i )T∑N

k=1(µτ−1ik )m

D2ikAi

= (xk − v (τ )i )T [ρidet(Fi)

1nF−1

i ](xk −

v (τ )i )φk ← {i | Dik = 0}for k ← 1,N do

if φk = ∅ then

µ(τ )ik ← (

∑Kj=1(

DikAiDjkAj

)2

m−1)−1

else

µ(τ )ik ←

{0 if DikAi

> 01|φk |

if DikAi= 0

τ ← τ + 1until ‖ U(τ ) − U(τ−1) ‖< εfor j ← 1,M do

νj ←∑µ1jk

γj ←{

1 if Ij ∈ C−1 if Ij /∈ C

Notation:vi: centre of i th

clusterFi: co-variance of i th

clusterxk : k th descriptorµik : membership ofxk to i th clusterDikAi

: inner productnormρi: det(Ai)m: measure offuzzificationM: no. of imagesC: categoryIj: j th image

Experiments

The comparative classification performance of BoF, FCM, and GKalgorithms is analysed across visual categories in a data-set;across set of data-sets; and for different codebook size. Thefeature descriptor utilized in all experiments is the popular localaffine co-variant descriptor SIFT. A classifier is SVM with RBFkernel. The datasets utilized are Caltech-101, Caltech-256, PascalVOC 2006, Pascal VOC 2010, and Scene-15. These datasets varyin terms of number of categories, number of images within eachcategory, visual domain of categories, inherent difficulty inmodelling a category.

Performance across categories

The graphs in the figures show the comparative meanclassification accuracy of BoF and GK approaches for each visualcategory in the datasets: VOC2006, VOC2010, and Scene15. Theabsolute and relative performance of both BoF and GK variesacross the categories due to the variation in content andcomplexity of each category.

Performance across datasetsComparison of the BoF, FCM, and GK approaches in terms oftheir classification performance for different datasets. The resultsin the figure show the mean accuracy averaged across allcategories of the dataset.

Performance across codebook sizesAnalysis of comparative mean classification accuracy of BoF andGK approaches for different codebook sizes. Graph showsperformance for Caltech101 dataset.

Summary

We have introduced fuzzy encoding technique for imageclassification using Fuzzy C-Means (FCM) to compute a fuzzymembership function. We extended this work to theGustafson-Kessel (GK) fuzzy clustering algorithm, which wasshown to adapt to local distributions. We demonstratedempirically that our fuzzy encoding approach is consistentlybetter than the BoF model, using several popular datasets. GKalgorithm was shown to provide a marginal improvement overFCM, which we expect to improve with optimization of covariancematrices in future.

ReferencesJ.C. van Gemert, C.J. Veenman, A.W.M. Smeulders, and J.-M. Geusebroek,“Visual word ambiguity,”Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 32, no. 7, pp. 1271 –1283, july 2010.

A. Baraldi and P. Blonda,“A survey of fuzzy clustering algorithms for pattern recognition. i,”Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, vol. 29, no. 6, pp. 778 –785, dec 1999.

Donald E. Gustafson and William C. Kessel,“Fuzzy clustering with a fuzzy covariance matrix,”in Decision and Control including the 17th Symposium on Adaptive Processes, 1978 IEEE Conference on, jan.1978, vol. 17, pp. 761 –766.

Acknowledgement

This work is supported by the EPSRC project Making Sense (EP/H023135/1).

Centre for Vision, Speech, and Signal Processing - University of Surrey - Guildford, United Kingdom Mail: [email protected] WWW: http://www.ee.surrey.ac.uk/cvssp

Technology

Fuzzy Encoding For Image Classification Using Gustafson-Kessel Aglorithm