Results EM GMM

7/31/2019 Results EM GMM

1/3

Gaussian Mixture Model Based Scheme Using

Expectation-Maximization for Image Segmentation

Ritu Garg (2011EEZ8469)

Adersh Miglani (2011EEZ8471)Project - EEL709 Course

IIT Delhi

27-April-2012

AbstractIn this paper, we define a probabilistic model whereeach class is represented by nineteen dimensional multivariatedistribution over 3 3 regions in image. This distribution isused to model classes are Gaussian mixture models learnedfrom the training data using the Expectation Maximization(EM)algorithm. For the experiments we have used the UCI imagesegmentation database.

I. INTRODUCTION

Image segmentation is useful in many applications. Goal of

image segmentation is to partition an image into regions each

of which has a reasonably homogeneous visual appearance or

which corresponds to objects or parts of objects. The image

segmentation is in general extremely difficult and remains

the subject of active research. Here we implement a variate

of GMM using EM based technique to classify pixels in

the image segment objects in the image. Researchers have

employed the theory of Gaussian mixture models for image

segmentation [1][3].

The images can be considered as a mixture of multi-variantdensities and the mixture parameters are estimated using the

EM algorithm. The segmentation is completed by classify-

ing each pixel into its corresponding class according to the

maximum log-likelihood estimation. A major drawback of this

method is that the number of Gaussian mixture components

is assumed to be known prior, so it cannot be considered as

a complete unsupervised learning methodology. Another issue

is the problem of mixture parameter initialization during EM

algorithm that can greatly effect the performance.

A commonly used solution is initialization by randomly

sampling in the mixture data [4] [5]. But this method may

result in inconsistent mathematical models such as non-linear

convergence, inconsistent covariance matrix etc. In our ex-periments, we have used K-means algorithm to initialize the

Gaussian mixture parameters and successfully solve the ini-

tialization problem. Hence we can state that method performs

both parameter estimation and model selection in a single

algorithm, thus the method is totally unsupervised.

The paper is organized as follows: In section II we describe

the various pre-processing steps applied to the give dataset to

rule out the inconsistency in the dataset. Further in section III

we discuss the Gaussian mixture model and EM algorithm.

In section IV we present and discuss our experimental results

and finally conclude in section V.

I I . DATA SET PREPROCESSING

We have used the UCI image segmentation dataset1

toevaluate EM based GMM learning. We summarize the relevant

information related to the dataset below:

Each instance is a 3x3 region.

Total number of Instances = 2310, 210 belonging to test

data and rest

Number of Attributes: 19 continuous attributes.

Missing Attribute Values: None

Class Distribution:

1 = brickface,

2 = sky,

3 = foliage,

4 = cement,5 = window,

6 = path,

7 = grass.

At first we started with considering all the training samples

(i.e. 210 19 data). It was found that the covariance matrixcomputed using this data had one column and one row as zero.

This implies that, that particular dimension was same for all

samples. This was detected using the variance across each

sample for every dimension and redundant data is removed

from the feature space.

Further, we applied SVD on the reduced feature set and

noticed that the eigen values for few to the dimension wasextremely small ranging from 1015 1025 also some of eigenvalues computed were negative. In order to make covariance

matrix consistent and positive definite we applied Principal

Component Analysis (PCA). Principal component analysis

was to transform original mean-adjusted features into new

eigen space with dominant dimensions, resulting in a con-

sistent positive definite covariance matrix for each iteration.

1http://archive.ics.uci.edu/ml/machine-learning-databases/statlog/segment/


2/3

III. GAUSSIAN MIXTURE MODELS AND EM ALGORITHM

A. Gaussian Mixture Models

Gaussian mixture model is a simple linear superposition

of Gaussian components, aimed at providing a richer class of

density models than the single Gaussian. The Gaussian mixture

model with K components for xn samples, n 1, can bewritten as :

p(x) =

K

k=1

kN(x|k,k) (1)

Where k is mixing coefficient, k and k are the set ofparameters defining the Kth component to be learned using

EM algorithm. N(x|k,k) is the Gaussian distribution withmean k and covariance k.

B. EM Algorithm

The EM algorithm is a general iterative technique for com-

puting maximum-likelihood. The usual EM algorithm consists

of an E-step and an M-step. Given a Gaussian Mixture, the

goal is to maximize the log-likelihood function with respectto the parameter vector comprising means and covariances

of components and mixing coefficients. The EM-algorithm is

summarized below:

Initialize means k, covariances k and mixing coeffi-cient k based on standard K-means clustering algorithm.

E-Step: Evaluate the responsibility using current param-

eters.

(znk) =kN(xn|k,k)Kj=1 jN(xn|j ,j)

(2)

M Step: Re-estimate the parameters using current respon-

sibilities.

newk =1

Nk

K

k=1

(znk)xn (3)

newk =1

Nk

K

k=1

(znk)xn(xn newk )(xn

newk )

T

(4)

newk =Nk

N(5)

where Nk =

K

k=1

(znk). (6)

Evaluate the log-likelihood

lnp(X|, , ) =N

n=1

ln{K

k=1

kN(x|k,k)} (7)

Since, EM algorithm is highly dependent on initialization,

instead of performing random sampling we have used K-means

to initialize the mixture parameters.

IV. EXPERIMENTAL RESULTS

As part of pre-processing i.e. after removing redundant

dimension and applying PCA, resulted in a reduced dimension

data of size (210 12) which was used for subsequentprocessing and learning. We used K-means algorithm on the

new sample set that yielded acceptable means, covariances.

Undersigned learning using EM based GMM provides local

minima under constraint that initialization of accepted Gaus-sian parameters. Since this algorithm can get trapped into one

of many local maxima of likelihood function. Therefore we

adopted multi-modal Gaussian for each class. For example

in each of seven classes we fit K = 2, 3, 4 Gaussian. The

convergence criterion of learning Gaussian parameters in EM

step was change of 1e2.For every sample we compute p(x) as sum of Gaussian over

all K classes. This is achieved by maximum log-likelihood

as objective function which is the sum of Gaussian. If any

sample is equal to the mean, log-likelihood function will tend

to infinity. Thus maximization of log-likelihood of regular

Gaussian is not a well posed problem. Due to above men-

tioned singularity. An elegant and powerful method for findingmaximum likelihood for models with latent variable k is EM

algorithm.

Fig. 1. Threshold vs. Classification over Seven Classes

The 7-class problem is experimented with varying the

number of components K = 2, 3, 4, 5 in each class.

Figure 1 illustrates the allocation of test samples in

K = 7 classes with varying KInClass = 2, 3, 4, and 5.


3/3

Fig. 2. Threshold vs. Expected Number of Samples over Total 2100 Samples

We verify our result with UCI test image database. As

shown in the second row of Figure 1, expected number of

samples over total 2100 samples are correctly classified

based on UCI test-set.

Figure 2 illustrates change in expected classified test

samples over varying negative likelihood threshold, that

is change in likelihood in every iteration with respect to

previous one.

Figure 3 shows the plot for negative log-likelihood vs.

number of iterations consumed to converge while esti-

mating mean, variance and mixing coefficient for each

Gaussian.

It is clear that our implementation shows consistent result for

multi-modal K Gaussian fitting in each class with monotoni-

cally decreasing negative log-likelihood. The expected number

of samples with respect to the classes specified by UCI for

each test sample is very low. We performed following steps to

chose dominant dimensions from ill-conditioned UCI database

and same may lead to approximation in our results.

We compared variance of each dimension over eachsamples and found three classes are redundant as per

assumed minimum variance as 0.001. The variance forother dimensions was up to in the multiple of 100.

We performed PCA and compared all 19 eigen values.Out of sorted list of eigen values, highest eigen values

was in multiple of 1000 and last four eigen values

were below 1013. Such lower eigen values should beconsidered as zero. All training and test mean-adjusted

samples were transformed in new eigen space with 12

dimensions

V. CONCLUSION

In this paper, we present an unsupervised image segmen-tation method based on finite Gaussian mixture model. The

observed pixels of the 3 3 region was considered as amixture of multi-variate densities. Each 3 3 entry is rep-resented using 19- dimensional features. In our experiments,

it was essential to perform pre-processing on the data to avoid

inconsistent mathematical modelling. Further, we used the K-

means algorithm that successfully circumvent the initialization

problem of EM algorithm. Finally, we are able to show

consistent convergence for association of test samples with

Fig. 3. Negative log-likelihood with experiments

their respective clusters based on the three variate parameters

learned using EM-GMM. Same is indicated in figure 3.

REFERENCES

[1] T. Yamazaki and T. Yamazaki, Introduction of em algorithm into color

image segmentation, in Proc. ICIPS98, 1998, pp. 368371.[2] H. Caillol, W. Pieczynski, and A. Hillion, Estimation of fuzzy gaussianmixture and unsupervised statistical image segmentation, Image Process-ing, IEEE Transactions on, vol. 6, no. 3, pp. 425 440, March 1997.

[3] J. J. veeberk N. Vlassis and B. Klose.[4] G. McLachlan and T. Krishnan, The EM Algorithm and Extensions.

Wiley, 1996.[5] G. McLachlan and D. Peel, Finite Mixture Models. Wiley, 2000.

Documents

Results EM GMM