22
1 E.V. Myasnikov 2007 Digital image collection navigation based on automatic classification methods Samara State Aerospace University RCDL 2007 Интернет-математика 2007 [email protected] Навигация по коллекциям цифровых изображений на основе методов автоматической классификации Самарский государственный аэрокосмический университет Е.В. Мясников

E . V. Myasnikov

  • Upload
    cheri

  • View
    44

  • Download
    0

Embed Size (px)

DESCRIPTION

RCDL 2007 Интернет-математика 2007. Навигация по коллекциям цифровых изображений на основе методов автоматической классификации. Е.В. Мясников. Самарский государственный аэрокосмический университет. [email protected]. - PowerPoint PPT Presentation

Citation preview

Page 1: E . V. Myasnikov

1

E.V. Myasnikov

2007

Digital image collection navigation based onautomatic classification methods

Samara State Aerospace University

RCDL 2007 Интернет-математика 2007

[email protected]

Навигация по коллекциям цифровых изображений на основе методов автоматической классификации

Самарский государственный аэрокосмический университет

Е.В. Мясников

Page 2: E . V. Myasnikov

2

Navigation in collection of digital images

• alternative to image retrieval system• complement to image retrieval system• convenient browsing system

Approaches to navigation system construction

• to construct projection of the whole image collection into 2-D navigation space• to cluster image collection into the set of clusters (hierarchy) and then construct 2-D projection of each cluster• to construct tree-like structure using an optimization rule

Page 3: E . V. Myasnikov

3

Clustering methods

Hierarchical clustering(agglomerative)

Singlelink

Completelink

Averagelink

Nonhierarchical clustering

K-means Kohonen neuralnetworks (SOM)

Fuzzy clustering

Page 4: E . V. Myasnikov

4

Linear

Principalcomponent

analysis(PCA)

Nonlinear

Classical Kruskal MDS

(multidimensional scaling)

Sammonprojection

Force-DirectedReplacement

Projection methods

Discrete latticesolution

Continuous solution

Page 5: E . V. Myasnikov

5

Demands to the navigation system

• Representation of the collection has a form of 2D vectors (as icons, points on the monitor)

• The set of images having higher level of similarity is displayed when bringing near the region

• The set of images having lower level of similarity is displayed when moving away from the current region

• Property of reversibility

Operations with the navigation “map”

• Scrolling (up, down, left, right)• Scaling (up, down)

Page 6: E . V. Myasnikov

6

Main phases of proposed approach

Feature extraction

Cluster hierarchy construction

Mapping into 2-D navigation space using restrictions

imposed by cluster hierarchy

Digital images

Navigation space

Page 7: E . V. Myasnikov

7

Clustering Phase: Analyzed Methods

Hierarchical clustering scheme

1. Adjacency matrix calculation2. Rank each object among

clusters3. Merge elements with minimal

distance between them4. Elimination of the raw and

column of absorbed cluster and matrix recalculation

5. Stopping criterion test and transition to the step 3

Inter-cluster distance

single linkminimal distance between objects

involved in clusters

complete linkmaximal distance between objects

involved in clusters

Kohonen neural network

WTA correction rule:w (t+1) = w (t) + (t)[x(t) - w (t)]

d(x(t), w (t)) = min1 i  K d(x(t), wi (t))

Following equation holds true for the winning neuron

To construct the hierarchy of clusters Kohonen neural network functions in a recursive order

.

.

. Com

peti

tion

m

echa

nism

y1 x1

x2

xn

w11 w12 w1n

wk1 wk2

wkn yk

y2 y3 . . .

Page 8: E . V. Myasnikov

8

U

jjjq U

1

2)(

1 wx

Clustering: Experimental results

*Experiment was conducted on samples of size equal to 1000

Number of

clusters

Average quantization error*

Single link

Complete link

WTA

25 0.387 0.193 0.169

50 0.359 0.150 0.139

100 0.293 0.116 0.112

Quantization error:

Examples of clusters

Page 9: E . V. Myasnikov

9

Mapping Phase: Sammon projection

N

ji ij

ijij

jiij d

dd

d

2*1

Error

dij - distance between objects i and j in multidimensional spaced*

ij - distance between objects i and j in two dimensional spaceyjk - coordinates in 2D space

N

ijj

jkikij

ij

N

jiij

ikik tytydd

dd

d

tyty

ij

ij

1*

*

)()(2

)()1(

Iterative formula

Notation

Operational time ~ O[N3](under the assumption that the number of iterations is of the same order as the number of objects)

Page 10: E . V. Myasnikov

10

Construction of initial configuration for Sammon mapping

Average error value*

Number of iterations

100 200 300

Sammon mapping with

random initalization

0.126 0.078 0.056

Best Sammon mapping over 10 runs with random

initialization

0.093 0.051 0.035

PCA 0.139

Two-phase method

(PCA as initial configuration for

Sammon mapping)

0.044 0.041 0.039

*samples of 100 images from dataset of 10 000 images were used to conduct the experiment

Two-phase method example

Page 11: E . V. Myasnikov

11

Methods of speeding-up Sammon projection

1. Triangulation

2. Neural Network

3. Approximation using random sets

Chalmers’96 adaptation for Sammon projection (CS)

Two sets are constructed for each object on each iteration:

• set of k1 close objects• set of k2 random objects

Operational time ~ O[N2](under the assumption that the number of iterations is of the same order as the number of objects and k1+k2 << N)

Page 12: E . V. Myasnikov

12

Proposed Methods: Combined Method (CM)

1. Build Sammon projection for the top level of the cluster tree2. Build Sammon projection for the each subcluster at level 2

using 2D coordinates of the superclasters as fixed points3. Repeat the process for each subclaster (or object) of the

level 3 and so on

Idea: Use hierarchical clustering to build the projection

Method description

Operational time

LNO

21

– for balanced tree with depth L

Modification of method (MCM): Use 2D coordinates of top level clusters for each subcluster (or object) at any level

Special case: O[N2] – for balanced tree with depth 2

Page 13: E . V. Myasnikov

13

Proposed Methods: Restrictive Combined Method (CMR)1. Map centers xu

1 of top level clusters Сu1C0 to the 2D vectors yu

1 using

dimensionality reduction method (Sammon or two-phase method). Set boundaries of the whole displayed area 0=

2. For each cluster СukCv

k-1 of the current level k carry out points 3-6

3. Construct boundaries uk of the cluster Cu

k in 2D space using centers

coordinates ymk in 2D space of the clusters Cm

k, m=1..|Cvk-1| of the current

level k4. Complete cluster boundaries u

k using boundaries vk-1 of the parent

cluster Cvk-1 at the previous level: u

k =uk  v

k-1

5. Map centers xik+1 of all subclusters Сi

k+1Сuk (or immediately images Oi) at

level k+1 to 2D vectors yik+1, using boundaries u

k of the cluster applying

the following recurrence relation

ku

ki

N

ijj

jki

ij

ij

N

ijij

ki

ki ttt

dd

dd

d

tt

ij

ij

),()()(2

)()1( 1

1

1*

*11 yyyyy

6. Apply described in points 3-5 procedure to map child clusters Cik+1

in the recursive order

Page 14: E . V. Myasnikov

14

Proposed Methods: Modifications for CMR

Function ,y

1. Full correction rule (CMR-1) – if yi exceeds the bounds of the cluster

then the correction value ensure yi to

be on the boundary at the next step

2. Piece-wise linear rule (CMR-2) – correction value ensure the “attraction” to the center of the cluster or to the boundary when yi comes near

or exceeds the boundary

can be selected based on minimization of

functional consisted of Sammon error and boundary function

Two models were considered

Example of CMR-1

Page 15: E . V. Myasnikov

15

Experimental Research

METHOD

1000 images 5000 images

Average error value

Mean square

deviation of error

Average operation

time

Average error value

Mean square

deviation of error

Average operation

time

PCA 0.1171 0.01449 2 0.1148 0.009690 11

CS 0.02880 0.005668 62 0.02592 0.001657 1880

CM 0.03407 0.002638 17 0.06220 0.028756 31

MCM 0.02767 0.002047 19 0.02840 0.001962 67

CMR-1 0.02972 0.002218 13 0.03143 0.001890 31

CMR-2 0.03494 0.002590 60 0.03723 0.002076 276

Page 16: E . V. Myasnikov

16

Example of MCM

sample size: 10 000 images

Page 17: E . V. Myasnikov

17

Example of CMR

sample size: 10 000 images

Page 18: E . V. Myasnikov

18

Example of navigation (CMR)

region “а” region “б”

Page 19: E . V. Myasnikov

19

region “в” region “г”

Example of navigation (CMR)

Page 20: E . V. Myasnikov

20

Selection of features

Note:“Measures that are more effective for retrieval tend to be more complex, and thus lose their advantage over the simpler measures when forced into two dimensions” (K.Rodden, W.Basalaj, D.Sinclair, K.Wood A comparison of measures for visualising image similarity. In The Challenge of Image Retrieval. British Computer Society Electronic Workshops in Computing, 2000)

Features: Color histograms in CIE L*a*b color space

Metrics: Euclidian

Main requirement to the feature system:• Configuration of images in navigation space must be understandable to user

Page 21: E . V. Myasnikov

21

• The requirements to the navigation method are considered • Novel navigation method is proposed• Novel combined method and its modifications for dimensionality reduction are proposed• Proposed methods are compared to known method• The results of experimental analysis of methods being used are present

Conclusions

Future plans• Exploring new feature systems• Method improvement• Estimation of effectiveness of navigation method including expert estimation

Page 22: E . V. Myasnikov

22

This work was financially supported by Yandex (www.yandex.ru)

The dataset “Image database” was provided by Yandex

Acknowledgements

THANK YOU FOR YOUR ATTENTION