HANOLISTIC: a h ierarchical automatic image an notation system using h olistic approach

HANOLISTIC: A HİERARCHİCAL AUTOMATİC İMAGE ANNOTATİON SYSTEM USİNG HOLİSTİC APPROACH

Özge Öztimur Karadağ &

Fatoş T. Yarman Vural

Department of Computer EngineeringMiddle East Technical University, Ankara,

Turkey

AUTOMATİC IMAGE ANNOTATİON

Image Annotation : Assigning keywords to digital images. Labor intensive Time consuming

Need a system that automatically annotates images.

IMAGE ANNOTATİON LİTERATURE

Annotation problem has become popular since 1990s.

Related to CBIR. CBIR processes visual information Annotation processes visual and semantic

information Relate visual content information to semantic

context information.

PROBLEMS ABOUT AUTOMATİC IMAGE ANNOTATİON

Human subjectivity Semantic Gap Availability of datasets

IMAGE ANNOTATİON APPROACHES İN THE LİTERATURE…

Segmental Approaches1. Segment or partition the image into regions2. Extract features from the regions3. Quantize features into blobs4. Model the relation between the image regions

and annotation words

Holistic Approaches Features are extracted from the whole image.

SEGMENTAL ANNOTATİON: SKY, BOAT, SEA, TREE

HOLİSTİC ANNOTATİON: TİGER, TREE, SNOW

THE PROPOSED SYSTEM: HANOLISTIC

Introducing semantic information as supervision. each word is considered as a class label, an image belongs to one or more classes

Holistic Approach: multiple visual features are extracted from the several whole image. Multiple feature spaces

DESCRİPTİON OF AN IMAGE

Content Description by Visual Features of Mpeg-7 Color Layout Color Structure Scalable Color Homogenous Texture Edge Histogram

Context Description by Semantic Words Annotation words

SYSTEM ARCHİTECTURE OF HANOLISTIC

Level-0 : consists of level-0 annotators, one annotator for each visual description space.

Meta-level : consists of a meta-annotator

LEVEL-0 ANNOTATOR

refers to the features of the i th image in the j th description space

refers to the membership value of the l th word for the i th image in the j th description space.

META-LEVEL

The results of level-0 annotators are aggregated.

is a vector, referring to the final word

membership values for the i th image.

EXPERİMENTAL STUDİES

Realization of HANOLISTIC Instance based realization of Level-0 Eager realization of Level-0 Realization of Meta-level

Performance criteria

Results

EXPERİMENTAL SETUP

Data set: A subset of Corel Stock Photo Collection, consisting of 5000 images. Training set: 4500 images (500 images for

validation) Testing set: 500 images

Each image is annotated with 1-5 many words.

INSTANCE BASED REALİZATİON OF LEVEL-0 ANNOTATOR BY FUZZY-KNN

Level-0 annotators are realized by fuzzy-knn. For each description space; k nearest

neighbors of the image is determined. Word membership values are estimated

considering the neighbors’ words and their distance from the image.

High membership values are assigned to words that appear in close neighborhood.

EAGER REALİZATİON OF LEVEL-0 BY ANN

For a given image Ii, ANN receives visual description of the image as input and semantic annotation words of the image as target.

Each ANN is trained with backpropagation and a randomly selected set of images is used for validation to determine when to stop training. K-fold cross validation is applied.

REALİZATİON OF META-LEVEL BY MAJORİTY VOTİNG

Adds the membership values returned by level-0 annotators using the formula

where, Pi,j is a vector containing the word membership values returned from the jth level-0 annotator.

For each word select the maximum of the five word membership values estimated by the level-0 annotators.

PERFORMANCE CRİTERİA

Precision

Recall

F-score

PERFORMANCE OF LEVEL-0 ANNOTATORS

Performance of Level-0 annotators with fuzzy-knn

PERFORMANCE OF HANOLISTIC

Comparison of HANOLISTIC with other systems in the literature:

ANNOTATİON EXAMPLES

ANNOTATİON EXAMPLES…

CONCLUSİON

We proposed a hierarchical automatic image annotation system using holistic approach.

We tested the system both with an instance based and an eager method.

We realized that the instance based methods are more promising in the considered problem domain.

CONCLUSİON…

The power of the proposed system comes from the following main principles: Simplicity Fuzziness Simultaneous processing of content and context

information Holistic view of image through different

perspectives

FUTURE WORK

Conduct experiments on other descriptors Test other algorithms at level-0 conforming to

the principle of least commitment Apply holistic approach followed by a

segmentation process , for annotation or intelligent segmentation.

Thank you

Questions and Comments

REFERENCES Duygulu, Barnard, Freitas and Foryth ‘Object

recognition as machine translation: Learning a lexicon for a fixed image vocabulary’ in ECCV’02: Proceedings of the 7th European Conference on Computer Vision,2002

Jeon, Lavrenko, Manmatha ‘Automatic image annotation and retrieval using cross-media relevance models’, in SIGIR’03

Monay and Perez ‘Plsa-based image auto-annotation: constraining the latent space’ in MULTIMEDIA’04

Akbaş and Vural ‘Automatic image annotation by ensemble of visual descriptors’ in CVPR’07

Feng, Manmatha and Lavrenko ‘Multiple bernoulli relevance models for image and video annotation’. CVPR’02.

Tang and Lewis, ‘Image auto-annotation using ‘easy’ and ‘more challenging’ training sets’, 7th International Workshop on Image Analysis for Multimedia Interactive Services, 2006

Documents

HANOLISTIC: a h ierarchical automatic image an notation system using h olistic approach