Content Based Image Retrieval : An Introductionmiune/LECTURES/Sikkim_CBIR_2014F_MKK.pdf · Content Based Image Retrieval (CBIR) 9/22/2014 Prof. M. K. Kundu, MIU,ISI 26 Image Database

Content Based Image Retrieval : An Introduction

9/22/2014 Prof. M. K. Kundu, MIU,ISI

Malay K. KunduMachine Intelligence UnitIndian Statistical Institute

Kolkata http://www.isical.ac.in/~malay

Introduction

• Due to revolutionary growth of digital imaging & internet technology, development of efficient & intelligent scheme for image retrieval from a large image collection became an important research issue. Two important approaches in this regards are,

• Content Based Image Retrieval(CBIR).• Content Based Image Retrieval with relevance feedback.

9/22/2014 Prof. M. K. Kundu, MIU,ISI 2

The Problem and Motivation

• There are now billions of images on the web and in collections such as Flickr, Google etc.

• Suppose I want to find pictures of monkeys with existing model for search.


Google Image Search -- monkey


Metadata based retrieval systems

• Metadata based retrieval systems– text, click-rates, etc.– Google Images– Clearly not sufficient

• what if computers understood images? – Content based image

retrieval (early 90’s)– search based on

the image content


Top 12 retrieval results for the query ‘Mountain’

Content Based Image Retrieval(CBIR)

• The process of retrieval of relevant images from an image database(or distributed databases) on the basis of primitive (e.g. color, texture, shape etc.) or semantic image features extracted automatically is known as Content Based Image Retrieval.


Problems of Information extraction in CBIR

• It differs generically from conventional information retrieval/Data Mining (DM) due to the following reasons,Unstructured nature of image databasesContains pixel intensities with no inherent meaningAny kind of reasoning about image content is possible only after extraction of some useful image information(e.g, presence of primitive or semantic feature)


What is Unstructured Data?

• Any data without a well‐defined model for information access

• Examples,– Image, Video, Sound.– Word documents– E‐mails

• Examples of what is structured– Database tables– Objects– XML tags


Unstructured data• Unstructured data consists of text, audio, images etc.

• Unstructured data contains significant scientific and commercial information

• Technologies and tools are being developed for efficient extraction of information in unstructured data

• Problems of information retrieval from unstructured data are not fully understood yet. A paradigm shift is needed


UDM Increases Informational Content


Structured Data(10-40%)

Unstructured Data(60-90%)+

Representation of Image


Two important questions for content-based image retrieval:• How are images represented => features

•How are image representations compared =>distance/similarity measures

CBIR: Visual Signature

• Image is Indexed by its visual contentVisual content is described by

• Low level features (Color, texture, shape, layout etc.).

• High level descriptors(Face, iconic pattern, context, spatial reasoning, semantic information etc.)


Animal Image Database


Example1:

Lion Rabbit


Lion Color? Yes Rabbit


Horse Color? No Rabbit


Horse Shape? Yes Rabbit


Horse Shape? No Zebra


Horse Texture? Yes Zebra


Lion Color? No Rabbit


Lion Shape? Yes Rabbit


Lion Color? No Horse


Lion Shape? Yes Horse


Horse Color? Different Horse


Deer Texture? Different Deer


Content Based Image Retrieval (CBIR)


Image Database

Result

Query Image

User

Content based image retrieval -1• Query by Visual Example(QBVE)

– user provides query image

– system extracts image features (texture, color, shape)

– returns nearest neighbors using suitable similarity measure

9/22/2014 Prof. M. K. Kundu, MIU,ISI27

Texturesimilarity

Colorsimilarity

Shapesimilarity

The Semantic Gap• First generation CBIR systems were based on color and texture; however these do not capture what users really care about : conceptual or semantic categories.

• Perception studies suggest that the most important cue to visual categorization is shape. This was ignored in earlier work (because it was hard!)


Content based image retrieval • Semantic Retrieval (SR)

– User provided a query text (keywords)– find images that contains the associated semantic concept.

– around the year 2000,– model semantic classes, learn to annotate images– Provides higher level of abstraction, and supports natural language

queries9/22/2014 Prof. M. K. Kundu, MIU,ISI 29

query: “people, beach”

Text based Semantic Retrieval (SR)

• Problem of lexical ambiguity– multiple meaning of the same word

• Anchor - TV anchor or for Ship?

• Bank - Financial Institution or River bank?

• Multiple semantic interpretations of an image• Boating or Fishing or People?

• Limited by Vocabulary size – What if the system was not trained for

‘Fishing’ – In other words, it is outside the space of

trained semantic concepts

9/22/2014 Prof. M. K. Kundu, MIU,ISI30

Lake? Fishing? Boating? People?

Fishing! what if not in the vocabulary?

Summary• SR Higher level of abstraction

– Better generalization inside the space of trained semantic concepts

– But problem of

• Lexical ambiguity

• Multiple semantic interpretations

• Vocabulary size

• QBVE is unrestricted by language. – Better Generalization outside the space of trained

semantic concepts

• a query image of ‘Fishing’ would retrieve visually similar images.

– But weakly correlated with human notion of similarity


Both have visually dissimilar sky

Fishing! what if not in the vocabulary?

Lake? Fishing? Boating? People?

The two systems in many respects are complementary!

What should be the Research Focus ?

• Automatically generate annotations corresponding to object labels or activities in an image

• If possible, combine these with other metadata such as text.


CBIR

• Content‐Based Image Retrieval

– Query‐by‐Example(QBE)

– Query‐by‐Feature(QBF)– Feature Vector


CBIR Architecture


FeatureExtraction

User Interface

SimilarityMetric

DatabaseImage data

Image Representation

Imagedata

ImageBrowsing

DatabaseCreation

QueryComparison

query

Example: CBIR of Butterflies

• To allow non‐expert users to find out some possible species of the butterflies they saw by the appearance of the butterflies

• The appearance:– Color, Texture, Shape


Problems

• How can you describe a butterfly?

• How can you communicate with a machine?


Problems

• Different users have different perception.• Users may not remember the appearance of

the butterfly clearly.• Users usually do not have enough

knowledge to describe the butterflies like experts.

• Users usually do not have patience to browse too many query results.


Possible Solutions• An user‐driven interactive query process: QBF/QBE query process– (Query By Features and Query By Example)

• Fuzzy feature description for each butterfly

• An “What You See Is What You Get” query interface

• A representative set for a collection of butterflies


QBF/QBE query process

• QBF query:– A QBF query is to choose some features of butterflies and expect that the system returns all butterflies with those features.

– Features of butterflies:• Dominant color, texture pattern, shape.

• QBE query:– A QBE query is to point an image and expect that the system returns all butterflies similar to that.


Feature Description (1)

• Feature Description for a butterfly:– Like metadata which describe the

appearance of this butterfly.– This makes QBF queries possible.– Feature Description consists of some

feature descriptors.• Feature descriptor:

– A ( “feature value” , “match level” ) pair.


Feature Description (coarse)


FeatureType Feature Value Degree of

Match mixed_with_black_and_orange 52/57 orange_yellow 12/42Color orange_red 3/38 many_spots 58/62 fore_half_different_color 27/33 horizontal_bands 41/60Texture

edge_with_different_color 10/74Shape wave 98/110

Common Color Features• Most commonly used color features are• Pixel values in RGB or alternative space like HSV, Y Cr,Cb etc.

with uniform quantization.• Color Histogram• Joint probability density of intensities of the three channels

Histogram intersection as similarity measure. • Color Moments• Using moments (low order moments like mean, variance,

skewness) for characterizing color distribution with Euclidian distance measure.

• Invariant descriptors.Invariance to illumination changes


What color is the apple ? We are so visual !!!!


I’d say it isBright Red

I think it is“Crimson”

It isRed!

I really couldn’t tell you(I am color blind)

RGB Color Space• Hardware Oriented Model:

RGB Color Space: 3 values to represent a color.

Red Green Blue


HSV Color Space

• HSV – Hue Saturation Value– close to human perception

– 3 values to represent a color.


SaturationHue

Value

Red (0o)

Yellow (60o)Green (120o)Cyan (180o)

Blue (240o) Magenta (300o)

Black

White

Y Cb Cr Color Space• Y is the Luminance Cb and Cr are the

Chrominance Values of this Color Space.

• Decouples intensity and color information

• A monochrome color representation has only the Y value.

• Very close to Perceptual Model


CIE-LAB Uniform Color Space

• Uniform Color Spaces: CIE Lab

• CIE solved this problem in 1976 with the development of the Lab color space.


CBIR: Major Challenges

• Major developmental Issues• Efficient feature extraction and similarity measure.

• Multi‐dimensional indexing for multi‐level image queries

• Relevance feedback network for automatic extraction high level knowledge.

• Efficient and adaptive retrieval system design.


Distance Measures• Heuristic

– Minkowski‐form– Weighted‐Mean‐Variance (WMV)

• Nonparametric test statistics– χ 2 (Chi Square)– Kolmogorov‐Smirnov (KS)– Cramer/von Mises (CvM)

• Information‐theory divergences– Kullback‐Liebler (KL)– Jeffrey‐divergence (JD)

• Ground distance measures– Histogram intersection– Quadratic form (QF)– Earth Movers Distance (EMD)


Texture

Texture refers to visual pattern that have properties of human perceptionhomogeneity, contrast, roughness, coarseness,directionality or sense of orientation,regularity (periodic or quasi‐periodic distribution) line‐likeness


Texture

• Innate property of virtually all natural objects (like clouds, trees, hair, woods, stones etc.) and scenes.


Common Texture features:

Co‐occurrence Matrix• Explore spatial dependence of gray level or color.• Frequency counts in a 2‐D array for a pair of pixels with

different direction and distance.• Statistical features extracted from the matrix used as

descriptor.

Wavelet based multi‐scale featuresStatistical features extracted from sub‐bands.

Gabor Filter based technique.


MPEG-7 CBIR Texture Feature

• In different CBIR systems performance greatly rely on the notion of texture, this may differ. For standardization MPEG‐7 has recommended different texture feature models,

Edge Histogram Descriptor(EHD)Homogeneous texture Descriptor(HTD)Texture Browsing Descriptor(TBD)TEXTURETEXTURE

– HOMOGENEOUS TEXTURE


Shape


•Shape of a visual object is one of the most powerful signature for human perception mechanism.

•It remains invariant under different geometric transformations (rotation, scaling & translation)

Two categories of methods

•Boundary based(using only outer boundary):Fourier Descriptor, Curvature function and Curvature scale space

•Region Based(using shape of entire region)

Region Based shape Signatures

• Binary and gray level moments

• Zernike Moments

• Grid based descriptor

• Object bounding box.

• Segmentation of different regions(may not be a meaningful objects)


Results: Based on Shape With Rotational invariance feature.


M. Banerjee and M. K. Kundu`` Edge based features for content based image retrieval'' , Pattern Recognition, Vol..36, No. 11, pp. 2649-2661, 2003

Relevance Feedback


The key idea of interactive relevance feedback technique is that human perception subjectivity is incorporated into the retrieval process,

providing users with the opportunity to evaluate the retrieval results.

Queries or similarity measures are automatically refined on the basis of these evaluations

Relevance Feedback


Relevance feedback, is a supervised active learning technique used to improve the effectiveness of information systems.

It uses positive and negative examples from the user to improve system performance. For a given query, the system first retrieves a list of ranked images according to a predefined similarity metrics.

Then, the user marks the retrieved images as relevant to the query(positive examples) or irrelevant (negative examples).

The system will refine the query based on the feedback, retrieve a new list of images, and present them to the user. The key issue in relevance feedback is how to use positive and negative examples to refine the query and/or to adjust the similarity measure.

Performance Measure

Precision is defined as:Number of relevant images retrievedTotal Number of images retrieved

Recall is defined as:Number of relevant images retrieved .

Number of relevant images in database* Recall‐Precision graph is another important measure for

performance.


Interactive Image Retrieval with M- band Wavelet Features


Aims & MOTIVATIONUsing an effective representation (features) of image, which supports human visual system.Improving the retrieval results, utilizing a fuzzy relevance feedback mechanism (RFM) with automatic feature’s weight updation.Incorporating Earth Mover’s Distance (EMD) in the RFM.

PROPOSED CBIR SYSTEM

SEARCH ENGINE (RETRIEVAL BLOCK)

M-BAND WAVELET FEATURE

EXTRACTION

IMAGE DATABASE

USER INTERACTION

QUERY IMAGE RETRIEVAL RESULTS

ERROR IN RELEVANCE

FUZZY RELEVANCE FEEDBACK BLOCK

FUZZY FEATURE

EVALUATION

EMD WEIGHT DISTANCE

UPDATE BASED ON RELEVANCE

Proposed AlgorithmM-band wavelet Features are computed of an input image.

EMD similarity measure are used on the M-band wavelet features distances, retrieval is made.From the first stage of retrieval the user marks the relevant and irrelevant set.

The intraset and interset ambiguity is computed.

The feature evaluation index (FEI) for each component of each plane (Y-CB-Cr) is computed from the ambiguity measures.

The weight of each component is updated automatically according to the FEI as follows :

The next set of iteration is started with the updated features.

FEIw 2

qq=

ImplementationThe experiments were performed on Dell Precision T7400 with 4 GB RAM. The performance of the image retrieval system is tested upon two databases

A) SIMPLIcity:1000 images in 10 categories (People, Beach, Buildings, Bus, Dinosaur, Elephant, Flower, Horses, Mountains and Food ).

B) Corel 10000 miscellaneous database:9908 images belonging to 79 semantic categories.

Results are compared with the Color Structure Descriptor (CSD) and Edge Histogram Descriptor (EHD) included in MPEG-7 standard. MPEG-7 Reference Software provided by ISO is used for the computation of CSD and EHD.

Euclidean similarity measure are used with CSD and EHD features.EMD similarity measure with M‐band wavelet features for enhancement of accuracy

EXPERIMENTAL RESULTSM-BAND +ED ON SIMPLICITY

DATABASE (FIRST PASS) (5/20)

EHD +ED (FIRST PASS) (1/20)

CSD+ED (FIRST PASS) (5/20)

CONTINUED…1st ITERATION ON M-BAND +ED (6/20)

1st ITERATION ON EHD+ED (2/20)

1st ITERATION ON CSD+ED (6/20)

CONTINUED…

CSD+ED (FIRST PASS) (5/20)

EHD +ED (FIRST PASS) (1/20)

M-BAND + EMD ON SIMPLICITY DATABASE (FIRST PASS) (6/20)

CONTINUED…

1st ITERATION EHD+ED (2/20)

1st ITERATION CSD+ED (6/20)

1st ITERATION ON M-BAND +EMD (8/20) RANKING

IMPROVED

CONTINUED…M-BAND +EMD ON COREL DATABASE (FIRST

PASS) (6/20)

EHD + ED(FIRST PASS) (1/20)

CSD +ED(FIRST PASS)(2/20)

CONTINUED…1st ITERATION ON M-BAND +EMD (7/20)

1st ITERATION ON CSD +ED(2/20)

1st ITERATION ON EHD +ED (1/20)

CONTINUED…2nd ITERATION ON M-BAND +EMD (8/20)

2nd ITERATION ON EHD +ED (1/20)

2nd ITERATION ON CSD +ED(2/20)

GRAPHICAL INTERPRETATION

AVERAGE PRECISION OF CSD,EHD & M-BAND

Vs NO. OF IMAGES Displayed ON SIMPLICITY DATABASE

AVERAGE PRECISION Vs

NO.OF ITERATION

Average precision is computed on 20 display images

References:

1. Beyer, K. et al., Bottom‐up Computation of Sparse and Iceberg CUBE’s, SIGMOD’99.

2. Barnet, J. Computational Methods for a Mathematical Theory of Evidence, Proc. 7th Int. Jt. Conf. On artificial Intelligence, Canada, 1982, pp. 868‐875.

3. Itten, J., Kunst der Farbe. Ravensburg, Otto Maier‐Verlag, 1961.4. Chang, S. et al., Semantic Visual template: linking Visual features to

semantic, IEEE Int Conf. On Image Processing(ICIP’98) Chicago, 1998, pp. 531‐535.

5. Oliva, A. et al., Real‐world Scene Characterization by a Self‐Organizing Neural Network, Perception, Supp 26,19, 1997.

6. Zaiane, O. et al., Mining Multi‐Media Data, CASCON’98: Meeting of Minds, pp.83‐96, Canada, 1998.

7. http://www.isical.ac.in/~scc/DmDw_web/ISI_DmDw_Web.html


Conclusions• CBIR in general at present, is still very much a research topic. The technology is exciting but yet to achieve a good degree of maturity. The available commercial systems have limited capability for restricted domain.

• Medical Image retrieval research is a very important use of CBIR, considering the generation of huge amount of visual data for diagnostic purpose. But it still remains an academic exercise due to lack of integration between visual and clinical information



Thank You

Documents

Content Based Image Retrieval : An Introductionmiune/LECTURES/Sikkim_CBIR_2014F_MKK.pdf · Content Based Image Retrieval (CBIR) 9/22/2014 Prof. M. K. Kundu, MIU,ISI 26 Image Database