30
Search Engine for Images We are given a large database comprising several categories of images like: people, landscape, flowers, buses, trains, food items, animals etc. There are numerous pictures in each category.

search engine for images

  • Upload
    anjani

  • View
    836

  • Download
    3

Embed Size (px)

DESCRIPTION

Search Engine for Images

Citation preview

Page 1: search engine for images

Search Engine for Images

We are given a large database comprising several categories of images like: people, landscape, flowers, buses, trains, food items, animals etc.

There are numerous pictures in each category.

Page 2: search engine for images

Our Goal

A user can pick any image (not necessarily inside the database) and our program (the search engine) should be able to pull out pictures from the database that are similar to the query image.

Page 3: search engine for images

Examples of similar images

Page 4: search engine for images

Our intention is to produce results as close to human perception as possible.

In other words, our search engine automatically categorizes the pictures.

Page 5: search engine for images

Mathematical Tools Needed

• The Wavelet Transform. A considerable amount of Linear algebra (particularly Inner product spaces) has to be covered before we are in a position to describe the wavelet transform.

• The K-means clustering algorithm. This is a powerful statistical tool.

Page 6: search engine for images

I.T. Tools Needed

We chose MATLAB 7.0 for the following reasons:

• It handles matrices in a convenient manner

• It supports a powerful programming language

• It can handle images in almost every format

Page 7: search engine for images

The Method

The core of this search engine is the Integrated Region Matching Scheme described by Wang, Li, and Wiederhold in the paper:SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture Libraries, IEEE, 2001.

We now briefly outline the method.

Page 8: search engine for images

A brief outline of the method

1. Partition a picture into 4x4 blocks and extract feature vectors for each block.

2. Group the feature vectors into a number of regions (using K-means).

3. Compute the signature of the image by combining the features of all its regions.

4. Save this signature to a database.5. Repeat the process for every picture in the

database.

Page 9: search engine for images

A brief outline of the method

6. The feature vectors contain information regarding colour, texture and shape.

7. The query image given by the user also undergoes the same signature extracting process.

8. Finally distances between the query signature and each image signature are calculated and sorted to give the closest matches.

Page 10: search engine for images

In the slides that follow we will make the above ideas precise.

Page 11: search engine for images

What is a Feature vector?

It is simply a list of 6 numbers calculated for each 4x4 block.

• The first three numbers are averages of the Red, Green and Blue components.

• The next three numbers are the root mean square of wavelet coefficients arising by applying the wavelet transform once to rows and columns of the block.

Page 12: search engine for images

Meaning of these components

• The first three numbers hold information about the colour.

• The next three numbers hold information about the texture.

Page 13: search engine for images

Image Segmentation

This is what you get when K means is applied to the set of feature vectors for various values of K.

All these pictures (except the original!) were created in MATLAB.

Page 14: search engine for images

3 regions2 regions

7 regions 4 regions

Original

Page 15: search engine for images

Remark

Figuring out an appropriate value of K for any image still remains an open problem in computer vision.

But Wang, Li and Weiderhold propose a scheme that calculates a K that works quite well for this experiment.

Page 16: search engine for images

Integrated Region Matching

This method combines the properties of all the regions in a picture to measure the overall similarity between images.

The payoff is that the scheme provides robustness against poor segmentation.

Page 17: search engine for images

Integrated Region Matching

Defining a similarity measure is nothing but defining a distance between sets of points (feature vectors) in a higher dimensional space.

The idea of distance must be carefully chosen so that it is consistent with a person’s idea of “closeness” of two images.

Page 18: search engine for images

Integrated Region Matching

Defining a similarity measure is nothing but defining a distance between sets of points (feature vectors) in a higher dimensional space.

The idea of distance must be carefully chosen so that it is consistent with a person’s idea of “closeness” of two images.

Page 19: search engine for images

Integrated Region Matching

Suppose two images A and B are represented by region sets

andB

Page 20: search engine for images

Integrated Region Matching

Denote by the distance between and .Wang et. al. give a simple prescription for this that they found experimentally.

The distance between the images is defined as the weighted sum of region-to-region matches:

Page 21: search engine for images

Integrated Region Matching

The weights are the elements of the significance matrix

Wang et.al. have devised an algorithm to calculate the significance matrix.

Page 22: search engine for images

Meaning of the weights

The weights capture the importance of match. For instance if a region consists of a body of an animal and is matched to various regions in the other picture then body-to-body match will be given more weight than a body-to-background or a body-to-tree match.

Page 23: search engine for images

Distance between Regions

Two regions and comprise of several feature vectors. We represent these two regions by single feature vectors and constructed as follows:The component is the average of the component of all FVs in that region.Now the distance between and is given by

Page 24: search engine for images

Experimental Results

We coded up this strategy in MATLAB 7.0 and ran the program on a database of 1000 images.

Some outputs are shown in the next few slides.

The Top left image is the query image.

Page 25: search engine for images
Page 26: search engine for images
Page 27: search engine for images
Page 28: search engine for images
Page 29: search engine for images

Final Remarks

One group (Eagle 2) has some original contributions to make:

• They had come up with their own criteria of calculating the appropriate number of regions for an image.

• They even implemented their own K-means routine instead of relying on MATLAB’s built-in function.

Page 30: search engine for images

Final Remarks

One of our IAYM student Shubhankar Biswas, NIT, Durgapur

has presented this project at the Seminar on Applications of Computer and Embedded Technology

organized by the Variable Energy Cyclotron Centre (VECC), Kolkata, in Oct. 2009 (2 months after IAYM 2009). His paper can be viewed at

http://www.vecc.gov.in/~sacet09/index_files/Page479.htm

He has acknowledged the MSF for all the guidance.