Upload
dangnhi
View
216
Download
0
Embed Size (px)
Citation preview
Content Based Image Retrieval : An Introduction
9/22/2014 Prof. M. K. Kundu, MIU,ISI
Malay K. KunduMachine Intelligence UnitIndian Statistical Institute
Kolkata http://www.isical.ac.in/~malay
Introduction
• Due to revolutionary growth of digital imaging & internet technology, development of efficient & intelligent scheme for image retrieval from a large image collection became an important research issue. Two important approaches in this regards are,
• Content Based Image Retrieval(CBIR).• Content Based Image Retrieval with relevance feedback.
9/22/2014 Prof. M. K. Kundu, MIU,ISI 2
The Problem and Motivation
• There are now billions of images on the web and in collections such as Flickr, Google etc.
• Suppose I want to find pictures of monkeys with existing model for search.
9/22/2014 Prof. M. K. Kundu, MIU,ISI 3
Metadata based retrieval systems
• Metadata based retrieval systems– text, click-rates, etc.– Google Images– Clearly not sufficient
• what if computers understood images? – Content based image
retrieval (early 90’s)– search based on
the image content
9/22/2014 Prof. M. K. Kundu, MIU,ISI 5
Top 12 retrieval results for the query ‘Mountain’
Content Based Image Retrieval(CBIR)
• The process of retrieval of relevant images from an image database(or distributed databases) on the basis of primitive (e.g. color, texture, shape etc.) or semantic image features extracted automatically is known as Content Based Image Retrieval.
9/22/2014 Prof. M. K. Kundu, MIU,ISI 6
Problems of Information extraction in CBIR
• It differs generically from conventional information retrieval/Data Mining (DM) due to the following reasons,Unstructured nature of image databasesContains pixel intensities with no inherent meaningAny kind of reasoning about image content is possible only after extraction of some useful image information(e.g, presence of primitive or semantic feature)
9/22/2014 Prof. M. K. Kundu, MIU,ISI 7
What is Unstructured Data?
• Any data without a well‐defined model for information access
• Examples,– Image, Video, Sound.– Word documents– E‐mails
• Examples of what is structured– Database tables– Objects– XML tags
9/22/2014 Prof. M. K. Kundu, MIU,ISI 8
Unstructured data• Unstructured data consists of text, audio, images etc.
• Unstructured data contains significant scientific and commercial information
• Technologies and tools are being developed for efficient extraction of information in unstructured data
• Problems of information retrieval from unstructured data are not fully understood yet. A paradigm shift is needed
9/22/2014 Prof. M. K. Kundu, MIU,ISI 9
UDM Increases Informational Content
9/22/2014 Prof. M. K. Kundu, MIU,ISI 10
Structured Data(10-40%)
Unstructured Data(60-90%)+
Representation of Image
9/22/2014 Prof. M. K. Kundu, MIU,ISI 11
Two important questions for content-based image retrieval:• How are images represented => features
•How are image representations compared =>distance/similarity measures
CBIR: Visual Signature
• Image is Indexed by its visual contentVisual content is described by
• Low level features (Color, texture, shape, layout etc.).
• High level descriptors(Face, iconic pattern, context, spatial reasoning, semantic information etc.)
9/22/2014 Prof. M. K. Kundu, MIU,ISI 12
Content Based Image Retrieval (CBIR)
9/22/2014 Prof. M. K. Kundu, MIU,ISI 26
Image Database
Result
Query Image
User
Content based image retrieval -1• Query by Visual Example(QBVE)
– user provides query image
– system extracts image features (texture, color, shape)
– returns nearest neighbors using suitable similarity measure
9/22/2014 Prof. M. K. Kundu, MIU,ISI27
Texturesimilarity
Colorsimilarity
Shapesimilarity
The Semantic Gap• First generation CBIR systems were based on color and texture; however these do not capture what users really care about : conceptual or semantic categories.
• Perception studies suggest that the most important cue to visual categorization is shape. This was ignored in earlier work (because it was hard!)
9/22/2014 Prof. M. K. Kundu, MIU,ISI 28
Content based image retrieval • Semantic Retrieval (SR)
– User provided a query text (keywords)– find images that contains the associated semantic concept.
– around the year 2000,– model semantic classes, learn to annotate images– Provides higher level of abstraction, and supports natural language
queries9/22/2014 Prof. M. K. Kundu, MIU,ISI 29
query: “people, beach”
Text based Semantic Retrieval (SR)
• Problem of lexical ambiguity– multiple meaning of the same word
• Anchor - TV anchor or for Ship?
• Bank - Financial Institution or River bank?
• Multiple semantic interpretations of an image• Boating or Fishing or People?
• Limited by Vocabulary size – What if the system was not trained for
‘Fishing’ – In other words, it is outside the space of
trained semantic concepts
9/22/2014 Prof. M. K. Kundu, MIU,ISI30
Lake? Fishing? Boating? People?
Fishing! what if not in the vocabulary?
Summary• SR Higher level of abstraction
– Better generalization inside the space of trained semantic concepts
– But problem of
• Lexical ambiguity
• Multiple semantic interpretations
• Vocabulary size
• QBVE is unrestricted by language. – Better Generalization outside the space of trained
semantic concepts
• a query image of ‘Fishing’ would retrieve visually similar images.
– But weakly correlated with human notion of similarity
9/22/2014 Prof. M. K. Kundu, MIU,ISI 31
Both have visually dissimilar sky
Fishing! what if not in the vocabulary?
Lake? Fishing? Boating? People?
The two systems in many respects are complementary!
What should be the Research Focus ?
• Automatically generate annotations corresponding to object labels or activities in an image
• If possible, combine these with other metadata such as text.
9/22/2014 Prof. M. K. Kundu, MIU,ISI 32
CBIR
• Content‐Based Image Retrieval
– Query‐by‐Example(QBE)
– Query‐by‐Feature(QBF)– Feature Vector
9/22/2014 Prof. M. K. Kundu, MIU,ISI 33
CBIR Architecture
9/22/2014 Prof. M. K. Kundu, MIU,ISI 34
FeatureExtraction
User Interface
SimilarityMetric
DatabaseImage data
Image Representation
Imagedata
ImageBrowsing
DatabaseCreation
QueryComparison
query
Example: CBIR of Butterflies
• To allow non‐expert users to find out some possible species of the butterflies they saw by the appearance of the butterflies
• The appearance:– Color, Texture, Shape
9/22/2014 Prof. M. K. Kundu, MIU,ISI 35
Problems
• How can you describe a butterfly?
• How can you communicate with a machine?
9/22/2014 Prof. M. K. Kundu, MIU,ISI 36
Problems
• Different users have different perception.• Users may not remember the appearance of
the butterfly clearly.• Users usually do not have enough
knowledge to describe the butterflies like experts.
• Users usually do not have patience to browse too many query results.
9/22/2014 Prof. M. K. Kundu, MIU,ISI 37
Possible Solutions• An user‐driven interactive query process: QBF/QBE query process– (Query By Features and Query By Example)
• Fuzzy feature description for each butterfly
• An “What You See Is What You Get” query interface
• A representative set for a collection of butterflies
9/22/2014 Prof. M. K. Kundu, MIU,ISI 38
QBF/QBE query process
• QBF query:– A QBF query is to choose some features of butterflies and expect that the system returns all butterflies with those features.
– Features of butterflies:• Dominant color, texture pattern, shape.
• QBE query:– A QBE query is to point an image and expect that the system returns all butterflies similar to that.
9/22/2014 Prof. M. K. Kundu, MIU,ISI 39
Feature Description (1)
• Feature Description for a butterfly:– Like metadata which describe the
appearance of this butterfly.– This makes QBF queries possible.– Feature Description consists of some
feature descriptors.• Feature descriptor:
– A ( “feature value” , “match level” ) pair.
9/22/2014 Prof. M. K. Kundu, MIU,ISI 40
Feature Description (coarse)
9/22/2014 Prof. M. K. Kundu, MIU,ISI 41
FeatureType Feature Value Degree of
Match mixed_with_black_and_orange 52/57 orange_yellow 12/42Color orange_red 3/38 many_spots 58/62 fore_half_different_color 27/33 horizontal_bands 41/60Texture
edge_with_different_color 10/74Shape wave 98/110
Common Color Features• Most commonly used color features are• Pixel values in RGB or alternative space like HSV, Y Cr,Cb etc.
with uniform quantization.• Color Histogram• Joint probability density of intensities of the three channels
Histogram intersection as similarity measure. • Color Moments• Using moments (low order moments like mean, variance,
skewness) for characterizing color distribution with Euclidian distance measure.
• Invariant descriptors.Invariance to illumination changes
9/22/2014 Prof. M. K. Kundu, MIU,ISI 42
What color is the apple ? We are so visual !!!!
9/22/2014 Prof. M. K. Kundu, MIU,ISI 43
I’d say it isBright Red
I think it is“Crimson”
It isRed!
I really couldn’t tell you(I am color blind)
RGB Color Space• Hardware Oriented Model:
RGB Color Space: 3 values to represent a color.
Red Green Blue
9/22/2014 Prof. M. K. Kundu, MIU,ISI 44
HSV Color Space
• HSV – Hue Saturation Value– close to human perception
– 3 values to represent a color.
9/22/2014 Prof. M. K. Kundu, MIU,ISI 45
SaturationHue
Value
Red (0o)
Yellow (60o)Green (120o)Cyan (180o)
Blue (240o) Magenta (300o)
Black
White
Y Cb Cr Color Space• Y is the Luminance Cb and Cr are the
Chrominance Values of this Color Space.
• Decouples intensity and color information
• A monochrome color representation has only the Y value.
• Very close to Perceptual Model
9/22/2014 Prof. M. K. Kundu, MIU,ISI 46
CIE-LAB Uniform Color Space
• Uniform Color Spaces: CIE Lab
• CIE solved this problem in 1976 with the development of the Lab color space.
9/22/2014 Prof. M. K. Kundu, MIU,ISI 47
CBIR: Major Challenges
• Major developmental Issues• Efficient feature extraction and similarity measure.
• Multi‐dimensional indexing for multi‐level image queries
• Relevance feedback network for automatic extraction high level knowledge.
• Efficient and adaptive retrieval system design.
9/22/2014 Prof. M. K. Kundu, MIU,ISI 48
Distance Measures• Heuristic
– Minkowski‐form– Weighted‐Mean‐Variance (WMV)
• Nonparametric test statistics– χ 2 (Chi Square)– Kolmogorov‐Smirnov (KS)– Cramer/von Mises (CvM)
• Information‐theory divergences– Kullback‐Liebler (KL)– Jeffrey‐divergence (JD)
• Ground distance measures– Histogram intersection– Quadratic form (QF)– Earth Movers Distance (EMD)
9/22/2014 Prof. M. K. Kundu, MIU,ISI 49
Texture
Texture refers to visual pattern that have properties of human perceptionhomogeneity, contrast, roughness, coarseness,directionality or sense of orientation,regularity (periodic or quasi‐periodic distribution) line‐likeness
9/22/2014 Prof. M. K. Kundu, MIU,ISI 50
Texture
• Innate property of virtually all natural objects (like clouds, trees, hair, woods, stones etc.) and scenes.
9/22/2014 Prof. M. K. Kundu, MIU,ISI 51
Common Texture features:
Co‐occurrence Matrix• Explore spatial dependence of gray level or color.• Frequency counts in a 2‐D array for a pair of pixels with
different direction and distance.• Statistical features extracted from the matrix used as
descriptor.
Wavelet based multi‐scale featuresStatistical features extracted from sub‐bands.
Gabor Filter based technique.
9/22/2014 Prof. M. K. Kundu, MIU,ISI 52
MPEG-7 CBIR Texture Feature
• In different CBIR systems performance greatly rely on the notion of texture, this may differ. For standardization MPEG‐7 has recommended different texture feature models,
Edge Histogram Descriptor(EHD)Homogeneous texture Descriptor(HTD)Texture Browsing Descriptor(TBD)TEXTURETEXTURE
– HOMOGENEOUS TEXTURE
9/22/2014 Prof. M. K. Kundu, MIU,ISI 53
Shape
9/22/2014 Prof. M. K. Kundu, MIU,ISI 54
•Shape of a visual object is one of the most powerful signature for human perception mechanism.
•It remains invariant under different geometric transformations (rotation, scaling & translation)
Two categories of methods
•Boundary based(using only outer boundary):Fourier Descriptor, Curvature function and Curvature scale space
•Region Based(using shape of entire region)
Region Based shape Signatures
• Binary and gray level moments
• Zernike Moments
• Grid based descriptor
• Object bounding box.
• Segmentation of different regions(may not be a meaningful objects)
9/22/2014 Prof. M. K. Kundu, MIU,ISI 55
Results: Based on Shape With Rotational invariance feature.
9/22/2014 Prof. M. K. Kundu, MIU,ISI 56
M. Banerjee and M. K. Kundu`` Edge based features for content based image retrieval'' , Pattern Recognition, Vol..36, No. 11, pp. 2649-2661, 2003
Relevance Feedback
9/22/2014 Prof. M. K. Kundu, MIU,ISI 57
The key idea of interactive relevance feedback technique is that human perception subjectivity is incorporated into the retrieval process,
providing users with the opportunity to evaluate the retrieval results.
Queries or similarity measures are automatically refined on the basis of these evaluations
Relevance Feedback
9/22/2014 Prof. M. K. Kundu, MIU,ISI 58
Relevance feedback, is a supervised active learning technique used to improve the effectiveness of information systems.
It uses positive and negative examples from the user to improve system performance. For a given query, the system first retrieves a list of ranked images according to a predefined similarity metrics.
Then, the user marks the retrieved images as relevant to the query(positive examples) or irrelevant (negative examples).
The system will refine the query based on the feedback, retrieve a new list of images, and present them to the user. The key issue in relevance feedback is how to use positive and negative examples to refine the query and/or to adjust the similarity measure.
Performance Measure
Precision is defined as:Number of relevant images retrievedTotal Number of images retrieved
Recall is defined as:Number of relevant images retrieved .
Number of relevant images in database* Recall‐Precision graph is another important measure for
performance.
9/22/2014 Prof. M. K. Kundu, MIU,ISI 59
Aims & MOTIVATIONUsing an effective representation (features) of image, which supports human visual system.Improving the retrieval results, utilizing a fuzzy relevance feedback mechanism (RFM) with automatic feature’s weight updation.Incorporating Earth Mover’s Distance (EMD) in the RFM.
PROPOSED CBIR SYSTEM
SEARCH ENGINE (RETRIEVAL BLOCK)
M-BAND WAVELET FEATURE
EXTRACTION
IMAGE DATABASE
USER INTERACTION
QUERY IMAGE RETRIEVAL RESULTS
ERROR IN RELEVANCE
FUZZY RELEVANCE FEEDBACK BLOCK
FUZZY FEATURE
EVALUATION
EMD WEIGHT DISTANCE
UPDATE BASED ON RELEVANCE
Proposed AlgorithmM-band wavelet Features are computed of an input image.
EMD similarity measure are used on the M-band wavelet features distances, retrieval is made.From the first stage of retrieval the user marks the relevant and irrelevant set.
The intraset and interset ambiguity is computed.
The feature evaluation index (FEI) for each component of each plane (Y-CB-Cr) is computed from the ambiguity measures.
The weight of each component is updated automatically according to the FEI as follows :
The next set of iteration is started with the updated features.
FEIw 2
qq=
ImplementationThe experiments were performed on Dell Precision T7400 with 4 GB RAM. The performance of the image retrieval system is tested upon two databases
A) SIMPLIcity:1000 images in 10 categories (People, Beach, Buildings, Bus, Dinosaur, Elephant, Flower, Horses, Mountains and Food ).
B) Corel 10000 miscellaneous database:9908 images belonging to 79 semantic categories.
Results are compared with the Color Structure Descriptor (CSD) and Edge Histogram Descriptor (EHD) included in MPEG-7 standard. MPEG-7 Reference Software provided by ISO is used for the computation of CSD and EHD.
Euclidean similarity measure are used with CSD and EHD features.EMD similarity measure with M‐band wavelet features for enhancement of accuracy
EXPERIMENTAL RESULTSM-BAND +ED ON SIMPLICITY
DATABASE (FIRST PASS) (5/20)
EHD +ED (FIRST PASS) (1/20)
CSD+ED (FIRST PASS) (5/20)
CONTINUED…1st ITERATION ON M-BAND +ED (6/20)
1st ITERATION ON EHD+ED (2/20)
1st ITERATION ON CSD+ED (6/20)
CONTINUED…
CSD+ED (FIRST PASS) (5/20)
EHD +ED (FIRST PASS) (1/20)
M-BAND + EMD ON SIMPLICITY DATABASE (FIRST PASS) (6/20)
CONTINUED…
1st ITERATION EHD+ED (2/20)
1st ITERATION CSD+ED (6/20)
1st ITERATION ON M-BAND +EMD (8/20) RANKING
IMPROVED
CONTINUED…M-BAND +EMD ON COREL DATABASE (FIRST
PASS) (6/20)
EHD + ED(FIRST PASS) (1/20)
CSD +ED(FIRST PASS)(2/20)
CONTINUED…1st ITERATION ON M-BAND +EMD (7/20)
1st ITERATION ON CSD +ED(2/20)
1st ITERATION ON EHD +ED (1/20)
CONTINUED…2nd ITERATION ON M-BAND +EMD (8/20)
2nd ITERATION ON EHD +ED (1/20)
2nd ITERATION ON CSD +ED(2/20)
GRAPHICAL INTERPRETATION
AVERAGE PRECISION OF CSD,EHD & M-BAND
Vs NO. OF IMAGES Displayed ON SIMPLICITY DATABASE
AVERAGE PRECISION Vs
NO.OF ITERATION
Average precision is computed on 20 display images
References:
1. Beyer, K. et al., Bottom‐up Computation of Sparse and Iceberg CUBE’s, SIGMOD’99.
2. Barnet, J. Computational Methods for a Mathematical Theory of Evidence, Proc. 7th Int. Jt. Conf. On artificial Intelligence, Canada, 1982, pp. 868‐875.
3. Itten, J., Kunst der Farbe. Ravensburg, Otto Maier‐Verlag, 1961.4. Chang, S. et al., Semantic Visual template: linking Visual features to
semantic, IEEE Int Conf. On Image Processing(ICIP’98) Chicago, 1998, pp. 531‐535.
5. Oliva, A. et al., Real‐world Scene Characterization by a Self‐Organizing Neural Network, Perception, Supp 26,19, 1997.
6. Zaiane, O. et al., Mining Multi‐Media Data, CASCON’98: Meeting of Minds, pp.83‐96, Canada, 1998.
7. http://www.isical.ac.in/~scc/DmDw_web/ISI_DmDw_Web.html
9/22/2014 Prof. M. K. Kundu, MIU,ISI 73
Conclusions• CBIR in general at present, is still very much a research topic. The technology is exciting but yet to achieve a good degree of maturity. The available commercial systems have limited capability for restricted domain.
• Medical Image retrieval research is a very important use of CBIR, considering the generation of huge amount of visual data for diagnostic purpose. But it still remains an academic exercise due to lack of integration between visual and clinical information
9/22/2014 Prof. M. K. Kundu, MIU,ISI 74