32
A Presentation On “ Vertical Image Search Engine” Guided By- Mrs. S. D. Chaudhari Group Members- 1.Shivam Dave 2.Shivam Kedia 3. Vidya Bhushan Singh 4.Yashwardhan Sisodia

Vertical Image Search Engine

Embed Size (px)

Citation preview

System Overview

A Presentation On Vertical Image Search Engine Guided By- Mrs. S. D. Chaudhari

Group Members- 1.Shivam Dave 2.Shivam Kedia 3. Vidya Bhushan Singh 4.Yashwardhan Sisodia

System Overview1.Introduction2.Objective3.Timeline Chart4.Architecture5.Algorithm5.1.Representing Keywords 5.2.Weighing Visual Features 5.3.Visual Thesaurus 5.4.Weighted Vector Optimization 5.5.Feature Quality and Correlation 5.6.Query Expansion and Search6.Sequence Diagram

System Overview(Ctd.)7.Class Diagram8.Proposed Modules8.1 User Interface8.2 Parser8.3 Image Processor8.4 Crawler9.Conclusion10.References

1.IntroductionWith the development of Internet and Web 2.0, large-volume multimedia contents have been made available online.

With the advances of text-based indexing, computer systems demonstrate superior efficiency to handle images over the Internet.

However, the search performance (precision) is not always reliable.

To remedy this problem we present a vertical search engine which integrates both text and visual features to improve image retrieval performance.

2.ObjectiveTo bridge the semantic gap by integrating textual and visual features and hence improving the precision of content-based image retrieval (CBIR).

To improve the recall by yielding items that would otherwise be missed by searching with either type of the features.

To bridge the user intention gap between users cognitive intentions and the textual queries received by the IR systems.

To discover the semantic relationships of the terms and automatically generate a thesaurus based on the visual semantics of words

3.Timeline Chart

4.ArchitectureThe system is comprised of three major components: a)The Crawler, b)The (Pre-)Processor, c)The Search and UI component.

The Crawler fetches product pages from retailer websites.

A customized parser extracts item descriptions and generates the term dictionary and inverted index.

Simultaneously, the image processor extracts visual features from item images.

4.Architecture(Ctd.)

Next, we integrate textual and visual features in a reweighting scheme, and further construct a visual thesaurus for each text term.

Finally, the UI component provides query interface and browsing views of search results.

As is shown in the Figure that follows in the next slide.

4.Architecture(Ctd.)

Figure: System Architecture

5.Algorithm

In multimedia information retrieval, the roles of textual feature space and visual feature space are complementary.

Textual information better represents the semantic meaning, while visual features play a dominant role at the physical level.

They are separated by the semantic gap, which is the major obstacle in content-based image retrieval.

5.1.Representing KeywordsThere are difficulties using only text features to retrieve mixtures of image/textual contents.

Moreover, calculating text similarity is difficult as distance measurements do not perfectly represent the distances in human perception.

To make up for the deficiency of pure text search or pure CBIR approaches, we explore the connections between textual and visual feature subspaces.

The text description represents the narrators perception of the visual features.

5.2.Weighing Visual Features Features coherent with the human perception of the keyword tend to have consistent values; while other features are more likely to be diverse.

To put it another way, suppose that we have two groups of samples: 1) positive: N1 items that have the keyword in their descriptions, and 2) negative: N2 items that do not contain the keyword.

Moreover, the feature values in the positive group tend to demonstrate a small variance, while values in the negative group are usually diversified.

5.2.Weighing Visual Features(Ctd.)Figure B demonstrates the value distribution of eight different features for the keyword dotted.

In the figure, blue (solid) lines represent distributions of the positive samples, while red (dashed) lines represent the distributions of negative samples.

For the first four texture features, distributions of the positive samples are significantly different from negative samples.

On the contrary, the two distributions are indistinguishable for the other four features.

5.2.Weighing Visual Features(Ctd.)

Figure B

5.3.Visual ThesaurusThesauri are widely used in information retrieval, especially in linguistic pre-processing and query expansion.

One can automatically generate thesauri using statistical analysis of textual corpora, based on co-occurrence or grammatical relations of terms.

In this search engine, a different type of thesaurusa visual thesaurus, based on the term distributions in the visual space is generated.

The term-wise similarity across the dictionary, to generate a domain-specific visual thesaurus or a visual WordNet is calculated.

5.4.Weight Vector OptimizationDue to the existence of synonyms, false negatives in the negative sets are observed.

A false negative is an item that: 1) is actually relevant to the term, 2) demonstrates similar visual features with the positive items, 3) is described by a synonym of the term, not the term itself.

The domain-specific visual thesaurus can help us find both synonyms and antonyms.

5.4.Weight Vector Optimization(Ctd.)A high threshold is enforced in determining the top synonyms, so that we do not introduce false positives into the positive set.

An example of the value distributions (normalized) of a colour feature of the positive and negative sets identified by terms pale and cream are shown in Figure C (dashed lines).

The distributions of the positive and negative sets from the combined set are also shown in figure.

By iteratively combining similar keywords in the visual thesaurus, we can improve the quality of the weight vectors.

5.4.Weight Vector Optimization(Ctd.)

Figure C

5.5.Feature Quality and CorrelationIn CBIR, the entropy of low-level visual features is widely used for feature selection and image annotation.

In this search engine, we reemploy this problem by utilizing the entropy of feature weights across all keywords.

A good feature will produce high weights for some terms, and low weights for the others.

The feature-quality curve is shown as Fig. Da. On the other hand, Figures. Db and Dc demonstrate the weight histogram for two difference features.

5.5.Feature Quality and Correlation(Ctd.)

Figure D

5.6.Query Expansion And Search

As it has been introduced, in this search engine, we first employ classic text-based search to obtain an initial set.

For each keyword in the user query, the system loads its corresponding weight vector, which is generated offline.

For each item in the initial set, its visual features to construct a base query ~qi are used.

6.Sequence Diagram

7.Class Diagram

8.Proposed Modules

8.1.Module-I: User Interface

8.2.Module-II: Crawler

8.3.Module-III: Parser

8.4.Module-IV: Image Processor

8.1 Module-IUser InterfaceUser Interface is the space where interaction between humans and machines occurs.

The goal of this interaction is effective operation and control of the machine on the user's end, and feedback from the machine, which aids the operator in making operational decisions.

The user interface created in this search engine will contain search field and will also contain filters for visual features.

The user interface will also display the categories of products for more efficient and refined search.

8.2 Module-IIParserA parser is a software component that takes input data and builds a data structure giving a structural representation of the input, checking for correct syntax in the process.

The parser analyses the text and visual feature scopes provided by user and generates the term dictionary and inverted index.

The parser is often preceded by a separate lexical analyser, which creates tokens from the sequence of input characters; alternatively, these can be combined in scannerless parsing.

8.3 Module-IIIImage ProcessorAn image processor, image processing engine, also called media processor, is a specialized digital signal processor used for image processing in digital cameras or other devices.

Image processor uses industry standard products, application-specific standard products (ASSP) or even application-specific integrated circuits (ASIC) with trade names.

The image processor extracts visual features from item images available in database.

8.4 Module-IVCrawler

Crawlers can copy all the pages they visit for later processing by a search engine that indexes the downloaded pages so that users can search them much more quickly.

Crawlers can validate hyperlinks and HTML code.

The users can search images by entering the text and providing their visual feature scopes as well.

The Crawler then will fetch product pages from related database.

9.ConclusionIn this search engine the integration of the textual and visual features for better search performance is done.

Text terms in the visual feature space are representated and a text-guided weighting scheme for visual features is developed.

Such weighting scheme infers user intention from query terms, and enhances the visual features that are significant toward such intention.

To sum up, by combining textual and visual features, the search engine manages to pick good features that reflect users perception, and therefore is effective for vertical search.

10.ReferencesY. Chen, N. Yu, B. Luo, and X.-w. Chen, iLike: Integrating Visual and Textual Features for Vertical Search, Proc. ACM Intl Conf. Multimedia, 2010.

B. Luo, X. Wang, and X. Tang, A World Wide Web Based Image Search Engine Using Text and Image Content Features, Proc. IS&T/SPIE, vol. 5018, pp. 123-130, 2003.

J. Cui, F. Wen, and X. Tang, Real Time Google and Live Image Search Re-Ranking, Proc. 16th ACM Intl Conf. Multimedia, 2008.

THANK YOU