Upload
rama-gopal-s
View
221
Download
0
Embed Size (px)
Citation preview
7/29/2019 Content Based Image Search
1/41
Content Based Image Search
Mark Desnoyer
7/29/2019 Content Based Image Search
2/41
Text Search - Bag of Words
Gnome
grass
him
caught
moonwalked
walksGnome
him
house
internet
walks
shuttle
PC
Nord
7/29/2019 Content Based Image Search
3/41
Text Search - TF-IDF
7/29/2019 Content Based Image Search
4/41
Text Search - Query
7/29/2019 Content Based Image Search
5/41
Current Image Search
On the web uses contextaround image
Words around itWords in the alt tag
Those words are treatedas a documentSame as normal textsearch
But we want pictures, nottext!
Query: Horse
7/29/2019 Content Based Image Search
6/41
Searching With Pictures
How about searching with pictures instead
7/29/2019 Content Based Image Search
7/41
Using Visual Words For Search
Use visual words paradigmwe've seen beforeCan use all the text search
machinery we already haveBut, we're searching withpictures now
7/29/2019 Content Based Image Search
8/41
The Players
O. Chum et al., Total recall: Automatic query expansionwith a generative feature model for object retrieval, in Proc.ICCV, vol. 2, 2007.
N. Snavely, S. M. Seitz, and R. Szeliski, Photo tourism:exploring photo collections in 3D, in InternationalConference on Computer Graphics and InteractiveTechniques (ACM New York, NY, USA, 2006), 835-846.
D. Nister and H. Stewenius, Scalable recognition with avocabulary tree, in Proc. CVPR, vol. 5, 2006.
7/29/2019 Content Based Image Search
9/41
SIFT Features
Succinct descriptorsScale invariantRobust to changes in lighting, viewpoint, blur etc.Therefore, normally used in this search context
7/29/2019 Content Based Image Search
10/41
Bag of Words First Step - Build a
Dictionary
Must be big to be expressive enough to differentiate objects
So, cluster SIFT features
Each cluster is a word in the dictionary
But K-means clustering 10M+ descriptors is O(NK)Hierarchical K-means (Nister)
Approximate K-means (Chum)
7/29/2019 Content Based Image Search
11/41
Vocabulary Tree
Copyright David Nistr, Henrik Stewnius
Vocabulary Tree (Nister)
7/29/2019 Content Based Image Search
12/41
Vocabulary Tree
Vocabulary Tree (Nister)
Copyright David Nistr, Henrik Stewnius
7/29/2019 Content Based Image Search
13/41
Vocabulary Tree
Vocabulary Tree (Nister)
Copyright David Nistr, Henrik Stewnius
7/29/2019 Content Based Image Search
14/41
Vocabulary Tree (Nister)
Copyright David Nistr, Henrik Stewnius
7/29/2019 Content Based Image Search
15/41
Vocabulary Tree (Nister)
Copyright David Nistr, Henrik Stewnius
7/29/2019 Content Based Image Search
16/41
Approximate Nearest Neighbour (Chum)
Most of the time in K-Means is spent doing NearestNeighbour
Nearest Neighbour can be approximated using kd-trees
O(N logK) vs. O(NK)
7/29/2019 Content Based Image Search
17/41
Another Problem - Synonyms
Visual Polysemy. Single visual word occurring on different (but locallysimilar) parts on different object categories.
Visual Synonyms. Two different visual words representing a similar part ofan object (wheel of a motorbike).
7/29/2019 Content Based Image Search
18/41
Video Google
Search for recurring objectsin a movie
Synonyms suppressed by
enforcing consistency intime
Stop list used to throw outwords that are too common
7/29/2019 Content Based Image Search
19/41
Photo Tourism Overview
Scenereconstruction
Photo Explorer
Inputphotographs
Relative camerapositions and orientations
Point cloud
Sparse correspondence
Copyright Noah Snavely
7/29/2019 Content Based Image Search
20/41
Photo Tourism Scene Recontruction
Automatically estimate
position, orientation, and focal length of cameras
3D positions of feature points
Feature detection
Pairwise
feature matching
Incrementalstructure from
motion
Correspondenceestimation
Copyright Noah Snavely
7/29/2019 Content Based Image Search
21/41
Photo Tourism
Demo
http://phototour.cs.washington.edu/applet/index.html7/29/2019 Content Based Image Search
22/41
Photo Tourism Limitations
Matching is only performed between pairs of imagesDoes not scale to large datasets
7/29/2019 Content Based Image Search
23/41
Chum et al. Experiment
5K labeled images of sights at Oxford
1M Flickr images from popular tags (distractors)
Dictionary built from Oxford Images16M descriptions -> 1M word dictionary
Query for landmarks and calculate PR curve using differentforms of query expansion
7/29/2019 Content Based Image Search
24/41
Copyright James Philbin
7/29/2019 Content Based Image Search
25/41
Copyright James Philbin
7/29/2019 Content Based Image Search
26/41
Copyright James Philbin
7/29/2019 Content Based Image Search
27/41
Copyright James Philbin
7/29/2019 Content Based Image Search
28/41
Copyright James Philbin
7/29/2019 Content Based Image Search
29/41
Copyright James Philbin
7/29/2019 Content Based Image Search
30/41
Copyright James Philbin
7/29/2019 Content Based Image Search
31/41
Copyright James Philbin
7/29/2019 Content Based Image Search
32/41
Text Query Expansion
In text search, some words are similar, but they are differentin the dictionary
e.g. gray and grey
Improve results by expanding the query to include similarwords
e.g. "grey goose" -> "grey goose gray"
Similar words are found by clustering on document data
At query time, relevant clusters are found and pulled in
False positives add a lot of noise to the results
7/29/2019 Content Based Image Search
33/41
Image Query Expansion - Baseline
7/29/2019 Content Based Image Search
34/41
Transitive Closure
7/29/2019 Content Based Image Search
35/41
Average Query Expansion
7/29/2019 Content Based Image Search
36/41
Recursive Average Query Expansion
7/29/2019 Content Based Image Search
37/41
Multiple Image Resolution Expansion
7/29/2019 Content Based Image Search
38/41
Copyright James Philbin
7/29/2019 Content Based Image Search
39/41
Demo
http://arthur.robots.ox.ac.uk:8080/search/?id=oxc1_hertford_000011
http://arthur.robots.ox.ac.uk:8080/search/?id=oxc1_hertford_000011http://arthur.robots.ox.ac.uk:8080/search/?id=oxc1_hertford_0000117/29/2019 Content Based Image Search
40/41
Results - PR Curves Before & After
Expansion
7/29/2019 Content Based Image Search
41/41
Results - Effect Of Distractors
My distractors: 9K images from searches like "building","cathedral", "library", "historic", "spire" etc.