View
228
Download
1
Embed Size (px)
Citation preview
5/30/2006 EE 148, Spring 2006 1
Visual Categorization with Bags of Keypoints
Gabriella Csurka Christopher R. Dance
Lixin Fan Jutta Willamowski
Cedric Bray
Presented by Yun-hsueh Liu
5/30/2006 EE 148, Spring 2006 2
What is Generic Visual Categorization?
Categorization: distinguish different classes
Generic Visual Categorization: Generic to cope with many object types simultaneously readily extended to new object types. Handle the variation in view, imaging, lighting, occlusion, and
typical object and scene variations
5/30/2006 EE 148, Spring 2006 3
Previous Work in Computational Vision
Single Category Detection
Decide if a member of one visual category is present in a given image. (faces, cars, targets)
Content Based Image Retrieval Retrieve images on the basis of low-level image
features, such as colors or textures.
Recognition Distinguish between images of structurally distinct
objects within one class. (say, different cell phones)
5/30/2006 EE 148, Spring 2006 4
Bag-of-Keypoints Approach
Interesting Point Detection
Key PatchExtraction
FeatureDescriptors
Bag of KeypointsMulti-classClassifier
5.1
.
.
.
5.0
1.0
5/30/2006 EE 148, Spring 2006 5
SIFT Descriptors
Interesting Point Detection
Key PatchExtraction
FeatureDescriptors
Bag of KeypointsMulti-classClassifier
5.1
.
.
.
5.0
1.0
5/30/2006 EE 148, Spring 2006 6
Bag of Keypoints (1)
Construction of a vocabulary Kmeans clustering find “centroids”
(on all the descriptors we find from all the training images) Define a “vocabulary” as a set of “centroids”, where every centroid
represents a “word”.
Interesting Point Detection
Key PatchExtraction
FeatureDescriptors
Bag of KeypointsMulti-classClassifier
5/30/2006 EE 148, Spring 2006 7
Bag of Keypoints (2)
Histogram Counts the number of occurrences of different visual words in each
image
Interesting Point Detection
Key PatchExtraction
FeatureDescriptors
Bag of KeypointsMulti-classClassifier
5/30/2006 EE 148, Spring 2006 8
Multi-class Classifier
In this paper, classification is based on conventional machine learning approaches Naïve Bayes Support Vector Machine (SVM)
Interesting Point Detection
Key PatchExtraction
FeatureDescriptors
Bag of KeypointsMulti-classClassifier
5/30/2006 EE 148, Spring 2006 9
Multi-class classifier –Naïve Bayes (1)
Let V = {vi}, i = 1,…,N, be a visual vocabulary, in which each vi represents a visual word (cluster centers) from the feature space.
A set of labeled images I = {Ii } .
Denote Cj to represent our Classes, where j = 1,..,M
N(t,i) = number of times vi occurs in image Ii (keypoint histogram)
Score approach: want to determine P(Cj|Ii), where
(*)
5/30/2006 EE 148, Spring 2006 10
Multi-class Classifier –Naïve Bayes (2)
Goal: Find one specific class Cj so that
has maximum value
In order to avoid zero probability, use Laplace smoothing:
5/30/2006 EE 148, Spring 2006 11
Multi-class classifier –Support Vector Machine (SVM) Input: the keypoints histogram for each image
Multi-class one-against-all approach
Linear SVM gives better performances than quadratic or cubic SVM
Goal: find hyperplanes which separate multi-class data with maximun margin
5/30/2006 EE 148, Spring 2006 12
Multi-class classifier –SVM (2)
5/30/2006 EE 148, Spring 2006 13
Evaluation of Multi-class Classifiers
Three performance measures: The confusion matrix
Each column of the matrix represents the instances in a predicted class Each row represents the instances in an actual class
The overall error rate = Pr(output class = true class)
The mean ranks The mean position of the correct labels when labels output by the multi-
class classifier are sorted by the classifier score.
5/30/2006 EE 148, Spring 2006 14
n-Fold Cross Validation What is “fold”?
Randomly break the dataset into n partitions
Example: suppose n = 10 Training on 2, 3,…,10; testing on 1 = result 1 Training on 1, 3,…,10; testing on 2 = result 2 … Answer = Average of result 1, result 2, ….
5/30/2006 EE 148, Spring 2006 15
Experiment on Naïve Bayes –k’s effect
Present the overal error rate as a function of # of clusters k
Result
Error rate decreases as k increases
Selecting point: k = 1000
After passing the selecting point, the error rate decreases slowly
5/30/2006 EE 148, Spring 2006 16
Experiment on Naïve Bayes –Confusion Matrix
faces buildings trees cars phones bikes books
faces 76 4 2 3 4 4 13
buildings 2 44 5 0 5 1 3
trees 3 2 80 0 0 5 0
cars 4 1 0 75 3 1 4
phones 9 15 1 16 70 14 11
bikes 2 15 12 0 8 73 0
books 4 19 0 6 7 2 69
error rate
24 56 20 25 27 27 31
mean rank
1.49 1.88 1.33 1.33 1.63 1.57 1.57
5/30/2006 EE 148, Spring 2006 17
Experiment on SVM –Confusion Matrix
faces buildings trees cars phones bikes books
faces 98 14 10 10 34 0 13
buildings 1 63 3 0 3 1 6
trees 1 10 81 1 0 6 0
cars 0 1 1 85 5 0 5
phones 0 5 4 3 55 2 3
bikes 0 4 1 0 1 91 0
books 0 3 0 1 2 0 73
error rate 2 27 19 15 45 9 27
mean rank
1.04 1.77 1.28 1.30 1.83 1.09 1.39
5/30/2006 EE 148, Spring 2006 18
Interpretation of Results The confusion matrix
In general, SVM has more correct predictions than Naïve Bayes does
The overall error rate In general, Naïve Bayes > SVM
The Mean Rank In general, SVM < Naïve Bayes
5/30/2006 EE 148, Spring 2006 19
Why do we have errors? There are objects from more than 2 classes in one image The data set is not totally clean (noise) Each image is given only one training label
5/30/2006 EE 148, Spring 2006 20
Conclusion Bag-of-Keypoints is a new and efficient generic visual categorizer.
Evaluated on a seven-category database, this method is proved that it is robust to Choice of clusters, background clutter, multiple objects
Any Questions?
Thank you for listening to my presentation!! :)