EE368/CS232 Digital Image Processing Lecture Review and ...€¦ · (b) Add a brightness offset to the grayscale intensity (assumed to be in the range ... • hw7_paintings_database.zip:

1

EE368/CS232 Digital Image Processing Winter 2017-2018

Lecture Review and Quizzes (Due: Wednesday, February 28, 1:30pm) Please review what you have learned in class and then complete the online quiz questions for the following section on OpenEdX1:

• Feature-based Methods for Image Matching

Homework #7 Released: Monday, February 19

Due: Wednesday, February 28, 1:30pm 1. Robustness of the SIFT Descriptor (Total of 9 points) In this problem, we investigate how well SIFT descriptors still match if an image is modified in a variety of ways, e.g., by adding a brightness offset, by adding noise, or by blurring the image. Ideally, feature matches should not be affected by any of the following image modifications. Please download the image hw7_building.jpg from the homework webpage.

(a) Extract a few hundred SIFT features using the vl_sift function in the VLFeat library.2

Show the feature keypoints overlaid on top of the image using the vl_plotframe function. (1 point)

(b) Add a brightness offset to the grayscale intensity (assumed to be in the range [0,255]), for the following offset values: . Compute SIFT descriptors at the same keypoints (locations/scales/orientations) as in the original image. You can use vl_sift with the ‘Frames’ option to input custom keypoints. Match the original image’s SIFT descriptors and the modified image’s SIFT descriptors in the 128-dimensional feature space using

1 https://suclass.stanford.edu/courses/course-v1:Engineering+EE368+Winter2018 2 http://www.vlfeat.org/

D-100, -80, ..., 80, 100D =

2

nearest-neighbor search with a distance ratio test as implemented in the vl_ubcmatch function (with default threshold 1.5). Measure “repeatability”, which is defined as the number of matching features divided by the number of features in the original image. Display and submit a plot of repeatability versus ∆, and comment on the SIFT descriptor’s robustness against brightness changes.

(2 points) (c) Repeat part (b), except instead of changing the brightness, adjust the contrast of the image by

raising the grayscale intensity to a power γ, for the following γ values: . Display and submit a plot of repeatability versus γ, and

comment on the SIFT descriptor’s robustness against contrast changes. (2 points)

(d) Repeat part (b), except instead of changing the brightness, add zero-mean Gaussian noise with standard deviation σn, for the following standard deviation values: . Use the function randn to generate the noise. Display and submit a plot of repeatability versus , and comment on the SIFT descriptor’s robustness against additive noise.

(2 points) (e) Repeat part (b), except instead of changing the brightness, convolve the image with a

Gaussian kernel of standard deviation , for the following standard deviation values: . Use the function fspecial to construct a Gaussian kernel of standard

deviation with finite extent . Display and submit a plot of

repeatability versus , and comment on the SIFT descriptor’s robustness against blurring. (2 points)

Note: Please attach relevant MATLAB code.

0.50, 0.75, ..., 1.75, 2.00g =

0, 5, ..., 25, 30ns =

ns

bs1, 2, …, 9, 10bs =

bs 10σ b +1( )× 10σ b +1( )

bs

3

2. Recognition of Posters with Local Image Features (Total of 8 points) When you visit a poster at a conference/meeting, it would be useful to be able to snap a picture of the poster and automatically retrieve the authors’ contact information and the corresponding publication for later review. Please download the images hw7_poster_1.jpg, hw7_poster_2.jpg, and hw7_poster_3.jpg from the handouts webpage. These query images show 3 different posters during a previous EE368/CS232 poster session. Also download hw7_poster_database.zip, which contains clean database images of all posters shown during that poster session.

For each query image, use the following algorithm to match to the best database image: (1) Extract SIFT features from the query image using vl_sift in the VLFeat library. (2) Match the query image’s SIFT features to every database image’s SIFT features using

nearest-neighbor search with a distance ratio test as implemented in vl_ubcmatch. (3) From the feature correspondences that pass the distance ratio test, find the inliers using

RANSAC with a homography as the geometric mapping. (4) Report the database image with the largest number of inliers after RANSAC as the best

matching database image. Please submit the following results for each query image: (a) A side-by-side view of the query image and the best matching database image.

(2 points) (b) A side-by-side view of the query image and best matching database image, with SIFT

keypoints overlaid on each image. (2 points)

(c) A side-by-side view of the query image and the best matching database image with feature correspondences after the distance ratio test overlaid and connected by lines.

(2 points) (d) A side-by-side view of the query image and the best matching database image with feature

correspondences after RANSAC overlaid and connected by lines. (2 points)

An example of how to implement the distance ratio test and RANSAC is available in the script sift_match.m available on the handouts webpage. This script requires the VLFeat library. Note: Please attach relevant MATLAB code.

4

3. Recognition of Paintings with a Vocabulary Tree (Total of 7 points) Although the pairwise image matching algorithm described in Problem 2 can accurately find the best matching database image for each query image, the computational cost is high. Depending on the speed of your machine, the matching procedure can take tens of seconds to several minutes for each query image. In this problem, we will use a vocabulary tree to substantially speed up the image retrieval process and quickly find the best matching database candidates.

Training Please download the file hw7_training_descriptors.mat from the homework webpage. This file contains 200,000 training SIFT descriptors extracted from 9,000 DVD cover images. Train a vocabulary tree with branch factor 10 and depth 4, so that the tree has 10,000 leaf nodes. You can use the function vl_hikmeans in the VLFeat library to perform hierarchical k-means clustering. Testing Please download the following files from the handouts webpage:

• hw7_paintings_database.zip: contains 91 references images of paintings • hw7_paintings_query.zip: contains 5 query images of paintings

Query Image Database Image

Quantize each database image’s SIFT descriptors and each query image’s SIFT descriptors through the vocabulary tree. You can use the function vl_hikmeanspush. Compute a histogram of visit counts over the leaf nodes. You can use the function vl_hikmeanshist. Normalize each histogram to have unit L1 norm; each histogram can then be considered as an empirical

5

probability mass function. For two normalized histograms u and v, use the L1 distance

as a measurement of their distance or dissimilarity from each other.

Vocabulary tree-based retrieval finds the database images whose histograms have the lowest L1 distances (equivalently, the highest histogram intersection scores) to the query image’s histogram. Please submit the following results for each query image: (a) Show the 10 database images with the lowest L1 histogram distance scores, next to the query

image. Also write the L1 distance next to each database image. Does the correct database image attain the lowest L1 distance? Note that due to randomness in k-means, you may not get exactly the same results if you run your script multiple times. Also, adjusting the number of SIFT features (by changing “PeakThresh” and “EdgeThresh” arguments in vl_sift) can have a significant effect on the quality of the ranking with tree histograms.

(4 points) (b) Report the amount of time required to (i) extract features from the query image, and (ii) to

match the query image to the database. Exclude the time spent on feature extraction and descriptor quantization for the database images, because these processing steps can be performed before a query occurs. (MATLAB functions: tic, toc)

(3 points) Note: Please attach relevant MATLAB code.

11

n

i ii

u v u v=

- = -å

Documents

EE368/CS232 Digital Image Processing Lecture Review and ...€¦ · (b) Add a brightness offset to the grayscale intensity (assumed to be in the range ... • hw7_paintings_database.zip: