45
Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Embed Size (px)

Citation preview

Page 1: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Instructor: Mircea Nicolescu

Lecture 15

CS 485 / 685

Computer Vision

Page 2: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Matching SIFT Features

• Given a feature in I1, how to find the best match in I2?1. Define distance function that compares two descriptors

2. Test all the features in I2, find the one with min. distance

I1I2

2

Page 3: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Matching SIFT Features

I1 I2

f1 f2

• What distance function should we use?− Use (sum of square differences)− Can give good scores to very ambiguous (bad) matches

1

0

22121 )(),(

N

iii ffffSSD

3

Page 4: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

• Accept a match if SSD(f1,f2) < t • How do we choose t?

Matching SIFT Features

4

Page 5: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Matching SIFT Features

• A better distance measure is the following:− SSD(f1, f2) / SSD(f1, f2’)

− f2 is best SSD match to f1 in I2

− f2’ is 2nd best SSD match to f1 in I2

I1 I2

f1 f2f2'

5

Page 6: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Matching SIFT Features

• Accept a match if SSD(f1, f2) / SSD(f1, f2’) < t− t = 0.8 has given good results in object recognition− 90% of false matches were eliminated− Less than 5% of correct matches were discarded

6

Page 7: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Matching SIFT Features

• How to evaluate the performance of a feature matcher?

5075

200

7

Page 8: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Matching SIFT Features

• True positives (TP) = # of detected matches that are correct

• False positives (FP) = # of detected matches that are incorrect

5075

200false match

true match

• Threshold t affects # of correct/false matches

8

Page 9: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Matching SIFT Features

1

0.7

0 1FP rate

TPrate

0.1

• ROC Curve

- Generated by computing (FP rate, TP rate) for different thresholds

- Maximize area under the curve (AUC)

http://en.wikipedia.org/wiki/Receiver_operating_characteristic

9

Page 10: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Applications of SIFT

• Object recognition• Object categorization• Location recognition• Robot localization• Image retrieval• Image panoramas

10

Page 11: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Object Recognition

Object Models

11

Page 12: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Object Categorization

12

Page 13: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Location Recognition

13

Page 14: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Robot Localization

14

Page 15: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Map Continuously Built over Time

15

Page 16: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Image Retrieval – Example 1

…> 5000images

change in viewing angle

16

Page 17: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Matches

22 correct matches

17

Page 18: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Image Retrieval – Example 2

…> 5000images

change in viewing angle

+ scale

18

Page 19: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

33 correct matches

Matches

19

Page 20: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Image Panoramas from Unordered Image Set

20

Page 21: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Variations of SIFT features

• PCA-SIFT

• SURF

• GLOH

21

Page 22: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

SIFT Steps – Review

(1) Scale-space extrema detection− Extract scale and rotation invariant interest points

(keypoints)

(2) Keypoint localization− Determine location and scale for each interest point− Eliminate “weak” keypoints

(3) Orientation assignment− Assign one or more orientations to each keypoint

(4) Keypoint descriptor− Use local image gradients at the selected scale

D. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints”, International Journal of Computer Vision, 60(2):91-110, 2004.

Cited 9589 times (as of 3/7/2011)22

Page 23: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

• Steps 1-3 are the same; Step 4 is modified.• Take a 41 x 41 patch at the given scale, centered

at the keypoint, and normalized to a canonical direction

PCA-SIFT

Yan Ke and Rahul Sukthankar, “PCA-SIFT: A More Distinctive Representation for Local Image Descriptors”, Computer Vision and Pattern Recognition, 2004.

23

Page 24: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

• Instead of using weighted histograms, concatenate the horizontal and vertical gradients (39 x 39) into a long vector

• Normalize vector to unit length

PCA-SIFT

2 x 39 x 39 = 3042 elements vector

24

Page 25: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

• Reduce the dimensionality of the vector using Principal Component Analysis (PCA)− e.g., from 3042 to 36

• Sometimes, less discriminatory than SIFT

PCA-SIFT

PCA

N x 1K x 1

'11 KxNxKxN IIA

25

Page 26: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

SURF: Speeded Up Robust Features

• Speed-up computations by fast approximation of (i) Hessian matrix and (ii) descriptor using “integral images”

• What is an “integral image”?

Herbert Bay, Tinne Tuytelaars, and Luc Van Gool, “SURF: Speeded Up Robust Features”, European Conference on Computer Vision (ECCV), 2006.

26

Page 27: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Integral Image

• The integral image IΣ(x,y) of an image I(x, y) represents the sum of all pixels in I(x,y) of a rectangular region formed by (0,0) and (x,y).

Using integral images, it takes only four array references to calculate the sum of pixels over a rectangular region of any size.

0 0

( , ) ( , )j yi x

i j

I x y I i j

27

Page 28: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

• Approximate Lxx, Lyy, and Lxy using box filters

• Can be computed very fast using integral images!

(box filters shown are 9 x 9 – good approximations for a Gaussian with σ=1.2)

derivative approximation approximationderivative

Integral Image

28

Page 29: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

• In SIFT, images are repeatedly smoothed with a Gaussian and subsequently sub-sampled in order to achieve a higher level of the pyramid

SURF: Speeded Up Robust Features

29

Page 30: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

• Alternatively, we can use filters of larger size on the original image

• Due to using integral images, filters of any size can be applied at exactly the same speed!

(see Tuytelaars’ paper for details)

SURF: Speeded Up Robust Features

30

Page 31: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

• Approximation of H:

:

ˆ ˆ:

ˆ ˆ

xx xySIFTapprox

yx yy

xx xySURFapprox

yx yy

D DSIFT H

D D

L LSURF H

L L

Using DoG

Using box filters

SURF: Speeded Up Robust Features

31

Page 32: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

• Instead of using a different measure for selecting the location and scale of interest points (e.g., Hessian and DOG in SIFT), SURF uses the determinant of to find both.

• Determinant elements must be weighted to obtain a good approximation:

SURFapproxH

2ˆ ˆ ˆdet( ) (0.9 )SURFapprox xx yy xyH L L L

SURF: Speeded Up Robust Features

32

Page 33: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

• Once interest points have been localized both in space and scale, the next steps are:• Orientation assignment• Keypoint descriptor

SURF: Speeded Up Robust Features

33

Page 34: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

• Orientation assignment

Circular neighborhood of radius 6σ around the interest point(σ = the scale at which the point was detected)

x response y response

Can be computed very fast using integral images!

60°

angle

( , )dx dy

SURF: Speeded Up Robust Features

34

Page 35: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

• Sum the response over each sub-region for dx and dy separately

• To bring in information about the polarity of the intensity changes, extract the sum of absolute value of the responses too

Feature vector size:

4 x 16 = 64

• Keypoint descriptor (square region of size 20σ)

4 x 4grid

SURF: Speeded Up Robust Features

35

Page 36: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

• SURF-128− The sum of dx and |

dx| are computed separately for points where dy < 0 and dy >0

− Similarly for the sum of dy and |dy|

− More discriminatory!

SURF: Speeded Up Robust Features

36

Page 37: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

• Has been reported to be 3 times faster than SIFT.

• Less robust to illumination and viewpoint changes compared to SIFT.

K. Mikolajczyk and C. Schmid,"A Performance Evaluation of Local Descriptors", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, 2005.

SURF: Speeded Up Robust Features

37

Page 38: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Gradient Location-Orientation Histogram (GLOH)

• Compute SIFT using a log-polar location grid:− 3 cells in radial direction (i.e., radius 6, 11, and 15)− 8 cells in angular direction (except for central one)

• Gradient orientation quantized in 16 bins • Total: (2x8+1)*16 = 272 bins PCA

K. Mikolajczyk and C. Schmid, "A Performance Evaluation of Local Descriptors", IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, 2005.

38

Page 39: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Shape Context

• A 3D histogram of edge point locations and orientations− Edges are extracted by the Canny edge detector.− Location is quantized into 9 bins (using a log-polar

coordinate system).− Orientation is quantized in 4 bins (horizontal, vertical,

and two diagonals).

• Total number of features: 4 x 9 = 36

K. Mikolajczyk and C. Schmid, "A Performance Evaluation of Local Descriptors", IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, 2005.

39

Page 40: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Spin Image

• A histogram of quantized pixel locations and intensity values− A normalized histogram is computed for each of five

rings centered on the region. − The intensity of a normalized patch is quantized into 10

bins.

• Total number of features: 5 x 10 = 50

K. Mikolajczyk and C. Schmid, "A Performance Evaluation of Local Descriptors", IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, 2005.

40

Page 41: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Differential Invariants

Example: some Gaussian derivatives up to fourth order

• “Local jets” of derivatives obtained by convolving the image with Gaussian derivates• Derivates are computed at different orientations by rotating the image patches

( , ) ( )

( , ) ( )

( , )* ( )

( , )* ( )

( , )* ( )

( , )* ( )

x

y

xx

xy

yy

I x y G

I x y G

I x y G

I x y G

I x y G

I x y G

computeinvariants

41

Page 42: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Bank of Filters

(e.g., Gabor filters)

42

Page 43: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Moment Invariants

• Moments are computed for derivatives of an image patch using:

where p and q is the order, a is the degree, and Id is the image gradient in direction d

• Derivatives are computed in x and y directions

43

Page 44: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Bank of Filters: Steerable Filters

44

Page 45: Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision

Object Recognition

• Model-based object recognition

• Generic object recognition

45