36
A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Embed Size (px)

Citation preview

Page 1: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

A Tutorial on using SIFT

Presented by Jimmy Huff

(Slightly modified by Josiah Yoder for Winter 2014-2015)

Page 2: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Introduction

• The Scale-Invariant Feature Transform by David Lowe is useful in many applications of object recognition.

• Our objective in this presentation is to understand how to extract SIFT descriptors from an image

Page 3: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Introduction

• To extract SIFT keypoints, we use a cascaded filtering algorithm with the following four steps of filtering:– Scale-Space Extrema Detection– Keypoint Localization– Orientation Assignment– Keypoint Descriptor

• This algorithm is efficient as its more expensive operations are performed on a small subset of the initial image input.

Page 4: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Scale-Space Extrema Detection – Get the Points!

• In order to have scale-invariant features, we must have a way to extract features from an image across all scales.

• This can be done using a continuous function known as scale-space (Witkin, 1983)

• The only scale-space kernel is the Gaussian function.

• Lowe proposed to use Difference of Gaussians (DOG) in order to collect extrema as interest points.

Page 5: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Scale-Space Extrema Detection – Get the Points!

• Scale-space groups an image into an octave with S levels.

• The smoothing is done incrementally such that σ of the S + 1 image in the octave is twice that of the first image.

Page 6: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Scale-Space Extrema Detection – Get the Points!

Page 7: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Scale-Space Extrema Detection – Get the Points!

• DOG is used for its efficiency.

• Using the images to the right, we may now find the extrema for this octave.

Page 8: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Scale-Space Extrema Detection – Get the Points!

• If a point is greater or less than its 26 neighbors, it is regarded as an extreme point.

• This is a relatively inexpensive step as most points are not compared to every neighbor.

• Note that this comparison cannot be done on the boundaries of an image or on the top and bottom DOG.

Page 9: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Scale-Space Extrema Detection – Get the Points!

. . .

Each octave is processed separately.Each octave starts with σ twice the value of σ of the previous octave and continues to increase.

σ

As sample points are collected, they are stored as a three-vector

p = (x, y, σ) [σ being scale in this case]

Page 10: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Refine the Points!

If we were to stop after the first steps, we would have too many interest points to be effective. In this second step, we eliminate points of low contrast. [Ignoring localization of “real” SIFT here…]

Can you see the truck??

Page 11: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Refine the Points!

Only keep points where DOG > some threshold (e.g. 3% of maximum intensity in original image)

Page 12: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Refine the Points!

• By applying this to our previous image, with 8714 sample points…

• We reduce the number of sample points to 362

Page 13: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

We may further refine the sample points by removing them from edges. First, we take the Hessian matrix computed at the location and scale of the keypoint.

FurtherRefine the Points!

Page 14: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Further Refine the Points!

The eigenvalues of the matrix H are proportional to the principal curvatures of D. If a point is on an edge, its ratio of eigenvalues will be very high (recall Harris Corner Detector). Since we are only concerned with ratios we may set a threshold r, where α = rβ and

Therefore, if

the point is ignored.

Page 15: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Further Refine the Points!

• By applying this to our previous image, with 362 sample points…

• We reduce the number of sample points to 240

Page 16: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Orientation Assignment• In order to be rotation invariant, each

point must have a reference angle based on its neighbor points.

• We find the magnitude and angle of every pixel in the scale space by the following equations

• We are concerned with the points in the region of the keypoint.

Page 17: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

• The magnitudes are weighted according to a Gaussian function centered at the keypoint.

Orientation Assignment

Page 18: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

0

10

20

30

40

50

60

70

80

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35

We then use the magnitudes to populate a histogram of 36 bins

Orientation Assignment

Page 19: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

0

10

20

30

40

50

60

70

80

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35

A parabola is fit to the maximum value and the two values nearest to it. The maximum of this parabola gives us the angle θ. Furthermore, the point now has four components

p = (x, y, σ, θ)

Orientation Assignment

Page 20: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Keypoint Descriptor

We now assign a descriptor to the sample point. The two above points represent sample points, with the red arrow being the points orientation assignment. By assigning a keypoint descriptor, we will know if these two are alike or not.

Page 21: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Keypoint Descriptor

We again use gradients of neighboring pixels to determine the descriptor. The size of the region is a Gaussian window proportional to the scale of the keypoint.

Page 22: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Keypoint Descriptor

We first must rotate the neighboring pixels vectors relative to the keypoint’s angle θ.

Page 23: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Notice that these two are (most likely) a match after this step is done to ensure rotation invariance!

Keypoint Descriptor

Page 24: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

We then group the vectors from step 3 into a 2 x 2 set with 8 bins each. However, experimentation has shown it is best to use a 4 x 4 set with 8 bins each for maximum effectiveness and efficiency. This is essentially a 128-feature vector.

Keypoint Descriptor

Page 25: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

By generalizing the gradient vectors in the neighboring pixels into 8 bins, this keypoint is resilient against different 3D perspectives.

Keypoint Descriptor

Page 26: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

In order to be resilient to differences in illumination, we normalize the entries of the feature vector. This makes the descriptor invariant to changes in contrast or brightness

In order to be resilient to non-linear changes in illumination, such as camera saturation, we reduce the effect of large gradient vectors by setting a threshold in the feature vector such that no value is larger than 0.2. We then re-normalize.

Keypoint Descriptor

Page 27: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Rotation Invariance

Page 28: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Scale Invariance

Page 29: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

3D Perspective Resilience

Page 30: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Occlusion – with outliers

Page 31: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Occlusion

Page 32: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Tracking

Page 33: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Tracking

Page 34: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Tracking

Page 35: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Tracking

Page 36: A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter 2014-2015)

Tracking