34
Computer Vision Colorado School of Mines Colorado School of Mines Computer Vision Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1

Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Colorado School of Mines

Computer Vision

Professor William HoffDept of Electrical Engineering &Computer Science

http://inside.mines.edu/~whoff/ 1

Page 2: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Tracking

2

Page 3: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Detection vs Tracking

• Assume we want to detect an object in a sequence of images and compute its pose

• Two possible approaches:1. Detect the object independently in each image2. Track the object, using knowledge of its position (and 

motion) in the previous image

• Approach #2 may be faster and more robust• Approach #1 is necessary when starting out, or to re‐acquire the object if tracking is lost 

3

Page 4: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Example Detection Method

• Assume that we have a “reference image” of a planar object, and we want to find the object in each new image

• Approach:– Find feature points (e.g., SIFT or SURF) and their descriptors in the reference 

image– Calculate the 3D locations of the feature points– Then, for each new image:

• Find feature points and their descriptors in the new image• Match the descriptors to get a set of tentative matched points• Fit a homography (projective transform) to the point locations in the new 

image and reference image, using RANSAC• Using the inlier points, find the pose of the object

4

Page 5: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Example Tracking Method

• Assume that we have detected the planar object in the initial image

• Approach:– Find simple feature points (e.g., template image patches) in the initial 

image– Calculate the 3D locations of the feature points– Then, for each new image:

• Track each template to its location in the new image• Fit a homography (projective transform) to the point locations in the new image and previous image, using RANSAC

• Using the inlier points, find the pose of the object• Update templates with their appearance in the new image

5

Page 6: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Tracking Patch Features

• We want to select good features to track– Features should be “distinctive” in some sense (such as a 

corner)– This is often called “corner detection”, although  the features 

don’t have to be corners!

• Once we have found some patch features, we need to track them to the new image– We can always use normalized cross correlation to search for 

the best matching location (see lecture on template matching)– However, if the movement is small in the image, we can use an 

iterative search instead (can be very fast)

6

Page 7: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Good Features to Track

7

Page 8: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Good Features to Track

• Say we want to track a patch from image I0 to image I1– We’ll assume that intensities don’t 

change, and only translational motion– So we expect that 

– In practice we will have noise, so they won’t be exactly equal

• But we can search for the displacement u=(x,y) that minimizes the sum of squared differences (SSD):

8

i

iii IIwE 201 )()()( xuxxu

)()( 01 ii II xux

Page 9: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Stability of SSD• We would like the SSD score to have a distinct minimum at the 

correct value of u– If not, there will be uncertainty in the value of u

• The image texture in the patch will determine the stability of the SSD score– Bland, featureless patches will not be able to be matched uniquely

• We want to choose patches to track that will be stable

• We can predict how stable a patch will be by comparing its SSD score with itself, at various displacements u

• If the surface E(u) has a distinct minimum, it is a good feature to track

9

yx

IIwEi

iii uxuxxu where,)()()( 200

w(x) is optional weighting function

Page 10: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines10

From Computer Vision:Algorithms and Applications, Richard Szeliski, 2010, Springer

Page 11: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Stability of SSD Score

• We can estimate the stability of the SSD score• First, take the Taylor series expansion of the image

– Recall Taylor series expansion of a function f(x) near x0

– For vectors x, u

• Then

11

)()()( 00

0

xxdxdfxfxf

x

uxxux )()()( 000 iii III

iii

iiiii

iiii

Iw

IIIw

IIwE

20

2000

200

)()(

)()()()(

)()()()(

uxx

xuxxx

xuxxu

where• xi = (xi,yi)T is some image point 

• u = (x,y)T is the displacement from that point

• I0(xi) is the gradient of the image at xi

Page 12: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Stability of SSD Score (continued)

• Expanding, we get

• where

• A can tell us how stable the SSD score is, at a given point

12

yyxy

xyxx

IIII

wA

Note – the “*” indicates correlation of w with each element of A

uAu

u

T

yyxy

xyxx

yx

IwIwIwIw

yxE )(2

2

, ,

yII

yI

xII

xII

yy

xyxx

• w is a small NxN averaging filter such as a box filter • it essentially averages the values over a small local 

neighborhood 

Page 13: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Example Patches

13

2

2

yxy

xyx

IIII

wA

-5

0

5

-5

0

50

50

100

150

200

250

300

-5

0

5

-5

0

50

100

200

300

400

-5

0

5

-5

0

50

500

1000

1500

2000

2500

21   ‐21‐21    21

12     11    12

0     00    22

22     00     0

-5

0

5

-5

0

50

50

100

150

200

250

300

Page 14: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Stability of SSD Score (continued)

• E=uTA u is a second order polynomial – Has the form E(x,y)=ax2+bxy+cy2

– The matrix A gives the curvature of the surface

14

-5 -4 -3 -2 -1 0 1 2 3 4 50

1

2

3

cbba

A

ax2

• A large value of ameans that E curves up sharply as xvaries– Similarly, large cmeans that E curves up sharply as y varies– Both a,c large mean that we have a good minimum in E

• Problem – if b is not zero, the surface may be flat along the diagonal

-5 -4 -3 -2 -1 0 1 2 3 4 50

1

2

3

4

5

cy2

-5

0

5

-5

0

50

500

1000

1500

2000

2500

Page 15: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Estimating stability (continued)

• Let R be the matrix containing the eigenvectors of A– We can rotate the coordinate system to u=R v– Then E = uTA u= (vTRT) A (R v ) = vT v– Which has the form E(v1,v2) = 1 v12 + 2 v22

• So if both eigenvalues are large, we have a good minimum (there is no diagonal component)

• If one eigenvalue is low, surface is flat along that direction

15

-5

0

5

-5

0

50

100

200

300

400

-5

0

5

-5

0

50

50

100

150

200

250

300

Page 16: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Possible interest point measures

• Look at the smallest eigenvalue – if it is sufficiently large, then we have a candidate point (Shi‐Tomasi)

• Alternative measures that don’t require square roots:

16

210102)(tr)det( AA

10

10

)(tr)det(

AA

From Computer Vision:Algorithms and Applications, Richard Szeliski, 2010, Springer

(Harris)

(Harmonic mean)

Page 17: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Matlab Corner Interest Operator

17

% Find corner pointsclear allclose all

I = double(imread('test000.jpg'));

% Apply Gaussian blursd = 1.0;I = imfilter(I, fspecial('gaussian', round(6*sd), sd));

imshow(I, []);

% Compute the gradient componentsGx = imfilter(I, [-1 1]);Gy = imfilter(I, [-1; 1]);

% Compute the outer product g'g at each pixelGxx = Gx .* Gx;Gxy = Gx .* Gy;Gyy = Gy .* Gy;

Page 18: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Matlab Corner Interest Operator (continued)

18

% Size of neighborhood over which to compute corner features.N = 13;

w = ones(N); % The neighborhood

% Sum the G's over the window size.A11 = imfilter(Gxx, w);A12 = imfilter(Gxy, w);A22 = imfilter(Gyy, w);

% At each pixel (x,y), we have the 2x2 matrix% [A11(x,y) A12(x,y);% A21(x,y) A22(x,y)]% Of course, A21 = A12.

% Compute the interest score: det(A)/trace(A)detA = A11.*A22 - A12.^2; % Computes det(A) at each pixeltraceA = A11 + A22; % Computer trace(A) at each pixels = detA./traceA; % The interest score

% If trace(A)=0 anywhere, we get "not a number".s(isnan(s)) = 0; % If any NaN's, just set them to zero

Page 19: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Finding Local Maxima

• Simply finding all points greater than a threshold will yield too many points

• We should find points that are– greater than the threshold– a local maximum

interest scores interest scores, thresholded

19

Page 20: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Finding Local Maxima

• If peaks are smooth, you can test each point to see if it is higher than its neighbors

• Matlab codeLmax = false(size(S));Lmax(S(2:end-1, 2:end-1) = ...

( S(2:end-1, 2:end-1) > S(3:end, 2:end-1) ... & S(2:end-1, 2:end-1) > S(1:end-2, 2:end-1) ... & S(2:end-1, 2:end-1) > S(2:end-1, 3:end) ... & S(2:end-1, 2:end-1) > S(2:end-1, 1:end-2) );

A 1D cross sectionscore

x

thresh

0

1

2

3

4

5

6

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

In 2D, check 4 nearest neighbors

20

Page 21: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Finding Local Maxima (continued)

• Problem – what if you don’t have smooth peaks?– Too many detected features

• Instead, find points that are local regionalmaxima – they are the largest points within a region (e.g., square of size W)

• Can do by calculating the largest value within distance N (this is a dilation)

• Then find those points that equal the local maxima

Smax = (S==imdilate(S, ones(N)));

0

1

2

3

4

5

6

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

score

x0

1

2

3

4

5

6

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Local max within ±2

21

Page 22: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Finding Local Maxima (continued)

Score array

S

Score array dilated with square of side 15

Sd

Points where S=Sd

Smax

Points where S=Sd and are > threshold

22

Page 23: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Matlab

23

% Choose the suppression radius, for non-maxima suppressionr = N;

% Find local maxima within each neighborhood of radius 2rLmax = (s==imdilate(s, strel('disk',2*r)));

% Note - we don't want to detect points too close to the border, so just % zero out everything near the border.Lmax(1:N,:) = false;Lmax(:,1:N) = false;Lmax(end-N:end,:) = false;Lmax(:,end-N:end) = false;

% Get a list of the indices of all the potential interest points[rows cols] = find(Lmax);

% Get the values of those interest pointsvals = s(Lmax);

Look at range of scores

Page 24: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Identifying interest points

• Once you have the scores, you can decide which points to identify as interest points – Could choose the points with the top M scores– Or, choose all points with scores greater than a threshold

24

% Identify interest points above a threshold, and draw% a box around them, of size NxNfor i=1:length(vals)

if vals(i) > thresholdx = cols(i);y = rows(i);

rectangle('Position', [x-N/2 y-N/2 N N], ...'EdgeColor', 'r', ...'Linewidth', 2.0); % default is 0.5

text(x,y, sprintf('%d', i), ...'Color', 'r', ... % label with id number'FontSize', 16); % default is 10

endend

Page 25: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Corner interest points:

• Window size = 15

• Threshold = 100

• Local maxima within 15 pixels25

Page 26: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Tracking Features

26

Page 27: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Tracking Features

• Once we have found a good feature to track, how do we find it in the next image?– Assume the feature is represented by a small patch

• We can always use normalized cross correlation to search for the best matching location (see lecture on template matching)

• However, if the movement is small in the image, we can use an iterative search instead– This can be very fast, as long as you have a good initial guess

27

,1/2

22

, ,

[ ( , ) ][ ( , ) ]( , )

[ ( , ) ] [ ( , ) ]

s t

s t s t

w s t w f x s y t fc x y

w s t w f x s y t f

Page 28: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Lucas‐Kanade Incremental Refinement

• Does gradient descent on the sum‐of‐squared differences energy function

28

Method• We have the current guess 

of the displacement u• Compute the intensity 

difference between the translated image patch and the current image at that point

• Compute the local image gradient

• Update the displacement From: Richard Szeliski, “Computer VisionAlgorithms and Applications”

Page 29: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Taylor Series Expansion

• In 2D

29

Page 30: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Solve for the Displacement Correction

• Solve for u in the normal equations

• A, b can be computed using

30

Page 31: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Hierarchical Search

• If the search range is large, a hierarchical search strategy can be used

• An image pyramid is used, where matches are first found in lower‐resolution images and then used to guide search in higher resolution images

31

https://en.wikipedia.org/wiki/Pyramid_(image_processing)

Page 32: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Problems Tracking over Long Sequences

• If features are tracked over a long sequence, their appearance can undergo changes. 

• If you continue to match against the originally detected patch (feature), tracking may eventually fail. 

• Or, you can re‐sample the template at subsequent frame at the matching location. – However, the feature may drift from its original location to some other location in the image

32

Page 33: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

Affine Tracker

• Using an affine motion model is more robust when tracking over long sequences

• where

– d is the translation, A is the deformation matrix

• This can account for rotation and scale changes• Also viewpoint changes, if planar patch is small compared 

to distance from camera

2 1 I Ax d I (x)

yyyx

xyxx

aaaa

A

33

Page 34: Computer Visioncs-courses.mines.edu/csci507/schedule/35/Tracking.pdf · Colorado School of Mines Computer Vision Tracking Patch Features •We want to select good features to track

Computer VisionColorado School of Mines

KLT Tracker

• Find interest points in the first frame

• Track (assuming translation only) points from each frame to subsequent frame

• Test consistency with original frame by computing affine transformation

from:  Shi and Tomasi, “Good Features to Track”, IEEE CVPR 1994

34