Motion ECE 847: Digital Image Processing Stan Birchfield Clemson University

Preview:

Citation preview

Motion

ECE 847:Digital Image Processing

Stan BirchfieldClemson University

What if you only had one eye?

Depth perception is possible by moving eye

ParallaxMotion parallax – apparent

displacement of object viewed along two different lines of sight

http://www.infovis.net/imagenes/T1_N144_A6_DifMotion.gif

Head movements for depth perception

Some animals move their heads to generate parallax

http://www.animalwebguide.com/Praying-Mantis.htmhttp://pinknpurplelizard.files.wordpress.com/2008/06/pigeon1.jpg

pigeon praying mantis

Motion field and optical flow

• Motion field – The actual 3D motion projected onto image plane

• Optical flow – The “apparent” motion of the brightness pattern in an image(sometimes called optic flow)

When are the two different?

Barber pole illusion Television / movies

Rotating ping-pong ball

http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/OWENS/LECT12/node4.html

(no optical flow)

(no motion field)

* From Marc Pollefeys COMP 256 2003

Optical flow breakdown

Perhaps an aperture problem discussed later.Perhaps an aperture problem discussed later.

Another illusionAnother illusion

Motion fieldpoint in world: projection onto image:

motion of point:

where

Expanding yields:

Motion field (cont.)velocity of projection:

Expanding yields:

Note that rotationgives no depthinformation

Motion field (cont.)

Note: This is a radial field emanating from

Two special cases:

Optical Flow Assumptions:Brightness Constancy

* Slide from Michael Black, CS143 2003

Optical Flow Assumptions:

* Slide from Michael Black, CS143 2003

Optical Flow Assumptions:

* Slide from Michael Black, CS143 2003

Optical flow

Image at time t Image at time t + t

Brightness constancy assumption:

Taylor series expansion:

Putting together yields:

Optical flow (cont.)From previous slide:

Divide both sides by t and take the limit:

orstandard

optical flowequation

More compactly,

Aperture problemFrom previous slide:

This is one equation, two unknowns!(Underconstrained problem)

We can only compute component of motion in direction of gradient:

I(x,y,t)=isophote

I(x,y,t+t)=isophote

true motion

another possibleanswer

gradient

Key idea:Any function looks

linear through small `aperture’

Aperture Problem Exposed

Motion along just an edge is ambiguous

from G. Bradski, CS223B

Two approaches to overcome aperture problem

• Horn-Schunck (1980)– Assume neighboring pixels are similar– Add regularization term to enforce

smoothness– Compute (u,v) for every pixel in image Dense optical flow

• Lucas-Kanade (1981)– Assume neighboring pixels are same– Use additional equations to solve for motion of

pixel– Compute (u,v) for a small number of pixels

(features); each feature treated independently of other features

Sparse optical flow

Lucas-KanadeRecall scalar equation with two unknowns:

Assume neighboring pixels have same motion:

where N is the number of pixels in the window

Can solve this directly using least squares, or …

or

Lucas-KanadeMultiply by AT:

or

2x2 matrix 2x1 vector

or

Note:

RGB version• For color images, we can get more

equations by using all color channels

• E.g., for 7x7 window we have 49*3=147 equations!

– Can solve using Newton’s method• Also known as Newton-Raphson method

– Lucas-Kanade method does one iteration of Newton’s method

• Better results are obtained via more iterations

Improving accuracy• Recall our small motion assumption

• This is not exact– To do better, we need to add higher order terms back in:

• This is a polynomial root finding problem

It-1(x,y)

It-1(x,y)

It-1(x,y)

* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

1. Solve 2x2 equation to get motion for each pixel

2. Shift second image using estimated motion(Use interpolation to improve accuracy)

3. Repeat until convergence

Iterative Lucas-Kanade Algorithm

shift image2

Lucas-Kanade Algorithm

solving 2x2 equationis easy

Linear interpolation

x0 x0+1 x0+2x0-1 x

f(x0-1)f(x0)

f(x0+1)

f(x0+2)

f(x) ≈ f(x0) + (x-x0) ( f(x0+1) – f(x0) ) = f(x0) + ( f(x0+1) – f(x0) )

= (1 – ) f(x0) + f(x0+1)

0≤<1

1D function

Bilinear interpolationx0 x0+1 x0+2x0-1

y0

y0+1

y0+2

y0-1

x

y

2D function

f(x0,y) ≈ (1-)f00 + f01 f(x0+1,y) ≈ (1-)f10 + f11

f(x,y) ≈ (1-) [(1-)f00 + f01] + [(1-)f10 + f11] = (1-)(1-)f00 + (1-)f01 + (1-)f10 + f11

f00 f10

f11f01

Bilinear interpolationx0 x0+1 x0+2x0-1

y0

y0+1

y0+2

y0-1

x

y

2D function

f(x,y0) ≈ (1-)f00 + f10 f(x,y0+1) ≈ (1-)f01 + f11

f(x,y) ≈ (1-) [(1-)f00 + f10] + [(1-)f01 + f11] = (1-)(1-)f00 + (1-)f10 + (1-)f01 + f11

f00 f10

f11f01

sameresult

Bilinear interpolation• Simply compute double weighted

average of four nearest pixels

• Be careful to handle boundaries correctly

How does Lucas-Kanade work?

• Recall Newton’s method(or Newton-Raphson)

• To find root of function, use first-order approximation

• Start with initial guess• Iterate:

– Use derivative to find root, under assumption that function is linear

– Result become new estimate for next iteration

http://en.wikipedia.org/wiki/Newton-Raphson_method

Optical Flow: 1D Case Brightness Constancy Assumption:

)),(()),(()( dttdttxIttxItf

0)(

txt t

I

t

x

x

I

Ix v It

x

t

I

Iv

0

)(

t

xfBecause no change in brightness with time

from G. Bradski, CS223B

v

??

Tracking in the 1D case:

x

),( txI )1,( txI

p

from G. Bradski, CS223B

v

xI

Spatial derivativeSpatial derivative

Temporal derivativeTemporal derivativetI

Tracking in the 1D case:

x

),( txI )1,( txI

p

tx x

II

px

t t

II

x

t

I

Iv

Assumptions:Assumptions:

• Brightness constancyBrightness constancy• Small motionSmall motion

from G. Bradski, CS223B

Tracking in the 1D case:

x

),( txI )1,( txI

p

xI

tI

Temporal derivative at 2Temporal derivative at 2ndnd iteration iteration

Iterating helps refining the velocity vectorIterating helps refining the velocity vector

Can keep the same estimate for spatial derivativeCan keep the same estimate for spatial derivative

x

tprevious I

Ivv

Converges in about 5 iterationsConverges in about 5 iterations

from G. Bradski, CS223B

From 1D to 2D tracking

0)(

txt t

I

t

x

x

I1D:

0)(

txtt t

I

t

y

y

I

t

x

x

I2D:

0)(

txtt t

Iv

y

Iu

x

I

One equation, two velocity (u,v) unknowns

from G. Bradski, CS223B

From 1D to 2D tracking

* Slide from Michael Black, CS143 2003

We get at most “Normal Flow” – with one point we can only detect movement perpendicular to the brightness gradient. Solution is to take a patch of pixelsaround the pixel of interest.

From 1D to 2D tracking

The Math is very similar:The Math is very similar:

v

x

),( txI )1,( txI

1v

2v

3v

4v

x

y)1,,( tyxI

),,( tyxI

x

t

I

Iv

Aperture problemAperture problem

bGv 1

p yyx

yxx

III

IIIG

aroundwindow 2

2

p ty

tx

II

IIb

aroundwindow

Window size here ~ 11x11Window size here ~ 11x11 from G. Bradski, CS223B

When does Lucas-Kanade work?

• ATA must be invertible• In real world, all matrices are invertible• Instead, we must ensure that ATA is well-

conditioned (not close to singular):– Both eigenvalues are large

– Ratio of eigenvalues max / min is not too large

Eigenvalues of Hessian

• Z is gradient covariance matrix– Related to autocorrelation of I

(Moravec interest operator)– Sometimes called Hessian

• Recall from PCA:– (Square root of) eigenvalues give

length of best fitting ellipse– Large eigenvalues means large

gradient vectors– Small ratio means information in all

directions

Ix

Iy

Finding good features

Three cases:

1. 1 and 2 small

Not enough texture for tracking

2. 1 large, 2 small

On intensity edge

Aperture problem: Only motion perpendicular to edge can be found

3. 1 and 2 largeGood feature to track

Ix

Iy

Ix

Iy

Iy

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Note: Even though tracking is a two-frame problem, we can determine good features from only one frame

Finding good features

• In practice, eigenvalues cannot be too large,b/c image values range from [0,255]

• Solution: Threshold minimum eigenvalue (Shi and Tomasi 1994)

• An alternative approach (Harris and Stephens 1987): – Note that det(Z) = 12 and trace(Z) = 1+ 1

– Use det(Z) – k trace(Z)2, where k=0.04 The second term reduces effect of having a small eigenvalue with a large dominant eigenvalue (known as Harris corner detector or Plessey operator)

• Another alternative: det(Z) / trace(Z) = 1 / (1/1 + 1/2)

Good features code• % Harris Corner detector - by Kashif Shahzad• sigma=2; thresh=0.1; sze=11; disp=0;

• % Derivative masks• dy = [-1 0 1; -1 0 1; -1 0 1];• dx = dy'; %dx is the transpose matrix of dy• • % Ix and Iy are the horizontal and vertical edges of image• Ix = conv2(bw, dx, 'same');• Iy = conv2(bw, dy, 'same');• • % Calculating the gradient of the image Ix and Iy• g = fspecial('gaussian',max(1,fix(6*sigma)), sigma);• Ix2 = conv2(Ix.^2, g, 'same'); % Smoothed squared image derivatives• Iy2 = conv2(Iy.^2, g, 'same');• Ixy = conv2(Ix.*Iy, g, 'same');• • % My preferred measure according to research paper• cornerness = (Ix2.*Iy2 - Ixy.^2)./(Ix2 + Iy2 + eps);• • % We should perform nonmaximal suppression and threshold• mx = ordfilt2(cornerness,sze^2,ones(sze)); % Grey-scale dilate• cornerness = (cornerness==mx)&(cornerness>thresh); % Find maxima• [rws,cols] = find(cornerness); % Find row,col coords.

• clf ; imshow(bw);• hold on;• p=[cols rws];• plot(p(:,1),p(:,2),'or');• title('\bf Harris Corners')

from Sebastian Thrun, CS223B Computer Vision, Winter 2005

Example (=0.1)

from Sebastian Thrun, CS223B Computer Vision, Winter 2005

Example (=0.01)

from Sebastian Thrun, CS223B Computer Vision, Winter 2005

Example (=0.001)

from Sebastian Thrun, CS223B Computer Vision, Winter 2005

Feature tracking

• Identify features and track them over video– Usually use a few hundred features– When many features have been lost, renew feature

detection– Assume small difference between frames– Potential large difference overall

• Two problems:– Motion between frames may be large– Translation assumption is fine between consecutive

frames, but long periods of time introduce deformations

Revisiting the small motion assumption

• Is this motion small enough?– Probably not—it’s much larger than one pixel

(2nd order terms dominate)– How might we solve this problem?

* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Reduce the resolution!

* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

image It-1 image I

Gaussian pyramid of image It-1 Gaussian pyramid of image I

image Iimage It-1u=10 pixels

u=5 pixels

u=2.5 pixels

u=1.25 pixels

Coarse-to-fine optical flow estimation

slides fromBradsky and Thrun

image Iimage J

Gaussian pyramid of image It-1 Gaussian pyramid of image I

image Iimage It-1

Coarse-to-fine optical flow estimation

run iterative L-K

run iterative L-K

warp & upsample

.

.

.

slides fromBradsky and Thrun

• Compute translation assuming it is small

Alternative derivation

differentiate:

Affine is also possible, but a bit harder (6x6 in stead of 2x2)

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Example

Simple displacement is sufficient between consecutive frames, but not to compare to reference template

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Example

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Synthetic example

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Good features to keep tracking

Perform affine alignment between first and last frameStop tracking features with too large errors

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Errors in Lucas-KanadeWhat are the potential causes of errors in this procedure?

– Suppose ATA is easily invertible

– Suppose there is not much noise in the image

• When our assumptions are violated– Brightness constancy is not satisfied– The motion is not small– A point does not move like its neighbors

• window size is too large• what is the ideal window size?

* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003

Feature matching vs. tracking

Extract features independently and then match by comparing descriptors

Extract features in first images and then try to find same feature back in next view

What is a good feature?

Image-to-image correspondences are key to passive triangulation-based 3D reconstruction

M. Pollefeys, http://www.cs.unc.edu/Research/vision/comp256fall03/

Horn-Schunck

• Dense optical flow

• Minimize

where(data)

(smoothness)

Note: We want to find u and v to minimize the integral, but u and v are themselves functions of x and y!

Calculus of variations

• A function maps values to real numbers– To find the value that minimizes a

function, use calculus

• A functional maps functions to real numbers– To find the function that minimizes a

functional, use calculus of variations

Calculus of variations

Integral

is stationary where

independentvariable

dependentvariable

explicitfunction

Euler-Lagrange equations

C.o.v. applied to o.f.

independent variables dependent variables

explicit function

Euler-Lagrange equations

C.o.v. applied to o.f. (cont.)

Take derivatives:

Expand:

C.o.v. applied to o.f. (cont.)Plug back in:

Laplacian

where

Rearrange:

Approximate:

where (difference of Gaussians)

meanconstant What if =0?

C.o.v. applied to o.f. (cont.)Solve equation:

This is a pair of equations for each pixel. The combined system of equations is 2N x 2N, where N is the number of pixels in the image.

Because it is a sparse matrix, iterative methods (Jacobi iterations, Gauss-Seidel, successive over-relaxation) are more appropriate.

Iterate:

where

Stationary iterative methodsProblem is to solve sparse linear system:

Jacobi method (simply solve for element, using existing estimates):

Gauss-Seidel is “sloppy Jacobi” (use updates as soon as they are available):

Successive over relaxation (SOR) speeds convergence:

=1 means Gauss-SeidelG-S iterate

matrix = (diagonal) – (lower) – (upper)

Horn-Schunck Algorithm

Horn-Schunck Algorithm

www.cs.unc.edu/~sud/courses/256/optical_flow.ppt

note that this is defined differently

Time-to-Collision

Beyond two-frame tracking

Background subtraction and frame differencing

Spatiotemporal volumes

http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/COLLINS/TobyCollins.pdf

Layers

• Wang and Adelson

Recommended