Upload
esmond-townsend
View
232
Download
1
Embed Size (px)
Citation preview
Feature Reconstruction Using Lucas-Kanade Feature Tracking and Tomasi-Kanade
Factorization
EE7740 Project I
Dr. Gunturk
ABSTRACT
• Recovering 3-D structure from motion in noisy 2-D images is a problem addressed by many vision system researchers. By consistently tracking feature points of interest across multiple images using a methodology first described by Lucas-Kanade, a 3-D shape of the scene can be reconstructed using these features points using the factorization method developed by Tomasi-Kanade.
The image flow, or velocity field, in the image plane due to object/camera motion can be computed using feature matching.
Velocity Flow
Image I Image J
xx
d
x + d
Total error E is the weighted sum-squared difference
)]()([2
)( xJdxIWw
xwE
Approximate I(x-d) using the Taylor series expansion
• A good match occurs when E is small, so we need to find a displacement d that minimizes E.
• This can be achieved by differentiating E with respect to d, setting it equal to zero, and solving for d. We can approximate the value of I(x-d) using the Taylor series expansion:
...),(),(),(...),(),(),(),(
yxyxyxIyyxI
xyxIyxIyxI II yx
Approximate (cont.)
first order term approx is sufficient for the calculations. Gradient of the intensity I is
can represent the shifted intensity as
sum-squared difference can now be represented as
)],(),,([)( yxIyxI yxxgT
)()()( xgxIdxI dT
)]()()([2
)( xgdxJxI TxwEWw
Approximate (cont.)
),()()()()(2 yxxgxJxIxwxE Id x
T
Ww
),()()()()(2 yxxgxJxIxwyE Id y
T
Ww
Taking the partial differentials with respect to x,y:
equivalently
),()()()()(2 yxgxgxJxIxwdE d
T
Ww
Approximate (cont.)
• Setting differential to 0 ->
dxwyxgxgxwyxgxJxIxw xgxgdT
Ww
T
WwWw)()()(),()()(),()()()(
• This can be represented in matrix form as Zd = e, where
III
III
yWw
yxWw
yxWw
xWw
xwxw
xwxwZ 2
2
)()(
)()(
IIy
Ww
xWw
JIxw
JIxwe
)()(
)()(
corner detecting Harris filter “cornerness” function
• uses these 2 eigenvalues to give a quantitative measure of the corner and edge qualities.
2
04.0R
Lucas-Kanade assumptions
• Z is invertible,
• that the two eigenvalues are large enough to be discernable from noise,
• and that the ratio of the two eigenvalues is well-behaved (larger/smaller is not too large).
• This is normally not the case.
desirable parameters for a tracker
• Accuracy can be related to the local sub-pixel resolution, in which a smaller integration window is desirable in order not to “smooth out” the details in the image.
• Robustness pertains to the sensitivity of the tracker to changes in lighting, size of image motion, etc. To handle larger motions, it is intuitive that a larger integration window would work better.
• One solution to this problem is a pyramidal Lucas-Kanade algorithm.
pyramidal Lucas-Kanade algorithm
• Using a Gaussian pyramid requires estimating the velocity at each pixel by solving Lucas-Kanade equations, using bilinear interpolation to warp the image so we keep all computation at a subpixel accuracy level, and then upsampling,
• continuing doing this same process for each layer of the pyramid all the way to the highest resolution (original image).
image It-1 image I
Gaussian pyramid of image It-1 Gaussian pyramid of image I
image Iimage It-1u=10 pixels
u=5 pixels
u=2.5 pixels
u=1.25 pixels
Coarse-to-fine optical flow estimation
image Iimage J
Gaussian pyramid of image It-1 Gaussian pyramid of image I
image Iimage It-1
Coarse-to-fine optical flow estimation
run iterative L-K
run iterative L-K
warp & upsample
.
.
.
pseudo-code
Goal: Let u be a point on image I. Find its corresponding location v on image JBuild pyramid representations of I and J: {IL}L=0,…,Lm and {JL}L=0,…,Lm
Initialization of pyramidal guess:
00TT
L
ggg Lmy
Lmx
m
wp
wp
wp
wp yxyxyx
yxyxyxG
xx
xx
yy
yy
x
x
y
y yyx
yxx
IIIIII
),(),(),(
),(),(),(2
2
000 T
k
),(),(),(11 k
y
L
y
k
x
L
x
LL
k ggJII yxyxyx
wp
wp
wp
wpyxyx
yxyxxx
xx
yy
yy
x
x
y
y yk
xk
k IIII
b ),(),(
),(),(
bG k
k 1
kkk 1
for L = Lm down to 0 with step of -1
Location of point u on image IL: uL = [px py]T = u/2L Derivative of IL with respect to x: Ix(x, y) = IL(x + 1, y) - IL(x
– 1, y) 2
Derivative of IL with respect to x: Ix(x, y) = IL(x + 1, y) - IL(x
– 1, y) 2
Spatial gradient matrix:
Initialization of iterative L-K:
for k = 1 to K with step of 1 (or until
Image difference:
Image mismatch vector:
Optical flow (Lucas-Kanade):
Guess for next
iteration:
end of for-loop on k
< accuracy threshold)
d Lk
dgggg LLLy
Lx
TL
2111
Final optical flow at level L:
Guess for next level L - 1:
end of for-loop on L
Final optical flow vector: d = g0 + d0
Location of point on J: v = u + d
Solution: The corresponding point is at location v on image J
Initial Feature Points• Methodology used to select the initial feature points on image I is as
follows:• Compute the G matrix and its minimum eigenvalue m at every pixel
in image I.• Determine the maximum max of all the minimum eigenvalues over
the whole image.• Retain the image pixels that have a m value that is 5%-10% of max.• From those pixels keep the local max pixels (i.e. pixels are kept if its
m value is larger than any other pixel in its 3x3 neighborhood).• Keep the subset of those pixels so that the minimum distance
between any pair of pixels is larger than a given threshold distance (typically 5 or 10 pixels).
Orthographic Case
• Trajectories of image coordinates {ufp,vfp} | f=1…F, p=1...P
• Input: registered measurement matrix Ŵ
The rank theorem• place origin of the world coordinate at the
centroid of the P points.
• Unit vectors if ,jf point along the direction X,Y of the image respectively
The rank theorem
• The projection (ufp,vfp) i.e. the image feature point of point sp=(xp,yp,zp) on to frame f
tf : the vector from world origin to the origin of image frame fNote: the origin is placed at the centroid of the object points, andsince the origin of the world coordinatesIs placed at the centroid of object points
The rank theorem
• For the registered horizontal image projection we have
• To summerize
The rank theorem
• The registered measurement matrix can be expressed in a matrix form:
represents the camera rotation
is the shape matrix
The rank theorem
• Since R is 2Fx3, S is 3xP ,
• Rank theoremRank theorem: without noise, the registered measurement matrix is at most rank 3.
• The registered measurement matrix Ŵ will be at most of rank three without noise.
• When noise corrupts the images, however, Ŵ will not be exactly of rank 3.
• The rank theorem can be extended to the case of noisy measurements in a well-defined manner, however, using approximate rank.
Approximate rank
• Ŵ can be decomposed into three matrix– Ŵ=O1∑O2, O1 and O2 are unitary matrix
•We have
•Ideally, ∑’ should contains all the singular value of Ŵ, O1
’’∑’’O2’’ must be entirely to noise.
Rank theorem for Noisy Measurement
• All the shape and rotation information in W is contain in three greatest singular values, together with the corresponding left and right eigenvector.
•Ř and Š same size as the desired rotation and shape matrices R and S•decomposition is not unique•(ŘQ)(Q-1Š) = Ř(QQ-1)Š = ŘŠ = Ŵ•Since that column space is 3-D because of the rank theorem, R and Ř are different bases for the same space -> linear transformation between them• Ř is a linear transformation of the true rotation matrix R•Š is a linear transformation of the true rotation matrix S.
The metric constraints
• There exist a 3X3 matrix Q, – R= ŘQ, S=Q-1 Š
• To find Q: R is the rows of true rotation matrix. These metrix constraints yield the over-constrained quadratic system
•This is a simple nonlinear data fitting problem.
Experimental Results
The 430 features selected by the automatic detection method Tomasi-Kanade.
Experimental Results
388 features selected by the automatic detection method Bishop
288 features tracked across 10 images by the automatic detection method BishopReconstructed image Bishop
Conclusions
• The pyramidal Lucas-Kanade tracker worked quite well on the images I submitted to it. For larger motions I would like to implement the Shi-Tomasi improvements I read about concerning an automatic scheme for rejecting spurious features in [7], but time constraints have not allowed for me to implement yet.
• The Tomasi-Kanade factorization method proved to be a robust solution for generating 3-D coordinates of feature points of rigid objects using the points tracked by the pyramidal Lucas-Kanade tracker.
References
• [1] "Pyramidal Implementation of the Lucas Kanade Feature Tracker Description of the algorithm", Jean-Yves Bouguet, Intel Corporation, Microprocessor Research Labs,
• [email protected]• [2] "A combined corner and edge detector", Chris Harris and Mike Stephens,• Proceedings Fourth Alvey Vision Conference, Manchester, pp 147-151, 1988.• [3] “Good Features to Track”, Jianbo Shi and Carlo Tomasi,• IEEE Conference on Computer Vision and Pattern Recognition (CVPR94), Seattle,
June 1994• [4] “Shape and motion from image streams under orthography: a factorization
method.” Carlo Tomasi and Takeo Kanade, International Journal of Computer Vision, 9(2):137-154, November 1992.
• [5] http://mathworld.wolfram.com/UnitaryMatrix.html• [6] “Linear and Incremental Acquisition of Invariant Shape Models from Image
Sequences”, Daphna Weinshall and Carlo Tomasi, Proceedings: IEEE fourth International Conference of Computer Vision, pp. 675-682, Berlin, May 1993.
• [7] “Improving Feature Tracking with Robust Statistics”, A. Fusiello, E. Trucco, T. Tommasini, V. Roberto, Pattern Analysis & Applications (1999)2:312–320, Ó 1999 Springer-Verlag London Limited