SCI30_Lecture1

UE SCI30

Method and Modeling of 3D Motion Capture

Vincent Fremont

Master 2

Fall Semester

UE SCI30 Fall Semester 1

Introduction

Course objectives

Acquire knowledge and skills in computer vision and imageprocessing to understand and to master methods for motioncapture with and without markers using static and movingcamera networks.


Introduction

Content and schedule

CM1: Introduction and course presentation (+ demonstrations), 2D/3Dprojective geometry for machine vision.

CM2: Robust Estimation Methods for Computer Vision.

CM3: Digital Image Enhancement in the Frequency Domain.

CM3: Image formation, camera models and calibration.

CM4: Spatial and Temporal Segmentation.

CM5: 3D Reconstruction from multiple images : stereovision andStructure-From-Motion.


Introduction

Content and schedule

TD1: Image processing and computer vision under Matlab.

TD2: Projective geometry, camera models and stereovision.

TD3: Camera Calibration.

Books

Multiple View Geometry in Computer Vision, R. Hartleyand A; zisserman

Digital Image Processing, R. C. Gonzalez and R. E. Woods

An Invitation to 3-D Vision, Y. MA et al.


Introduction3D Motion Capture from Images

Principle: make use of retro-reflective markers viewed frommultiple cameras or other vision-based techniques to captureactors for computer animation; Solution = Computer Vision


Introduction

What is computer vision?

The answer to the question: How to describe the worldthat we see in one or more images and to reconstruct itsproperties, such as shape, illumination, and colordistributions.

An inverse problem: recover some unknowns giveninsufficient information to fully specify the solution

Computer Vision = Physics (radiometry, optics, sensordesign) + Computer Graphics (3D modeling, rendering,animation)


Introduction

Relationship between images, geometry, and photometry

20 Computer Vision: Algorithms and Applications (September 3, 2010 draft)

Images (2D) Geometry (3D) shape

Photometry appearance+

vision

graphics

image processing

2.1 Geometric image formation

2.2 Photometric image formation

2.3 Samplingand aliasing

3 Image processing

4 Feature detection

6 Feature-based alignment

7 Structure from motion

8 Motionestimation

10 Computational photography

11 Stereo correspondence

12 3D shape recovery

12 Texture recovery

13 Image-based rendering

14 Recognition

5 Segmentation

9 Stitching

Figure 1.11 Relationship between images, geometry, and photometry, as well as a taxonomyof the topics covered in this book. Topics are roughly positioned along the left–right axisdepending on whether they are more closely related to image-based (left), geometry-based(middle) or appearance-based (right) representations, and on the vertical axis by increasinglevel of abstraction. The whole figure should be taken with a large grain of salt, as there aremany additional subtle connections between topics not illustrated here.

Images: Sampling and aliasing, Image Processing, Featuredetection, Segmentation, Motion Estimation, Stitching.

Geometry: Geometric image formation, Feature-basedalignment, Structure-From-Motion, Stereo correspondence,3D shape recovery.

Photometry: Photometric image formation, Texturerecovery, Image-based rendering.


Introduction

Sensors examples


Introduction

Sensors examples

Ladybug Pointgrey Video example


Introduction

Applications

Optical character recognition (OCR).

Machine inspection.

3D model building (photogrammetry).

Medical imaging.

Automotive safety.

Motion capture (mocap).

Surveillance.

Fingerprint recognition and biometrics.


Introduction

Heudiasyc Experimental plateforms

Carmen (Heudiasyc) Drone (Heudiasyc)


Introduction

Application examples

Real-time MonoSlam (Oxford)


Introduction


Lane Detection (Berkeley)


Introduction


Augmented reality (Oxford)


Introduction

Objects Tracking (Compiegne)


Introduction

Distance Tracking (Compiegne)


Course beginning

From Light to Image


Image Formation Process

Some components of the image formation process


n

(a) (b)

zi=102mm

f = 100mm

zo=5m

d

G R G R

B G B G

G R G R

B G B G

(c) (d)

Figure 2.1 A few components of the image formation process: (a) perspective projection;(b) light scattering when hitting a surface; (c) lens optics; (d) Bayer color filter array.

(a) perspective projection; (b) light scattering when hitting asurface; (c) lens optics; (d) Bayer color filter array.



1. Geometric primitives and transformations2.1 Geometric primitives and transformations 33

y

xd

nl

^z

xd

n

m

y

^

(a) (b)

Figure 2.2 (a) 2D line equation and (b) 3D plane equation, expressed in terms of the normaln and distance to the origin d.

algorithm, which is discussed in Section 4.3.2. The combination (θ, d) is also known aspolar coordinates.

When using homogeneous coordinates, we can compute the intersection of two lines as

x = l1 × l2, (2.4)

where × is the cross product operator. Similarly, the line joining two points can be written as

l = x1 × x2. (2.5)

When trying to fit an intersection point to multiple lines or, conversely, a line to multiplepoints, least squares techniques (Section 6.1.1 and Appendix A.2) can be used, as discussedin Exercise 2.1.

2D conics. There are other algebraic curves that can be expressed with simple polynomialhomogeneous equations. For example, the conic sections (so called because they arise as theintersection of a plane and a 3D cone) can be written using a quadric equation

xT Qx = 0. (2.6)

Quadric equations play useful roles in the study of multi-view geometry and camera calibra-tion (Hartley and Zisserman 2004; Faugeras and Luong 2001) but are not used extensively inthis book.

3D points. Point coordinates in three dimensions can be written using inhomogeneous co-ordinates x = (x, y, z) ∈ R3 or homogeneous coordinates x = (x, y, z, w) ∈ P3. As before,it is sometimes useful to denote a 3D point using the augmented vector x = (x, y, z, 1) withx = wx.

Geometric primitives

Basic building blocks used to describe 3D shapes.

Geometric primitives are typically points, lines, planes,curves, surfaces and volumes.



1. Geometric primitives and transformations

Homogeneous representation of points

2D points (pixels coordinates in an image) are denoted as:

x =

(xy

)where (x, y) ∈ R2

2D homogeneous points are denoted as:

x =

xyw

where (x, y, w) ∈ P2 = R3 − (0, 0, 0)T

Inhomogeneous vector : x = (x, y, w)T = w(x, y, 1)T = wx.The vector x is the augmented vector.




Homogeneous representation of lines

Homogeneous coordinates : l = (a, b, c)T

Line equation: xT .l = ax+ by + c = 0

(ka)x+ (kb)y + (kc) = 0, ∀k 6= 0 → (a, b, c)T ∼ k(a, b, c)T

Example : represent the line y = 1 as a homogeneous vectorwrite the line as −y + 1 = 0 then a = 0, b = −1, c = 1 andl = (0,−1, 1)T

Normal vector notation: l = (nx, ny, d)T = (n, d) with||n|| = 1. We can also set n = (cos θ, sin θ)T . Thecombination (θ, d)T is also known as polar coordinates.

Line at infinity: m = (0, 0, 1)T , contains all the (ideal)points at infinity, cannot be normalized.




Intersection and join operators

The point x lies on the line l if and only if xT l = lTx = 0

The intersection of two lines l and l′ is x = l× l′

Example : compute the point of intersection of the twolines l and m in the figure below

l =

0−11

m =

−102

x = l×m =

−2−1−1

which is the point (2, 1)




Line joining two points

The line joining two points x and x′ is l = x× x′

Intersection of parallel lines

if l = (a, b, c)T and l’ = (a, b, c′)T then l× l = (−b, a, 0)T

(b,−a)T tangent vector(a, b)T normal vector

ideal points (x1, x2, 0)T line at infinity l∞ = (0, 0, 1)T

P2 = R2⋃

l∞ ; Note that in P2 there is no distinction betweenideal points and other




A model for the projective plane

π

l

xO

x 1

x

x 3

2

idealpoint

exactly one line through two pointsexaclty one point at intersection of two lines




point ⇐⇒ linex ⇐⇒ l

xT l = 0 ⇐⇒ lTx = 0x = l× l’ ⇐⇒ l = x× x’

Duality principle

To any theorem of 2-dimensional projective geometry therecorresponds a dual theorem, which may be derived byinterchanging the role of points and lines in the original theorem




2D Conics

Curve described by 2nd-degree equation in the plane

ax2 + bxy + cy2 + dx+ ey + f = 0

or homogenized x 7→ x1

x3, y 7→ x2

x3

ax21 + bx1x2 + cx22 + dx1x3 + ex2x3 + fx23 = 0

or in matrix form

xT Cx = 0 with C =

a b/2 d/2b/2 c e/2d/2 e/2 f

5 DOF : {a : b : c : d : e : f}




2D Conics

For each point the conic passes through

ax2i + bxiyi + cy2i + dxi + eyi + f = 0

or(x2i , xiyi, y

2i , xi, yi, 1)c = 0 with c = (a, b, c, d, e, f)T

stacking constraints yields

x21 x1y1 y21 x1 y1 1x22 x2y2 y22 x2 y2 1x23 x3y3 y23 x3 y3 1x24 x4y4 y24 x4 y4 1x25 x5y5 y25 x5 y5 1

c = 0




Tangent lines to conics

The line l tangent to C at point x on C is given by l = Cx




Dual conics

A line l tangent to the conic C satisfies lTC∗l = 0In general (C full rank) : C∗ = C−1

Dual conics = line conics = conic envelopes




Degenerate conics

A conic is degenerate if matrix C is not of full ranke.g. two lines (rank 2)

C = lmT + mlT

e.g. repeated line (rank 1)

C = llT

Degenerate line conics: 2 points (rank 2), double point (rank1)Note that for degenerate conics (C∗)∗ 6= C




3D points

3D points inhomogeneous points:

x =

xyz

where (x, y, z) ∈ R3

3D homogeneous points are denoted as:

x = (x, y, z, w) ∈ P3

Inhomogeneous vector :x = (x, y, z, w)T = w(x, y, z, 1)T = wx.




3D planes

Homogeneous coordinates : m = (a, b, c, d)T

Plane equation: xT .m = ax+ by + cz + d = 0

Normal vector notation: m = (nx, ny, ny, d)T = (n, d)T

with ||n|| = 1. We can also setn = (cos θ cos Φ, sin θ cos Φ, sin Φ)T . The combination (θ,Φ)is also known as spherical coordinates.

Plane at infinity: m = (0, 0, 0, 1)T , contains all the pointsat infinity, cannot be normalized.




3D lines


z

x

p

qy

r=(1- )p+ q

Figure 2.3 3D line equation, r = (1 − λ)p + λq.

3D planes. 3D planes can also be represented as homogeneous coordinates m = (a, b, c, d)

with a corresponding plane equation

x · m = ax + by + cz + d = 0. (2.7)

We can also normalize the plane equation as m = (nx, ny, nz, d) = (n, d) with �n� = 1.In this case, n is the normal vector perpendicular to the plane and d is its distance to theorigin (Figure 2.2b). As with the case of 2D lines, the plane at infinity m = (0, 0, 0, 1),which contains all the points at infinity, cannot be normalized (i.e., it does not have a uniquenormal or a finite distance).

We can express n as a function of two angles (θ,φ),

n = (cos θ cosφ, sin θ cosφ, sinφ), (2.8)

i.e., using spherical coordinates, but these are less commonly used than polar coordinatessince they do not uniformly sample the space of possible normal vectors.

3D lines. Lines in 3D are less elegant than either lines in 2D or planes in 3D. One possiblerepresentation is to use two points on the line, (p, q). Any other point on the line can beexpressed as a linear combination of these two points

r = (1 − λ)p + λq, (2.9)

as shown in Figure 2.3. If we restrict 0 ≤ λ ≤ 1, we get the line segment joining p and q.If we use homogeneous coordinates, we can write the line as

r = µp + λq. (2.10)

A special case of this is when the second point is at infinity, i.e., q = (dx, dy, dz, 0) = (d, 0).Here, we see that d is the direction of the line. We can then re-write the inhomogeneous 3Dline equation as

r = p + λd. (2.11)

Inhomogeneous:r = (1− λ)p + λr

Homogeneous: r = µp + λr

Special case: ifq = (dx, dy, dz, 0)T = (d, d)T .

d is the direction of the line,then r = p + λd

Problem: too many degrees offreedom.




3D lines

Plucker matrices : a 4× 4 skew-symmetric homogeneousmatrix :

L = pqT − qpT

Note that L is rank 2 (det L = 0 ). Any plane containingthe line is in the nullspace of L.




3D quadrics

Quadric surface: xTQx = 0

Serve as modeling primitives (spheres, ellipsoids, cylinders)




Basic set of 2D planar transformations36 Computer Vision: Algorithms and Applications (September 3, 2010 draft)

y

x

similarity

Euclidean affine

projective

translation

Figure 2.4 Basic set of 2D planar transformations.

Translation. 2D translations can be written as x� = x + t or

x� =�

I t�x (2.14)

where I is the (2 × 2) identity matrix or

x� =

�I t

0T 1

�x (2.15)

where 0 is the zero vector. Using a 2 × 3 matrix results in a more compact notation, whereasusing a full-rank 3 × 3 matrix (which can be obtained from the 2 × 3 matrix by appending a[0T 1] row) makes it possible to chain transformations using matrix multiplication. Note thatin any equation where an augmented vector such as x appears on both sides, it can always bereplaced with a full homogeneous vector x.

Rotation + translation. This transformation is also known as 2D rigid body motion or the2D Euclidean transformation (since Euclidean distances are preserved). It can be written asx� = Rx + t or

x� =�

R t�x (2.16)

where

R =

�cos θ − sin θ

sin θ cos θ

�(2.17)

is an orthonormal rotation matrix with RRT = I and |R| = 1.

Scaled rotation. Also known as the similarity transform, this transformation can be ex-pressed as x� = sRx + t where s is an arbitrary scale factor. It can also be written as

x� =�

sR t�x =

�a −b txb a ty

�x, (2.18)

where we no longer require that a2 + b2 = 1. The similarity transform preserves anglesbetween lines.




2D transformations

Translation:

x′ =

[I t

0T 1

]x

Rotation+Translation

x′ =[R t

]x

Scaled Rotationx′ =

[sR t

]x




2D transformations

Affine:

x′ =

[a00 a01 a02a10 a11 a12

]x

Projectivex′ = Hx




Projective transformations

A projectivity is an invertible mapping h from P2 to itself suchthat three points x1,x2,x3 lie on the same line if and only ifh(x1), h(x2), h(x3) do.

Theorem

A mapping h : P2 → P2 is a projectivity if and only if thereexist a non-singular 3x3 matrix H such that for any point in P2

reprented by a vector x it is true that h(x) = Hx




Projective transformations

x′1x′2x′3

=

h11 h12 h13h21 h22 h23h31 h32 h33

x1x2x3

or x′ = Hx (8 DOF)

projectivity = collineation = projective transformation =homography




Mapping between planes

O

x yx

ππ /

//

x

x /

y

central projection may be expressed by x’ = Hx




Removing projective distortion

select four points in a plane with known coordinates

x′ =x′1

x′3= h11x+h12y+h13

h31x+h32y+h33y′ =

x′2

x′3= h21x+h22y+h23

h31x+h32y+h33

x′(h31x+ h32y + h33) = h11x+ h12y + h13

y′(h31x+ h32y + h33) = h21x+ h22y + h23

(linear in hij , 2 constraints/point, 8 DOF ⇒ 4 points needed)

Remark : no calibration at all necessary, better ways to compute (see later)




More examples

��

��

��

��

��

��

planar surface

image 2image 1

R,t

x

X

x

��

��

��

��

��

��

��

��

��

��

��

X

x

ximage 1

image 2

/

x

x




Transformation of lines and conics

For a point transformation : x′ = Hx

Transformation for lines : l′ = H−T l

Transformation for conics : C′ = H−TCH−1

Transformation for dual conics : C∗′ = HC∗HT




A hierarchy of transformations

Projective linear group PL(n) (In the case of projectivetransformations of the plane n = 3. )↪→ Affine group = subgroup of PL(3) (last row (0, 0, 1))

↪→ Euclidean group = subgroup of affine group (upper left 2× 2orthogonal)

↪→ Oriented Euclidean group (upper left 2× 2 det 1)Alternative, characterize transformation in terms of elements orquantities that are preserved or invariant

e.g. Euclidean transformations leave distances unchanged




A hierarchy of transformations (2D)38 Computer Vision: Algorithms and Applications (September 3, 2010 draft)

Transformation Matrix # DoF Preserves Icon

translation�

I t�2×3

2 orientation

rigid (Euclidean)�

R t�2×3

3 lengths ✚✚✚✚❙❙❙❙

similarity�

sR t�2×3

4 angles ✚✚❙❙

affine�

A�2×3

6 parallelism ✂✂ ✂✂

projective�

H�3×3

8 straight lines

✥✥

Table 2.1 Hierarchy of 2D coordinate transformations. Each transformation also preservesthe properties listed in the rows below it, i.e., similarity preserves not only angles but alsoparallelism and straight lines. The 2×3 matrices are extended with a third [0T 1] row to forma full 3 × 3 matrix for homogeneous coordinate transformations.

Blinn (1998) describes (in Chapters 9 and 10) the ins and outs of notating and manipulatingco-vectors.

While the above transformations are the ones we use most extensively, a number of addi-tional transformations are sometimes used.

Stretch/squash. This transformation changes the aspect ratio of an image,

x� = sxx + tx

y� = syy + ty,

and is a restricted form of an affine transformation. Unfortunately, it does not nest cleanlywith the groups listed in Table 2.1.

Planar surface flow. This eight-parameter transformation (Horn 1986; Bergen, Anandan,Hanna et al. 1992; Girod, Greiner, and Niemann 2000),

x� = a0 + a1x + a2y + a6x2 + a7xy

y� = a3 + a4x + a5y + a7x2 + a6xy,

arises when a planar surface undergoes a small 3D motion. It can thus be thought of as asmall motion approximation to a full homography. Its main attraction is that it is linear in themotion parameters, ak, which are often the quantities being estimated.




A hierarchy of transformations (3D)2.1 Geometric primitives and transformations 39

Transformation Matrix # DoF Preserves Icon

translation�

I t�3×4

3 orientation

rigid (Euclidean)�

R t�3×4

6 lengths ✚✚✚✚❙❙❙❙

similarity�

sR t�3×4

7 angles ✚✚❙❙

affine�

A�3×4

12 parallelism ✂✂ ✂✂

projective�

H�4×4

15 straight lines

✥✥

Table 2.2 Hierarchy of 3D coordinate transformations. Each transformation also preservesthe properties listed in the rows below it, i.e., similarity preserves not only angles but alsoparallelism and straight lines. The 3 × 4 matrices are extended with a fourth [0T 1] row toform a full 4 × 4 matrix for homogeneous coordinate transformations. The mnemonic iconsare drawn in 2D but are meant to suggest transformations occurring in a full 3D cube.

Bilinear interpolant. This eight-parameter transform (Wolberg 1990),

x� = a0 + a1x + a2y + a6xy

y� = a3 + a4x + a5y + a7xy,

can be used to interpolate the deformation due to the motion of the four corner points ofa square. (In fact, it can interpolate the motion of any four non-collinear points.) Whilethe deformation is linear in the motion parameters, it does not generally preserve straightlines (only lines parallel to the square axes). However, it is often quite useful, e.g., in theinterpolation of sparse grids using splines (Section 8.3).

2.1.3 3D transformations

The set of three-dimensional coordinate transformations is very similar to that available for2D transformations and is summarized in Table 2.2. As in 2D, these transformations form anested set of groups. Hartley and Zisserman (2004, Section 2.4) give a more detailed descrip-tion of this hierarchy.

Translation. 3D translations can be written as x� = x + t or

x� =�

I t�x (2.23)




The circular points

I =

1i0

,J =

1−i0

= eigenvectors of HS

I′ = HSI =

s cos θ −s sin θ txs sin θ s cos θ ty

0 0 1

1i0

= seiθ

1i0

= I

Result

The circular points I, J are fixed points under the projectivetransformation H iff H is a similarity




The circular points

”Circular points” because every circle intersects l∞ at thecircular points.Conic :

ax21 + bx1x2 + cx22 + dx1x3 + ex2x3 + fx23 = 0

Circle, a = c = 1, b = 0 :

x21 + x22 + dx1x3 + ex2x3 + fx23 = 0

Intersection with l∞ in the points with x3 = 0

x21 + x22 = 0




The circular points

Solution :

I =(

1 i 0)T,J =

(1 −i 0

)T

Algebraically, circular points are the orthogonal directions of

Euclidean geometry(

1 0 0)T

and(

0 1 0)T

, packagedinto a single complex conjugate entity

I =(

1 0 0)T

+ i(

0 1 0)T




Conic dual to the circular points

C∗∞ = IJT + JIT =

11

0

Result

The dual conic C∗∞ is fixed under the projective transformationH iff H is a similarity

Notes :C∗∞ has 4 DOF : symmetric 3× 3 matrix (5 DOF) - det(C∗∞) = 0l∞ is the null vector of C∗∞If lTC∗∞m = 0 then the lines l and m are orthogonal.




Recovery of metric properties from images

Metric rectification using C∗∞

if x′ = Hx then

C∗∞′ =

[HP HA HS

]C∗∞[HP HA HS

]T

=

[KKT KKTv

vTKKT vTKKTv

]

Result

Once the conic C∗∞ is identified on the projective plane thenprojective distortion may be rectified up to a similarity




Metric from projective

Suppose lines l and m are images of orthogonal lines on the worldplane then lTC∗∞m = 0

(l1 l2 l3

) [ KKT KKTvvTKKT vTKKTv

]

m1

m2

m3

= 0

(l1m1, (l1m2 + l2m1)/2, l2m2, (l1m3 + l3m1)/2, (l2m3 + l3m2)/2, l3m3)c = 0

where c = (a, b, c, d, e, f)T


Documents

SCI30_Lecture1