Upload
kathy
View
55
Download
3
Embed Size (px)
DESCRIPTION
Kentaro Toyama and Andrew Blake Microsoft Research Presentation prepared by: Linus Luotsinen. Probabilistic Tracking in a Metric Space. Outline. Introduction Modelling of images and observations Pattern theoretic tracking Learning Learn mixture centers (exemplars) - PowerPoint PPT Presentation
Citation preview
Probabilistic Tracking in a Metric Space
Kentaro Toyama and Andrew Blake Microsoft Research
Presentation prepared by:Linus Luotsinen
Outline
Introduction Modelling of images and observations Pattern theoretic tracking Learning
– Learn mixture centers (exemplars)– Learn kernel parameters (observational likelihood)– Learn dynamic model (transition probabilities)
Practical tracking Results
– Human motion using curve based exemplars– Mouth using exemplars from raw image
Conclusions
Introduction
Metric Mixture, M2
– Combine exemplars in metric space with probabilistic treatments
– Models easily created directly from training set– Dynamic model to deal with occlusion
Problems with other probabilistic approaches– Complex models– Training required for each object to be tracked– Difficult to fully automate
Pattern Theoretic Tracking - Notation
1 2
* * * *1 2
Test images: { , ,..., }
Train images: { , ,..., }Class is defined by a set of exemplars: { , 1, 2,..., }Geometrical transformation: , (known in advance)Pattern theoretic tracking
T
T
k
Z z z z
Z z z zx k K
T A
:
State vector defined by: ( , )z T xk
Metric Functions
True metric function– All constraints
Distance function– Without 3 and 4
1) ( , ) 0, ,2) ( , ) 0,3) ( , ) ( , )4) ( , ) ( , ) ( , )
a b a ba b iff a ba b b aa b b c a c
a
b
c
Modelling of Images and Observations
Patches– Image sub-region– Shuffle distance function
Distance with the most similar pixel in its neighborhood
Curves– Edge maps– Chamfer distance function
Distance to the nearest pixel in the binary images See next slide!
Probabilistic Modelling of Images and Observations
Distance image – dI
Exemplar - TOriginal image
Feature image - I
Tt
I tdT
TI )(1),(chamfer Not a true metric!3) ( , ) ( , )4) ( , ) ( , ) ( , )
I T T IA B B C A C
Curves with Chamfer distance
Pattern Theoretic Tracking
xTz
Geometrical Transform:
- Translation- Affine- Projective…
Observation
Exemplar(from a training set):
- Intensity images- Feature images (edges, corners…)
“patches”
“curves”
Pattern Theoretic Tracking
x1:
x2:
x3:
x4:
Exemplars xk (k=1…K)
z:
Given Image
Geometric Transform Tα (α A)
Tα1Tα2Tα3
kx~1) - A set of K Exemplars.
2) - Distribution of observations around – Likelihood:
k)(αXXzp,
| kxT
~
3) - Prior about dependency between states – Dynamics:
1| tt XXp
t-1 t
Pattern Theoretic Tracking - Learning
1) Find “central” exemplar -
)',(maxminarg'0 zzzzz
MAX
0z
2) “Align” other images to z0 -
mm
mm
zTx
zzT1
01 ),(minarg
Goal - given M images (zm), find K exemplars:
zm xmm=1…M
Learning Mixture Centers
2c
ckk xx )~,~( 1
3) Find K “distinct” exemplars -
4) Cluster the rest by minimal distance -
)~,(minarg kmkm xxk
5) Find new representatives -
)',(maxminarg~'
xxxxxk kx~
Learning Mixture Centers
kx~
z
)~,( kxTz
1) Using a “validation set” find distances between images and their exemplars:
22~(..) d
2) Approx. distances as chi-square:
(to find σ and d)
(..)exp1)|( Z
Xzp
3) Then the observation likelihood is:
Learning Kernel Parameters
221;
dZ
Learning Dynamics
Learn a Markov matrix for by histogramming transitions
Run a first order auto-regressive process (ARP) for , with coefficients calculated using the Yule-
Walker algorithm
M 1( | )t tp k k
1( | )t tp
Practical Tracking
Forward algorithm–
Results are chosen by–
1 1
1 2
1 1 1
( ) ( | , ,..., )
( ) ( | ) ( | ) ( )t t
t t t t t
t t t t t t tk
p X p X Z Z Z
p X p z X p X X p X
ˆ arg max ( )t t tX p Xz1 z2 z3
X1
X2
X3
X4
zT…
t= 1 2 3 T
0
1
0
0
0
0.09
0.01
0.2
…
Results
Tracking human motion– Based on contour edges– Dynamics learned on 5 sequences of 100 frames each
Exemplars Same person, motion not seen in training sequence
Results
Tracking human motion– Based on contour edges– Dynamics learned on 5 sequences of 100 frames each
Different person Different person with occlusion (power of dynamic model)
Results
Tracking person’s mouth motion– Based on raw pixel values– Training sequence was 210 frames captured at 30Hz– Exemplar set was 30 (K=30)
Left image show test sequence Right image show maximum a posteriori
Using L2 distance Using shuffle distance
Results
Tracking ballerina– Larger exemplar sets (K=300)
Conclusions
Metric Mixture (M2) Model– Easier to fully automate learning– Avoid explicit parametric models to describe target objects
Generality– Metrics can be chosen without significant restrictions
Temporal fusion of information for occlusion recovery– Bayesian networks
References
[1] Kentaro Toyama, Andrew Blake, Probabilistic Tracking with Exemplars in a Metric Space, International Journal of Computer Vision, Volume 48, Issue 1, Marr Prize Special Issue, Pages: 9–19, 2002, ISSN:0920-5691.
[2] Jongwoo Lim, CSE 252C: Selected Topics in Vision & Learning. http://www-cse.ucsd.edu/classes/fa02/cse252c/
[3] Eli Schechtman and Neer Saad, Advanced topics in computer and human vision. http://www.wisdom.weizmann.ac.il/~armin/AdvVision02/course.html