Upload
tushar-jangid
View
217
Download
0
Embed Size (px)
Citation preview
7/31/2019 Video Clustering
1/44
From videos to verbs: Mining videos foractivities using a cascade of dynamical systems
Pavan Turaga, Ashok Veeraraghavan, Rama Chellappa
7/31/2019 Video Clustering
2/44
Outline
What is video mining ?
Challenges
Prior Work
Overview of proposed algorithms
Experiments
7/31/2019 Video Clustering
3/44
Videos galore ..
7/31/2019 Video Clustering
4/44
Video MiningWhat is it ?
Isolate activities of interest from long videos
Identify repetitive activities
Aid analyst by presenting clusters
7/31/2019 Video Clustering
5/44
Challenges
Unsupervised: Dont know what we are looking for
Do not know temporal boundaries of an activity
Do not know how many clusters to find
Need to be invariant to affine changes, view, execution rate
7/31/2019 Video Clustering
6/44
Related Work
HMMsStochastic GrammarsTime seriesclustering
Shot BoundarydetectionNews cast, sportsvideos
Switched lineardynamic systemsSubspace angles
7/31/2019 Video Clustering
7/44
Where do we fit in ?
Mining Videos for Events using acascade of dynamical systems
ClusteringRecognition
View InvarianceRate Invariance
Learning the modelDistance metrics
7/31/2019 Video Clustering
8/44
Tiers of processing
Find repetitive sequencesof action elements
Extract Action-Elements(Temporal Segmentation)
Low Level Features
7/31/2019 Video Clustering
9/44
Tier I: Low-level features
Any of a wide choice of features depending on domain
Silhouettes
Point TrajectoriesOptical Flow
Kendalls Shape
7/31/2019 Video Clustering
10/44
Tier II: Segmentation
Break video into segments such that each segment can bemodelled by a linear dynamic system
How to segment ?
Curvature in space-timeAffine Motion-model
Shape deformation Texture
7/31/2019 Video Clustering
11/44
Tier III: Sequence of LTI
Simpler case of SLDS
Activity composed ofsegments of consistentmotion
Each segment modeled asLTI system
),0(~)(),()()1(
),0(~)(),()()(
QNtvtvtAztz
RNtwtwtCztf
7/31/2019 Video Clustering
12/44
Learning the Model: Prediction ErrorMethods
Maximum likelihood solution difficult to compute.
Instead, use Minimum prediction error criterion.
Solution can be obtained in closed form.
7/31/2019 Video Clustering
13/44
7/31/2019 Video Clustering
14/44
Is the learnt model any good ?
A very useful testfor a class of
generative modelsis to synthesize
from it
Ulf Grenander -Father of Pattern theory
7/31/2019 Video Clustering
15/44
Is the learnt model any good ?
7/31/2019 Video Clustering
16/44
Distance Metric for ARMA models
Principal angles betweencolumn spaces ofobservability matrices of thetwo models
Three types of distances:
}{ i
7/31/2019 Video Clustering
17/44
Clustering
Do not know number ofclusters
Multibody FactorizationApproachPerform row-column permutations
7/31/2019 Video Clustering
18/44
Guessing the number of clusters
Spectral Graph Theory
Construct normalizedLaplacian
Multiplicity of zeroeigenvalue = number ofconnected components ingraph (idealized case)
Practical Scenario
Elbow
7/31/2019 Video Clustering
19/44
7/31/2019 Video Clustering
20/44
So far
Model Activities as sequence of LTI
Segment video stream in subsequences
Learn model parameters for each segment
Cluster the segments
Identify repetitive sequences of labels
7/31/2019 Video Clustering
21/44
View and rate variations
7/31/2019 Video Clustering
22/44
Building Invariances: Motivation
Feature transforms
Reflected inobservation matrix C
Estimate transformparameters
from segments C1, C2
Change in execution rate
Reflected inState transition matrix A
Estimaterelative sampling rate
between segments A1, A2
7/31/2019 Video Clustering
23/44
Spatial transforms and the Observationmatrix
),0(~)(),()()(11
RNtwtwtzCtf
),0(~)(),()()(22
RNtwtwtzCtf
Transform T ?
7/31/2019 Video Clustering
24/44
Invariances
7/31/2019 Video Clustering
25/44
Implication
If two sequences are related by an affine transform, then thecorresponding principal components are also related by the sameaffine transform
Thus, affine transforms can be estimated from the C matrices
7/31/2019 Video Clustering
26/44
Affine Transforms
7/31/2019 Video Clustering
27/44
View Invariance
Result may be extended to view changes in a limited way
Valid when perspective distortion can be approximated by anaffinity
7/31/2019 Video Clustering
28/44
Compensating
Let S1 = (A1,C1) and S2 = (A2,C2)
where, T(S1) = (A1, T(C1)), and T is an appropriatetranformation group.
7/31/2019 Video Clustering
29/44
Performing the minimization
Optimization Procedures
Gradient based
Direct search
Stochastic approaches
Direct Methods: Used when gradients cannot be computed
Nelder-Mead (Simplex) procedure is extremely popular
7/31/2019 Video Clustering
30/44
Time warp and the transition matrix
),0(~)(),()()1(111
QNtvtvtzAtz
Warp factor q ?
),0(~)(),()()1( 222 QNtvtvtzAtz
7/31/2019 Video Clustering
31/44
Invariance to Execution rate
7/31/2019 Video Clustering
32/44
Some experiments
7/31/2019 Video Clustering
33/44
Visualizing the clusters
Bend
Throw
Phone Bat
Squat
7/31/2019 Video Clustering
34/44
A recognition experiment
7/31/2019 Video Clustering
35/44
Far field expt
Left Right
Top 5/6 6/6
Bottom 4/6 5/6
7/31/2019 Video Clustering
36/44
Model Order SelectionUSF gait expt
7/31/2019 Video Clustering
37/44
Change of View
7/31/2019 Video Clustering
38/44
Thank You !
Questions welcome.
7/31/2019 Video Clustering
39/44
Building Invariances: Motivation
Feature transforms
Reflected inobservation matrix C
Estimate transformparameters
from segments C1, C2
Change in execution rate
Reflected inState transition matrix A
Estimaterelative sampling rate
between segments A1, A2
7/31/2019 Video Clustering
40/44
Spatial transforms and the Observationmatrix
),0(~)(),()()(11
RNtwtwtzCtf
),0(~)(),()()( 22 RNtwtwtzCtf
Transform T ?
7/31/2019 Video Clustering
41/44
It can be shown that
If two sequences are related by an affine transform, then thecorresponding principal components are also related by the sameaffine transform
Thus, affine transforms can be estimated from the C matrices
7/31/2019 Video Clustering
42/44
View Invariance
Result may be extended to view changes in a limited way
Valid when perspective distortion can be approximated by anaffinity
7/31/2019 Video Clustering
43/44
Compensating
Let S1 = (A1,C1) and S2 = (A2,C2)
where, T(S1) = (A1, T(C1))
Nelder-Mead (Simplex) procedure to perform the minimization
7/31/2019 Video Clustering
44/44
Performing the minimization
Optimization Procedures
Gradient based
Direct search
Stochastic approaches
Direct Methods: Used when gradients cannot be computed
Nelder-Mead (Simplex) procedure is extremely popular