Video Clustering

Embed Size (px)

Citation preview

  • 7/31/2019 Video Clustering

    1/44

    From videos to verbs: Mining videos foractivities using a cascade of dynamical systems

    Pavan Turaga, Ashok Veeraraghavan, Rama Chellappa

  • 7/31/2019 Video Clustering

    2/44

    Outline

    What is video mining ?

    Challenges

    Prior Work

    Overview of proposed algorithms

    Experiments

  • 7/31/2019 Video Clustering

    3/44

    Videos galore ..

  • 7/31/2019 Video Clustering

    4/44

    Video MiningWhat is it ?

    Isolate activities of interest from long videos

    Identify repetitive activities

    Aid analyst by presenting clusters

  • 7/31/2019 Video Clustering

    5/44

    Challenges

    Unsupervised: Dont know what we are looking for

    Do not know temporal boundaries of an activity

    Do not know how many clusters to find

    Need to be invariant to affine changes, view, execution rate

  • 7/31/2019 Video Clustering

    6/44

    Related Work

    HMMsStochastic GrammarsTime seriesclustering

    Shot BoundarydetectionNews cast, sportsvideos

    Switched lineardynamic systemsSubspace angles

  • 7/31/2019 Video Clustering

    7/44

    Where do we fit in ?

    Mining Videos for Events using acascade of dynamical systems

    ClusteringRecognition

    View InvarianceRate Invariance

    Learning the modelDistance metrics

  • 7/31/2019 Video Clustering

    8/44

    Tiers of processing

    Find repetitive sequencesof action elements

    Extract Action-Elements(Temporal Segmentation)

    Low Level Features

  • 7/31/2019 Video Clustering

    9/44

    Tier I: Low-level features

    Any of a wide choice of features depending on domain

    Silhouettes

    Point TrajectoriesOptical Flow

    Kendalls Shape

  • 7/31/2019 Video Clustering

    10/44

    Tier II: Segmentation

    Break video into segments such that each segment can bemodelled by a linear dynamic system

    How to segment ?

    Curvature in space-timeAffine Motion-model

    Shape deformation Texture

  • 7/31/2019 Video Clustering

    11/44

    Tier III: Sequence of LTI

    Simpler case of SLDS

    Activity composed ofsegments of consistentmotion

    Each segment modeled asLTI system

    ),0(~)(),()()1(

    ),0(~)(),()()(

    QNtvtvtAztz

    RNtwtwtCztf

  • 7/31/2019 Video Clustering

    12/44

    Learning the Model: Prediction ErrorMethods

    Maximum likelihood solution difficult to compute.

    Instead, use Minimum prediction error criterion.

    Solution can be obtained in closed form.

  • 7/31/2019 Video Clustering

    13/44

  • 7/31/2019 Video Clustering

    14/44

    Is the learnt model any good ?

    A very useful testfor a class of

    generative modelsis to synthesize

    from it

    Ulf Grenander -Father of Pattern theory

  • 7/31/2019 Video Clustering

    15/44

    Is the learnt model any good ?

  • 7/31/2019 Video Clustering

    16/44

    Distance Metric for ARMA models

    Principal angles betweencolumn spaces ofobservability matrices of thetwo models

    Three types of distances:

    }{ i

  • 7/31/2019 Video Clustering

    17/44

    Clustering

    Do not know number ofclusters

    Multibody FactorizationApproachPerform row-column permutations

  • 7/31/2019 Video Clustering

    18/44

    Guessing the number of clusters

    Spectral Graph Theory

    Construct normalizedLaplacian

    Multiplicity of zeroeigenvalue = number ofconnected components ingraph (idealized case)

    Practical Scenario

    Elbow

  • 7/31/2019 Video Clustering

    19/44

  • 7/31/2019 Video Clustering

    20/44

    So far

    Model Activities as sequence of LTI

    Segment video stream in subsequences

    Learn model parameters for each segment

    Cluster the segments

    Identify repetitive sequences of labels

  • 7/31/2019 Video Clustering

    21/44

    View and rate variations

  • 7/31/2019 Video Clustering

    22/44

    Building Invariances: Motivation

    Feature transforms

    Reflected inobservation matrix C

    Estimate transformparameters

    from segments C1, C2

    Change in execution rate

    Reflected inState transition matrix A

    Estimaterelative sampling rate

    between segments A1, A2

  • 7/31/2019 Video Clustering

    23/44

    Spatial transforms and the Observationmatrix

    ),0(~)(),()()(11

    RNtwtwtzCtf

    ),0(~)(),()()(22

    RNtwtwtzCtf

    Transform T ?

  • 7/31/2019 Video Clustering

    24/44

    Invariances

  • 7/31/2019 Video Clustering

    25/44

    Implication

    If two sequences are related by an affine transform, then thecorresponding principal components are also related by the sameaffine transform

    Thus, affine transforms can be estimated from the C matrices

  • 7/31/2019 Video Clustering

    26/44

    Affine Transforms

  • 7/31/2019 Video Clustering

    27/44

    View Invariance

    Result may be extended to view changes in a limited way

    Valid when perspective distortion can be approximated by anaffinity

  • 7/31/2019 Video Clustering

    28/44

    Compensating

    Let S1 = (A1,C1) and S2 = (A2,C2)

    where, T(S1) = (A1, T(C1)), and T is an appropriatetranformation group.

  • 7/31/2019 Video Clustering

    29/44

    Performing the minimization

    Optimization Procedures

    Gradient based

    Direct search

    Stochastic approaches

    Direct Methods: Used when gradients cannot be computed

    Nelder-Mead (Simplex) procedure is extremely popular

  • 7/31/2019 Video Clustering

    30/44

    Time warp and the transition matrix

    ),0(~)(),()()1(111

    QNtvtvtzAtz

    Warp factor q ?

    ),0(~)(),()()1( 222 QNtvtvtzAtz

  • 7/31/2019 Video Clustering

    31/44

    Invariance to Execution rate

  • 7/31/2019 Video Clustering

    32/44

    Some experiments

  • 7/31/2019 Video Clustering

    33/44

    Visualizing the clusters

    Bend

    Throw

    Phone Bat

    Squat

  • 7/31/2019 Video Clustering

    34/44

    A recognition experiment

  • 7/31/2019 Video Clustering

    35/44

    Far field expt

    Left Right

    Top 5/6 6/6

    Bottom 4/6 5/6

  • 7/31/2019 Video Clustering

    36/44

    Model Order SelectionUSF gait expt

  • 7/31/2019 Video Clustering

    37/44

    Change of View

  • 7/31/2019 Video Clustering

    38/44

    Thank You !

    Questions welcome.

  • 7/31/2019 Video Clustering

    39/44

    Building Invariances: Motivation

    Feature transforms

    Reflected inobservation matrix C

    Estimate transformparameters

    from segments C1, C2

    Change in execution rate

    Reflected inState transition matrix A

    Estimaterelative sampling rate

    between segments A1, A2

  • 7/31/2019 Video Clustering

    40/44

    Spatial transforms and the Observationmatrix

    ),0(~)(),()()(11

    RNtwtwtzCtf

    ),0(~)(),()()( 22 RNtwtwtzCtf

    Transform T ?

  • 7/31/2019 Video Clustering

    41/44

    It can be shown that

    If two sequences are related by an affine transform, then thecorresponding principal components are also related by the sameaffine transform

    Thus, affine transforms can be estimated from the C matrices

  • 7/31/2019 Video Clustering

    42/44

    View Invariance

    Result may be extended to view changes in a limited way

    Valid when perspective distortion can be approximated by anaffinity

  • 7/31/2019 Video Clustering

    43/44

    Compensating

    Let S1 = (A1,C1) and S2 = (A2,C2)

    where, T(S1) = (A1, T(C1))

    Nelder-Mead (Simplex) procedure to perform the minimization

  • 7/31/2019 Video Clustering

    44/44

    Performing the minimization

    Optimization Procedures

    Gradient based

    Direct search

    Stochastic approaches

    Direct Methods: Used when gradients cannot be computed

    Nelder-Mead (Simplex) procedure is extremely popular