Upload
ijunejo
View
63
Download
2
Embed Size (px)
Citation preview
Introduction
• Computer Vision:
Analysis of visual data in an intelligent manner
• Low-level vision (1)
• High-level vision (2)
• The transition from:
Static Images Video Content
6/6/2012 2
Dynamic Scene Analysis
• Interaction of multiple agents in a specific context and particular environment
• Activities reoccur over time and co-occur in time
• Scene analysis gives an understanding of:
– where objects are located,
– what is happening,
– how they interact over a period of time
6/6/2012 4
Related Work
• All the works start with some feature
extraction.
• Existing works in the literature are:
1. Trajectory-based
o Many require object detection
o Difficulties in handling occlusions
2. Optical Flow based
o Tracks motion between frames
o Preferable for complex videos as
it is fast and robust
6/6/2012 7
Video Scene Understanding Using Multi-scale Analysis [Yang et al.]
6/6/2012 8
─ Uses optical flow and
Bag-of-words representation
─ Each pixel is assigned
a codeword
─ Use diffusion maps - Clustering reveals the motion patterns, done using a spectral
analysis technique
• Trajectories used to find a set of behavior rules , followed by clustering
• Hidden Markov Models are used to detect primitive events
• Event rule representation is based on Stochastic Context-Free Grammar and extended with temporal logic
• Event rule induction is performed to discover the hidden temporal structures between primitive events using the Minimum Description Length algorithm
Trajectory Series Analysis based Event Rule Induction for Visual
Surveillance [Zhang et al.]
Random Field Topic Model for Semantic Region Analysis in Crowded Scenes from
Tracklets [Zhou et al.]
6/6/2012 9
• tracklets are observed within a short period
• A Random Field Topic Model is integrated with
Markov Random Field to enforce spatial and
temporal coherence during the learning process
• Tracklets are grouped into one topic
• Pairwise MRF: connects neighboring tracklets
• Tracklets which are spatially and temporally close,
have similar distributions over semantic regions
Random Field Topic Model for Semantic Region Analysis
in Crowded Scenes from Tracklets [Zhou et al.]
General Steps
(1) Feature Extraction
(2) Event Modeling
(3) Event Recognition
Atomic Event
• Involves a single object
• Represented by motion patterns
• Indicates the spatial properties
Composite Event
• Multiple atomic events taking place in space & time: complex activities
• Behavioral interaction: results in spatio-temporal patterns
6/6/2012 10
Problem Statement
Given, a video of a scene acquired by a static
camera:
– Identify regions of different dynamics
– Learn spatio-temporal patterns in the scene and
interpret the semantics within
–Detect abnormal events based on a normalcy
model
6/6/2012 11
Feature Extraction: Mean-shift Tracking
17
1. We need to detect and track objects of interest, i.e. vehicles.2. Target characterization
a) By a circular region in the image i.e. color PDF of target pixels
3. Target localizationa) Update the model in frame
Spectral Clustering: Data Representation
o Nodes (1,2,..,n) –
Trajectories
o Edge weights (w) –
Similarity measure (Dynamic Time Warping
Distance)
1 2 ..
.. n
w
w
Graph
Adjacency
matrix
t1 t2 … tn
t1 0 0.5 … 0.75
t2 0.5 0 … 0.66
… … … … …
tn 0.75 0.66 … 06/6/2012 21
We aim to clustering trajectories into distinct events in the
scene.
Spectral Clustering: Steps
(n x n) Affinity Matrix
• Form Laplacian Matrix: Compute K largest eigenvectors
• K estimated from the distortion score
Eigenvector Matrix
• Cluster eigenvectors
• Assign trajectory points to corresponding clusters
K-means Clustering
6/6/2012 22
Video Association Mining
• We want to uncover unknown patterns in the
scene
• We want to focus is on relationships occurring
within time-intervals rather than just points in
time
• Temporal Pattern Mining: Used to discover
interesting patterns in the scene
• Association Rule Mining: Helps predict future
scene dynamics
6/6/2012 25
What is a Frequent Pattern?
• Frequent Temporal Pattern (FTP): Occurs many times in the data; indicates co-occurring and recurring activities in the scene
• A temporal pattern composed of k events is called a k-pattern
• Relationships amongst events are encoded using Allen’s temporal logic
• Each temporal pattern is appended with its time duration
C
A
Brelationship
event
duration
3-pattern
6/6/201227
Allen’s First-Order Interval Logic
startX < startY < endX < endY
duration = startY ─ endX
6/6/2012 28
Interval-Based Event Miner: AlgorithmLevel-by-Level Discovery Process
• IEMiner: based on the Apriori principle of item-set
mining
•Apriori principle: Every subset of a frequent k-pattern set
also has to be frequent
(1)
Candidate Generation
(2)
SupportCounting
Frequent k-patterns
Candidate (k+1)-patterns6/6/2012 29
Input: List of Event Sequences
• Each event sequence consists of a
sequence of triplets:
{event_label,start_time,end_time}
No Event Sequence
1 A 0 5 B 0 9 C 9 11
2 C 0 7 A 3 11 B 9 11
3 A 0 11 C 1 6 D 1 5
4 A 0 4 C 0 3 E 6 7 G 7 11
Obtain single
frequent events
Event Count
A 4
B 2
C 4
D 1
E 1
G 1
6/6/2012 30
FREQUENT
(1) Candidate GenerationBottom-up approach
FIRST STEP:
GENERATE SET OF 2-PATTERNS
6/6/2012 31
No Event Sequence
1 A 0 5 B 0 9 C 9 11
2 C 0 7 A 3 11 B 9 11
3 A 0 11 C 1 6 D 1 5
4 A 0 4 C 0 3 E 6 7 G 7 11
Form composite
events
C
A
A
B
A starts B
C overlaps A.
.
.
(1) Candidate GenerationBottom-up approach
SECOND STEP: GENERATE(K+1)-PATTERNS FROM FREQUENT
K-PATTERNS AND 2-PATTERNS
LEVEL 2: K = 2
6/6/201232
C
A
A
B
A starts B
C overlaps A
.
.
.
A
A
A equals C
A overlaps B
.
.
.
C
B
Candidate 3-patterns
C
A
B
overlaps(C overlaps A) B .
.
.
2-patterns2-patterns
(2) Support CountingSingle-pass Procedure
6/6/2012 33
• support of a TP indicates the number of
event sequences in which the pattern occurs
• For a pattern to be classified as frequent, it should have a support value higher than
the user-specified min. support threshold
Determine frequency of
candidate patterns by
counting occurrences
(1) Candidate GenerationBottom-up approach
SECOND STEP: GENERATE(3+1)-PATTERNS FROM FREQUENT
3-PATTERNS AND 2-PATTERNS
LEVEL 3: K = 3
6/6/2012 34
.
.
.
A
C equals D
A overlaps B
.
.
.
D
B
Candidate 4-patterns
.
.
C
A
B
meets
(B overlaps A) C
2-patterns3-patterns
C
C
A
B
D
equals (meets
(B overlaps A) C) D
(2) Support CountingSingle-pass Procedure
6/6/2012 35
Determine frequency of
candidate patterns by
counting occurrences
• At each iteration: Increment the level
• Terminates when the Candidate Set is EMPTY
Minimum Support Threshold
vs.
Number of Frequent Patterns
Junction Dataset
0.02 vs. 92 patterns
Roundabout Dataset
0.02 vs. 29 patterns
6/6/2012 36
Pruning Redundant Patterns• Our pruning criteria:
6/6/2012 37
Relation_1 Relation_2
overlaps overlaps
during during
equals equals
CASE 1
Relation_1 Relation_2
overlaps starts
during
equals finishes
CASE 2
6/6/2012 38
k-patterns before after
2-patterns 55 40
3-patterns 33 26
4-patterns 4 3
k-patterns before after
2-patterns 23 17
3-patterns 5 4
4-patterns 1 1
JUNCTION
ROUNDABOUT
Pruning Redundant Patterns
CASE 3 overlaps(C overlaps A) A
Learning Association Rules
• Temporal association rules (TAR) describe time-
dependent correlations
• TARs are constructed from pairs of FTPs: The
left-hand side is a sub-pattern of the right-hand
pattern
k-pattern(X) k+1-pattern(Y)
• A rule’s strength is measured
by:
and rules are retained if confidence value is above
a threshold6/6/2012 40
Traffic Scene Model: Junction
6/6/2012 43
starts(A,B) [4]
meets(starts(A,B),C) [9]
{50%}
before(starts(D,C),G)[5.5]
overlaps(before(starts(D,C),G),E)
[4] {50%}
before(G,F) [4]
during(before(G,F),H) [5]
{100%}
during(F,H) [3.5]
before(during(F,H),A) [4]
{50%}
Traffic Scene Model: Roundabout
6/6/2012 45
before(starts(B,A),D)
[7]
finishes(before(starts
(B,A),D),C) [2]
{100%}
before(F,B) [7]
finishes(before(F,B),
A) [3]
{100%}
(0) Trajectory Classification
• The classification problem entails classifying
trajectories from test sequences to event categories:
{A,B,C,…}
• Classification is based on the nearest-neighbor scheme
3??? B
B
A A B
A A B 2??
A
1? C C
C C
D D
D
D D6/6/2012 49
(1) Spatial Outliers
• In the physical scene layout, these events deviate from the normal direction-of-flow
• The trajectory direction is computed as:
• The test trajectory direction is compared to cluster prototypes direction using the DTW distance measure
• Abnormal trajectories exceed the threshold defined per event cluster
6/6/2012 50
(2) Spatio-temporal Anomaly Detection
• Abnormal activities at this stage violate both spatial and temporal constraints
• Hierarchical pattern matching (level 1 to level k): Patterns from test sequence are matched against the trained sets of FTPs
– Level 1: Single Frequent Events
– Level 2: 2-patterns
– Level 3: 3-patterns
– Level 4: 4-patterns
• Next…
6/6/2012 53
(2) Spatio-temporal Anomaly Detection
• Law of transitivity has to be incorporated in the
pattern-matching process, in order to reduce false
positives
• If duration of test patterns exceeds a threshold with
respect to duration of trained frequent patterns,
indicates the presence of a rare event
6/6/2012 54
C
B
C
A
A
B
A before B C equals A C before B
Anomaly Detection: Accuracy
• Based on the ground truth:
– True Positives (TP): normal test sequence is classified as normal
– True Negatives (TN): abnormal test sequence is classified as abnormal
– False Positives (FP): abnormal behavior classified as normal
– False Negatives (FN): normal behavior classified as abnormal
6/6/2012 55
starts(F,A)
Fire-truck interrupting traffic flow
Junction
A
F
6/6/2012 56
Approach Accuracy
Ours 97.37%
Loy et al. 90%
Zen et al. 92.36%
overlaps (D,A)
Incorrect traffic flow
Roundabout
AD
6/6/2012 57
Approach Accuracy
Ours 97.62%
Zen et al. 86.4%
Contributions
• Clustering of motion trajectories using a spatial technique and the DTW measure
• Utilizing interval-based temporal mining techniques for event recognition in dynamic scenes
• Hierarchical spatio-temporal anomaly detection based on quantitative measures
6/6/2012 58
A DCB
C
A
B
D
Point-based
Interval-based
duration
Future Directions
• Using a fully unsupervised robust visual
surveillance tracking system
• Performing motion segmentation and anomaly
detection in real-time
• Applying this approach to more complex
scenarios as well as other domains
6/6/2012 59
Conclusion
• The goal is to organize the video into different
event groups and find their temporal
dependencies
• Single-agent events are modeled by trajectories
• Multi-agent interactions are represented by
temporal patterns
• Association rules are useful in predicting future
activities
• Ability to model individual behavior of vehicles
in the scene, helps in localizing anomalies
6/6/2012 60
Motion Segmentation: Spectral Clustering
• We aim to clustering trajectories into distinct
events in the scene.
• Spectral clustering
– obtains data points in a low-dimensional space
– ability to deal with non-convex shaped clusters
6/6/2012 65
Mean-shift Tracking
• Mean-shift theory: find the center of mass for ROI, move circle to centre of mass and continue until convergence
1) Obtain target model and location
2) Minimize the distance between the target and candidate model
3) Kernel is moved from previous location to current location until convergence
6/6/2012 66