Upload
stephen-ferguson
View
219
Download
0
Tags:
Embed Size (px)
Citation preview
CAMEO: Meeting Understanding
Prof. Manuela M. Veloso, Prof. Takeo Kanade
Dr. Paul E. Rybski, Dr. Fernando de la Torre,
Dr. Brett Browning, Raju Patil, Carlos Vallespi,
Scott Lenser, Betsy Ricker, Francesco Tamburrino,
Colin McMillen, Sonia Chernova
CALO: Physical Awareness
Computer Science Department /The Robotics Institute
School of Computer Science
Carnegie Mellon University
CAMEO: Camera Assisted Meeting Event Observer
CAMEO : Camera Assisted Meeting Event Observer
• Robust multi-person PA capture device
• Contributions• Mosaic generation• Person tracking• Face recognition• Activity recognition• Logging/modeling
Must effectively operate in unstructured environments.
Each camera is hand-calibrated only once to compensate for radial distortion
Video Mosaic
Person Tracking : Mean Shift Based Color Tracking
Register New Person: Person ID, Face histogramFace Center (x,y), Face width, Face height
• Additional filtering based on shape and color templates
“Omega” head and shoulder template
Henry Schneiderman. “Feature-Centric Evaluation for Cascaded Object Detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2004.
Henry Schneiderman. “Learning on Restricted Bayesian Network for Object Detection.” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2004.
Face Recognition: Training
1) Capture visual data2) Normalize for
geometry and illumination
3) Cluster the most discriminating face examples
Multiple face discrimination Real-time performance
classes
i
Trj
ri
classes
ij Cr Cr
ri
rj
rj
ri
ri
rj
ri
rj
T
i j
tr1
1111)))())(((( 21
1 2
21211212 BμμΣΣμμΣΣΣΣB
Face Recognition: Non-Linear Oriented Discriminant Analysis
Find transformation matrix B that MAXIMIZES the Kullback-Leibler divergence between clusters among classes
classes
ii
Ti
TtrJ1
1 ))()(()( BABBΣBBUse Iterative Majorization to approximate B
Project clustered images into a lower-dimensional subspace to speed recognition
Research challenge: Face subspace is multi-modal
Face Recognition: Results
• Each new face is projected into subspace and compared against the trained examples
• Closest match, via Mahalanobis distance, determines class membership
• 95% recognition rate with training database of 11 subjects
Inferring Activity from Observation
Person action sequences can be represented as a simple finite state machine.
Face tracker captures the (x,y) positions of faces in the image over time.
Global meeting state is inferred from aggregate of person activity.
Inferred state from classifier
•State transitions are encoded as a dynamic Bayesian network in a HMM structure.
•Current person state is a function of observed human activity and previous state.
Logging/Replay/Towards Learning
Tracked person data is recorded for off-line activity analysis and learning of dynamics.
The recorded logs can be replayed back through CAMEO.
Model-based simulation generates high-level state descriptions of group activies.
Data-based simulation generates low-level “frame-by-frame” individual person activity state descriptions.
bring [carlos, computer]bring [carlos, cameo]set_up [carlos, computer]use [carlos, computer]set_up [carlos, cameo]use [carlos, cameo]give_demo [carlos, face_recognition]ask_question [fernando, face_recognition]answer_question [carlos, fernando, face_recognition]give_demo [raju, tracking]ask_question [carlos, tracking]answer_question [raju, carlos, tracking]give_demo [carlos, face_detection]ask_question [raju, face_detection]answer_question [carlos, raju, face_detection]ask_question [fernando, face_detection]answer_question [raju, fernando, face_detection]remove [carlos, computer]remove [carlos, cameo]leave [jon]leave [raju]leave [fernando]leave [carlos]leave [daniel]