Upload
sophia-baker
View
217
Download
0
Embed Size (px)
Citation preview
Looking at people and
Image-based LocalisationRoberto Cipolla
Department of Engineering
Research team http://www.eng.cam.ac.uk/~cipolla/people.html
1. Real-time hand detection and tracking
Why is it hard?
• Highly articulated object, 27 model parameters
• Shape variation and self-occlusions
• Unreliable point features
• Ambiguities in single viewlead to multi-modal distributions (local minima)
Why is it hard?
• Background clutter
• Potentially fast motion
• Lighting changes
• Partial / full occlusion
A Solved Problem?
3D tracking, 6/7 DOF• Model: 3D quadrics• Cost Function:
Edges or colour-edges • Tracking: Unscented
Kalman filtering• Single or dual view• Single hypothesis
filter, no recovery strategy
A Robust Tracker
• Should work in scenes with complex backgroundand varying illumination– Important: Cost function design– Optimization strategy
• Should handle multi-modality– Examples: Particle filters, multi-hypotheses filters
• Should have a recovery strategy when track is lost– Trigger search algorithm
3D Pose Recovery
3D hand model constructed from cones and ellipsoids Contour projection, handling self-occlusions 27 motion parameters
Hierarchy of classifiers
Likelihood : Edges
Edge Detection Projected Contours
Robust EdgeMatching
Input Image 3D Model
Chamfer Matching
Input image Canny edges
Distance transform Projected Contours
Likelihood : Colour
Skin Colour ModelProjected Silhouette
Input Image 3D Model
Template Matching
Tree-based bayesian filtering
Matching Multiple Templates
Use tree structure to efficiently match many templates (>50,000) Arrange templates in tree based on their similarity Traverse tree using breadth-first search, several ‘active’ leaves possible
Search TreeGrid-based partitioning of parameter space
Bayesian-Tree
• The search-tree is brought into a Bayesian framework by adding the prior knowledge from previous frame.
• The Bayesian-Tree can be thought as approximating the posterior probability at different resolutions.
State space partitioning Estimation of posterior pdf
Experiments
Global Motion3D motions limited to hemisphere
Dynamics: First-order Gaussian process
3 level tree with 16,000 templates at leaf level
5 scales, divisions of 15 degrees in 3D rotation and
divisions of 10 degrees in image plane rotation
Translation search at 20, 5, 2-pixel resolution
Tracking Results
Tracking Results
Experiments
Finger Articulation
• Opening and closing of thumb and fingers approximated by 2 parameters
• Global motion restricted to smaller range, but still with 6 DOF
• 35,000 templates at the leaf level
Opening and closing
Hand detection system
Ongoing work
Large number of templates requiredExamples shown here show only constrained motionNumber of templates required for fully articulated motion?
Tracking rates at 5 fps to 0.2 fps For 400 - 35,000 templates (on a 2.4 GHz Pentium IV)
Error introduced by geometric model No palm deformation, no skin deformation, no arm model
Detecting people
2. Building 3D models of cities
Trumpington Street Data
Camera pose determination
3D reconstruction
Reconstruction texture mapped
3. Where am I?
Image-based localisation
......
Image-based localisation
Image-based localisation
……
Image-based localisation
Image-based localisation
Image-based localisation
Image-based localisation
Image-based localisation
Image-based localisation
Summary and deliverables
• Realtime hand detection in clutter
• 3D models from uncalibrated images
• Image-based localisation for augmented reality