Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Dandan Shan1, Jiaqi Geng1*, Michelle Shu2*, David F. Fouhey1
University of Michigan1, Johns Hopkins University2
Sponsored by Procter & Gamble and Nokia Networks
100Days Of Hands Video Dataset
furniture gardening housework packing
diy drink food boardgame
puzzle repair study vlog
131 Days
12 Categories
19.2K Uploaders
27.3K Videos
100K Frame-level Annotations
• Box around hand
• Side (left / right)
• Contact (no / self / other /
portable / furniture)
• Box around object in contact
• Association
Method
Hand
Box
Side
State
Offset
Object
Box
Faster-RCNN
Standard Detection
Classification (if hand)
Classification (if hand)
Regression (if in contact)
Standard Detection
(another class)
[1] Ren et al. Faster R-CNN: Towards real-time object detection with region proposal networks. NIPS 2015.
[2] Yang et al. A Faster Pytorch Implementation of Faster R-CNN (https://github.com/jwyang/faster-rcnn.pytorch).
simple greedy matching
on hands and objects
Cross-Dataset Hand Detection
Faster RCNN with Resnet-101 backbone, AP (TP = IoU > 0.5)
100DOH
VLOG
VIVA
Ego
VGG
>80 >70 >60 >40 <40
Train
on
TV+Co
VGG100DOH VLOG VIVA TV+CoEgo
73.990.1 86.4 86.5 65.490.8
21.5
17.4
23.6
40.7
27.7
32.6
90.8
44.9
10.1
8.090.7
56.8
64.678.6 77.5 76.6 59.283.2
79.6
77.4
61.4
56.2
61.7
61.5 74.9
78.8
69.9
66.6
62.4
63.0
Qualitative Results
Method
Hand
Box
Side
State
Offset
Object
Box
L
R
Hasson
et al.
Given location and side, can predict
low-dimensional parameterization of hand
[1] Hasson et al. Learning joint reconstruction of hands and manipulated objects. CVPR 2019.
[2] Romero et al. Embodied Hands: Modeling and Capturing Hands and Bodies Together. SIGGRAPH Asia 2017.
Enabling MANO at Scale
[1] Hasson et al. Learning joint reconstruction of hands and manipulated objects. CVPR 2019.
[2] Romero et al. Embodied Hands: Modeling and Capturing Hands and Bodies Together. SIGGRAPH Asia 2017.
Hand Predictions
Input Prediction Input Prediction
Code and Data Available!
Hand Detection
Model (detectron2)Full Hand State Detection
Model
Mesh Quality Assessment100K labeled Frames
Full Hand State Detection
Model (egocentric)
100DOH Video Dataset
100 Days
Of Hands