Towards 3D Human Pose Estimation in the Wild: a Weakly ... · Towards 3D Human Pose Estimation in...

Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach

Xingyi Zhou, Qixing Huang, Xiao Sun, Xiangyang Xue, Yichen Wei UT Austin & MSRA & Fudan

Human Pose EstimationPose representation

y = {p1, · · · , pN}Joint locations

Current Research on 2D Human Pose

Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh, Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields, CVPR 2017

•2D human pose estimation is a well studied problem

Is 2D human pose all we need?

Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh, Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields, CVPR 2017

•Ambiguous 3D structure

Why we have such a success on 2D?

•2D human pose data is easy to annotate and largely available

Mykhaylo Andriluka, Leonid Pishchulin, Peter Gehler, Schiele Bernt, 2D Human Pose Estimation: New Benchmark and State of the Art Analysis, CVPR 2014

3D data not easy to annotate

Current 3D human pose data.

•Captured in control-environment with accurate sensors.

Catalin Ionescu, Dragos Papava, Vlad Olaru and Cristian Sminchisescu, Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments, PAMI 2014

Supervised Pose Regression on Human3.6M

Xingyi Zhou, Xiao Sun, Wei Zhang, Shuang Liang, Yichen Wei. Deep Kinematic Pose Regression, In ECCV Workshop on Geometry Meets Deep Learning, 2016

Kinematic Pose Regression-Problems

•Training data is biased to indoor environment

Fail on in-the-wild images!

Xingyi Zhou, Xiao Sun, Wei Zhang, Shuang Liang, Yichen Wei. Deep Kinematic Pose Regression, In ECCV Workshop on Geometry Meets Deep Learning, 2016

Problem setting

Given:

Previous approaches: 2 Stages

Wei et al. Convolutional Pose Machines

Newell et al. Hourglass Network

Bulat et al. Part Heatmap Regression

2D pose estimation

Zhou et al. Shape Convex

Akhter et al. Pose Conditioned Angle Limits

Chen et al. KNN Matching

3D geometry recovery

2D pose

Previous approaches: 2 Stages

Wei et al. Convolutional Pose Machines

Newell et al. Hourglass Network

Bulat et al. Part Heatmap Regression

2D pose estimation

Zhou et al. Shape Convex

Akhter et al. Pose Conditioned Angle Limits

Chen et al. KNN Matching

3D geometry recovery

2D pose

The original in-the-wild 2D image, which contains rich cues for 3D pose recovery, is discarded in the second step.

Our solution:Weakly-supervised Transfer for 3D Human pose estimation in the wild

• Train a unified neural network using both 2D and 3D annotation.

• 2D and 3D pose are inherently entangled

• 2D-to-3D transfer: provide rich image features

• 3D-to-2D transfer: provide 3D annotation

Weakly-supervised Transfer

A batch of images from both datasets

Depth regression

moduleConvlayers

supervised 2D heap-map regression

3D data:regression

2D data: constraint

summation skip-connection

2D pose estimation module++++

(Ydep)

•Images from both dataset are fed into the same mini-batch •First estimate 2D pose and then regress depth from 2D results and lower

layer image features •Geometry constraint is applied for weakly-labeled 2D data

S3D = {I3D,Y2D,Ydep}S2D = {I2D,Y2D}

Depth regression

moduleConvlayers

3D data:regression

2D data: constraint

(Ydep)

2D Human Pose estimation: HourglassNetwork

Newell A, Yang K, Deng J. Stacked hourglass networks for human pose estimation, ECCV 2016

Depth regression

moduleConvlayers

3D data:regression

2D data: constraint

(Ydep)

Geometry Constraint

Key idea: Ratios between bone lengths remain relative fixed

: length of bone e

: a set of involved bones in a skeleton group

: length of bone e in canonical skeleton

{ lele}e∈Ri should remain fixed,

i.e. has zero variance

Depth regression

moduleConvlayers

3D data:regression

2D data: constraint

(Ydep)

L(YHM , Ydep|I) = L2D(YHM , Y2D) + Ldep(Ydep|I, Y2D)

Ldep(Ydep|I, Y2D) =

{λreg||Ydep − Ydep||2, if I ∈ I3DλgeoLgeo(Ydep|Y2D), if I ∈ I2D

Evaluation-Datasets

•MPII • 2D annotation, in-the-wild images •Used for weakly-supervised training

•Human 3.6M • MoCap 3D annotation, indoor • Used for supervised training

• MPI-INF-3DHP • MoCap 3D annotation, indoor & outdoor • Used for evaluation

• MPII-Validation • Used for evaluation

Evaluation-Baseline setup

Depth regression

moduleConvlayers

3D data:regression

2D data: constraint

(Ydep)

Supervised 3D pose estimation on Human3.6M dataset

• 3D/wo geo (82.44mm) shows the effectiveness of our architecture. • 3D/w geo shows the geo-constraint is consistent with supervision.• Training with 3D&2D data (3D+2D/wo geo) provides great performance gain.• Weakly supervised constraint 3D+2D/w geo brings further improvements. • Only 2-steps methods Chen & Ramanan(114.18mm) and Zhou et al,(79.9mm)

can be applied in-the-wild.

Results Analysis

• Is the improvement from more accurate 2D position or better depth estimation? • All baselines have very high 2D pose estimation. • This indicates that depth estimation are greatly benefit from more 2D data. • 2-stage approaches can not have such benefit.

2D PCK 90.01% 90.57% 90.93% 91.62%

In-the-wild 3D pose estimation on MPII-INF-3DHP Dataset

•3D data-only methods fail on in-the-wild images. •3D+2D/wo geo wins its counterpart of Metha et al. •Geo-constraint provides further improvements, whose results are close to

training on the corresponding training set.

In-the-wild 3D pose estimation on MPII-Validation-3D Set

•3D+2D/w geo performs better and correct the symmetry invalidity.

•Our framework keeps 2D accuracy.

More qualitative results

Failure Cases

inaccurate 2D prediction/ ambiguous depth/ false torso length.

Extension

•An improved weak-supervision for rigid objects. •The predicted pose of the same object from different viewpoint should be

consistent with each other.

Xingyi Zhou, Arjun Karpur, Chuang Gan, Linjie Luo, Qixing Huang, Unsupervised Domain Adaptation for 3D Keypoint Prediction from a Single Depth Scan, arXiv 1712.05765, 2017

Extension

•Add temporal refinement. •Add angle constraint.

Rishabh Dabral, Anurag Mundhada, Uday Kusupati, Safeer Afaque, Arjun Jain, Structure-Aware and Temporally Coherent 3D Human Pose Estimation, arXiv:1711.09250

Code & Model Available!

Torch PyTorch

Towards 3D Human Pose Estimation in the Wild: a Weakly ... · Towards 3D Human Pose Estimation in...

Documents

Bringing Portraits to Life with a Weakly Supervised Neural ... · identity/pose for face frontalization. GATH is similar in spirit to [32], in which the encoder subnetwork of the

Exploring Weakly Labeled Images for Video Object Segmentation …cvteam.net/papers/2018_TIP_Zhang-Exploring Weakly Labeled... · 2020. 7. 16. · Exploring Weakly Labeled Images for

Pose-Aware Face Recognition in the Wild - CVF Open Accessopenaccess.thecvf.com/content_cvpr_2016/papers/Masi_Pose-Aware_Face... · Pose-Aware Face Recognition in the Wild Iacopo Masi1

Weakly-supervised 3D Hand Pose Estimation from Monocular … · 2019-04-30 · Convolutional Pose Machines [Wei. et al. CVPR 2016] Stacked Hourglass Networks [Newell et al. ECCV 2016]

Recovering Accurate 3D Human Pose in The Wild Using IMUs ...files.is.tue.mpg.de/black/papers/VIP.pdf · Recovering Accurate 3D Human Pose in The Wild Using IMUs and a Moving Camera

3D Human Pose Estimation from a Single Image via Distance ... · alization capability on images ‘in the wild’. 2. Related Work Approaches to estimate 3D human pose from single

Weakly-supervised 3D Hand Pose Estimation from Monocular RGB …openaccess.thecvf.com/content_ECCV_2018/papers/Yujun_Cai... · 2018-08-28 · Weakly-supervised 3D HandPose Estimation

Weakly-supervised 3D Hand Pose Estimation from Monocular ...imi.ntu.edu.sg/NewsEvents/Events/PastSeminars/Documents/31_Jan… · Convolutional Pose Machines [Wei. et al. CVPR 2016]

Lifting 2d Human Pose to 3d : A Weakly Supervised …Lifting 2d Human Pose to 3d : A Weakly Supervised Approach Sandika Biswas, Sanjana Sinha, Kavya Gupta and Brojeshwar Bhowmick Embedded

Towards Accurate Multi-person Pose Estimation in …Towards Accurate Multi-person Pose Estimation in the Wild George Papandreou, Tyler Zhu, Nori Kanazawa, Alexander Toshev, Jonathan

Face detection, pose estimation and landmark localization ... · Face detection, pose estimation and landmark localization in the wild Presenter: Shuai Zheng (Kyle) Paper: X. Zhu

Face Detection, Pose Estimation, and Landmark Localization in the Wild

Weakly Coupled Oscillators

3D Human Pose Estimation in the Wild by Adversarial Learningopenaccess.thecvf.com/content_cvpr_2018/papers/Yang_3D_Human_Pose... · shift [44] between the constrained lab environment

Pose-Aware Face Recognition in the Wild - cv · PDF filePose-Aware Face Recognition in the Wild Iacopo Masi1 Stephen Rawls2 G´erard Medioni 1 Prem Natarajan2 1 USC Institute for Robotics

Deep Graph Pose: a semi-supervised deep graphical model ...€¦ · Semi-supervised learning aims to fully utilize unlabeled or weakly-labeled data to gain additional insights into

MoCap-guided Data Augmentation for 3D Pose Estimation in ... · MoCap-guided Data Augmentation for 3D Pose Estimation in the Wild Grégory Rogez Cordelia Schmid Inria Grenoble Rhône-Alpes,

Weakly-Supervised 3D Pose Estimation from a …epubs.surrey.ac.uk/852639/1/Weakly-Supervised 3D Pose...More recently, Convolutional Pose Machines have become a popular approach [34]

5. Weakly primary - full

DOPE: Distillation Of Part Experts for whole-body 3D pose … · 2020. 8. 4. · DOPE: Distillation Of Part Experts for whole-body 3D pose estimation in the wild Philippe Weinzaepfel