31
Reconstructing PASCAL VOC Sara Vicente* Anthropics Technology Lourdes Agapito University College London Jorge Batista ISR - University of Coimbra João Carreira* UC Berkeley / ISR * First two authors contributed equally

Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Reconstructing PASCAL VOC

Sara Vicente*Anthropics Technology

Lourdes AgapitoUniversity College London

Jorge BatistaISR - University of

Coimbra

João Carreira*UC Berkeley / ISR

* First two authors contributed equally

Page 2: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Data Matters

1960 1990 2010Person

Motorbike

EverythingToy images3D models

Image classificationCropped images

Hundreds of images,class labels

Object localizationSimple images

10K-1M images, class labels, segmentations and keypoints

Goal:Test data:

Training Data:

Page 3: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

PresentRenewed interest on joint object reconstruction and recognition

Seeing 3D chairs: exemplar part-based 2D-3D alignment using a large dataset of CAD models, M. Aubry, D. Maturana, A. Efros, B. Russell and J. Sivic

Estimating Image Depth Using Shape Collections, H. Su, Q. Huang, N. Mitra, Y. Li and L. Guibas

Beyond PASCAL: A Benchmark for 3D Object Detection in the Wild, Y. Xiang, R. Mottaghi and S. Savarese

Detailed 3D Representations for Object Recognition and Modeling, Z. Zia, M. Stark, B. Schiele and K. Schindler

Image-based Synthesis and Re-Synthesis of Viewpoints Guided by 3D Models. K. Rematas, T. Ritschel, M. Fritz, and T. Tuytelaars

Parsing IKEA objects: Fine Pose Estimation. J. Lim, H. Pirsiavash and A. Torralba

Page 4: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Present

Renewed interest on joint object reconstruction and recognition

But awesome recognition datasets (PASCAL VOC, Imagenet) that took years to collect and everyone uses have only 2D annotations

Person

Motorbike

Class labelsSegmentations

Keypoints (not shown)

Available Unavailable

Aligned 3D shapesPASCAL VOC

Page 5: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Proposed Solution

Bootstrap reconstructions for all objects in detection datasets from existing 2D annotations

Facilitate new attack at joint recognition and reconstruction

Available Reconstructed

Page 6: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Class-based Reconstruction – Prior Work

A Morphable Model for the Synthesis of 3D Faces, Volker Blanz and Thomas Vetter, Siggraph 1999

What shape are dolphins? Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution: An Incremental Approach to

Non-Rigid Structure from Motion, Shengqi Zhu, Li Zhang, Brandon M. Smith, CVPR 2010

Morphable Models built from:

Multiple 3D scans

Single 3D mesh + 2D data

2D data(non-rigid SFM)

Less information

Page 7: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

But… how ? PASCAL VOC - Birds

Page 8: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

But… how ? PASCAL VOC - Chairs

Page 9: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

But… how ? PASCAL VOC - Aeroplanes

Page 10: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

But… how ? PASCAL VOC - Boats

Page 11: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Key Idea

Assume for each object in a class there are a small number of similar ones seen from different viewpoints (shape surrogates)

Target Object Other objects in same category

Page 12: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Key Idea

Assume for each object in a class there are a small number of similar ones seen from different viewpoints (shape surrogates)

Target Object Other objects in same category

Reconstruct an object using standard rigid multiview techniques with the images of surrogates as additional views

Hard to identify surrogates: perform viewpoint-biased sampling

Page 13: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Proposed Approach

1. Viewpoint Estimation (Rigid Structure from Motion)

2. 3D Reconstruction (Visual Hull Sampling)

3. Reconstruction RankingFor each object:

Jointly over all objects in a class:

Bet

ter

Output

Page 14: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Step 1 of 3: Class-based Viewpoint Estimation

Page 15: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Step 1 of 3: Class-based Viewpoint Estimation

Factorization-based rigid SFM:

𝑥11 𝑥1

𝑘…

𝑦11 𝑦1

𝑘…

𝑥21 𝑥2

𝑘…

𝑦21 𝑦2

𝑘…

𝑥𝑁1 𝑥𝑁

𝑘…

𝑦𝑁1 𝑦𝑁

𝑘…

… … =

Measurement matrix

Estimating 3D shape from degenerate sequences with missing data, Manuel Marques, João Paulo Costeira, CVIU 2009

Known

𝑀1

Unknown

𝑀2

𝑀𝑁

Motion matrices

x

Shape

Unknown

𝑥1

𝑦1

𝑧1

𝑥2

𝑦2

𝑧2

𝑥𝑘

𝑦𝑘

𝑧𝑘

Page 16: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Step 1 of 3: Class-based Viewpoint Estimation

Idea: exploit segmentation information: occluded keypoints should project inside silhouette

Side viewOriginal view

Estimated keypoints (occluded)

Estimated keypoints (visible)

Ground truth keypoints (only visible ones are available)

Page 17: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Step 1 of 3: Class-based Viewpoint Estimation

Estimated elevation for airplanes:

Page 18: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Step 2 of 3: 3D Reconstruction (Visual Hull)

Well-known multiview reconstruction algorithm

Efficient

Easy to implement

Multiple views of same aeroplane model

Page 19: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Step 2 of 3: 3D Reconstruction (Visual Hull)

Making the multiview reconstruction assumptions hold

Sampling approach• Randomly select multiple pairs of silhouettes hoping that one

pair arises from shape surrogates• Bias sampling to most informative viewpoints

Cars Aeroplanes

Page 20: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Step 2 of 3: 3D Reconstruction (Visual Hull)

Making the multiview reconstruction assumptions hold

Sampling approach• Randomly select multiple pairs of silhouettes hoping that one

pair arises from shape surrogates• Bias sampling to most informative viewpoints

Typically:• Left/Right• Top/Bottom• Front/Back

Page 21: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Step 2 of 3: 3D Reconstruction (Visual Hull)

Principal Component Analysis on 3D points from SFM returns an intuitive set of 3 informative viewpoints

Cluster together objects up to 15º away from these viewpoints

Cars Aeroplanes

Informative viewpoints = PCA ( )

Page 22: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Step 2 of 3: Visual Hull Reconstruction

Randomly sample silhouettes from 2 out of the 3 clusters multiple times and reconstruct from each combination with target image (in gray)

a b

c d e

Page 23: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Step 2 of 3: Imprinted Visual Hull Reconstruction

Optimize each reconstruction to conform exactly to the reference silhouette

Non-imprintedImprinted Reference silhouette

Page 24: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Step 3 of 3: Reconstruction Ranking

Select mesh whose projected boundaries best match average masks

Car average masks and SFM model Selected reconstruction

Bet

ter

Target Object

Reconstruction ranking

Page 25: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Experiments

Reconstructed 9,087 annotated and unnocluded objects on PASCAL VOC 20 categories

Also reconstructed 1000 renderings of a synthetic extension of PASCAL VOC for obtaining quantitative results

Page 26: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Synthetic Dataset: Reconstruction Error

Smaller is better

Shape InflationOur results SFM convex hull

Page 27: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Smaller is better

Synthetic Dataset: Reconstruction Error

Playing with puffball: simple scale-invariant inflation for use in vision and graphics,N. Twarog, M. Tappen, and E. Adelson, In ACM Symp. on Applied Perception, 2012

Shape InflationOur results SFM convex hull

Page 28: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Synthetic Dataset: Reconstruction Error

Smaller is better

Shape InflationThis method SFM Convex Hull

Page 29: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Synthetic Dataset: Reconstruction Error

Inflation: shape inflation baseline

SFMCvxHull: convex hull of SFM points

aeroplane 3.58 9.64 5.79

bicycle 4.3 10.51 6.56

bird 9.98 8.76 12.01

boat 5.91 8.81 6.52

bottle 8.09 6.25 12.13

bus 6.45 11.02 7.34

car 3.04 11.07 3.22

cat 6.98 11.39 9.61

chair 5.36 8.13 7.37

cow 5.44 9.17 7.5

dining table 8.97 8.67 9.52

dog 7.08 11.61 9.91

horse 6.05 6.9 7.41

motorbike 4.12 9.24 5.32

person 7.35 9.14 19.46

potted plant 7.72 7.58 17.86

sheep 7.18 8.77 7.16

sofa 6.11 8.06 5.75

train 15.73 17.01 17.47

tv/monitor 9.73 9.67 10.08

mean 6.96 9.57 9.4

Page 30: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Code available online:http://www2.isr.uc.pt/~joaoluis/carvi/index.html

Conclusions

Rigid SFM can be made robust to challenging intra-category variation

Class-based reconstruction by sampling visual hulls with different putative surrogates shapes

Bootstrapped coarse 3D viewpoint and shape information from existing 2D annotations on PASCAL VOC

Future work: • Learn more powerful recognition models from the new 3D data

• Relax need for annotations

Page 31: Reconstructing PASCAL VOC - Semantic Scholar · 2017-03-18 · Building 3D morphable models from 2D images, Thomas J. Cashman and Andrew W. Fitzgibbon, PAMI 2013 Model Evolution:

Thanks!