Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Learning Dense Correspondence via 3D-guided Cycle Consistency
Tinghui Zhou1, Philipp Krähenbühl1, Mathieu Aubry2, Qixing Huang3, Alexei A. Efros1
UC Berkeley1, ENPC ParisTech2, TTI-Chicago3
The Unreasonable Effectiveness of Deep Learning?
Performance gain over traditional methods
60%
45%
30%
15%
0Object
detectionSemantic
seg.Humanpose
Intrinsicimage
VideoSeg.
Lots of direct labels
Very few direct labels
Densematching
3
Dense Semantic Correspondence
4
Dense Semantic Correspondence
5
Traditional Pairwise Methods
• SIFT flow: Liu et al., ECCV 2008• Generalized PatchMatch: Barnes et al., ECCV 2010• Deformable Spatial Pyramid: Kim et al., CVPR 2013
Hand-crafted Features
Hand-crafted Features
Feature Matching
Collection Correspondence
• Congealing: Learned-Miller, PAMI 2006• Collection Flow: Kramelmacher-Shlizerman et al., CVPR 2012• Object discovery and segmentation: Rubinstein et al., CVPR 2013• Compositional Image Model: Mobahi et al., CVPR 2014• Object discovery and localization: Cho et al., CVPR 2015• FlowWeb: T. Zhou et al., CVPR 2015• Multi-image Matching: X. Zhou et al., ICCV 2015
Labels for CNN Training?
CNN Infeasible to label in large-scale
Cycle-consistency as Supervision
• Composite flows along a cycle should be zero
Cycle-consistency as Supervision
• Composite flows along a cycle should be zero • 2-cycle consistency: Fi,j � Fj,i = 0
Cycle-consistency as Supervision
• Composite flows along a cycle should be zero • 2-cycle consistency: Fi,j � Fj,i = 0
• 3-cycle consistency: Fi,k � Fk,j � Fj,i = 0
Cycle-consistency as Supervision
• Composite flows along a cycle should be zero • 2-cycle consistency: Fi,j � Fj,i = 0
• 3-cycle consistency: Fi,k � Fk,j � Fj,i = 0
Cycle-consistency as Supervision
• Composite flows along a cycle should be zero • 2-cycle consistency: Fi,j � Fj,i = 0
• 3-cycle consistency: Fi,k � Fk,j � Fj,i = 0
CNNAmount of
inconsistency
Cycle Consistency in Vision
Shape Matching SfMCo-segmentation
Huang et al, SGP’13 Wang et al, ICCV’13 Zach et al, CVPR’10
Collection Correspondence
Zhou et al, CVPR’15 Zhou et al, ICCV’15
Could be consistent but wrong…
2
6664
0 0 0 . . . 00 0 0 . . . 0...
......
...0 0 0 . . . 0
3
7775
26664
00
0. ..
0
00
0. ..
0
. . .
. . .
. . .
. . .
00
0. ..
0
37775
26664
00
0. . .
0
00
0. . .
0
......
...
...
00
0. . .
0
37775
Need an anchor edge!
Synthetic Correspondence as the Anchor
3D CAD Model
Viewpoint Renderer
Correspondence from renderer
3D-guided Cycle Consistency
Fr2,s2
F̃s1,s2
Fr1,r2
Fs1,r1
synthetic s1 synthetic s2
real r1 real r2
F̃s1,s2 = Fs1,r1 � Fr1,r2 � Fr2,s2
Accumulate flow vector
Ground truth
TRAINING TIME
3D-guided Cycle Consistency
Fr2,s2
F̃s1,s2
Fr1,r2
Fs1,r1
synthetic s1 synthetic s2
real r1 real r2
minX
<s1,s2,r1,r2>
L⇣F̃s1,s2 � Fs1,r1 �Fr1,r2 �Fr2,s2
⌘
Ground truth
Network Architecture
128
8
3
128 64 64 32 32
16 16
16 32 32
64 64 128 128 256
128
8
3
128 64 64 32 32
16 16
16 32 32
64 64 128 128 256
8 16 16 32 32 64 64 128 128
512 256 256 128 128
64 64 32 2
Source
Target
WeightSharing
Flow field
Matchability PredictionSource
Target
Flow field
CNN
Matchability PredictionSource
Target
Flow field
CNN
Background: ✗!
Matchability PredictionSource
Target
Flow field
CNN
Background: ✗!Occlusion: ✗!
Matchability PredictionSource
Target Flow fieldCNN
Matchability
Training Set ConstructionPASCAL 3D
(Bbox + Viewpoint)ShapeNet
(Synthetic Rendering)
Xiang et al, WACV’14 Chang et al, arXiv’15
Training Set ConstructionPASCAL 3D
(Bbox + Viewpoint)ShapeNet
(Synthetic Rendering)
Xiang et al, WACV’14 Chang et al, arXiv’15
Training Set Construction
…
…
…
…
Single view reconstruction via joint analysis of image and shape collections, Huang et al., SIGGRAPH 2015
Image-to-shape retrieval
Training Set Construction
One training example
• ~80,000 examples per category• A single network for all 12 PASCAL3D categories (aero,
boat, bus, car, chair, etc.)
RESULTS
Image Warping VisualizationTargetSource
SIFT flow Ours
Image Warping Visualization
TargetSource
SIFT flow Ours
Keypoint TransferSource TargetAccuracy (PCK)
SIFT flow
Ours
Mean 19.6 24.0
…
Car 22.4 33.3
Bus 28.6 40.3
Bottle 28.3 40.3
TV 42.9 51.1
…
SIFT flow Ours
Matchability PredictionSource TargetOurs Ground truth
AccuracySIFT flow Ours
64.5 72.0
t-SNE Feature Visualization
128
8
3 Source
Target
Weight sharing
128 64 64 32 32
16 16
16 32 32
64 64 128 128 256
128
8
3
128 64 64 32 32
16 16
16 32 32
64 64 128 128 256
8 16 16 32 32 64 64 128 128
8 16 16 32 32 64 64 128 128
512 256 256 128 128
64 64 32 2
256 128 128 64 64
32 32 16 2
Flow field
Matchability
Global image features
t-SNE Feature Visualization
Side views 45。views Frontal views
Application: Cross-domain Dense Label Transfer
Source Target Dense CRF SIFT flow Ours
Conclusion
TRAINING TIME
Fr2,s2
F̃s1,s2
Fr1,r2
Fs1,r1
synthetic s1 synthetic s2
real r1 real r2
Ground truth
• Cycle consistency effective when direct labels not available• ‘Meta’-supervision: supervising the behavior of the data
Thank you!