Occlusion Aware Unsupervised Learning of Optical Flow

Yang Wang, Yi Yang, Zhenheng Yang, Liang Zhao, Peng Wang, Wei XuBaidu Research, Institute of Deep Learning (IDL)

Background – Self-supervised Learning

[3] Eslami et al, Natural scene representation and rendering, Science 2018 [2] Pathak et al, Curiosity-driven exploration by self-supervised prediction, ICML 2017

[1]Jaderberg et al, Reinforcement learning with unsupervised auxiliary tasks, ICLR 2017

[1] [2] [3]

• Learning without human annotation

Cherry On the Cake

Reinforcement Learning

Supervised Learning

Self-supervised Learning

But We Focus on Self-supervised Optical Flow

• Optical flow: [∆"#, ∆%#] encodes 2D motion for every pixel &

Brox and Malik, Large displacement optical flow: descriptor matching in variational motion estimation, PAMI 2011

Why Optical Flow?

• Optical flow is very useful but ground truth are hard to obtain• Optical flow technique can also apply to stereo depth estimation.• Can be evaluated with standard benchmark dataset.

• Depth and flow extend “state” representation from 2D to 4D in RL.• RGB, depth and flow are complimentary to each other.• Stereo Video -> Optical flow + Depth -> Attention / Recognition.

Why Optical Flow?

• A 4 month baby can• Follow an object (Tracking, Flow)• Reach and grasp an object (Depth, Shape)• Pay attention to small objects (Attention)

Previous Work

• Idea: Learning to predict optical flow to minimize photometric loss

[3] Zhe et al, Unsupervised Deep Learning for Optical Flow Estimation, AAAI 2017[2] Jason et al, Backtobasics: Unsupervised learning of optical flow via brightness constancy and motion, ECCV 2016W

[1] Ahmadi et al, Unsupervised convolutional neural networks for motion estimation, ICIP 2016

Warping Using Optical FlowUse image2 and optical flow to re-construct image1 with bilinear interpolation.Can be implemented through Spatial Transformer Networks.

Occlusion Problem in Warping: correct flow but wrong reconstruction

Problem – Occlusion affects Performance

Our Solution – Model Occlusion Explicitly

Occlusion Map

Occlusion-Aware Photometric Loss

Model Structure

Improving Backpropagation - Enlarged Search• The warped pixel only depends on its four nearest neighbors, so if the

target position is far away from the proposed position, the network will not get meaningful gradient signals during backpropagation. • We search for the best bilinear interpolation in different scales.

Experiments: Benchmark Datasets

• Dataset StatisticsDataset #Unsupervised Train Paris #Validation Paris Train Has GT Flow

Flying Chairs 22232 640MPI Sintel 908 133KITTI 2012 13372 194 ✘

Quantitative Results – The Lower the Better

3.62.7

Flying-Chairs MPI Sintel Clean KITTI 2012

Average Endpoint Error in pixels (EPE)

DSTFlow [1]BackToBasic [2]OursSupervise [3]

[1] Zhe et al, Unsupervised Deep Learning for Optical Flow Estimation, AAAI 2017 [2] Jason et al, Backtobasics: Unsupervised learning of optical flow via brightness constancy and motion, ECCV 2016W

[3] Flownet: Learning optical flow with convolutional networks, ICCV 2015

Cherry Pick Examples – KITTI

Cherry Pick Examples – MPI Sintel

Ablation Study

Extension: Occlusion Aware Unsupervised Learning of Stereo Depth (Disparity)• !" and !# are left and right stereo image pairs.• For calibrated cameras, the output disparity is only $ channel with ReLU.

0.060.068

KITTI 2015 Stereo

Abs Rel (The Lower the Better)

Godard [1]

Supervise* [1]

Results on KITTI Stereo Disparity Estimation

[1] Godard et al, Unsupervised monocular depth estimation with left-right consistency, CVPR 2017

Extension 2: Unsupervised Learning of Scene Flow (Flow + Disparity)

IL1 OpticalFlow

FlowNet

MotionNet

Camera Motion

DispNet

Depth Flow

Disparity /Depth

Moving Object Mask

Fused Flow

Cherry-Pick Examples of Moving Objects Mask

Occlusion Aware Unsupervised Learning of Optical Flow

Documents

Fusion of Multiple Cues from Color and Depth Domains using Occlusion Aware Bayesian Tracker

Occlusion-Aware Siamese Network for Human Pose Estimation · 2020. 7. 26. · Occlusion poses great threat to human pose estimation due to missing semantics of corresponding human

Shapecollage: occlusion-aware, example-based shape interpretationpersci.mit.edu/pub_pdfs/shapecollage.pdf · 2013-02-14 · shape collage, coupled with a new patch representation

Noise-Aware Unsupervised Deep Lidar-Stereo Fusion · 2019. 6. 10. · Noise-Aware Unsupervised Deep Lidar-Stereo Fusion ∗Xuelian Cheng1,2, ∗Yiran Zhong2,4,5, Yuchao Dai1, Pan

Occlusion-Aware Human Pose Estimation with Mixtures of Sub … · 2015. 12. 4. · E-mail: fibrahim.radwan, abhi.dhallg@canberra.edu.au, roland.goecke@ieee.org Signiﬁcant limitations

Unsupervised Learning Learning Unsupervised...Unsupervised Learning and Data Mining Learning Mining Unsupervised Learning and Data Mining Unsupervised Data Clustering Supervised Learning

Occlusion-aware Depth Estimation Using Light-ﬁeld Cameras · occlusion boundaries. In this paper, we explicitly model occlusions, by devel-oping a modiﬁed version of the photo-consistency

Occlusion Aware Unsupervised Learning of Optical Flow · Occlusion Aware Unsupervised Learning of Optical Flow Yang Wang 1Yi Yang Zhenheng Yang2 Liang Zhao 1Peng Wang Wei Xu1;3 1Baidu

Occlude Them All: Occlusion-Aware Attention Network for

Unsupervised Learning of Multi-Frame Optical Flow with ... · ﬁnal occlusion map. Instead of using a heuristic, we estimate the occlusion maps jointly with the optical ﬂow. We

Accurate Depth and Normal Maps From Occlusion-Aware …openaccess.thecvf.com/content_cvpr_2017/papers/Strecke_Accurate... · Accurate Depth and Normal Maps from Occlusion-Aware Focal

Occlusion-Aware Multi-Robot 3D Tracking - Karol Hausman · Occlusion-Aware Multi-Robot 3D Tracking Karol Hausman1 Gregory Kahn 2Sachin Patil Jorg M¨ uller¨ 1 Ken Goldberg 2;3 Pieter

Occlusion-Aware Depth Estimation with Adaptive Normal … · 2020. 8. 7. · Occlusion-Aware Depth Estimation with Adaptive Normal Constraints Xiaoxiao Long 1, Lingjie Liu 2, Christian

Occlusion aware sensor fusion for early crossing ...intelligent-vehicles.org/.../palffy2019iv_occlusion...Occlusion aware sensor fusion for early crossing pedestrian detection Andras

Deep Supervision with Shape Concepts for Occlusion-Aware ...huy/lib/exe/fetch.php?media=cvpr17.pdf · Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Chi

Unsupervised Ontology- and Sentiment-aware Review ...vagelis/publications/Review_SummarizationWISE2019.pdfinspired by well-studied graph coverage algorithms. To summarize, the review

File-Level Defect Prediction: Unsupervised vs. Supervised Models · 2020. 7. 8. · unsupervised defect prediction models. Finally, we describe existing work on effort-aware defect

Structure-Aware Keypoint Tracking for Partial Occlusion

Occlusion-Aware Search for Object Retrieval in Clutter

EFFICIENT INFERENCE IN OCCLUSION AWARE …bonner/courses/2020s/csc2547/papers/...segmentation (Hariharan et al., 2014). Furthermore, it can be trained in a fully unsupervised fashion