OBJECT PERCEPTION FOR ROBOT MANIPULATIONyuxng.github.io/Xiang_TRI_07122019.pdf · · 2020-04-076D...

Yu Xiang, 7/12/2019

OBJECT PERCEPTION FOR ROBOT MANIPULATION

MANIPULATION• The way of making physical changes to the world around us

Vs. question answering or autonomous driving

MANIPULATION REQUIRES INTELLIGENCE• Understanding the 3D environment from sensing

• E.g., Vision, Tactile

• Grasp and motion planning / decision making

• E.g., Obstacle avoidance

• Dynamics / Control

• Learning from experience

ROBOT MANIPULATION

IntelligentSystem

3D EnvironmentLearning

Sensing PlanningPerception Control

6D OBJECT POSE ESTIMATION

USING 3D MODELS OF OBJECTS• The YCB Object and Model Set

B. Calli, A. Singh, A. Walsman, S. Srinivasa, P. Abbeel and A. M. Dollar, "The YCB object and Model set: Towards common benchmarks for manipulation research," International Conference on Advanced Robotics (ICAR), 2015.

6D POSE ESTIMATION FOR GRASP PLANNING

From Graspit!

POSECNN

Yu Xiang, Tanner Schmidt, Venkatraman Narayanan and Dieter Fox. PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes. In RSS, 2018.

PoseCNN Texture-less objects Symmetric objects Occlusions

camera coordinate

POSECNN: DECOUPLE 3D TRANSLATION AND 3D ROTATION

object coordinate2D center

Distance

• 3D Translation• 3D Rotation

2D Center Localization 3D Rotation Regression

POSECNN: SEMANTIC LABELING

LabelsSkip link

Input image

Fully convolutional networkEncoder Decoder

POSECNN: 2D CENTER VOTING FOR HANDLING OCCLUSIONS

center

POSECNN: 3D TRANSLATION ESTIMATION

Labels#classes

Center direction X

Center direction Y

Center distance

Hough voting layer

3 × #classes

POSECNN: 3D ROTATION REGRESSION

Labels#classes

3 × #classes

Center direction X

Center direction Y

Center distance

Hough voting layer

RoI pooling layers 6D Poses4 × #class

For each RoI

POSECNN: HANDLE SYMMETRIC OBJECTS

POSECNN: 3D ROTATION REGRESSION LOSS FUNCTIONS

Pose Loss (non-symmetric)

Shape-Match Loss for symmetric objects (symmetric)

3D model points

Ground truth rotation

Predicted rotation

IMPLICIT ROTATION LEARNING

Encoder Decoder

Embedding

Input ReconstructionSundermeyer et al. Implicit 3D orientation learning for 6D object detection from RGB images. In ECCV, 2018.

ROTATION ESTIMATION WITH CODEBOOK MATCHING

Encoder

Codebook

191,808 discrete rotations

Similarity scores

Detection

TRAINING DATA: DOMAIN RANDOMIZATION

DEEP ITERATIVE MATCHING FOR 6D OBJECT POSE ESTIMATION

Yi Li*, Gu Wang, Xiangyang Ji, Yu Xiang and Dieter Fox. DeepIM: Deep Iterative Matching for 6D Pose Estimation. In ECCV, 2018 (Oral) (*PhD student at UW).

Initial pose estimation

pose changeDeep Neural Network

pose(0)

Δpose(0)

Network

Observed image

3D model

Renderer

Rendered image

pose(1)

Network

3D model

Renderer

Δpose(1)

DEEPIM PIPELINE

Rendered image

FlowNetConvs [1]

Rotation

Translation

640x480

Observedimage

Renderedimage

Zoomed input

FC256FC256

Feature map

NETWORK STRUCTURE

[1] Dosovitskiy, Alexey and Fischer, Philipp and Ilg, Eddy and Hausser, Philip and Hazirbas, Caner and Golkov, Vladimir and Van Der Smagt, Patrick and Cremers, Daniel and Brox, Thomas. Flownet: Learning optical flow with convolutional networks. In ICCV, 2015.

TRAINING DATA: YCB OBJECTS

6D OBJECT POSE TRACKING

PoseRBPF

Input images Translation Orientationdistribution

Xinke Deng*, Arsalan Mousavian, Yu Xiang, Fei Xia*, Timothy Bretl and Dieter Fox. PoseRBPF: A Rao-Blackwellized Particle Filter for 6D Object Pose Tracking. In RSS, 2019 (*intern at NVIDIA).

Uncertainty in Pose Estimation

PoseRBPF: Particle Representation

3D Translation𝑇

Orientation Distribution𝑷 𝑹𝒊 𝑻𝒊, 𝒁𝟏:𝒌

Encoder

Discretized Rotations

Codebook

ParticleCode

Rotation Likelihood

191,808 bins

Results: YCB ObjectsExample: YCB mug (50 particles, ~20fps)

YCB-Video RGB PoseRBPF:

ADD: 62.1, ADD-S: 78.4 PoseCNN:

ADD: 53.7, ADD-S: 75.9

Results: TLess ObjectsExample: TLess 01 (100 particles, ~11fps)

TLess RGBObject recall for Err_vsd < 0.3: PoseRBPF: 41.47% Sundermeyer et al: 18.35%

Yu Xiang and Dieter Fox. DA-RNN Semantic Mapping with Data Associated Recurrent Neural Network, RSS, 2017.

SEMANTIC MAPPING

UNSEEN OBJECT INSTANCE SEGMENTATION

Christopher Xie*, Yu Xiang, Arsalan Mousavian and Dieter Fox. The Best of Both Modes: Separately Leveraging RGB and Depth for Unseen Object Instance Segmentation. Under Review, 2019 (*PhD student at UW).

POSECNN FOR 20 YCB OBJECTS

FUTURE WORK: SELF-SUPERVISED LEARNING

Simulation

Synthetic data model

Leaning

Real environment

Interacting

Updating

Training data in the real world• Reason about uncertainty/failure in

the real world to obtain annotations• Interact with the real world to

collect more data

New environment? Learning in that environment to

adapt the model!

Questions?

OBJECT PERCEPTION FOR ROBOT MANIPULATIONyuxng.github.io/Xiang_TRI_07122019.pdf · · 2020-04-076D...

Documents

3D OBJECT RECOGNITION AND POSE ESTIMATION USING AN …

Real-Time Object Pose Estimation with Pose Interpreter ... · interpreter network is able to generalize to real data. Our end-to-end system for object pose estimation runs in real-time

Fast Object Localization and Pose Estimation in …av21/Documents/pre2011/Fast Object...Fast Object Localization and Pose Estimation in Heavy Clutter for Robotic Bin Picking Ming-Yu

Global Hypothesis Generation for 6D Object Pose …openaccess.thecvf.com/content_cvpr_2017/papers/Michel_Global... · Global Hypothesis Generation for 6D Object Pose Estimation

Recognition Scene understanding / visual object categorization Pose clustering

A Benchmark Dataset for 6DoF Object Pose Tracking

Lecture 23: 3-D Pose Object Recognition

Recognition Scene understanding / visual object categorization Pose clustering

Viewpoint-Aware Object Detection and Continuous Pose

ROBOTIC ASSEMBLY, USING RGBD- BASED OBJECT POSE …

Deep Object Pose Estimation for Semantic Robotic Grasping

Learning 6D Object Pose Estimation using 3D Object Coordinates

Efﬁcient Object Manipulation to an Arbitrary Goal Pose

Chapter 8 Multi-view Object Categorization and Pose …

BOP: Benchmark for 6D Object Pose Estimation

Robust 6D Object Pose Estimation in Cluttered Scenes Using

Total3DUnderstanding: Joint Layout, Object Pose and Mesh ...openaccess.thecvf.com/content_CVPR_2020/papers/Nie_Total...Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction

Single Image 3D Object Detection and Pose Estimation for Graspingkosta/Links/icra2014_object_grasping.pdf · 2014. 12. 8. · Single Image 3D Object Detection and Pose Estimation

Deep Quaternion Pose Proposals for 6D Object Pose Tracking

Fast and automatic object pose estimation for range images