Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Deep Robotic Learning

Sergey LevineUC Berkeley Google Brain

roboticcontrolpipeline

observationsstate

estimation(e.g. vision)

modeling & prediction

planninglow-level control

controls

standardcomputervision

features(e.g. HOG)

mid-level features(e.g. DPM)

classifier(e.g. SVM)

deeplearning

Felzenszwalb ‘08

roboticcontrolpipeline

observationsstate




controls

deeproboticlearning

observationsstate




controls

end-to-end training

end-to-end training

no direct supervision

actions have consequences

1. Does end-to-end learning produce bettersensorimotor skills?

2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?

3. Can we scale up deep robotic learning and produce skills that generalize?

4. How can we learn safely and efficiently in safety-critical domains?

5. Can we transfer skills from simulation to the real world, and from one robot to another?






Chelsea Finn

end-to-end training

0%successrate

96.3%successrate

pose prediction

(trained on pose only)

L.*, Finn*, Darrell, Abbeel, ‘16






Deep Robotic Learning Applications

manipulation

locomotion

with N. Wagener, P. Abbeel with V. Kumar, A. Gupta, E. Todorov

with V. Koltun

aerial vehicles

with G. Kahn, T. Zhang, P. Abbeel

tensegrity robot

with X. Geng, M. Zhang, J. Bruce, K. Caluwaerts,M. Vespignani, V. SunSpiral, P. Abbeel

dexterous hands

with C. Eppner, A. Gupta, P. Abbeel

soft hands






ingredients for success in learning:

supervised learning: learning robotic skills:

computation

algorithms

data

computation

algorithms~data?

monocularRGB camera

7 DoF arm

2-fingergripper

objectbin

Grasping with Learned Hand-Eye Coordination

• monocular camera (no depth)• no camera calibration either

• 2-5 Hz update• continuous arm control

• servo the gripper to target

• fix mistakes

• no prior knowledge

L., Pastor, Krizhevsky, Quillen ‘16

Peter PastorAlex

Krizhevsky Deirdre Quillen

Grasping Experiments

Policy Learning with Multiple Robots

Local policy optimization Global policy optimization

Rollout execution

MrinalKalakrishnan Yevgen ChebotarAdrian LiAli Yahya

Yahya, Li, Kalakrishnan, Chebotar, L., ‘16

Policy Learning with Multiple Robots: Deep RL with NAF

Gu*, Holly*, Lillicrap, L., ‘16

Shane Gu Ethan Holly Tim Lillicrap

Learning a Predictive Model of Natural Images

originalvideo

predictions

Chelsea Finn






unknown environment

1. Learn a collision prediction model

command velocities

raw image

neural network ensemble

3. Iteratively train with on-policy samples

2. Speed-dependent, uncertainty-awarecollision cost

Key idea: To learn about collisions,must experience collisions (but safely!)

Safe Uncertainty-Aware Learning

Kahn, Pong, Abbeel, L. ‘16

Greg Kahn

Safe Uncertainty-Aware Learning

Kahn, Pong, Abbeel, L. ‘16






Training in Simulation: CAD2RL

Sadeghi, L. ‘16

Fereshteh Sadeghi


Sadeghi, L. ‘16


Sadeghi, L. ‘16

Sadeghi, L. ‘16

Learning with Transfer in Mind: Ensemble Policy Optimization (EPOpt)

train test

adapt

training on single torso mass training on model ensemble

unmodeled effectsensemble adaptation

Aravind Rajeswaran






6. How can we get sufficient supervision to learn in unstructured real-world environments?

Learning what Success Means

can we learn the goalwith visual features?

Finn, Abbeel, L. ‘16

Learning what Success Means

Sermanet, Xu, L. ‘16

ingredients for success in learning:

supervised learning: learning robotic skills:

computation

algorithms

data

computation

algorithms~data?

Announcement: New ConferenceConference on Robotic Learning (CoRL)www.robot-learning.org

Goal: bring together robotics & machine learning in a focused conference format

Conference: November 2017Papers deadline: late June 2017Steering committee: Ken Goldberg (UC Berkeley), Sergey Levine (UC Berkeley), Vincent Vanhoucke (Google), Abhinav Gupta (CMU), Stefan Schaal (USC, MPI), Michael I. Jordan (UC Berkeley), RaiaHadsell (DeepMind), Dieter Fox (UW), Joelle Pineau (McGill), J. Andrew Bagnell (CMU), Aude Billard (EPFL), Stefanie Tellex (Brown), Minoru Asada (Osaka), Wolfram Burgard (Freiburg), Pieter Abbeel(UC Berkeley)

Chelsea Finn

Peter PastorAlex

Krizhevsky Deirdre Quillen

MrinalKalakrishnan Yevgen ChebotarAdrian LiAli Yahya Shane Gu Ethan Holly Tim Lillicrap

Greg Kahn Fereshteh Sadeghi Aravind Rajeswaran

Pieter AbbeelTrevor Darrell

Documents

Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control