33
Deep Robotic Learning Sergey Levine UC Berkeley Google Brain

Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Deep Robotic Learning

Sergey LevineUC Berkeley Google Brain

Page 2: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control
Page 3: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

roboticcontrolpipeline

observationsstate

estimation(e.g. vision)

modeling & prediction

planninglow-level control

controls

Page 4: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

standardcomputervision

features(e.g. HOG)

mid-level features(e.g. DPM)

classifier(e.g. SVM)

deeplearning

Felzenszwalb ‘08

roboticcontrolpipeline

observationsstate

estimation(e.g. vision)

modeling & prediction

planninglow-level control

controls

deeproboticlearning

observationsstate

estimation(e.g. vision)

modeling & prediction

planninglow-level control

controls

end-to-end training

end-to-end training

Page 5: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

no direct supervision

actions have consequences

Page 6: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

1. Does end-to-end learning produce bettersensorimotor skills?

2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?

3. Can we scale up deep robotic learning and produce skills that generalize?

4. How can we learn safely and efficiently in safety-critical domains?

5. Can we transfer skills from simulation to the real world, and from one robot to another?

Page 7: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

1. Does end-to-end learning produce bettersensorimotor skills?

2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?

3. Can we scale up deep robotic learning and produce skills that generalize?

4. How can we learn safely and efficiently in safety-critical domains?

5. Can we transfer skills from simulation to the real world, and from one robot to another?

Page 8: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Chelsea Finn

Page 9: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

end-to-end training

0%successrate

96.3%successrate

pose prediction

(trained on pose only)

L.*, Finn*, Darrell, Abbeel, ‘16

Page 10: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

1. Does end-to-end learning produce bettersensorimotor skills?

2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?

3. Can we scale up deep robotic learning and produce skills that generalize?

4. How can we learn safely and efficiently in safety-critical domains?

5. Can we transfer skills from simulation to the real world, and from one robot to another?

Page 11: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Deep Robotic Learning Applications

manipulation

locomotion

with N. Wagener, P. Abbeel with V. Kumar, A. Gupta, E. Todorov

with V. Koltun

aerial vehicles

with G. Kahn, T. Zhang, P. Abbeel

tensegrity robot

with X. Geng, M. Zhang, J. Bruce, K. Caluwaerts,M. Vespignani, V. SunSpiral, P. Abbeel

dexterous hands

with C. Eppner, A. Gupta, P. Abbeel

soft hands

Page 12: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

1. Does end-to-end learning produce bettersensorimotor skills?

2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?

3. Can we scale up deep robotic learning and produce skills that generalize?

4. How can we learn safely and efficiently in safety-critical domains?

5. Can we transfer skills from simulation to the real world, and from one robot to another?

Page 13: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

ingredients for success in learning:

supervised learning: learning robotic skills:

computation

algorithms

data

computation

algorithms~data?

Page 14: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

monocularRGB camera

7 DoF arm

2-fingergripper

objectbin

Grasping with Learned Hand-Eye Coordination

• monocular camera (no depth)• no camera calibration either

• 2-5 Hz update• continuous arm control

• servo the gripper to target

• fix mistakes

• no prior knowledge

L., Pastor, Krizhevsky, Quillen ‘16

Peter PastorAlex

Krizhevsky Deirdre Quillen

Page 15: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Grasping Experiments

Page 16: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Policy Learning with Multiple Robots

Local policy optimization Global policy optimization

Rollout execution

MrinalKalakrishnan Yevgen ChebotarAdrian LiAli Yahya

Page 17: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Yahya, Li, Kalakrishnan, Chebotar, L., ‘16

Page 18: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Policy Learning with Multiple Robots: Deep RL with NAF

Gu*, Holly*, Lillicrap, L., ‘16

Shane Gu Ethan Holly Tim Lillicrap

Page 19: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Learning a Predictive Model of Natural Images

originalvideo

predictions

Chelsea Finn

Page 20: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

1. Does end-to-end learning produce bettersensorimotor skills?

2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?

3. Can we scale up deep robotic learning and produce skills that generalize?

4. How can we learn safely and efficiently in safety-critical domains?

5. Can we transfer skills from simulation to the real world, and from one robot to another?

Page 21: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

unknown environment

1. Learn a collision prediction model

command velocities

raw image

neural network ensemble

3. Iteratively train with on-policy samples

2. Speed-dependent, uncertainty-awarecollision cost

Key idea: To learn about collisions,must experience collisions (but safely!)

Safe Uncertainty-Aware Learning

Kahn, Pong, Abbeel, L. ‘16

Greg Kahn

Page 22: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Safe Uncertainty-Aware Learning

Kahn, Pong, Abbeel, L. ‘16

Page 23: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

1. Does end-to-end learning produce bettersensorimotor skills?

2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?

3. Can we scale up deep robotic learning and produce skills that generalize?

4. How can we learn safely and efficiently in safety-critical domains?

5. Can we transfer skills from simulation to the real world, and from one robot to another?

Page 24: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Training in Simulation: CAD2RL

Sadeghi, L. ‘16

Fereshteh Sadeghi

Page 25: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Training in Simulation: CAD2RL

Sadeghi, L. ‘16

Page 26: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Training in Simulation: CAD2RL

Sadeghi, L. ‘16

Page 27: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Sadeghi, L. ‘16

Page 28: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Learning with Transfer in Mind: Ensemble Policy Optimization (EPOpt)

train test

adapt

training on single torso mass training on model ensemble

unmodeled effectsensemble adaptation

Aravind Rajeswaran

Page 29: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

1. Does end-to-end learning produce bettersensorimotor skills?

2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?

3. Can we scale up deep robotic learning and produce skills that generalize?

4. How can we learn safely and efficiently in safety-critical domains?

5. Can we transfer skills from simulation to the real world, and from one robot to another?

6. How can we get sufficient supervision to learn in unstructured real-world environments?

Page 30: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Learning what Success Means

can we learn the goalwith visual features?

Finn, Abbeel, L. ‘16

Page 31: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Learning what Success Means

Sermanet, Xu, L. ‘16

Page 32: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

ingredients for success in learning:

supervised learning: learning robotic skills:

computation

algorithms

data

computation

algorithms~data?

Page 33: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control

Announcement: New ConferenceConference on Robotic Learning (CoRL)www.robot-learning.org

Goal: bring together robotics & machine learning in a focused conference format

Conference: November 2017Papers deadline: late June 2017Steering committee: Ken Goldberg (UC Berkeley), Sergey Levine (UC Berkeley), Vincent Vanhoucke (Google), Abhinav Gupta (CMU), Stefan Schaal (USC, MPI), Michael I. Jordan (UC Berkeley), RaiaHadsell (DeepMind), Dieter Fox (UW), Joelle Pineau (McGill), J. Andrew Bagnell (CMU), Aude Billard (EPFL), Stefanie Tellex (Brown), Minoru Asada (Osaka), Wolfram Burgard (Freiburg), Pieter Abbeel(UC Berkeley)

Chelsea Finn

Peter PastorAlex

Krizhevsky Deirdre Quillen

MrinalKalakrishnan Yevgen ChebotarAdrian LiAli Yahya Shane Gu Ethan Holly Tim Lillicrap

Greg Kahn Fereshteh Sadeghi Aravind Rajeswaran

Pieter AbbeelTrevor Darrell