Robot Motor Skill Coordination with EM-based Reinforcement Learning

Robot Motor Skill Coordination withEM-based Reinforcement Learning

Italian Institute of TechnologyAdvanced Robotics dept.

http://www.iit.it

Petar Kormushev, Sylvain Calinon, Darwin G. Caldwell

October 20, 2010IROS 2010

Petar Kormushev, Italian Institute of Technology

Motivation

• How to learn complex motor skills which also require variable stiffness?

• How to demonstrate the required stiffness/compliance?

• How to teach highly-dynamic tasks?

Background

• Learning adaptive stiffness by extracting variability and correlation information from multiple demonstrations Sylvain Calinon et al., IROS 2010

Robot Motor Skill Learning

Demonstration by human

Encoding the skill

Refining the skill

Reproduction

Imitation learning

Reinforcement learning

Shared representation(encoding)

Motion capture Kinesthetic teaching

Skill representation (encoding)

Time independent

Trajectory-basedVia-pointsDMP

GMM/GMRDS-based

Time dependent

Dynamic Movement Primitives

Sequence of attractorsDemonstrated trajectory

Ä̂x =KX

hi(t)h· P (¹ X

i ¡ x) ¡ · V _xi

Ijspeert, Nakanishi, Schaal, IROS 2001

Extended DMP to include coordination

Ä̂x =KX

hi(t)hK Pi (¹

Xi ¡ x) ¡ · V _x

Coordination matrix (full stiffness matrix)

Advantages: capture correlations between the different motion variables reduce number of primitives

Ä̂x =KX

hi(t)h· P (¹ X

i ¡ x) ¡ · V _xi

Stiffness gain (scalar)

Proposal: use Reinforcement learning to learn the coordination matrices

Example: Reaching task with obstacle

Using full coordination matricesUsing diagonal matrices

Reward function: r(t) =½w1

T e¡ jjxRt ¡ x

Dt jj;t 6= te

w2 e¡ jjxRt ¡ x

Gjj;t = te

Expected return: 0.61 Expected return: 0.73

EM-based Reinforcement learning (RL)

• PoWER algorithm - Policy learning by Weighting Exploration with the Returns

• Advantages over policy-gradient based RL:no need of learning ratecan use importance samplingsingle rollout enough to update policy

Jens Kober and Jan Peters, NIPS 2009

µn+1=µn+

D(µk ¡ µn)R(¿k)

w(¿k)DR(¿k)

w(¿k)

RL implementation

• Policy parameters– Full coordination matrices:– Attractor vectors:

• Policy update rule:

• Importance sampling– uses best σ rollouts so far

Df (µk;¿k)

w(¿k)=

f (µind(k);¿ind(k))

Pancake flipping: Experimental setup

Frying pan mounted on the end-effector

Artificial pancakewith 4 passive markers

(more robust to occlusions)

Barrett WAM 7-DOF robot

Evaluation: Tracking of the pancake

NaturalPoint OptiTrack motion capture system

100 Hz camera fps 40 Hz real-time capturing

Reward function

• Cumulative return of a rollout:

r(tf ) =w1harccos(v0:vtf )

i+w2e¡ jjx

p¡ xF jj +w3xM3

R(¿) =TX

• Reward function:

orientation position height

Kinesthetic demonstration of the task

Learning by trial and error

Finally learned skill

Motion capture to evaluate rollouts

Captured pancake trajectory

90° flip 180° flip

Performance

M (q)Äq+C(_q;q) _q+g(q) =¿G +¿T

¿G =LX

J TG;iFG;i

Gravity compensation

Task execution

¿T =J TT FT

Reproduction control strategy

Conclusion

• Combining Imitation learning + RL to learn motor skills with variable stiffness– Imitation used to initialize policy– RL to learn coordination matrices– Learned variable stiffness during reproduction

• Future work– other representations– other RL algorithms

Thanks for your attention!

Petar Kormushev, Italian Institute of Technology 22/22

Robot Motor Skill Coordination with EM-based Reinforcement Learning

Documents

Patience, cognitive skill and coordination in the repeated ... · Patience, cognitive skill and coordination in the repeated stag hunt Omar Al-Ubaydli, Garett Jones and Jaap Weel1

Avalanche Beacon Parks: Skill Development and Team ...summit.sfu.ca/system/.../AvalancheBeaconPark-AUTHOR... · Avalanche Beacon Parks: Skill Development and Team Coordination in

Components of Skill Related Fitness. Skill Related Fitness Agility Balance Coordination Reaction Time * What components of skill related fitness

SET-C-MEET-2017--Re-Test - Patna University · variation (D) stimulus variation and reinforcement 20. In microteaching practice the pre instructional skill is (A) skill of reinforcement

Robot Motor Skill Coordination with EM-based Reinforcement ...kormushev.com/papers/Kormushev-IROS2010.pdf · coordination for simple rest-to-rest movements, by taking inspiration

Government of India Ministry of Skill Development and ...The Ministry of Skill Development and Entrepreneurship is an apex organization for the development and coordination of the

Deep Reinforcement Learning for Coordination in Tra c ... · - deep reinforcement learning - has resulted in strong decision making agents, capable of outperforming human beings [31,

KING’S COLLEGE · solving skill, critical thinking skill, micro-thinking skill etc.) ... Reinforcement of STEM education by ... plan Lesson observation

Behavior reduction using differential reinforcement and ...griffinresa.net/Lund/GAPBS.pdfBehavior reduction using differential reinforcement and alternative skill acquisition procedures

Skill-Related Dimension Agility Balance Coordination Power Speed Reaction Time

3 Errors, rewards, and reinforcement in motor skill learning

Common Core 3.0 Knowledge & Skill Reinforcement Lab: Case ... · 12/31/2018 · California Common Core Curriculum | Knowledge and Skill Reinforcement Lab: Case Planning Service Delivery

VISUAL-MOTOR SKILL A · ii DESCRIBING THE EFFECT OF MOTOR ABILITY ON VISUAL-MOTOR SKILL ACQUISITION AND TASK PERFORMANCE IN CHILDREN WITH DEVELOPMENTAL COORDINATION DISORDER Noémi

11201S-A (P&T)(Vol-II) - DC(MSMEdcmsme.gov.in/employ-corner/d/Work-Allocation-29-8-2019.pdf2019/08/29 · (iii) Training Policy and • Skill India and overall coordination Coordination

On Movement Skill Learning and Movement Representations ... · References Movement Skill Learning for Robotics Movement Skill Learning can be easily formulated as Reinforcement Learning

t Teacher Page Skill related fitness components include: Power Speed Reaction Time Balance Coordination Agility

CHC08 Community Services Training Package...CHC42912 - Certificate IV in Mental Health Peer Work Addition of skill sets: Assessment, carer support and respite coordination skill set

DAILY INSTRUCTIONAL PLAN - Zaner-Bloser...Daily Routines Foundational skill reinforcement and informal assessment Teacher’s Guide Skill instruction and practice Word Work Book Skill

Transnational Skill Standards Bar Bender and Steel Fixer · reinforcement COSVR166 Cut and bend reinforcement steel to shape This UK NOS is comparable with 2 Indian NOS CON/N0205

Learning coordination strategies using reinforcement learning myriam z abramson , dissertation, 2003