108
Autonomous robotic systems by combining control and learning Jia Pan Department of Computer Science University of Hong Kong

Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Autonomous robotic systems bycombining control and learning

Jia Pan

Department of Computer ScienceUniversity of Hong Kong

Page 2: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

What ideal robotic systems should look like

Knowledge about• human’s preference• task’s manufacturing

technique and process• physics and mechanics• …

Data about• Human’s motion• Environment’s status• Signals of failure/success• …

Finish tasks that requires• high accuracy but straightforward• deep thought but low accuracy• fast response but middle level accuracy and thought• coordination/collaboration/cooperation• …

Page 3: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Control theory• The study of how to use past knowledge and

some data to enhance the future operation of adynamical system

Knowledge

Controller= ( )

Model= ,

Data

Modeling

Design

Update

Tune

InteractCollect

Page 4: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Reinforcement learning• The study of how to use past data and some

knowledge to enhance the future operation of adynamical system

Knowledge

Controller= ( )

Model= ,

Data

Modeling

Design

Learn

Optimize

InteractCollect

Page 5: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

But almost the same thing

Are learning scientists and control scientistsknow each other well?

How are they thinking about each other?

Page 6: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Complains from control engineers

• 2o19-Dec-18 on Zhihu (Chinese version of quora):1.8K likes and 294 comments

Control engineer, get control Ph.D. in US(possibly Umich);Worked in car hardware companybefore, developed advanced controlalgorithm; greatly improvedperformance;But CTO said: you know someoptimization and data scienceL

Joined autonomous driving startupControl theory is despised thereOnly POMDP, vision, and learning areresearch;Modeling and controller design are notconsidered as researchL

Page 7: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Planning engineers also despise controlNobody cares about control theory

Computer vision community despisescontrol theory even more: CV can solveeverything; arrogant; no respect tosafety and experience in control

Learning guy are parameter tuningmachine: some ideaà tune parameterà paper

CV and planning community use ROS alotà stupidàMATLAB/Simulink is amust for industry qualityFriend interviewed in Waymo: alsorequire C++ and Python, not Matlab

Page 8: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Complains from control engineers

• https://www.zhihu.com/answer/939129746

Control is the king!Control is rigorous, withmathematics guarantee.Much better than learning

Page 9: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Reality

Learning scientists and control scientists don’tknow each other.

They look down upon each other

Page 10: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Disciplinary biases

Control theory

RL

Reinforcementlearning

Control

CE/EE/ME CS/ML/AI

Mostly discreteData → action

Continuous/discrete/hybridModel → action

Robotics tries to unify and merge theirperspectives.

Page 11: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Task biasesControl Reinforcement learning

Robotics covers both regions.

Page 12: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Criterion biasesControl

• System guarantee• Stability• Robustness to

perturbation• Dynamic performance

• Overshoot• Oscillation• Convergence rate

• Low-level tasks

Reinforcement learning

• Intelligence• Mimic brain decision making• Learn skills from scratch by

self teaching• Trade-off complex factors• Analyze rich sensory signals

• High-level tasks

Modern robotics want both.

Page 13: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Pure control• Modeling and design:

time-consuming manualprocedure; still an art

• Difficult to leverage richsensory data (e.g. imageor video)

• Huge gap between theoryand applications, due tounrealistic assumptionsnecessary for mathsimplicity

Pure learning• Low sample efficiency →

large training set• Convergence and hyper-

parameter tuning: still anart

• Low dynamicperformance

• Mostly limited in non-industrial applications

Why bother to combine?

Page 14: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Reinforcement learning is limited

• Difficult to transfer from simulation to real world• Requires huge number of training data / experience• Discard most structural knowledge about tasks• Limited to toy tasks:

• e.g., flexible grasping is useless without the subsequentaccurate and fast manipulation

Control policiesRaw sensor

measurementsabout tasks

A giganticneural

network

Page 15: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

How to combine?• Common practice: add patches to fix problems

• Divide-and-conquer principle: control & learningas blackbox

• Simple combination; NOT always a good solution

Original controlalgorithms

Deeplearning

Reinforcementlearning

Robust control

Page 16: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

How to combine – need deeper thinking

• Understand the core problem to be solved foryour task

• Understand reinforcement learning and control’scapability, e.g.,

• RL: good at trade-off in complex situations• Control: good at fast response in simple situations

• Let reinforcement learning and control work onparts most suitable for them

• Accomplish your task requirement; no more no loss

Page 17: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Let examples speak out• Autonomous navigation in dense crowds

Page 18: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Let examples speak out• Deformable object manipulation

• Accurately change the shape or configuration of adeformable object

by Navarro-Alarcon and YH Liu,Chinese University of Hong Kong

by . D. Langsfeld, A. M. Kabir, K.N. Kaipa, and S. K. Gupta,Maryland Robotics Center

Page 19: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Autonomous navigationin dense crowds

Page 20: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Many important applications• Low-speed

autonomous driving• Valent parking• Package delivery

system

• Service robot• Hotel service• Elderly assistance• Crowd surveillance

Page 21: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

State of the arts

Page 22: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Common solutions: divide-and-conquer

• Based on the well-known solution to autonomousderiving

• Simultaneous Localization and Mapping (SLAM)

Motion planning:avoid collisions withstatic obstacles

Build a map andlocalize the robotusing SLAM

Feedback controlfollows the collision-free trajectory

Page 23: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Common solutions: divide-and-conquer

• Then add patches for handling moving pedestrians

Motion planning:avoid collisions withstatic obstacles

Build a map andlocalize the robotusing SLAM

Feedback controlfollows the collision-free trajectory

Use vision and learning toanalyze crowds:1. character recognition2. trajectory estimation3. pedestrian tracking

Avoid collisions withpedestrians by usingtechniques such as MotionPredictive Control (MPC)

Many manualrules to handle

corner cases

Page 24: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Performance: robot gets stuck

• Each component is not perfect and you need toleave some margin for accumulated uncertainty

• But then uncertainty explosion blocks all moves

Page 25: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Divide-and-conquer issues (I)

Connections of perfectcomponents

≠High-quality navigationsystem in dense crowds

Page 26: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Divide-and-conquer issues (II)• Each component may be solving a much more

difficult problem than the task

Pipeline Bottleneck requirement

Divide-and-conquer

1. Construct map2. Plan trajectory in the map3. Avoid collision with movingobjects

Accurate robot localization & map:global knowledge about scenes

Bettersolution?

Directly solve navigation indense crowds

Collision avoidance: localinformation is sufficient, no need forlocalization or mapMoving to a goal: a rough globallocalization and a rough map; noneed for map if the scene topologyis simple

Page 27: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Divide-and-conquer issues (III)• Difficulty of an individual component may be due

to ignorance of interaction among components

Tracking in densecrowds

Challenges Occlusion; Re-id

If havingperfectdense-crowdcollisionavoidance

Always track thepeople with fewerocclusion or re-idtroubles

Localization in densecrowdsLost of localization orclose-loop features

Always recover lostfeatures by activeexploration using collisionavoidance

Page 28: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

End-to-end RL solutions?

• Difficult to transfer from simulation to real world• Requires huge number of training data / experience• Discard most structural knowledge in divide-and-

conquer

Navigationpolicies in

dense crowds

Raw sensormeasurementsabout scenarios

A gigantic neuralnetwork

(e.g., GraphNeural Network)

Page 29: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Re-investigate our task:

What is the core capabilityrequired in

“navigation in dense crowd”?

Let’s get some idea from human

Page 30: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

What is human’s core skill fornavigation in dense crowds

The cooperated collision avoidancewithout central guidance or communication

Page 31: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

What is human’s core skill fornavigation in dense crowds

The cooperated collision avoidancewithout central guidance or communication

Page 32: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Lets first solve this core task

Flexible collisionavoidance in crowds

Localization in crowdsMapping in crowds

Crowds tracking

Trajectoryplanning

Behavior cooperation

Semantic recognition

Page 33: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Human-like decentralized collisionavoidance

Relative velocityof other agents

Relative positionof other agents

Robot’s steeringcommand

• Prefer to avoid traditional difficulties like robustrecognition and position/velocity estimation

Page 34: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Use raw sensor data

Network output

Page 35: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

How to find this controller• One way is to build a collision avoidance model

based on the first principle – velocity obstacle

• Intuitively, any velocity that can result in futurecollision in a window 0, should be discarded.

Page 36: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

How to find this controller?• But learning is more appropriate

• Rich sensor + complex trade-off between safety andefficiency – first principle is limited and has manyparameters to be tuned

• Rough localization is sufficient for learning-based policy

Page 37: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Learn controller using RL• Neural network controller

• Optimize the entire crowd for:• Minimizing collision cost• Minimizing the time to reach goal• Maximizing trajectory smoothness

encourage smoothcooperation

Page 38: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Training step 1• “Baby” controller trained on simple scenarios• Random configuration for start and goal (yellow)

During training Training finished

Page 39: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Training step 2• Controller gets “mature” after being trained on

multiple challenging scenarios

Page 40: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Training step 2• Controller gets “mature” after being trained on

multiple challenging scenarios

Page 41: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Each robot movesfrom a position on thecircle to its dualposition on the circle.

The robots behavecooperatively toresolve the congestioncaused by partialobservation to theenvironment.

Test in a challenging benchmark

Page 42: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Unknown map

Traditional methods based oncollision avoidance rules

Our reinforcement learned policy

Effective navigation is possible without map,if the topology is simple

Page 43: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Pure learning is NOT enough• Learning based policy

• Make complex trade-offbetween navigation safetyand efficiency

• Difficult to respond in timeand optimally for extremecases

• Low-level control based policy• Aggressive optimal control when

the moving obstacles are of small-number, distant, and slow

• Conservative safe policy whenthe moving obstacles are of largenumber, close, and fast

Page 44: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Hybrid control

Page 45: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Benefit of hybrid control

RL Hybrid-RLPID control RL control Emergent control

Shorter trajectory, smaller curvature, and lower collision probability

Page 46: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Multi-level swarm behavior

• Close to center• mainly safe policy

(in red)

• Close to periphery• Mainly PID policy

(in green)

• Between centerand periphery

• Mainly RL policy(in blue)

Page 47: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing
Page 48: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Performanceunder different density

Page 49: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Real robot demo:UWB for rough global localization

4x

Page 50: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Extension in 3D:multiple drone collision avoidance

Page 51: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Robot has learned:

Avoid collision with moving obstacles ina flexible way

Cooperate with moving obstacles in aclever way

Page 52: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Track a target in human crowd• Let’s first see how the robot’s performance when

only considering human as dumb moving objects

• The guider has a Ultra-Wideband (UWB) label inhis hand

• The robot tries to follow the location of the label

Page 53: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Indoor test

Page 54: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Limitation?• Does not consider human’s preference• The robot is overly responsible for collision

avoidance

Page 55: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Outdoor test

Page 56: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Escape from the surround

Page 57: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Test on a legged robot

Page 58: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Collision avoidance helpschallenges in other components

Flexible collisionavoidance in crowds

Localization in crowdsMapping in crowds

Crowds tracking

Trajectoryplanning

Behavior cooperation

Semantic recognition

Page 59: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Traditional localization in dense crowds

• Need to overcome the static feature lost difficulty

Page 60: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Theoretical solutions

start

goal

“No feature” “withfeature”

• Gather information by moving to regions with richlocalization features, and then continue

Page 61: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Chicken-and-egg difficulty

• But how to execute such “information gain”movement in a dense crowd?

Lost localization and at most has a roughlocalization from inertial odometry

Cannot navigation through crowds to theregion with “rich feature for localization”

Cannot navigation through crowds to theregion with “rich feature for localization”

Cannot recover localization

Page 62: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Our solution

environment

agile navigation policy

• Our collision avoidance does not rely onlocalization

• Our “moving to the goal” can safely use arough localization to indicate where thefeature-rich region is

lost recovery policy

Chicken-and-egg problem solved

Page 63: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Localization recovery policy• Choose a region in the map as a temporary goal

for observing features and recover localization• Trade-off different regions according to:

• The distance to the robot• The richness of features• How difficult for the robot to pass through the crowd

flow and reach the region

robot

Closer but difficult to reach

Farther but easier to reach

Page 64: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Actor-Critic recoveryThe crowd-relatedcriterion can becomputed using thevalue function, a by-product of collisionavoidance policytraining

Adaptively updatepolicy according tothe crowd flow status

Page 65: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

agile navigation policy

Page 66: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

localization recoverypolicy

Page 67: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Simulation benchmarksfor quantitative comparison

Page 68: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Simulation benchmarksfor quantitative comparison

Page 69: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Performanceunder different density

Page 70: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Real-worldlocalization recovery demo

Page 71: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Collision avoidance helpschallenges in other components

Flexible collisionavoidance in crowds

Localization in crowdsMapping in crowds

Crowds tracking

Trajectoryplanning

Behavior cooperation

Semantic recognition

Page 72: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

• Traditional mapping algorithm relies onthe matching features of the samestatic object at different time points

Curse of dynamic obstacles

• In crowd scenario, mappingalgorithms will be confused bydynamic obstacles:

• Mistaking dynamics obstacles forstatic obstacles

• Mistaking features of one dynamicobstacle for features of one staticobstacle observed before

Page 73: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Mapping in crowds?

• Common solutions: detect moving obstacles andremove them from sensor measurements

• But such detection is difficult to be robust

Page 74: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Our solution

Rather than considering movingpedestrians as troubles for mapping

We would consider them as usefulinformation source for mapping

How is this possible?

Page 75: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Intuition• Crowd vector field indicates the position of static

obstacles

• Even more: information about social preferenceabout moving directions

Page 76: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Mapping based on crowd flow• The local crowd flow provides information about the

moving resistance in all directions• As the only sensor measurements in mapping

• I.e., the robot has no knowledge about static obstaclesand only use the crowd for mapping

Page 77: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Mapping based on crowd flow• Then the sensor measurements are integrated

using the occupancy-based mapping

Only possible with the navigation in crowd skill

Page 78: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Navigation using crowd-flow map

• Search algorithms(like ∗ ) can usethe map to plan asocial-compliabletrajectory

• Then use RL-basedcollision avoidanceto local adjust

Page 79: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Navigation using crowd-flow map

Prefer to follow human’s flow to minimize the flow disturbOvercome the overly responsible problem

• Blue trajectory: min-path trajectory

• Purple trajectory: D*trajectory respectingflow

Page 80: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Can we do mapping and localizationusing crowd data only?

(i.e. replacing the SLAM using static objects?)

YES!Using the crowd vector field as features

to obtain rough localization

Page 81: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

• Green: ground truth• Yellow: particle

localization usingcrowd feature

• Red: localization usingfeatures of staticobstacle

Put everything together

Page 82: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Summary of part I• Start with collision avoidance as the core task, we

obtain much better navigation policy in dense crowdsthan state-of-the-arts

• Let learning focus on complex trade-off in collisionavoidance, and use traditional control and robotics tohandle other parts.

• Further solve a set of well-known robotics challenges• Localization recovery in crowd• Mapping in crowd• Localization in crowd• Human-friendly navigation

Page 83: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Deformable object manipulation

Page 84: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Motivation• Observed in a shoe factory in Anta• Insole fabricated in 2 steps: assembly & sewing

Cloth piece assembly by human worker Industrial sewing of assembled result

Page 85: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

More challenging task: bra suturing• 2.5D shape• Geometric interface between soft/hard materials

Page 86: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Simplified problem• Autonomous robotic cloth assembly

hole pin

Page 87: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Still a non-trivial task• The controller must generalize to

• Cloth pieces with different types of materials• Cloth pieces with different numbers of holes• Holes with different tolerance

hard fabric, 4 holes hard fabric, 2 holes soft fabric, 2 holes, largetolerance where stretch

is necessary

Page 88: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Learning based controller seems tobe appropriate?

• Require complicated high-level policy

• Difficult to build mathematical model for thecomplicated physical interaction between clothand environments

• Possible solutions:• Reinforcement learning• Learning from demonstration• Or combinations

Page 89: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Learning from demonstration• Operators provide demonstrations/examples

about how to achieve a task (intuitively)• Robots learn and generalize from examples• Examples can be provided by teleoperation,

motion capture, VR/AR, or kinesthetic teaching

demonstrations by kinestheticteaching

automated knot tying

Page 90: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

LfD knot tying

Page 91: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Combine LfD, RL and DL• Demonstration can include image, video, and

tactile sensing measurement

Page 92: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

But fails on assembly taskKnot tying

Requires flexiblemulti-step policy

Knot physics issimpler, and thuspossible to generatesimulation consistentwith real world

Cloth assembly

Requires both highaccuracy and highflexibility

Complex interactionbetween cloth, fixture,and gripper, high-qualitysimulation is difficult

Page 93: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Simulator: Achilles’ heel for RL• Realistic simulation of deformable objects is

extremely difficult (and also expensive)• Even after careful optimization of physics parameters

Page 94: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Recap traditional control:visual-servo control

Page 95: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Recap traditional control:visual-servo control

Image space velocity

Joint space velocity

Kinematics Jacobian:analytically computed

Page 96: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Visual-servo control fordeformable objects

• We can do similar thing for deformable object

Image space velocity

Joint space velocity

How the joint influencesthe cloth status:

unknown,time-varying,material-varying

Page 97: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Visual-servo control fordeformable objects

• Traditional methods use linear functions to modelthe matrix

• We are using nonlinear approximators, including• Gaussian process• Neural network

• Dynamically update the function during themanipulation to best describe the relationbetween Δ and Δ in the near future

Page 98: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Nonlinear visual-servo controllerusing Gaussian processes

• ~ ( , , )• Initialize GP using random

perturbation of the clothpiece around the initialconfiguration

• Use the data collected bythe robot during themanipulation process togradually update GP

Δ= + ,

, + ( − )

= , − ,, + ( , )

Page 99: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

More general controller using deepneural network

Besides fixture holes, we can also use raw image as controller input

Page 100: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Peg-in hole assembly for soft fabric:2 holes/2 pins

Page 101: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Peg-in hole assembly for hard fabric:4 holes/4 pins

Page 102: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Rolled towel bending:point feedback features

Page 103: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Plastic sheet bending:curve feedback features

Page 104: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Better performance than linearcontroller

• The linear controller is also updated online• GP controller converges faster and more robust• Success rate >95%; >80% with human perturbation

on rolled towel task on peg-in-hole task

Page 105: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Neural network controllerfurther improves

Page 106: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Summary of part II• Learning is appropriate for modeling the complex

interaction physics between deformable objectand the environment

• But the overall pipeline is still more suitable fortraditional visual-servo feedback controller due tothe required high accuracy

• Pure reinforcement learning does not fit in here

Page 107: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Conclusion• Combine control and learning – always an art

1. Inspect your problem/task carefully

2. Let control or learning handle the subtasksmost appropriate for themselves

3. It usually is important to figure out which is thecore subtask and start from there

Page 108: Autonomous robotic systems by combining control and learning Jia.pdf · What ideal robotic systems should look like Knowledge about • human’s preference • task’s manufacturing

Thank you