Combining Programming by Demonstration with Path

Combining Programming by Demonstration withPath Optimization and Local Replanning toFacilitate the Execution of Assembly Tasks

1st Gal GorjupDept. of Mechanical Eng.The University of AucklandAuckland, New Zealand

[email protected]

2nd George P. KontoudisBradley Dept. of Electrical and Computer Eng.

Virginia TechBlacksburg, USA

[email protected]

3rd Anany DwivediDept. of Mechanical Eng.The University of AucklandAuckland, New Zealand

[email protected]

4th Geng GaoDept. of Mechanical Eng.The University of AucklandAuckland, New Zealand

[email protected]

5th Saori MatsunagaInformation Technology R&D Center

Mitsubishi Electric CorporationTokyo, Japan

[email protected]

6th Toshisada MariyamaInformation Technology R&D Center

Mitsubishi Electric CorporationTokyo, Japan

[email protected]

7th Bruce MacDonaldCentre for Automation and Robotic Eng. Science

The University of AucklandAuckland, New Zealand

[email protected]

8th Minas LiarokapisDept. of Mechanical Eng.The University of AucklandAuckland, New Zealand

[email protected]

Abstract—With the emergence of agile manufacturing in highlyautomated industrial environments, the demand for efficientrobot adaptation to dynamic task requirements is increasing. Forassembly tasks in particular, classic robot programming methodstend to be rather time intensive. Thus, effectively respondingto rapid production changes requires faster and more intuitiverobot teaching approaches. This work focuses on combiningprogramming by demonstration with path optimization and localreplanning methods to allow for fast and intuitive programmingof assembly tasks that requires minimal user expertise. Twodemonstration approaches have been developed and integratedin the framework, one that relies on human to robot motionmapping (teleoperation based approach) and a kinesthetic teach-ing method. The two approaches have been compared with theclassic, pendant based teaching. The framework optimizes thedemonstrated robot trajectories with respect to the detectedobstacle space and the provided task specifications and goals. Theframework has also been designed to employ a local replanningscheme that adjusts the optimized robot path based on onlinefeedback from the camera-based perception system, ensuringcollision-free navigation and the execution of critical assemblymotions. The efficiency of the methods has been validated througha series of experiments involving the execution of assembly tasks.Extensive comparisons of the different demonstration methodshave been performed and the approaches have been evaluated interms of teaching time, ease of use, and path length.

Index Terms—flexible manufacturing, assembly, programmingby demonstration, system integration

The funding for this research was obtained by Mitsubishi Electric Corpo-ration under UniServices grant agreement #5000736.

Fig. 1. The robot arm and gripper system performing a peg-in-hole task on apolishing machine replica. The insertion is executed in a heavily constrainedspace, which corresponds to real-life assembly scenarios.

I. INTRODUCTION

Industrial robots have allowed production and assemblylines to reach extraordinary levels of automation, increasingprocess efficiency and minimizing faults. Adapting to the highrate of technological advancement, the market has started tomove away from mass production, demanding progressively

more customized products. In order to effectively respondto the rapidly changing task requirements, robot systemsmust allow for fast and efficient re-programming that doesnot require expert knowledge in the field. Traditional robotprogramming methods (such as pendant teaching) have provento be less efficient in cases where the system must be adaptedto novel, complex tasks on a regular basis. Online, pendant-based teaching methods are often tedious, time-consuming,and introduce production delays because the robot must behalted for the duration of re-programming. Offline, simulation-based methods can be utilized without using the physicalrobot, but require professional knowledge and a considerableamount of effort to be properly configured. These issues havemotivated several industrial and research groups to develop al-ternative methods of transferring human skill to robot systems.

Most skill transfer approaches are based on Programming byDemonstration (PbD), where the human operator demonstratesa task or skill, which is then transferred to the target robotsystem. The skill transfer can be arbitrarily complex, rangingfrom simple trajectory reproduction to sophisticated skill ac-quisition that interprets the task context and adapts to externaldisturbances. Teaching by showing comes naturally to humanoperators, making most developed PbD approaches accessibleto untrained operators. From the user perspective, the timeand effort required to program tasks of varying complexity isfairly low when using an effective PbD framework. The flex-ibility offered by such approaches facilitates the developmentof reconfigurable manufacturing environments suited for thefuture. In addition, it introduces new automation frontiers forsmaller businesses that may require the robots to perform shortterm, daily changing tasks. The concept of PbD can also playa major role in human-robot collaboration, where the robotcan learn and adapt its behavior based on human observation.

This paper presents a human to robot skill transfer method-ology that combines programming by demonstration with pathoptimization and local replanning to allow for the efficient exe-cution of assembly tasks. The methodology performs collision-aware optimization of non-functional demonstrated trajectorysegments in terms of path length, ensuring fast execution ofthe desired task. This reduces the effects of sub-optimal userdemonstrations that may influence the quality of programmedmotions in other, learning based approaches [1]. The systemcan adapt to dynamic task goal changes by employing a localreplanning scheme that updates the tail of the global pathduring its execution, based on real-time feedback from theperception system. The efficiency of the proposed methodshas been validated with a series of experiments focusing onthe execution of peg-in-hole assembly tasks with a robotarm and gripper system. As a secondary contribution, threedifferent task demonstration approaches have been comparedand evaluated in terms of teaching time and ease-of-use.

The rest of the paper is organized as follows: Section IIintroduces the related work, Section III describes the utilizedapparatus, Section IV presents the developed framework, Sec-tion V presents and discusses the results, while Section VIconcludes the paper.

II. RELATED WORK

This section provides a brief literature review of the fieldsrelevant to this work. This includes related work in PbD,trajectory optimization, and path planning.

A. Programming by Demonstration

Over the last decades, significant research effort has beeninvested in the field of PbD, with developments ranging fromsimple trajectory mapping to high level representations ofcomplete task procedures [1]–[3]. Such approaches have alsobeen employed in robotic assembly studies [4], where muchof the work has focused on a set of representative tasks, suchas peg-in-hole, pick-and-place, bolt screwing, etc.

Several methods of capturing user demonstrations have beenconsidered in literature. Kinesthetic teaching and teleoperationproduce data that is recorded directly on the target robotic plat-form with the human operator compensating for kinematic anddynamic embodiment differences. This makes the demonstra-tions easier to interpret, but can sometimes be inconvenient forthe operator and prevent them from effectively demonstratingthe target skill [5]. Alternatively, a PbD scheme can be basedon motions that the operators execute with their own bodies,mapping them to the target platforms [6], [7]. Although thisis more intuitive for the operator, it requires additional humanto robot motion mapping methods [8].

Once the demonstration trajectories have been obtained, thesimplest approach is to map them directly to the robot plat-form. While this might be sufficient for simple motions withloose accuracy requirements, such an approach is typicallyinadequate for assembly oriented applications. Such a basicmotion replay cannot account for external disturbances and anyoperator errors are propagated to the robot. These errors canbe amended either through independent trajectory optimizationor by employing appropriate local replanning methods.

B. Trajectory Optimization

In assembly tasks, the operator is typically trying to programshort, efficient robot motions that complete a given task asquickly as possible. Human demonstrations are in most casessub-optimal in terms of speed of execution. To effectivelyshorten the path length, the human demonstrated trajectoriescan be efficiently optimized by employing appropriate pathplanning algorithms [9]. Moreover, given an existing robottrajectory, methods such as the Shortcut or the Partial Shortcut[10], are able to refine and shorten the path with respect to theidentified obstacles in the space. Alternatively, reinforcementlearning methods that produce a set of feasible trajectories inconstrained environments can also be employed [11]. As suchmethods discard any functional motion that may be present inthe segment, they can only be applied to the non-functionalparts of the demonstrated skill. In this context, functionalmotions refer to trajectory segments and force profiles thatare critical for the success of a particular assembly task, suchas part alignment and insertion. Non-functional motions referto trajectory segments that can be significantly altered withoutaffecting task success, such as the reach to grasp path.

Fig. 2. The employed parallel-jaw gripper with the pre-shaped adaptive robotfingers. A single actuator is used to control both fingers, which move in acoupled manner on linear rails. Upon contact with the object, the pre-shapedfingers conform to the object shape. An off-centered mount houses the IntelRealSense RGBD camera.

C. Path Planning and Local Replanning

The path planning problem for continuous obstacle spacesand arbitrary robot shapes has been addressed by sampling-based algorithms, such as the rapidly-exploring random tree(RRT) [12] and the probabilistic roadmaps (PRM) [13]. Avariant of the RRT with rewiring of the search-tree, the RRT?,was proposed in [14]. RRT? is a probabilistically complete andasymptotically optimal, single-query algorithm that can be ap-plied only to static environments. Replanning can be employedto efficiently identify a new path for dynamic or even uncer-tain environments. An asymptotically optimal sampling-based,single-query framework for dynamic environments, namelyRRTX, was presented in [15]. The methodology performs localreplanning to repair and refine the precomputed global search-graph. The authors in [16] introduced the informed RRT?

that significantly reduces the convergence rate of RRT?. Thelatter instead of employing the sampling-based algorithm inthe whole space, it reduces the problem to a relatively smallerspace (prolate hyperspheroid) and dynamically compresses thelocal space based on the gathered information. In [17], localreplanning was utilized to recompute a path in an onlinefashion. In this approach, the RRT? has been executed on arelatively small space, where motion planning completeness isguaranteed with topological connectedness tools.

III. APPARATUS

In this section, the employed apparatus is presented. Therobot platform is based on a 6 Degrees of Freedom (DoF)serial manipulator with a maximum payload of 5 kg. Thegripper used is a parallel-jaw design with adaptive, pre-shaped fingers [18] (see Fig. 2). The gripper is chosen as itmaximizes the contact area with the objects while graspingand compensates for object positioning uncertainties. Theperception module relies on an Intel RealSense depth camera(model D435) mounted on the robot end-effector. The hand-eye calibration is performed through fiducial marker detection,as described in [19].

Fig. 3. The proposed dataglove system that comprises of IMU and magneticmotion capture sensors.

The system was experimentally validated with a peg-in-holetask executed on a replica of the Rana-3 sample polishingmachine (Fig. 1). The task involves inserting cylindrical spec-imens into the carrier disk of the polishing machine, whichstops at an arbitrary angle after every polishing cycle. Thecylindrical samples have varying height and a diameter ofØ29.5±0.1 mm, while the hole diameter is Ø30.2±0.1 mm.Since the samples are circular, this insertion task is comparablyeasier than one involving prismatic shapes. Even though thisis not evaluated in this work, the in-hand object pose trackingcomponent of the framework is designed to account for suchobject pose considerations.

A. Dataglove

To record human demonstrations, a dataglove combining aninertial and a magnetic motion capture system was developed(Fig. 3). The Synertial Cobra sensors were selected as theinertial motion capture system. More precisely, 16 InertialMeasurement Units (IMUs) were used per hand and twoadditional IMUs were used for the forearm and the upper-arm. The Cobra system does not suffer from occlusion-relatedissues and is reasonably robust to magnetic disturbances.However, since it is IMU-based, the system accuracy is limitedand the measured angles tend to drift over time. These errorspropagate through the skeleton to the fingertips. Furthermore,the inertial glove is not able to measure the hand pose withrespect to the global coordinate frame and a system withabsolute pose estimation capability is required. To overcomethe drift and absolute reference issues, a magnetic motioncapture system was included to track the fingertips and correctthe inertial system data. For this purpose, the Polhemus Libertymagnetic motion capture system was chosen. The Libertysystem offers accurate pose tracking through standard sensorsthat can be attached to the hand or arm, and micro sensors thatcan be mounted on the fingertips (Fig. 3). The measurementsare expressed in the form of transformation matrices withrespect to the global reference frame and are therefore notprone to drifting, making the system appropriate for correctingthe IMU-based system. The proposed dataglove thus combinesin a synergistic manner IMU and magnetic motion capturesensors to provide accurate and robust measurements.

Fig. 4. Framework architecture for the proposed Programming by Demonstration methodology. Pendant and kinesthetic teaching are executed manually. Theglove based teleoperation involves a human to robot motion mapping procedure that projects the demonstrated motions onto the robot joint space.

B. Trajectory Optimization

The correction of the joint angles from the inertial motioncapture system is performed by solving the inverse kinematicsas a constrained numerical optimization problem, where thegoal pose is the Liberty pose estimate for each finger. Thedecision variables are the joint angles, and the non-equalityconstraints are determined by the joint limits. The sameformulation is used for all five fingers of the hand and thecombined results provide the calibrated joint values of thefull human arm hand system. Let R represent the currentrotation matrix and Rgoal the goal rotation matrix, then theangle difference ∆θ can be obtained by,

∆θ = arccos

trace((R)−1 Rgoal

)−1

2

, (1)

Furthermore, let x = f (q) be the forward kinematics mappingfrom joint to task space for each finger and q is the vectorof joint angles for each finger. Let also xgoal ∈ R3 be thegoal translation vector. Then, the objective function for thecorrection of the joint angles can be defined as,

C(q) := wx∥∥x−xgoal

∥∥+wr∆θ +ws ‖q−qimu‖ , (2)

where wx and wr are the position and rotation goal weightsand ws is the weight of the similarity measure and q is thejoint angle vector from the inertial motion capture system. Theweights are nonnegative and they sum up to one. Finally, theoptimal pose is obtained by,

q? = argminq∈Q

C(q), (3)

where Q ∈ (q−,q+) is the feasible joint set bounded by thelower q− and upper q+ joint limits.

IV. FRAMEWORK

The PbD framework was implemented with the RobotOperating System (ROS) [20], which provided the neces-sary communication, testing, and visualization utilities. Itsarchitecture is presented in Fig. 4, where the blocks con-ceptually correspond to the implemented ROS node structure.The Demonstration module handles capturing and translatinghuman demonstrations to the robot platform, relying on eitherkinesthetic teaching or dataglove teleoperation. Trajectoriesgenerated from human demonstrations are executed on thephysical robot system, which forms the Robot Platform mod-ule, along with their respective interfaces built with ROSControl [21]. The Perception module contains an interface forthe Intel RealSense camera that streams color and depth data tothe network. The data is filtered by the point cloud processingmodule, which also handles object tracking and target poseestimation. The module relies on the PCL library [22]. ThePath Planning module relies on the MoveIt [23] developmentplatform, which maintains a simulated environment based onthe robot state and collision space. The global planning andlocal replanning strategies are based on the RRT* algorithm[14]. The PbD Master module represents the central frame-work component, handling state control and task execution.

A. Demonstration Approaches

Trajectory samples were captured through two demonstra-tion methods: i) kinesthetic teaching and ii) teleoperationbased teaching. Kinesthetic teaching was based on the built-inrobot gravity compensation mode, with an external trigger forgripper opening and closing. In this setting, the user physicallyguides the robot, operating the gripper when necessary. Inthe second approach, hand and finger motion data obtainedfrom the dataglove are mapped to the robot arm and gripperin real time. This mapping allows the user to execute the

Algorithm 1: HoleEstimate (Pbase,Ttargetbase , l,α,Pobj,ρ)

1 Tholebase← Ttarget

base ;2 Ptarget← TransformCloud(Pbase,T

targetbase );

3 Plocal← CropCloud(Ptarget, l);4 if Plocal 6= /0 then5 (Pinliers,Cmodel)← SACPlaneSegmentation(Plocal);6 Pplane← ProjectInliers(Pinliers,Cmodel);7 Ppolygons← ConcaveHull(Pplane,α);8 dmin← l;9 for each : P ∈ Ppolygons do

10 if AreaRatio(P,Pobj)> ρ then11 d←

∥∥thole− tP∥∥;

12 if d < dmin then13 Thole

base← TPbase;

14 dmin← d;

15 return Tholebase;

task without physically interacting with the platform. In thiscase, translation mapping is relative to the initial position,while rotation mapping is absolute with respect to a commonreference frame. Gripper aperture control is based on thedistance between the human thumb and index fingertips.

B. Perception

The perception module handles filtering and interpretationof streamed point cloud data for the purposes of obstacle spaceidentification, pose tracking of the grasped object, and target(hole) estimation. A distance limit filter is applied to removethe gripper points from the point cloud set before passing thedepth data to the path planning module. Object pose trackingduring grasping and manipulation is crucial for the successof most assembly tasks, especially when utilizing non-rigid,adaptive grippers, which are susceptible to grasping inaccuracyand post-contact reconfiguration. The grasped object pose withrespect to the end-effector is in this setting tracked through atwo-stage, parallel point cloud processing procedure. The firstcomponent performs global pose estimation by aligning FastPoint Feature Histogram (FPFH) features [24] of the objectmesh to the observed scene. To ensure adequate alignmentspeed, a pre-rejective variant of the Random Sample Con-sensus (RANSAC) procedure [25], is used. Once an initialpose estimate is available, continuous, real-time tracking isperformed by the Iterative Closest Point (ICP) algorithm [26].

In several applications, including the one examined inthis work, the assembly target pose may change betweenexecutions. For the peg-in-hole task, an online, vision-basedgoal pose estimation algorithm was developed, as presented inAlgorithm 1. The procedure takes as input the raw point clouddata in the base reference frame Pbase = {pi}N

i=1 where pi ∈R3,the goal object pose obtained from human demonstrationTtarget

base ∈R4×4, the local hole search volume defined by a cubewith side l ∈ R+, the α ∈ R+ parameter determining con-

cave hull computation, the point cloud of the grasped objectPobj = {p j}M

j=1 where p j ∈ R3, and the area ratio thresholdρ ∈ R+. First, the demonstrated target Ttarget

base is assigned asthe initial hole pose estimate Thole

base. The TransformCloud

function takes as input the point cloud data defined in thebase reference frame Pbase and projects them onto the targetframe Ttarget

base , which yields Ptarget. The transformed cloud Ptargetis cropped using the CropCloud function to extract a volumeof interest around the goal point. If the extracted volume Plocalis not empty, a plane model is fitted to the cloud through theSACPlaneSegmentation function, which employs a sampleconsensus approach to return a collection of resulting planeinliers Pinliers and the plane parameters Cmodel. The planeparameters Cmodel are used to construct the plane S ∈R2. TheProjectInliers function projects the input point cloud ofinliers Pinliers onto the plane S, generating a flattened set ofpoints Pplane. The ConcaveHull function applies an alpha-shape based concave hull extraction algorithm to its inputcloud Pplane, producing a set of polygons Ppolygons = {Pk}K

k=1corresponding to closed contours in the plane. The AreaRatiofunction compares the area of polygon P with the projectionof the object model Pobj onto the polygon plane, returning ascalar ratio. If the areas match within the provided threshold ρ ,the centroid transform TP

base of the current polygon P is storedas a hole pose candidate Thole

base. The polygon with its centroidclosest to the human demonstration target Ttarget

base is selected asthe final hole pose estimate Thole

base. To achieve this, the distanced between the translation vector of the current hole estimatethole ∈R3 and each translation vector of the examined polygoncentroids tP ∈ R3 is calculated.

C. Global PlanningHuman demonstrations are often sub-optimal in terms of

path length, containing unnecessary motion segments. Tocompensate for this redundancy, the trajectory segments thatare not critical for the demonstrated assembly task are refinedusing the RRT* path planning algorithm [14].

Let the known obstacle closed space denoted Xobs ⊂X . Formultiple obstacles, the obstacle space is defined as Xobs :=⋃Q

q=1Xobs,q, where Q ∈ N is the total number of obstacles.The free open space results in Xfree := (Xobs)

{ =X\Xobs. Letthe known initial start state represented as xinit

start and the knowninitial goal state as xinit

goal. The RRT? algorithm rewires the treeby considering the potential cost-to-go to every node v ∈ V(we refer to nodes v and states x interchangeably), that liesin a circular set near the node of interest. The computationalcomplexity of the RRT? is Θ(|V | log |V |), where |V | denotesthe cardinality of the set of nodes V and Θ(·) a tight bound.Thus, RRT? is a computationally demanding algorithm forlarge spaces. However, if we consider a small subspace, thismethodology allows for real-time implementation.

The RRT? is used to compute offline the global graphG(V,E) which contains the global path π(xinit

start,xinitgoal). The

RRT? is presented in Algorithm 2 and is described as follows.The Sample provides a random state xrand by randomlysampling the free state space Xfree with a uniform distribution.

Algorithm 2: RRT* (X ,Xobs,xinitstart,xinit

goal,N)

1 V ← xinitstart,xinit

goal; E← /0;2 for n = 1 to N do3 xrand← Sample(Xfree,N);4 xnearest← Nearest(V,xrand);5 xnew← Steer(xnearest,xrand);6 if NoCollision(xnearest,xnew,Xobs) then7 Xnear← Near(V,xnew);8 η ← Line(xnearest,xnew);9 (xmin,cmin) ← Parent(xnew,Xnear,η);

10 V ← V ∪{xnew};11 E← E ∪{(xmin,xnew)};12 G ← Rewire(G,Xnear,xnew);

13 return G(V,E);

Then, the randomly sampled state xrand and the rest statesin the set of nodes V are compared to yield the neareststate xnearest to the random state. The function Steer indi-cates a new state xnew which is closer to nearest state, byconnecting the xrand and xnearest with a steering function. Inour case, the steering function follows a straight path. Next,the NoCollision function examines potential collisions fromxnew to xnearest with the obstacle space Xobs. If the path iscollision free, the algorithm proceeds to the rewiring process.The function Near collects the set of states that lie in the circle‖xnew− xnearest‖ ≤ γ(logNs/Ns)

1/(n+1), where γ ∈ R+ is theconnection radius for the rewiring process, Ns is the number ofsamples, and n is the state space dimension [27]. The functionLine connects the nearest state xnearest to the new state xnewwith a straight line. The subroutine Parent identifies a parentstate xmin and a corresponding cost-to-go cmin from any statein the near set Xnear to the new state xnew. Next, the newstate xnew is added to the set of nodes V and the new edgeconnecting the parent state xmin with the new state xnew to theset of edges E. The Rewire function indicates the path withminimum cost, by adding or discarding edges between statesin the near set Xnear accordingly. After evaluating Ns number ofsamples, the process terminates and returns the graph G(V,E).

D. Local Replanning

The final phase of the task consists of placing the pegexactly above the hole. That is a challenging task, as theexact location of the hole is not known a priori. To thisend, we exploit the perception system throughout the taskexecution to estimate the exact location of the hole. Then,we set the location of the hole as the final goal. Next, thelocal replanning framework is employed to modify the tail ofthe path accordingly to the final goal.

The overall framework is illustrated in Figure 5. The robotstarts at the initial start state xinit

start and computes the global pathπ(xinit

start,xinitgoal). The initial goal state xinit

goal is selected from oneof the demonstration goal states. Through the task execution,the perception system seeks to observe the final goal state of

xx

xxgoal

xstartloc

XobsXobs

xgoalinit

xstart

xobservX

X freeloc

O

r+

init

final

x

x

xgoalfinal

xstartloc

Fig. 5. The space of the global path is compressed to a sufficiently smallsubspace that allows the implementation of the RRT? in real time. Thecomputed subspace on the right side of the figure is the relative complement ofthe obstacle space in the circular space. The geometric center O of the circularspace is the center of the line, connecting xloc

start and xfinalgoal . Additionally, xobserv

denotes the first state at which the final goal is observed. The dotted linedenotes the part of the global trajectory that the system keeps following untilthe replanned trajectory is executed.

the hole xfinalgoal in real time. We name the first state at which

the final goal state is observed as xobserv. The correspondingtraversed path from the initial start state to the observationstate is shown with a blue dashed line in Fig. 5. Since theRRT? is computationally demanding, we provide sufficienttime for the replanning computation, by assigning as the localreplanning start state xloc

start a future state after the observationof the hole. The traversed path from xobserv up to the xloc

startfollows the global path and is indicated with blue dottedline in Fig. 5. We condense the original space by defining acircular subspace X loc

circle := {x ∈ Xfree | ‖x−O‖2 ≤ r2}, basedon the local replanning start state and the final goal stateO = (xloc

start + xfinalgoal)/2 and with radius r = ‖xloc

start− xfinalgoal‖, i.e.

the geometric center of xlocstart and xfinal

goal . Next, the local freesubspace X loc

free is produced as the relative complement of theobstacle space in the circular subspace X loc

free =X loccircle\Xobs, as

shown in the right part of Fig. 5. The completeness of theRRT? in the local free subspace is assessed with topologicalconnectedness tools [17]-(Lemma 3). The resulted local freesubspace is sufficiently small to allow for the computationof the local path in real time. Lastly, provided the local startstate xloc

start and the final goal state xfinalgoal we execute the RRT?

algorithm in the X locfree to obtain a new path π(xloc

start,xfinalgoal).

To verify that the peg-in-hole task is completed, the systemexploits the adaptive nature of the utilized gripper. Afterarriving to the final goal state, the end-effector executes a "Z"motion in the estimated hole plane, while tracking the objectpose with respect to the camera frame. This final alignmentmotion can also compensate for minor hole estimation inac-curacies. When the object is detected within the hole edges,the task execution is considered successful.

V. RESULTS AND DISCUSSION

To validate the implemented system experimentally, a groupof five able-bodied subjects (male, ages: 26 ± 1) was in-structed to demonstrate the peg-in-hole task using three differ-ent approaches: pendant-based teaching, kinesthetic teaching,and dataglove-based teleoperation. The pendant-based method

Fig. 6. Comparison of object trajectories for (a) kinesthetic and teleoperated user demonstrations, and (b) optimized trajectory executions of the correspondingdemonstrations. Subfigures (c) and (d) isolate a demonstrated and optimized trajectory obtained through kinesthetic teaching. More precisely, subfigure (c)presents the full motion from grasp to sample placement, and subfigure (d) focuses on the local replanning. In subfigure (c), point A corresponds to theobject grasping pose, point B to the initial pose for local replanning, and point C represents the sample insertion / placement point. The sharp rerouting ofthe optimized trajectory is due to the geometry and structure of the obstacle space.

served as a comparison baseline, utilizing the translation androtation jog interface on the robot control box to guide the end-effector to the desired position. The demonstrations obtainedthrough teleoperation and kinesthetic teaching were examinedwith- and without path optimization and local replanningenabled. For each experiment, the sample was placed in a pre-defined initial position on a tray adjacent to the experimentalplatform. Only demonstrations that resulted in the successfulexecution of the assembly task have been considered forcomparison and optimization. After completing the teachingexperiments, subjects were instructed to complete a surveyasking them to evaluate the three demonstration approaches.

The first stage of the system evaluation consisted of acomparison between the different demonstration approaches interms of teaching time and ease of use. For each subject anddemonstration approach, the teaching time was measured asthe time between grasping the object on the tray and releasingit after successful insertion. The ease of use was obtainedthrough the survey, where the subjects were asked to evaluateteaching difficulty on a scale of 0 to 10 (0 being very easyand 10 being very difficult). The obtained average teachingtime and ease of use values for each demonstration approachare collected in Table I. The kinesthetic teaching approach

has proven to be the fastest and most user friendly amongthe three, with subjects completing the task nearly four timesfaster than with the pendant. Even though teleoperation wasreported to be very engaging, it was rather difficult to usedue to the necessary safety distance from the operated robot.It must be noted, however, that the robot platform was notas large compared to the human user, which contributed tothe better kinesthetic teaching score. For bigger robots, theteleoperation approach would likely be more appropriate froma safety and accessibility perspective.

The demonstrated and optimized object trajectories obtainedthrough kinesthetic teaching and teleoperation are presented inFig. 6, where a single pair of kinesthetic teaching trajectories isisolated in subfigures (c) and (d) to allow for closer inspection.Examining the figures, it is clear that the raw user paths arelonger and contain jittery motion that compromises executionefficiency. Such perturbations are particularly evident closerto the goal, where the user needs to guide the robot end-effector (adaptive gripper) with greater precision. Directlyreplaying a demonstration results in speed fluctuations andrarely results in successful sample insertion due to graspingand positioning inaccuracies. The teleoperated demonstrationpath lengths were on average 20% longer than the kinesthetic

TABLE IAVERAGE TEACHING TIME, EASE-OF-USE SCORE, AND TRAJECTORY

LENGTH COMPARISON

TeachingMethod

Teaching Time[s]

Ease of Use[0−10]

UserTrajectory

[m]

OptimizedTrajectory

[m]

Pendant 85.16 5.80 / /Kinesthetic 22.15 1.00 1.02 0.81

Teleoperation 49.75 5.00 1.21 0.86

demonstrations, as the users need significantly more time andalignment effort in teleoperation due to the safety distance anddecreased visibility. The optimized trajectories, on the otherhand, are much smoother. The local replanning (from point Bto point C, depicted in Fig. 6) took on average less than 0.1 sdue to the constrained search space and introduced no delay inthe overall task execution. The optimized kinesthetic teachingand teleoperation trajectories were on average 20.3% and28.9% shorter than the demonstrations, respectively. Overall,the optimized trajectories were on average 26.4% shorter thanthe demonstrations across all teleoperation and kinestheticteaching trials. The experiments involving both the demon-strated and the optimized trajectories are presented in theaccompanying video, which is available in HD quality at thefollowing URL: www.newdexterity.org/skilltransfer

VI. CONCLUSION

This work focused on combining programming by demon-stration with path optimization and local replanning methodsto facilitate a fast and intuitive execution of assembly tasksrequiring minimal user expertise and involvement. The frame-work can also be extended to enhance task programming inservice robotics. Two demonstration approaches were devel-oped, integrated into the framework, and compared: i) kines-thetic teaching and ii) teleoperation based teaching. The twoapproaches have also been compared with the classic, pendantbased teaching. Both developed methods require a trajectoryoptimization step that transforms the demonstrated robot tra-jectories considering the obstacle space and the provided taskspecifications. The local replanning adjusts the optimized robotpath based on online feedback from the perception system,guaranteeing a collision-free navigation and the efficient taskexecution. The efficiency of the methods has been validatedwith a series of experiments involving the execution of acomplex assembly task with a robot arm and gripper system.The different approaches have been evaluated in terms ofteaching time, ease of use, and path length.

REFERENCES

[1] B. D. Argall, S. Chernova, M. Veloso, and B. Browning, “A survey ofrobot learning from demonstration,” Robotics and Autonomous Systems,vol. 57, no. 5, pp. 469–483, May 2009.

[2] A. Billard, S. Calinon, R. Dillmann, and S. Schaal, Robot Programmingby Demonstration. Springer Berlin Heidelberg, 2008, pp. 1371–1394.

[3] S. Chernova and A. L. Thomaz, “Robot learning from human teachers,”Synthesis Lectures on Artificial Intelligence and Machine Learning,vol. 8, no. 3, pp. 1–121, 2014.

[4] Z. Zhu and H. Hu, “Robot learning from demonstration in roboticassembly: A survey,” Robotics, vol. 7, no. 2, 2018.

[5] K. Fischer, F. Kirstein, L. C. Jensen, N. Krüger, K. Kuklinski, M. V.aus der Wieschen, and T. R. Savarimuthu, “A comparison of types ofrobot control for programming by demonstration,” in ACM/IEEE Intern.Conference on Human-Robot Interaction, 2016, pp. 213–220.

[6] A. J. Ijespeert, J. Nakanishi, and S. Schaal, “Movement imitation withnonlinear dynamical systems in humanoid robots,” in IEEE InternationalConference on Robotics and Automation, vol. 2, 2002, pp. 1398–1403.

[7] M. V. Liarokapis, C. P. Bechlioulis, G. I. Boutselis, and K. J. Kyr-iakopoulos, A Learn by Demonstration Approach for Closed-Loop,Robust, Anthropomorphic Grasp Planning. Springer InternationalPublishing, 2016, pp. 127–149.

[8] M. Liarokapis, C. P. Bechlioulis, P. K. Artemiadis, and K. J. Kyri-akopoulos, “Deriving Humanlike Arm Hand System Poses,” Journal ofMechanisms and Robotics, vol. 9, no. 1, 2017.

[9] S. Karaman and E. Frazzoli, “Sampling-based algorithms for optimalmotion planning,” The International Journal of Robotics Research,vol. 30, no. 7, pp. 846–894, 2011.

[10] R. Geraerts and M. H. Overmars, “Creating high-quality paths formotion planning,” The International Journal of Robotics Research,vol. 26, no. 8, pp. 845–863, 2007.

[11] K. Ota, D. K. Jha, T. Oiki, M. Miura, T. Nammoto, D. Nikovski, andT. Mariyama, “Trajectory optimization for unknown constrained systemsusing reinforcement learning,” arXiv:1903.05751, 2019.

[12] J. J. Kuffner and S. M. LaValle, “RRT-Connect: An efficient approachto single-query path planning,” in IEEE International Conference onRobotics and Automation, vol. 2, 2000, pp. 995–1001.

[13] L. E. Kavraki, P. Svestka, J.-C. Latombe, and M. H. Overmars, “Prob-abilistic roadmaps for path planning in high-dimensional configurationspaces,” IEEE Transactions on Robotics and Automation, vol. 12, no. 4,pp. 566–580, 1996.

[14] S. Karaman and E. Frazzoli, “Sampling-based algorithms for optimalmotion planning,” The International Journal of Robotics Research,vol. 30, no. 7, pp. 846–894, 2011.

[15] M. Otte and E. Frazzoli, “RRTX: Asymptotically optimal single-querysampling-based motion planning with quick replanning,” The Inter.Journal of Robotics Research, vol. 35, no. 7, pp. 797–822, 2016.

[16] J. D. Gammell, S. S. Srinivasa, and T. D. Barfoot, “Informed RRT*:Optimal sampling-based path planning focused via direct sampling of anadmissible ellipsoidal heuristic,” in IEEE/RSJ International Conferenceon Intelligent Robots and Systems, 2014, pp. 2997–3004.

[17] G. P. Kontoudis and K. G. Vamvoudakis, “Kinodynamic motion plan-ning with continuous-time Q-learning: An online, model-free, and safenavigation framework,” IEEE Transactions on Neural Networks andLearning Systems, vol. 30, no. 12, pp. 3803–3817, 2019.

[18] C.-M. Chang, L. Gerez, N. Elangovan, A. Zisimatos, and M. Liarokapis,“On alternative uses of structural compliance for the development ofadaptive robot grippers and hands,” Frontiers in Neurorobotics, vol. 13,p. 91, 2019.

[19] R. Y. Tsai and R. K. Lenz, “A new technique for fully autonomousand efficient 3D robotics hand/eye calibration,” IEEE Transactions onRobotics and Automation, vol. 5, no. 3, pp. 345–358, June 1989.

[20] M. Quigley, K. Conley, B. P. Gerkey, J. Faust, T. Foote, J. Leibs,R. Wheeler, and A. Y. Ng, “ROS: an open-source Robot OperatingSystem,” in ICRA Workshop on Open Source Software, 2009.

[21] S. Chitta et al., “ros_control: A generic and simple control frameworkfor ROS,” The Journal of Open Source Software, 2017.

[22] R. B. Rusu and S. Cousins, “3D is here: Point Cloud Library (PCL),” inIEEE International Conference on Robotics and Automation, Shanghai,China, May 9-13 2011.

[23] I. A. Sucan and S. Chitta. MoveIt. [Online]. Available: http://moveit.ros.org

[24] R. B. Rusu, N. Blodow, and M. Beetz, “Fast Point Feature Histograms(FPFH) for 3D registration,” in IEEE International Conference onRobotics and Automation, 2009, pp. 3212–3217.

[25] A. G. Buch, D. Kraft, J. Kamarainen, H. G. Petersen, and N. Krüger,“Pose estimation using local structure-specific shape and appearancecontext,” in IEEE International Conference on Robotics and Automation,2013, pp. 2080–2087.

[26] P. J. Besl and N. D. McKay, “Method for registration of 3-D shapes,”in Sensor fusion IV: Control paradigms and data structures, vol. 1611.International Society for Optics and Photonics, 1992, pp. 586–606.

[27] K. Solovey, L. Janson, E. Schmerling, E. Frazzoli, and M. Pavone,“Revisiting the asymptotic optimality of RRT,” arXiv:1909.09688, 2019.

Documents

Combining Programming by Demonstration with Path