Trajectories and Keyframes for Kinesthetic Teaching: A ... ?· Trajectories and Keyframes for Kinesthetic…

Embed Size (px)

Text of Trajectories and Keyframes for Kinesthetic Teaching: A ... ?· Trajectories and Keyframes for...

  • Trajectories and Keyframes for Kinesthetic Teaching:A Human-Robot Interaction Perspective

    Baris Akgun1, Maya Cakmak1, Jae Wook Yoo2, Andrea L. Thomaz1

    1School of Interactive ComputingGeorgia Inst. of Technology801 Atlantic Dr., Atlanta, GA

    {bakgun3,maya,athomaz}@cc.gatech.edu

    2Computer Science Dept.Texas A&M UniversityCollege Station, TX

    jwookyoo@gmail.com

    ABSTRACTKinesthetic teaching is an approach to providing demonstra-tions to a robot in Learning from Demonstration whereby ahuman physically guides a robot to perform a skill. In thecommon usage of kinesthetic teaching, the robots trajectoryduring a demonstration is recorded from start to end. In thispaper we consider an alternative, keyframe demonstrations,in which the human provides a sparse set of consecutivekeyframes that can be connected to perform the skill. Wepresent a user-study (n = 34) comparing the two approachesand highlighting their complementary nature. The studyalso tests and shows the potential benefits of iterative andadaptive versions of keyframe demonstrations. Finally, weintroduce a hybrid method that combines trajectories andkeyframes in a single demonstration.

    Categories and Subject DescriptorsI.2.9 [Artificial Intelligence]: Robotics; H.1.2 [Modelsand Principles]: User/Machine Systems

    General TermsExperimentation, Design

    KeywordsLearning from Interaction

    1. INTRODUCTIONThe goal of Learning from Demonstration (LfD) is to en-

    able humans to program robot skills by showing successfulexamples [4]. There are various ways that this demonstra-tion can take place. In this work we focus on kinestheticteaching whereby a human teacher physically guides therobot in performing the skill, as in Fig. 1.

    Kinesthetic teaching has several advantages for LfD. Sincethe teacher is directly manipulating the robot there is no

    Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.Copyright 2012 ACM X-XXXXX-XX-X/XX/XX ...$10.00.

    Figure 1: Simon interacting with a teacher.

    correspondence problem and demonstrations are restrictedto the kinematic limits (e.g. workspace, joint limits) of therobot. Moreover, extra hardware/instrumentation, such asmotion capture or teleoperation devices, is not necessary.

    While there has been much work on representations andlearning algorithms, the usability of kinesthetic teaching hasnot been explored in depth. Kinesthetic teaching may bechallenging for everyday users who do not have experiencemanipulating a robot arm with many degrees of freedom.

    In many practical LfD applications (e.g., homes, schools,hospitals), the teacher will not be an expert in machinelearning or robotics. Our research identifies challenges thateveryday users face with common robot learning methodsand investigate improvements to these methods to increasetheir effectiveness.

    In the typical kinesthetic teaching interaction, and mostLfD interactions, each demonstration is an entire state tra-jectory, which involves providing a continuous uninterrupteddemonstration of the skill. In this paper, we explore the al-ternative of providing a sparse set of consecutive keyframesthat achieve the skill when connected together. We presentan experiment that compares these through quantitativemeasures, survey results and expert evaluations. We findthat both are suitable to kinesthetic teaching from the usersperspective and both communicate different information.We also present two modified keyframe interactions and eval-uate their utility. Based on our findings, we introduce a wayto merge trajectory and keyframe demonstrations to takeadvantage of their complementary nature.

  • 2. RELATED WORKThere are several methods for learning skills, which mostly

    fit into two groups: direct policy learning and cost/rewardlearning. Dynamical system approaches such as Stable Es-timator of Dynamical Systems (SEDS) [13] and DynamicMovement Primitives (DMP) [17] and mixture models (e.g.Gaussian Mixture Models as in [10]) fall into the former andinverse reinforcement learning (IRL) [1] or apprenticeshiplearning fall into the latter. These methods are designed fordifferent skill types and all have their pros and cons. Apartfrom GMMs and DMPs, most methods require many train-ing samples which is not suitable for a short-duration HRIstudy. However, DMPs and GMMs have either implicit orexplicit time dependency. Most of the methods either can-not handle cyclic skills or they need to be reformulated. Wechose GMMs as our skill learning algorithm for the follow-ing reasons: GMMs can be learned with a low-number ofdemonstrations, can be trained in interaction time, can han-dle cyclic skills as well as point to point skills, GMMs canbe modified slightly to handle keyframe demonstrations (seeSection 4.2.2), and the fact that we do not need temporalrobustness (e.g. the robot will not be stopped in the middleof the motion) for our experiments.

    In kinesthetic teaching, demonstrations are often repre-sented as arm joint trajectories and/or end-effector path [9,12]. Some also consider the position of the end-effector withrespect to the target object of the skill [5, 11]. Typicallystart and end points of a demonstration are explicitly demar-cated by the teacher. Most studies subsample the recordeddata with a fixed rate [3, 5]. Demonstrations are often timewarped such that a frame-by-frame correspondence can beestablished between multiple demonstrations [12].

    Keyframes have been used extensively in the computeranimation literature [16]. The animator creates importantframes in a scene and the software fills in-between. In theLfD setting, an earlier work [15] utilizes via-points, which aresimilar to keyframes. These are extracted from continuousteacher demonstrations and updated to achieve the demon-strated skill. A recent approach is to only record keyframesand use them to learn a constraint manifold for the statespace in a reinforcement learning setting [6]. In this paperwe consider both trajectory and keyframe representations.

    Human-robot interaction (HRI) has not been a focus ofprior work on kinesthetic teaching, but there are a few exam-ples. In [18], kinesthetic teaching is embedded within a dia-log system that lets the user start/end demonstrations andtrigger reproductions of the learned skill with verbal com-mands. A modification to the kinesthetic teaching interfaceis kinesthetic correction [7, 8], where the teacher corrects as-pects of a learned skill in an incremental learning interactionby using a subset of joints in subsequent demonstrations. Inanother study [14], four types of force controllers that effectthe response to users are evaluated for kinesthetic teaching.The study addressed human preferences on which controllerwas the most natural.

    While various learning methods for kinesthetic teachinghave been explored, there is a lack of studies with end-userstesting the effectiveness of these techniques in terms of HRI.This is the focus of our work.

    3. PLATFORMOur robotic platform in this work is Simon, an upper-

    torso humanoid social robot with two 7-DoF arms, two 4-DoF hands, and a socially expressive head and neck, includ-ing two 2-DoF ears with full RGB spectrum LEDs (Fig. 1).Simon is designed for face-to-face human-robot interaction.Simons arms have compliant series-elastic actuators withstiffness that can be dynamically changed. We use MicrosoftWindows 7 Speech API for speech recognition.

    4. DEMONSTRATION METHODSWe explore three different ways for teachers to demon-

    strate skills: trajectory demonstrations, keyframe demon-strations, and keyframe iterations. In all three, users physi-cally manipulate the robots right arm, which is gravity com-pensated, to teach it skills (Fig.1). In this section we detailthe implementation of each kinesthetic teaching mode.

    4.1 Trajectory Demonstrations

    4.1.1 InteractionThe teacher is informed that the robot will record all the

    movement they make with its right arm. The teacher ini-tiates the demonstration by saying New demonstration,moves the arm to make the robot perform the skill and fin-ishes by saying End of demonstration. This process isrepeated to give as many demonstrations as the person de-sires. After a demonstration, the teacher can use the speechcommand Can you perform the skill? to have the robotperform the current state of the learned skill and adjusthis/her demonstrations to attend to any errors.

    4.1.2 LearningJoint angle trajectories are recorded as the teacher moves

    the robots arm to perform the skill. As mentioned previ-ously, we use GMMs as our skill learning algorithm [10].

    The data is subsampled in the time dimension to a con-stant length before being input to the learning algorithm.In our case, learning is done in an eight dimensional space(7 joint angles over time). First, we use the K-means al-gorithm with a constant k. The resulting clusters are usedto calculate initial mean vectors and covariance matrices forthe expectation-maximization (EM) algorithm. The EM al-gorithm is run to extract a Gaussian-Mixture Model (GMM)from the data. The resulting GMM has k sub-populationswhich is kept constant during our experiments. Gaussian-Mixture Regression (GMR) is used to generate a trajectoryto perform the learned skill. The desired time dimensionvector is given to GMR which in turn generates the jointpositions. Note that the algorithm can learn from either asingle or multiple demonstrations.

    4.2 Keyframe Demonstrations

    4.2.1 InteractionThe teacher is informed that the r