An Implemented Theory of Mind to Improve Human-Robot

HAL Id: hal-01330339https://hal.archives-ouvertes.fr/hal-01330339

Submitted on 10 Jun 2016

HAL is a multi-disciplinary open accessarchive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come fromteaching and research institutions in France orabroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, estdestinée au dépôt et à la diffusion de documentsscientifiques de niveau recherche, publiés ou non,émanant des établissements d’enseignement et derecherche français ou étrangers, des laboratoirespublics ou privés.

An Implemented Theory of Mind to ImproveHuman-Robot Shared Plans Execution

Sandra Devin, Rachid Alami

To cite this version:Sandra Devin, Rachid Alami. An Implemented Theory of Mind to Improve Human-Robot SharedPlans Execution. The Eleventh ACM/IEEE International Conference on Human Robot Interation,Mar 2016, Christchurch, New Zealand. pp.319-326, �10.1109/HRI.2016.7451768�. �hal-01330339�

https://hal.archives-ouvertes.fr/hal-01330339

https://hal.archives-ouvertes.fr

An Implemented Theory of Mind to ImproveHuman-Robot Shared Plans Execution.

Sandra DevinCNRS, LAAS,

Univ de Toulouse, INP, LAAS,7 avenue du colonel Roche,F 31400 Toulouse, France

Rachid AlamiCNRS, LAAS,

Univ de Toulouse, LAAS,7 avenue du colonel Roche,F 31400 Toulouse, France

Index Terms—Theory of Mind; Joint Action; Shared Plan;HRI.

Abstract—When a robot has to execute a shared plan witha human, a number of unexpected situations and contingenciescan happen due, essentially, to human initiative. For instance,a temporary absence or inattention of the human can entaila partial, and potentially not sufficient, knowledge about thecurrent situation. To ensure a successful and fluent execution ofthe shared plan the robot might need to detect such situations andbe able to provide the information to its human partner aboutwhat he missed without being annoying or intrusive. To do so,we have developed a framework which allows a robot to estimatethe other agents mental states not only about the environmentbut also about the state of goals, plans and actions and to takethem into account when executing human-robot shared plans.

I. INTRODUCTION

In robotics, one of the current research interests is to createrobots able to work jointly with humans or to help them ineveryday life. To do so, robots need to perform joint actionswith humans. They are already capable of computing sharedplans [1], not only for themselves, but also for other agents(humans or robots) involved in a joint task ([2], [3]). They canexecute their part of the plan while monitoring the activity ofthe other agents. However, when working with humans, theglobal execution of these shared plans is not always simple.Indeed, unexpected situations and contingencies can happendue essentially to human initiative and robots need to correctlyinterpret them and act accordingly.

One of these unexpected situation is the temporary absenceor inattention of a human. If it happens, when the humancomes back, he can lack information about what happened dur-ing his absence. Some of the facts are observable while othersare not, leading the human to have a partial, and potentiallynot sufficient, knowledge about the current situation. To ensurea successful and fluent execution of the shared plan, the robotneeds to inform the human about what he missed. However,systematically informing him about all missing informationcan quickly lead to an intrusive and annoying behaviour of therobot. Consequently, the robot needs to be able to distinguishwhich information the human really needs and which he doesnot. To do so, the robot needs to estimate correctly and takeinto account the mental state of its human partner.

The ability to reason about other people perception, beliefsand goals and to take them into account is called Theory of

Mind (ToM) ([4], [5]). In robotics, previous work in ToMfocused mainly on perspective taking and belief management:robots are able to reason about what humans can perceive ornot, and then construct representations of the world from theirpoint of view ([6], [7], [8]). They can use this knowledge tolearn tasks, solve ambiguous situations or understand humansbehaviour. However, there is still a gap between such repre-sentations and those necessary to permit an effective sharedplan action execution.

In order to fill this gap, we present, as a first step, aframework that allows the estimation by the robot of its humanpartner mental state related to collaborative task achievement.Such mental states contain not only the state of the world butalso the state of the goals, plans and actions. They are based onthe capability of the robot to permanently compute the spatialperspective of its partners and to track their activity. As asecond step, we present how the robot uses these mental statesto perform joint actions with humans and more particularly tomanage the execution of shared plans in a context of humansand robots performing collaborative objects manipulation. Asa result, the robot is able to adapt to humans decisionsand actions and to inform them when needed without beingintrusive by giving (unnecessary) information that the humancan observe or infer by himself.

We first present the global architecture in §III and thenbriefly situate, in §IV, our work relatively to the joint actionliterature. The formal definition of what we call a goal, a planand an action are described in §V and we define more preciselythe representation of the mental state used by our robot in §VI.These definitions are then used in §VII to explain how themental states are estimated and updated. §VIII shows how therobot uses this information to manage shared plans execution.Finally, we evaluate our system in §IX and conclude.

II. BACKGROUND

Previous work on ToM in robotics mainly concern per-spective taking and belief management. One pioneer workis [9] where two models from social sciences are analysedconducting to a first architecture integrating ToM for robots.Trafton et al., based on the Polyscheme architecture, com-pute spatial perspective taking to help the understanding ofinstructions in dialogue between astronauts [6]. Based on theACT-R architecture, [10] models the mechanisms used to take

decisions during the Sally and Anne test ([11]). ACT-R isalso used in [12] to explain, with the help of perspectivetaking, unexpected behaviours from a human partner. Breazealet al. use perspective taking to learn tasks from ambiguousdemonstrations [13]. Gray et al. simulate the outcomes ofactions from other agents point of view to predict the actionseffects in others’ mental states [14]. Milliez et al. proposea belief management that enables a robot to pass the Sallyand Anne test [8]. This reasoning is used in [15] to solveambiguous situations in dialogue and in [3] to compute human-aware plans taking into account potential divergence of beliefsbetween the human and the robot or lack of information in thehuman knowledge about the current world state.

Concerning shared plans execution, to our knowledge, onlya few contributions integrate the consideration of the humanpartner mental state. Human activities can be fairly wellmonitored ([16], [17]), robot actions can be executed takinginto account the humans in the environment ([18], [19]) andhuman-aware plans can be computed ([2], [3]). However,very few architectures track and take into account the humanpartner state during the execution of a joint task. Clodic etal. present SHARY, a supervision system which enables toexecute human-aware plans [20]. In this system, the executioncan be stopped and a new plan computed in case of unexpectedsituations. Fiore et al. extended this work in a more robust waythat includes new aspects of joint action like reactive actionexecution [21]. Fong et al. present HRI/OS, a system able toproduce and schedule tasks for different agents based on theircapacities [22]. However, the agents act mostly in a paralleland independent way. Shah et al. present Chaski, a task-levelexecutive that allows to choose when to execute the robotactions adapting to the human partner [23]. However, none ofthese architectures explicitly takes into account the humansmental state when executing a shared plan with him.

Previous work has also been conducted in order to modelshared plans [24] or agent intentions [25]. Researchers inArtificial Intelligence (AI) developed Beliefs, Desires andIntention (BDI) theories to model rational agents for multi-agents activities ([26], [27]). Frameworks have been devised tomodel multi-agents joint activities and shared plans executionin a robust way with regards to joint intention and beliefs aboutjoint intention of agents but without taking into account otheragents knowledge about the state of the plans and actions norspatial perspective taking ([28], [29]).

III. OVERALL ARCHITECTURE

In this paper, we place ourselves in a context where anassistant or team-mate robot has to work jointly with otheragents (humans or robots). They share an environment andmutually observe each other. Fig. 1 illustrates the implementedarchitecture on the robot. It is composed of:

• A Situation Assessment module [8]: which takes asinput the sensor data and maintains the current worldstate from the point of view of all agents based on spatialperspective-taking. It also computes a set of symbolicfacts that represent the observable world state from

Fig. 1: The implemented architecture. The work presented hereconcerns the ToM Manager and the Supervisor.

the different agents point of view. These facts concernrelations between objects (e.g. (mug isOn table)) oragents affordances (e.g. (mug isVisibleBy human), (bookisReachableBy robot)).

• A ToM Manager: which takes the symbolic worldmodels computed by the Situation Assessment moduleand information from the supervisor about the executionof goals, plans and actions in order to estimate andmaintain the mental state of each agent involved in thecooperation. These mental states contain not only worldstate information but also an estimation of the agentsbeliefs about the tasks and the other agents capacities.More details concerning this module are given in §VII.

• A high-level task planner [3], [30]: which allows therobot to synthesize shared plans containing the actionsof all agents involved in a given task. It is an HTN taskplanner which bases its reasoning on the symbolic rep-resentations of the world from the Situation Assessmentmodule and on the agents abilities on the current context(for example which agent can reach which object). Itsearches for the best possible plan to achieve the sharedgoal by taking into account human-aware costs in orderto come up with highly acceptable plans. An example ofsuch a plan is given in Fig. 3(b).

• A geometric action and motion planner [31]: whichallows to compute trajectories as well as objects place-ments and grasps in order to perform actions like Pickor Place while taking into account the human safety andcomfort.

• A dialogue manager [32], [33]: which allows to ver-balize information to the human and to recognize basicvocal commands.

• A Supervisor [21]: in charge of collaborative activity. Todo so, it takes the human partner mental state into accountto decide when to perform actions or to communicatewith him. It also interprets the information coming fromthe Situation Assessment module in order to recognizehuman actions like Pick or Place. More details concerning

this module can be found in §VIII.

IV. OUR WORK IN THE JOINT ACTION CONTEXT

The first step during a joint action is to share a goal. Thisimplies a number of issues related to agent commitment. Inorder to focus on the paper contribution, we consider here thatthe joint goal has already been established: we consider thatthe robot and its human partners have a commitment followingthe definition of weak achievement goal [34]. Consequently,we assume that none of the humans will abort the goal unlesshe knows that the goal is not achievable any more.

Once the joint goal had been established, the involved agentsneed to share a plan. This plan can come from multiplesources: it can be imposed by a human, negotiated throughdialogue or the robot can compute it. To focus on the executionof the shared plan, we choose, for this paper, to let the robotcompute the plan. However, the processes presented in thispaper hold even if the plan comes from a different source.Moreover, as the symbolic planner used in the architecturetakes into account human-aware costs, we assume that thecomputed plan will be close enough to human expectations tobe accepted by him. This plan is automatically shared by therobot at the beginning of the interaction (by displaying it ona screen near the robot or by verbalization). In this paper, wewill discuss about the processes used during the execution ofshared plans, and, we will not focus on issues linked to thecomputation or the communication of these plans.

The execution of shared plan is based on agent capacities.We distinguish between humans and robots capacities. Weconsider that a human has the capacities to:

• Perform high level actions: He is able to perform aset of high level actions like Pick or Place. The sharedplan is then computed based on this assumption. In orderto know what a human can do in a given situation,the Situation Assessment module computes, based ongeometry, an estimation of his ability to reach the objectsof the environment.

• Perceive: The Situation Assessment module computes anestimation of what the human can see in the environment.This information added to a belief management algorithmallows the robot to estimate, at any moment, what thehuman knows about the environment. Concerning actions,we make the assumption that a human will see andunderstand an action of another agent (mainly robotactions) when he is present and looking at the agent.

• Communicate: The dialogue manager allows the human,at any moment, to ask the robot to perform an action. Wealso assume that when he is present, the human is ableto hear and understand the information verbalized by therobot.

We also consider that the robot has the following capacities:• Perform high level actions: like Pick, Place, Drop,

Handover, etc... Similarly to the human, we consider aset of high level actions that the robot is able to performand that are taken into account to build the shared plan.

Likewise, reachabilities are computed for the robot inorder to decide which action it is able to perform at eachtime.

• Perceive: The Situation Assessment module allows tokeep an estimation of the current state of the environment:the robot is able to detect and localize objects and agents.The robot is also able to recognize simple high levelactions performed by a human like Pick, Place or Drop.

• Communicate: Thanks to the Dialogue module, the robotis able to ask to a human to perform an action and toinform him about the state of the environment, the goal,a plan or an action. The robot is also able to share aplan through speech synthesis (plan verbalization) or bydisplaying it on a screen.

V. REPRESENTATIONS

Let’s define the goal which the robot needs to perform withother agents, the plans computed to achieve this goal and theactions which compose these plans.

Let’s G be the set of all goals. A goal g ∈ G is defined as:

g = 〈nameg, AGGg, Og〉

Where nameg is used to identify the goal, Og is a set of factsrepresenting the desired world state and AGGg is a set ofagents1 involved in the goal

Let’s P be the set of all plans. A plan p ∈ P is defined as:

p = 〈idp, gp, ACPp, Lp〉

Where idp is used to identify the plan and gp is the goalthat the plan allows to achieve. ACPp is a set of actions (seebelow) that compose the plan and Lp is a set of links definingactions order and causal links.

A link l is defined as l = 〈previousc, afterc〉 wherepreviousc is the id of the action which needs to be achievedbefore the action with the id afterc is performed.

Let’s ACT be the set of all actions. An action ac ∈ ACTis defined as:

ac = 〈idac, nameac, AGCac, PRac, PREac, EFFac〉

Where idac is the action identifier and nameac represents itsname (e.g. pick, place, etc...). AGCac is a set of agent namesrepresenting the actors of the actions and PRac a set of entities(objects or agents) which allows to define precisely the action(e.g. the name of the object to pick). PREac and EFFac aresets of facts representing respectively the action preconditionsand effects.

VI. MENTAL STATES

Let’s also define how the robot represents other agents. Let’sA be the set of all agents, an agent ag ∈ A is defined as:

ag = 〈nameag, typeag, CAPa, MSag, AGag〉

Where nameag is used to identify the agent and typeagrepresents if the agent is a human or a robot. CAPag , the set of

1Often only the robot and a human partner, but other robots and humanscan be concerned

(a) Evolution of the state of a goal g in agent ag mental state (b) Evolution of the state of a plan p in the mental state of an agent ag

(c) Evolution of the state of an action ac coming from a plan p in the mental state of an agent ag

Fig. 2: Goal, plan and action evolutions.

high level action names, representing the actions that the agentis able to perform, is used by the symbolic planner to produceplans. MSag is the mental state of the agent (see below) andAGag is a set of agents containing all agents excepting ag:AGag ⊂ A | AGag = {A\{ag}}. These agents are defined inthe same way as ag and model how the given agent representsthem. Accordingly, these agents will also contain other agentsand so on in order to model high-order ToM [35].

The mental state MSag of ag is defined as:

MSag = 〈WSag, GSag, PSag, ACSag〉

Where WSag is a set of facts representing the current worldstate from the agent point of view.

GSag represents the state of the goals from the agentpoint of view. The state of a goal g is represented as〈nameg, stateg〉 where stateg can be either PROGRESS ifthe agent thinks the goal is in progress, DONE if the the agentthinks the goal has been achieved or ABORTED if the agentthinks the goal has been aborted.

PSag represents the state of the plans from the agent pointof view. The state of a plan p is represented as 〈idp, statep〉where statep can be either PROGRESS, DONE, ABORTEDor UNKNOWN if the agent is not aware of the plan2.

Finally, ACSag represents the state of the actions from theagent point of view. The state of an action ac is representedas 〈idac, stateac〉 where stateac can be either PROGRESS,DONE, FAILED, ASKED (an agent asked for the action tobe done), PLANNED (need to be done later according to thecurrent plan), NEEDED (need to be done now according to thecurrent plan but not possible), READY (need to be done nowaccording to the current plan and possible) or UNKNOWN.

2An agent is not aware of a shared plan if it has not contributed to itssynthesis or has not been informed by the agent(s) who has elaborated it

At each moment, goals, plans and actions can only haveone state in each agent mental state.

VII. THE TOM MANAGER

The ToM Manager allows the robot to estimate and maintainthe mental states of each agent interacting with the robotand is limited here to a first-order ToM the robot has itsown knowledge and a representation of the other agents andtheir knowledge. However, these last agents do not haverepresentations of other agents. In order to do that, we usean agent r ∈ A which represents our robot. As definedpreviously, r contains a representation of the other agentsAGr. However, the agents in AGr do not contain furtherother agents representation (∀ag ∈ AGr, AGag = ∅). Wewill present here how the robot estimates and maintains themental state of an agent ag ∈ AGr but these processes arethe same for all agents in AGr and for the mental state of therobot r.

A. World state

The representation of the world state by an agent is com-posed of two types of facts:

• Observable facts: these facts are computed and main-tained by the Situation Assessment module. They concernwhat the agent can observe about the world state. Thesefacts represent the affordances of all agents (e.g. isVisi-bleBy, isReachableBy) and the relations between objects(e.g. isOn, isIn) visible to them.

• Non-observable facts: they can not be computed by theSituation Assessment module. They concern informationthat the agent can not observe (e.g. the fact that an objectis into a closed box). There are two ways for an agent tobe aware of a non-observable fact, it can be informed by

another agent or it can perform or see an action that hasthis fact in its side effects (when the robot estimates thatan agent considers an action DONE, it considers that theagent is aware of all the effects of the action).

B. Goals

The evolution of the state of a goal g ∈ G in the mentalstate of an agent ag is described in Fig. 2(a). As said before,we consider that the agent already commits to the goal. Con-sequently, when the robot starts to execute a goal (g.start())the goal is considered in progress. The agent considers a goalachieved when all the facts belonging to the objective of thegoal are known to it (accordingly to its knowledge, the desiredworld state has been reached). As we do not focus on issuesrelative to shared intention, we consider that the agent willnot abort the goal on his own. The robot will abort the goalwhen no more plan can be found to achieve it (g.no plan()).Finally, the agent can be informed about the result of a goal(ag.isInformed(g)).

C. Plans

The evolution of the state of a plan p ∈ P in the mentalstate of an agent ag is described in Fig. 2(b). Each time therobot computes a plan (p.isComputed()) to achieve a goal, itshares it with the other agents (r.share(p,ag)). Accordingly,if the agent is present, it considers the plan in progress. Theagent considers a plan achieved when he considers each actionincluded in it performed. If, at any time, the agent considersthat there is still actions from the plan to be performed butthere are no action in progress or action to be done thatare possible, the agent considers that the plan is aborted(ag.no action(p), Alg. 1). Finally, the agent can be informedabout the result of a plan (ag.isInformed(p)).

Algorithm 1 ag.no action(p)

if (@ 〈id, state〉 ∈ ACSag |state = PROGRESS‖ state = READY )& (∃ 〈id2, state2〉 ∈ ACSag | state2 = PLANNED‖ state2 = NEEDED) then〈idp, ABORTED〉 ∈ PSag

end if

D. Actions

The evolution of the state of an action ac ∈ AC belongingto a plan p ∈ P in the mental state of an agent ag isdescribed in Fig. 2(c). When a new plan is shared by therobot (r.share(p,ag)), all the actions in this plan are consideredplanned by the agent. The agent considers an action READYif all previous actions in the plan (based on the plan links)are considered done by the agent and if the agent considersall the preconditions of the action true. If the agent does notconsider all the preconditions of the action true, it considersthe action NEEDED. When an action is executed, from anagent perspective, if it performs it (ag.perform(ac)) or observesanother agent performing it (ag.see(ac)), it knows then that the

action is in progress. In a same way, we consider that, whenan action is over, if the agent performed it (ag.perform(ac)) orobserved the end of its execution (ag.see end(ac)), it knowsthe result of the action. When the agent is informed that anaction has been done, it also infers the effects of the action.But an agent can also infer that an action has been done ifit knows that the action was in progress or on its way to bedone and it can see the effects of the action or if it knows thatthe action was in progress and can see the actors of the action(ag.see actor(ac)) and that there are not performing the actionany more (!AGCac.peform(ac)). An agent can also be asked(or ask to somebody else) to perform an action (ask(ac)) andbe informed about the result of an action (ag.isInformed(ac)).

VIII. THE SUPERVISOR

The supervisor manages shared plans execution. To do so, itmakes the robot execute its actions when it considers they areneeded and possible (action READY in the robot knowledge)and, in parallel, the robot monitors the humans activities inorder to detect their actions. If the supervisor estimates thatthe current plan is not feasible any more (plan ABORTED inits knowledge) it tries to compute a new plan. If it succeeds,it shares the plan and starts to execute it. If it fails, it abortsthe goal. Thus, when a human performs an unexpected actionor if an action fails, the supervisor is able to quickly producea new plan and adapt to the new situation.

In this paper, we will focus on the activity of the supervisorwhich allows to manage divergent belief during the executionof a shared plan. Indeed, when two humans share a plan,they usually do not communicate all along the plan execution.Only the meshing subplans of the plan need to be shared [36].Consequently, the robot should inform humans about elementsof the shared plan only when it considers that the divergentbelief might have an impact on the joint activity in order tonot be intrusive by giving them information which they do notneed or which they can observe or infer by themselves.

A. Weak achievement goal

If we follow the definition of weak achievement goal in [34],if the robot knows that the current goal has been achievedor is not possible anymore, it has to inform its partners.Accordingly, we consider that, when, in the robot knowledge,the state of a goal is DONE (resp. ABORTED) and the robotestimates that a human does not consider it DONE (resp.ABORTED), the robot informs him about the achievement(resp. abandoning) of the goal (if the agent is not here oris busy with something else, the robot will do it as soon asthe agent is available). We extend this reasoning to plans: therobot informs in the same way about the achievement (resp.abandoning) of a plan.

B. Before human action

A divergent belief of a human partner can be an issue whenit is related to an action that he has to do. To avoid that ahuman misses information to execute his part of the plan, eachtime the robot estimates that a human has to do an action

(a) Initial world state (b) Computed shared plan to solve the goal

Fig. 3: Initial set-up of the Clean the table scenario. The human and the robot have to clean the table together. The computedshared plan is to remove the three books lying on the table, sweep it and place back the books.

(action with the state READY in the robot knowledge and witha human in its actors) it checks if the human is aware that hehas to and can perform the action (the state of the same actionshould be READY in the estimation of the knowledge of thehuman too). If it is not the case, there are three possibilities:

• The state of the current plan is not in PROGRESS inthe estimation of the agent knowledge: the human missesinformation about the current plan, so, the robot sharesit with him.

• The state of the action is PLANNED in the estimationof the human knowledge: the human misses informationabout previous achieved actions to know that his actionhas to be performed now according to the plan. The robotchecks the state of all actions linked to the first one withthe plan links and informs about the achievement of allactions with a state different of DONE in the estimationof the human knowledge.

• The state of the action is NEEDED in the estimationof the human knowledge: the human misses informationabout the world state to know that his action is possible.In such case, the robot looks into the preconditions of theactions and informs the human about all those the humanis not aware of.

C. Preventing mistakes

A divergent belief of a human partner can also be an issue ifit leads him to perform an action that is not planned or not todo now according to the plan. To prevent this, for each actionthat the robot estimates the human thinks READY, the robotchecks if the action really needs to be done (action READYin its knowledge too). If it is not the case, the robot correctsthe human divergent belief by two different ways:

• If the action is PLANNED in the robot knowledge: thehuman thinks that a previous action has been achievedsuccessfully while it is not the case leading him to thinkhe has to perform his action. The robot looks in all actionslinked to the first one by the plan links and informs abouttheir state if it is different in the estimation of the humanknowledge and in the robot one.

• If the action is NEEDED in the robot knowledge: thehuman has a divergent belief concerning the world statethat leads him to think that his action is possible while

it is not the case. The robot looks into the preconditionsof the action and informs about divergent beliefs.

D. Signal robot actions

When the robot has to perform an action, it looks if itestimates that the humans are aware that it will act (the actionshould be READY in the humans knowledge). If it is not thecase, the robot signals its action before performing it.

E. Inaction and uncertainty

Even if the robot estimates that the human is aware that hehas to act (the state of the action which he must perform isREADY in the estimation of his knowledge), it is possible thatthe human still does not perform this action. If the human isalready busy by something else (there is an action in the robotknowledge with the state PROGRESS and with the human inits actors), the robot waits for the human to be available. Ifthe human is not considered busy by the robot, the robot firstconsiders that its estimation of the human mental state canbe wrong, and that, in reality, the human is not aware that heshould act. Consequently, the robot asks the human specificallyto do the action. If the human still does not act while the actionhas been asked, the robot considers the action failed, abortsthe current plan and tries to find an alternative plan excludingthat action.

In order to avoid considering that the human is available,and so to disturb him while he is busy doing something that therobot can not recognize, we have added an action named busyused when the robot estimates that the human is doing some-thing without knowing what. The action busy when executedby a human h can be defined as 〈id, busy, {h}, ∅, ∅, ∅〉.

IX. EVALUATION

A. Scenarios

1) ”Clean the table” scenario: In this example, a PR2robot and a human have to clean a table together. To do so,they need to remove all items from this table, sweep it, andre-place all previous items. The initial world state is the onein Fig. 3(a). We consider that the grey book is reachable onlyby the robot, the blue book only by the human and the whitebook by both agents. The actions considered during this taskare pick-and-place and sweep. To pick and place an object on asupport, the object and the support need to be reachable by the

Fig. 4: Initial set-up for the Inventory scenario. The colouredobjects need to be scanned by the robot and then, put into abox of the same colour.

agent, and, to sweep a surface, it should not have any objectson it and it should be reachable by the agent. The initial planproduced to achieve the goal is shown in Fig. 3(b)3.

2) ”Inventory” scenario: In this example, a human anda PR2 robot have to make an inventory together. At thebeginning of the task, both agents have coloured objects nearthem as well as a coloured box (initial world state in Fig. 4).These coloured objects need to be scanned and then, storedin the box of the same colour. The actions considered duringthis task are placeReachable, pickanddrop and scan. We callplaceReachable the action to pick an object and to place itsuch that it is reachable by the other agent. For an agent toperform this action, the object needs to be reachable by it.We call pickanddrop the action to pick an object and to dropit on a box. To perform such action, the object and the boxneed to be reachable by the agent. The scan action can onlybe performed by the robot and consists of scanning an objectby orienting the head of the robot (assumed to be equippedby a scanner) in the direction of the object placed such that itis reachable.

B. Criteria and results

One objective of our contribution is to reduce unnecessarycommunication acts from the robot during the execution ofa shared plan aiming at a more friendly and less intrusivebehaviour of the robot. Consequently, in order to evaluate oursystem, we have chosen to measure the amount of informationshared by the robot during a shared plan execution wherethe human misses some information. As it is not trivial tocreate and control a lack of knowledge from a human subject,we decided to evaluate our system in simulation. We ranexperiments in the two scenarios described previously wherea human and a robot have to perform a joint task together4.When the interaction starts, we consider that the joint goal isalready established and that both human and robot alreadyagreed on a shared plan. The robot executes the plan asdescribed in the presented work and the simulated humanexecutes the actions planned for him. We randomly sample

3This example has been fully implemented in a real robot. A detailedexecution can be found in the attached video

4More details about these two scenarios can be found in the attached video

a time when the human leaves the scene and another timewhen the human comes back. While absent, the human doesnot execute actions and cannot see anything nor communicate.

During the interaction, we logged the number of facts(information chunks) given by the robot to the human. Aninformation concerns either a change in the environment, thestate of a previous action, the abortion of a previous plan orthe sharing of a plan. We compared our system (called ToMsystem) to:

• a system which verbalizes any plan abortion, shares anynew plan and informs about each action missed by thehuman (called Missed system).

• a system which verbalizes any plan abortion, shares anynew plan and informs about each action performed by therobot even if the human sees it (called Performed system).

The obtained results are given in Table I.

Scenario Clean the table InventorySystem Average Std Dev Average Std Dev

ToM 1.32 0.98 0.41 0.48Missed 2.86 1.33 2.61 1.36

Performed 4.44 1.85 10.0 0.0

TABLE I: Number of information given by the robot duringthe two presented scenarios for the three systems (TOM,Missed and Performed).

We can see that our system allows to reduce significantlythe amount of information given by the robot. In the ”Cleanthe table” scenario, depending on when the human leaves,the robot might change the initial plan and take care of thebook reachable by both agents instead of the human. Thisexplains why the average number for the Performed systemis higher than the number of actions initially planned forthe robot: the robot performs more actions in the new planand can communicate about the new plan and the abortionof the previous one. In this scenario, our system allows tocommunicate about the plan abortion only if the human cannot infer it by himself and to not communicate about missedpickandplace actions as the human can infer them by lookingat the objects placements. However, the robot will inform thehuman if he missed the fact that the robot has swept the tableas it is not observable and it is a necessary information for thehuman to know before he can put back objects on the table.

In the inventory scenario, as all objects and boxes arereachable only by one agent, the robot does not change theplan when the human leaves. This explains the fact that thestandard deviation is null for the Performed system: the numberof actions performed by the robot never changes and there isno change in the plan. In this scenario, the pickanddrop andscan actions have non-observable effects (the human can notsee an object in a box). However, we can see that our systemstill verbalizes less information than the Missed system: therobot communicates only the information which the humanreally needs (as the fact that an object the human should dropin a box has been scanned) and does not give information

which are not linked to the human part of the plan (as the factthat the robot scanned an object it have to drop in its box).

X. CONCLUSION

In this paper, we have presented a system that allows toestimate and maintain mental states of other agents concerningnot only the environment but also about the state of goals,plans and actions in the context of human-robot shared planexecution. This system takes these mental states into accountwhen executing human-robot shared plans by allowing therobot to manage human-robot joint activities taking into ac-count the human perspective and its knowledge about the task.We have shown that this system allows to reduce the numberof unnecessary information given to the human while givingthe needed information to ensure a proper task achievement.

The novelty of this work is twofold: the estimated mentalstates concern not only observable information about theenvironment but also the state of current and previous goals,plans and actions, and, these mental states are taken intoaccount during the execution of a shared plan in order toreduce unnecessary communication to produce a less intru-sive behaviour of the robot. Moreover, this work has beenimplemented and run on a complete human-aware architecture,enabling the robot to fluently perform joint tasks with a human.

We have presented here one relevant use of the computedmental states. However, we strongly believe that they can alsobe used to tackle other challenges during human-robot jointactions. For example, another use could be to estimate thepossible lack of information of the robot in order to allow itto ask for help or to ask information when needed. Anotheraspect, that we plan to explore in the future, is to reason aboutsuch mental states to better understand human “unexpected”behaviours.

Acknowledgments: This work has been funded by theFrench Agence Nationale de la Recherche ROBOERGOSUMproject ANR-12-CORD-0030.

REFERENCES

[1] B. J. Grosz and C. L. Sidner, “Plans for discourse,” tech. rep., DTICDocument, 1988.

[2] M. Cirillo, L. Karlsson, and A. Saffiotti, “Human-aware task planning:an application to mobile robots,” Intelligent Systems and Technology,vol. 1, no. 2, p. 15, 2010.

[3] J. Guitton, M. Warnier, and R. Alami, “Belief management for hriplanning,” European Conf. on Artificial Intelligence, p. 27, 2012.

[4] S. Baron-Cohen, A. M. Leslie, and U. Frith, “Does the autistic childhave a theory of mind?,” Cognition, vol. 21, no. 1, pp. 37–46, 1985.

[5] D. Premack and G. Woodruff, “Does the chimpanzee have a theory ofmind?,” Behavioral and brain sciences, vol. 1, no. 04, pp. 515–526,1978.

[6] J. G. Trafton, N. L. Cassimatis, M. D. Bugajska, D. P. Brock, F. E. Mintz,and A. C. Schultz, “Enabling effective human-robot interaction usingperspective-taking in robots,” Systems, Man and Cybernetics, vol. 35,no. 4, pp. 460–470, 2005.

[7] M. Berlin, J. Gray, A. L. Thomaz, and C. Breazeal, “Perspective taking:An organizing principle for learning in human-robot interaction,” in Nat.Conf. on Artificial Intelligence, vol. 21, AAAI Press, MIT Press, 2006.

[8] G. Milliez, M. Warnier, A. Clodic, and R. Alami, “A frameworkfor endowing an interactive robot with reasoning capabilities aboutperspective-taking and belief management,” in Int. Symp. on Robot andHuman Interactive Communication, pp. 1103–1109, IEEE, 2014.

[9] B. Scassellati, “Theory of mind for a humanoid robot,” AutonomousRobots, vol. 12, no. 1, pp. 13–24, 2002.

[10] L. M. Hiatt and J. G. Trafton, “A cognitive model of theory of mind,”in 10th Int. Conf. on Cognitive Modeling, pp. 91–96, Citeseer, 2010.

[11] H. Wimmer and J. Perner, “Beliefs about beliefs: Representation andconstraining function of wrong beliefs in young children’s understandingof deception,” Cognition, vol. 13, no. 1, pp. 103–128, 1983.

[12] L. M. Hiatt, A. M. Harrison, and J. G. Trafton, “Accommodating humanvariability in human-robot teams through theory of mind,” in Int. JointConf. on Artificial Intelligence, vol. 22, p. 2066, 2011.

[13] C. Breazeal, M. Berlin, A. Brooks, J. Gray, and A. L. Thomaz, “Usingperspective taking to learn from ambiguous demonstrations,” Roboticsand Autonomous Systems, vol. 54, no. 5, 2006.

[14] J. Gray and C. Breazeal, “Manipulating mental states through physicalaction,” Int. Journal of Social Robotics, vol. 6, no. 3, 2014.

[15] R. Ros, S. Lemaignan, E. A. Sisbot, R. Alami, J. Steinwender,K. Hamann, and F. Warneken, “Which one? grounding the referent basedon efficient human-robot interaction,” in Int. Symp. on Robot and HumanInteractive Communication, pp. 570–575, IEEE, 2010.

[16] J. Gray, C. Breazeal, M. Berlin, A. Brooks, and J. Lieberman, “Actionparsing and goal inference using self as simulator,” in Int. Workshop onRobot and Human Interactive Communication, IEEE, 2005.

[17] Y. Demiris and B. Khadhouri, “Hierarchical attentive multiple models forexecution and recognition of actions,” Robotics and autonomous systems,vol. 54, no. 5, pp. 361–369, 2006.

[18] J. Mainprice, E. A. Sisbot, L. Jaillet, J. Cortes, R. Alami, and T. Simeon,“Planning human-aware motions using a sampling-based costmap plan-ner,” in Int. Conf. on Robotics and Automation, IEEE, 2011.

[19] A. D. Dragan, S. Bauman, J. Forlizzi, and S. S. Srinivasa, “Effects ofrobot motion on human-robot collaboration,” in Int. Conf. on Human-Robot Interaction, pp. 51–58, ACM/IEEE, 2015.

[20] A. Clodic, H. Cao, S. Alili, V. Montreuil, R. Alami, and R. Chatila,“Shary: a supervision system adapted to human-robot interaction,” inExperimental Robotics, pp. 229–238, Springer, 2009.

[21] M. Fiore, A. Clodic, and R. Alami, “On planning and task achieve-ment modalities for human-robot collaboration,” in Int. Symposium onExperimental Robotics, 2014.

[22] T. Fong, C. Kunz, L. M. Hiatt, and M. Bugajska, “The human-robotinteraction operating system,” in Int. Conf. on Human-robot interaction,pp. 41–48, ACM/IEEE, 2006.

[23] J. Shah, J. Wiken, B. Williams, and C. Breazeal, “Improved human-robot team performance using chaski, a human-inspired plan executionsystem,” in Int. Conf. on Human-Robot Interaction, HRI’11, ACM, 2011.

[24] B. J. Grosz and S. Kraus, “Collaborative plans for complex groupaction,” Artificial Intelligence, vol. 86, no. 2, pp. 269–357, 1996.

[25] P. R. Cohen and H. J. Levesque, “Intention is choice with commitment,”Artificial intelligence, vol. 42, no. 2, pp. 213–261, 1990.

[26] A. S. Rao and M. P. Georgeff, “Modeling rational agents within a bdi-architecture,” Readings in agents, vol. 91, pp. 473–484, 1991.

[27] M. J. Wooldridge, Reasoning about rational agents. MIT press, 2000.[28] N. R. Jennings, “Controlling cooperative problem solving in indus-

trial multi-agent systems using joint intentions,” Artificial intelligence,vol. 75, no. 2, pp. 195–240, 1995.

[29] M. Tambe, “Towards flexible teamwork,” Journal of artificial intelli-gence research, pp. 83–124, 1997.

[30] R. Lallement, L. de Silva, and R. Alami, “HATP: an HTN planner forrobotics,” CoRR, vol. abs/1405.5345, 2014.

[31] E. A. Sisbot and R. Alami, “A human-aware manipulation planner,”Robotics, IEEE Transactions on, vol. 28, no. 5, pp. 1045–1057, 2012.

[32] E. Ferreira, G. Milliez, F. Lefevre, and R. Alami, “Users belief awarenessin reinforcement learning-based situated human-robot dialogue manage-ment,” in IWSDS, 2015.

[33] G. Milliez, R. Lallement, M. Fiore, and R. Alami, “Using humanknowledge awareness to adapt collaborative plan generation, explanationand monitoring,” in ACM/IEEE International Conference on Human-Robot Interaction, HRI’16, New Zealand, March 7-10, 2016.

[34] P. R. Cohen and H. J. Levesque, “Teamwork,” Nous, pp. 487–512, 1991.[35] R. Verbrugge and L. Mol, “Learning to apply theory of mind,” Journal

of Logic, Language and Information, vol. 17, no. 4, pp. 489–511, 2008.[36] M. E. Bratman, “Shared intention,” Ethics, pp. 97–113, 1993.

Documents

An Implemented Theory of Mind to Improve Human-Robot