PLANNING-BASED CONTROLLERS FOR INCREASED LEVELS …robotics.estec.esa.int/ASTRA/Astra2013/Papers/Fratini_2811286.pdfSimone Fratini, Sebastian Martin, Nicola Policella, and Alessandro

PLANNING-BASED CONTROLLERS FOR INCREASED LEVELS OF AUTONOMOUSOPERATIONS

Simone Fratini, Sebastian Martin, Nicola Policella, and Alessandro Donati

European Space Agency, ESA-ESOCRobert-Bosch-Strasse 5, 64293, Darmstadt, Germany

[email protected]

ABSTRACT

Upcoming missions will require a higher degree of re-mote operations to increase quality and quantity of sci-ence return. Remote operations are certainly a challeng-ing scenario, mainly because communication delays anderrors lead to objective difficulties for the operator topromptly enable opportunistic science, activate contin-gency recovery procedures or to anticipate and react totracked events. AI planning-based control layers havedemonstrated to be able to entail autonomy, but modelingstill constitute a bottleneck for the use of these technolo-gies. In this paper we discuss how to design a controllerbased on the ESA GOAC platform to robustify the line op-erator - communication layer - remote system in the con-text of remote operations, with particular attention on dis-cussing a general methodology to design efficient mod-els for the AI planning engine. The approach presentedhas been tested on a real system made of a communica-tion network, to simulate the communication issues, anda rover, as controlled device. The communication systemand rover used are part of the ESA METERON infras-tructure (Multi-Purpose End-To-End Robotic OperationsNetwork).

Key words: Control, Autonomy, AI, Teleoperations.

1. INTRODUCTION

Upcoming missions either targeting earth observation,planets or deep space, will require a higher degree of on-board autonomy operations to increase quality and quan-tity science return, to minimize close-loop space-grounddecision making, and to enable new operational scenar-ios. Autonomy operations will induce a new distributionof processes and functions between ground and space;on-ground routine operations workload will be reducedand refocused on supervisory role while operator inter-vention will be required only for non-nominal situations.

In this context remote operations are certainly an interest-ing scenario where autonomy can provide added value.In fact, communications, in remote operations, are af-fected by non-continuous communication and by rele-vant radio communication propagation delay and trans-mission errors, with subsequent difficulties for the op-

erator to promptly enable opportunistic science, activatecontingency recovery procedures or to anticipate and re-act to tracked events.

In this paper we propose the use of a domain indepen-dent, model-based control layer between the operator andthe system to be operated, in order to robustify the lineoperator - communication layer - remote system in twoways: on one side the controller can be located at the op-erator side (see Figure 1, left), providing a unified viewof the communication layer plus remote system that al-lows the operator to reduce the impact of the delay anderrors introduced by the communication channel. On theother side the controller can be located at the remote sys-tem side, rising the abstraction level of the platform com-manding and providing to the operator the view of a morerobust autonomous, goal oriented system, that reducesthe amount of controls to be transmitted and reduce theneed of fine grained synchronization of commands for theremote system (Figure 1, right).

Figure 1. Conceptual Architecture

The AI controller has been implemented by means ofan ESA platform deployed for the “Goal Oriented Au-tonomous Controller” (GOAC [2]) initiative. The GOACplatform entails rapid prototyping of model-based con-trollers for complex physical systems exploiting AI plan-ning technologies. The platform takes in input a modelof the system to be controlled and a set of user goal tobe (hopefully) achieved by the controlled system. Thecontroller plans for user goals and supervises plan execu-tion by ingesting the telemetry of the controlled system.The controller is based on a domain independent planner,entailing the reusability of the same technology in differ-

ent domains (by changing only the model). As describedin the following sections, we have designed two GOAC-based controllers, one to be used upstream of the com-munication channel and one to be used downstream. Thetechnology is the same, but the models and the roles ofthe model change. Besides all the obvious advantages ofusing model-based technologies (the high level of codereusabilty above all), there is usually a common prob-lem: the difficulty of modeling. This paper aims also atdiscussing this problem from a methodological point ofview. In this sense the objective of this paper with respectto modeling, is not to present complex models executablewith the GOAC platform, but to discusshow to build amodel to cope with some common problems in remoteoperations.

The proposed approach has been tested on a real sys-tem made of a communication network, to simulate thecommunication issues, and a rover, as controlled device.The communication system and rover used are part of theMETERON infrastructure (Multi-Purpose End-To-EndRobotic Operations Network), one of the projects estab-lished within ESA to test networking protocols and op-erations for future human exploration scenarios to moon,mars and other celestial bodies[1]. It contains a sophisti-cated simulator and operational network for in-orbit test-ing and validation of novel communication techniquesunder consideration for future human exploration, suchas Disruption Tolerant Network (DTN)[3] concepts andtechnologies (Communications).

To generate more meaningful telemetry and telecom-mand streams, a rover called MOCUP is part of this net-work. Via its user interface software named MOPS, itallows simple operations such as waypoint navigation,execution of command stacks, capturing, transmissionand display of pictures as well as transmission of sta-tus telemetry. This system was used successfully forthe METERON OPSCOM-1 experiment of October 23,2012, where an astronaut onboard the ISS controlled theMOCUP rover on ground in ESOC, Germany over theDTN network. With re-using this infrastructure and hard-ware, the AI controller layer proposed in this paper hasalready been validated in a realistic environment.

2. THE AI CONTROL LAYER

The AI control layer is an instantiation of the GOACplatform. The GOAC platform is the result of the ESAGoal Oriented Autonomous Controller (GOAC) initia-tive, aimed at defining a new generation of softwareautonomous controllers to support increasing levels ofautonomy for robotic task achievement. In particular,goal of the GOAC architecture is to generate on-boardplans, dispatch activities for execution, and recover fromoff-nominal conditions. GOAC has been designed as aprincipled integration of different software solutions (see[2, 6]) to be agoal-oriented system embedding auto-mated model-based planning and execution. GOAC en-capsulates the long-standing notion of asense-deliberate-act cycle, where sensing, planning and execution are in-terleaved. The architecture consists of a set ofreactors,each of them encapsulating an independent control loop.

There is a well-defined messaging protocol to synchro-nize the independent control loops: exchangingfacts andgoals between reactors. Each reactor ingestobservationsof the current state either from the platform or other reac-tors, and sendgoals to be accomplished (to the platformor to the other reactors). The whole control loop is thensplit into subproblems, following a “divide-and-conquer”approach to complexity that simplifies the scalability andincreases the flexibility of the architecture (see [8] for acomprehensive discussion of the principles behind the ex-ecutive engine of GOAC).

The number and hierarchy of reactors to be instantiatedis a problem-specific design decision. Goals follow atop-down flow, from an abstract mission level reactor(close to the user), to more specific reactors, down toa “command-dispatcher” reactor (close to the platform),which translates goals into requests to the functional layerof the controlled system. Observations flow in a bottom-up direction. At the bottom of the hierarchy, the com-mand dispatcher reactor synthesizes observations fromreplies sent by the functional layer, and forwards theseto reactors placed at higher levels of abstraction, whichin turn generate more abstract observations to state thelevel of accomplishment of the received goals. Hence,the goals in input to the highest level reactor are the usergoals for the controller, the goals sent out by the low-est level reactor (the command dispatcher) are commandsfor the platform. Vice-versa, the observations in inputto the command dispatcher constitute the telemetry fromthe platform, while observations sent out by the highestlevel reactor constitute the status of accomplishment ofthe user’s goals (see Figure 2).

Figure 2. Reactors

The flow of observations and goals among the reactorsimplement, respectively, the sense and the act steps of theloop. The planning step is a model-based process basedon timeline planning. Each reactor embeds a timeline-based planner with its own model. The model comprisesa set of state variables, and describes the allowed transi-tions as well as a set of synchronization rules. Synchro-nization rules describe causal and temporal relationshipsbetween state variables (see for instance [7, 4, 5] abouttimeline-based modeling and planning). The advantage

of using a domain-independent planning technology is athand: different reactors differ only on the model, whilethe software is reused across the different reactors. Theonly exception is the command dispatcher reactor: thisreactor does not do planning but conversion of timelinesinto commands for the functional layer and the other wayaround (telemetry from the platform into timelines). Thereactors that actually implement a planning loop are de-fineddeliberative.

Hence the tailoring of the GOAC platform consists in: (1)choosing the number and hierarchy of the reactors for thespecific problem; (2) design the model for each delibera-tive reactor and (3) implement a command dispatcher forthe connection between the platform and the functionallayer of the system to be controlled. The number andhierarchy of the reactors depends on the level of auton-omy that the controller has to provide to the platform andthe complexity of the model. A generic controller basedon the GOAC platform aimed at providing autonomy re-quires at least one deliberative reactor (for planning andre-planning) in cascade with a command dispatcher (toconnect the planner to the platform).

The need of managing goals at different levels of abstrac-tion and different temporal scopes requires to increasethe depth of the reactors graph. Two reactors in cascadefor instance allows an HTN approach dividing the modelof abstract, flexible high level goals (to be refined intolower-levels goals and planned/optimized over a widetemporal horizon) from the modeling of more concrete,better specified (in time and value) lower-levels goals (tobe planned and monitored in a smaller temporal horizon).The model for high level goals is more complex fromthe planning point of view but leads to more stable plan(which does not need to be re-planned frequently) whilethe model for low-level goals is simpler and allows fastre-planning for more frequent failures. Hence to increasethe depth of the reactors graph entails the possibility ofdesigning models with different levels of abstraction.

The need for modeling sub-problems with different func-tional scopes requires to increase the width of the reactorsgraph. Two reactors in parallel allows modeling for mul-tiple planning and execution monitoring processes to becarried on in parallel. Let’s take for instance the exampleof complex platform payloads. Each payload needs to beoperated in parallel with occasional logical and tempo-ral synchronizations. A model for managing all of themwould become quickly very complex, with no real needof modeling all the logic in the same model. Multiplereactors in parallel simplify drastically the model and al-lows synchronizations based on the post goal/receive ob-servation interaction provided natively by the platform.Hence to increase the width of the reactors graph entailsthe possibility of designing models with different func-tional scopes.

For our purposes we have designed 4 reactors and wehave used them (1) in a cascade configuration of 3 re-actors for the downstream controller (see Figure 3, top)and (2) in a cascade of 2 reactors for the upstream con-figuration (Figure 3, bottom). The 4 reactors were: theMission Manager, the Platform Manager, the Commu-nication Repair and theMOPS reactor. The first threereactors are deliberative, the last one is the command dis-

patcher.

Figure 3. Reactors Architecture

The platform manager reactor aims at modeling a “virtualplatform” able to perform simple navigation tasks in a ro-bust manner. The model is designed to be able to re-planfor simple obstacle avoidance and plan autonomously foropportunistic science. The mission manager reactor is de-signed to plan for complex user tasks, like monitoring anarea and taking pictures around interesting objects. Thecommunication repair reactor is designed to robustify thecommunication between the operator and the remote sys-tem. The MOPS reactor encapsulates the connection withthe rover trough the network and constitute the logical in-terface with the platform.

3. DELIBERATIVE REACTORS

The role of deliberative reactors is to plan timelines forachieving goals, monitor the execution status of the time-line and re-plan timelines when problem occurs in the ex-ecution. The behavior of a GOAC deliberative reactor de-pends on themodel and on various policies. The most in-teresting ones are thegoal planning policy (GPP) and thegoal re-planning policy (GRP) (see [6] for details on howhow a GOAC deliberative reactor is implemented and theavailable policies). The GPP specifies how the planneris feed with goals asynchronously received by the reac-tor, the GRP specifies what to do with the goals whoseplan failed execution. The design of a specific reactorimplies (1) the definition of a model for the planner, (2)the choice of a goal planning policy and (3) the choice ofthe re-planning policy. Following sections describe theobjectives, the models and the policies used by the delib-erative reactors designed for the experiments described inthis paper.

3.1. The Platform Manager Reactor

The platform manager reactor aims at modeling a “vir-tual platform” able to perform simple tasks in a robustmanner. The model is designed to allow the rover to (1)recover from a set of predefined not nominal conditionsand (2) plan on-board activities to achieve unexpectedsub-goals triggered by the status of the platform. Obsta-cle avoidance and plan autonomously for opportunisticscience are two examples of capabilities entailed by thismodel.

Regarding the problem of recovering from failures, weconsider a generic platform modeled with a timelineTLP

(platform timeline) with two types of states:Stable statesS andActing states A, with the goal expressed by spec-ifying the stable status we want to get. The nominal be-havior would be a sequence of stable and acting states S→ A → S. . . . To add the capability of reacting to fail-ure conditions, we add a set of error states Ei (one forfor each possible failure) plus a path Ei →Pi → A ofstates to go trough to recover the failure (see Figure 4,left). A failure forces an error state on the timelineTLP

that models the platform. The use of a re-planning policythat re-plans for the last goal in execution (a stable sta-tus in this case) with this type of model forces, in case offailure, the execution of the recovering procedure for thefailure occurred from the error state to the acting states inexecution when the failure occurred. Section 5 shows anexample of implementation of this pattern to entail sim-ple obstacle avoidance.

Figure 4. Platform Manager Modeling Patterns

To achieve unexpected sub-goals triggered by the statusof the platform (for example to react when an interestingobject is found), we need to model at least two differentscenarios: a “nominal” one, i.e. what is the configurationof the telemetry expected in default conditions (that doesnot need a reaction) and (2) one (or more) not nominalscenarios, i.e. what are the configurations of the teleme-try that need a reaction and how to react. There is a num-ber of requirements for the plans that can be generatedfrom the model to guarantee the correctness of the behav-ior that can be obtained executing these plans. First of alla nominal plan has to fail if the telemetry configurationchanges during the execution into a status that require areaction (this is to enforce the reaction, if the envelopefor the nominal plan accidentally contains configurationsof the telemetry that needs a reaction, the platform mightnot react when needed). Secondly, a plan to react to agiven configuration of the telemetry must fail with anyother configuration (included the nominal one), to ensurethe right reaction of the platform and to avoid an “over-reaction” (i.e. a reaction in nominal conditions). Finally,any plan for the not nominal conditions must re-generatethe initial conditions for the nominal scenario (to handlemultiple reactions).

Considering again a generic platform modeled with thetimeline TLP with acting and stable states. We wantto be able to react to a set of telemetry values modeledas a set of valuesobsi of a timeline TLT (telemetrytimeline). We assumeobs0 as the nominal value for thetelemetry (no action is required), while the other values

require a reaction (obsi is supposed to require a sub-planPi). To react to a generic triggerobsi we conceptuallyadd to the state machine modeling the platform the pathsA→I→Pi →R→A, where the states I (forInterrupt) andR (for Restore) aim a modeling the procedures to inter-rupt the current activities and to restore the default statusof the platform1 (see Figure 4, right). Besides that, wesynchronize the values of A and Pi with the values of thetimelineTLT that triggers the reactions. In order to guar-antee at the modeling level that nominal plan must fail ifthe telemetry configuration changes during the executioninto a status that require a reaction, we force the nominalaction A to occur during the valueobs0 of TL. In orderto ensure that a plan for a reaction Pi will fail with anyother configuration of telemetry thanobsi, Pi will be syn-chronized to occur duringobsi. Finally, to guarantee thatthe platform will be able to perform A after Pi, R will besynchronized to restore the valueobs0 at its end (see Fig-ure 5). Section 5 shows an example of implementation ofthis pattern to entail opportunistic science.

Figure 5. Trigger Synchronization

3.2. The Mission Manager Reactor

The mission manager reactor is designed to manage com-plex user tasks, like monitoring an area, take stereo-graphic picture of interesting objects or zig-zagging be-tween two locations. What is interesting to discuss besidethe model in itself is the interaction between the modelof this reactor and the model of the platform reactor de-scribed above. Let us supposeRT the reactor on the top(the mission manager in our case) andRB the reactoron the bottom (the platform manager in our case). Themodel ofRT is designed to achieve a set of high leveltasksTi, whileRB has to provide a set of low level func-tionalities such thatTi can be modeled as a set of them.In order to simplify the model ofRT and wrap the com-plexity of RB , instead of thinkingRB as a set of opera-tions it is useful to model it as a set of states that it canachieve (autonomously) and defineTi as a sequence ofstates to be achieved instead of a sequence of operationsto perform.

Hence in general, supposing a modelMB for RB and amodel forMT for RT , letTi the tasks we want to achievewith MT andSi the interesting states that can be achievedwith MB , we (1) define a timelineTLG (G stands for

1Depending on the actual behavior of the platform and the reactiv-ity requested, I may not be necessary if the platform autonomously in-terrupts its activities when launching a trigger or R might beincludeddirectly in Pi.

Figure 6. Mission Manager Modeling Pattern

“goal timeline”) with n possible statesSi plus the stateRUNNING() (the running status wrap the complexity ofhow actuallyMB changes its status); (2) we addTLG

both toMT andMB ; (3) we add toMB the proper syn-chronizations with the corresponding states in theMB ’stimelines; a statusSi onTLG is synchronized to start atthe corresponding status of the timeline inMB (we callthis type of goalstatus achievement goal) or to be metby the action that can achieve that status (action achieve-ment goal); (4) we add toMT the proper definitions ofTi in terms ofSi

2 (see Figure 6). Regarding the mod-eling of Ti in terms ofSi, Ti has to contain all theSi

to be achieved and has to be synchronized to end at theachievement of the last status that has to be met by thesub-plan. Section 5 shows an example of instantiationof this pattern to implement high-level tasks in terms oflow-level platform functionalities.

3.3. The Communication Repair Reactor

The GOAC architecture has been designed to support de-ployment of software autonomous controllers to generateon-board plans and to dispatch activities for execution.As briefly described in Section 2, GOAC is based on awell-defined messaging protocol to synchronize indepen-dent control loops: exchangingfacts andgoals betweenreactors. Each reactor ingestobservations of the currentstate either from the platform or other reactors, and sendgoals to be accomplished to the platform or to the otherreactors. In theory an underlying assumption is that thereis no communication errors among the reactors, henceany difference between planned and observed timelines istheoretically due only to unexpected outcomes of plannedactions and/or unforeseen events.

In practice there might more reasons for discrepancyamong planned and executed timelines (interpreted asfailures during execution): errors in the models of thedeliberative reactors, errors in the implementation of thecommand dispatcher and errors in communication amongthe reactors (lost goals or observations). The GOACmodel-based approach is flexible enough to cope with

2This pattern works under the assumption that all theSi used to de-fine a taskTi belongs to the same modelMB . In case of tasks definedin terms of states belonging to different models, more complex synchro-nization patterns have to be used. The description of these patterns isout of the scope of this paper.

these errors. In fact in principle there is no difference be-tween an error during execution and any other type of er-ror mentioned above, since they all lead to a discrepancyon timelines that entails re-planning. In other word, aswe design a model for reacting to unexpected outcomesof planned actions, it is possible to design a model to copewith errors in sending goals and receiving observations.

Let us consider again a generic platform modeled withtwo types of states: a set of stable states Si and a set ofacting states Ai. We assume for the communication reac-tor a model conceptually similar to the one described forthe reactor on the bottom in Section 3.2 and on the rightin Figure 6. Hence we have 2 timelines for the model ofthe communication repair reactor: a goal timelineTLG

and a platform timelineTLP . TLP is a timeline sharedwith the command dispatcher, and takes values Si andAi, whileTLG takes valuesSi and RUNNING().

Errors can occur in the flow of goals and observationsfrom and toTLP . If we consider forTLP a status Siand as goal to achieve a status Sj , the transition of theplanned timeline Si → A→ Sj would generate a flow ofinformation among the communication repair reactor andthe platform reactor (that in this configuration wraps thecommunication channel) of 1 goal (A) and 2 observations(A to acknowledge the start of the action and Sj to com-municate the end of the action and the new stable statusof the platform). Hence we can have 3 types of failures(see Figure 7 (a)): (MG) missed goal, when A is sent butno received, (MA) missed ack for the goal, when A isreceived, the action starts but the ack that the action hasstarted is missed, and (MS) missed status change, whenthe action has finished, the new status Sj is sent to thedeliberative reactor but is not received. (MG) and (MS)are less critical, because in both cases the real status ofthe platform is stable and during re-planning no danger-ous actions are performed by the platform. (MA) is morecritical because the model of the platform is of a statusstable while the platform is actually acting.

Figure 7. Communication Repair Reactor

We can deal with the problems mentioned above bymeans of timeouts, a user defined re-planning policy anda slightly modified model forTLP . Regarding the modelfor TLP , we add a status CS (for check status), whicheffect is to stop the platform (if it is acting) and to force itto provide its current status (see Figure 7 (b)). Regardingtimeouts, let us suppose that we can calculate a “reason-able”3 timeoutτout for the transition Si →Sj . If we im-poseτout as upper bound for the transition of the platforminto Sj , we can have 3 possible failures (see Figure 8): (1)

3Reasonable here means that we know the platform able to performthe task in nominal conditions in a given time, and we set the timeout abit higher than that.

at τout the timeline is still in Si; (2) a straight transitionSi →Sj appears on the timeline; (3) atτout the timelineis in the status A. The case (2) indicates a communica-tion error MA (missed ack). Obviously the platform haschanged status but we missed the ack that the platformwas acting. The case (1) can be the result of either a MGor MA error: being apparently the platform still in Sieither it did not move because the goal has not been re-ceived or it was actually moving but the ack of the movehas been lost and the timeout has been reached. To sortout which is the case, we use a re-planning policy thatforces first CS followed by the goal that was originallyplanned. This has the effect of restoring the status of thetimeline allowing a consistent re-planning for the originalgoal. Regarding the third case, i.e. a value A at the time-out, it can be the result of either a MS error (the platformgot the stable status but we haven’t received the confir-mation) or a generic error (the platform is still acting butnot able to reach the goal by the timeout). Also in thiscase, the user-defined re-planning policy forces a checkof the status before starting the re-planning process.

Figure 8. Communication Repair Reactor Failures

The models of deliberative reactors implemented for theexperimental evaluation are described in Section 5, whilethe next Section describes the implementation of thecommand dispatcher.

4. THE MOPS REACTOR

The connection to the MOCUP Rover platform washighly facilitated due to the MOPS control software andits external interfaces. The MOPS (METERON Op-erations Software) user interface itself connects to theMOCUP rover via simple String commands. The com-mands send are headed by an identification number, andreplied by the platform with acknowledgment of recep-tion messages, and later by an execution status. In ad-dition, the platform periodically sends status informationsuch as current rover position, ultrasonic distance sensorreadings and battery status to the MOPS software.

MOPS, programmed in the python language, has an APIallowing for sending all available commands via exter-nal applications, as well as to register to the rover com-mand responses and telemetry. This API was used to in-terface the MOPS reactor with the MOPS software. Asthe MOPS software can be deployed and connected toMOCUP from remote computers or be run heedlessly di-rectly on the platform itself, the rover can be interfaced

by a controller located remotely over the network or by acontroller directly deployed on the rover.

Figure 9. The MOPS user interface for MOCUP rovercontrol

As MOPS/MOCUP allow only sequential execution ofcommands, the MOPS Reactor uses a FIFO queue to pro-cess goals and their responses. Goals received by the con-troller are translated in the reactor into the correspondingString commands understandable for MOPS/MOCUPand queued for execution. Separate processes forwardthe commands to MOPS which forwards the commandsover the network to MOCUP. They also monitor and pro-cess responses forwarded back to MOPS from MOCUP.At any time, the controller application can then subse-quently poll and react on new status information of eachgoal depending whether execution has started, executionwas successful or execution has failed.

The MOPS reactor in the architecture plays the role ofcommand dispatcher. The rover is able to move betweentwo points in space given their coordinates〈?x, ?y〉 (tak-ing into account that the rover may get stuck in betweentwo points because of an obstacle) and the orientation?h it has to reach. Besides that, the rover can take andstore pictures. Hence MOPS reactor has been imple-mented to be able to execute the goalsGOTO(?x, ?y, ?h),to move the rover in〈x, y〉 with final orientationh,TAKEPICTURE(), to take a picture in the current posi-tion andCHECKSTATUS(). The reactor provides for theobservationsAT(?x, ?y, ?h) when the rover is standingin 〈x, y〉 with an orientationh (observation provided af-ter having executed a commandGOTO(?x, ?y, ?h) or af-ter a commandCHECKSTATUS()), STUCKAT(?x, ?y, ?h)when the rover is stuck in〈x, y〉 with an orientationh (an error status),PICTURETAKENAT(?x, ?y, ?h), af-ter having took a picture or after a commandCHECK-STATUS() (if the last goal successfully executed wasto take a picture) andTARGETAT(?x, ?y, ?h) when aninteresting object is detected in〈x, y, h〉 (while mov-ing). We assume that a transitionGOTO(?x, ?y, ?h) →AT(?x, ?y, ?h) denotes a successful move to〈x, y, h〉,while a transitionGOTO(?x, ?y, ?h) → AT(?x′, ?y′, ?h′)or GOTO(?x, ?y, ?h) → STUCKAT(?x′, ?y′, ?h′), with?x 6=?x′ or ?y 6=?y′ denotes an unsuccessful move to〈x, y, h〉4 with the rover standing in〈x′, y′〉 with orienta-

4For this simple model we consider a failure not reaching the posi-tion, with no care of the actual orientation of the rover. Once that therover has reached the proper position no failure is considered possible

tion h′ while moving (status AT) or stuck because of anobstacle during the path (statusSTUCKAT). A transitionAT(?x, ?y, ?h) → GOTO(?x′, ?y′, ?h′) denotes the roverstarting to move from a point〈x, y, h〉 to a point〈x, y, h〉.Finally, we do not consider possible a failure in takinga picture, hence the only possible transition involvingthe commandTAKEPICTURE() will be AT(?x, ?y, ?h)→TAKEPICTURE() →PICTURETAKENAT(?x, ?y, ?h).

5. EXPERIMENTAL SETTING

The MOCUP rover is built onLEGO R© bricks and theLEGO NXT 2.0 Mindstorms kit. It is extended by anARM Linux Beagleboard with 1Ghz CPU and 500Mb ofRAM with a Debian Linux operating system. The USBPorts of the board are used to interface with the NXTbrick to control theLEGO R© motors for movement andthe ultrasonic sensors for obstacle detection, a webcamfor taking pictures and a wireless dongle for connectionto the network. The rover control software is started fromthe Linux system and communicating via the METERONsimulator to the MOPS user interface.

Figure 10. The MOCUP Rover

The ultrasonic sensors can detect obstacles in the frontand rear, and can triangulate positions of objects in thefront. If an obstacle is detected too close during move-ments, the rover is automatically stopped to prevent colli-sions, and an “interrupt” response is send back to the userinterface. The network delays of the simulator have beenset to a realistic ISS scenario with around 6-7 seconds ofround-trip time for each transmitted packet. As a DelayTolerant Network protocol is used in the communicationschain, delays can be adjusted for various scenarios with-out the danger of data packets timing out during transmis-sion. In addition, the simulator allows for pausing - hencedelaying any data packet, as well as dropping packets tosimulate data loss on the communications chain.

We have designed for the experimental evaluation 3 mod-els, one for each of the 3 deliberative reactors describedin the previous sections.

Regarding the Platform Manager reactor, we have de-signed a model to entail simple obstacle avoidance andthe capability of performing opportunistic science whilenavigating the environment. The model is made of 3timelines: thegoal timeline TLG, theplatform timelineTLP and thetrigger timeline TLT . The synchronization

during rotation to get the final orientation.

between the goal timeline and the platform timeline fol-lows the pattern described in Figure 6. OnTLG is pos-sible to post the goalsAT(?x, ?y, ?h), to move the plat-form andPICTURETAKENAT(?x, ?y, ?h), to take a pic-ture in the current position of the platform. The goals exe-cutable by the command dispatcher are posted on the plat-form timelineTLG , while onTLT the platform raisesthe triggers when an interesting object is found. Figure12(a) shows the transition graph forTLP to entail ob-stacle avoidance. This is an instantiation of the failurepattern (Figure 4, left), whereSTUCKAT(?x, ?y, ?h) isthe error status. The long path on the right is the recoverprocedure for the error status (when the rover is stuck, ittakes a picture then moves around the obstacle).

Figure 11. Mission Manager Tasks

To show an example of a model for opportunistic science,we suppose the rover equipped with a system to detectscientific targets while navigating the environment. Thetrigger timelineTLT has 2 values:NOTARGET() (nom-inal status),TARGETAT(?x, ?y, ?h) when an interestingobject is detected in〈x, y, h〉. The reaction consists instopping the platform while moving, coming back to〈x, y, h〉, take a picture of the target, restore the status ofthe trigger timeline and keep moving towards the destina-tion (see Figure 12(b)). This is an instantiation of the trig-gering pattern (Figure 4, right), where the plan (Pi in thepattern) for a triggerTi TARGETAT(?x, ?y, ?h) consistsin GOTO(?x, ?y, h)→TAKEOPPPICTURE(?x, ?y, h), andthe restore plan forTi (R in the pattern) consists in restor-ing the valueNOTARGET() onTLT . In this instantiationof the pattern we do not need a plan for interrupting thecurrent action, since the platform is able to change direc-tion without stopping if a new command is received.

In the mission manager reactor we have implemented3 high level tasks (see Figure 11, (a)-(c)): MONITOR-ING(?depth, ?width) (to monitor an area and take pic-tures at the corners, coming back to the starting point),ZIGZAG(?x, ?y, ?depth, ?width) (provides for a zigzagmove to 〈x, y〉 with a fix amplitude and depth) andTAKEPICTUREAROUND(?x, ?y) (to take 4 pictures allaround a target in〈x, y〉 and coming back to the start-ing point). The model is made of 2 timelines: themissiongoal timeline TLM and theplatform goal timeline TLG

(the model implements the pattern in Figure 6).TLG isthe goal timeline of the platform reactor described above,TLM is the goal timeline of the mission manager.TLM

Figure 12. The Platform Reactor

can take the value IDLE() plus all the high level tasks de-scribed above.

The models have been tested (a) upstream with a commu-nication manager reactor, conceptually similar to a plat-form manager reactor but implementing the behavior inFigure 8 (without obstacle avoidance and opportunisticscience capabilities), and a command dispatcher (ingest-ing manually failures in the command dispatcher) and (b)downstream with a mission manager reactor on top of aplatform manager reactor and a command dispatcher.

6. CONCLUSIONS

Space has been often a fertile field for the introductionof AI based advanced planning and scheduling technolo-gies. In fact, the AI model-based approach allows reusingof software modules across different scenarios because ofthe great flexibility introduced by the symbolic represen-tation of goals, constraints, logic, parameters to be opti-mized and so on.

However, a great effort and amount of time in general isnecessary to understand domains and problems, captur-ing all the specificity, and create the model. This model-ing difficulties constitute a not trivial barrier for the prac-tical adoption of AI model-based P&S technologies, se-riously harming the great advantage that, in theory, theapproach could bring.

In this paper we discuss how to build models for an AIplanning-based, domain independent control layer in thecontext of remote operations. The control layer has beenused in two ways: on one side the controller is locatedat the operator side, providing a unified view of the com-munication layer plus remote system that allows the op-erator to reduce the impact of the delay and errors intro-duced by the communication channel. On the other sidethe controller is located at the remote system side, ris-ing the abstraction level of the platform commanding andproviding to the operator the view of a more robust au-tonomous, goal oriented system, that reduces the amountof controls to be transmitted and reduce the need of fine

grained synchronization of commands for the remote sys-tem.

The experimental evaluation shows that is possible to pro-vide, with a little effort in writing additional code, robust-ness and levels of autonomy for an existing architectureoriginally designed for remote operations. We have im-plemented the AI controllers with the GOAC platform. Ifthe effort in writing code was minimal, with such an ap-proach a not trivial efforts have been required for model-ing. This paper is an initial step to address the problemof modeling in the context of AI planning for autonomy.

REFERENCES

1. W. Carey, P. Schoonejans, B. Hufenbach, K. Nergaard, F. Bosquil-lon de Frescheville, J. Grenouilleau, and A. Schiele. Meteron: Amission concept proposal for preparation of human-robotic explo-ration. InProceedings of the Global Space Exploration Conference,Washington D.C., 2012. GLEX-2012,01,2,6,x12697.

2. A. Ceballos, S. Bensalem, A. Cesta, L. de Silva, S. Fratini,F. In-grand, J. Ocon, A. Orlandini, F. Py, K. Rajan, R. Rasconi, and M. vanWinnendael. A Goal-Oriented Autonomous Controller for space ex-ploration. InProceedings of the ASTRA 2011, 11th Symposium onAdvanced Space Technologies in Robotics and Automation, 2011.

3. Vinton G. Cerf, Scott C. Burleigh, Robert C. Durst, Kevin Fall,Adrian J. Hooke, Keith L. Scott, Leigh Torgerson, and HowardS.Weiss. Delay-tolerant networking architecture, April 2007. InternetResearch Task Force RFC4838.

4. J. Frank and A. Jonsson. Constraint based attribute and intervalplanning.Journal of Constraints, 8(4):339–364, 2003.

5. S. Fratini and A. Cesta. The APSI Framework: A Platform for Time-line Synthesis. InProceedings of the 1st Workshops on Planning andScheduling with Timelines at ICAPS-12, Atibaia, Brazil, 2012.

6. S. Fratini, A. Cesta, R. De Benedictis, A. Orlandini, and R. Rasconi.Apsi-based deliberation in goal oriented autonomous controllers. InASTRA 2011. 11th Symposium on Advanced Space Technologies inRobotics and Automation, 2011.

7. N. Muscettola. HSTS: Integrating Planning and Scheduling. InZweben, M. and Fox, M.S., editor,Intelligent Scheduling. MorganKauffmann, 1994.

8. K. Rajan and F. Py. T-REX: Partitioned Inference for AUV MissionControl. In G. N. Roberts and R. Sutton, editors,Further Advancesin Unmanned Marine Vehicles. The Institution of Engineering andTechnology (IET), 2012.

Documents

PLANNING-BASED CONTROLLERS FOR INCREASED LEVELS …robotics.estec.esa.int/ASTRA/Astra2013/Papers/Fratini_2811286.pdfSimone Fratini, Sebastian Martin, Nicola Policella, and Alessandro