Upload
george
View
23
Download
0
Embed Size (px)
DESCRIPTION
CSE 574 Automated Planning. Daniel S. Weld. What action next?. Agent Control: The Problem. Environment. Percepts. Actions. What’s Wrong with this Picture?. Motivating Domain #1. Motivating Domain #2. Applications (Current & Potential). - PowerPoint PPT Presentation
Citation preview
1
CSE 574
Automated Planning
Daniel S. Weld
2
Agent Control: The Problem
Environment
Percepts Actions
What action next?
3
What’s Wrong with this Picture?
4
Motivating Domain #1
5
Motivating Domain #2
6
Applications (Current & Potential)
Scheduling problems with action choices as well as resource handling requirements Problems in supply chain management HSTS (Hubble Space Telescope scheduler) Workflow management
Autonomous agents RAX/PS (The NASA Deep Space planning agent)
Software module integrators VICAR (JPL image enhancing system); CELWARE (CELCorp) Test case generation (Pittsburgh)
Interactive decision support Monitoring subgoal interactions
Optimum AIV system Plan-based interfaces
E.g. NLP to database interfaces Plan recognition
Web-service composition
7
Lots of activity...Significant scale-up in
the last 4-5 years Before we could
synthesize about 5-6 action plans in minutes
Now, we can synthesize optimal 100-action durative plans in minutes
Further scale-up with domain-specific control
Significant strides in our understanding Rich connections between
planning and CSP(SAT) OR (ILP)
Vanishing separation between planning & Scheduling
New ideas for heuristic control of planners
Wide array of approaches for customizing planners with domain-specific knowledge
8
Today’s Agenda
AdministriviaApproaches to agent controlSimplifying assumptionsSTRIPS – the easiest caseOverview of methods
9
Administrivia
Class Goals Learn about planning Practice critical reading Gain experience as reviewers Gain experience leading discussion (PC Mtg) Extensible projects
Experience[ Quals / Generals ][ AAAI / ICAPS Papers ]
10
Grading
20% paper summaries 25% organization of your
class 20% class participation 35% project
11
Paper Summary ProcessOne [ two ] papers / classPost reviews to class b-board
One-line summary The most important ideas in the paper, and why The one or two largest flaws in the paper Identify two important, open research questions on
the topic, and why they matterStimulate class discussion Encourage detailed reading
12
Process for Student Class-LeadsYou pick an areaRead papers ahead of time (I provide list)Decide on best paper for class to read We meet to discuss strategy
T- (1 week) for an hour
Class
We meet to discuss +/-
13
Project
Not too big But potential for quals or publication
Individual or groupLots of software available for quick startMore soon…
14
Administrivia
Add yourself to mailing listAnything else?
15
Agent-Control Approaches
Reactive Control Set of situation-action rules E.g.
1) if dog-is-behind-methen run-forward
2) if food-is-nearthen eat
Planning Reason about effect of combinations of actions “Planning ahead” Avoiding “painting oneself into a corner”
16
Ways to make “plans”
Generative Planning Reason from first principles (knowledge of actions) to
generate plan Requires formal model of actions
Case-Based Planning Retrieve old plan which worked for similar problem Revise retrieved plan for this problem
Reinforcement Learning Act randomly, noticing effects Learn reward, action models, policy
17
Generative Planning
Input Description of (initial state of) world (in some KR) Description of goal (in some KR) Description of available actions (in some KR)
Output Controller
E.g. Sequence of actionsE.g. Plan with loops and conditionalsE.g. Policy = f: states -> actions
18
Input Representation
Description of initial state of world E.g., Set of propositions: ((block a) (block b) (block c) (on-table a) (on-
table b) (clear a) (clear b) (clear c) (arm-empty))Description of goal (i.e. set of desired worlds)
E.g., Logical conjunction Any world that satisfies the conjunction is a goal (and (on a b) (on b c)))
Description of available actions
19
Simplifying Assumptions
Environment
Percepts Actions
What action next?
Static vs.
Dynamic
Fully Observable vs.
Partially Observable
Deterministic vs.
Stochastic
Instantaneous vs.
Durative
Full vs. Partial satisfaction
Perfectvs.
Noisy
20
Classical Planning
EnvironmentStatic
Fully Observable Deterministic Instantaneous
Full
Perfect
I = initial state G = goal state Oi(prec) (effects)
[ I ] Oi Oj Ok Om[ G ]
21
Real Class Focus
Durative Actions Simultaneous actions, events, deadline goals
Planning Under Uncertainty Modeling a robot’s (or softbot’s) sensors
[ I ] Oi
Oj
Ok
?
Ob
Oa
Oc
22
Static Deterministic Observable InstantaneousPropositional
“Classical Planning”
DynamicR
ep
lan
ni
ng
/S
itu
ate
d
Pla
ns
Durative
Tem
pora
l R
eason
ing
Continuous
Nu
meri
c
Con
str
ain
t re
ason
ing
(LP
/ILP
)
Stochastic
Con
tin
gen
t/C
on
form
an
t P
lan
s,
Inte
rleaved
execu
tion
MD
P
Policie
sP
OM
DP
P
olicie
s
PartiallyObservable
Con
tin
gen
t/C
on
form
an
t P
lan
s,
Inte
rleaved
execu
tion
Sem
i-M
DP
P
olicie
s
23
Broad Aims & Biases
AIM: We will concentrate on planning in deterministic, quasi-static and fully observable worlds Will start with “classical” domains;
but discuss handling durative actions and numeric constraints, as well as
replanning
Neo-Classical Planning
BIAS: To the extent possible, we shall shun brand-names and concentrate on unifying themes
Better understanding of existing plannersNormalized comparisons between plannersEvaluation of trade-offs provided by various design choices
Better understanding of inter-connectionsHybrid planners using multiple refinementsExplication of the connections between planning,CSP, SAT and ILP
24
Today’s Agenda
AdministriviaApproaches to agent controlSimplifying assumptionsSTRIPS – the easiest caseOverview of methods
25
Why care about “classical” Planning?
Most advances seen first in classical planning Many stabilized environments ~satisfy classical assumptions
It is possible to handle minor assumption violations through replanning and execution monitoring
“ This form of solution has the advantage of relying on widely-used (and often very efficient) classical planning technology” Boutilier, 2000
Techniques developed for classical planning often shed light on effective ways of handling non-classical planning worlds Most of the efficient techniques for handling non-classical scenarios
based on classical ideas/advances
26
How Represent Actions?Simplifying assumptions
Atomic time Agent is omniscient (no sensing necessary). Agent is sole cause of change Actions have deterministic effects
STRIPS representation World = set of true propositions Actions:
Precondition: (conjunction of literals)Effects (conjunction of literals)
a
aa
north11 north12
W0 W2W1
27
STRIPS ActionsAction =function from world-stateworld-statePrecondition says where function definedEffects say how to change set of propositions
aa
north11
W0 W1
north11precond: (and (agent-at 1 1)
(agent-facing north))
effect: (and (agent-at 1 2)
(not (agent-at 1 1)))
Note: str
ips doesn
’t
allow deri
ved effec
ts;
you must b
e complet
e!
28
Action Schemata
(:operator pick-up :parameters ((block ?ob1)) :precondition (and (clear ?ob1)
(on-table ?ob1) (arm-empty))
:effect (and (not (clear ?ob1)) (not (on-table ?ob1))
(not (arm-empty)) (holding ?ob1)))
Instead of defining: pickup-A and pickup-B and …
Define a schema:Note: strips doesn’t
allow derived effects;
you must be complete!}
29
Planning as Search
Nodes
Arcs
Initial State
Goal State
World states
Actions
The state satisfying the complete description of the initial conds
Any state satisfying the goal propositions
30
Forward-Chaining World-Space Search
AC
BCBA
InitialState Goal
State
31
Backward-Chaining Search Thru Space of Partial World-States
DCBA
E
D
CBA
E
DCBA
E
* * *
Problem: Many possible goal states are equally acceptable.
From which one does one search?
AC
B
Initial State is completely defined
DE
32
Plan Space
Forward chaining thru world-states
Backward chaining thru world-states
33
“Causal Link” Planning
Nodes
Arcs
Initial State
Goal State
Partially specified plans
Adding + deleting actions or constraints (e.g. <) to plan
The empty plan(Actually two dummy actions…)
A plan which when simulated achieves the goalNeed efficient way to evaluate quality (percentage ofpreconditions satisfied) of partial plan …Hence causal link datastructures
34
Plan-Space Search
pick-from-table(C)
pick-from-table(B)
pick-from-table(C)put-on(C,B)
How represent plans?How test if plan is a solution?
35
Planning as Search 3Graphplan
Phase 1 - Graph Expansion Necessary (insufficient) conditions for plan
existence Local consistency of plan-as-CSP
Phase 2 - Solution Extraction Variables
action execution at a time point
Constraints goals, subgoals achievedno side-effects between actions
36
Planning Graph
PropositionInit State
ActionTime 1
PropositionTime 1
ActionTime 2
37
Constructing the planning graph…
Initial proposition layer Just the initial conditions
Action layer i If all of an action’s preconds are in i-1 Then add action to layer I
Proposition layer i+1 For each action at layer i Add all its effects at layer i+1
38
Mutual Exclusion
Actions A,B exclusive (at a level) if A deletes B’s precond, or B deletes A’s precond, or A & B have inconsistent preconds
Propositions P,Q inconsistent (at a level) if all ways to achieve P exclude all ways to
achieve Q
39
Graphplan
Create level 0 in planning graphLoop
If goal contents of highest level (nonmutex)
Then search graph for solutionIf find a solution then return and
terminate
Else Extend graph one more level
A kind of double search: forward direction checks necessary
(but insufficient) conditions for a solution, ...
Backward search verifies...
40
Searching for a Solution
For each goal G at time t For each action A making G true @t
If A isn’t mutex with a previously chosen action, select it
If no actions work, backup to last G (breadth first search)
Recurse on preconditions of actions selected, t-1
PropositionInit State
ActionTime 1
PropositionTime 1
ActionTime 2
41
Dinner Date
Initial Conditions: (:and (cleanHands) (quiet))
Goal: (:and (noGarbage) (dinner) (present))
Actions:(:operator carry :precondition
:effect (:and (noGarbage) (:not (cleanHands)))(:operator dolly :precondition
:effect (:and (noGarbage) (:not (quiet)))(:operator cook :precondition (cleanHands)
:effect (dinner))(:operator wrap :precondition (quiet)
:effect (present))
42
Planning Graph noGarb
cleanH
quiet
dinner
present
carry
dolly
cook
wrap
cleanH
quiet
0 Prop 1 Action 2 Prop 3 Action 4 Prop
43
Are there any exclusions? noGarb
cleanH
quiet
dinner
present
carry
dolly
cook
wrap
cleanH
quiet
0 Prop 1 Action 2 Prop 3 Action 4 Prop
44
Do we have a solution? noGarb
cleanH
quiet
dinner
present
carry
dolly
cook
wrap
cleanH
quiet
0 Prop 1 Action 2 Prop 3 Action 4 Prop
45
Extend the Planning Graph noGarb
cleanH
quiet
dinner
present
carry
dolly
cook
wrap
carry
dolly
cook
wrap
cleanH
quiet
noGarb
cleanH
quiet
dinner
present
0 Prop 1 Action 2 Prop 3 Action 4 Prop
46
One (of 4) possibilities noGarb
cleanH
quiet
dinner
present
carry
dolly
cook
wrap
carry
dolly
cook
wrap
cleanH
quiet
noGarb
cleanH
quiet
dinner
present
0 Prop 1 Action 2 Prop 3 Action 4 Prop