View
220
Download
1
Tags:
Embed Size (px)
Citation preview
Agent-Oriented Techniques for Programming Robots
Hans-Dieter BurkhardHumboldt University Berlin
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 2
What is an Agent?
Someone who acts autonomously on behalf of others
• Sales agent• Insurance agent• Undercover agent• .....
Software Agents
• Assistance Systems• Search engines• ChatterBots• …
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 3
Open Systems
Definition (Hewitt)
• Continuous availability• Extensibility • Decentralized control • Asynchronous work• Inconsistent information • Arm length relationships
Consider: P2P
Agents arrived with open systems
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 4
What is an Agent?
A program that acts autonomously on behalf of its user
Further Attributes:Intelligent, social, reactive, proactive, adaptive, …
An agent is a long running program, where the work can be meaningfully described as autonomous completion of orders or goals while interacting with the environment.
AI as research on intelligent agents.
(cf. Textbook Russell/Norvig: Artificial Intelligence)
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 5
Agents (Autonomous Systems) in Real World
• Natural language understanding• Image interpretation• Driver assistance systems• Traffic control • Space discovery• Autonomous robots:
– Service robots– Rescue robots– Entertainment robots– Industrial robots– Agricultural robots– …
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 6
Autonomous Systems in Real World
Robot soccer as testbed
(How to build and program soccer robots?)
Annual world championships and conference
Long term goal: Play like FIFA champion in 2050
Robot “Vision” from Team Osaka
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 7
Chess vs. Soccer
Chess:• Static• 3 Minutes per move• Single action• Single player• Information:
• reliable• complete
1997: Deep Blue wins against human champion Kasparov
Soccer:• Dynamic• Milliseconds• Sequences of actions• Team• Information:
• unreliable• incomplete
Robot“Nao” from Aldebaran
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 8
RoboCup
Melbourne 2000 Bremen 2006
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 9
Service Robots
Alternatives:
- from the refrigerator
- from the cellar
- from the neighbor
- from the shop
- from the internet
- …
Which alternative to choose?
What else is needed (glass, …)?
Willie, bring me a beer
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 10
Robot Needs a World Model
Facts about the world– maps, positions of objects, descriptions, …
Methods for processing sensory inputs– language processing, image processing
Methods for integrating sensory data– new world model from old model and new sensory data
Memory of environment:Part of state in the program
there was a beer in the refrigerator
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 11
World Model
Problems:
Environment is only partially observable
Observations are insecure and noisy
Scene interpretation with Bayesian methods, e.g. Probability to be at location s given an observation z: P(s|z) = P(z|s)·P(s) / P(z)
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 12
World Model
World model need not be true knowledge,
only belief of the agent.Someone took the beer from
the refrigerator!
Plans may fail.Need methods for revision.
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 13
Memory of Commitments
Tasks/Goals: Desired world states
Plans (Sequence of actions)
Rationality: Agents should only pursue
goals/plans that can be achieved
Why did I go to the refrigerator
Commitments:Part of state in the program
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 14
Goal Oriented Agents
Deliberation: Select goal to achieve
e.g. by calculating utilities
Means-ends reasoning: Planning method
e.g. by search in the action space
Rationality. Needs measures of success/quality/benefits.
“Bounded rationality”:Success w.r.t. to available resources (information, time, …)
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 15
Utility Estimations
Different options oAchievable by different plans pWith different results r
Value of result r : v(r)Probability for achieving r using plan p: (r | p) Utility of plan p (expectation) : u(p) = r result of p (r | p) · v(r)Utility of option o: u(o) = Max{ u(p) | p plan for o }
Decision process (used for simulated soccer player ATH98):Estimate utilities for options oSelect best option o as goal gBuild plan p for g
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 16
Rationality (Realism) Goals must be feasible
Selection process:
1. Rough estimation (utilities)
2. In case of error in means-ends reasoning (planning)
Revision of goal selection
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 17
Refinement of GoalsRefinement as iterated decision-process:
Long term goal intermediate goals ...
intermediate goals actions
Analogy: Stack of procedure calls
Least commitment: Specification only as far as necessary.
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 18
Maintaining Multiple Goals: BDI-Approach
Belief (world model)
Desire (desirable future world states)
Intentions (world states to be achieved)
Desires may be in conflict
Intentions must not be in conflict (rationality)
Mental states based on models of human acting (especially w.r.t. bounded rationality)
M.E. Bratman: Intentions, Plans, and Practical Reason, Harvard University Press, Massachusetts, 1987.
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 19
Adaptation vs. Stability
Conflicts between old intentions
and potential new intentions (desires)
Adaptation: select always best intentions
Stability: continue old intentions
Advantages of stability:
Reliability (important for cooperation)
Reduce overhead for changes
Avoid oscillations
Disadvantages of stability:
Stick too long on unsatisfactory behavior (fanatism)
There is a beeron the table!
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 20
BDI: Screen of Admissibility
Bratman’s solution
for conflicts between old and potential new intentions:
Old intentions restrict admissibility of new intentions,
i.e. set a filter for
- additional intentions- for refinement of intentions
Efficiency:
Reduce repeated evaluation of adopted intentions.
Bounded Rationality
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 21
BDI Agents
BDI architectures widely used
Implementation in different variations
Often only in simplified manner
desire = goal
intention = plan
without parallel intentions
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 22
Putting Together: Sense-think-act Cycle
Logical ordering of intern processing of the agent
1. Sense („input“) + perception (interpretation, world model)
2. Think (“decision”: evaluation, planning)
3. Act („output“)
thinkact
sense
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 23
Sense-think-act Cycle
Synchronisation (sequential)
thinkact
sense
think
act
sense
time
input
output
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 24
Sense-think-act Cycle
Synchronisation (concurrent)
thinkact
sense
think
act
sense
time
input
output
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 25
Sense-think-act Cycle
Synchronisation problems
thinkact
sense
think
act
sense
time
input
output
?For complicated deliberation processes
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 26
Different Deliberation Times
Layered architectures with different deliberation cycles, e.g.- Immediate reactions (avoid obstacles)- Short term planning- Long term planning
AIBO: 30 images per second125 motor commands per second
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 27
Structures: Layered Architectures
Synchronization
Conflicts
Concurrency
Layer n
Layer 2
Layer 1
sense
act
. . . . . .
AgentEnvironment
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 28
Layered Architectures with Mediator
Layer n
Layer 2
Layer 1
sense
act
. . . . . .
AgentEnvironment
Mediator
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 29
1-Pass-Architecture
Layer n
Layer 2
Layer 1
sense
act
. . . . . .
AgentEnvironment
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 30
2-Pass-Architecture
Layer n
Layer 2
Layer 1
sense
act
. . . . . .
AgentEnvironment
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 31
How to Deal with Dynamic World
Changing situations
Changing expectations
Unexpected situations (e.g. obstacles)
Changing plans
Conflict handling by BDI-approach
Least Commitment: Deliberate as far as necessary
Double pass architecture (DPA)
Plans may fail.Need methods for revision.
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 32
Option Hierarchies
Servebeer
Get bottle
Get glass
Open bottle
Fill glass
Bring Glass
fromRefr.
fromShop
Go toRefr.
OpenRefr.
TakeBottle
GetMoney
GotoShop
BuyBottle
Gohome
“And-branches”- all suboptions have to be achieved
“Or-branches” (Alternatives)- one suboption has to be achieved
. . . . . .
. . . . . . . . . . . .
. . .. . . . . .. . . . . . . . .
. . . . . . . . .
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 33
Intention Tree
Servebeer
Get bottle
Get glass
Open bottle
Fill glass
Bring Glass
fromRefr.
fromShop
Go toRefr.
OpenRefr.
TakeBottle
GetMoney
GotoShop
BuyBottle
Gohome
Options may be in
different states, e.g.
- intended
- active
- done
. . . . . . . . .
. . . . . .
. . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . .
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 34
Intention Tree
Servebeer
Get bottle
Get glass
Open bottle
Fill glass
Bring Glass
fromRefr.
Go toRefr.
OpenRefr.
TakeBottle
Options may be in
different states, e.g.
- intended
- active
- done. . . . . .
. . .
. . . . . .
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 35
Activation Path
Servebeer
Get bottle
Get glass
Open bottle
Fill glass
Bring Glass
fromRefr.
Go toRefr.
OpenRefr.
TakeBottle
Options may be in
different states, e.g.
- intended
- active
- done
Part of intention tree
. . . . . .
. . .
. . . . . .
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 36
Plan Fails
Servebeer
Get bottle
Get glass
Open bottle
Fill glass
Bring Glass
fromRefr.
Go toRefr.
OpenRefr.
TakeBottle
Need for re-deliberation:
Look for alternativesNo Beer inside
. . .
. . .
. . .
. . .
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 37
Repair: Intention Tree
Servebeer
Get bottle
Get glass
Open bottle
Fill glass
Bring Glass
fromRefr.
fromShop
Go toRefr.
OpenRefr.
TakeBottle
GetMoney
GotoShop
BuyBottle
Gohome
Re-deliberation
not by chronological
backtracking
. . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . .
. . .
. . . . . . . . .
. . .
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 38
Double Pass Architecture (DPA)
2 Passes:- Deliberation determines intention tree
modification if necessary (re-deliberation)
- Executor works over intention tree
maintains activity pass (top-down processing)
controls actuators
Advantages over stack oriented approaches:
Procedure stack has access only to last recent call
Implementations: XABSL, DPA
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 39
Still: Classical Approach (“Dualism”)
Robot = Agent (Brain) augmented by Sensors + Actuators
EnvironmentS
enso
rs
Act
uato
rs
Robot
Agent(program)
Input Output
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 40
Limitations for Complex Actuators
Vehicles have simpler actuation than legged robots
Vehicles:• Accelerate• Drive• Turn• Stop
Legged robots:• Coordination of limbs• Complex kinematics• Stability maintenance (even in stop state)
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 41
Machine LearningUse „trial and error“.
•Evolutionary algorithms•Reinforcement learning•Case based reasoning•Neural networks
http://www.robocup.de/AT-Humboldt/simloid-evo.shtml?de
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 42
Proprioception: Feeling the own Body
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 43
Biologically Inspired Robotics
Emergent behavior using situatedness in physical world
Intelligence emerges by “clever connections”
New insights for Artificial Intelligence:Intelligence needs a body for experiencing the real world.
Many sensors
Local processing
Coupling with actuators
Neural Networks
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 44
Acceleration Sensors at our RobotsAcceleration Sensors at our Robots
Accelboards: Accelboards: • real time (10ms cycle)• C/Assembler program• local processing
ABHL
ABML
ABAL
ABSR
ABFL
ABAR
ABHR
ABFR
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 45
Recent Experiments
Local control by Recurrent Neural NetworkNetworks developed by evolution
H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 46
See you at RoboCup 2009 in Graz!
Thank you!