Upload
myrtle-davidson
View
218
Download
3
Embed Size (px)
Citation preview
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Honeywell LaboratoriesHoneywell Laboratories
Goals and Threats:
Motivations in CIRCA
David J. MuslinerHoneywell Laboratories
[email protected](612) 951-7599
Still working on my (Michigan) thesis topic.13 years since first CIRCA paper.
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Presentation OutlinePresentation Outline
• CIRCA overview:
– Motivating domains.
– Architectural modules.
– Representations, including motivation mechanisms.
– Automatically planning real-time reactive plans.
• Probabilistic CIRCA:
– Motivating domains.
– Representation changes.
– Motivation changes.
– Planning methods changes.
• Learning in CIRCA
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Motivating Domain CharacteristicsMotivating Domain Characteristics
• Time-critical, hazardous, open-world domains: CIRCA guarantees that it will respond in a timely way to threats in its environment, avoiding failures and pursuing goals.
– Requires robustness beyond human performance.
• Bounded reactivity: CIRCA reasons explicitly about the time needed for sensing and actions (“perceptual-motor limits”).
• Bounded rationality: CIRCA dynamically builds reactive plans for only the immediately relevant parts of the situation. CIRCA is self-aware, using meta-level deliberation scheduling to optimize its online planning process.
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Multi-Agent Self-Adaptive CIRCAMulti-Agent Self-Adaptive CIRCAApproach: • Automatic synthesis and adaptation of guaranteed real-time controllers.
Performance:• Reactive control responses to threats and contingencies in
milliseconds.
• Coordinated multi-agent behaviors in tens of milliseconds.
• Dynamic reconfiguration of team mission plan in less than 10 seconds.
• Demonstrations in simulated UAV team domains: coordinated defense, dynamic replanning for contingencies.
Impact:• Robust UAVs that rebuild their own control systems in response to
contingencies (e.g., damage, target of opportunity).
• Smart UAV teams that actively coordinate distributed capabilities/resources to maximize mission effectiveness.
• Sponsor: DARPA ANTS.
• Teammate: Univ. of Michigan
Goal: Adaptive real-time coordination and control of multi-UAV teams.
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Intelligent Real-Time Cyber SecurityIntelligent Real-Time Cyber SecurityApproach: • Use CIRCA to plan and execute reactive security controllers.
• Tailor responses automatically according to available resources, varying threat levels & security policies.
Performance:• Fully autonomous operations defeating attacks in microseconds.
• Rapid reconfiguration for dynamic network assets, security state, threat profile.
• Demonstrations in real computer networks.
Impact:• Real-time responses defeat manual and automated attack scripts.
• Automatic tradeoffs of security vs. service level and accessibility.
• System derives responses for novel attacks built from known components.
Sponsor: DARPA CyberPanel.
Teammate: Secure Computing
Goal: Automatic real-time response to computer security intrusions.
Computing services
Active Security ControllerExecutive
Controller Synthesis ModuleNetworks, Computers
Attacks, intrusions
Intrusion Assessment
Security Tradeoff Planner
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
CIRCA ArchitectureCIRCA Architecture
Adaptive Mission Planner: Divides an overall mission into multiple phases, with limited performance goals designed to make the planning problem solvable with available time and available execution resources. Deliberation scheduling.
Controller Synthesis Module: For each mission phase, plans a set of real-time reactions according to the constraints sent from AMP. Planning.
Real Time Subsystem: Continuously executes planned control reactions in hard real-time environment; does not “pause” waiting for new plans. Execution.
Adaptive Mission Planner
Controller Synthesis Module
Real Time
System
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Generate controller
How CIRCA WorksHow CIRCA Works
Adaptive Mission Planner
Controller Synthesis Module
Real Time
System
Break down mission
Generate controller
Execute controllerif (state-1) then action-1if (state-2) then action-2
...
Generate controller
Start Goal
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Extending Performance Guarantees to Multi-Agent TeamsExtending Performance Guarantees to Multi-Agent Teams
Adaptive Mission Planner: Negotiates roles and responsibilities between agents in collaborative team.
Controller Synthesis Module: Builds controllers that include coordinated actions by multiple agents.
Real Time Subsystem: Executes coordinated controllers predictably, including distributed sensing and acting.
Only system to guarantee timing of end-to-end multi-agent coordinated behaviors
Adaptive Mission Planner
Controller Synthesis Module
Real Time
System
Roles, Goals
Real-Time Reactions
Planned Actions,Planned Negotiations
Adaptive Mission Planner
Controller Synthesis Module
Real Time
System
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Real Time Subsystem (RTS)Real Time Subsystem (RTS)
• The RTS executes loops of Test-Action Pairs (TAPs).
• The RTS executes in parallel with the other CIRCA modules.
• Parallel execution permits re-planning using computationally-expensive algorithms while preserving platform safety.
• Special-purpose TAPs used to download and switch to next controller.
• RTS includes multiple TAP schedule caches to hold controllers before they are activated.
• Example TAP:
- If (radar-missile-tracking T) then begin-evasives with max-delay: 300 msec.
action1 action2 action1 action3 action1test1 test2 test1 test3 test1 test4 action4
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Available actions
“Non-volitional” transitions
Goal state description
Initial state description
TimedAutomataWorld Model&ExecutableReactiveController
Planning Real-Time ReactionsPlanning Real-Time Reactions
Transition-based input model similar to classical planners, but with temporal characteristics and non-volitional transitions.
TAP Compiler
Scheduler
State Space PlannerVerifier
CSM
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
CSM Functional ComponentsCSM Functional Components
• State Space Planner predicts future threats and opportunities, plans actions with timing constraints for future states.
• Verifier reasons about complex temporal model to ensure that all failures are preempted.
• TAP compiler reduces timed automata controller model to time-constrained reactions (Test-Action Pairs).
• Scheduler builds executable cycle of TAPs to meet time constraints.
Controller SynthesisModule
TAP Compiler
Scheduler
State Space PlannerVerifier
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
CSM AlgorithmCSM Algorithm
• CSM essentially determines a strategy in a timed game against a worst-case adversary.
• Search loop iteratively selects a state and chooses action for that state.
– Heuristics guide choice for safety and goal achievement.
– Approximations indicate that timing will work.
– Formal reachability analysis called after each action choice, to confirm that all planned preemptions will occur.
– If failure reachable, path to failure can be used to backjump to most recent decision related to any state on the path.
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
CIRCA Motivations: ThreatsCIRCA Motivations: Threats
• Threats represented by temporal transitions to failure (TTFs).
• CSM only returns plans that make failure unreachable, using:
– Prevention: planned actions never allow TTF preconditions to become true.
– Preemption: planned actions will definitely happen before TTFs.
OK Threatened
Failure
Safe
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Radar Threat Domain - 1Radar Threat Domain - 1
;; Radar-guided missile threats can occur at any time.(make-instance 'event :name "radar_threat" :preconds '((radar_missile_tracking F)) :postconds '((radar_missile_tracking T)))
;; You die if don't defeat a threat by 1200 time units.(make-instance 'temporal :name "radar_threat_kills_you" :preconds '((radar_missile_tracking T)) :postconds '((failure T)) :min-delay 1200)
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Radar Threat Domain - 2Radar Threat Domain - 2
;; It takes no more than 10 time units to start evasives.(make-instance 'action :name "begin_evasive" :preconds '((path normal)) :postconds '((path evasive)) :max-delay 10)
;; We defeat missile in between 250 and 400 time units.(make-instance 'reliable-temporal :name "evade_radar_missile" :preconds '((radar_missile_tracking T) (path evasive)) :postconds '((radar_missile_tracking F)) :delay (make-range 250 400))
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
FAILURE
Radar-threat-kills-you
Radar-missile-tracking TPath normal
RadarThreat
Key Concept: Prevent Failure
Preemption as Key Planning StructurePreemption as Key Planning Structure
Radar-missile-tracking FPath normal
Begin-evasive
Radar-missile-tracking TPath evasive
preemption
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
FAILURE
Radar-threat-kills-you
Radar-missile-tracking TPath normal
RadarThreat
Non-Markov Temporal ModelNon-Markov Temporal Model
Radar-missile-tracking FPath normal
Begin-evasive
Radar-missile-tracking TPath evasive
Radar-threat-kills-you
Evade-radar-missile
Radar-missile-tracking FPath evasive
Why non-Markov? Efficient reactive plan construction.
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
CIRCA Motivations: GoalsCIRCA Motivations: Goals
• Represented by designation of specific desirable feature/value pairs.
• CSM heuristic guides system to choose actions that try to achieve (and re-achieve) maximum number of goal features.
• All goals are:
– Conjunctive.
– Optional.
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Radar Threat Domain - 3Radar Threat Domain - 3
;; Your goal is to continue flying normal path.(make-instance ‘goal :condition '((path normal)))
Optional elements for different planners:• :reward• :priority
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
FAILURE
Radar-threat-kills-you
Radar-missile-tracking TPath normal
RadarThreat
Goals Drive StabilizationGoals Drive Stabilization
Begin-evasive
Radar-missile-tracking TPath evasive
Radar-threat-kills-you
Evade-radar-missile
Radar-missile-tracking FPath evasive
Radar-missile-tracking FPath normalEnd-evasive
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Dynamic Abstraction PlanningDynamic Abstraction Planning
• Start with abstract states omitting all non-goal features.
• Incrementally and non-uniformly add features to states when required:
– When no safe actions are applicable.
– When goal achievement heuristic indicates.
• Result: planner decides what it needs to think about, when.
• Future direction: use this to guide what you attend to for learning.
non-failure failure
Radar-threat-kills-you
non-failure
Path normal
Path evasive
failure
Radar-threat-kills-you
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
AMP ResponsibilitiesAMP Responsibilities
• Divide mission into phases, subdividing them as necessary to handle resource restrictions.
• Negotiate with other AMPs to allocate goals and threats in each phase.
• Build problem configurations for each phase, to drive CSM.
• Modify problem configurations, both internally and via negotiation with other AMPs, to handle resource limitations.
• Tasks represent:
– Contracts to handle threats and goals.
- Need to announce, bid, award, and plan for them.
– Need to generate plan for a problem configuration.
– Need to download plan for a configuration to the RTS.
• On each AMP decision cycle, select and execute highest-priority task.
• New capability: deliberation scheduling.
– Estimate costs/benefits of different tasks: tie priority to utility.
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Negotiated Allocation of Mission GoalsNegotiated Allocation of Mission Goals
• Adaptive Mission Planners negotiate to distribute:
– Long-term mission goals.
– Roles: predefine responsibilities/concerns as context for negotiation.
– Performance evaluation responsibilities.
• Enhanced Contract-Net style negotiation.
– Adaptivity/dynamics.
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
AMP Deliberation SchedulingAMP Deliberation Scheduling
• Mission phases characterized by:
– Probability of survival/failure.
– Expected reward.
– Expected start time and duration.
• Agent keeps reward from all executed phases.
• Different CSM problem configuration operators yield different types of plan improvements.
– Improve probability of survival.
– Improve expected reward (number or likelihood of goals).
• Configuration operators can be applied to same phase in different ways (via parameters).
• Configuration operators have different expected resource requirements (computation time/space).
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Extensible CIRCA ArchitectureExtensible CIRCA Architecture
• Well-defined API for each CIRCA module and components within modules.
• Eg: CSM: problem specification, algorithm controls in, reactive plans and planning process monitoring out.
• Well-defined API for components within modules.
• Eg: API for state-space planner interaction with verifier:
– State space model in to verifier.
– Safety assessment plus optional culprit state trace out from verifier.
• Has allowed us to plug in different planners and verifiers for different domain representations and verifier approaches:
– Timed automata: Kronos, RTA, CSV, RTA-incremental, CSV-incremental; DAP, pDAP, regular planners.
– Safety-oriented Generalized semi-Markov: Monte Carlo sampler.
– Maximizing expected utility: MC sampler; evolutionary reaction-space search engine.
• Different executives: RTS, CLIPS.
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Memory and Models in CIRCAMemory and Models in CIRCA
Adaptive Mission Planner
Controller Synthesis Module
Real Time
System
• Mission model: phases with threats & goals.• Mappings from threats/goals to partial CSM input models (sets of transitions).• CSM performance profiles.
• Transition models from AMP.• Timing model of RTS.
• Cached TAP controllers.• Currently sensed state features.
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Probabilistic Reactive PlanningProbabilistic Reactive Planning
• Add transition probabilities to state model.
– World transitions and controlled actions.
• Sample simulated executions of the current plan to estimate probability of reaching different states.
• Build plans that handle most-probable states.
• Allows CIRCA to trade off planning time and plan complexity against system safety.
• Allows CIRCA to optimize expected utility of plans, trading off safety against mission objectives (goals, reward model).
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Probabilistic World Model DynamicsProbabilistic World Model Dynamics
• The world model is a generalized semi-Markov process (GSMP).
• The world occupies a single state at any point in time.
• Enabled transitions in the current state compete to trigger.
• One transition triggers in each state, determining the next state.
• Non-Markovian because trigger distributions depend on holding times.
• There are no analytic solutions for unrestricted GSMPs.
• Must use a sampling-based approach to estimate state probabilities.
– Simpler: estimate whether failure is too likely.
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Acceptance SamplingAcceptance Sampling
• Let pF be the failure probability of a plan.
• Want to specify failure threshold such that:
– Plan is accepted if pF
– Plan is rejected if pF >
• Use acceptance sampling to decide whether to accept a plan.
• Exhaustive sampling is impossible, so we must expect errors:
– Type I error: reject acceptable plan.
– Type II error: accept rejectable plan.
• Want to bound probability of error.
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Sequential SamplingSequential Sampling
• Single sampling plan always requires fixed number of samples.
• Sequential sampling plan decides whether to generate more samples based on samples seen so far.
– Define acceptance number an and rejection number r
n at stage n.
– Accept plan if observed failures are at most an
– Reject plan if observed failures are at least rn
• Intuitive sequential sampling:
– If you’ve already seen c failures at any iteration, then reject (rn = c).
– If you cannot possibly see c failures in remaining iterations, then accept (a
n = c+i -n-1 ).
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Number of Samples RequiredNumber of Samples Required
Wald acceptance sampling requires significantly fewer samples.
Actual failure probability
Static sampling plan
Wald sequential sampling plan
threshold
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
PerformancePerformance
• Expected number of required samples only depends on failure probability and threshold, not state space size!
• Domain-dependent factors affecting time to generate each sample:
– Time period considered (tmax
).
– Mean values of the distribution functions F.
• In practice, this allows us to generate probabilistically-verified plans for very large domains that cannot be handled by complete (non-probabilistic) model-checking approaches.
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Optimizing Plans in GSMPsOptimizing Plans in GSMPs
• Adding probabilistic delay distributions to timed automata yields Generalized Semi-Markov Process model:
– Efficient for representing real world.
– No analytic solutions available.
• Adding reward model gives opportunity for decision-theoretic solution criterion: maximize expected utility.
• Approach: generate plans and assess EU dominance using Monte Carlo sampling of GSMP executions.
– Backjump based on sample traces.
• New ideas: local search; evolutionary search in reaction space.
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Learning in CIRCA: Not About SpeedupLearning in CIRCA: Not About Speedup
• Unique requirements on learning for mission-critical systems.
• Executive can learn primitive operator response times.
– Currently an offline, manual process transfers that knowledge to planner.
• Learning from failure: if a planned preemption does not occur, model can be revised because either:
– Action or reliable temporal process was slower than modeled bounds.
– Threat temporal process was faster than modeled bounds.
– Planner tells us what features to watch for learning.
• Learning from unexpected reachable states:
– Self-aware system knows what should happen, can explicitly build context-specific responses to surprises, including planning “harder”, invoking default safe modes, and revising models.
• Learning refined planner performance profiles.
– Performance monitoring during planning for a selected problem can indicate reduced probability of solution, prompt switch to different problem.
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Summary: Cool Things About CIRCASummary: Cool Things About CIRCA
• Builds and executes plans that provide real-time performance guarantees.
• Automatically trades off mission goals with mission safety.
• Multiple CIRCA agents can coordinate and cooperate on teamed real-time behaviors.
• Self aware: adapts planning process to time available.
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
• The End.
~sa-circa/talks/cog-arch-03/cog-arch-03.ppt
Recent AdvancesRecent Advances
• Multi-agent CIRCA negotiates responsibilities (threats, goals).
• Coordinated real-time reactions.
• Planning with Generalized Semi-Markov Process models (GSMPs).
– Plans that maximize expected utility.
• Automatic dynamic abstraction.
• Deliberation scheduling.
Adaptive Mission Planner
Controller Synthesis Module
Real Time
System
Roles, Goals
Real-Time Reactions
Planned Actions,Planned Negotiations
Adaptive Mission Planner
Controller Synthesis Module
Real Time
System
Phase1
Phase2
FAILURE
Phase4
s1
1-s1
R3
s2 Phase5
Phase3
R5
s3 s4
0
10
20
30
40
50
60
70
5 10 15 20 25 30 35 40 45 50
Fre
qu
en
cy
Planner Time (secs)
100 Samples Per Threat Group
0.5 Second Wide Sample Bins
1 Threat 2 Threats3 Threats4 Threats