~sa-circa/talks/cog-arch-03/cog-arch-03.ppt Honeywell Laboratories Goals and Threats: Motivations in CIRCA David J. Musliner Honeywell Laboratories [email protected]

~sa-circa/talks/cog-arch-03/cog-arch-03.ppt

Honeywell LaboratoriesHoneywell Laboratories

Goals and Threats:

Motivations in CIRCA

David J. MuslinerHoneywell Laboratories

[email protected](612) 951-7599

Still working on my (Michigan) thesis topic.13 years since first CIRCA paper.


Presentation OutlinePresentation Outline

• CIRCA overview:

– Motivating domains.

– Architectural modules.

– Representations, including motivation mechanisms.

– Automatically planning real-time reactive plans.

• Probabilistic CIRCA:

– Motivating domains.

– Representation changes.

– Motivation changes.

– Planning methods changes.

• Learning in CIRCA


Motivating Domain CharacteristicsMotivating Domain Characteristics

• Time-critical, hazardous, open-world domains: CIRCA guarantees that it will respond in a timely way to threats in its environment, avoiding failures and pursuing goals.

– Requires robustness beyond human performance.

• Bounded reactivity: CIRCA reasons explicitly about the time needed for sensing and actions (“perceptual-motor limits”).

• Bounded rationality: CIRCA dynamically builds reactive plans for only the immediately relevant parts of the situation. CIRCA is self-aware, using meta-level deliberation scheduling to optimize its online planning process.


Multi-Agent Self-Adaptive CIRCAMulti-Agent Self-Adaptive CIRCAApproach: • Automatic synthesis and adaptation of guaranteed real-time controllers.

Performance:• Reactive control responses to threats and contingencies in

milliseconds.

• Coordinated multi-agent behaviors in tens of milliseconds.

• Dynamic reconfiguration of team mission plan in less than 10 seconds.

• Demonstrations in simulated UAV team domains: coordinated defense, dynamic replanning for contingencies.

Impact:• Robust UAVs that rebuild their own control systems in response to

contingencies (e.g., damage, target of opportunity).

• Smart UAV teams that actively coordinate distributed capabilities/resources to maximize mission effectiveness.

• Sponsor: DARPA ANTS.

• Teammate: Univ. of Michigan

Goal: Adaptive real-time coordination and control of multi-UAV teams.


Intelligent Real-Time Cyber SecurityIntelligent Real-Time Cyber SecurityApproach: • Use CIRCA to plan and execute reactive security controllers.

• Tailor responses automatically according to available resources, varying threat levels & security policies.

Performance:• Fully autonomous operations defeating attacks in microseconds.

• Rapid reconfiguration for dynamic network assets, security state, threat profile.

• Demonstrations in real computer networks.

Impact:• Real-time responses defeat manual and automated attack scripts.

• Automatic tradeoffs of security vs. service level and accessibility.

• System derives responses for novel attacks built from known components.

Sponsor: DARPA CyberPanel.

Teammate: Secure Computing

Goal: Automatic real-time response to computer security intrusions.

Computing services

Active Security ControllerExecutive

Controller Synthesis ModuleNetworks, Computers

Attacks, intrusions

Intrusion Assessment

Security Tradeoff Planner


CIRCA ArchitectureCIRCA Architecture

Adaptive Mission Planner: Divides an overall mission into multiple phases, with limited performance goals designed to make the planning problem solvable with available time and available execution resources. Deliberation scheduling.

Controller Synthesis Module: For each mission phase, plans a set of real-time reactions according to the constraints sent from AMP. Planning.

Real Time Subsystem: Continuously executes planned control reactions in hard real-time environment; does not “pause” waiting for new plans. Execution.

Adaptive Mission Planner

Controller Synthesis Module

Real Time

System


Generate controller

How CIRCA WorksHow CIRCA Works



Real Time

System

Break down mission

Generate controller

Execute controllerif (state-1) then action-1if (state-2) then action-2

...

Generate controller

Start Goal


Extending Performance Guarantees to Multi-Agent TeamsExtending Performance Guarantees to Multi-Agent Teams

Adaptive Mission Planner: Negotiates roles and responsibilities between agents in collaborative team.

Controller Synthesis Module: Builds controllers that include coordinated actions by multiple agents.

Real Time Subsystem: Executes coordinated controllers predictably, including distributed sensing and acting.

Only system to guarantee timing of end-to-end multi-agent coordinated behaviors



Real Time

System

Roles, Goals

Real-Time Reactions

Planned Actions,Planned Negotiations



Real Time

System


Real Time Subsystem (RTS)Real Time Subsystem (RTS)

• The RTS executes loops of Test-Action Pairs (TAPs).

• The RTS executes in parallel with the other CIRCA modules.

• Parallel execution permits re-planning using computationally-expensive algorithms while preserving platform safety.

• Special-purpose TAPs used to download and switch to next controller.

• RTS includes multiple TAP schedule caches to hold controllers before they are activated.

• Example TAP:

- If (radar-missile-tracking T) then begin-evasives with max-delay: 300 msec.

action1 action2 action1 action3 action1test1 test2 test1 test3 test1 test4 action4


Available actions

“Non-volitional” transitions

Goal state description

Initial state description

TimedAutomataWorld Model&ExecutableReactiveController

Planning Real-Time ReactionsPlanning Real-Time Reactions

Transition-based input model similar to classical planners, but with temporal characteristics and non-volitional transitions.

TAP Compiler

Scheduler

State Space PlannerVerifier

CSM


CSM Functional ComponentsCSM Functional Components

• State Space Planner predicts future threats and opportunities, plans actions with timing constraints for future states.

• Verifier reasons about complex temporal model to ensure that all failures are preempted.

• TAP compiler reduces timed automata controller model to time-constrained reactions (Test-Action Pairs).

• Scheduler builds executable cycle of TAPs to meet time constraints.

Controller SynthesisModule

TAP Compiler

Scheduler

State Space PlannerVerifier


CSM AlgorithmCSM Algorithm

• CSM essentially determines a strategy in a timed game against a worst-case adversary.

• Search loop iteratively selects a state and chooses action for that state.

– Heuristics guide choice for safety and goal achievement.

– Approximations indicate that timing will work.

– Formal reachability analysis called after each action choice, to confirm that all planned preemptions will occur.

– If failure reachable, path to failure can be used to backjump to most recent decision related to any state on the path.


CIRCA Motivations: ThreatsCIRCA Motivations: Threats

• Threats represented by temporal transitions to failure (TTFs).

• CSM only returns plans that make failure unreachable, using:

– Prevention: planned actions never allow TTF preconditions to become true.

– Preemption: planned actions will definitely happen before TTFs.

OK Threatened

Failure

Safe


Radar Threat Domain - 1Radar Threat Domain - 1

;; Radar-guided missile threats can occur at any time.(make-instance 'event :name "radar_threat" :preconds '((radar_missile_tracking F)) :postconds '((radar_missile_tracking T)))

;; You die if don't defeat a threat by 1200 time units.(make-instance 'temporal :name "radar_threat_kills_you" :preconds '((radar_missile_tracking T)) :postconds '((failure T)) :min-delay 1200)



;; It takes no more than 10 time units to start evasives.(make-instance 'action :name "begin_evasive" :preconds '((path normal)) :postconds '((path evasive)) :max-delay 10)

;; We defeat missile in between 250 and 400 time units.(make-instance 'reliable-temporal :name "evade_radar_missile" :preconds '((radar_missile_tracking T) (path evasive)) :postconds '((radar_missile_tracking F)) :delay (make-range 250 400))


FAILURE

Radar-threat-kills-you

Radar-missile-tracking TPath normal

RadarThreat

Key Concept: Prevent Failure

Preemption as Key Planning StructurePreemption as Key Planning Structure

Radar-missile-tracking FPath normal

Begin-evasive

Radar-missile-tracking TPath evasive

preemption


FAILURE



RadarThreat

Non-Markov Temporal ModelNon-Markov Temporal Model

Radar-missile-tracking FPath normal

Begin-evasive



Evade-radar-missile

Radar-missile-tracking FPath evasive

Why non-Markov? Efficient reactive plan construction.


CIRCA Motivations: GoalsCIRCA Motivations: Goals

• Represented by designation of specific desirable feature/value pairs.

• CSM heuristic guides system to choose actions that try to achieve (and re-achieve) maximum number of goal features.

• All goals are:

– Conjunctive.

– Optional.



;; Your goal is to continue flying normal path.(make-instance ‘goal :condition '((path normal)))

Optional elements for different planners:• :reward• :priority


FAILURE



RadarThreat

Goals Drive StabilizationGoals Drive Stabilization

Begin-evasive



Evade-radar-missile

Radar-missile-tracking FPath evasive

Radar-missile-tracking FPath normalEnd-evasive


Dynamic Abstraction PlanningDynamic Abstraction Planning

• Start with abstract states omitting all non-goal features.

• Incrementally and non-uniformly add features to states when required:

– When no safe actions are applicable.

– When goal achievement heuristic indicates.

• Result: planner decides what it needs to think about, when.

• Future direction: use this to guide what you attend to for learning.

non-failure failure


non-failure

Path normal

Path evasive

failure



AMP ResponsibilitiesAMP Responsibilities

• Divide mission into phases, subdividing them as necessary to handle resource restrictions.

• Negotiate with other AMPs to allocate goals and threats in each phase.

• Build problem configurations for each phase, to drive CSM.

• Modify problem configurations, both internally and via negotiation with other AMPs, to handle resource limitations.

• Tasks represent:

– Contracts to handle threats and goals.

- Need to announce, bid, award, and plan for them.

– Need to generate plan for a problem configuration.

– Need to download plan for a configuration to the RTS.

• On each AMP decision cycle, select and execute highest-priority task.

• New capability: deliberation scheduling.

– Estimate costs/benefits of different tasks: tie priority to utility.


Negotiated Allocation of Mission GoalsNegotiated Allocation of Mission Goals

• Adaptive Mission Planners negotiate to distribute:

– Long-term mission goals.

– Roles: predefine responsibilities/concerns as context for negotiation.

– Performance evaluation responsibilities.

• Enhanced Contract-Net style negotiation.

– Adaptivity/dynamics.


AMP Deliberation SchedulingAMP Deliberation Scheduling

• Mission phases characterized by:

– Probability of survival/failure.

– Expected reward.

– Expected start time and duration.

• Agent keeps reward from all executed phases.

• Different CSM problem configuration operators yield different types of plan improvements.

– Improve probability of survival.

– Improve expected reward (number or likelihood of goals).

• Configuration operators can be applied to same phase in different ways (via parameters).

• Configuration operators have different expected resource requirements (computation time/space).


Extensible CIRCA ArchitectureExtensible CIRCA Architecture

• Well-defined API for each CIRCA module and components within modules.

• Eg: CSM: problem specification, algorithm controls in, reactive plans and planning process monitoring out.

• Well-defined API for components within modules.

• Eg: API for state-space planner interaction with verifier:

– State space model in to verifier.

– Safety assessment plus optional culprit state trace out from verifier.

• Has allowed us to plug in different planners and verifiers for different domain representations and verifier approaches:

– Timed automata: Kronos, RTA, CSV, RTA-incremental, CSV-incremental; DAP, pDAP, regular planners.

– Safety-oriented Generalized semi-Markov: Monte Carlo sampler.

– Maximizing expected utility: MC sampler; evolutionary reaction-space search engine.

• Different executives: RTS, CLIPS.


Memory and Models in CIRCAMemory and Models in CIRCA



Real Time

System

• Mission model: phases with threats & goals.• Mappings from threats/goals to partial CSM input models (sets of transitions).• CSM performance profiles.

• Transition models from AMP.• Timing model of RTS.

• Cached TAP controllers.• Currently sensed state features.


Probabilistic Reactive PlanningProbabilistic Reactive Planning

• Add transition probabilities to state model.

– World transitions and controlled actions.

• Sample simulated executions of the current plan to estimate probability of reaching different states.

• Build plans that handle most-probable states.

• Allows CIRCA to trade off planning time and plan complexity against system safety.

• Allows CIRCA to optimize expected utility of plans, trading off safety against mission objectives (goals, reward model).


Probabilistic World Model DynamicsProbabilistic World Model Dynamics

• The world model is a generalized semi-Markov process (GSMP).

• The world occupies a single state at any point in time.

• Enabled transitions in the current state compete to trigger.

• One transition triggers in each state, determining the next state.

• Non-Markovian because trigger distributions depend on holding times.

• There are no analytic solutions for unrestricted GSMPs.

• Must use a sampling-based approach to estimate state probabilities.

– Simpler: estimate whether failure is too likely.


Acceptance SamplingAcceptance Sampling

• Let pF be the failure probability of a plan.

• Want to specify failure threshold such that:

– Plan is accepted if pF

– Plan is rejected if pF >

• Use acceptance sampling to decide whether to accept a plan.

• Exhaustive sampling is impossible, so we must expect errors:

– Type I error: reject acceptable plan.

– Type II error: accept rejectable plan.

• Want to bound probability of error.


Sequential SamplingSequential Sampling

• Single sampling plan always requires fixed number of samples.

• Sequential sampling plan decides whether to generate more samples based on samples seen so far.

– Define acceptance number an and rejection number r

n at stage n.

– Accept plan if observed failures are at most an

– Reject plan if observed failures are at least rn

• Intuitive sequential sampling:

– If you’ve already seen c failures at any iteration, then reject (rn = c).

– If you cannot possibly see c failures in remaining iterations, then accept (a

n = c+i -n-1 ).


Number of Samples RequiredNumber of Samples Required

Wald acceptance sampling requires significantly fewer samples.

Actual failure probability

Static sampling plan

Wald sequential sampling plan

threshold


PerformancePerformance

• Expected number of required samples only depends on failure probability and threshold, not state space size!

• Domain-dependent factors affecting time to generate each sample:

– Time period considered (tmax

).

– Mean values of the distribution functions F.

• In practice, this allows us to generate probabilistically-verified plans for very large domains that cannot be handled by complete (non-probabilistic) model-checking approaches.


Optimizing Plans in GSMPsOptimizing Plans in GSMPs

• Adding probabilistic delay distributions to timed automata yields Generalized Semi-Markov Process model:

– Efficient for representing real world.

– No analytic solutions available.

• Adding reward model gives opportunity for decision-theoretic solution criterion: maximize expected utility.

• Approach: generate plans and assess EU dominance using Monte Carlo sampling of GSMP executions.

– Backjump based on sample traces.

• New ideas: local search; evolutionary search in reaction space.


Learning in CIRCA: Not About SpeedupLearning in CIRCA: Not About Speedup

• Unique requirements on learning for mission-critical systems.

• Executive can learn primitive operator response times.

– Currently an offline, manual process transfers that knowledge to planner.

• Learning from failure: if a planned preemption does not occur, model can be revised because either:

– Action or reliable temporal process was slower than modeled bounds.

– Threat temporal process was faster than modeled bounds.

– Planner tells us what features to watch for learning.

• Learning from unexpected reachable states:

– Self-aware system knows what should happen, can explicitly build context-specific responses to surprises, including planning “harder”, invoking default safe modes, and revising models.

• Learning refined planner performance profiles.

– Performance monitoring during planning for a selected problem can indicate reduced probability of solution, prompt switch to different problem.


Summary: Cool Things About CIRCASummary: Cool Things About CIRCA

• Builds and executes plans that provide real-time performance guarantees.

• Automatically trades off mission goals with mission safety.

• Multiple CIRCA agents can coordinate and cooperate on teamed real-time behaviors.

• Self aware: adapts planning process to time available.


• The End.


Recent AdvancesRecent Advances

• Multi-agent CIRCA negotiates responsibilities (threats, goals).

• Coordinated real-time reactions.

• Planning with Generalized Semi-Markov Process models (GSMPs).

– Plans that maximize expected utility.

• Automatic dynamic abstraction.

• Deliberation scheduling.



Real Time

System

Roles, Goals

Real-Time Reactions

Planned Actions,Planned Negotiations



Real Time

System

Phase1

Phase2

FAILURE

Phase4

s1

1-s1

R3

s2 Phase5

Phase3

R5

s3 s4

0

10

20

30

40

50

60

70

5 10 15 20 25 30 35 40 45 50

Fre

qu

en

cy

Planner Time (secs)

100 Samples Per Threat Group

0.5 Second Wide Sample Bins

1 Threat 2 Threats3 Threats4 Threats

Documents

~sa-circa/talks/cog-arch-03/cog-arch-03.ppt Honeywell Laboratories Goals and Threats: Motivations in CIRCA David J. Musliner Honeywell Laboratories [email protected]