45
Cognitive Support for Intelligent Cognitive Support for Intelligent Survivability Management Survivability Management CSISM TEAM June 21, 2007

Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

Embed Size (px)

Citation preview

Page 1: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

Cognitive Support for Intelligent Cognitive Support for Intelligent Survivability ManagementSurvivability Management

CSISM TEAM

June 21, 2007

Page 2: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

2

OutlineOutline

• Introduction

• Status, results and plans for technical thrusts– Multi-layer reasoning for cyber-defense administration

• Knowledge representation and rules for system wide reasoning (OLC) • Fast containment response and policies (ILC)

– Improving defense parameters and strategies by learning augmentation

– Implementation and Integration

• Conclusions

Page 3: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

CSISM Introduction and CSISM Introduction and BackgroundBackground

Page 4: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

4

Problem Domain: Self-Regenerative SystemsProblem Domain: Self-Regenerative Systems

Our Focus: Automated interpretation of observation and response selection..

Level of service w/o attack

undefended

Survivable (3rd Gen.)

Regenerative

time

Level of service

Start of focused attack

Graceful degradation: Adaptive response limited to static use of diversity and policy; Event-interpretation and response selection by human experts.

Level of service w/o attack

undefended

Survivable (3rd Gen.)

Regenerative

time

Level of service

Start of focused attack

Retain level of service and improve defense: Static and dynamic use of artificial diversity; Use of wide area distribution; Automated interpretation of observation and response selection, augmented by learning from past experience.

• Cyber-Defense• Survivable systems

• Automated …• Self-improving……

Page 5: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

5

Cyber-Defense Decision-Making LandscapeCyber-Defense Decision-Making Landscape

level of automation

scal

e &

co

mp

lexi

ty o

f co

nte

xt

generality of scope

SRS Phase 1

CSISM

3rd generation (DPASA)

considerable human involvement

automated expert behavior

singl

e ap

plica

tion

DoD

rele

vant

info

rmat

ion

syst

em

Virtu

aliza

tion

of D

oDre

leva

nt in

form

atio

n sy

stem

0

+

+

+

mostly autonomic

level of automation

scal

e &

co

mp

lexi

ty o

f co

nte

xt

generality of scope

SRS Phase 1

CSISM

3rd generation (DPASA)

considerable human involvement

automated expert behavior

singl

e ap

plica

tion

DoD

rele

vant

info

rmat

ion

syst

em

Virtu

aliza

tion

of D

oDre

leva

nt in

form

atio

n sy

stem

0

+

+

+

mostly autonomic

Page 6: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

6

ChallengesChallenges

• Goal: Automate the reasoning performed by expert cyber-defense administrators– Effective, reusable, easy to port and retarget

• Challenges:– Making sense of low-level information (alerts, observations)

to drive low-level defense-mechanisms (block, isolate etc.) such that higher-level objectives (survive, continue to operate) are achieved

– Doing it as good as human experts– Additional difficulties

• Rapid and real time decision-making and response • Uncertainty due to incomplete and imperfect information• Widely varying operating conditions (no alerts to 100s of alerts per

second) • New symptoms and changes in adversary’s strategy

Page 7: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

7

ApproachApproach

– Multi-perspective multi-hypothesis deliberation• Keep all options open– delay the bindings• Divide and conquer

– Current-utility as well as potential adversarial counter-response based response selection

• A simple “match” is insufficient against intelligent adversary• Unpredictability to counter gaming

– Contain while deliberate• Buy time

– Learning-based dynamic modification of defense parameters and strategies

• “Immunity” against repeats and variants

Inte

rpre

tSe

lect

Res

pons

e

ILC

OLC

Lear

ning

Mul

ti-La

yer r

easo

ning

Page 8: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

Knowledge Representation and Knowledge Representation and Rules for System-wide ReasoningRules for System-wide Reasoning

Page 9: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

9

Objectives

• Represent knowledge of cyber-defense• Allow reasoning about attack and defense,

including look-ahead• Automate most reasoning• Encode enough detail to estimate relative

goodness of alternatives in most situations

• Extract knowledge from Red Team encounters; attempt to generalize• Separate generic, reusable, knowledge from

system-specific

Page 10: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

10

Achievements

• Classification of knowledge• Classification of reasoning• Breadth-first:

• Relationship between alerts, accusations, corruption, flooding, failures

• Instantiate for DPASA

• Depth-first:• DPASA registration protocol• Run 6, Nov 2005 Red Team exercise

• Encode knowledge and reasoning• 1st-order logic prototype• Soar rules and data• Representing concepts, instances and relations– use of a

common ontology (Adventium’s Netbase)

Page 11: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

11

1. Symptomatic: possible explanations for a given anomalous event– Both generic and system-specific

2. Relational: constraints that reinforce or eliminate possible explanations– Mostly system-specific

3. Teleological: possible attacker goals and actions that may be used to accomplish the goals– Mostly generic

4. Reactive: possible defensive countermeasures for a given attack– Both generic and system-specific

Kinds of Knowledge

Focus so far has been on 1, 2, and 4

Page 12: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

12

Focus so far has been on restrictive reasoning.

• Restrictive– From observations of past events and knowledge

of system properties, deduce good explanations and good defensive responses

– (the reasoning restricts what is possible)

• Predictive– Look ahead, comparing alternatives

Kinds of Reasoning

Page 13: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

13

Example from Run 11, Nov 2005

Server 1(Linux)

Server 3(Solaris)

Server 4(Linux)

Server 2(Windows)

accusation: violated protocol

accusation

accusation

Reasoning:Under most likely assumption, no common-mode failure and exploit of at most one OS, Servers 2 and 3 can’t both be lying, so Server 1 must be corrupt. It’s not restartable, so quarantine it. Note that no information source is completely trusted.

Page 14: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

14

(Simplified) Example from Run 6, Nov 2005

Monitor 3 Monitor 4Monitor 2Monitor 1

Client 2

Client2 LAN

Client 1

commcomm

accusation:no heartbeats

accusations

Reasoning:All 4 monitors claim to have received communication from oneclient but accuse another client of not delivering heartbeats. Theycan’t all be lying. The communication path for some must be OK,so either Client 2 or its LAN is bad. Ping Client2 to determine which.

Page 15: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

15

OLC Reasoning Flow OLC Reasoning Flow

Reason about info. flow:Refine the interpretation by considering the potential sources of omission or corruption implied in the accusation.

Reason about bad behavior: Create initial baseline interpretation of the reported event and observation-- one entity in the system accuses another

Reason about attacker goal: Further refinement - reduce the potential set of failures & corruptions by considering attacker objectives & assumptions Reason about the context:

Additional refinement –eliminate candidate failures and corruptions by considering current scenario or workflow state

Intermediatecandidate

hypotheses

Conditional jump to response selection

Hypotheses: potential conditions explaining

observed state

Event reports and observations

Even

t Int

erpr

etat

ion

Even

t Int

erpr

etat

ion

Resp

onse

Sel

ectio

n Re

spon

se S

elec

tion

Match responses for the candidate hypotheses

Select responses providing most utility

Look ahead fixed no of steps for possible adversary counter-response

Intermediatecandidate

hypotheses

Intermediatecandidate

hypotheses

Intermediatecandidate responses

Intermediatecandidate responses

Conditional jump to response engagement Response selected

for execution

Reason about info. flow:Refine the interpretation by considering the potential sources of omission or corruption implied in the accusation.

Reason about info. flow:Refine the interpretation by considering the potential sources of omission or corruption implied in the accusation.

Reason about bad behavior: Create initial baseline interpretation of the reported event and observation-- one entity in the system accuses another

Reason about bad behavior: Create initial baseline interpretation of the reported event and observation-- one entity in the system accuses another

Reason about attacker goal: Further refinement - reduce the potential set of failures & corruptions by considering attacker objectives & assumptions

Reason about attacker goal: Further refinement - reduce the potential set of failures & corruptions by considering attacker objectives & assumptions Reason about the context:

Additional refinement –eliminate candidate failures and corruptions by considering current scenario or workflow state

Reason about the context: Additional refinement –eliminate candidate failures and corruptions by considering current scenario or workflow state

Intermediatecandidate

hypotheses

Conditional jump to response selection

Hypotheses: potential conditions explaining

observed state

Event reports and observations

Even

t Int

erpr

etat

ion

Even

t Int

erpr

etat

ion

Resp

onse

Sel

ectio

n Re

spon

se S

elec

tion

Match responses for the candidate hypotheses

Match responses for the candidate hypotheses

Select responses providing most utilitySelect responses providing most utility

Look ahead fixed no of steps for possible adversary counter-response

Look ahead fixed no of steps for possible adversary counter-response

Intermediatecandidate

hypotheses

Intermediatecandidate

hypotheses

Intermediatecandidate responses

Intermediatecandidate responses

Conditional jump to response engagement Response selected

for execution

Page 16: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

16

Rapid Prototyping

Use automatic theorem prover– “prover9”, McCune, UNM– 1st order– encode restrictive reasoning

– Advantage over Soar:– Existing algorithm for deep reasoning– Easier to get started

– Disadvantages compared to Soar:– Goals are not selected automatically– Reasoning algorithm can’t be controlled– Non-1st-order reasoning not available

Page 17: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

17

Encoding in Soar

Soar is based on more than 20 years research into human

cognition. It uses pattern-directed inference and hierarchical

control to reason in a manner similar to human thinking

The OLC inference engine will use coherence theory to search for a set of hypotheses that is maximally consistent with the observations and with its experience—we anticipated the need, but our implementation has not yet faced a situation

Use of standard ontology and Protégé

Managing the complexity of knowledge acquisition

Use of Herbal to generate Soar rules from higher level representation

Page 18: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

18

Conclusion and Next Steps

• A good start:• Knowledge and reasoning sufficient for defense of

DPASA in some Red Team exercises, e.g., run 6• Rough estimate of coverage:

• Existing rules would reason about all alerts and defend successfully in roughly half of Nov 2005 runs in which human operators also defended successfully

• 2nd half will be harder

• Needed now:• Immediately: rules for flooding; redundant groups;

phases of mission• Soon: attacker objectives in larger-scale attacks

Page 19: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

Fast Containment Response Fast Containment Response and Policiesand Policies

Page 20: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

20

Inner Loop Controller (ILC) ObjectivesInner Loop Controller (ILC) Objectives

Attempt to contain and correct the problem at the earliest stage possible

• Policy Driven: Implement policies and tactics from OLC on a single host.

• Autonomous: high speed responsecan work when disconnected from the OLC by an attack or failure

• Flexible: Policies can be updated at any time

• Adaptive: Use learned characteristics of host and monitored services to tune the policy.

• Low impact on mission: able to back out of

defensive decisions when warranted

Policy DB

Chk Pt DB

HW/OS Watchdog

AppController

AppFactory

ILC

App1

App2

Outer Loop ControlRemote App

policy layer

sensorsactuators

Control Data

instantiate

Policy DB

Chk Pt DB

HW/OS Watchdog

AppController

AppFactory

ILC

App1

App2App2

Outer Loop ControlRemote App

policy layer

sensorsactuators

Control Data

instantiate

Page 21: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

21

Survey of ILC WorkSurvey of ILC Work

• Requirements– The threat model, Performance, Range of

sensing and response, OLC communications

• Design– Study typical applications and recovery needs

• Policies

• First Prototype– Dynamically configurable rule-based policies

• Plans for Integration and Testing– With the testbed emulating the DPASA

survivable JBI– As a stand-alone program on real host

Page 22: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

22

ILC Prototype-1 ArchitectureILC Prototype-1 Architecture

• Java Driver Program– Instantiate reasoning

components, start load

• System API– OLC Communications– Sensing and Response

• Jess Inference Engine

• Policy Modules– For each application and

services monitored

Java Driver

Jess Rule Engine

A

System API (Java+Jess)

B C D

SavedStateFiles

jess facts and rules

D

Page 23: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

23

Components of ILC ResponseComponents of ILC Response

Monitored Service SStatus, Settings

DetectionRules for SProblem Types

ProblemInstance P

ProblemTypes andResponse Policies

Detection API

Response API

Internal Objects usedin implementing ILC responses.

internal timers

Evidence E

Page 24: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

24

ILC Status – June 2007ILC Status – June 2007

• Requirements and design for ILC• Working Java Driver

– Initializes Jess inference engine– Remote access to ILC for policy manipulation or

remote debugging• Preliminary System API modules for

– ILC embedded in emulated test environment– Standalone ILC for Linux host– Initial ties with learning/adaptation module

• Sample policy modules– for SELinux, EFWAgent (Typical defense

mechanisms)

Page 25: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

25

Next StepsNext Steps

• Integration with emulated test environment– Flesh out API, make compatible with ontology– Explore interactions with OLC, e.g. strategies

involving dynamic ILC policy changes– Complete ties to the learning module

• More sample application policies– Explore broader range of behaviors, e.g.

nondeterminism• Standalone Testing

– Install ILC on workstation and/or server and monitor live applications/services

– Probe ILC response under failures and attacks

Page 26: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

Improving Defense Parameters Improving Defense Parameters and Strategies and Strategies

Page 27: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

27

Learning Augmentation: MotivationLearning Augmentation: Motivation

• Why learning?– Extremely difficult to capture all the complexities of the

system, particularly interactions among activities– The system is dynamic (static configuration gets out of

date)• CSISM will learn to

– improve the defensive posture • better knowledge (about the attacks or attacker), better policies

– improve how the system responds to symptoms • better connection between response actions and their triggers

Adaptation is the key to survival

Page 28: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

28

Development Plan for Learning in CSISMDevelopment Plan for Learning in CSISM

1. Responses under normal conditions (Calibration)

2. Situation-dependent responses under attack conditions

3. Multi-stage attacks

Page 29: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

29

Analysis: RegTime by QuadAnalysis: RegTime by Quad

Quad 0&1 are slower than Quads 2&3.

Complex domain: human calibration

(incorrectly) claimed that Quad 1

was slowest, missing Quad 0

Page 30: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

30

Analysis: Registration Times by Client TypeAnalysis: Registration Times by Client Typecaf_plan, chem_haz

and maf_plan are slower than other clients

Complex domain: human calibration

(incorrectly) claimed that caf_plan & maf_plan were

slowest because of hand-typed

password, missing chem_haz

Page 31: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

31

Step 1: CalibrationStep 1: Calibration

• Calibrate the parameters of rules for normal operating conditions – Important first step because it learns how to respond to

normal conditions– Initially, timing parameters from ILC, e.g.

• Client Registration, PSQ server local probes, SELinux enforcement, SELinux flapping, File integrity checks

• Core challenge:Offline Training

+ Good data+ Complex environment

- Dynamic system

Online Training- Unknown data

+ Complex environment+ Dynamic system

CSISM’s Experimental Sandbox+ Good data (self-labeled)+ Complex environment

+ Dynamic system

Very hard for adversary from “training” the

learner!!!

Human+ Good data

- Complex environment- Dynamic system

Sandbox approach successfully tried in

SRS phase 1

Page 32: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

32

Step 1: CalibrationStep 1: Calibration

• Using algorithm of Last & Kandel– Calculates a membership score for each sample,

based on how similar it is to nearby samples (the distance-to-density ratio).

• If score < threshold, it is an outlier

– It can make estimates even for multi-modal data.

x xx

xxx

xxx

xx

x xxxx

xx

xxxx

Threshold

Score

Page 33: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

33

Results for CombOps RegistrationResults for CombOps Registration

If threshold were 0.90,

then x-values inside the green box

would be OK

Beta=0.001

Beta=0.0025

Beta=0.005

Page 34: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

34

Results for Results for allall Registration times Registration times

Beta=0.0001

Beta=0.0005

Page 35: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

35

Beta=0.0005

Results for Results for allall Registration times Registration times

In the demo, you’ll see these two “shoulder” points, indicating upper

and lower limits.

As more observations are collected, the estimates become more confident of the

range of expected values (i.e. tighter estimates to observations)

Page 36: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

36

Status, Development Plan & Future stepsStatus, Development Plan & Future steps

1. Responses under normal conditions (Calibration)a. Analyze DPASA data (done)

b. Integrate with ILC (single node) (done)

c. Add experimentation sandbox (single-node)

d. Calibrate across nodes

2. Situation-dependent responses under attack conditions

3. Multi-stage attacks

Page 37: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

Implementation and IntegrationImplementation and Integration

Page 38: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

38

Objectives and AssumptionsObjectives and Assumptions

• Objectives– CSISM Components should be reusable and portable

• Maximize genericity, and clear demarcation between specific and generics• Standardized representation, generating CSISM internal representations from higher level

specification– Evaluation framework should be “system scale”, easy to construct, easy to inject attack

effects into, easy to interface with • Emulation

• Assumptions– Soar can process alerts as fast as they are generated (not to say that the OLC input will

not be flooded)– The survivable system ensures that alerts make it to the OLC and Learner– The survivable system ensures the ILC process runs with higher privilege – If the target is not corrupt, OLC’s command will be executed by the survivable system – Source IP addresses are not spoofed (can be satisfied by the ADF cards)

• Challenges Addressed– Standardized representation of concepts, instances and relationships involved in a

survivable system– Time handling in reasoning and evaluation– Thread handling in the reasoning engine

Page 39: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

39

Integration FrameworkIntegration Framework

Page 40: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

40

Achievement SummaryAchievement Summary

• OLC– Reasoning about accusations, information flow, and some context and

protocol specific situations covering all alerts in half of the DPASA attack runs

• A subset of these is exercisable by the emulated testbed, the rest are tested from Soar (apart from rapid prototyping in Prover9)

• ILC – Confirmation that reactive response policies for typical defended

applications or defense mechanisms can be built from small, reusable rule-based components

• Learning Augmentation– Calibration– set up and initial example (e.g., registration time)

• Validation framework for CSISM capabilities– Emulation of a subset of ODV survivable JBI implemented, ongoing

• Integration– OLC-system under test– Learner-ILC

Page 41: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

41

Next Steps Next Steps

• Challenges/obstacles?– Consistent set of hypotheses

• Coherence theory

• Plan for next steps in individual tasks– Outlined in earlier sections

• Plan for next steps in Integration– KR-work fully integrated with the OLC and system under

test– Fuller emulation– ILC- system under test integration– ILC-OLC and Learning-OLC integration– More attack variations and support for red team access– Improved viewport into reasoning and metrics

Page 42: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

42

ConclusionConclusion

• Good start, gathered momentum• Preliminary results are promising

– OLC coverage– ILC feasibility– Learning insights

• Cross-project integration potential– Looked into SPDR at more detail

• Reasoning about attack plan recognition and OLC bin 3• ILC and DRED• Same ontological representation

– Would like to look into• Other projects, for example:

– VICI defense against rootkit to protect the ILC• Other issues (e.g. timeliness)

– Of defense– Interference with the timeliness requirements of the system under test

• Evaluation vehicle

Page 43: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

Backup notesBackup notes

Page 44: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

44

Enforcement OffEnforcement Offno-enforcement.soarno-enforcement.soar

• Current:– Interpretation: node reports process-protection off, we note

that self accusation

– Response selection: enforcement-off self accusation causes blocking all ADF NICs on that host

• Next step:– Treat the self accusation generically—many alerts will be

“self-accusations”– they will be handled by a single set of rules

– Response selection will consider other actions like restarting a process, rebooting a host, blocking the NICs or isolating the LAN

Page 45: Cognitive Support for Intelligent Survivability Management CSISM TEAM June 21, 2007

45

RegistrationRegistrationcallback.soar, prepare-registartion.soar, reboot.soar, gui-up.soarcallback.soar, prepare-registartion.soar, reboot.soar, gui-up.soar

• Observation that a client is invited sets up an expectation (that GUI should appear in the future)

• If the GUI does not appear that triggers some interpretation (see below)• Current:

– An intermediate condition with a ordered prescription for remedies• Reboot the client: It’s a client issue that rebooting may fix

• Re-register from another SM: If there is an SM/DC/AP issue this may solve the problem

• If all quads exhausted, try refresh the AP refs and reinvite– If there is a reason to suspect a quad, try isolating that SM before refresh

• Future:– Hypotheses that the client or the inviting SM may be bad, or the path may be

bad – Restrictive reasoning considering info flow and other incoming events to narrow

eliminate• Maximally consistent set of hypotheses

– Select response based on utilities (and predictive reasoning)