35
July 2001 OASIS --- Santa Fe 1 Dependence Graphs for Information Assurance Paul Anderson [email protected] GrammaTech, Inc. Ithaca, NY http://www.grammatech.com Tim Teitelbaum [email protected] (Cornell)

July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson [email protected] GrammaTech, Inc. Ithaca, NY

Embed Size (px)

Citation preview

Page 1: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 1

Dependence Graphs for Information Assurance

Paul Anderson

[email protected]

GrammaTech, Inc.

Ithaca, NY

http://www.grammatech.com

Tim Teitelbaum

[email protected]

(Cornell)

Page 2: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 2

Situation

• Front door open– Eligible Receiver: 65% of attacks successful; 63% undetected [Tenet; Minihan]

• Back door (installed and) open– Back Orifice, etc. demonstrate the potential of automated intrusion maintenance

• Blinds are up– Open sources expose critical software and all its flaws

• Foundation is rotten– Buggy software is the norm [Weise; Engler]

– A dozen new buffer-overrun attacks every week [Epstein]

• The domestic is foreign– 195,000 H1B visas issued per year

Page 3: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 3

DISCEX II Keynotes

• DoD Perspective– “Nation states are our greatest concern”

LTG Edward G. Anderson IIIDeputy Commander in ChiefUnited States Space Command

• Characteristics

– Planning

– Long term view

– Strategic investment

– Stealth

• Questions for OASIS

– What is your model of a

nation-state attack?

– How do you address it?

Page 4: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 4

DISCEX II Keynotes

• Industry Perspective– “DARPA should focus on tools for building safer code”– “More emphasis needed on correct implementation,

especially for security products!”Jeremy EpsteinDirector, Product SecuritywebMethods, Inc.

• DoD Perspective– “Nation states are our greatest concern”

LTG Edward G. Anderson IIIDeputy Commander in ChiefUnited States Space Command

Page 5: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 5

Questions

• Front door open– What have nation states been doing to us while we have been so exposed?

• Back door (installed and) open– What percentage of observed attacks are nation-state attacks?

• Blinds are up– Are nation states investing heavily in vulnerability analysis of open source code?

• Foundation is rotten– Are nation-state insider code-level attacks distinguishable from bugs?

• The domestic is foreign– How many foreign agents program routers for Cisco?– How does Cisco defend its products from its own malicious employees?– Do you consider firmware part of the TCB? On what basis?

Page 6: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 6

President’s Critical Infrastructure Protection Plan

Recommendations [page 61]6.1 Critical Infrastructure Protection Research and Development

Intrusion Detection and Monitoring

. . .

This program is designed to develop advanced software tools and techniques

that can detect and eliminate trap doors and other malicious code in software.

Although detecting subtle but intentional alterations to computer code is

problematic, these tools will increase the integrity of software products, and

thereby reduce the probability of future penetrations and compromises of

computers and networks.

Development of Advanced [...] Software Tools for Trap Door Analysis

and Malicious Code Detection

DARPAHARD

Page 7: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 7

Role of Static Code Analysis in IA

• Assumptions– Implementations are critical– Tools for understanding code are strategic

• Attacks and Vulnerabilities– Trap doors and exploitable bugs

• Approach – Statically detect and eliminate

• Other applications of core-technology – Policy enforcement by code rewriting– Model extraction from code

• Scope– Mission-critical and mass-market– Open-source and closed-source– Source and binary

Page 8: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 8

• Core Static-Analysis Research– Context-Sensitive GMOD / GREF Analysis– Fine-grained discrimination by structure-fields– Variable-based queries and function-based queries– Non-structured control constructs, e.g., switch, break, continue, goto

• better precision

– Pointer Analysis• better performance [about x6 faster]

– Interprocedurally-precise model checker (mu-calculus)

• Information Assurance Workbench– Buffer overrun vulnerability detection and analysis– Pattern matching on AST fragments

Accomplishments [past 6 months]

2

3

1

1

Page 9: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 9

Analysis of Buffer Overrun Vulnerabilities

• Code Red attack– Begins by exploiting a buffer-overrun vulnerability

• Static analysis

– Can detect potential buffer-overrun vulnerabilities

1

Page 10: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 10

Analysis of Buffer Overrun Vulnerabilities, cont.

• Code Red attack– Exploits an unchecked byte-string to wide-character-string conversion

– Assume the operation used was

mbstowcs(char *dst, char *src, int length)

– Can 2*length be bigger than size of dst?

• Dependence queries– Reveal potential information-flows, e.g.,

• from data sources under user control (external strings)

• to dangerous operations (unchecked length arguments)

Page 11: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 11

Analysis of Buffer Overrun Vulnerabilities, cont.

Sources of external strings

mbstowcs(wide_buf, buf, );strlength(buf)

Sources of internal strings

Unchecked variable-length argument mbstowcs(wide_buf, buf, );SIZE

Unchecked fixed-length argument

mbstowcs(wide_buf, buf, );r

Checked variable-length argument

Can users influence the length argument of a string-to-wide-character conversion?

Bounds-check

Page 12: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 12

Analysis of Buffer Overrun Vulnerabilities, cont.

Sources of external strings

mbstowcs(wide_buf, buf, );strlength(buf)

Sources of internal strings

Unchecked variable-length argument mbstowcs(wide_buf, buf, );SIZE

Unchecked fixed-length argument

mbstowcs(wide_buf, buf, );r

Checked variable-length argument

Dependences from data sources to mbstowcs arguments

Bounds-check

Page 13: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 13

Analysis of Buffer Overrun Vulnerabilities, cont.

Sources of external strings

mbstowcs(wide_buf, buf, );strlength(buf)

Sources of internal strings

Unchecked variable-length argument mbstowcs(wide_buf, buf, );SIZE

Unchecked fixed-length argument

mbstowcs(wide_buf, buf, );r

Checked variable-length argument

Chop from sources to targets shows all possible information flows

bounds-check

sources targets

Page 14: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 14

Analysis of Buffer Overrun Vulnerabilities, cont.

Sources of external strings

mbstowcs(wide_buf, buf, );strlength(buf)

Sources of internal strings

Unchecked variable-length argument mbstowcs(wide_buf, buf, );SIZE

Unchecked fixed-length argument

mbstowcs(wide_buf, buf, );r

Checked variable-length argument

Good news: find all flows; Bad news: false positive (flow through bounds-check)

bounds-check

Page 15: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 15

Code Red source-code mock-up, showing

chop-sources, chop-targets , and query-

results.

chop-targets

Analysis of Buffer Overrun Vulnerabilities, cont.

chop-sources

bounds-check

Page 16: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 16

Analysis of Buffer Overrun Vulnerabilities, cont.

• Model checking queries

– Can assert and check properties about flow paths

– Counter-examples: reveal possible vulnerabilities

• Sample (false) assertion

Every path from a user data source to the

length argument of mbstowcs goes through a

bounds-check

• Sample counter-example

– Path from data source to unchecked length argument of mbstowcs

Page 17: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 17

Analysis of Buffer Overrun Vulnerabilities, cont.

Sources of external strings

mbstowcs(wide_buf, buf, );strlength(buf)

Sources of internal strings

Unchecked variable-length argument mbstowcs(wide_buf, buf, );SIZE

Unchecked fixed-length argument

mbstowcs(wide_buf, buf, );r

Checked variable-length argument

Good news: counter-example avoids bounds-check

bounds-check

Page 18: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 18

Analysis of Buffer Overrun Vulnerabilities, cont.

No bounds-check

Counter-example in query-results; Chop result in _______ background

Page 19: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 19

Analysis of Buffer Overrun Vulnerabilities, cont.

• Constraint satisfaction [Wagner, et al.]

– Assert required constraints between destination buffer sizes and

corresponding copy length arguments

– Report all cases where constraints are not satisfied

• Use of CodeSurfer [future work]

– Implement in industrial-strength framework

– Reduce false positives reported

• Context-sensitive constraint satisfaction

• Better pointer analysis

– Interactive tool for analysis of false positives

Page 20: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 20

Context-Sensitive GMOD / GREF Analysis

• Accurate dependence analysis for reference parameters– Previously, a major source of imprecision– Now, context-sensitive analysis of non-local variable usage– Substantial improvement– Relevant for buffer-overrun analysis

• Example– Instead of

mbstowcs(char *dst, char *src, int length)

consider

assign(char *dst, char *src){

*dst = *src;}

2

Page 21: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 21

Context-Sensitive GMOD / GREF Analysis

• When procedure P has formal parameter F of type *T, the flow-insensitive,

context-insensitive points-to set of F is the union of the points-to sets of all

corresponding actual parameters in calls to P (plus any other pointers

assigned to F in P)

assign(&a1, &a2); assign(&b1, &b2);

assign(char *dst, char *src)

{

*dst = *src;

}

Page 22: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 22

Context-Sensitive GMOD / GREF Analysis, cont.

• Additional (hidden) actual and formal parameters are generated to represent

the variables modified or referenced via formal F (as well as variables modified

or referenced via global pointer variables)

assign(&a1, &a2); assign(&b1, &b2);

assign(char *dst, char *src)a2,b2 a1,b1

{

*dst = *src;

}

Page 23: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 23

Context-Sensitive GMOD / GREF Analysis, cont.

• The generated actual parameters are wired to the accessible defs and uses of

the variables accessible via F.

• In general, there is more than one def for each actual-in, and more than one

use for each actual-out.

assign(&a1, &a2); assign(&b1, &b2);

assign(char *dst, char *src)a2,b2 a1,b1

{

*dst = *src;

}

a2=… b2=……=a1 …=b1

Page 24: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 24

Context-Sensitive GMOD / GREF Analysis, cont.

• Previously, the wiring was based on the points-to sets of the corresponding formal parameters. Thus, the additional edges (in blue) were also wired.

assign(&a1, &a2); assign(&b1, &b2);

assign(char *dst, char *src)a2,b2 a1,b1

{

*dst = *src;

}

a2=… b2=……=a1 …=b1

Page 25: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 25

Context-Sensitive GMOD / GREF Analysis, cont.

• A backward slice from a use of variable b1 shows what influences its value.• It is computed by following dependence edges backward.

assign(&a1, &a2); assign(&b1, &b2);

assign(char *dst, char *src)a2,b2 a1,b1

{

*dst = *src;

}

a2=… b2=……=a1 …=b1

Page 26: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 26

Context-Sensitive GMOD / GREF Analysis, cont.

• A backward slice from a use of variable b1 shows what influences its value.• It is computed by following dependence edges backward.• Only feasible paths are followed, i.e., edges shown in gold are not followed.

Good!

assign(&a1, &a2); assign(&b1, &b2);

assign(char *dst, char *src)a2,b2 a1,b1

{

*dst = *src;

}

a2=… b2=……=a1 …=b1

Page 27: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 27

Context-Sensitive GMOD / GREF Analysis, cont.

• But the path to variable a2 would also be followed. Bad! Variable a2 has no

influence on variable b1

assign(&a1, &a2); assign(&b1, &b2);

assign(char *dst, char *src)a2,b2 a1,b1

{

*dst = *src;

}

a2=… b2=……=a1 …=b1

Page 28: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 28

Context-Sensitive GMOD / GREF Analysis, cont.

• … and other spurious paths would also be followed. Very bad!

assign(&a1, &a2); assign(&b1, &b2);

assign(char *dst, char *src)a2,b2 a1,b1

{

*dst = *src;

}

a2=… b2=……=a1 …=b1

Page 29: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 29

Context-Sensitive GMOD / GREF Analysis, cont.

• This bad behavior has now been eliminated.

• We now distinguish between variables accessible because of actual

parameters and variables accessible because of globals.

assign(&a1, &a2); assign(&b1, &b2);

assign(char *dst, char *src)a2,b2 a1,b1

{

*dst = *src;

}

a2=… b2=……=a1 …=b1

Page 30: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 30

Context-Sensitive GMOD / GREF Analysis, cont.

• Big win in precision

• Big win in time and space

• But a new time and space problem [example not shown]Conjecture: Greater precision makes previously identical sets (with shared

representations) different (and therefore unshared)

Program LOCSize of SDG

Summary edge time Build time

Forward slice time

Backward slice time

compress 1,937 -48% 0% -62% -50% 0%cpp 4,079 -16% -33% -19% -47% -29%

byacc 6,626 3% -23% -27% -34% -25%cadp 12,787 -21% 14% -39% -32% -22%flex 12,400 -6% 4% -28% -33% -29%

ijpeg 28,177 -24% -81% 6% -71% -71%go 29,246 -21% -31% -35% -46% -53%

ntpd 61,068 -12% -16% -27% -12% -7%

Page 31: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 31

Discrimination by Structure Field

• Previously, all fields participated in every operation on any field

– e.g., predecessors of p->f were defs of every field of struct pointed to by p

• Now, there is an option to discriminate on structure fields

– e.g., predecessors of p->f are only defs of field f of structs pointed to by p

• But casts and unions must be taken into account

– For portable analysis, cannot use explicit offsets

– Two fields f1 and f2 in different struct types T1 and T2 have the same offsets if the field

sequences leading up to f1 and f2 have pair-wise compatible types

• Explicit offsets could be used in the future for precise platform-dependent analysis

• New problem to be solved

– Unless calls to malloc are immediately cast to their intended types, field discrimination

is lost

3

Page 32: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 32

Transitions (Spin-off SBIR Research)

• Current SBIR Phase I projects to transition research to products

– Malicious Code Detection in Firmware (Air Force)

• CodeSurfer for x86; use to detect malicious code

– Model Checking of Hierarchical Graph Structures (DARPA / ITO)

• CodeSurfer model checking plug-in for QA

– Inlined Reference Monitors for Java Bytecode (NIST)

• Use of dependence-graph technology for insertion of efficient IRMs

– Model Checking of UML designs (Navy / Aegis)

• Model-checking to assure properties in UML Rose/RealTime designs

– Dependence Graphs for Dynamic Internet Technologies (NSF)

• CodeSurfer for Java; decision support for test coverage

– Static Analysis for AOP (DARPA / PCES)

• Aspect C (separate take from Gregor’s)

– New Technique for Efficient Compression of Information (BMDO) *

• BDD variant, potential for double-exponential decision tree compression

* [unrelated to DARPA research]

Page 33: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 33

Transitions, cont.

• Recent Product Release – CodeSurfer 1.5

• Open APIs to C program representation and analysis operations

• Paper– Software Inspection using CodeSurfer, WISE’01 Workshop on Inspection

in Software Engineering, July 23rd, 2001, Paris.

Page 34: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 34

Workshop Topics

• Integration Opportunities– Projects exploring code rewriting or reorganization, or developing

vulnerability scanners• Client of our open APIs to program representation and analyses

– Projects relying on a system model• Potential to extract the model automatically from the code

• Validation– Scalability

• Performance on benchmarks

– Vulnerability detection

• False positive rate w.r.t. “truth”, e.g., known buffer overrun attacks

Page 35: July 2001OASIS --- Santa Fe1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

July 2001 OASIS --- Santa Fe 35

Future Work

• Core Technology– Pointer analysis– Dependence analysis

• concurrency, asynchronous control, reused storage, types, array strides– Model checker (model reduction and abstraction relaxation)– Constraint satisfaction: sets and numeric ranges– Summary information for libraries– Rewriting support– Performance

• Information Assurance Workbench– Scanners for buffer overruns, race conditions, covert channels