23
Language-Based Safety Language-Based Safety Mechanisms Mechanisms Stanford University CS 444A, Autumn Stanford University CS 444A, Autumn 99 99 Software Development for Critical Applications Software Development for Critical Applications Armando Fox & David Dill Armando Fox & David Dill {fox,dill}@cs.stanford.edu {fox,dill}@cs.stanford.edu

Language-Based Safety Mechanisms

Embed Size (px)

DESCRIPTION

Language-Based Safety Mechanisms. Stanford University CS 444A, Autumn 99 Software Development for Critical Applications Armando Fox & David Dill {fox,dill}@cs.stanford.edu. Concepts Overview & Outline. Static approaches “Safe by design” (limiting the language) - PowerPoint PPT Presentation

Citation preview

Language-Based Safety Language-Based Safety MechanismsMechanisms

Stanford University CS 444A, Autumn 99Stanford University CS 444A, Autumn 99Software Development for Critical ApplicationsSoftware Development for Critical Applications

Armando Fox & David DillArmando Fox & David Dill{fox,dill}@cs.stanford.edu{fox,dill}@cs.stanford.edu

Concepts Overview & OutlineConcepts Overview & Outline

Static approachesStatic approaches ““Safe by design” (limiting the language)Safe by design” (limiting the language)

Static analysis/type-safe languagesStatic analysis/type-safe languages

Dynamic approachesDynamic approaches interpreters and sandboxesinterpreters and sandboxes

Dynamic dataflow analysisDynamic dataflow analysis

A few examples (and problems)A few examples (and problems) Java, the Exokernel, VMware, SFI, Janus, Interface CompilationJava, the Exokernel, VMware, SFI, Janus, Interface Compilation

As usual…each bullet is the subject of volumes of As usual…each bullet is the subject of volumes of papers…this is just an introduction to the landscapepapers…this is just an introduction to the landscape

Contrast With David’s “Req Spec”Contrast With David’s “Req Spec”

RS is about verifying a program (or FSM) RS is about verifying a program (or FSM) in the in the abstractabstract

SFI is about securing them SFI is about securing them in practicein practice

The two are complementaryThe two are complementary

Ex: “Transitions in FSM cover all possibilities”Ex: “Transitions in FSM cover all possibilities” What is “all”, really?What is “all”, really?

Recall: dreaming up desired emergent propertiesRecall: dreaming up desired emergent properties

Compare: Intel P6 bus protocol Compare: Intel P6 bus protocol verificationverification vs. vs. implementation implementation validationvalidation

What Is “Safety” in this context?What Is “Safety” in this context?

Primary emphasis: prevent buggy/malicious app from Primary emphasis: prevent buggy/malicious app from doing harm doing harm to othersto others

Don’t interfere with other apps directly (read/write their Don’t interfere with other apps directly (read/write their data or files)data or files)

Don’t interfere with other apps Don’t interfere with other apps indirectlyindirectly (hog OS (hog OS resources so other apps are denied service)resources so other apps are denied service)

Don’t crash or corrupt the OSDon’t crash or corrupt the OS particularly important, since OS usually is the “trusted arbiter” particularly important, since OS usually is the “trusted arbiter”

of limited resourcesof limited resources

Non-goal: stability of the isolated app.Non-goal: stability of the isolated app.

TechniquesTechniques

Two basic families of techniques:Two basic families of techniques:

1. Limit things at runtime1. Limit things at runtime

2. Limit things at compile time2. Limit things at compile time

Many schemes use a combination of bothMany schemes use a combination of both

Runtime schemes typically rely on some OS and/or Runtime schemes typically rely on some OS and/or hardware supporthardware support

Background: “The Thin Red Line”Background: “The Thin Red Line”

Separates untrusted Separates untrusted user user space(s)space(s) from trusted from trusted kernel kernel spacespace Kernel manages hardware, shared Kernel manages hardware, shared

resources, …resources, …

If you can bend the kernel to your If you can bend the kernel to your will, you can do serious damagewill, you can do serious damage

Typical implementation: hardware Typical implementation: hardware VM supportVM support Each user process has its own page Each user process has its own page

tables (managed by the kernel)tables (managed by the kernel)

Certain addresses mapped to kernel Certain addresses mapped to kernel pagespages

Usercode

Kernelcode

Programming model

UserpageUserpageUserpage

Kernel pages User

pageUserpageUserpage

Call GatesCall Gates

Call gatesCall gates (or call descriptors, or traps, or…) (or call descriptors, or traps, or…) Controlled breach in the thin red lineControlled breach in the thin red line

Typically involve an address space change, which relies on Typically involve an address space change, which relies on VM; so they are slow and expensiveVM; so they are slow and expensive

Implementation often uses exception-handling capability of Implementation often uses exception-handling capability of processorprocessor

User code

Kernel code

Background: Virtual MachinesBackground: Virtual Machines

In practice, a VM provides a combination of a language In practice, a VM provides a combination of a language execution environment and a “pseudo-OS” runtime execution environment and a “pseudo-OS” runtime systemsystem ““guest” VM may virtualize hardware resources differently from guest” VM may virtualize hardware resources differently from

“host” OS“host” OS

Safety is often not a primary goal of a VMSafety is often not a primary goal of a VM

The “guest” and “host” OS’s may be the same or The “guest” and “host” OS’s may be the same or different with respect to…different with respect to… Machine language/programmer-visible architectureMachine language/programmer-visible architecture

Virtualization of resourcesVirtualization of resources

Common flavor to various approaches: Control access to Common flavor to various approaches: Control access to “unsafe” language/VM features“unsafe” language/VM features

VM ExamplesVM Examples

Java: artificial-machine-in-a-real-machineJava: artificial-machine-in-a-real-machine Provides a language, a runtime, and OS-like abstractions Provides a language, a runtime, and OS-like abstractions

(network, filesystems, etc.)(network, filesystems, etc.)

Centralized Java Security Manager enforces security policiesCentralized Java Security Manager enforces security policies

For the most part, runs in user modeFor the most part, runs in user mode

VMware: virtualize any x86 OS inside any other (well, VMware: virtualize any x86 OS inside any other (well, almost)almost) Every VM “sees” x86 protected-mode environmentEvery VM “sees” x86 protected-mode environment

Within a VM, policies enforced by guest OSWithin a VM, policies enforced by guest OS

Across VM’s, virtualized hardware is isolatedAcross VM’s, virtualized hardware is isolated

User must grant a certain level of trust to VMware host programUser must grant a certain level of trust to VMware host program

What Can You Do With This?What Can You Do With This?

Limit what the language can expressLimit what the language can express ““Unsafe” operations are defined out of existenceUnsafe” operations are defined out of existence

““Never put off till runtime what you can do at compile time”Never put off till runtime what you can do at compile time”

Limit what can be done at runtimeLimit what can be done at runtime Perhaps in combination with language limitingPerhaps in combination with language limiting

Each approach has pros and consEach approach has pros and cons

Static Analysis, Type-Safe LanguagesStatic Analysis, Type-Safe Languages

Goal: To limit the damage a program can do, limit what can Goal: To limit the damage a program can do, limit what can be expressed in the source languagebe expressed in the source language Assumes binaries are tamper-evidentAssumes binaries are tamper-evident

Assumes only Assumes only trustedtrusted tools used to build binaries tools used to build binaries

Assumes trusted tools are working correctly!Assumes trusted tools are working correctly!

Language features/limitations may allow you to prove Language features/limitations may allow you to prove some invariantssome invariants Example: Backward branching disallowed Example: Backward branching disallowed finite-length programs finite-length programs

finish in finite timefinish in finite time

Example: Pointers disallowed Example: Pointers disallowed dangling pointer dereferences dangling pointer dereferences vanishvanish

Contrast: SFI or inserting guard codeContrast: SFI or inserting guard code

Example: Spin and Modula-3Example: Spin and Modula-3

SPIN (Bershad et al., early 90’s): a user-extensible SPIN (Bershad et al., early 90’s): a user-extensible microkernelmicrokernel

Extension language: Modula-3, a type-safe, object-Extension language: Modula-3, a type-safe, object-oriented languageoriented language

Why type safety?Why type safety?

Why object oriented?Why object oriented?

The extension checker and compilerThe extension checker and compiler

Limiting the LanguageLimiting the Language

Goal: To limit the damage a program can do, limit what can be Goal: To limit the damage a program can do, limit what can be expressed in the source languageexpressed in the source language Assumes binaries are tamper-evidentAssumes binaries are tamper-evident

Assumes only Assumes only trustedtrusted tools used to build binaries tools used to build binaries

Assumes trusted tools are working correctly!Assumes trusted tools are working correctly!

Language features/limitations may allow you to prove some Language features/limitations may allow you to prove some invariantsinvariants Example: Backward branching disallowed Example: Backward branching disallowed finite-length programs finite-length programs

finish in finite timefinish in finite time

Example: Pointers disallowed Example: Pointers disallowed dangling pointer dereferences vanish dangling pointer dereferences vanish

““Never put off till runtime what you can do at compile time”Never put off till runtime what you can do at compile time”

Pros & Cons of Static AnalysisPros & Cons of Static Analysis

- Requires that code be written in that specific language- Requires that code be written in that specific language Sometimes it’s actually desirable to have a simpler Sometimes it’s actually desirable to have a simpler

language! (e.g. Exokernel generalized packet filter)language! (e.g. Exokernel generalized packet filter)

Other times languages may be too limited or awkwardOther times languages may be too limited or awkward

May also rely on integrity of tool chainMay also rely on integrity of tool chain

- Languages with rich type systems and class - Languages with rich type systems and class hierarchies confound this approachhierarchies confound this approach Checking virtual function callsChecking virtual function calls

Casting between “safe” types (e.g. int to enum)Casting between “safe” types (e.g. int to enum)

Static analysis, cont’d.Static analysis, cont’d.

- Relies on integrity of interpreter or binaries- Relies on integrity of interpreter or binaries What if the Java guys forgot some of the security checks?What if the Java guys forgot some of the security checks?

VM interpreter may need semi-privileged access to get at VM interpreter may need semi-privileged access to get at the “real” resources controlled by the host OSthe “real” resources controlled by the host OS

Or at least, OS must verify signed code segments (ActiveX Or at least, OS must verify signed code segments (ActiveX does this)does this)

+ May allow strong formal proofs of program safety+ May allow strong formal proofs of program safety Usually done by showing that a particular high-level Usually done by showing that a particular high-level

construct can never produce “unsafe” low-level codeconstruct can never produce “unsafe” low-level code

Can prove from the source code, if transformations are Can prove from the source code, if transformations are “correctness-preserving” (or “semantics preserving”)“correctness-preserving” (or “semantics preserving”)

At Runtime: Classic SFI and JanusAt Runtime: Classic SFI and Janus

SFI: “If program stays in its sandbox, it can’t damage SFI: “If program stays in its sandbox, it can’t damage other programs.”other programs.” Dangerous operations/references surrounded by Dangerous operations/references surrounded by

interpolated “guard code”interpolated “guard code”

Dangerous references can also be “pinned” to sandbox by Dangerous references can also be “pinned” to sandbox by overwriting upper address bitsoverwriting upper address bits

Note, this breaks program correctness! But focus of SFI is Note, this breaks program correctness! But focus of SFI is preventing harm to others, not to oneselfpreventing harm to others, not to oneself

Janus: “If program can’t make system calls, it can’t Janus: “If program can’t make system calls, it can’t damage the OS [and therefore other programs].damage the OS [and therefore other programs]. Some programs break because they don’t check system call Some programs break because they don’t check system call

resultsresults

Pros & cons of runtime approachesPros & cons of runtime approaches

+ Use high-confidence machine-level mechanisms+ Use high-confidence machine-level mechanisms Based on hardware-level mechanisms, e.g. VM, trapsBased on hardware-level mechanisms, e.g. VM, traps

In practice, hardware implementation errors for these are In practice, hardware implementation errors for these are extremely extremely rare (why?)rare (why?)

+ Can be used with arbitrary “legacy” code+ Can be used with arbitrary “legacy” code

- No onus on programmer to make potential error - No onus on programmer to make potential error conditions explicit (e.g. assertions)conditions explicit (e.g. assertions) So runtime has no idea what to do to “recover”So runtime has no idea what to do to “recover”

- Doesn’t guarantee correct behavior--only safety to - Doesn’t guarantee correct behavior--only safety to othersothers

Dynamic Dataflow AnalysisDynamic Dataflow Analysis

PotentiallyPotentially unsafe operations must unsafe operations must alwaysalways be denied, to be denied, to be conservativebe conservative If done statically, renders code impotentIf done statically, renders code impotent

Idea: quarantine the data that may be “contaminated” by Idea: quarantine the data that may be “contaminated” by user (user (taintperltaintperl works this way) works this way)

print STDERR “Enter file name:”;print STDERR “Enter file name:”;$x=<STDIN>; # $x is tainted (user input)$x=<STDIN>; # $x is tainted (user input)…more code… …more code… $z=“/tmp/safe_file.txt”; # $z is clean$z=“/tmp/safe_file.txt”; # $z is clean$y=“$sysdir/$x”; # $y is tainted$y=“$sysdir/$x”; # $y is taintedsystem(“cat $y”); # disallowed!system(“cat $y”); # disallowed!system(“cat $z”); # OKsystem(“cat $z”); # OK

Interface CompilationInterface Compilation

Problem: interfaces are a syntactic abstraction that usually Problem: interfaces are a syntactic abstraction that usually carry no semanticscarry no semantics

Semantics might be useful for…Semantics might be useful for… Special-case optimizations (e.g. file I/O, specialization by call Special-case optimizations (e.g. file I/O, specialization by call

site)site)

Safety of called proc, or error handling in case of failureSafety of called proc, or error handling in case of failure

Is the interface Is the interface tootoo narrow? narrow? Semantic type info may be lost (Unix)Semantic type info may be lost (Unix)

Semantic properties such as “liveness” are not preserved Semantic properties such as “liveness” are not preserved across the interface (hidden state) - example to followacross the interface (hidden state) - example to follow

Exploiting SemanticsExploiting Semantics

Example 1: File I/OExample 1: File I/O

fd = open(filename);fd = open(filename);/* …do some file operations … *//* …do some file operations … */close(fd);close(fd);/* …more code… *//* …more code… */read(fd,buf,4096); /* certain to fail! */read(fd,buf,4096); /* certain to fail! */

Example 2: type impoverishmentExample 2: type impoverishment

read(int fd, void *buf, size_t n);read(int fd, void *buf, size_t n);

What if What if bufbuf is unaligned or not big enough? is unaligned or not big enough?

No way to tell from call syntaxNo way to tell from call syntax

Interface Compilation With MAGIKInterface Compilation With MAGIK

Provides abstractions for dealing with interfacesProvides abstractions for dealing with interfaces Iterators over the function callsIterators over the function calls

Accessors for the data structures manipulated by each call: Accessors for the data structures manipulated by each call: what type? Compile-time constant? Access to internal what type? Compile-time constant? Access to internal fields of structure? Etc.fields of structure? Etc.

Allows programmer to write C-like code “extensions” Allows programmer to write C-like code “extensions” using these functions and accessorsusing these functions and accessors Original source and extensions are compiled together into Original source and extensions are compiled together into

common intermediate formcommon intermediate form

Intermediate form can be optimized using traditional Intermediate form can be optimized using traditional methods before machine targetingmethods before machine targeting

IC as an Orthogonal MechanismIC as an Orthogonal Mechanism

Can retrofit existing “legacy” code (provided source is Can retrofit existing “legacy” code (provided source is available)available) Admits of incremental improvementsAdmits of incremental improvements

Safety concerns/development can be kept separate from mainline Safety concerns/development can be kept separate from mainline logic for maintainabilitylogic for maintainability

Some cool implemented examplesSome cool implemented examples Type-aware I/O for CType-aware I/O for C

Safe signal handling (prevents calling non-reentrant library Safe signal handling (prevents calling non-reentrant library functions inside a signal handler)functions inside a signal handler)

Common thread: uses semantic information that cannot be Common thread: uses semantic information that cannot be extracted from source aloneextracted from source alone

Compare with “emergent properties” in req. spec.Compare with “emergent properties” in req. spec.

Lessons? Anyone?Lessons? Anyone?

Limits of virtual machines and static analysisLimits of virtual machines and static analysis Assumes tools are trustworthy, from a security standpointAssumes tools are trustworthy, from a security standpoint

But…buggy == untrustworthyBut…buggy == untrustworthy

End-to-end argument suggests falling back on runtime SFI?End-to-end argument suggests falling back on runtime SFI?