Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
https://trustworthy.systems
Security Needs a Better Hardware-Software ContractGernot Heiser | [email protected] | @GernotHeiser• DAC’19, Las Vegas, 5 June 2019
Threats
DAC, Las Vegas, 5 June 20192 |
Speculation
Microarchitectural Timing Channel
An “unknown unknown” until
recently
A “known unknown”
for decades
What Are Timing Channels?
DAC, Las Vegas, 5 June 20193 |
Timing Channels
DAC, Las Vegas, 5 June 20194 |
Information leakage through timing of events• Typically by observing response latencies or own execution speed
Covert channel: Information flow that bypasses the security policy
Side channel: Covert channel exploitable without insider help
High LowTrojanencodes
info
SpyobservesAttacker observes
Victim executes normally
Cause: Competition for Shared HW Resources
DAC, Las Vegas, 5 June 20195 |
Affect execution speed
Shared hardware
• Inter-process interference• Competing access to micro-
architectural features • Hidden by the HW-SW contract!
High Low
Security: A HW-SW Codesign Issue
DAC, Las Vegas, 5 June 20196 |
Enforcing Security
DAC, Las Vegas, 5 June 20197 |
Operating System
Hardware (CPU etc)
High Low
Provide mechanisms
Enforce policies
HW-SW Contract
Why Hardware Cannot Do Security Alone• Security policies are high-level
• Course-grain: “applications” are sets of cooperating processes
• Hardware mechanisms are fine-grain: instructions, pages, address spaces• Much semantics lost in mapping to hardware level
• Security policies are complex: “Can A talk to B?” is too simple• maybe one-way communication is allowed• maybe communication is allowed under certain conditions• maybe low-bandwidth leakage doesn’t matter• maybe secrets only matter for a short time• maybe only subset of {confidentiality, integrity, availability} is important
DAC, Las Vegas, 5 June 20198 |
Why the ISA is an Insufficient Contract• The ISA is a purely operational contract
• Sufficient for ensuring functional correctness• Insufficient for ensuring confidentiality or availability
DAC, Las Vegas, 5 June 20199 |
High Low
Observe execution speed:Confidentiality violation
Affect execution speed:Availability violation
The ISA intentionally abstracts time away
What Is Needed?
DAC, Las Vegas, 5 June 201910 |
Confidentiality Needs Time Protection
Time protection: A collection of OS mechanisms which collectively prevent interference between security domains that make execution speed in one domain dependent on the activities of another.
[Ge et al. EuroSys’19]
DAC, Las Vegas, 5 June 201911 |
High Low
Traditionally OSes enforce security by memory protection, i.e. enforcing spatial isolation
Time Protection: Partition HardwareLowHigh
CacheFlush
Temporally partition
Cannot spatially partition on-core caches (L1, TLB, branch predictor, pre-fetchers)• virtually-indexed• OS cannot control
Low
Cache
High
LowHigh
Cache
Spatially partition
Flushing useless for concurrent access• HW threads• cores
Needboth!Needboth!
DAC, Las Vegas, 5 June 201912 |
Requirements for Time Protection
DAC, Las Vegas, 5 June 201913 |
Timing channels can be closed iff the OS can• (spatially) partition or• resetall shared hardware
On-corestate
Off-corestate &
stateless HW
Sharing 1: Stateless InterconnectH/W is bandwidth-limited• Interference during concurrent
access• Generally reveals no data or
addresses• Must encode info into access
patterns• Only usable as covert channel, not
side channel
Sharedinterconnect
Memory No effective defencewith present hardware!
DAC, Las Vegas, 5 June 201914 |
High Low
Sharing 2: Stateful HardwareHW is capacity-limited• Interference during• concurrent access• time-shared access
• Collisions reveal addresses• Usable as side channel
Cache
Any state-holding microarchitectural feature:• cache, branch predictor, pre-fetcher state machine
Solvable problem –focus of this work
DAC, Las Vegas, 5 June 201915 |
High Low
ImplementingTime Protectionon StatefulHardware
DAC, Las Vegas, 5 June 201916 |
Spatial Partitioning: Cache Colouring
DAC, Las Vegas, 5 June 201917 |
Cache
RAM
• Partitions get frames of disjoint colours• seL4: userland supplies kernel memory⇒ colouring userland colours dynamic kernel memory
• Per-partition kernel image to colour kernel[Ge et al. EuroSys’19]
High Low
TCB PT PTTCB
Temporal Partitioning: Flush on Switch
DAC, Las Vegas, 5 June 201918 |
1. T0 = current_time()2. Switch user context3. Flush on-core state4. Touch all shared data needed for return5. while (T0+WCET < current_time()) ;6. Reprogram timer7. return
Latency dependson prior execution!
Time padding to Remove
dependency
Ensure deterministic
execution
Must remove any history dependence!
Reality Check:FlushingOn-Core State
DAC, Las Vegas, 5 June 201919 |
Evaluating Intra-Core Channels
DAC, Las Vegas, 5 June 201920 |
Flush
Mitigation on Intel and Arm processors:• Disable data prefetcher (just to be sure)• On context switch, perform all architected flush operations:• Intel: wbinvd + invpcid (no targeted L1-cache flush supported!)•Arm: DCCISW + ICIALLU + TLBIALL + BPIALL
LowHigh
CacheFlush
Low
Cache
High
Methodology: Prime and Probe
DAC, Las Vegas, 5 June 201921 |
OutputSignal
2. Touch n cache lines
1. Fill cache with own data
3. Traverse cache, measure execution timeInput
Signal
High LowTrojan
encodesSpy
observes
Methodology: Channel Matrix
DAC, Las Vegas, 5 June 201922 |
7000 8000 9000
10000 11000 12000
0 10 20 30 40 50 60Pro
bin
g tim
e (
cycl
es)
Cache sets accessed
datafile using 1:2:($3>pmax ? pmax : $3)
0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04
Horizontal variation indicates
channel
Raw I-cache channelIntel Sandy Bridge
Channel Matrix:• Conditional probability of
observing time, t, given input, n.• Represented as heat map:
• bright = high probability
I-Cache Channel With Full State Flush
DAC, Las Vegas, 5 June 201923 |
60000
61000
62000
63000
64000
0 10 20 30 40 50 60
Tim
e (
cycl
es) datafile using 1:2:3
0.001
0.01
Intel Sandy Bridge
12500
13000
13500
14000
0 2 4 6 8 10
Tim
e (
cycl
es) datafile using 1:2:3
0.001
0.01
Intel Haswell
7000
8000
9000
10000
11000
0 10 20 30 40 50 60
Outp
ut (c
ycle
s)
Input (sets)
datafile using 1:2:3
0.00010
0.00100 Intel Skylake
90000
92000
94000
0 5 10 15 20 25 30 35 40
Tim
e (
cycl
es)
Cache sets
datafile using 1:2:3
0.00010
0.00100 HiSilicon A53
CHANNEL!
CHANNEL!
No evidence of channel
SMALL CHANNEL!
HiSilicon A53 Branch History Buffer
DAC, Las Vegas, 5 June 201924 |
0 1
10-1
10-3
10-2
10-4
10-5400
600
800
1000
Trojan signalSpy
exec
utio
n tim
e
Branch history buffer (BHB)• One-bit channel• All reset operations applied
Channel!
Intel Haswell Branch Target Buffer
DAC, Las Vegas, 5 June 201925 |
31000
32000
33000
34000
3500 4000 4500 5000
Tim
e (
cycl
es) datafile using 1:2:3
0.001
0.01
Spy
exec
utio
n tim
e
Trojan cache footprintChannel!
Found residual channels in all recent Intel and ARM processors examined!
Branch target buffer• All reset operations
applied
Intel Spectre Defences
DAC, Las Vegas, 5 June 201926 |
Intel added indirect branch control (IBC) feature, which closes most channels, but…
Intel SkylakeBranch history buffer
Small channel!
https://ts.data61.csiro.au/projects/TS/timingchannels/arch-mitigation.pml
Requirements on Hardware
DAC, Las Vegas, 5 June 201927 |
New HW/SW Contract: aISA
For all shared microarchitectural resources:1. Resource must be spatially partitionable or flushable2. Concurrently shared resources must be spatially partitioned3. Resource accessed solely by virtual address must be flushed and not
concurrently accessed• Implies cannot share HW threads across security domains!
4. Mechanisms must be sufficiently specified for OS to partition or reset5. Mechanisms must be constant time, or of specified, bounded latency6. Desirable: OS should know if resettable state is derived from data,
instructions, data addresses or instruction addresses
Augmented ISA supporting time protection
DAC, Las Vegas, 5 June 201928 |