Upload
jensen
View
77
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Practical Timing Side Channel Attacks Against Kernel Space ASLR. Fall 2014 Instructor: Kun Sun, Ph.D. Outline. Background Side Channel Attack Kernel Space ASLR Use timing-based side channel to defeat kernel space ASLR (A heavy engineering work) - PowerPoint PPT Presentation
Citation preview
Practical Timing Side Channel Attacks Against Kernel Space ASLR
Fall 2014Instructor: Kun Sun, Ph.D.
2
Outline Background
Side Channel Attack Kernel Space ASLR
Use timing-based side channel to defeat kernel space ASLR (A heavy engineering work)
How to prevent/mitigate this side channel attack?
3
Outline Background
Side Channel Attack Kernel Space ASLR
Use timing-based side channel to defeat kernel space ASLR
How to prevent/mitigate this side channel attack?
Side Channel
Side channel vs. Covert channel Side channel: incidentally,
unintentionally Covert channel: colluding,
intentionally A side channel is a communication
channel based on the use of system resources not normally intended for communication between the subjects (processes) in the system
Side Channel Types Computer systems abound with side
channels Storage Channels
use system variables and attributes (other than time) to signal information
classic example is resource exhaustion channel
Timing Channels vary the amount of time required to
complete a task to signal information classic example is load sensing channel
Side channels are typically noisy but information theory techniques can be used to achieve error-free communication
Resource Exhaustion Channel
Given 5MB pool of dynamically allocated Memory
High Process bit = 1 request 5MB of memory bit = 0 request 0MB of memory
Low Process request 5MB of memory if allocated then bit = 0 otherwise bit =
1
Load Sensing Channel
High Process bit = 1 enter computation intensive
loop bit = 0 go to sleep
Low Process perform a task with known
computational requirements if completed quickly then bit = 0
otherwise bit =1
Mitigating Side Channel
Audit for storage-based Time normalization for timing-
based Slow it down, reduce the
transmission date rate
9
Outline Background
Side Channel Attack Kernel Space ASLR
Use timing-based side channel to defeat kernel space ASLR
How to prevent/mitigate this side channel attack?
ASLR Deployment
Windows user space and kernel space ASLR since vista.
Pax patches in Linux (not in kernel) MacOS since 10.5 Mobile OSes including Android and
iOS
User Space ASLR vs. Kernel Space ASLR
Shacham et al. showed that user mode ASLR on 32-bit architectures only leaves 16 bit of randomness, which is not enough to defeat brute-force attacks.
However, such brute-force attacks are not applicable for kernel space ASLR. An attacker wants to exploit a
vulnerability in kernel code, a wrong offset will cause a complete crash of the system and thus an attacker has only one attempt to perform an exploit.
Windows Kernel Space ASLR
For both 32-bit and 64 bit Vista, Windows 7, and Windows 8. Obtain the info by reverse-engineering During boot process, Windows loader allocates kernel_region
for the kernel image and the hardware abstraction layer (HAL)
The entropy is 6 bit, 32*2=64=26
13
Outline Background
Side Channel Attack Kernel Space ASLR
Use timing-based side channel to defeat kernel space ASLR
How to prevent/mitigate this side channel attack?
Timing-based Side Channel Attack
Target at how a local attacker with restricted rights can mount a timing-based side channel attack against the memory management system to deduce information about the privileged kernel space ASLR.
Attack Model
System is protected by ASLR and Local adversary can run user-space
applications, but cannot execute privileged kernel mode code.
Challenges on Exploiting the Attack
Timing attacks against a modern CPU are complicated. Performance optimizations including
hardware prefetching, speculative execution, multi-core architectures, or branch prediction.
Target on modern CPU and new OS Previous work focused on older
processors without such optimization.
TLB and Cache Translation lookaside buffer: convert virtual address to physical
address Virtually indexed, physically tagged
http://www.cs.princeton.edu/courses/archive/fall04/cos471/lectures/19-VirtualMemory.pdf
Intel i7 Memory Architecture plus Clock Latency
General Approach Two steps:
1. Set the system in a specific state from user mode.
2. Measure the duration of a certain memory access operation. The time span of this operation then (possibly) reveals certain information about the kernel space layout.
One example: Cache a system call handler by executing sys
call. Access a designated set of user space
addresses and execute the system call again. If the system call takes longer than expected,
the access of user space addresses evicted the system call handler code and reveals address info of system code.
Handling Noise
Circumvent performance optimizations E.g., test memory in non-
consecutive order to circumvent memory pre-fetching.
OS Scheduler, shared cache Two steps requires 5000 clock
cycles OS time slides several ms
Guarantee to find cases with No Interrupts between two steps.
Three Attacks
Attacks on L1/L2/L3 caches and TLB/PS caches Cache probing Double page fault Cache preloading
Cache Probing Based on multiple memory addresses
mapped into the same cache set and compete for available slots
n-way set associative mode. all available slots are grouped into sets of the
size n and each memory chunk can be stored in all slots of one particular set.
E.g., in 32-bit address, a typical L3 cache of 8 MB is 16-way set associative. It consists of (8, 192∗1, 024)/64 = 131, 072 single slots that are grouped into 131, 072/16 = 8, 192 different sets.
Reverse-engineering Sandybridge Hash Function for L3 parallel access among multiple cores
Double Page Fault
Based on the observation on Intel CPUs When the page translation was
successful, but the access permission check fails (e.g., when kernel space is accessed from user mode), a TLB entry is indeed created.
Do not work on AMD CPUs Can be easily fixed by hardware
redesign.
Double Page Fault First page fault: for each kernel space page p,
access p from user mode and a page fault will be handled by the operating system and forwarded to the exception handler of the process. When p refers to an allocated page, the MMU
creates a TLB entry for p although the succeeding permission check fails.
When p refers to an unallocated page, since the translation fails, the MMU does not create a TLB entry for p.
Second page fault: accessing p again, if p refers to an allocated kernel page, the page fault will be delivered faster due to the inherent TLB hit.
Cache Preloading Double page fault estimates the allocation map of
the kernel space, but do not determine at which concrete base addresses the kernel and drivers are loaded to.
3 steps in Cache Preloading: flush all caches (i.e., address translation and
instruction/data caches) to start with a clean state.
preload the address translation caches by indirectly calling into kernel code, for example by issuing a sysenter operation.
Intentionally generate a page fault by jumping to some kernel space address and measure the time. If the faulting address lies in the same memory range as the preloaded kernel memory a shorter time
Testbed
Tested on 32-/64-bit systems running Windows 7/Linux
Tested on different CPUs + VM: Intel i7-870 (Nehalem/Bloomfield, Quad-
Core) Intel i7-950 (Nehalem/Lynnfield, Quad-
Core) Intel i7-2600 (Sandybridge, Quad-Core) AMD Athlon II X3 455 (Triple-Core) VMWare Player 4.0.2 on Intel i7-870
(with VT-x)
Results
28
Outline Background
Side Channel Attack Kernel Space ASLR
Use timing-based side channel to defeat kernel space ASLR (A heavy engineering work)
How to prevent/mitigate this side channel attack?
Mitigation Approaches
Root cause: memory hierarchy is a shared resource
Also, no enough randomization: 64 different slots for windows kernel
Option one: split caches for user and kernel. Shortcoming: checking overhead,
maximum cache size reduced Option two: suppress the creation of TLB
entries on successful address translation if an access violation happens
Option three: Timing normalization; modify the execution time of the page fault handler
Future works
Test MacOS Test Android Obtain the physical address of a
certain memory location from user mode.
Questions?