50
XenTT: Deterministic Systems Analysis in Xen Anton Burtsev , David Johnson, Chung Hwan Kim, Mike Hibler, Eric Eide, John Regehr University of Utah

XenTT: Deterministic Systems Analysis in Xen

Embed Size (px)

DESCRIPTION

This talk details XenTT, an open-source framework for deterministic replay and systems analysis in development at the University of Utah. The framework consists of two main parts: a set of Xen extensions that implement efficient, deterministic replay, and a powerful analysis engine that extracts information from systems during replay executions. Deterministic replay promises to change how people analyze and debug software systems. As software stacks grow in complexity, traditional ways of understanding failures, explaining anomalous executions, and analyzing performance are reaching their limits in the face of emergent behavior, unrepeatability, cross-component execution, software aging, and adversarial changes to code. Replay-based, whole-system analyses offer precise solutions to these problems. XenTT extends Xen with the ability to replay and analyze the execution of VM guests. A number of careful design choices ensure that our implementation, which supports single-CPU, paravirtual, Linux guests, is efficient, maintainable, and extensible. XenTT's run-time checks and offline log-comparison tools enabled us to efficiently scale the recording layer by detecting and debugging errors in the determinism of replay. Our analysis engine seeks to overcome the semantic gap between an analysis algorithm and the low-level state of a guest. Using debug information to reconstruct functions and data structures within the guest, the engine provides a convenient API for implementing systems analyses. The engine implements a powerful debug-symbol and VM introspection library, which enables an analysis to access the state of the guest through familiar terms. To further simplify the development of new analyses, the engine provides primitives that support common exploration patterns, e.g., breakpoints, watchpoints, and control-flow integrity checking. To enable performance analyses of recorded executions, XenTT provides a performance modeling interface, which faithfully replays performance parameters of the original run. Beyond describing the design and implementation of XenTT, this talk will present examples of how we have used deterministic replay to implement security and performance analyses.

Citation preview

Page 1: XenTT: Deterministic Systems Analysis in Xen

XenTT: Deterministic Systems Analysisin Xen

Anton Burtsev, David Johnson, Chung Hwan Kim, Mike Hibler,Eric Eide, John RegehrUniversity of Utah

Page 2: XenTT: Deterministic Systems Analysis in Xen

2

• Record execution history of a guest VM• Recreate it in an instruction-accurate way

Page 3: XenTT: Deterministic Systems Analysis in Xen

3

• Execution replay is a right way to analyze systems• We need a practical tool!

Page 4: XenTT: Deterministic Systems Analysis in Xen

4

Deterministic replay

Page 5: XenTT: Deterministic Systems Analysis in Xen

5

DeterminismCPU is

deterministic

Execution history

Interrupt, device I/O

Page 6: XenTT: Deterministic Systems Analysis in Xen

6

Recording

Event log

• Determinism of the execution environment• Instruction-accurate position of events

Page 7: XenTT: Deterministic Systems Analysis in Xen

7

Instruction-accurate position of events

...movshrmov movsltest ...jne label

label:

rep

EIP, branch #, ECX}{Position =

• Number of instructions since boot• Intel has a hardware

counter• It’s not accurate

• Hardware instruction counter• Preempt execution of a system at the same instruction• Hardware instruction counter + single-stepping

Page 8: XenTT: Deterministic Systems Analysis in Xen

8

Determinism in Xen

Page 9: XenTT: Deterministic Systems Analysis in Xen

9

Nondeterministic events

Simple Model

System = memory pages + registers

Events = memory updates (time, device I/O, system calls)

Control flow updates = registers + stack(interrupts, events)

Page 10: XenTT: Deterministic Systems Analysis in Xen

10

Some examples• Instruction emulation (e.g. cpuid, rdtsc, in/out)

• Return the values of the original run• Hypercalls

• Re-execute to ensure determinism of the hypervisor• Time

• Shared info page + rdtsc• Exceptions

• Deterministic, re-execute• Interrupts

• Force re-execution of the interrupt frame (bounce frame) code in entry.S

• Shared info updates• Replay original values

• Memory• Shadow page tables

Page 11: XenTT: Deterministic Systems Analysis in Xen

11

Page 12: XenTT: Deterministic Systems Analysis in Xen

12

Xen devices

Page 13: XenTT: Deterministic Systems Analysis in Xen

13

Device interposition

• Devd ensures determinism of updates to the guest’s shared ring buffers

Page 14: XenTT: Deterministic Systems Analysis in Xen

14

Replay touches many parts of Xen• Device discovery

• Devd implements a concept of a device bus• Discovers new devices in Xenstore• Binds new devices with drivers

• Xenstore transactions• During replay, transactions from replayed guest can’t fail• They will not be re-executed

• Out-of-order device responses• Disk and network responses can arrive out-of-order

• Disk logging• Disk payload is deterministic• LVM snapshots

• Network logging• In-kernel logging of the network payload

Page 15: XenTT: Deterministic Systems Analysis in Xen

15

Low-overhead logging

Page 16: XenTT: Deterministic Systems Analysis in Xen

16

Are we sure that executions identical?• Intel branch store trace facility• Record all taken branches in a memory buffer• Compare original and replay runs

Page 17: XenTT: Deterministic Systems Analysis in Xen

17

Analysis Engine and Virtual Machine Introspection

Page 18: XenTT: Deterministic Systems Analysis in Xen

18

Analysis framework

Page 19: XenTT: Deterministic Systems Analysis in Xen

19

Virtual Machine Introspection (VMI)

Page 20: XenTT: Deterministic Systems Analysis in Xen

20

Performance model

• Re-execution approach to performance

• Account for effects of replay

• Translate performance between original and replay runs

Virt cntr = Virt cntrstart + ∆ (Real cntr)

Page 21: XenTT: Deterministic Systems Analysis in Xen

21

Analysis Examples

Page 22: XenTT: Deterministic Systems Analysis in Xen

22

NFS request processing path

• How much time requests spend in each subsystem?

• Request tracking• Address of the kernel

data structure is a unique identifier

• Join identifiers when requests move between subsystems

Page 23: XenTT: Deterministic Systems Analysis in Xen

23

Execution context tracking• Execution context• Context switches

• schedule.switch_tasks• User/kernel• System call transitions• system_call

• Interrupts and exceptions• do_IRQ• do_pagefault• do_* (divide_error, debug, nmi, int3...)

• Make analysis context aware• Filter probes by context, e.g.• All pagefaults from the process “foo”

Page 24: XenTT: Deterministic Systems Analysis in Xen

24

Control Flow Integrity (CFI)

• Simple CFI model:• Returns should match calls• Dynamic calls are “sane”, e.g. within

kernel address space• Detect ROP attacks, stack smashing, etc.

Page 25: XenTT: Deterministic Systems Analysis in Xen

25

sys_sendfile do_sendfile fget_light rw_verify_area fget_light rw_verify_area shmem_file_sendfile do_shmem_file_read shmem_getpage find_lock_page radix_tree_lookup shmem_recalc_inode shmem_swp_alloc shmem_swp_entry kmap_atomic __kmap_atomic page_address kunmap_atomic find_get_page radix_tree_lookup file_send_actor sock_sendpage UNKNOWN FUNCTION (addr:0x00000000)

• CFI records a trace of function calls• sys_sendfile is

the last system call before control flow jumps to 0x0

Execution trace

Page 26: XenTT: Deterministic Systems Analysis in Xen

26

Intrusion backtracking

• Track accesses to “/etc/passwd”• Probe sys_open• Filter by file name• Find process ID, branch counter

Page 27: XenTT: Deterministic Systems Analysis in Xen

27

Intrusion backtracking: pass 1

• Process or it’s parent escalated privileges• Watch write accesses to &task->uid • Filter by parents of the offending process

Page 28: XenTT: Deterministic Systems Analysis in Xen

28

Intrusion backtracking: pass 1

• Find the syscall inside which privileges are escalated• Probe sys_* - all system call entry and exit points • Filter by the offending process ID

Page 29: XenTT: Deterministic Systems Analysis in Xen

29

Intrusion backtracking: pass 1

• Privilege escalation is a CFI violation• Start CFI analysis from the last system call• Find %EIP at which CFI is violated, and location of the

shell code (0x0)

Page 30: XenTT: Deterministic Systems Analysis in Xen

30

Intrusion backtracking: pass 1

• Find at which point address 0x0000000 gets mapped• Probe do_page_fault

Page 31: XenTT: Deterministic Systems Analysis in Xen

31

More mechanisms• Execution traces• BTS trace of all taken branches• Instruction traces

• Memory (variable) access traces• Intersect with the execution trace • See where variables get accessed

Page 32: XenTT: Deterministic Systems Analysis in Xen

32

How much overhead?

Page 33: XenTT: Deterministic Systems Analysis in Xen

33

• 32-bit x86 PV-guests• xen-unstable near v3.0.4• We rely on a working shadow page tables

• 1-CPU time-traveling guests• No SMP replay• Dom0 and Xen are SMP of course

• Test machine• 4 cores• 1Gbps network• 130 MB/s disks

Page 34: XenTT: Deterministic Systems Analysis in Xen

34

CPU

gnugo

.tacti

cs

ogg.tibetan

-chan

t

bzip2.deco

mpress-9

scimark

2.small

dcraw.d300

povray.r

educed

openssl.ae

s

bzip2.co

mpress-9

0

20

40

60

80

100

120

140

XenXenTTSc

ore

Page 35: XenTT: Deterministic Systems Analysis in Xen

35

Network throughput

TCP Send TCP Receive0

20

40

60

80

100

120

140

XenXenTTXenTT (32MB buffers)M

B/s

Page 36: XenTT: Deterministic Systems Analysis in Xen

36

Network delay

domU to dom0 domU to external host0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

XenXenTT

Ping

ms

Page 37: XenTT: Deterministic Systems Analysis in Xen

37

Disk I/O

Serial write Serial read0

20

40

60

80

100

120

140

XenXenTTM

B/s

Page 38: XenTT: Deterministic Systems Analysis in Xen

38

Raw Compressed (gzip)

Linux boot

Event log 4.3 GB 0.8 GB

Idle overnight (14 hours)

Event log 6.2 GB 1.6 GB

Growth rate 440 MB/hour 114 MB/hour (2.7GB/day)

TCP receive (1.63 GB data stream)

Event log 1.76 GB 342 MB

Payload log 1.79 GB Payload dependent

Disk write (4 GB file)

Event log 1.8 GB 350 MB

Disk read (4 GB file)

Event log 0.59 GB

Page 39: XenTT: Deterministic Systems Analysis in Xen

39

Lessons learnt

Page 40: XenTT: Deterministic Systems Analysis in Xen

40

Scaling development• Extending Xen with BTS support

• Debug crashes in Xen, and dom0

• Execution comparison tools• BTS traces to understand what went wrong• Support for resolving symbols

• Run-time comparison tools• Compare guest’s state between original and replay runs

• Trace from all parts of your system• Xen, dom0, domU

• Support performance tracing• Xentrace messages

Page 41: XenTT: Deterministic Systems Analysis in Xen

41

What we didn’t predict• I/O delay goes up• Not sure if Linux has adequate low-latency user-level

processing support• Maybe need an in-kernel interposition component

• Branch counters are fragile• Our code works on several server CPUs• Fails on a laptop with the CPU from the same model/family

line

Page 42: XenTT: Deterministic Systems Analysis in Xen

42

Conclusions

Page 43: XenTT: Deterministic Systems Analysis in Xen

43

Practical replay analysis is feasible• Performance overheads are reasonable• Realistic systems• Realistic workloads• Minor setup costs (just install Xen)

• Analysis engine is an amazing tool• And we’re growing it

• We need your help to port it upstream• Porting effort• Shadow memory for PV guests• Support for HVM guests

Page 44: XenTT: Deterministic Systems Analysis in Xen

44

Thank you.

All code is GPLv2 and will be available soon(available on-request now).

Questions: [email protected]

Page 45: XenTT: Deterministic Systems Analysis in Xen

45

Backup slides

Page 46: XenTT: Deterministic Systems Analysis in Xen

46

Execution environment

• CPU• Virtual hardware

• Memory• Disk

• Software

System

External Events

DeterministicMachine

• External I/O

Page 47: XenTT: Deterministic Systems Analysis in Xen

47

How do we find nondeterministic events?

Page 48: XenTT: Deterministic Systems Analysis in Xen

48

Device interface

Page 49: XenTT: Deterministic Systems Analysis in Xen

49

Reading a local variablebsym_skb = target_lookup_sym(probe->target, "netif_poll.skb", ".", NULL, flags);

lval_skb = bsymbol_load(bsym_skb, flags);

skb_addr = *(unsigned long*) lval_skb->buf;

Page 50: XenTT: Deterministic Systems Analysis in Xen

50

Controlled re-execution or replay• Types of non-deterministic events• Synchronous• Hypercalls

• Asynchronous• Interrupts

• Best effort• Time updates

• Branch counters• The biggest problem of this solution