An Integrated Framework for Dependable and Revivable
Architecture Using Multicore Processors
Weidong Shi Motorola LabsHsien-Hsin “Sean” Lee Georgia TechLaura Falk University of MichiganMrinmoy Ghosh Georgia Tech
2
Problem Statement
• Highly Available, Reliable, and Revivable networked services.
• Explore new programming and usage models for Multi-core processors
• Provide “architectural support” for network services to be– Autonomic
– Remote-exploits revivable
– Self-recoverable
• Achieve high performance
3
Problem Statement
• Highly Available, Reliable, and Revivable networked services.
• Explore new programming and usage models for
Multi-core processors
• Provide “architectural support” for network services to be– Autonomic
– Remote-exploits revivable
– Self-recoverable
• Achieve high performance
4
Toward Self-recovery Network Services
Causes of Network Service Loss
AccidentalTransient Heisenbugs Damage Aging
IntentionalDoS Buffer
Overflow
Solutions
Replication
Rejuvenation
Checkpoint
Remote Exploit Self-
recovery
5
Multicore: An ideal platform
• Exploit insulation: Each core of a multicore can be programmed to run at different
privilege levels with different OS.
Dual Core Dual Core (Merome)(Merome)
Server Core
Monitor Core
SharedL2
• Tight coupling of cores comparing with SMP Fine-grained processor state monitoring
• Concurrent monitoring, efficient state backup and recovery
• Massive multi-core will have many idle cores
6
INDRA: A Dependable and Revivable Architecture
Monitor CoreMonitor Core
L2 CacheL2 CacheL2 CacheL2 Cache
IL1Cache
IL1Cache
DL1CacheDL1
Cache Monitor
Insulation
Issue Recovery
Control
Memory InterfaceWatch Dog
Memory InterfaceWatch Dog
Physical Memory Space(used by service OS and applications)
Protected Memory Space (monitor BIOS, OS, and SW)
Server Core(Network Apps)Server Core
(Network Apps)
IL1Cache
IL1Cache
DL1CacheDL1
Cache
TraceFilterTraceFilter
TraceFIFOTraceFIFO
Code origin check
CFG check
Control signals
7
Data Page
Code Page
Monitor Core: Insulated Parallel Inspection [Kiriansky et al., USENIX 2002]
Vuln_func(){ // Attack!!// Return address changed }
FunctionA(){ Vuln_func(); A =3;}
Malicious_func(){}
Code Page
Code Origin Check
Control Flow Graph Check
Exception Handling
8
Server Core: Request Based Recovery
Issue state backuprequest
Issue state backuprequest
Read network request(Request for page
arch.ece.gatech.edu)
Read network request(Request for page
arch.ece.gatech.edu)
Process networkrequest
Monitor SignalledError?
No Yes
Restore CheckpointedState
Restore CheckpointedState
9
Comparison of Backup and Recovery
Backup RecoveryApproach
Software checkpointing Slow
Fast, modify page translation
Memory Update Log Fast
Log based undo slow
Virtual Checkpointing
Copy dirty page on demand, slow
Fast, modify TLB entry
INDRAFast, no page copy Fast, no page
copy
10
INDRA Backup Page Record
Active Page
Modified TLB
Global Timestamp Register (GT) GT=4
Backup Page
TLB Extension for Backup and Rollback
Dirty Block Bitvector
Backup Page(Physical Address)
Rollback Bitvector
RollbackValid
Local Timestamp
Active Page (Physical Address)
Tag
Dirty BlockBitvector
Backup Page (Physical Address)
LocalTimestamp
RollbackBitvector
RollbackValid
3
ProcessorMemory
11
INDRA Backup Page Record
Active Page
Modified TLB
Global Timestamp Register (GT) GT=4
Backup Page
TLB Extension for Backup and Rollback
Backup Page Record
ProcessorMemory
Dirty BlockBitvector
Backup Page (Physical Address)
LocalTimestamp
RollbackBitvector
3
Dirty Block Bitvector
Backup Page(Physical Address)
Rollback Bitvector
Backp Record
RollbackValid
Local Timestamp
Active Page (Physical Address)
Tag
RollbackValid
3
12
INDRA Recovery Example
Active Page
Global Timestamp Register (GT) GT=5
Backup Page
Modified TLB TLB Extension for Backup and Rollback
3
Dirty Block Bitvector
Backup Page(Physical Address)
Rollback Bitvector
Backup Record
RollbackValid
Local Timestamp
Active Page (Physical Address)
Tag
Current Operation
Wr memory line 7Wr memory line 7REQUEST nREQUEST n
5
13
INDRA Recovery Example
Active Page
Global Timestamp Register (GT) GT=5
Backup Page
Modified TLB TLB Extension for Backup and Rollback
3
Dirty Block Bitvector
Backup Page(Physical Address)
Rollback Bitvector
Backup Record
RollbackValid
Local Timestamp
Active Page (Physical Address)
Tag
Current Operation
REQUEST nREQUEST n
5
Wr memory line 2Wr memory line 2
14
INDRA Recovery Example
Active Page
Global Timestamp Register (GT) GT=5
Backup Page
Modified TLB TLB Extension for Backup and Rollback
3
Dirty Block Bitvector
Backup Page(Physical Address)
Rollback Bitvector
Backup Record
RollbackValid
Local Timestamp
Active Page (Physical Address)
Tag
REQUEST nREQUEST n
5
Failure SignalFailure Signal
Restore system resource allocationRestore process context
1
15
INDRA Recovery Example
Active Page
Global Timestamp Register (GT) GT=5
Backup Page
Modified TLB TLB Extension for Backup and Rollback
3
Dirty Block Bitvector
Backup Page(Physical Address)
Rollback Bitvector
Backup Record
RollbackValid
1
Local Timestamp
Active Page (Physical Address)
Tag
REQUEST n+1REQUEST n+1
5
Current Operation
Rd memory line 7Rd memory line 7
16
INDRA Recovery Example
Active Page
Global Timestamp Register (GT) GT=5
Backup Page
Modified TLB TLB Extension for Backup and Rollback
3
Dirty Block Bitvector
Backup Page(Physical Address)
Rollback Bitvector
Backup Record
RollbackValid
1
Local Timestamp
Active Page (Physical Address)
Tag
REQUEST n+1REQUEST n+1
5
Current Operation
Wr memory line 1Wr memory line 1
17
INDRA Recovery Example
Active Page
Global Timestamp Register (GT) GT=5
Backup Page
Modified TLB TLB Extension for Backup and Rollback
3
Dirty Block Bitvector
Backup Page(Physical Address)
Rollback Bitvector
Backup Record
RollbackValid
1
Local Timestamp
Active Page (Physical Address)
Tag
REQUEST n+1REQUEST n+1
5
Current Operation
Handle Next RequestHandle Next Request Global Timestamp Register (GT) GT=6
Record system resource allocationRecord process context
18
INDRA Recovery Example
Active Page
Global Timestamp Register (GT) GT=5
Backup Page
Modified TLB TLB Extension for Backup and Rollback
3
Dirty Block Bitvector
Backup Page(Physical Address)
Rollback Bitvector
Backup Record
RollbackValid
1
Local Timestamp
Active Page (Physical Address)
Tag
REQUEST n+2REQUEST n+2
5
Current Operation
Global Timestamp Register (GT) GT=6Wr memory line 4Wr memory line 4
6
19
Test Bed (Bochs + TAXI [Vlaovic &
Davidson, ICCD’02])
Monitor(Stripped Down OS,Security SW, 10MB)
Monitor(Stripped Down OS,Security SW, 10MB)
Linux Network Server
Linux Network Server
Bochs + TAXIBochs + TAXI
Host OS
Network Requests
Server Response
• Run production OS with real service applications, httpd, ftpd, bind, sendmail, etc.
• Recoverability evaluated by applying real x86 remote exploits from security websites.
• Experiment with documented exploits
20
Inter-Request Interval (# of Instructions)
Average Network Request Interval (instructions/per request)
0
500000
1000000
1500000
2000000
2500000
21
I-Cache Miss Rate L1 Miss Rate
0.0%
0.5%
1.0%
1.5%
2.0%
2.5%
3.0%
3.5%
4.0%
ftp http bind sendmail imap nfs average
• Code Origin Check reads traces of code read from L2 Cache
• Number of Instructions in the Trace is Proportional to L1 I Cache Miss Rate
• Overhead of monitoring code origin depends on L1 I Cache Miss Rate
22
Monitoring Overhead
Request Response Time Slowdown
0
0.2
0.4
0.6
0.8
1
1.2
23
Sensitivity of Monitoring Queue Size
1
1.1
1.2
1.3
1.4
1.5
1.6
8 16 32 64 128
Queue Size
Queue Size vs. Performance
Slo
wd
ow
n
24
Backup Overhead of Modified Lines
Percentage of Modified Lines Requiring Backup
0%
2%
4%
6%
8%
10%
12%
14%
25
Performance of Recovery + Monitoring
Slowdown of Service Response Time
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
ftpd httpd bind sendmail imap nfs average
Monitor+Backup Monitor+Backup+Rollback
26
Conclusions
• Real time exploit monitoring with autonomic recovery increases revivability and availability.
• Multicore architectures are an ideal candidate for new type of revivable system.
• INDRA-based Multicore system can provide improved reliability and availability.
• More research is required to explore the trade-off between availability, performance, architecture design, and cost.
27
Questions and Answers
http://arch.ece.gatech.edu
Thank you !