Upload
hadan
View
214
Download
1
Embed Size (px)
Citation preview
Tilman WolfDepartment of Electrical and Computer Engineering
University of Massachusetts Amherst
October 7, 2016
Attacks and Hardware Defenses for Network Infrastructure
Tilman Wolf 2
Computer Networks§ Networks provide connectivity between end-systems
• Use for remote access, data transfers, control, etc.
§ Networking requires common protocols for communication
client
server
network
Tilman Wolf 3
Computer Networks§ Success of the Internet: hourglass architecture
• Very basic services (connectivity, bit pipes, etc.)• Highly diverse set of applications “on top”
§ Success is also a problem• Diverse applications, diverse systems
§ Changing requirements for network layer• New network functionality
− Security, quality-of-service, multicast, reliability, etc.
• New communication paradigms− Content distribution, content addressable networks, data aggregation, etc.
• Application-layer processing in network− Payload transcoding, content-based load balancing, etc.
Layered protocol stack
Physical layer
Link layer
Network layer
Transport layer
Application layer
IP
UDP TCP
HTTP
TLS/SSL
DNS BGP
SIP
Ethernet
DSL FDDI
1000BASE-T
SONET/SDH802.11a/b/g/n
RS-232
...
...
...
...
Example protocols
Tilman Wolf 4
Extended Network Functionality§ Extensions to current Internet
• New functionality in routers− Firewalls, network address translation− Intrusion detection systems− Traffic shaping− Etc.
§ Customization of data plane• Complex per-packet protocol
processing operation • Deployment of new features at runtime• Vendors may compete on features
§ Requires routers with ability to adapt• Programmability is necessary
`
End system:- IP security- TCP termination
Server:- Content-based switching- Firewall- SSL termination- IP security
Access router:- Access concentration (cable, DSL, wireless)- Network address translation- Policy-based QoS- Monitoring and billing- Firewall
Edge router:- Packet classification- QoS (DiffServ)- monitoring and billing
Core router:- Multiprotocol label switching- QoS aware routing- Monitoring
Tilman Wolf 5
Packet Processing on Router§ Protocol processing operations implemented on input port
• Application-Specific Integrated Circuit (ASIC) for simple protocols
• ASICs is fast, butfixed functionality
input ports switch fabric output ports
network interface
packet processing
system switc
h in
terfa
ce
network interface
packet processing
system switc
h in
terfa
ce
switc
h in
terfa
cesw
itch
inte
rface
packet processing
system
packet processing
system
network interface
network interface
...
...
input port
output port
input port
output port
input port
output port
input port
output port
switch fabric
Tilman Wolf 6
Programmable Router§ Flexibility through programmable network processor
• General-purpose processing capability in data path• Packet processing in software
§ High-performance processing hardware• Scalability through high
levels of parallelism§ Example: 40-core
network processors
§ Key challenge:• Security is critical for
network infrastructure • Vulnerabilities due
to software processing
Router
SwitchingFabric
PortPort
Port
Port
Port
Network Processor
Net
wor
k In
terfa
ce Processing Engine
Processing Engine
Processing Engine
Processing Engine
I/O
packets
Interconnect
Protocol processing on network processor
Tilman Wolf 7
Middleboxes and Network Appliances§ Processing also on middleboxes and network appliances
• Standalone nodes (e.g., server-type processor)§ Used with Software-Defined Networking (SDN)
• Network function virtualization (NFV)§ Similar processing environment
• Throughput performance important
• Limited resources to detect / protect from attacks
§ This talk focuseson networkprocessors
`` NF
SDN controller
switch
Tilman Wolf 8
Outline§ Introduction§ Vulnerabilities
• Example attack on network processor§ Defense mechanism
• Hardware monitor§ Extensions
• Multicore hardware monitor and dynamic workloads• Secure loading and avoiding homogeneity• Operating system support
§ Conclusions
Tilman Wolf 9
Network Attack Classification§ Programmable data plane introduces new type of attack
• Hacking of packet processing engine on router • Attack targets network infrastructure
``
End-system
Goal of attack
Data access and modification
Data access and modificaiton
Denial-of-service
Denial-of-service
Control plane
Data access and modificaiton
Data plane
Attack examples
Hacking, phishing, espionage, etc.
Malicious route announcement, DNS cache poisoning, etc.
DNS recursion attack, etc.
Denial-of-service attack via botnets, etc.
Eavesdropping, man-in-the-middle attack, etc.
Defenses
Secure routing protocols (with cryptographic
authentication), secure DNS (DNSSEC), etc.
Virus scanner, firewall, network intrusion
detection system, etc.
Secure network protocols (IPSec, TLS),
etc.
Denial-of-service Exploit of vulnerable packet processing code
Processing monitor, etc.
Focus of this work
Attack target
Tilman Wolf 10
Exploit of Vulnerable Network Processor§ Vulnerability can be exploited to launch attack§ “In-network” denial of service attack
• Router has access to many links with high data rates
• Potentially devastating impact§ Key questions
• Can such vulnerabilities occurin packet processing code?(Yes, we show one example.)
• Can vulnerabilities be exploitedto launch DoS attack?(Yes for one processor type;no for another (crashed instead))
Unprotected software-based
router
Vulnerable packet
processor
Packet forwarding software
Malicious packet
In-network denial of service attack
Tilman Wolf 11
Attack Type§ Overflow attacks
• Malicious data exploits vulnerable code• Often leads to attacker executing arbitrary code• Can be exploited via network
§ National Vulnerability Database (late 2014)• 66,399 vulnerabilities total• 6,518 vulnerabilities that exploit “overflows” via network (approx. 10%)
§ End-system vulnerabilities can be detected• Virus scanner on end-system• Content-inspection firewalls in network
§ Packet processors need custom protection• No processing power for virus scanner• No protection from firewall inside network core
Tilman Wolf 12
Example of Data Plane Vulnerability§ Many different potential vulnerabilities
• We focus on one example to show that it is possible• Specific attack depends on system, software, etc.
§ Requirements• Vulnerability must be in packet processing code• Vulnerability must be triggered by data packet
§ Protocol processing: header insertion• Congestion management (CM) protocol
ETHhdr
IPhdr
UDPhdr
Original packet
payload
CM header
ETHhdr
IPhdr
UDPhdr payloadCM
hdrNew
packet
Tilman Wolf 13
Header Insertion Vulnerability§ Vulnerability based on stack smashing attack
• Exploit of integer overflow− Sum is 4 (not 65540)
§ Vulnerable code in networking context• Carefully crafted shifting of packet content:
− Check if enoughroom for shift
− Perform memcpy• Problem:
− Attack can send short, malformed UDP packet (length field of 65532)− Integer overflow can occur on check
Thus, − Memory copy of 65532 bytes will overwrite outside boundaries of packet
ETHhdr
IPhdr
UDPhdr payloadCM
hdr
len2
len1
Tilman Wolf 14
Header Insertion Exploit§ Memory copy causes stack overwrite
• Return address from function call can be changed
• Control flow can be redirected to attack code
• If separate instruction and data memory (Harvard architecture) use return-to-libc attack
§ Example Attack code: infinite transmission loop• Self-propagating
denial-of-service attack
.
.
.
Packet payload(attack code)
New pkt buffer
Local var
.
.
.
Return address
Stack growth
Newstack
pointer
Low address
High address
CurrentframePrevious frame ptr
Current frame ptr
(generate CM header)
Original packet hdr
Tilman Wolf 15
Header Insertion Exploit§ Prototype implementation
• Custom network processor on NetFPGA• Packet forwarding with vulnerable code• Malicious packet injected into benign background traffic
§ Single malicious packet triggers attack at full link rate!
§ Attack has not yet been shown on commercial system
Tilman Wolf 16
Header Insertion Exploit§ Ability to exploit vulnerability depends on processor system
• Previous result: custom ARM-based packet processor• Other system: Click modular router on Linux system
− Stack smashing crashes router, but could not create DoS attack
§ Main observation: software on NP can be attacked• Exploits can happen through data plane only
§ Need to develop defense mechanisms for router systems
Tilman Wolf 17
Outline§ Introduction§ Vulnerabilities
• Example attack on network processor§ Defense mechanism
• Hardware monitor§ Extensions
• Multicore hardware monitor and dynamic workloads• Secure loading and avoiding homogeneity• Operating system support
§ Conclusions
Tilman Wolf 18
Defense Mechanism for Processors§ Software-based defenses (e.g., virus scanner)
• High processing overhead• Processing requirement is proportional to input/output operations• Scanning for known attacks is reactive, not proactive
§ Hardware-based defenses more suitable• Defense mechanism can be separated from data plane
− Makes it more difficult to circumvent• Performance impact on packet processor is small• Challenge is to make it dynamically adaptable
− Needs to work for new packet processing functions
§ We have designed and prototyped hardware defense for NP• Hardware monitor tracks processor• Deviation from “normal behavior” due to attack can be detected• Reset operation recovers system
Tilman Wolf 19
Related Work§ Monitor-based defense mechanism for embedded systems
• Aurora et al., DATE 2005• Ragel et al. DAC 2006• Zambreno et al., TECS 2005• Our monitor uses finer-grained monitoring for faster detection
− More details in Mao and Wolf, TC 2010
§ Processor-based defense mechanisms• No eXecute (NX) bit (creates virtual Harvard architecture)• Depends on processor architecture
§ Network-based defense mechanisms• Attack signature in intrusion-detection systems (e.g., snort, bro)• Problem with system homogeneity and IDS only at network edge
Tilman Wolf 20
System Architecture§ Hardware monitor
co-located witheach processor core• Core reports hash of
each executed instruction§ Monitoring graph repre-
sents correct behavior• Obtained from offline
analysis of binary• Deviations trigger reset
§ Change of software easy• Just need matching
monitoring graph
networkprocessor
core
instruction memory
data memorypacket buffer
processing code
network interface
comparison logic
mon. memorymon. graph
netw
ork
proc
esso
r
hard
war
e m
onito
r
hash of processinginstruction
reset/recovery
processing codebinary
NFA monitoringgraph
DFA monitoringgraph
NFA-to-DFA transformation
offli
ne a
naly
sis
runt
ime
oper
atio
n
Tilman Wolf 21
Offline Analysis of Processing Binary§ Executed instruction reported by core as 4-bit hash
• Hash combines address, opcode, registers• Hash allows for compact representation of information
§ Monitoring graph• Each instruction represented as a state• Edges correspond to execution of instruction• Control-flow operations lead to multiple possible next states
[…] 49c: 97c20010 lhu v0,16(s8) 4a0: 00000000 nop 4a4: 2c420033 sltiu v0,v0,51 4a8: 1440000a bnez v0,4d4 4ac: 00000000 nop 4b0: 3c026666 lui v0,0x6666 4b4: 34430191 ori v1,v0,0x191 4b8: 97c20010 lhu v0,16(s8) […]
49c
4a0
4a4
4a8
4ac
4b0
4b4
4b8
07
1110
107
3
6
4d4
Tilman Wolf 22
Implementation Cost of Monitor§ Monitor requires additional logic
and memory resources§ Comparison logic tracks hash value
• Simple logic to follow control flow in processor§ Graph memory stores hash for each
instruction• Approximately 4 bits for each 32-bit instruction• Fraction of size of application binary
§ Examples from NpBench• Hundreds to thousands of instructions only
Tilman Wolf 23
NFA-to-DFA conversion§ Problem: non-deterministic finite automaton (NFA)
• State may have two next states with same edge value due to hash
• Implementation would need to keep track of multiple states§ Solution: NFA-to-DFA conversion (powerset construction)
• Well-known algorithm [Hopcroft and Ullman, 1976]
• Deterministic finite automaton (DFA) requires only one state
1 2 3 4 5b c d e f
c
6a
61a
2b
{3,5}c
4d
5e f
f
Tilman Wolf 24
Implementation of DFA Monitor§ Need to design effective DFA traversal mechanisms§ Requirements
• Compact representation• Fast processing of each hash value (single memory access)
§ Each state may have up to 16 next states• Most states only have one or two next states
§ Idea: grouping of states by number of outgoing edges of previous state
d g
h
2
7
311
0142
79
a
c e
grouping
group 1
group 2
group 3
f
b d g
h
2
7
311
052
7a
c e
f
b
9
Tilman Wolf 25
Implementation of DFA Monitor§ DFA monitor system
• Memory keeps all valid hash values in one bit vector• Next state based on offset of group and position of hash value
2
number of next states
0
offset in state group
0000 0000 1000 0100
valid hash values on outgoing edges
2 1 0000 1000 0000 10001 1 0000 0000 1000 00001 0 0000 0010 0000 00003 0 0000 0000 0010 0101... ... ...1 0 0000 0010 0000 0000... ... ...... ... ...
acbfedfhg
group 1
group 2
group 3
0x00000x00020x0006
...
group 1group 2group 3
group 16
...
group base address
-1
mult
add
...
k(position of
matching hash among valid hash values)
one-hot encoding
...
hash compari-
son
...
4-bit hash function
processor instruction
reset/recovery
32 32
4
4
16
16
4
416
1
state machine memory
Tilman Wolf 26
Evaluation§ Monitoring lookup speed
• Single memory access plus lookup into fixed-size register file§ Memory size of monitor
• More states due to NFA-to-DFA conversion• More states due
to multiple entries in memory for certain states
• In practice, overhead is below 10%
Tilman Wolf 27
Timing Diagram§ Attack without monitor
• Attack packet is forwarded on all ports
53
6.2. Experimental results
6.2.1. Attack Detection
This section explains the experiments performed to test the ability of our proposed
security monitoring system to detect and recover from an attack. We observed the security
monitor operation in simulation using the ModelSim-Altera simulator [41], and in hardware
using an Altera Signal-tap logic generator [56].
6.2.1.1. Network processor without security monitor
We initially tested the single-core network processor operation without the security
monitor system when the attack described in section 5.1 is implemented. Figure 34 shows the
simulation results for the behavior of the processor system. The attack packet was received
through MAC port Rx0, and then forwarded to the network processor. The processor then
forwards the attack packet to all the outgoing ports of the router and then crashes the router.
This behavior was also verified in hardware.
Figure 34: Simulation waveform showing attack packet propagation in the network processor system.
6.2.1.2. Network processor with security monitor
We then repeated the previous experiment after including the security monitor as
illustrated in Figure 26. Figure 35 shows the simulation results for the behavior of the
network processor system when an attack packet and normal packet are sent simultaneously.
Tilman Wolf 28
Timing Diagram§ Monitor works as expected
• Attack packet is detected and dropped• Later normal packet is forwarded
Tilman Wolf 29
Attack with Defense in Place§ Attack packet dropped, router continues to operate
Tilman Wolf 30
Outline§ Introduction§ Vulnerabilities
• Example attack on network processor§ Defense mechanism
• Hardware monitor§ Extensions
• Multicore hardware monitor and dynamic workloads• Secure loading and avoiding homogeneity• Operating system support
§ Conclusions
Tilman Wolf 31
Multicore Monitor§ Dynamic workloads pose problem for
hardware monitor• Processing may differ between packets• Monitors need to match processing
§ Mapping between processors and monitors• 1-to-1 mapping requires frequent reload of monitor• Any-to-any mapping costly to implement• Clusters with n-to-m mapping provide balance
§ Interconnect is configured dyna-mically depending on workload• Mapping between core and
monitor
core
monitor
core
monitor
core
monitor
core
monitor...
...
core core core
monitor monitor monitor monitor
...
...
core core
monitor monitor monitor
...
core core
monitor monitor monitor
...
...
...
...
Tilman Wolf 32
System Architecture of Clustered System§ Multiple cores can access multiple monitors
• Dynamic configuration of crossbar§ Secure loading of monitors through external interface
Proc Proc Proc... Proc Proc Proc
...
Proc Proc Proc...
n Processors
Inter-coreInterconnect
External Memory
Crossbar Crossbar Crossbar
Mon Mon Mon...
m Monitors
Mon Mon Mon... Mon Mon Mon...
... ...
AESCentralized
MonitorMemory
Control Processor
...
External Interface
ControlSignals
Network Interface
Tilman Wolf 33
Cluster Design§ Simple implementation of clustered monitor
• Dynamic configuration through programming of demultiplexersNP Core
32
Hash
MonitorSelect
3
4Hash_1
R/R fromMonitor_1 to 6
Reset/RecoverNP Core
Hash
Hash_2
NP Core
Hash
Hash_4
Monitor_1
Hash 4
FromHash_1 to 4
Reset/Recover
1
Monitor_2 Monitor_3 Monitor_4 Monitor_5 Monitor_6
Proc Select
2
NP Core
Hash
Hash_3
1
4
Tilman Wolf 34
Dual-Ported Monitor Implementation§ Memory of monitor can be shared between two monitors
• Effective use of dual-ported memory• Two monitoring graphs can be used in parallel
One-hotEncoding
HashComparison
4-bit HashFunction
16
4
32
1
ProcessorInstruction
Reset/Recover
NextStateSelect
4
GraphSelect
K-1
K
1
Next State
Valid Hashes
One-hotEncoding
HashComparison
4-bit HashFunction
4
16
NextStateSelect
GraphSelect
1
K
4
14
Next StateValid Hashes 14
1
Reset/Recover
ProcessorInstruction
32
Monitor 1
Monitor 2
Monitoring Graph 2
Monitoring Graph 1
Tilman Wolf 35
Runtime Monitor Allocation§ How many monitors per cluster?
• Number of monitors m, number of processor cores n§ Analytical model
• Blocking occurswhen no monitoris available forgiven packetprocessing
• Two programswith equaltraffic and workloadassumed
§ Overprovisioningof 1.5 is sufficient
0.5
0.6
0.7
0.8
0.9
1
1 1.2 1.4 1.6 1.8 2
thro
ughp
ut
monitor overprovisioning (m/n)
n=2n=4n=8
n=16n=32
0.5
0.6
0.7
0.8
0.9
1
1 1.2 1.4 1.6 1.8 2
thro
ughp
ut
monitor overprovisioning (m/n)
0.5
0.6
0.7
0.8
0.9
1
1 1.2 1.4 1.6 1.8 2
thro
ughp
ut
monitor overprovisioning (m/n)
0.5
0.6
0.7
0.8
0.9
1
1 1.2 1.4 1.6 1.8 2
thro
ughp
ut
monitor overprovisioning (m/n)
0.5
0.6
0.7
0.8
0.9
1
1 1.2 1.4 1.6 1.8 2
thro
ughp
ut
monitor overprovisioning (m/n)
Tilman Wolf 36
Prototype Implementation on FPGA§ Multi-core system (4 cores, 6 monitors)
• Monitor logic very simple• Interconnect uses very little resources• Monitors require about 1/3 of memory of processors• Monitors require about 1/8 of power of processors
Tilman Wolf 37
Runtime Operation§ Adaptation based on threshold in queue for application§ Simulation results
• Monitor allocation adapts to dynamics in traffic
0"
0.1"
0.2"
0.3"
0.4"
0.5"
0.6"
0.7"
0.8"
0.9"
1"
0" 5" 10" 15" 20" 25" 30" 35" 40" 45" 50" 55" 60" 65" 70"
Incoming(pa
cket(ra
/o(
Time(t((kilocycles)(
Packet"type"1"
Packet"type"2"
Packet"type"3"
Total"
0"
1"
2"
3"
4"
5"
6"
7"
8"
9"
10"
11"
12"
13"
14"
15"
16"
0" 5" 10" 15" 20" 25" 30" 35" 40" 45" 50" 55" 60" 65" 70"Num
ber'o
f'processors'a
ssigne
d'
Time't'(kilocycles)'
Packet"type"1"
Packet"type"2"
Packet"type"3"
Total"
Tilman Wolf 38
Runtime Operation§ Simulation results
• Throughput variation due to adaptation• Small inefficiencies during workload change
0"
1"
2"
3"
4"
5"
6"
7"
8"
9"
10"
11"
12"
13"
14"
15"
16"
0" 5" 10" 15" 20" 25" 30" 35" 40" 45" 50" 55" 60" 65" 70"
Num
ber'o
f'processors'a
ssigne
d'
Time't'(kilocycles)'
Packet"type"1"
Packet"type"2"
Packet"type"3"
Total"
0"
2"
4"
6"
8"
10"
12"
14"
16"
18"
0" 5" 10" 15" 20" 25" 30" 35" 40" 45" 50" 55" 60" 65" 70"
Throughp
ut)(p
ackets/kilo
cycles))
Time)t)(kilocycles))
Tilman Wolf 39
Graph Loading Times§ Time to load graph depends on application size§ Results from NpBench
Tilman Wolf 40
Outline§ Introduction§ Vulnerabilities
• Example attack on network processor§ Defense mechanism
• Hardware monitor§ Extensions
• Multicore hardware monitor and dynamic workloads• Secure loading and avoiding homogeneity• Operating system support
§ Conclusions
Tilman Wolf 41
Extension: Infrastructure Diversity§ System-level challenges
• Dynamics: runtime verification of monitoring graphs− Network traffic and functionality change at runtime− Multiple processor cores and
their monitors need to be reprogrammed based on the traffic
• Homogeneity: parameterizablehashing for heterogeneity− Practical networks use large
numbers of identical router devices
− A successful attack on one device can lead to Internet-scale failures
Tilman Wolf 42
Security Loading of Monitoring Graph§ Three entities:
• Router manufacturer
• Network operator• Router/network
processor§ Signatures on graph
establish chain oftrust• Network processor
verifies authenticity• Network operator
can install new graph
Tilman Wolf 43
Prototype Implementation on FPGA§ Prototype system
• Altera Stratix IV FPGA on a DE4 board
• Nios II connects to a FTP server through OpenSSL
• Parameterizablehash function in hardware monitor
Tilman Wolf 44
Security Operations Evaluation on Nios II§ Secure download, decryption, and verification times
• IPv4 with congestion management application • Verification takes several sections
Tilman Wolf 45
Parameterizable Hash Function§ Merkle tree for hash
function• Can be parameterized• High performance
implementation in hardware
• Low resource overhead§ Each network processor
can use a different parameter value• Resulting monitoring
graph has differenthash values
Tilman Wolf 46
Hash Function Evaluation§ Resource cost for hash function
• Compared to non-parameterizable hash function
§ Distribution of hash values in Merkle tree• Random distribution of Hamming distance for almost all inputs• Hash function requires zero Hamming distance for same inputs
Tilman Wolf 47
Outline§ Introduction§ Vulnerabilities
• Example attack on network processor§ Defense mechanism
• Hardware monitor§ Extensions
• Multicore hardware monitor and dynamic workloads• Secure loading and avoiding homogeneity• Operating system support
§ Conclusions
Tilman Wolf 48
Extension: Operating System in ES§ Coordination between embedded OS and monitor
• Multiple activeprocesses in OS,multiple activemonitoring graphs
• Monitor switchesmonitoring graphsin sync with OSprocesses
• Requires minorextension to OS
§ Prototype:• NIOS II processor• μC/OS-II operating
system
processorcore
instruction memory
data memory
I/O interface
comparison logic
mon. memory
embe
dded
pro
cess
or
hard
war
e m
onito
r
hash of processinginstruction
task reset
offli
ne
anal
ysis
runt
ime
oper
atio
n
processing codebinary
monitoringgraph
... ...
processing code mon. graph
task context
OS or active task
context info
active graph
Tilman Wolf 49
Processor-to-Monitor Interface§ OS on processor needs to coordinate with monitors
• Process creation (ensure monitoring graph is ready)• Context switch between processes (switch monitoring graph)• Process deletion (remove monitoring state)• Reset signal from monitor
§ A set of five registers to communicate with the processor
Tilman Wolf 50
Operating System Support§ Hardware monitoring logic tracks OS operations
32
31 10x0000h
14 20x1200h
x 0xxxx
x 0xxxx
GID# of Active Processes
Base Addr
4 14
21 14
11 31
x xxxx
GIDPID Valid
1
1
1
0
4 0x0002h
21 0x0004h
11 0x0000h
x xxxx
PID Address Pointer Valid
1
1
1
0
Address Pointer +
override
0
1
Frame Address
Hash Comparison
CPU Instruction
Valid HashNext State
...
...
...
Group1 Addr
Group3 Addr
Group1 Addr
Group3 Addr
0x1200h + 0x0000h:
0x1200h + 0x0008h:
0x0008h:
0x0000h: Group1 Addr
Group3 Addr
Group1 Addr
Group3 Addr
Valid HashNext State
...
...
...
Slot 4 Region
Slots 2 and 3 Regions
Slot 1 Region
Group 1
Group 2
Group 3
Group 4
0x0008h
0xffffh
0x000eh
0x000ah
...
read data
Base Addresses
Register File
Graph Memory
write data
14
16
16 4
Sequencing logic
4
Position of matching
hash in the hash vector
14Read
Address
14
Control FSM
One‐hot encoding
Hash calculation
Recovery signal
PID
GID
Operation
Done
From the CPU
Pipeline
Processor interface
CPU Interrupt controller
load
Controller
DMA
14 32
Write dataWrite
addressWrite enable
Graph pool
PID addresses
PID to GID binding GID to frame binding
Enable/Disable
Tilman Wolf 51
Operating System Support§ Context switch interactions:
§ Attack detection:
Monitor readyPID change Context Switch CPU ready
GID changePID change Task Create Monitor ready
Instruction hash matches graph memory Instruction hash doesn’t match graph memory
Attack detected
Monitor readyPID change Context Switch CPU ready
GID changePID change Task Create Monitor ready
Instruction hash matches graph memory Instruction hash doesn’t match graph memory
Attack detected
void process_input(char *stringpassed) {char name[90];strcpy(name,stringpassed);printf("Processing string .. !\n");return;
}
Tilman Wolf 52
Operating System Support§ Implementation cost on Stratix IV FPGA
§ Hardware monitoring can be used for embedded systems • Embedded systems are similarly performance constrained
Tilman Wolf 53
Outline§ Introduction§ Vulnerabilities
• Example attack on network processor§ Defense mechanism
• Hardware monitor§ Extensions
• Multicore hardware monitor and dynamic workloads• Secure loading and avoiding homogeneity• Operating system support
§ Conclusions
Tilman Wolf 54
Conclusions§ Current and future Internet needs to meet new demands
• Flexibility is key to avoid ossification• Deployment of new edge services requires programmable data plane
§ Programmable routers provide packet processing platform• Systems problem: security vulnerabilities• Attacks can be launched within data plane (i.e., not control access)• Monitor-based hardware defense mechanism is effective
§ Our work has addressed many practical concerns• Workload dynamics and secure installation of monitoring graphs• System heterogeneity• Extension to general embedded systems with operating systems
§ Exciting research area that spans computer networking, embedded systems, and system security
Tilman Wolf 55
Acknowledgements§ Graduate students
• Kekai Hu (now: Intel)• Arman Pouraghily• Harikrishnan Chandrikakutty
(now: Juniper Networks)• Danai Chasaki (now: Villanova Univ.)• Shufu Mao (now: WorldQuant)• Thiago Teixeira
§ Collaborators• Russell Tessier
§ Sponsors• National Science Foundation• Altera Corporation
Tilman Wolf 56
Selected Publications§ Data plane attack:
• Danai Chasaki and Tilman Wolf. Attacks and defenses in the data plane of networks. IEEE Transactions on Dependable and Secure Computing, 9(6)798–810, November 2012.
• Danai Chasaki and Tilman Wolf. Design of a secure packet processor. In Proc. of ACM/IEEE Symposium on Architectures for Networking and Communication Systems (ANCS), San Diego, CA, October 2010.
§ Hardware monitors for network processors:• Shufu Mao and Tilman Wolf. Hardware support for secure processing in embedded
systems. IEEE Transactions on Computers, 59(6):847–854, June 2010.• Harikrishnan Kumarapillai Chandrikakutty, Deepak Unnikrishnan, Russell Tessier, and
Tilman Wolf. High-performance hardware monitors to protect network processors from data plane attacks. In Proc. of 50th Design Automation Conference (DAC), Austin, TX, June 2013.
• Kekai Hu, Harikrishnan Chandrikakutty, Russell Tessier, and Tilman Wolf. Scalable Hardware Monitors to Protect Network Processors from Data Plane Attacks. In Proc. of First IEEE Conference on Communications and Network Security (CNS), Washington, DC, October 2013. (Best Paper Award)
• Kekai Hu, Tilman Wolf , Thiago Teixeira, and Russell Tessier. System-level security for network processors with hardware monitors. In Proc. of 51st Design Automation Conference (DAC), San Francisco, CA, June 2014.
§ Hardware monitors for embedded systems:• Tedy Thomas, Arman Pouraghily, Kekai Hu, Russell Tessier, and Tilman Wolf. Multi-task
support for security-enabled embedded processors. In Proc. of 26th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP), pages 136–143, Toronto, ON, July 2015.