25
Fast Firewall Implementation for Software and Hardware-base Routers Lili Qiu, Microsoft Research George Varghese, UCSD Subhash Suri, UCSB 9 th International Conference on Network Protocols Riverside, CA, November 2001

Fast Firewall Implementation for Software and Hardware-based Routers

Embed Size (px)

DESCRIPTION

Fast Firewall Implementation for Software and Hardware-based Routers. Lili Qiu, Microsoft Research George Varghese, UCSD Subhash Suri, UCSB 9 th International Conference on Network Protocols Riverside, CA, November 2001. Outline. Motivation for packet classification Performance metrics - PowerPoint PPT Presentation

Citation preview

Page 1: Fast Firewall Implementation  for Software and Hardware-based Routers

Fast Firewall Implementation for Software and Hardware-based Routers

Lili Qiu, Microsoft ResearchGeorge Varghese, UCSD

Subhash Suri, UCSB

9th International Conference on Network ProtocolsRiverside, CA, November 2001

Page 2: Fast Firewall Implementation  for Software and Hardware-based Routers

2

Outline Motivation for packet classification Performance metrics Related work Our approaches Performance results Summary

Page 3: Fast Firewall Implementation  for Software and Hardware-based Routers

3

Motivation Traditionally, routers forward packets

based on the destination field only Firewall and diff-serv require packet

classification forward packets based on multiple fields in

the packet header e.g. source IP address, destination IP

address, source port, destination port, protocol, type of service (ToS) …

Page 4: Fast Firewall Implementation  for Software and Hardware-based Routers

4

Problem Specification Given a set of filters (or rules), find the least

cost matching filter for each incoming packet Each filter specifies

Some criterion on K fields Associated directive Cost

Example:Rule 1: 24.128.0.0/16 4.0.0.0/8 … udp denyRule 2: 64.248.128.0/20 8.16.192.0/24 … tcp permit…Rule N: 24.128.0.0/16 4.16.128.0/20 … any permit

Incoming packet: [24.128.34.8, 4.16.128.3, udp] Answer: rule 1

Page 5: Fast Firewall Implementation  for Software and Hardware-based Routers

5

Performance Metrics Classification speed

Wire rate lookup for minimum size (40 byte) packets at OC192 (10 Gbps) speeds.

Memory usage Should use memory linear in the number of

rules Update time

Slow updates are acceptable Impact on search speed should be minimal

Page 6: Fast Firewall Implementation  for Software and Hardware-based Routers

6

Related Work Given N rules in K dimensions, the worst-case bounds

O(log N) search time, O(N(K-1)) memory O(N) memory, O((log N)(K-1)) search time

Tree based Grid-of-tries (Srinivasan et.al. Sigcomm’98) Fat Inverted Segment Tree (Feldman et.al. Infocom’00)

Cross-producting (Srinivasan et.al. Sigcomm’98) Bit vector scheme

Lucent bit vector (Lakshman et.al. Sigcomm’98) Aggregated bit vector scheme (Baboescu et.al.

Sigcomm’01) RFC (Pankaj et.al. Sigcomm’99) Tuple Space Search (Srinivasan et.al. Sigcomm’99)

Page 7: Fast Firewall Implementation  for Software and Hardware-based Routers

7

Backtracking Search A trie is a binary

branching tree, with each branch labeled 0 or 1

The prefix associated with a node is the concatenation of all the bits from the root to the node

F1 00*

F2 10*

A

B

C

F2F1

1

0

0

0 D

E

Page 8: Fast Firewall Implementation  for Software and Hardware-based Routers

8

Backtracking Search (Cont.)

Extend to multiple dimensions

Standard backtracking Depth-first traversal of

the tree visiting all the nodes satisfying the given constraints

Example: Search for [00*,0*,0*]Result: F8

Reason for backtrack 00* matches *, 0*, 00*

00

0

00

0

0 00

1

1

111

0

0

F1

F4F7

F2F6

F3F8

F5

1

C

DE

A

B

HI

J

KFG

Page 9: Fast Firewall Implementation  for Software and Hardware-based Routers

9

Set Pruning Tries Multiplane trie Fully specify all search paths so that no

backtracking is necessary Performance

O(logN) search time O(N(k-1)) storage

Page 10: Fast Firewall Implementation  for Software and Hardware-based Routers

10

Set Pruning Tries: Conversion Converting a backtracking trie to a set

pruning trie is essentially replacing a general filter with more specific filters

Page 11: Fast Firewall Implementation  for Software and Hardware-based Routers

11

Set Pruning Tries: Example0

0

0

1

1

1

0

0

0

1

1

1

1 0

F1

F2

F3 Min(F1,F2)

F2 F2

Min(F2,F3)

Backtracking Trie Set Pruning Trie

F2 F2

Replace [*,*,*] with [0*,0*,*], [0*,0*,0*], [0*,1*,*], [1**,0*,*],[1*,1*,*], and [1*,1*,1*].

A

DCB E

F

Page 12: Fast Firewall Implementation  for Software and Hardware-based Routers

12

Performance Evaluation 5 real databases from various sites

Five dimensions src IP, dest IP, src port, dest port, protocol

Performance metrics Total storage

Total number of nodes in the multiplane trie Worst-case lookup time

Total number of memory accesses in the worst-case assuming 1 bit at a time trie traversal

Page 13: Fast Firewall Implementation  for Software and Hardware-based Routers

13

Performance Results

Database

# Rules

Backtracking Set Pruning Tries

Lookup time

Storage Lookup time

Storage

1 67 146 1848 86 5541

2 158 153 4914 102 51785

3 183 169 3949 102 59180

4 279 202 6785 102 123951

5 266 208 6555 102 165920Backtracking has small storage and affordable lookup time.

Page 14: Fast Firewall Implementation  for Software and Hardware-based Routers

14

Major Optimizations Trie compression algorithm Pipelining the search Selective pushing Using minimal hardware

Page 15: Fast Firewall Implementation  for Software and Hardware-based Routers

15

Trie Compression Algorithm

If a path AB satisfies the Compressible Property: All nodes on its left point to the same place L All nodes on its right point to the same place R

then we compress the entire branches by 3 edges Center edge with value (AB) pointing to B Left edge with value < (AB) pointing to L Right edge with value > (AB) pointing to R

Advantages of compression: save time & storage

0 branch>01010

0 branch= 01010

0 branch< 01010

F1

F1

F2 F3

F3

00 1

100 1

0 1 F2F1 F3

Page 16: Fast Firewall Implementation  for Software and Hardware-based Routers

16

Performance Evaluation of Compression

Database

Lookup Time of Uncompressed

Lookup Time of Compressed

1 146 30

2 153 51

3 169 49

4 202 98

5 208 59

Compression reduces the lookup time by a factor of 2 - 5

Page 17: Fast Firewall Implementation  for Software and Hardware-based Routers

17

Pipelining Backtracking Use pipeline to speed up backtracking

Issues The amount of register memory passed

between pipelining stages need to be small The amount of main memory need to be

small

PipelineStage

1

PipelineStage

2

PipelineStage

m

Page 18: Fast Firewall Implementation  for Software and Hardware-based Routers

18

Pipelining Backtracking:Limit the amount of register

Standard backtracking requires O(KW) state for K-dimensional filters, with each dimension W-bit long

Our approach Visit more general filters first,

and more specific filters later Example

Search for [00*,0*,0*]A-B-H-J-K-C-D-E-F-GResult: F8

Performance K+1 32-bit registers

00

0

00

0

0 00

1

1

111

0

0

F1

F4F7

F2F6

F3F8

F5

1

C

DE

B

HI

J

KFG

A

Page 19: Fast Firewall Implementation  for Software and Hardware-based Routers

19

Pipelining Backtracking: Limit the amount of memory Simple approach

Store an entire backtracking search trie at every pipelining stage

Storage increases proportionally with the number of pipelining stages

Our approach Have pipeline stage i store only the trie

nodes that will be visited in the stage i

Page 20: Fast Firewall Implementation  for Software and Hardware-based Routers

20

Storage Requirement for Pipeline

Uncompressed Trie

0

10000

20000

30000

40000

50000

60000

70000

1 6 11 16

# Pipeline stages

Sto

rag

e (

# n

od

es

)

Database 1 Database 2 Database 3

Database 4 Database 5

Compressed Trie

0

2000

4000

6000

8000

10000

1 6 11 16

# Pipeline stages

Sto

rag

e (

# n

od

es

)

Database 1 Database 2 Database 3

Database 4 Database 5

Storage increases moderately with the number of pipelining stages (i.e. slope < 1).

Page 21: Fast Firewall Implementation  for Software and Hardware-based Routers

21

Trading Storage for Time Smoothly tradeoff storage for time

Observations Set pruning tries eliminate all backtracking by pushing

down all filters intensive storage Eliminate backtracking for filters with large backtracking

time Selective push

Push down the filters with large backtracking time Iterate until the worst-case backtracking time satisfies our

requirement

O((logN)(k-1)) Time(e.g. Backtrack)

O(N(k-1)) Space(e.g. Set Pruning)

Page 22: Fast Firewall Implementation  for Software and Hardware-based Routers

22

Example of Selective Pushing

Goal: worst-case memory accesses 11

The filter [0*, 0*, 000*] has 12 memory accesses.

Push the filter down reduce lookup time

Now the search cost of the filter [0*,0*,001*] becomes 12 memory accesses. So we need to push it down. Done!

F1

F3

F2

F1

F3

F1

F2

F1 F1

F2

F3

F2

0

0

0 00

0

0

0

00

0

00

0

0

0

00

0

0

00

11 10 0 0 00

1

Page 23: Fast Firewall Implementation  for Software and Hardware-based Routers

23

Performance of Selective Push

Database 1

02000400060008000

1000012000

85 90 95 100 105 110 115 120

Lookup Time(# memory accesses)

Stor

age

(# n

odes

)

Database 2

0

5000

10000

15000

20000

25000

40 42 44 46 48 50 52

Lookup Time (# memory accesses)

Sto

rag

e

(# n

od

es

)

Uncompressed Trie Compressed Trie

Lookup time is reduced with moderate increase in storage until we reach the knee of the curve.

Page 24: Fast Firewall Implementation  for Software and Hardware-based Routers

24

Summary Experimentally show simple trie

based schemes perform much better than the worst case figure

Propose optimizations Trie compression Pipelining the search Selective push

Page 25: Fast Firewall Implementation  for Software and Hardware-based Routers

25

Summary (Cont.)Approach Description Performance Gain

Trie compression algorithm

Effectively exploit redundancy in trie nodes by using range match

Reduce lookup time by a factor of 2 – 5, save storage by a factor of 2.8 – 8.7

Pipelining the search

Split the search into multiple pipelining stages, and each stage is responsible for a portion of search

Increase throughput with marginal increase in memory cost

Selective push

“Push down” the filters with large backtracking time

Reduce lookup time by 10 – 25% with only marginal increase in storage