Fast Lookup for Dynamic Packet Filtering in FPGA REPORTER: HSUAN-JU LI 2014/09/18 Design and...

Preview:

DESCRIPTION

Outline Introduction Related Work Design And Architecture Evaluation And Results Conclusion 3

Citation preview

Fast Lookup for Dynamic Packet Filtering in FPGA

REPORTER: HSUAN- JU L I2014/09/18

Design and Diagnostics of Electronic Circuits & Systems, 17th International Symposium on (DDECS), 2014 18rd

International Conference on, April (2014)Luka´s Kekely, Martin ˇ Zˇ adn ´ ´ık, Jiˇr´ı Matousek, Jan Ko ˇ ˇrenek

2

OutlineIntroductionRelated WorkDesign And ArchitectureEvaluation And ResultsConclusion

3

OutlineIntroductionRelated WorkDesign And ArchitectureEvaluation And ResultsConclusion

4

Introduction Software applications of safety- and security-critical embedded systems are often divided into several self-contained functions.

Between individual system partitions and functions.We use segregation to confine error propagation.

Soft processors are one order of magnitude slower in terms of operating frequency than hard-wired devices.

5

Introduction(cont.) Current FPGA families provide wide and fast memory attachments mostly implemented as hard macros that are faster than configurable logic.There is a performance gap between soft processors and the

memory attachment.Propose an architecture combines : The specific needs of partitioned software.The flexibility of reconfigurable hardware.

6

Introduction(cont.) Multiple self-contained systems on a single platform FPGAShares available memory bandwidth among the systems In a predictable and scalable way.

The main building blocks of the proposed architectureSecure bus bridges that are used to form a segregated hierarchy

of memory busses.

7

Introduction(cont.) With secure bus bridges, it is possible to use soft processors for safety and security-critical functions.To reach high assurance levels with far less effort.

8

OutlineIntroductionRelated WorkDesign And ArchitectureEvaluation And ResultsConclusion

9

Related Work Cuckoo hash function

h(x)

h’(x)

x = {a, b, c}

h(a) h(b) h(c)

h’(b) h’(c) h’(a)

h(a) h(b) h(c)

h’(b) h’(c) h’(a)

h(6) = 6 mod 11 = 6h’(6) = floor(6/11) mod 11 = 0x ={20, 50, 53,75}

0 1 2 3

0 1 2 3

10

OutlineIntroductionRelated WorkDesign And ArchitectureEvaluation And ResultsConclusion

11

Design And ArchitectureA. Lookup engine interface and functionalityB. Cuckoo hash lookup engineC. Binary search tree lookup engine

12

Design And Architecture(cont.)A. Lookup engine interface and functionalityB. Cuckoo hash lookup engineC. Binary search tree lookup engine

13

Design And Architecture(cont.) Lookup engine interface

and functionality

LookupEngine

Key Width Data Width Maximum Capacity

Representation in bits

Interface

14

Design And Architecture(cont.) Lookup engine interface

and functionalityLookup procedure 3 basic groups:InputOutputConfiguration

15

Design And Architecture(cont.) Lookup engine interface

and functionalityLookup procedure 3 basic groups:

LookupEngine

Input keys Lookup results

Routing decisionKey identification

Arbitrary Data 1 bit information

FoundInvalid

(Outputs)

Configuration

(Every Clock Cycle)

16

Design And Architecture(cont.)A. Lookup engine interface and functionalityB. Cuckoo hash lookup engineC. Binary search tree lookup engine

17

Design And Architecture(cont.) Cuckoo hash

lookup engine

18

Design And Architecture(cont.) Cuckoo hash

lookup engine

Parallelcomputing

CRC implementation

19

Design And Architecture(cont.) Cuckoo hash

lookup engineReading records

Key value data

Record

Records from hash tables in memory or outside register

20

Design And Architecture(cont.) Cuckoo hash

lookup engineCompared for equality

At most one comparison successful

Data associated with matching key and set flag

21

Design And Architecture(cont.) Cuckoo hash

lookup engine

Update key set based on requests received

22

Design And Architecture(cont.) Cuckoo hash

lookup engine

Controller can evict records from hash tables on-the-fly preserving the set of active keys

Reconfiguration cycle

23

Design And Architecture(cont.) Cuckoo hash

lookup engine

Ccuckoo = d x t + 1

d – The number of used hash tables(hash functions)t – The size of individual table1 – Additional reconfiguration register

24

Design And Architecture(cont.)A. Lookup engine interface and functionalityB. Cuckoo hash lookup engineC. Binary search tree lookup engine

25

Design And Architecture(cont.) Binary search tree

lookup engineTree level (pipeline stage)

Piece of memory

Stage

Address of a node

Searched KeyComparator

26

Design And Architecture(cont.) Binary search tree

lookup engineContaining associated data to the key

Piece of memory

Stage

Address of a node

Searched KeyComparator

27

Design And Architecture(cont.) Binary search tree

lookup engine

Atomic operations

Piece of memory

Stage

Address of a node

Searched KeyComparator

Result corrected according to a register

28

The capacity of the BST based engine can be configured by the number of BST levels l.

Cbst = 2l - 1

Design And Architecture(cont.) Binary search tree

lookup engine

29

Design And Architecture(cont.)A. Lookup engine interface and functionalityB. Cuckoo hash lookup engineC. Binary search tree lookup engineD. Top-level lookup engine

30

Design And Architecture(cont.) Top-level

lookup engineBoth Cuckoo and BST engine in parallelBoth results are stored in FIFOs

Cuckoo engine BST engine

Stash Stash

FIFO

31

Ctotal = d×t+1+s.d and t of the cuckoo hash and the stash size s

Design And Architecture(cont.) Top-level

lookup engineThe maximum capacity of the cuckoo hash with stash lookup engine can be defined:

32

OutlineIntroductionRelated WorkDesign And ArchitectureEvaluation And ResultsConclusion

33

Evaluation And Results Memory utilization can be computed in two basic ways:Ucuckoo = (n−m)/Ccuckoo

Utotal = n/Ctotal

n: Total number of successfully inserted keys before the memory became fullm: The number of keys that resides in the stash

Stash can be always filled up to 100% of its capacity It can always put m = s

The values of n must be acquired from the test runs

34

Evaluation And Results Evaluate the relation between achievable memory utilization of cuckoo hash and the used sizes of stash for different parameters.

The memory utilization plotted in the graphs is Ucuckoo and the size of the stash (s) is plotted as a portion of t.

35

Evaluation And Results(cont.)

36

Evaluation And Results(cont.)

37

Evaluation And Results(cont.)

38

Evaluation And Results(cont.)

39

OutlineIntroductionRelated WorkDesign And ArchitectureEvaluation And ResultsConclusion

40

Conclusion The proposed architecture leverages the combination of the cuckoo hash engine with BST engine with a focus on parallel implementation in FPGA.

41

THANK YOU