22
PARALLEL TABLE LOOKUP FOR NEXT GENERATION INTERNET Author: Parallel Table Lookup for Next Generation Internet Publisher/Conf.: Computer Software and Applications, 2008. COMPSAC '08. 32nd Annual IEEE International Speaker: Han-Jhen Guo Date: 2009.03.11 1

PARALLEL TABLE LOOKUP FOR NEXT GENERATION INTERNET Author: Parallel Table Lookup for Next Generation Internet Publisher/Conf.: Computer Software and Applications,

Embed Size (px)

Citation preview

PARALLEL TABLE LOOKUP FOR NEXT GENERATION INTERNET

Author: Parallel Table Lookup for Next Generation Internet

Publisher/Conf.: Computer Software and Applications, 2008. COMPSAC '08. 32nd Annual IEEE International

Speaker: Han-Jhen Guo

Date: 2009.03.11

1

OUTLINE

Introduction The Proposed Scheme Implement Performance

2

INTRODUCTION- BINARY SEARCH AMONG PREFIX LENGTHS (1/2)

eg. (address length = 8)

3

Prefix Nexthop

00* A

1001* B

000* C

001* D

111* E

10010*

F

1* G

B (1001*)

4

2

6

1

3

5

7

A (00*)

C (000*)D (001*)E (111*)

F (10010*)

Fig. 1 Binary search tree among prefix length

G (1*)

B (1001*)

4

2

6

1

3

5

7

A (00*)

C (000*)D (001*)E (111*)

F (10010*)

Fig. 1 Binary search tree among prefix length

G (1*)

INTRODUCTION- BINARY SEARCH AMONG PREFIX LENGTHS (2/2)

eg. (search 11101000)

4

not match

not match

Error!E (111*) should be matched.

key = 1110

key = 11key = 1

match

INTRODUCTION- BINARY SEARCH TREE WITH MARKERS (1/3)

Solution: marker eg.markers of prefix 10010* = 1*, 10*, 100*,

1001* Meaning: should have a matched prefix longer

than this marker Insert markers into those hash tables in the

search path of binary search tree (only pick those markers whose lengths have appeared in the lookup order)

In order to avoid backtracking, the marker is recorded with BMP

5

INTRODUCTION- BINARY SEARCH TREE WITH MARKERS (2/3)

6

Fig. 2 Binary search tree among prefix length with markers

B (1001*)

4

2

6

1

3

5

7

C (000*)D (001*)E (111*)

F (10010*)

G (1*)

A (00*)

G (11*)

Fig. 2 Binary search tree among prefix length with markers

B (1001*)

4

2

6

1

3

5

7

C (000*)D (001*)E (111*)

F (10010*)

G (1*)

A (00*)

G (11*)

INTRODUCTION- BINARY SEARCH TREE WITH MARKERS (3/3)

eg. (search 11101000)

7

not match

match

match

key = 111

key = 1110

key = 11

INTRODUCTION- CONCLUSION

The lookup scheme in above is scalable with complexity O(log2W), where W is the length of the IP address.

Assuming that we have a perfect hash function, we only need to do lookup for each hash table only one time 8

It only need to perform lookup of 5 different hash tables in the worst case in IPv4

THE PROPOSED SCHEME- MERGING HASH TABLES

The concept of merging hash tables (n = 1, 2, 3, 4, etc.) Assuming either prefix P.0 or prefix P.1 is in

Table2n+1 (. means which is followed by a bit) , there should have a marker P in Table2n

Associate a marker P in Table2n with P.0 and P.1 It only need to lookup instead of 4 different hash

tables in the worst case after merging

9

THE PROPOSED SCHEME- DATA STRUCTURE OF MODIFIED HASH NODE (1/2)

Data structure

10

THE PROPOSED SCHEME- DATA STRUCTURE OF MODIFIED HASH NODE (2/2)

11

eg. after merging

Fig. 2 Binary search tree among prefix length with markers

B (1001*)

4

2

6

1

3

5

7

C (000*)D (001*)E (111*)

F (10010*)

G (1*)

A (00*)

G (11*)

Fig. 3 Merging hash tables

4, 5

2, 3

6, 7

4 F 1001* 4 B1001-0 -

2 C1011-0

-

00* 2 AD

2 -1010-0 - 11* 1 GE

0 default port* 0-1010-0 - G

Fig. 3 Merging hash tables

4, 5

2, 3

6, 7

4 F 1001* 4 B1001-0 -

2 C1011-0

-

00* 2 AD

2 -1010-0 - 11* 1 GE

0 default port* 0-1010-0 - G

THE PROPOSED SCHEME- LOOKUP ALGORITHM

eg. (search 11101000)

12

key = 1110

key = 11

BMP = G → E

THE PROPOSED SCHEME- MAKING LOOKUP ALGORITHM PIPELINED (1/2) Binary search tree for IPv6

without merging hash tables Modified binary search tree for

IPv6

13

THE PROPOSED SCHEME- MAKING LOOKUP ALGORITHM PIPELINED (2/2)

assign each one to do lookup of the hash table in one level → 6 stages totally

Each stages:1. use the destination IP address as the key to do

hash2. lookup the hash table using the hash value as

the index3. do computation according to the lookup result

BMP so far the hash table to be searched next the skip flag for the next processing unit regarding

the BMP that has been found14

THE PROPOSED SCHEME- USING MULTI-THREADING IN THE PIPELINE STAGE

15

IMPLEMENT- IMPLEMENTATION PLATFORM

IXP2400 8 micro engines; it supports 8 threads each micro

engine use 6 micro engines to implement our design of

pipeline, and run 8 threads on each micro engine for realizing the design of multi-threading

IXA SDK 4.1 to simulate the environment of IXP2400

16

IMPLEMENT- IMPLEMENTATION BRIEFS (1/4)

Maximum size of three separate memories

The average latencies of reading eight 4-byte words from SRAM and DRAM in the circumstance of only one micro engine trying to access the memories

17

IMPLEMENT- IMPLEMENTATION BRIEFS (2/4) The average latencies of reading 8 words from a

certain channel of SRAM or DRAM when different numbers of micro engines try to contend for accessing that channel of SRAM or DRAM

allow 8 simultaneous SRAM accesses (4 from each channel) and 3 simultaneous DRAM accesses without increasing the average memory latency

18

IMPLEMENT- IMPLEMENTATION BRIEFS (3/4)

Distribute hash tables to 3 separate memory of IXP2400

19

IMPLEMENT- IMPLEMENTATION BRIEFS (4/4)

Hashing hash function: CRC32 collision resolution: chaining

alleviate the penalty of hash collision with access 2 contiguous nodes a time with a little bit of memory latency

20

Fig. Chaining in a hashFig. Chaining with 2 contiguous nodes

PERFORMANCE

Comparisons of maximum forwarding rates 10,000 random IP addresses and calculate the

number of total cycle counts required to perform lookup

21

22