Updating Designed for Fast IP Lookup Author : Natasa Maksic, Zoran Chicha and Aleksandra Smiljani´c...

Preview:

DESCRIPTION

Introduction In this paper, we propose implement and analyze lookup table updating for POLP lookup algorithm and BPFL. Compare POLP and BPFL update algorithm in terms of real-world routing tables. Measure the memory requirement for both lookup algorithm. Observe the number of memory access when lookup table are updated. National Cheng Kung University CSIE Computer & Internet Architecture Lab 3

Citation preview

Updating Designed for Fast IP Lookup

Author : Natasa Maksic, Zoran Chicha and Aleksandra Smiljani´cConference: IEEE High Performance Switching and Routing (HPSR), 2012Presenter: Kuan-Chieh FengDate: 2015/9/23

Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.

Outline

Introduction BPFL (Balanced parallelized frugal lookup) POLP (Parallel optimized linear pipeline) Memory requirement Execution time Conclusion

National Cheng Kung University CSIE Computer & Internet Architecture Lab

2

Introduction

In this paper, we propose implement and analyze lookup table updating for POLP lookup algorithm and BPFL.• Compare POLP and BPFL update algorithm in

terms of real-world routing tables.• Measure the memory requirement for both

lookup algorithm.• Observe the number of memory access when

lookup table are updated.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

3

Outline

Introduction BPFL (Balanced parallelized frugal lookup) POLP (Parallel optimized linear pipeline) Memory requirement Execution time Conclusion

National Cheng Kung University CSIE Computer & Internet Architecture Lab

4

BPFL Structure(1/12)

National Cheng Kung University CSIE Computer & Internet Architecture Lab

5

BPFL Structure(2/12)

Number of levels : : address length : subtree depth

National Cheng Kung University CSIE Computer & Internet Architecture Lab

6

BPFL Structure(3/12)

IP Lookup example:• = 32 , = 8 , L = 4• IP = 10000000100000001111000011110000

National Cheng Kung University CSIE Computer & Internet Architecture Lab

7

Level 1

Level 2

Level 3

Level 4Highest

level match

BPFL Structure(4/12)

Each module at level i contains two parts : • Subtree search engine• Prefix search engine

National Cheng Kung University CSIE Computer & Internet Architecture Lab

8

BPFL Structure (5/12)

National Cheng Kung University CSIE Computer & Internet Architecture Lab

9

Subtree search engine Prefix search engine

BPFL Structure (6/12)

Subtree search engine• Balanced Tree Selector• Balanced Trees

National Cheng Kung University CSIE Computer & Internet Architecture Lab

10

BPFL Structure (7/12)

Subtree search engine• Balanced Tree Selector gives the address of the

root of the subtree.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

11

Address ofthe root of selected

balanced tree

BPFL Structure (8/12)

Subtree search engine• The selected balanced tree is traversed based on

the comparisons of its node entries to the given IP address.

• In order to frugally use the on-chip memory, balanced tree nodes do not store pointers to their children.

• The left child address is obtained by adding ‘0’ before the parent’s address, and the right child address is obtained by adding ‘1’.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

12

BPFL Structure (9/12)

Subtree search engine

National Cheng Kung University CSIE Computer & Internet Architecture Lab

13

BPFL Structure (10/12)

Prefix search engine• Bitmap processor• Internal memory block (subtree bitmap memory)

National Cheng Kung University CSIE Computer & Internet Architecture Lab

14

BPFL Structure (11/12)

Prefix search engine• If the number of non-empty nodes is below the

threshold, their indices are kept in the internal memory.

• Otherwise, the complete bitmap vector describing the subtree structure is stored in the internal memory.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

15

BPFL Structure (12/12)

Prefix search engine

National Cheng Kung University CSIE Computer & Internet Architecture Lab

16

BPFL advantage

BPFL frugally uses the memory resources so the large lookup tables can fit the on-chip memory.

Since the pipelining is used, one IP lookup can be performed per a clock cycle.

Using memory frugally makes BPFL support IPv6.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

17

Memory saving

Fast lookup

Support IPv6

BPFL Update

National Cheng Kung University CSIE Computer & Internet Architecture Lab

18

Outline

Introduction BPFL (Balanced parallelized frugal lookup) POLP (Parallel optimized linear pipeline) Memory requirement Execution time Conclusion

National Cheng Kung University CSIE Computer & Internet Architecture Lab

19

POLP Structure (1/7)

National Cheng Kung University CSIE Computer & Internet Architecture Lab

20

POLP Structure (2/7)

National Cheng Kung University CSIE Computer & Internet Architecture Lab

21

POLP Search engine contains:• Pipeline Selector • Pipeline 1 ~ P (Stages)• Final Selector

POLP Structure (3/7)

Pipeline Selector Select with first I bits of input IP address Pass remaining 32-I bits to pipeline

National Cheng Kung University CSIE Computer & Internet Architecture Lab

22

POLP Structure (4/7)

Pipeline• Each pipeline consists of F stages• Children nodes of the given node do not have to

be in the next stage

National Cheng Kung University CSIE Computer & Internet Architecture Lab

23

POLP Structure (5/7)

Pipeline

National Cheng Kung University CSIE Computer & Internet Architecture Lab

24

POLP Structure (6/7)

Stage• Each stage contains a memory block that holds

the nodes of the subtrees

National Cheng Kung University CSIE Computer & Internet Architecture Lab

25

POLP Structure (7/7)

Stage memory contains:• Next-hop bit• Left child bit• Right child bit• Children pointer

National Cheng Kung University CSIE Computer & Internet Architecture Lab

26

POLP advantage

The lookup process is parallelized and pipelined.

Multiple IP addresses can be searched in parallel through distinct pipelines.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

27

Fast lookup

High throughput

POLP Update

National Cheng Kung University CSIE Computer & Internet Architecture Lab

28

Outline

Introduction BPFL (Balanced parallelized frugal lookup) POLP (Parallel optimized linear pipeline) Memory requirement Execution time Conclusion

National Cheng Kung University CSIE Computer & Internet Architecture Lab

29

Memory requirement

National Cheng Kung University CSIE Computer & Internet Architecture Lab

30

Outline

Introduction BPFL (Balanced parallelized frugal lookup) POLP (Parallel optimized linear pipeline) Memory requirement Execution time Conclusion

National Cheng Kung University CSIE Computer & Internet Architecture Lab

31

Execution time

National Cheng Kung University CSIE Computer & Internet Architecture Lab

32

Memory access

National Cheng Kung University CSIE Computer & Internet Architecture Lab

33

Memory access

National Cheng Kung University CSIE Computer & Internet Architecture Lab

34

Conclusion

The POLP update algorithm is faster for large routing tables.

The BPFL update algorithm performs better for the smaller routing tables with the number of memory access.

The BPFL algorithm had the smaller memory requirements, and that its memory savings increase with the routing table size.

National Cheng Kung University CSIE Computer & Internet Architecture Lab

35

Supplement

Three realistic lookup table:• 71K• 143K• 309K

FPGA chip:• Altera’s Stratix II RP2S180F1020C5 chip

National Cheng Kung University CSIE Computer & Internet Architecture Lab

36

Supplement (BPFL)

National Cheng Kung University CSIE Computer & Internet Architecture Lab

37

Supplement (POLP)

National Cheng Kung University CSIE Computer & Internet Architecture Lab

38