33
On Designing Fast Nonuniformly Distributed IP Address Lookup Hashing Algorithms Author: Christopher J. Martinez, Devang K. Pandya, and Wei-Ming lin Publisher: IEEE/ACM Transactions on Networking, 2009 Presenter: Yuen-Shuo Li Date: 2013/01/09 1

On Designing Fast Nonuniformly Distributed IP Address Lookup Hashing Algorithms

  • Upload
    craig

  • View
    43

  • Download
    0

Embed Size (px)

DESCRIPTION

On Designing Fast Nonuniformly Distributed IP Address Lookup Hashing Algorithms. Author: Christopher J. Martinez, Devang K. Pandya, and Wei-Ming lin Publisher: IEEE/ACM Transactions on Networking, 2009 Presenter : Yuen- Shuo Li Date: 2013/01/09. Outline. Introduction - PowerPoint PPT Presentation

Citation preview

Page 1: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

1

On Designing Fast Nonuniformly Distributed IP Address Lookup Hashing Algorithms

Author: Christopher J. Martinez, Devang K. Pandya, and Wei-Ming lin

Publisher: IEEE/ACM Transactions on Networking, 2009 Presenter: Yuen-Shuo Li Date: 2013/01/09

Page 2: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

2

Outline

Introduction Proposed Hashing Algorithm Simulation Results Implementation

Page 3: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

3

Introduction(1/4)

Hashing has been widely used for fast IP address, but performance from known hashing schemes is far from optimal due to the nonuniformity in actual IP address distribution.

Page 4: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

4

Introduction(2/4)

there exist a set of well-established hash algorithms such as MD4, MD5, SHA-1, and SHA-2, which have found use in the cryptography field.

These algorithms rely on a series of addition, bit rotation, and logic operations through many cycles.

Too slow!

Page 5: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

5

Introduction(3/4)

CRC-based hash functions have proven to be excellent means, but have some potential shortcomings.

Compared to a simple XOR folding hash algorithm that can be implemented in a fast parallel circuit, the CRC-based hash function requires a sequential circuit and a much longer time to determine the hash value.

can’t be implement in parallel !

Page 6: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

6

Introduction(4/4)

The goal of this paper is to develop a universal hashing methodology applicable to nonuniformly distributed data sets.

Our proposed designs allow the application of a standard XOR folding hashing to produce a significantly improved performance.

A New Hash Function (improve XOR folding hashing)

K1, k2, k3K4, k5K9, k10K6, k11, k12K13,k19K7, k17K8, k18K14, k15, k16

balance!

Page 7: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

7

Proposed Hashing Algorithm(1/13)

The hashing process is to hash each of the n-bit entries into an m-bit hash value.

n bits

m bitshash

Page 8: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

n bits

Proposed Hashing Algorithm(2/13)

Intuitively, using the bits with smaller d values for hashing would lead to a probabilistically better hash distribution.

d: the difference between the number of 0’s and 1’s

1 1

1

1 1

1

d=

0

0

0

0

0

0

0 2 2

Page 9: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

Proposed Hashing Algorithm(3/13)

Employ a simple preprocessing step in rearranging the n-bit vectors according to their d values sorted into a increasing order.

n bits

Page 10: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

10

Proposed Hashing Algorithm(4/13)

A bit-extraction hashing is to simply extract m bits from the n-bit entry as its hash value

n bits

m bits

n bits

m bits

EXT d-EXT

sort by d

Page 11: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

11

Proposed Hashing Algorithm(5/13)

MSL: the largest number of entries that are mapped into any hash bin.ASL: the average maximum number of matching steps needed for any given record to match.

n=32, m= varied

Page 12: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

12

Proposed Hashing Algorithm(6/13)

Group-XOR is a commonly used hashing technique by simply grouping the n-bit key into m-bit hash result through a simple process XORing every n/m key bits into a final hash bit.

12

n bits m bits

⨁m bits

m bits

m bits

Page 13: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

13

Proposed Hashing Algorithm(7/13)

The goal of this paper is to use the extracted information from the preprocessing (d values) to facilitate a better hash design with the XOR operator.

Page 14: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

14

Proposed Hashing Algorithm(8/13)

In order not to degrade the hash performance, every intended XOR operation to be taken between two bits should lead to a value such that .

Page 15: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

15

Proposed Hashing Algorithm(9/13)

Bit vectors with smaller d values are XORed with larger d-value bits in order to have a better chance for further reduction.

Bit vectors in the middle range are XORed together to provide the most reductions available.

Page 16: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

16

Proposed Hashing Algorithm(10/13)

Two straightforward ways to exploit the benefit from the d-value-based sorted sequence are to perform XOR hashing on the preprocessed database.

Page 17: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

17

Proposed Hashing Algorithm(11/13)

The traditional group-XOR process may easily lead to detrimental effect, while both d-IOX and d-SOX avoid XORing two bits –- both with small values (the worst possible XORing) both with large values (the XORing leading to minimal gain).

Page 18: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

18

Proposed Hashing Algorithm(12/13)

Natural-Fold XOR(d-NFX) folds the sorted bit sequence from both ends’ matching pair of bits accordingly.

Natural-Fold with Duplication XOR(d-NFD) duplicates the middle subsegments to patch up the missing portion for uniformity.

Page 19: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

19

Proposed Hashing Algorithm(13/13)

d-NFD may lead to overduplication or underduplication on the center subsegments.

A simple method is adopted in simply truncating the bits overshot or duplicating more the once.

Page 20: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

20

Simulation Results(1/12)

The data set used for our simulation is randomly generated such that the value for each bit position is uniformly distributed.

16384(214) entries

Page 21: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

21

Simulation Results(2/12)

The simulation results for n = 32 and are given in Fig. 12 in terms of MSL and ASL by taking an average of results from 1000 runs.

MSL: the largest number of entries that are mapped into any hash bin.ASL: the average maximum number of matching steps needed for any given record to match.

RS hash

Page 22: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

22

Simulation Results(3/12)

RS Hash(additional)

Page 23: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

23

Simulation Results(4/12)

a summary of performance gain in MSL from each of the three proposed techniques and the two reference techniques over the group-XOR.

Page 24: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

24

Simulation Results(5/12)

RS HashThe RS is a multiplicative hash algorithm that requires two multiply and one addition steps for every 8 bits of hash key to generate a hash value.

CRC-32 HashThe CRC-32 requires 32 iterations to generate the final hash value for a given hash key, requiring additional control logic to properly maintain the sequential process.

Page 25: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

25

Simulation Results(6/12)

the average d value of each final hash bit for m=14

Page 26: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

26

Simulation Results(7/12)

a collection of real IP addresses gathered from three different sources: general IP traffic addresses; ad/spam IP addresses; P2P IP addresses.

Page 27: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

27

Simulation Results(8/12)

Performance comparison in terms of MSL and ASL on general IP traffic addresses.

Page 28: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

28

Simulation Results(9/12)

Performance comparison in terms of MSL and ASL on AD/SPAM IP traffic addresses.

Page 29: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

29

Simulation Results(10/12)

Performance comparison in terms of MSL and ASL on P2P IP traffic addresses.

Page 30: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

30

Simulation Results(11/12)

To further analyze potential performance difference between the d-value XOR folding algorithms and the well-established CRC and RS hashing algorithms, the 2 analysis is conducted.

Page 31: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

31

Simulation Results(12/12)

the 2 analysis

Page 32: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

32

Implementation(1/2)

The mapping from the original bit position to the sorted position and then through the d-SOX hashing.

Page 33: On Designing Fast  Nonuniformly  Distributed IP Address Lookup Hashing Algorithms

33

Implementation(2/2)

d-NFD