Upload
preston-lindsey
View
53
Download
4
Tags:
Embed Size (px)
DESCRIPTION
ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS. Authors: Steffen Zeuch, Frank Huber, Johann-Christoph Freytag Humboldt-Universität zu Berlin {zeuchste,huber,freytag}@informatik.hu-berlin.de. 1. Motivation. B + -Tree: common index structure - PowerPoint PPT Presentation
Citation preview
ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS
1
Authors: Steffen Zeuch, Frank Huber, Johann-Christoph FreytagHumboldt-Universität zu Berlin
{zeuchste,huber,freytag}@informatik.hu-berlin.de
MotivationB+-Tree: common index structure
Common node-internal search algorithm:
Binary search in O(log2n)
2
Can we do better?
Yes with SIMD!
Outline1. Background
2. Binary Search and SIMD
3. Segmented Tree
4. Segmented Trie
5. Evaluation
6. Conclusion
3
Single Instruction Multiple Data:
Available on CPU and GPU
Arithmetical, comparison, conversion, logical
SIMD
4
3 2
5 4
+2 +2
Add const to vector
Add two vectors
3 2 65 67
67 69
+
Compare two vectors
0 -1
≥
673 265
Binary Search
5SeparatorSearch KeyExcludedSearch Space
Iteration
1
2
3
4
5
Search Key = 9
Outline1. Background
2. Binary Search and SIMD
3. Segmented Tree
4. Segmented Trie
5. Evaluation
6. Conclusion
6
Binary Search - two Separator
7SeparatorSearch KeyExcludedSearch Space
Search Key = 9Iteration
1
2
3
Binary Search + SIMD
8
0 -1
SIMD Register C
>=
Separator
Search KeyExcludedSearch Space
8 17 9 9SIMD
Register A
SIMD Register
B
Problem: SIMD on CPU
SIMD on CPU do not support Scatter and Gather functionality.
9
8 94 x 32-bit
SIMD Register 10 11
SIMD load(start position)
Solution: K-ary Search by Schlegel et al.
10SeparatorSearch KeyExcludedSearch Space
3-ary Search Tree(k = 3)
Linearized Order
Search Key = 9
Applied K-ary Search
11SeparatorSearch KeyExcludedSearch Space
3-ary Search Tree
Linearized Order Search Key = 91
2
3
Degree of Parallelism
12
SIMDBandwidth
SearchMethod
DataType
ParallelComparisons
128-bit
K-arySearch
8-bit 16
16-bit 8
32-bit 4
64-bit 2
BinarySearch
All 1
Outline1. Background
2. Binary Search and SIMD
3. Segmented Tree
4. Segmented Trie
5. Evaluation
6. Conclusion
13
Segmented Tree
14
Change inner-node search algorithm from commonlybinary search to k-ary search.
Problem: Unfilled Nodes
15
3-ary Search Tree
Linearized Order
K-ary requirement: multiple of k-1 keys Smax+1
ReorderingNew keys require reordering:
Sorting → Inserting → Linearizing
Exceptions:
Empty Node
Key is greater than the largest existing key
16
Segmented TreeAdvantages:
High resource utilization
Less iterations required
Binary Search: log2n vs. K-ary Search logkn
Disadvantages:
Reordering overhead
Large data types decrease performance
17
Outline1. Background
2. Binary Search and SIMD
3. Segmented Tree
4. Segmented Trie
5. Evaluation
6. Conclusion
18
Segmented Trie
19
Key (Dec)
Key (Hex)
Partial Key(Hex)
Level 1
Level 2
Segmented Trie
20
Segmented TrieAdvantages:
High SIMD search performance
Prefix compression
Early termination
Disadvantages:
Fix level count
Reordering overhead
21
SegTree vs. SegTrie
22
SegTreeSegTree SegTrieSegTrie
Derived From
B+-Tree Prefix B-Tree
Number of Iterations Tree Height Max. #Level
(Early termination)
Number of Level Dynamic Static (Pre-defined)
DOP Depends onData Type
16 (8-bit)
Outline1. Background
2. Binary Search and SIMD
3. Segmented Tree
4. Segmented Trie
5. Evaluation
6. Conclusion
23
Test SetupHW/SW Configuration:
CPU: Intel Xeon 5520, 4 x 2,26 GHz
L1: 32KB, L2: 256 KB, L3: 8 MB, MM: 8 GB
Cacheline: 128 Byte, SIMD bandwidth: 128 Bit
Windows 7 64-bit Professional
Test Dataset:
Synthetically generated, ascending, starting at 024
Evaluation: Bitmask
25
Three Algorithms:
1. Bit Shifting
2. Case-Switch
3. PopCnt
0 -1 SIMD Register C
>=
8 17 9 9SIMD Register A
SIMD Register B
Evaluation: SegTree
26
Evaluation: SegTrie
27
Outline1. Background
2. Binary Search and SIMD
3. Segmented Tree
4. Segmented Trie
5. Evaluation
6. Conclusion
28
Our Contributions B+-Tree and prefix B-Tree using SIMD
Transformation and search algorithm for breadth-first and depth-first data layout
Three algorithms for interpreting a SIMD comparison result
Solution for an arbitrary key count
29
Thanks
Backup
30