23
© 2019 Cray Inc. Towards Radix Sorting in the Chapel Standard Library Michael Ferguson CHIUW 2019

Towards Radix Sorting in the Chapel Standard Library

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

To w a r d s R a d i x S o r t i n g i n t h e

C h a p e l S t a n d a r d L i b r a r y

Michael FergusonCHIUW 2019

Page 2: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

• Interface Design for Radix Sort• Exploring Parallel Radix Sort Algorithms• Comparing Single-Locale Performance• Distributed Sorting

Outline

2

Page 3: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

Interface Design for Radix Sort

Page 4: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

• Programming languages usually offer a sort library• These normally include the ability to specify a comparison function

• but that limits implementations to comparison sorting algorithms

• Some sorting libraries also allow specifying a key to sort by• easy to sort by a specific field in a structure• but no help for variable length keys

• Would like to improve on this situation• to enable radix sorting in most cases• including the common case of sorting by variable-length strings

Interface Design for Radix Sort

4

Page 5: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

use Sort;

record MyRecord { var key: int; var value: int; }

record MyKeyComparator {

proc key(element: MyRecord) {

return element.key; // now uses radix sorting for integral keys}

}

config const n = 10000;

var A: [1..n] MyRecord = [i in 1..n] new MyRecord(i, i*i);

sort(A, new MyKeyComparator());

The .key method

5

Page 6: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

use Sort;

record MyRecord { var key: c_string; var value: int; }

record MyKeyPartComparator { }

proc keyPart(element: MyRecord, i: int) {

var byte = element.key[i-1]; // compute the current key byte// has the end been reached? Note, c_strings have a 0 terminator

var done = if byte != 0 then 0 else -1;

return (done, byte);

} }

var A:[1..n] MyRecord = ...;

sort(A, new MyKeyPartComparator());

The .keyPart method

6

Page 7: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

• Say we are sorting strings• Which comes first?

• "badminton"• "bad"

• keyPart returns a tuple to indicate the ordering here• Tuple consists of (section, part)

• (-1, part) ⇢ sort this key before those with more data• ( 0, part) ⇢ sort based on key data in part• ( 1, part) ⇢ sort this key after those with more data

How .keyPart supports variable length keys

7

Page 8: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

Exp lor ing Para l le l Radix Sor t A lgor i thms

Page 9: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

• Two most-significant digit first counting radix sorts• Recursive algorithm with serial bucketize inspired by [1]• Two-array algorithm with parallel bucketize

[1] Peter M McIlroy, Keith Bostic, and M Douglas McIlroy. 1993. Engineering radix sort. Computing systems 6, 1 (1993), 5–27.

Algorithms Explored

9

Page 10: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc. 10

21 11 53 52 26 11 22 15 Count 1st digit:3x '1' 3x '2' 2x '5'

Scan to find bin starts1s 2s 5s

Bucketize: move into bins11 11 15 21 26 22 53 52

Continue with next digit within each bin

11 11 15 21 26 22 53 52

Page 11: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

Recursive Algorithm

11

proc recursiveSort(start, end, A, digit) {

// local arrays for byte counts and offsetsvar counts, offsets : [0..#256] int;

parallelCountAndScan(...)

sequentialInPlaceBucketize(...); // repeated swapping of current itemforall bins do

// recursively calls algorithmrecursiveSort(binStart, binEnd,

A, digit+1);

}

Drawbacks:Limited parallel speedupLots of array allocationsNot a stable sort

Page 12: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

Iterative Algorithm

12

var counts, offsets : [0..#256] int; // just one of each per sort callproc twoArraySort(start, end, A, Scratch, digit) {

bigTasks.push( ... );

while !bigTasks.isEmpty() {

task = bigTasks.pop();

parallelCountAndScan(...);

parallelBucketizeToScratch(...)

for bins do append task to bigTasks or smallTasks

}

forall tasks in smallTasks do baseCaseSort(...)

}

Drawbacks:Uses 2n space

Page 13: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc. 13

Count 1st digit:3x '1' 3x '2' 2x '5'

Scan to find bin starts

Bucketize: move into bins11 11 15 21 26 22 53 52

Continue with next digit within each bin

11 11 15 21 26 22 53 52

21 11 53 52 26 11 22 15

t1 t2 t4t3

1s 2s 5st1 t3 t4 t1 t3 t4 t2

Page 14: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

Compar ing Sing le-Locale Per formance

Page 15: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

66

2,813

1,766

175

1,485

90

161

94

8

0 500 1000 1500 2000 2500 3000 3500

Chapel two-array radixChapel recursive radix

Chapel quickSortips4o

Boost Block IndirectBoost Spreadsort

GCC parallel_stl sortC++ std::sort

C qsortnumpy sort

python3 builtin sort

Speed (MiB/s)

Sorting 1GiB of random uint(64): Broadwell

15

Page 16: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

3,034

670

66

2,813

1,766

175

1,485

90

161

94

8

0 500 1000 1500 2000 2500 3000 3500

Chapel two-array radixChapel recursive radix

Chapel quickSortips4o

Boost Block IndirectBoost Spreadsort

GCC parallel_stl sortC++ std::sort

C qsortnumpy sort

python3 builtin sort

Speed (MiB/s)

Sorting 1GiB of random uint(64): Broadwell

16

Page 17: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

D istr ibuted Sort ing

Page 18: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

Distributed Two-Array Algorithm

18

proc distSort(start, end, A, Scratch, digit) {

while !distTasks.isEmpty() {

countAndBucketizeLocalDataToScratch(...) on each locale

for bins do append task to distTasks or localTasks

}

forall tasks in localTasks do twoArraySort(...)

}

Page 19: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

0

500

1000

1500

2000

2500

3000

0 4 8 12 16 20 24 28

Sor

t Spe

ed (M

iB/s

)

Number of Locales

Strong Scaling on Broadwell, sorting 100 M uint

19

Page 20: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

0

5

10

15

20

25

30

0 4 8 12 16 20 24 28

Sor

t Tim

e (s

)

Number of Locales

Weak Scaling on Broadwellsorting numLocales*100 M uint

20

Page 21: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

• Put two-array sorting and distributed sorting on master• Explore in-place one-pass parallel bucketizer as with ips4o [2]• Support sample sort

• when only comparison function is provided• for very data with skewed data distribution

Thanks to: Rupal Jain and Avneet Kaur (Rails Girls Summer of Code 2018)References:

[1] Peter M McIlroy, Keith Bostic, and M Douglas McIlroy. 1993. Engineering radix sort. Computing systems 6, 1 (1993), 5–27.

[2] Michael Axtmann, Sascha Witt, Daniel Ferizovic, Peter Sanders. In-Place Parallel Super Scalar Sample Sort. arXiv:1705.02257v2 [cs.DC] 29 Jun 2017

Future Work

21

Page 22: Towards Radix Sorting in the Chapel Standard Library

© 2019 Cray Inc.

S A F E H A R B O R S TAT E M E N T

This presentation may contain forward-looking statements that are based on our current expectations. Forward looking statements may include statements about our financial guidance and expected operating results, our opportunities and future potential, our product development and new product introduction plans, our ability to expand and penetrate our addressable markets and other statements that are not historical facts.

These statements are only predictions and actual results may materially vary from those projected. Please refer to Cray's documents filed with the SEC from time to time concerning factors that could affect the Company and these forward-looking statements.

22

Page 23: Towards Radix Sorting in the Chapel Standard Library

THANK YOUQ U E S T I O N S ?

@ChapelLanguage

chapel-lang.org

[email protected]

@cray_inc

linkedin.com/company/cray-inc-

cray.com