22
M inimal MapReduce Alg orithms Yufei Tao Chinese Universit y of Hong Kong, H ong Kong

Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

Embed Size (px)

Citation preview

Page 1: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

Minimal MapReduce Algori

thms

Yufei Tao

Chinese University of Ho

ng Kong, Hong Kong

Page 2: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

outline

• INTRODUCTION• PRELIMINARY AND RELATED WORK• SORTING• BASIC MINIMAL ALGORITHMS IN DATABAS

ES• SLIDING AGGREGATION• EXPERIMENTS• CONCLUSIONS

Page 3: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

introduction• Motivation Although these principles have guided th

e design of MapReduce algorithms, the previous practices have mostly been on a best-effort basis, paying relatively less attention to enforcing serious constraints on different performance metrics.

Page 4: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

introduction• Minimal MapReduce Algorithms

Minimum footprint.Minimum footprint.Bounded net-trafficBounded net-trafficConstant roundConstant roundOptimal computationOptimal computation

Page 5: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

introduction• Contributions

The core of this work comprises of neat mini

mal algorithms for two problems:

SortingSortingSliding AggregationSliding Aggregation

Page 6: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

introductionSortingSortingSliding AggregationSliding Aggregation

Page 7: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

related work

MapReduceMapReduceTeraSortTeraSortAlgorithms on MapReduceAlgorithms on MapReduceRelevance to Minimal AlgorithmsRelevance to Minimal Algorithms

Page 8: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

related work-MR

Statelessness for Fault ToleranceStatelessness for Fault Tolerance

Some MapReduce implementations (e.g., Hadoop) place the requirement that, at the end of a round, each machine should send all the data in its storage to a distributed file system.

Page 9: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

related work-TS

What's TeraSort?What's TeraSort?

Page 10: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

sorting-TS

Page 11: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

sortingDefine Si = S ∩(bi−1, bi], for 1 ≤ i ≤ t. In Round 2, all the objects in Si are gathered by Mi, which sorts them in the reducephase. For TeraSort to be minimal, it must hold:P1. s = O(m).P1. s = O(m).P2. |Si| = O(m) for all 1 ≤ i ≤ tP2. |Si| = O(m) for all 1 ≤ i ≤ t

Page 12: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

sortingPr

Discussion

Minimality

Page 13: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

sorting

Removing the Broadcast Assumption

(by changing round 1)

Page 14: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

in databases

Ranking & Skyline

Group by

Semi-Join

Page 15: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

in databasesGroup by

example

Page 16: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

sliding aggregation

,

,

( )

( ) ( )o window o

win sum o w o

The window sum of o equal:

Page 17: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

sliding aggregation

Sorting with Perfect Balance

Page 18: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

sliding aggregation

Sliding Aggregate Computation

Page 19: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

experiments-sorting

Page 20: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

experiments-sorting

Page 21: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

experiments-skyline

Page 22: Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong

本篇论文的主要贡献是填充了

最小 MR 算法概念一个空隙。。

thx @hh's