Upload
stephan-ewen
View
491
Download
2
Tags:
Embed Size (px)
Citation preview
StratoSphereAbove the Clouds
Stratosphere
Massively Parallel Analytics
Alexander Alexandrov, Stephan Ewen,Joseph Harjung, Fabian Hüske,
Moritz Kaufmann, Aljoscha Krettek, Volker Markl, Kostas Tzoumas, Sebastian Schelter
Stratosphere – Parallel Analytics Beyond MapReduce
The Big Data Context
2
Large Quantitiesof Data
Diverse Data Structures
Complex AnalysisTasks
SQL
?
SQL NoSQL
?
NoMapReduce
SQL NoSQL
?
NoMapReduce
SQL NoSQL
SQL--
?
NoMapReduce
SQL NoSQL
SQL--
?
?
NoMapReduce
SQL NoSQL
SQL--
?
?Question 1:
Is it faster to add a HiveQL parser and
an HDFS adapter to your favorite
parallel database, or develop a parallel
engine from scratch?
NoMapReduce
SQL NoSQL
SQL--
?
?Question 1:
Is it faster to add a HiveQL parser and
an HDFS adapter to your favorite
parallel database, or develop a parallel
engine from scratch?
Question 2:Have we closed the circle (“we want
SQL!”) or is there more in analytics?
10
11
scripting
12
scripting
SQL--
13
scripting
SQL--
XQuery+/-
14
scripting
SQL--
scalable parallel sort
XQuery+/-
15
scripting
SQL--
scalable parallel sort
XQuery+/- not a sortingproblem!
16
scripting
SQL--
columnstore--
scalable parallel sort
XQuery+/- not a sortingproblem!
17
scripting
SQL--
columnstore--
scalable parallel sort
a queryplan
XQuery+/- not a sortingproblem!
18
scripting
SQL--
columnstore--
scalable parallel sort
a queryplan
XQuery+/- not a sortingproblem!
Question 3:
How do we architect systems for the
next wave of rich data analysis?
19
≠
commandments
for Big Data
Analytics
10
Stratosphere – Parallel Analytics Beyond MapReduce
case class Vertex(id: Int, component: Int)case class Edge(from: Int, to: Int)
val vertices = hdfsFile(…);val edges = hdfsFile(…);
val result = step iterate (vertices distinctBy {_.id}, vertices)
def step = (s: Data[Vertex], ws: Data[Vertex]) => {
val neighbors = ws join edges on {_.id} isEqualTo {_.from} using {(v,e) => Vertex(e.to, v.component)}
val min = allNeighbors reduceBy {_.id} ( minBy _.component)
val s1 = minNeighbors join s on {_.id} isEqualTo {_.id} using {(c,o)=> if (c.component < o.component) Some(c) else None} (s1, s1)}
(I) Thou shalt…
21
… use declarative languages!
Stratosphere – Parallel Analytics Beyond MapReduce22
case class Vertex(id: Int, component: Int)case class Edge(from: Int, to: Int)
val vertices = hdfsFile(…);val edges = hdfsFile(…);
val result = step iterate (vertices distinctBy {_.id}, vertices)
def step = (s: Data[Vertex], ws: Data[Vertex]) => {
val neighbors = ws join edges on {_.id} isEqualTo {_.from} using {(v,e) => Vertex(e.to, v.component)}
val min = allNeighbors reduceBy {_.id} ( minBy _.component)
val s1 = minNeighbors join s on {_.id} isEqualTo {_.id} using {(c,o)=> if (c.component < o.component) Some(c) else None} (s1, s1)}
(I) Thou shalt…
… use declarative languages!
Executive Summary
Connected components of a graph.
- Joins and aggregations on custom data types
- Incremental / Delta Iterations
- Mixture of operators and UDFs
Stratosphere – Parallel Analytics Beyond MapReduce23
case class Vertex(id: Int, component: Int)case class Edge(from: Int, to: Int)
val vertices = hdfsFile(…);val edges = hdfsFile(…);
val result = step iterate (vertices distinctBy {_.id}, vertices)
def step = (s: Data[Vertex], ws: Data[Vertex]) => {
val neighbors = ws join edges on {_.id} isEqualTo {_.from} using {(v,e) => Vertex(e.to, v.component)}
val min = allNeighbors reduceBy {_.id} ( minBy _.component)
val s1 = minNeighbors join s on {_.id} isEqualTo {_.id} using {(c,o)=> if (c.component < o.component) Some(c) else None} (s1, s1)}
(II) Thou shalt…
… accept external (dynamic) sources! “In situ” data - no load
Stratosphere – Parallel Analytics Beyond MapReduce24
case class Vertex(id: Int, component: Int)case class Edge(from: Int, to: Int)
val vertices = hdfsFile(…);val edges = hdfsFile(…);
val result = step iterate (vertices distinctBy {_.id}, vertices)
def step = (s: Data[Vertex], ws: Data[Vertex]) => {
val neighbors = ws join edges on {_.id} isEqualTo {_.from} using {(v,e) => Vertex(e.to, v.component)}
val min = allNeighbors reduceBy {_.id} ( minBy _.component)
val s1 = minNeighbors join s on {_.id} isEqualTo {_.id} using {(c,o)=> if (c.component < o.component) Some(c) else None} (s1, s1)}
(III) Thou shalt…
… use rich primitives! (beyond MapReduce)
Stratosphere – Parallel Analytics Beyond MapReduce25
case class Vertex(id: Int, component: Int)case class Edge(from: Int, to: Int)
val vertices = hdfsFile(…);val edges = hdfsFile(…);
val result = step iterate (vertices distinctBy {_.id}, vertices)
def step = (s: Data[Vertex], ws: Data[Vertex]) => {
val neighbors = ws join edges on {_.id} isEqualTo {_.from} using {(v,e) => Vertex(e.to, v.component)}
val min = allNeighbors reduceBy {_.id} ( minBy _.component)
val s1 = minNeighbors join s on {_.id} isEqualTo {_.id} using {(c,o)=> if (c.component < o.component) Some(c) else None} (s1, s1)}
(III) Thou shalt…
… use rich primitives! (beyond MapReduce)
Map
Reduce
Cross
Match
CoGroup
Stratosphere – Parallel Analytics Beyond MapReduce26
case class Vertex(id: Int, component: Int)case class Edge(from: Int, to: Int)
val vertices = hdfsFile(…);val edges = hdfsFile(…);
val result = step iterate (vertices distinctBy {_.id}, vertices)
def step = (s: Data[Vertex], ws: Data[Vertex]) => {
val neighbors = ws join edges on {_.id} isEqualTo {_.from} using {(v,e) => Vertex(e.to, v.component)}
val min = allNeighbors reduceBy {_.id} ( minBy _.component)
val s1 = minNeighbors join s on {_.id} isEqualTo {_.id} using {(c,o)=> if (c.component < o.component) Some(c) else None} (s1, s1)}
(IV) Thou shalt…
… define queries and UDFs in the same language!
UDF
Query definition
Stratosphere – Parallel Analytics Beyond MapReduce27
case class Vertex(id: Int, component: Int)case class Edge(from: Int, to: Int)
val vertices = hdfsFile(…);val edges = hdfsFile(…);
val result = step iterate (vertices distinctBy {_.id}, vertices)
def step = (s: Data[Vertex], ws: Data[Vertex]) => {
val neighbors = ws join edges on {_.id} isEqualTo {_.from} using {(v,e) => Vertex(e.to, v.component)}
val min = allNeighbors reduceBy {_.id} ( minBy _.component)
val s1 = minNeighbors join s on {_.id} isEqualTo {_.id} using {(c,o)=> if (c.component < o.component) Some(c) else None} (s1, s1)}
(V) Thou shalt…
… use an algebraic butrich data model!
Custom Object Oriented andFunctional Data Types
Use functions as referencesto fields/attributes
Stratosphere – Parallel Analytics Beyond MapReduce28
case class Vertex(id: Int, component: Int)case class Edge(from: Int, to: Int)
val vertices = hdfsFile(…);val edges = hdfsFile(…);
val result = step iterate (vertices distinctBy {_.id}, vertices)
def step = (s: Data[Vertex], ws: Data[Vertex]) => {
val neighbors = ws join edges on {_.id} isEqualTo {_.from} using {(v,e) => Vertex(e.to, v.component)}
val min = allNeighbors reduceBy {_.id} ( minBy _.component)
val s1 = minNeighbors join s on {_.id} isEqualTo {_.id} using {(c,o)=> if (c.component < o.component) Some(c) else None} (s1, s1)}
(VI) Thou shalt…
… optimize! Auto-parallelization and optimization à la relational databases.
Stratosphere – Parallel Analytics Beyond MapReduce29
case class Vertex(id: Int, component: Int)case class Edge(from: Int, to: Int)
val vertices = hdfsFile(…);val edges = hdfsFile(…);
val result = step iterate (vertices distinctBy {_.id}, vertices)
def step = (s: Data[Vertex], ws: Data[Vertex]) => {
val neighbors = ws join edges on {_.id} isEqualTo {_.from} using {(v,e) => Vertex(e.to, v.component)}
val min = allNeighbors reduceBy {_.id} ( minBy _.component)
val s1 = minNeighbors join s on {_.id} isEqualTo {_.id} using {(c,o)=> if (c.component < o.component) Some(c) else None} (s1, s1)}
(VII) Thou shalt…
… not treat UDFs as black boxes!
Static code analysis of UDFsto determine field accessesand modificationsVastly increases optimization
potential
Stratosphere – Parallel Analytics Beyond MapReduce30
case class Vertex(id: Int, component: Int)case class Edge(from: Int, to: Int)
val vertices = hdfsFile(…);val edges = hdfsFile(…);
val result = step iterate (vertices distinctBy {_.id}, vertices)
def step = (s: Data[Vertex], ws: Data[Vertex]) => {
val neighbors = ws join edges on {_.id} isEqualTo {_.from} using {(v,e) => Vertex(e.to, v.component)}
val min = allNeighbors reduceBy {_.id} ( minBy _.component)
val s1 = minNeighbors join s on {_.id} isEqualTo {_.id} using {(c,o)=> if (c.component < o.component) Some(c) else None} (s1, s1)}
(VIII) Thou shalt…
… iterate/recurse!
Step function
Needed for most interesting analysis cases
Stratosphere – Parallel Analytics Beyond MapReduce31
case class Vertex(id: Int, component: Int)case class Edge(from: Int, to: Int)
val vertices = hdfsFile(…);val edges = hdfsFile(…);
val result = step iterate (vertices distinctBy {_.id}, vertices)
def step = (s: Data[Vertex], ws: Data[Vertex]) => {
val neighbors = ws join edges on {_.id} isEqualTo {_.from} using {(v,e) => Vertex(e.to, v.component)}
val min = allNeighbors reduceBy {_.id} ( minBy _.component)
val s1 = minNeighbors join s on {_.id} isEqualTo {_.id} using {(c,o)=> if (c.component < o.component) Some(c) else None} (s1, s1)}
(IX) Thou shalt…
… exploit dynamic computation!
Naïve (Bulk)
Incremental
0200000400000600000800000
100000012000001400000
Superstep
# Ve
rtice
s (t
hous
ands
)
Pregel as a Stratosphere plan with comparable performance.
Stratosphere – Parallel Analytics Beyond MapReduce32
(X) Thou shalt…
… use a scalable and efficient execution engine!
Pipeline and data parallelism, flexible checkpointing, optimized network data transfers
Stratosphere – Parallel Analytics Beyond MapReduce
Write like a programming language
Fazit
33
Execute like a Database
Stratosphere – Parallel Analytics Beyond MapReduce
Write like a programming language
Fazit
34
Execute like a DatabaseAdd a bit of "languages and compilers" sauce to the database stack…
Stratosphere – Parallel Analytics Beyond MapReduce
Stratosphere Programming Stack
35
Nephele Dataflow Engine
Runtime Operators
SOPREMOCompiler
MeteorScript
Scala
Scala-Compiler Plugin
Stratosphere Optimizer
Nephele Parallel Dataflow
PACT Program
Layered approach – several entry points to the system
Stratosphere – Parallel Analytics Beyond MapReduce
Stratosphere Programming Stack
36
Nephele Dataflow Engine
Runtime Operators
SOPREMOCompiler
MeteorScript
Scala
Scala-Compiler Plugin
Stratosphere Optimizer
Nephele Parallel Dataflow
PACT Program
Pact programScala program
Scala compiler plug-in
RuntimeHash- and sort-based out-of-core operator implementations, memory management
Stratosphere optimizerPicks data shipping and local strategies, operator order
Execution plan
Nephele Execution EngineTask scheduling, network data transfers, resource allocation, checkpointing
Job graph Execution graph
Pact programScala program
Scala compiler plug-in
RuntimeHash- and sort-based out-of-core operator implementations, memory management
Stratosphere optimizerPicks data shipping and local strategies, operator order
Execution plan
Nephele Execution EngineTask scheduling, network data transfers, resource allocation, checkpointing
Job graph Execution graph
1
2
3
Stratosphere – Parallel Analytics Beyond MapReduce
StratoSphereAbove the Clouds
PARALLEL PROGRAMMING MODEL
Part 1
39
Stratosphere – Parallel Analytics Beyond MapReduce
Background: PACTs
40
D. Battré, S. Ewen, F. Hueske, O. Kao, V. Markl, D. Warneke: Nephele/PACTs: a programming model and execution framework for web-scale analytical processing
Second-orderfunction
First-order function(UDF)Data Data
Map Reduce Cross Match CoGroup
Stratosphere – Parallel Analytics Beyond MapReduce
■ Data flow operators (UDFs)are first-order functions
■ Application of UDFs to thedata through second-orderfunctions that defineparallel semantics
■ Declarative, as executionstrategies are not fixed
Background: PACTs
41
Reduce (on A)sum(B), avg(C)
Match (A = D)if (A>3) emit
MapC := max(A,B)
Mapif (D>4) emit
Sink 1
Source 1Extract (A,B)
Source 2Extract (D,E)
D. Battré, S. Ewen, F. Hueske, O. Kao, V. Markl, D. Warneke: Nephele/PACTs: a programming model and execution framework for web-scale analytical processing
Stratosphere – Parallel Analytics Beyond MapReduce
Iterative Programs
42
S. Ewen, K. Tzoumas, M. Kaufmann, V. Markl:Spinning Fast Iterative Data Flows. PVLDB 5(11), 2012
Wi Si
(v2, cid) Match
(v1,v2), (vid,cid)
(vid, cid)CoGroup
[(vid,cid)],(vid, cid)
N
Wi+1 Di+1
U.
Edges
Bulk Iteration(Page Rank)
Incremental Iteration(Connected Components)
(pid, tid, p)
Join Pand A
(pid, r)
A
Reduce (on tid)(pid=tid, r=∑ k)
Match (on pid)(tid, k=r*p)
Sum uppartial ranks
p
Stratosphere – Parallel Analytics Beyond MapReduce
How does it look in code
43
val result = step iterate (vertices distinctBy {_.id}, messages)
def step = (s: Data[Vertex], ws: Data[Message]) => { val sNext = ws join s on {…} isEqualTo {…} using {…} val wNext = sNext join edges on … (sNext, wNext)}
Java
Scala
Stratosphere – Parallel Analytics Beyond MapReduce
Incremental Iterations matter…
44
0 3 6 9 12 15 18 21 24 27 30 330
200000
400000
600000
800000
1000000
1200000
1400000
Superstep
# Ve
rtice
s (t
hous
ands
)
Naïve (Bulk)
Incremental
Twitter Webbase (20)0
1000
2000
3000
4000
5000
6000
Changes to the iteration's result for Connected Components in each superstep…
… and runtime.
Stratosphere – Parallel Analytics Beyond MapReduce
Pregel as a Pact program
45
Stratosphere – Parallel Analytics Beyond MapReduce
StratoSphereAbove the Clouds
THE PROGRAM COMPILER AND OPTIMIZER
Part 2
46
Stratosphere – Parallel Analytics Beyond MapReduce
Why an Optimizer for such Programs?
47
Do you want to hand-optimize that?
Stratosphere – Parallel Analytics Beyond MapReduce
■ Cost-based optimizer produces physical execution plan given PACT program□ Annotates data channels with distribution patters, e.g., broadcast, partition□ Chooses physical execution strategies (e.g., hash/sort)□ Reorders PACT functions Deeply embeds MapReduce style UDFs in the
optimization
■ Optimization of iterative programs□ Passing data between super-steps□ Loop-invariant data□ Efficient state maintenance in partitioned indexes
■ Challenge: Semantics of user-defined functions unknown
Pact Optimizer Overview
48
Stratosphere – Parallel Analytics Beyond MapReduce
Current architecture
49
1) Analyze 3) Parallelize
2) Reorder
Stratosphere – Parallel Analytics Beyond MapReduce
1) Opening the Black Boxes …
50
Analyze user code to discover:
■ Read set Rf: Attributes of the input record(s) that might influence output
■ Write set Wf: Attributes of the output record(s) that might have different values from respective input attributes
■ Emit cardinality Ef: Bounds on records emitted per call (1, >1, …)
PACTf
(Rf,Wf,Ef)
Stratosphere – Parallel Analytics Beyond MapReduce
1 void match (Record left,2 Record right,3 Collector col) {4 Record out = copy (left);5 if (left.get(0) > 3) {6 double a = right.get(2);7 out.set(2,1.0/a);8 }9 out.set(1, 42);10 out.set(3,right.get(0));11 out.set(4,right.get(1));12 out.set(5,right.get(2));13 col.emit (out);14 }
… via Static Code Analysis
51
Feasible:1. No control flow between
operators 2. Record data model, fixed API
Correct: ■ Difficulty comes from different code
paths■ Correctness guaranteed through
conservatism■ Add to R,W when in doubt
Stratosphere – Parallel Analytics Beyond MapReduce
Conditions for reordering UDFs
52
Enabled optimizations: Selection push-down (Bushy) join reordering Aggregation push-down
Equivalent to invariant grouping transformation [Chaudhuri & Shim 1994]
Reordering of non-relational Reduce functions
Theorem 1: Two Map operators can be reordered if their UDFs have only read-read conflictsTheorem 2: For a Map and a Reduce, we need in addition the Reduce key groups to be preserved
Stratosphere – Parallel Analytics Beyond MapReduce
■ Simple enumeration algorithm that checks pairwise reordering for all neighboring operators
■ Current problem: Walking all points in the search space
■ Next: Deduce join-graph-like information from reordering degrees-of-freedom
Optimizer Architecture (I)
53
Stratosphere – Parallel Analytics Beyond MapReduce
■ Operators are defined in terms of possible global data properties (partitioning/replication/...) and local data properties (order/grouping/uniqueness/...)
■ Nodes propagate requested properties top-down□ Filtered by UDF‘s field modification□ Filtered by incompatibility□ Every data flow edge has a set of possible requested properties
■ Requested properties are instantiated at each point□ Global properties by exchange strategies□ Local properties by local operators
■ Requested properties used for pruning candidate (as with intersting properties)
Optimizer Architecture (II)
54
Stratosphere – Parallel Analytics Beyond MapReduce
■ Determine static and dynamic data flow paths for iterations□ Static path contains data that is loop-invariant
■ Use heuristics to place caches such that loop-invariant computations are not repeated□ Cache loop-invariant data also in ordered form, or as hash tables
■ Weigh costs for static and dynamic path differently□ Optimizer favors plans that „push“ work into static path
Optimizer Architecture (III)
55
Stratosphere – Parallel Analytics Beyond MapReduce
PageRank: Two Optimizer Plans
56
Match (on pid)(tid, k=r*p)
Reduce (on tid)(pid=tid, r=∑ k)
O
I(pid, tid, p)
CACHE
Join P and A
Sum uppartial ranks
(pid, r)
Abroadcast
part./sort (tid)
probeHashTable (pid)buildHash-Table (pid)
p
O
I(pid, tid, p)
buildHashTable (pid)
Join P and A
(pid, r)
A
part./sort (tid)
partition (pid)
CACHEprobeHash-Table (pid)
Reduce (on tid)(pid=tid, r=∑ k)
Match (on pid)(tid, k=r*p)
Sum uppartial ranks
ppartition (pid)
fifo
fifo
Stratosphere – Parallel Analytics Beyond MapReduce
StratoSphereAbove the Clouds
THE FUNCTIONAL LANGUAGE COMPILATION
Part 3
57
Stratosphere – Parallel Analytics Beyond MapReduce
The Compiler Mismatch
58
Parser/Checker Optimizer Code
Generation Runtime
Parser/Checker
Code Generation Optimizer Runtime
The Database Approach
UDF Systems: MapReduce &Stratosphere (original)
Code Generation AFTERcontext of operation is fixed.
Code Generation BEFOREcontext of operation is fixed.
Query Compiler
Language Compiler
Stratosphere – Parallel Analytics Beyond MapReduce
The Program Compilation Pipeline
59
Program Code
Parser/Checker
ByteCode
Generator
Analyzer and Code
Generator
GlobalSchema
Generator
PactOptimizer
ProgramInstantiation
Schema and Code
Finalization
Parallel Data Flow
Generator
Parallel Data Flow
Language Compiler
Stratosphere – Parallel Analytics Beyond MapReduce
■ Supported Types□ Primitive (Integers, Floating-Point, Strings, …), Lists, Tuples, Product Types
(classes), Summation Types (class hierarchies) , Recursive Types
■ Data types are logically flattened□ Some fields are transparent members of the flat model, some are black box
members
■ Transparent members may be references in selector functions
■ Selector Functions are likewise analyzed and translated into logical positions
1) Analyzing Data Types
60
Stratosphere – Parallel Analytics Beyond MapReduce
■ User Code is pure Scala, no Stratosphere specific types, interfaces
■ Wrapper code necessary to run it as a UDF in Stratosphere
■ Serializer/Comparator Code is generated as a template (omitting exact field positions, storing logical positions)
■ Code is inserted by modifying the program's Abstract-Syntax-Tree
2) Generating Glue Code
61
Stratosphere – Parallel Analytics Beyond MapReduce
■ Schema generated from logical flattened model■ Each field in every operator’s result gets a unique name
□ Unless exact copy of an input field (info from code analysis)
■ Run Stratosphere optimizer□ Potentially reorders functions
■ Prune unused fields early□ Information whether fields are accessed by UDF from code analysis
■ Create physical data layout■ Finalize serializer / comparator code
3) Schema Generation
62
Stratosphere – Parallel Analytics Beyond MapReduce
Some preliminary results...
63
Stratosphere – Parallel Analytics Beyond MapReduce
■ MapReduce ■ Pig, JAQL, Hive■ AQL■ Scope■ Datalog for Machine Learning■ BOOM■ Twister / HaLoop■ Spark■ Naiad■ Flume Java / Plume Java■ Scalops■ Jet■ LINQ
Related Work
64