54
Deep Dive: Memory Management in Apache Andrew Or June 8th, 2016 @andrewor14

Deep Dive: Memory Management in Apache Spark

Embed Size (px)

Citation preview

Page 1: Deep Dive: Memory Management in Apache Spark

Deep Dive:Memory Management in Apache

Andrew OrJune 8th, 2016@andrewor14

Page 2: Deep Dive: Memory Management in Apache Spark

students.select("name").orderBy("age").cache().show()

Caching Tungsten

Off-heapMemoryContention

Page 3: Deep Dive: Memory Management in Apache Spark

3

Efficient memory use is critical to good performance

Page 4: Deep Dive: Memory Management in Apache Spark
Page 5: Deep Dive: Memory Management in Apache Spark

Memory contention poses three challenges for Apache Spark

5

How to arbitrate memory between execution and storage?

How to arbitrate memory across tasks running in parallel?

How to arbitrate memory across operators running within the same task?

Page 6: Deep Dive: Memory Management in Apache Spark

Two usages of memory in Apache Spark

6

ExecutionMemory used for shuffles, joins, sorts and aggregations

StorageMemory used to cache data that will be reused later

Page 7: Deep Dive: Memory Management in Apache Spark

Iterator

4, 3, 5, 1, 6, 2 Sort

4 3 5 1 6 2 Iterator

1, 2, 3, 4, 5, 6

1 2 3 4 5 6

Execution memory

Take(3)

What if I want the sorted values again?

Page 8: Deep Dive: Memory Management in Apache Spark

Iterator

4, 3, 5, 1, 6, 2 Sort

4 3 5 1 6 2 Iterator

1, 2, 3, 4, 5, 6

1 2 3 4 5 6 Take(3)

Iterator

4, 3, 5, 1, 6, 2 Sort

4 3 5 1 6 2 Iterator

1, 2, 3, 4, 5, 6

1 2 3 4 5 6 Take(4)

...

Page 9: Deep Dive: Memory Management in Apache Spark

Sort

Iterator

4, 3, 5, 1, 6, 2

4 3 5 1 6 2 Iterator

1, 2, 3, 4, 5, 6

1 2 3 4 5 6

Cache

4 3 5 1 6 21 2 3 4 5 6

Execution memory Storage memory

Take(5)Take(4)Take(3) ...

Page 10: Deep Dive: Memory Management in Apache Spark

Challenge #1How to arbitrate memory between

execution and storage?

Page 11: Deep Dive: Memory Management in Apache Spark

Easy, static assignment!

11

Total available memory

Execution Storage

Spark 1.0

May 2014

Page 12: Deep Dive: Memory Management in Apache Spark

Easy, static assignment!

12

Execution Storage

Spill to disk

Spark 1.0

May 2014

Page 13: Deep Dive: Memory Management in Apache Spark

Easy, static assignment!

13

Execution Storage

Spark 1.0

May 2014

Page 14: Deep Dive: Memory Management in Apache Spark

Easy, static assignment!

14

Execution Storage

Evict LRU block to disk

Spark 1.0

May 2014

Page 15: Deep Dive: Memory Management in Apache Spark

15

Inefficient memory use leads tobad performance

Page 16: Deep Dive: Memory Management in Apache Spark

Easy, static assignment!

16

Execution can only use a fraction of the memory, even when there is no storage!

Execution Storage

Spark 1.0May 2014

Page 17: Deep Dive: Memory Management in Apache Spark

Storage

Easy, static assignment!

17

Efficient use of memory required user tuning

Execution

Spark 1.0May 2014

Page 18: Deep Dive: Memory Management in Apache Spark

18

Fast forward to 2016…How could we have done better?

Page 19: Deep Dive: Memory Management in Apache Spark

19

Execution Storage

Page 20: Deep Dive: Memory Management in Apache Spark

20

Unified memory management

Spark 1.6+

Jan 2016

What happens if there is already storage?

Execution Storage

Page 21: Deep Dive: Memory Management in Apache Spark

21

Unified memory management

Spark 1.6+

Jan 2016

Evict LRU block to disk

Execution Storage

Page 22: Deep Dive: Memory Management in Apache Spark

22

Unified memory management

Spark 1.6+

Jan 2016

What about the other way round?

Execution Storage

Page 23: Deep Dive: Memory Management in Apache Spark

23

Unified memory management

Spark 1.6+

Jan 2016

Evict LRU block to disk

Execution Storage

Page 24: Deep Dive: Memory Management in Apache Spark

Design considerations

24

Why evict storage, not execution?Spilled execution data will always be read back from disk, whereas cached data may not.

What if the application relies on caching?Allow the user to specify a minimum unevictable amount of cached data (not a reservation!).

Spark 1.6+

Jan 2016

Page 25: Deep Dive: Memory Management in Apache Spark

Challenge #2How to arbitrate memory across

tasks running in parallel?

Page 26: Deep Dive: Memory Management in Apache Spark

Easy, static assignment!

Worker machine has 4 cores

Each task gets 1/4 of the total memory

Slot 1 Slot 2 Slot 3 Slot 4

Page 27: Deep Dive: Memory Management in Apache Spark

Alternative: Dynamic assignment

The share of each task depends on number of actively running tasks (N)

Task 1

Page 28: Deep Dive: Memory Management in Apache Spark

Alternative: Dynamic assignment

Now, another task comes alongso the first task will have to spill

Task 1

Page 29: Deep Dive: Memory Management in Apache Spark

Alternative: Dynamic assignment

Each task is now assigned 1/N ofthe memory, where N = 2

Task 1 Task 2

Page 30: Deep Dive: Memory Management in Apache Spark

Alternative: Dynamic assignment

Each task is now assigned 1/N ofthe memory, where N = 4

Task 1 Task 2 Task 3 Task 4

Page 31: Deep Dive: Memory Management in Apache Spark

Alternative: Dynamic assignment

Last remaining task gets all the memory because N = 1

Task 3

Spark 1.0+

May 2014

Page 32: Deep Dive: Memory Management in Apache Spark

Static vs dynamic assignment

32

Both are fair and starvation free

Static assignment is simpler

Dynamic assignment handles stragglers better

Page 33: Deep Dive: Memory Management in Apache Spark

Challenge #3How to arbitrate memory across

operators running within the same task?

Page 34: Deep Dive: Memory Management in Apache Spark

SELECT age, avg(height) FROM students GROUP BY age ORDER BY avg(height)

students.groupBy("age") .avg("height") .orderBy("avg(height)") .collect()

Scan

Project

Aggregate

Sort

Page 35: Deep Dive: Memory Management in Apache Spark

Worker has 6 pages of memory

Scan

Project

Aggregate

Sort

Page 36: Deep Dive: Memory Management in Apache Spark

Scan

Project

Aggregate

Sort

Map { // age → heights 20 → [154, 174, 175] 21 → [167, 168, 181] 22 → [155, 166, 188] 23 → [160, 168, 178, 183]}

Page 37: Deep Dive: Memory Management in Apache Spark

Scan

Project

Aggregate

Sort

All 6 pages were used by Aggregate, leaving no memory for Sort!

Page 38: Deep Dive: Memory Management in Apache Spark

Solution #1:Reserve a page for

each operator

Scan

Project

Aggregate

Sort

Page 39: Deep Dive: Memory Management in Apache Spark

Solution #1:Reserve a page for

each operator

Scan

Project

Aggregate

Sort

Starvation free, but still not fair…What if there were more operators?

Page 40: Deep Dive: Memory Management in Apache Spark

Solution #2:Cooperative spilling

Scan

Project

Aggregate

Sort

Page 41: Deep Dive: Memory Management in Apache Spark

Scan

Project

Aggregate

SortSolution #2:

Cooperative spilling

Page 42: Deep Dive: Memory Management in Apache Spark

Scan

Project

Aggregate

SortSolution #2:

Cooperative spilling

Sort forces Aggregate to spill a page to free memory

Page 43: Deep Dive: Memory Management in Apache Spark

Scan

Project

Aggregate

SortSolution #2:

Cooperative spilling

Sort needs more memory so it forces Aggregate to spill another page (and so on)

Page 44: Deep Dive: Memory Management in Apache Spark

Scan

Project

Aggregate

SortSolution #2:

Cooperative spilling

Sort finishes with 3 pages

Aggregate does not have to spill its remaining pages

Spark 1.6+

Jan 2016

Page 45: Deep Dive: Memory Management in Apache Spark

Recap: Three sources of contention

45

How to arbitrate memory …

● between execution and storage?● across tasks running in parallel?● across operators running within the same task?

Instead of avoid statically reserving memory in advance, deal with memory contention when it arises by forcing members to spill

Page 46: Deep Dive: Memory Management in Apache Spark

Project Tungsten

46

Binary in-memory data representation

Cache-aware computation

Code generation (next time)Spark 1.4+

Jun 2015

Page 47: Deep Dive: Memory Management in Apache Spark

“abcd”

47

• Native: 4 bytes with UTF-8 encoding• Java: 48 bytes– 12 byte header– 2 bytes per character (UTF-16 internal representation)– 20 bytes of additional overhead– 8 byte hash code

Java objects have large overheads

Page 48: Deep Dive: Memory Management in Apache Spark

48

Schema: (Int, String, String)

Row

Array String(“data”)

String(“bricks”)

5+ objects, high space overhead, expensive hashCode()

BoxedInteger(123)

Java objects based row format

Page 49: Deep Dive: Memory Management in Apache Spark

6 “bricks”

49

0x0 123 32L 48L 4 “data”

(123, “data”, “bricks”)

Null tracking bitmap

Offset to var. length data

Offset to var. length data

Tungsten row format

Page 50: Deep Dive: Memory Management in Apache Spark

Cache-aware computation

50

ptr key rec

ptr key rec

ptr key rec

Naive layoutPoor cache locality

ptrkey prefix rec

ptrkey prefix rec

ptrkey prefix rec

Cache-aware layoutGood cache locality

E.g. sorting a list of records

Page 51: Deep Dive: Memory Management in Apache Spark

Off-heap memory

51

Available for execution since Apache Spark 1.6Available for storage since Apache Spark 2.0

Very important for large heaps

Many potential advantages: memory sharing, zero copy I/O, dynamic allocation

Page 52: Deep Dive: Memory Management in Apache Spark

For more info...

Deep Dive into Project Tungsten: Bringing Spark Closer to Bare Metalhttps://www.youtube.com/watch?v=5ajs8EIPWGI

Spark Performance: What’s Nexthttps://www.youtube.com/watch?v=JX0CdOTWYX4

Unified Memory Managementhttps://issues.apache.org/jira/browse/SPARK-10000

Page 53: Deep Dive: Memory Management in Apache Spark

Databricks Community Edition

Free version of cloud based platform in beta

More than 8,000 users registered

Users created over 61,000 notebooks in different languages

http://www.databricks.com/try

53

Page 54: Deep Dive: Memory Management in Apache Spark

Thank [email protected]@andrewor14