40
MULTICORE IN DATA APPLIANCES Gustavo Alonso Systems Group Dept. of Computer Science ETH Zürich, Switzerland SwissBox – CREST Workshop– March 2012

MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

MULTICORE IN DATA APPLIANCES

Gustavo Alonso Systems Group

Dept. of Computer Science ETH Zürich, Switzerland

SwissBox – CREST Workshop– March 2012

Page 2: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Systems Group = www.systems.ethz.ch Enterprise Computing Center = www.ecc.ethz.ch

2 Gustavo Alonso - Systems Group - ETH Zürich

Page 3: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

The SwissBox project

Build an open source data appliance

• Hardware

• Software

3 Gustavo Alonso - Systems Group - ETH Zürich

Page 4: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Goals

Robust by design

Scalable by design

Fully predictable

Behavior immune to peak loads (read or write)

Efficient use of modern hardware

• Cost efficiency

• Power/space/management complexity

Gustavo Alonso - Systems Group - ETH Zürich 4

Page 5: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

DATA APPLIANCE

Database in a box • Funny database • Funny box

Examples: • Exascale (Oracle) • Twin-Fin (Netezza – IBM) • NewDB (SAP) • Teradata

5 Gustavo Alonso - Systems Group - ETH Zürich

Page 6: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

• Intelligent storage manager • Massive caching • RAC based architecture • Fast network interconnect

ORACLE EXADATA

6 Gustavo Alonso - Systems Group - ETH Zürich

Page 7: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

The Multicore Challenge

7 Gustavo Alonso - Systems Group - ETH Zürich

Page 8: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Data parallelism

Relational model is highly parallel

• Independent tables

• Orthogonal operators

• Intra- and Inter-query parallelism

• Most successful commercial parallel systems

And yet …

Gustavo Alonso - Systems Group - ETH Zürich 8

Page 9: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Database engines and multicore

Gustavo Alonso - Systems Group - ETH Zürich 9

0

200

400

600

800

1000

1200

1400

1600

1800

0 50 100 150 200 250

Th

rou

gh

pu

t (T

PS

)

TPCW-S Clients

Postgres TPC-WB 20GB DB

PG 48 PG-24 PG 8

8 cores

48 cores

24 cores

Salomie, Subasu, Giceva, Alonso, EuroSys 2011

Page 10: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Database engines and multicore

Gustavo Alonso - Systems Group - ETH Zürich 10

0

100

200

300

400

500

600

700

800

0 50 100 150 200 250 300 350

Th

oru

gh

pu

t(T

PS

)

Clients

MySQL TPC-WB 20 GB DB

MYSQL-48 MYSQL-24 MYSQL-12

8 cores

48 cores

24 cores

Salomie, Subasu, Giceva, Alonso, EuroSys 2011

Page 11: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

CLAIM #1

Adding resources to a troubled application does not necessarily lead to improvements

Gustavo Alonso - Systems Group - ETH Zürich 11

Page 12: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Size matters

The challenge of appliances is the unprecedented power available

• 64 cores AMD,256 GB memory, 10 Gb network + 3 TB NAS: ~14k CHF

• Imagine a rack full of those

Is your job large enough?

Gustavo Alonso - Systems Group - ETH Zürich 12

Page 13: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

CLAIM #2

Large scale parallelism sensible only when looking at aggregated

loads

Gustavo Alonso - Systems Group - ETH Zürich 13

Page 14: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Load interaction

In a highly parallel system, multiple jobs will get on each other’s way:

• Synchronization

• Data movement

• Resource arbitrage

• Resource capping

• Heavy vs. light jobs

• Management and coordination

Gustavo Alonso - Systems Group - ETH Zürich 14

Page 15: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Load interaction in practice

Gustavo Alonso - Systems Group - ETH Zürich 15

System X

Gia

nn

ikis

, Alo

no

, Ko

ssm

ann

, PV

LD

B 2

012

Page 16: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

CLAIM #3

Robustness and performance can only be obtained by minimizing interaction

Gustavo Alonso - Systems Group - ETH Zürich 16

Page 17: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Locality in the XXIst century

Gustavo Alonso - Systems Group - ETH Zürich 17

P1

P0

P2

P3

P4

P6

P5

P7

Each die has: • 6 cores • 4HT ports • 2 memory channels

Each package has: • 12 cores • 4HT ports • 4 memory channels

Page 18: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

CLAIM #4

Robustness and definitely performance can only be achieved on fixed data paths

Gustavo Alonso - Systems Group - ETH Zürich 18

Page 19: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Architecture

Gustavo Alonso - Systems Group - ETH Zürich 19

RACK RACK

RACK RACK

RACK

multicore

Parallel hardware accelerator

other data centers

Page 20: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

CLAIM #5

Traditional layered architectures (HW/OS/VM/App) do not work in these

environments

Gustavo Alonso - Systems Group - ETH Zürich 20

Page 21: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

CLAIM #6

Strong notions of consistency (serializability) and atomicity of complex programs no

longer feasible

Gustavo Alonso - Systems Group - ETH Zürich 21

Page 22: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

SWISSBOX

22 Gustavo Alonso - Systems Group - ETH Zürich

Alonso, Kossmann, Roscoe, CIDR 2011

Page 23: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

SwissBox: the project

Great opportunity for research

• Rethink the entire system software stack

• Redesign the operating system, database, and storage system architecture

• Software – hardware co-design

23 Gustavo Alonso - Systems Group - ETH Zürich

Page 24: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

SwissBox: the product

Direct collaboration and input from industry

Great demand for tailored systems

• Big data

• Highly demanding applications

• Low power / high efficiency

24 Gustavo Alonso - Systems Group - ETH Zürich

Page 25: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Claim # 1 => Deterministic behavior

Adding resources to an application does not necessarily lead to a performance

improvement

System performance completely determined at design time through simple parameters

Gustavo Alonso - Systems Group - ETH Zürich 25

Page 26: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Clock Scan

READ CURSOR

WRITE CURSOR DATA IN

CIRCULAR BUFFER

(WIDE TABLE)

BUILD QUERY INDEX FOR NEXT SCAN QUERIES

UPDATES

26 Gustavo Alonso - Systems Group - ETH Zürich

Unterbrunner, Giannikis, Alonso, Kossmann, PVLDB 2009

Page 27: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Claim # 2 => Batch processing

Large scale parallelism makes sense only when considering aggregated loads

Execution proceeds in batches (1000’s of queries per batch)

Gustavo Alonso - Systems Group - ETH Zürich 27

Page 28: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Shared join

Crescando runs selection and projections in one set of cores

SharedDB runs joins on the streams from Crescando, thousands of queries at a time

28 Gustavo Alonso - Systems Group - ETH Zürich

Gia

nn

ikis

, Alo

no

, Ko

ssm

ann

, PV

LD

B 2

012

Page 29: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Claim # 3 => No load interaction

Robustness and performance can only be obtained by minimizing interaction

Operators are orthogonal and work on clearly delimited resources

Gustavo Alonso - Systems Group - ETH Zürich 29

Page 30: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Predictability, robustness

30 Gustavo Alonso - Systems Group - ETH Zürich

Gia

nn

ikis

, Alo

no

, Ko

ssm

ann

, PV

LD

B 2

012

Page 31: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Claim # 4 => No dynamic scheduling

Robustness (and definitely performance) can only be achieved on fixed data paths

No dynamic scheduling, operators are always on and at fixed locations

Gustavo Alonso - Systems Group - ETH Zürich 31

Page 32: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Single plan: operator per core

32 Gustavo Alonso - Systems Group - ETH Zürich

Page 33: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Raw performance

33 Gustavo Alonso - Systems Group - ETH Zürich

Gia

nn

ikis

, Alo

no

, Ko

ssm

ann

, PV

LD

B 2

012

Page 34: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Claim # 5 => Open stack

Traditional layered architectures (HW/OS/VM/App) do not work in these

environments

Operating System / Database co-design

Gustavo Alonso - Systems Group - ETH Zürich 34

Page 35: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Claim # 6 => Consistency

Strong notions of consistency (serializability) and atomicity of complex programs no

longer feasible

Snapshot isolation, multiversions, eventual consistency

Gustavo Alonso - Systems Group - ETH Zürich 35

Page 36: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

SW

ISS

BO

X

36 Gustavo Alonso - Systems Group - ETH Zürich

Page 37: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Where are we?

Fully predictable performance

• Accurate analytical model

• Easily tunable / scalable

Tolerates high peaks of reads and updates without compromising SLA

Intelligent storage engine

37 Gustavo Alonso - Systems Group - ETH Zürich

Page 38: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

Next steps

Hardware acceleration

Query optimizer

Parallel operators

OS / DB interfaces

Gustavo Alonso - Systems Group - ETH Zürich 38

Page 39: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

In the future

Virtualized operators

Flexible, elastic deployment through OS interaction

Scalability through operator/plan replication across cores and machines

Hardware acceleration by operator offloading and in-network data processing

Operator parallelism

39 Gustavo Alonso - Systems Group - ETH Zürich

Page 40: MULTICORE IN DATA APPLIANCES - crest.cs.ucl.ac.ukcrest.cs.ucl.ac.uk/cow/18/slides/COW18_Alonso.pdf · Behavior immune to peak loads (read or write) ... DATA APPLIANCE Database in

SwissBox in a nutshell

A new way to process data

• Parallel, predictable by design

• Not optimal but good enough

• Co-design at all levels

Great opportunity for research

• Redo everything from scratch

40 Gustavo Alonso - Systems Group - ETH Zürich