22
© Stavros Harizopoulos 2006 Performance Tradeoffs in Read-Optimized Databases Stavros Harizopoulos MIT CSAIL joint work with: Velen Liang, Daniel Abadi, and Sam Madden massachusetts institute of technology

© Stavros Harizopoulos 2006 Performance Tradeoffs in Read-Optimized Databases Stavros Harizopoulos MIT CSAIL joint work with: Velen Liang, Daniel Abadi,

Embed Size (px)

Citation preview

© Stavros Harizopoulos 2006

Performance Tradeoffs in Read-Optimized Databases

Stavros HarizopoulosMIT CSAIL

joint work with:Velen Liang, Daniel Abadi, and Sam Madden

massachusetts institute of technology

massachusetts institute of technology 2© Stavros Harizopoulos 2006

Read-optimized databases

45

…37

Joe

…Sue

1

…2

column stores

1 Joe 45

… … …2 Sue 37

row stores

Sybase IQMonetDBCStore

SQL ServerDB2Oracle

Materialized views, multiple indices, compressionRead optimizations:

How does column-orientation affect performance?

massachusetts institute of technology 3© Stavros Harizopoulos 2006

Rows vs. columns

column datarow data

1 Joe 45

2 Sue 37… … … single

file

project

Joe 45

1 2 …

JoeSue

4537……

3 files

Joe

45reconstruct

Joe 45

Study performance tradeoffs solely in data storage

seek

massachusetts institute of technology 4© Stavros Harizopoulos 2006

Performance study• Methodology

– Built storage manager from scratch– Sequential scans– Analyze CPU, disk, memory

• Findings– Columns are generally more I/O efficient– Competing traffic favors columns– Conditions where columns are CPU-constrained– Conditions where rows are MemBW-constrained

massachusetts institute of technology 5© Stavros Harizopoulos 2006

Talk outline• System architecture

• Workload and Experiments

• Analysis

• Conclusions

massachusetts institute of technology 6© Stavros Harizopoulos 2006

System architecture• Block-iterator operators

– Single-threaded, C++, Linux AIO

• No buffer pool– Use filesystem, bypass OS cache

• Compression

• Dense-pack60% full 100% full

massachusetts institute of technology 7© Stavros Harizopoulos 2006

Compression methods• Dictionary

• Bit-pack– Pack several attributes inside a 4-byte word– Use as many bits as max-value

• Delta– Base value per page– Arithmetic differences

… ‘low’ …… ‘high’ …… ‘low’ …… ‘normal’ …

… 00 …… 10 …… 00 …… 01 …

massachusetts institute of technology 8© Stavros Harizopoulos 2006

Storage engine

S

SELECT name, ageWHERE age > 40

applypredicate(s)

Joe 45… …

S

S

#POS 45#POS …

Joe 45… …

applypredicate #1

row scanner column scanner

age

name

massachusetts institute of technology 9© Stavros Harizopoulos 2006

Platform

3.2GHz

CPU L2 RAM

1MB 1GB180 MB/sec3.2 GB/sec

DISKS

direct IO

100msread

10msseek

L2 cacheprefetching

read 128 bytes

(striped)

prefetching:

massachusetts institute of technology 10© Stavros Harizopoulos 2006

Workload• LINEITEM (wide)

– 60m rows → 9.5 GB

• ORDERS (narrow)– 60m rows → 1.9 GB

• Query

150 bytes 50 bytes

32 bytes 12 bytes

SELECT a1, a2, a3, …WHERE a1 yields variable selectivity

massachusetts institute of technology 11© Stavros Harizopoulos 2006

Wide tuple: 10% selectivity

selected bytes per tuple

time

(sec

)

0

10

20

30

40

50

60

4 20 36 52 68 84 100 116 132 148

• Large prefetch hides disk seeks in columns

Row

Row (CPU only)

Column (CPU only)

Column

25B 10B 69B

int4B

text text text

char1B

massachusetts institute of technology 12© Stavros Harizopoulos 2006

Wide tuple: 10% sel. (CPU)tim

e (s

ec)

row store

0

2

4

6

8

10

12

1 16

Other stalls (user)

Memory stalls (user)

Busy (user)

System

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

# attributes selectedcolumn store

• Row-CPU suffers from memory stalls

massachusetts institute of technology 13© Stavros Harizopoulos 2006

0

2

4

6

8

10

12

1 16

Other stalls (user)

Memory stalls (user)

Busy (user)

System

• Column-CPU efficiency with lower selectivity

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Wide tuple: 10% sel. (CPU)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

0.1%

# attributes selectedcolumn store

time

(sec

)

row store

massachusetts institute of technology 14© Stavros Harizopoulos 2006

Narrow tuple: 10% selectivity

• Memory stalls disappear in narrow tuples

• Compression: similar to narrow (not shown)

0

2

46

8

10

12

4 8 12 16 20 24 28 32

RowColumn

1 2 3 4 5 6 7

time

(sec

)

selected bytes per tuple# attributes selected

0

24

68

1012

1 7

Other

Memory

CPU user

CPU system

row store column store

massachusetts institute of technology 15© Stavros Harizopoulos 2006

Varying prefetch size

• No prefetching hurts columns in single scans

0

10

20

30

40

4 8 12 16 20 24 28 32

time

(sec

)

no competingdisk traffic

selected bytes per tuple

Row (any prefetch size)

Column 48 (x 128KB)Column 16

Column 8

Column 2

massachusetts institute of technology 16© Stavros Harizopoulos 2006

Varying prefetch size

• No prefetching hurts columns in single scans

• Under competing traffic, columns outperform rows for any prefetch size

0

10

20

30

40

4 8 12 16 20 24 28 32

no competingdisk traffic

with competing disk traffic

0

10

20

30

40

4 12 20 28

Column, 48Row, 48

0

10

20

30

40

4 12 20 28

Column, 8Row, 8

selected bytes per tuple

time

(sec

)

massachusetts institute of technology 17© Stavros Harizopoulos 2006

Analysis• Central parameter in analysis:

cycles per disk byte (cpdb)

• What can it model:• More / fewer disks• More / fewer CPUs• CPU / disk competing traffic

• Trends in cpdb:• 10 → 30 from 1995 to 2006• Further increase with multicore chips

massachusetts institute of technology 18© Stavros Harizopoulos 2006

Analysis

• Rows favored by narrow tuples and low cpdb– Disk-bound workloads have higher cpdb

8 12 16 20 24 28 32 369

18

36

72

14410% selectivity50% projection

tuple width

cycl

es p

er d

isk

byte

speedup ofcols over rows

2

1.6 – 2

1.2 – 1.6

0.8 – 1.2

0.4 – 0.8

(cpdb)

massachusetts institute of technology 19© Stavros Harizopoulos 2006

See our paper for the rest• CPU time breakdowns, L2 prefetcher

• Disk prefetching implementation

• Compression results

• Non-pipelined column scanner

• Analysis

massachusetts institute of technology 20© Stavros Harizopoulos 2006

Conclusions• Given enough space for prefetching,

columns outperform rows in most workloads

• Competing traffic favors columns

• Memory-bandwidth bottleneck in rows

• Future work– Column scanners, random I/O, write performance

massachusetts institute of technology 21© Stavros Harizopoulos 2006

Thank you

db.csail.mit.edu/projects/cstore

massachusetts institute of technology 22© Stavros Harizopoulos 2006

Analysis

SizeFilevarious DB schemas

TupleWidth

MemBytesCycle memory bus speed

f # of selected attributes

I CPU work

cpdb(cycles perdisk byte)

more / fewer disks

more / fewer CPUs

CPU / disk competing traffic

parameter

what it can model