Adaptive Indexing in Modern Databases - planet-data.eu · Adaptive Indexing Remove all tuning,...

Preview:

Citation preview

Adaptive Indexingin Modern Databases

Stratos Idreos, Stefan Manegold and Goetz Graefe

CWI, Amsterdam HP Labs, Palo Alto

* *

* +

+

Tutorial PlanPart IIntroduction and db cracking basics (Stratos)Adaptive merging (Goetz)Benchmarking and hybrids (Stefan)

Part IIStochastic cracking and sideways cracking (Stratos)Concurrency control (Goetz)Partial cracking and updates (Stefan)

Introduction and db cracking basics

CIDR2007, Database CrackingStratos Idreos, Martin Kersten and Stefan Manegold

Physical design problem

which indexes to build?on which data parts?and when to build them?

Database systems perform efficiently only after proper tuning...

DBA without adaptive indexing

Physical Design

Sample Workload

Timeline

Physical Design

Sample Workload

Analyze Performance

Timeline

Physical Design

Sample Workload

Analyze Performance

Prepare Estimated physical design

Timeline

Physical Design

Sample Workload

Analyze Performance

Prepare Estimated physical design

Timeline

Queries

Physical Design

Sample Workload

Analyze Performance

Prepare Estimated physical design

Timeline

Queries

Complex and time consuming process

Physical Design

Sample Workload

Analyze Performance

Prepare Estimated physical design

Timeline

Queries

Complex and time consuming process

? Dynamic Workloads

Very Large Databases?

Dynamic environments

idle time workload knowledge

Dynamic environments

idle time workload knowledge

some problem cases

Dynamic environments

idle time workload knowledge

• Not enough idle time to finish proper tuningsome problem cases

Dynamic environments

• By the time we finish tuning, the workload changes

idle time workload knowledge

• Not enough idle time to finish proper tuningsome problem cases

Dynamic environments

• By the time we finish tuning, the workload changes• No index support during tuning

idle time workload knowledge

• Not enough idle time to finish proper tuningsome problem cases

Dynamic environments

• By the time we finish tuning, the workload changes• No index support during tuning

• Not all data parts are equally useful

idle time workload knowledge

• Not enough idle time to finish proper tuningsome problem cases

Adaptive Indexing

Remove all tuning, physical design steps but still get similar performance as a fully tuned system

How? Design new auto-tuning kernels

(operators, plans, structures, etc.)

For dynamic environments:

DBA with adaptive indexing

Adaptive Indexingno monitoringno preparation

no human involvement

no external tools

no full indexes

Adaptive Indexingno monitoringno preparation

no human involvement

no external tools

Continuous on-the-fly physical reorganization

no full indexes

Adaptive Indexingno monitoringno preparation

no human involvement

no external tools

Continuous on-the-fly physical reorganization

no full indexes

partial, incremental, adaptive indexing

Cracking ExampleDatabase Cracking CIDR 2007

Each query is treated as an advice on how data should be stored

Cracking ExampleDatabase Cracking CIDR 2007

Each query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Cracking ExampleDatabase Cracking CIDR 2007

Each query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Resu

lt tu

ples

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42Gain knowledge

on how data is organized

Physically reorganize based on the selection predicate

Resu

lt tu

ples

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42Gain knowledge

on how data is organized

Dynamically/on-the-fly within the select-operator

Physically reorganize based on the selection predicate

Resu

lt tu

ples

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

Physically reorganize based on the selection predicate

Resu

lt tu

ples

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Cracking ExampleEach query is treated as an advice on how data should be stored

from Rwhere R.A > 10

and R.A < 14

select *

Q2:select *from Rwhere R.A > 7

and R.A <= 16

Q1:136798131211141619 Piece 5: 16 < A

Piece 3: 10 < A < 14

Piece 1: A <= 7

Piece 2: 7 < A <= 10

Piece 4: 14 <= A <= 16

Cracker column of A Cracker column of A

10 < A < 14

14 <= A

A <= 10Piece 1:

Piece 3:

Piece 2:

(in!place)(copy)

Q1 Q2

Column A

13164921271193141186

49271386131211161914

42

The more we crack, the more

we learn

Physically reorganize based on the selection predicate

Resu

lt tu

ples

Dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

Cracking a column

5 20 40 60 90 120

cracker array A

existing cracks

All values after this position are bigger than 5

new select: 15<A<105

5 20 40 60 90 12015 105

after cracking

Database Cracking CIDR 2007

Cracking a column

5 20 40 60 90 120

cracker array A

existing cracks

All values after this position are bigger than 5

new select: 15<A<105

5 20 40 60 90 12015 105

after cracking

Database Cracking CIDR 2007

Cracking a column

5 20 40 60 90 120

cracker array A

existing cracks

All values after this position are bigger than 5

new select: 15<A<105

5 20 40 60 90 12015 105

after cracking

Database Cracking CIDR 2007

Cracking a column

5 20 40 60 90 120

cracker array A

existing cracks

All values after this position are bigger than 5

new select: 15<A<105

5 20 40 60 90 12015 105

after cracking

Database Cracking CIDR 2007

Cracking a column

5 20 40 60 90 120

cracker array A

existing cracks

All values after this position are bigger than 5

new select: 15<A<105

5 20 40 60 90 12015 105

after cracking

Database Cracking CIDR 2007

Cracking a column

5 20 40 60 90 120

cracker array A

existing cracks

All values after this position are bigger than 5

new select: 15<A<105

5 20 40 60 90 12015 105

after cracking

Database Cracking CIDR 2007

Cracking a column

5 20 40 60 90 120

cracker array A

existing cracks

All values after this position are bigger than 5

new select: 15<A<105

5 20 40 60 90 12015 105

after cracking

Database Cracking CIDR 2007

Cracking a column

5 20 40 60 90 120

cracker array A

existing cracks

All values after this position are bigger than 5

new select: 15<A<105

5 20 40 60 90 12015 105

after cracking

Database Cracking CIDR 2007

EngineeringDatabase Cracking CIDR 2007

Challenge: Efficiently refine index

Goal: Lightweight adaptation

index is refined as part of query processing

the select operator answers the query and refines the index in one go

EngineeringDatabase Cracking CIDR 2007

Exploit column-store bulk processing

EngineeringDatabase Cracking CIDR 2007

Exploit column-store bulk processing

select(A,v1-v2)A

EngineeringDatabase Cracking CIDR 2007

Exploit column-store bulk processing

select(A,v1-v2)A

scan and materialize result

EngineeringDatabase Cracking CIDR 2007

Exploit column-store bulk processing

select(A,v1-v2)A

scan and materialize result

more operators

EngineeringDatabase Cracking CIDR 2007

Exploit column-store bulk processing

select(A,v1-v2)A

scan and materialize result

more operators

crackselect(A,v1-v2)

A

EngineeringDatabase Cracking CIDR 2007

Exploit column-store bulk processing

select(A,v1-v2)A

scan and materialize result

more operators

crackselect(A,v1-v2)

A no need to materialize

EngineeringDatabase Cracking CIDR 2007

Exploit column-store bulk processing

select(A,v1-v2)A

scan and materialize result

more operators

crackselect(A,v1-v2)

A no need to materialize

more operators

Cracking ExampleEach query is treated as an advice on how data should be stored

100K random selections random selectivityrandom value rangesin a 10 million integer column

Database Cracking CIDR 2007

set-upScan

Full Index

Crack

0.001

0.01

0.1

1

10

100

1000

1 10 100 1000 10000 100000

Resp

onse

tim

e (

secs

)

Query sequence (x1000)

Cracking ExampleEach query is treated as an advice on how data should be stored

100K random selections random selectivityrandom value rangesin a 10 million integer column

almost no initialization overhead

Database Cracking CIDR 2007

set-upScan

Full Index

Crack

0.001

0.01

0.1

1

10

100

1000

1 10 100 1000 10000 100000

Resp

onse

tim

e (

secs

)

Query sequence (x1000)

Cracking ExampleEach query is treated as an advice on how data should be stored

100K random selections random selectivityrandom value rangesin a 10 million integer column

almost no initialization overhead

continuous improvement

Database Cracking CIDR 2007

set-upScan

Full Index

Crack

0.001

0.01

0.1

1

10

100

1000

1 10 100 1000 10000 100000

Resp

onse

tim

e (

secs

)

Query sequence (x1000)

Cracking ExampleEach query is treated as an advice on how data should be stored

100K random selections random selectivityrandom value rangesin a 10 million integer column

almost no initialization overhead

continuous improvement

Database Cracking CIDR 2007

set-upScan

Full Index

Crack

0.001

0.01

0.1

1

10

100

1000

1 10 100 1000 10000 100000

Resp

onse

tim

e (

secs

)

Query sequence (x1000)

Cracking ExampleEach query is treated as an advice on how data should be stored

10K random selections selectivity 10%random value rangesin a 30 million integer column

Database Cracking CIDR 2007

set-up

0.004

200

0.001

0.01

0.1

1

10

100

1 10 100 1000 10000

Cum

ula

tive a

vera

ge r

esp

onse

tim

e (

secs

)

Query sequence

Scan

Full Index

Crack

Cracking ExampleEach query is treated as an advice on how data should be stored

10K random selections selectivity 10%random value rangesin a 30 million integer column

10K queries later, Full Index still has not

amortized the initialization costs

Database Cracking CIDR 2007

set-up

0.004

200

0.001

0.01

0.1

1

10

100

1 10 100 1000 10000

Cum

ula

tive a

vera

ge r

esp

onse

tim

e (

secs

)

Query sequence

Scan

Full Index

Crack

Skyserver experiments

By the time stochastic cracking answered 160.000 queriesfull indexing (sorting) is still half way creating the index

Stochastic Cracking, PVLDB 12

4T Skyserver instancereal workload queries on PhotoObjAllPhotoObjAll = 500 million tuples

Literature

LiteratureCIDR’05, first sketch

CIDR’07, initial architecture

SIGMOD’07, updates

SIGMOD’09, tuple reconstruction

PhDThesis’10

TPCTC’10, benchmarking

EDBT’10, adaptive merging

PVLDB’11, hybrids

PVLDB‘12a, robustness

PVLDB‘12b, concurrency control

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexingA B C D

Table 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexingA B C D

Table 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

Storage adaptation

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

Storage adaptation

A B C DTable 1

A B C DTable 2

Base Data

x

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

Storage adaptation

A B C DTable 1

A B C DTable 2

Base Data

xx

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

Storage adaptation

A B C DTable 1

A B C DTable 2

Base Data

x

xx

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

Storage adaptation

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

Storage adaptation

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

Storage adaptation

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

Storage adaptation

No tuple reconstruction

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

Storage adaptation

No tuple reconstruction

Adaptive Alignment

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

Storage adaptation

No tuple reconstruction

Adaptive Alignment

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

Storage adaptation

No tuple reconstruction

Adaptive Alignment

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

Storage adaptation

No tuple reconstruction

Adaptive Alignment

A B C DTable 1

A B C DTable 2

Base Data

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

Storage adaptation

No tuple reconstruction

Adaptive Alignment

A B C DTable 1

A B C DTable 2

Base Data

Sort, cluster in caches

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

Storage adaptation

No tuple reconstruction

Adaptive Alignment

A B C DTable 1

A B C DTable 2

Base Data

Sort, cluster in caches

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

Storage adaptation

No tuple reconstruction

Adaptive Alignment

A B C DTable 1

A B C DTable 2

Base Data

Sort, cluster in caches

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

Storage adaptation

No tuple reconstruction

Adaptive Alignment

A B C DTable 1

A B C DTable 2

Base Data

Sort, cluster in caches

Crack joins

Tangram

A B C DTable 1

A B C DTable 2

As queries arrive...Partial materialization

Partial indexing

Continuous adaptation

Storage adaptation

No tuple reconstruction

Adaptive Alignment

A B C DTable 1

A B C DTable 2

Base Data

Sort, cluster in caches

Crack joins

Related Work Auto-tuning tools

On-line tuning

Offline analysis and physical design.

Online analysis. Combine scans with index build.

Surajit Chaudhuri, Vivek Narassaya, Nico Bruno, Guy Lohman, Daniel Zilio, Anastassia Ailamaki, Alkis Polyzotis, Sam Lightstone,

Kai-Uwe Sattler, Karl Schnaitter ... and many more

Related Work Auto-tuning tools

On-line tuning

Offline analysis and physical design.

Online analysis. Combine scans with index build.

Surajit Chaudhuri, Vivek Narassaya, Nico Bruno, Guy Lohman, Daniel Zilio, Anastassia Ailamaki, Alkis Polyzotis, Sam Lightstone,

Kai-Uwe Sattler, Karl Schnaitter ... and many more

Adaptive Indexing pushes partial, incremental, adaptive, online indexing inside

the kernel

Indexing Overviewworkload analysis

index buildingquery processing

offline indexing

Indexing Overviewworkload analysis

index buildingquery processing

offline indexing

online indexingworkload analysis

index buildingquery processing

Indexing Overviewworkload analysis

index buildingquery processing

offline indexing

online indexing

adaptive indexing

workload analysisindex building

query processing

adaptive indexing

Indexing Overviewworkload analysis

index buildingquery processing

offline indexing

online indexing

adaptive indexing

workload analysisindex building

query processing

adaptive indexing

wor

kloa

d kn

owle

dge

idle time

adaptive

online

offline

So what’s next...Is adaptive indexing the ultimate solution?Should we throw auto-tuning tools away? NO

So what’s next...Is adaptive indexing the ultimate solution?Should we throw auto-tuning tools away? NO

SIGMOD2012, PHDw, Holistic IndexingEleni Petraki and Stratos Idreos

So what’s next...Is adaptive indexing the ultimate solution?Should we throw auto-tuning tools away?

But AI may be the missing link

NO

adaptive indexing offline analysis online analysis

A system that continuously adapts and exploits both online and offline statistics as well as any a priori or online idle time

and its tuning/alert actions do not slow down queries

SIGMOD2012, PHDw, Holistic IndexingEleni Petraki and Stratos Idreos

Tutorial PlanPart IIntroduction and db cracking basics (Stratos)Adaptive merging (Goetz)Benchmarking and hybrids (Stefan)

Part IIStochastic cracking and sideways cracking (Stratos)Concurrency control (Goetz)Partial cracking and updates (Stefan)

Recommended