33
1 XJoin: Faster Query Results Over Slow And Bursty Networks IEEE Bulletin, 2000 by T. Urhan and M Franklin Based on a talk prepared by Asima Silva & Leena Razzaq

XJoin: Faster Query Results Over Slow And Bursty Networks

  • Upload
    kordell

  • View
    21

  • Download
    1

Embed Size (px)

DESCRIPTION

XJoin: Faster Query Results Over Slow And Bursty Networks. IEEE Bulletin, 2000 by T. Urhan and M Franklin. Based on a talk prepared by Asima Silva & Leena Razzaq. Motivation. Data delivery issues in terms of: unpredictable delay from some remote data sources - PowerPoint PPT Presentation

Citation preview

Page 1: XJoin: Faster Query Results Over Slow And Bursty Networks

1

XJoin: Faster Query Results Over Slow And Bursty Networks

IEEE Bulletin, 2000by T. Urhan and M Franklin

Based on a talk prepared by Asima Silva & Leena Razzaq

Page 2: XJoin: Faster Query Results Over Slow And Bursty Networks

2

Motivation

Data delivery issues in terms of:– unpredictable delay from some remote data sources– wide-area network with possibly communication links,

congestion, failures, and overload Goal:

– Not just overall query processing time matters– Also when initial data is delivered– Overall throughput and rate throughout query process

Page 3: XJoin: Faster Query Results Over Slow And Bursty Networks

3

Overview

Hash Join History 3 Classes of Delays Motivation of XJoin Challenges of Developing XJoin Three Stages of XJoin Handling Duplicates Experimental Results

Page 4: XJoin: Faster Query Results Over Slow And Bursty Networks

4

Hash Join

Only one table is hashed

key2 R tuples

key1 R tuples

key3 R tuples

key4 R tuples

Key5… R tuples…

S tuple 1

S tuple 2

S tuple 3

S tuple 4

S tuple 5….

1. BUILD

2. Probe

Page 5: XJoin: Faster Query Results Over Slow And Bursty Networks

5

Hybrid Hash Join

One table is hashed both to disk and memory (partitions) G. Graefe. “Query Evaluation Techniques for Large Databases”, ACM1993.

Disk

Bucket i

Bucket i+1

Bucket i+2

Bucket …

Bucket j-1

Bucket j

R tuples

R tuples

R tuples

R tuples

R tuples

R tuplesBucket n

Bucket n+1

Bucket n+2

Bucket …

Bucket m-1

Bucket m

R tuples

R tuples

R tuples

R tuples

R tuples

R tuples

Memory

S tuple 1

S tuple 1

S tuple 1

S tuple 1

S tuple …

Page 6: XJoin: Faster Query Results Over Slow And Bursty Networks

6

Symmetric Hash Join (Pipeline)

Both tables are hashed (both kept in main memory only) Z. Ives, A. Levy, “An Adaptive Query Execution”, VLDB 99

Source R

OUTPUT

Source S

Key n

Key n+1

Key n+2

Key …

Key m-1

Key m

R tuples

R tuples

R tuples

R tuples

R tuples

R tuples

BUILD

PROBE

R tuple S tuple

Key i

Key i+1

Key i+2

Key …

Key j-1

Key j

S tuples

S tuples

S tuples

S tuples

S tuples

S tuples

BUILD

PROBE

R tuple S tuple

Page 7: XJoin: Faster Query Results Over Slow And Bursty Networks

7

Problem of SHJ:

Memory intensive : – Won’t work for large input streams– Wont’ allow for many joins to be processed in a

pipeline (or even in parallel)

Page 8: XJoin: Faster Query Results Over Slow And Bursty Networks

8

New Problems: Three Delays

– Initial Delay First tuple arrives from remote source more slowly than

usual (still want initial answer out quickly)

– Slow Delivery Data arrives at a constant, but slower than expected

rate (at the end, still overall good throughput behavior)

– Bursty Arrival Data arrives in a fluctuating manner (how to avoid sitting

idle in periods of low input stream rates)

Page 9: XJoin: Faster Query Results Over Slow And Bursty Networks

9

Question:

Why are delays undesirable?– Prolongs the time for first output– Slows the processing if wait for data to first be

there before acting– If too fast, you want to avoid loosing any data– Waste of time if you sit idle while no data is

incoming– Unpredictable, one single strategy won’t work

Page 10: XJoin: Faster Query Results Over Slow And Bursty Networks

10

Challenges for XJoin

Manage flow of tuples between memory and secondary storage (when and how to do it)

Control background processing when inputs are delayed (reactive scheduling idea)

Ensure the full answer is produced Ensure duplicate tuples are not produced Both quick initial output as well as good

overall throughput

Page 11: XJoin: Faster Query Results Over Slow And Bursty Networks

11

Motivation of XJoin

Produces results incrementally when available– Tuples returned as soon as produced– Good for online processing

Allows progress to be made when one or more sources experience delays by:– Background processing performed on previously

received tuples so results are produced even when both inputs are stalled

Page 12: XJoin: Faster Query Results Over Slow And Bursty Networks

12

Stages (in different threads)

M :M

M :D

D:D

Page 13: XJoin: Faster Query Results Over Slow And Bursty Networks

13

Tuple B

hash(Tuple B) = n

SOURCE-B

Memory-resident partitions of source B

SOURCE-A

D I S K

M E

M O

R Y 1

. . . . . . nn1

Memory-resident partitions of source A

1

. . . . . . . . . . . . n

1

Disk-residentpartitions of source A

. . . n

Disk-residentpartitions of source B

. . . . . .1 nk

k

flu

sh

Tuple A

hash(Tuple A) = 1

XJoin

Page 14: XJoin: Faster Query Results Over Slow And Bursty Networks

14

1st Stage of XJoin

Memory - to - Memory Join Tuples are stored in partitions:

– A memory-resident (m-r) portion – A disk-resident (d-r) portion

Join processing continues as usual:– If space permits, M to M– If memory full, then pick one partition as victim, flush to disk

and append to end of disk partition 1st Stage runs as long as one of the inputs is producing

tuples If no new input, then block stage1 and start stage 2

Page 15: XJoin: Faster Query Results Over Slow And Bursty Networks

15

M E

M O

R Y

Partitions of source B

. . . . . . . . .i j

SOURCE-B

hash(record B) = j

Tuple B

SOURCE-A

Tuple A

hash(record A) = i

i j

Partitions of source A

. . . . . . . . .

Output

Insert Probe InsertProbe

1st Stage Memory-to-Memory Join

Page 16: XJoin: Faster Query Results Over Slow And Bursty Networks

16

Why Stage 1?

• Use Memory as it is the fastest whenever possible

• Use any new coming data as it’s already in memory

• Don’t stop to go and grab stuff out of disk for new data joins

Page 17: XJoin: Faster Query Results Over Slow And Bursty Networks

17

Question:

– What does Second Stage do?– When does the Second Stage start?– Hints:

Xjoin proposes a memory management technique What occurs when data input (tuples) are too large for

memory?

– Answer: Second Stage joins Mem-to-Disk Occurs when both the inputs are blocking

Page 18: XJoin: Faster Query Results Over Slow And Bursty Networks

18

2nd Stage of XJoin

Activated when 1st Stage is blocked Performs 3 steps:

1. Chooses the partition according to throughput and size of partition from one source

2. Uses tuples from d-r portion to probe m-r portion of other source and outputs matches, till d-r completely processed

3. Checks if either input resumed producing tuples. If yes, resume 1st Stage. If no, choose another d-r portion and continue 2nd Stage.

Page 19: XJoin: Faster Query Results Over Slow And Bursty Networks

19

Output

i. . . . . . .. . . . . . .i. . . . . . .. . . . . . .

M E

M O

R Y

Partitions of source BPartitions of source A

D I

S K

Partitions of source BPartitions of source A

ii . . . . .. . . . .. . . . .. . . . .

DPiA MPiB

Stage 2: Disk-to-memory Joins

Page 20: XJoin: Faster Query Results Over Slow And Bursty Networks

20

Controlling 2nd Stage

Cost of 2nd Stage is hidden when both inputs experience delays

Tradeoff ? What are the benefits of using the second stage?

– Produce results when input sources are stalled– Allows variable input rates

What is the disadvantage?– The second stage must complete a d-r portion before

checking for new input (overhead) To address the tradeoff, use an activation threshold:

– Pick a partition likely to produce many tuples right now

Page 21: XJoin: Faster Query Results Over Slow And Bursty Networks

21

3rd Stage of XJoin

Disk-to-Disk Join Clean-up stage

– Assumes that all data for both inputs has arrived– Assumes that first and second stage completed– Makes sure that all tuples belonging in the result

are being produced. Why is this step necessary?

– Completeness of answer

Page 22: XJoin: Faster Query Results Over Slow And Bursty Networks

22

Handling Duplicates

When could duplicates be produced? Duplicates could be produced in all 3 stages

as multiple stages may perform overlapping work.

How address it:– XJoin prevents duplicates with timestamps.

When address this:– During processing as continuous output

Page 23: XJoin: Faster Query Results Over Slow And Bursty Networks

23

Time Stamping : part 1

2 fields are added to each tuple:– Arrival TimeStamp (ATS)

indicates when the tuple arrived first in memory– Departure TimeStamp (DTS)

used to indicated time the tuple was flushed to disk

[ATS, DTS] indicates when tuple was in memory When did two tuples get joined?

– If Tuple A’s DTS is within Tuple B’s [ATS, DTS] Tuples that meet this overlap condition are not

considered for joining by the 2nd or 3rd stages

Page 24: XJoin: Faster Query Results Over Slow And Bursty Networks

24

Tuple B1 178 198

• Tuples joined in first stage

• B1 arrived after A, and before A was flushed to disk

Tuple A 102 234

DTSATS

Tuple B2 348 601

• Tuples not joined in first stage

• B2 arrived after A, and after A was flushed to disk

Tuple A 102 234

DTSATS

Non-Overlapping

Detecting tuples joined in 1st stage

Overlapping

Page 25: XJoin: Faster Query Results Over Slow And Bursty Networks

25

Time Stamping : part 2

• For each partition, keep track off:– ProbeTS: time when a 2nd stage probe was done– DTSlast: the latest DTS time of all the tuples that were

available on disk at that time

• Several such probes may occur:– Thus keep an ordered history of such probe descriptors

• Usage: – All tuples before and including at time DTSlast were

joined in stage 2 with all tuples in main memory (ATS,DTS) at time ProbeTS

Page 26: XJoin: Faster Query Results Over Slow And Bursty Networks

26

Tuple A

DTS

100 200

ATS

Tuple B

DTS

500 600

ATS

Detecting tuples joined in 2nd stage

ProbeTSDTSlast

20 340

Partition 1

Overlap

250 550

Partition 2

300 700

Partition 3

History list for the corresponding partitions

100 300

Partition 1

800 900

Partition 2

All tuples before and including DTSlast were joined in Stage 2 At time ProbeTS

All A tuples in Partition 2 up to DTSlast 250,Were joined with m-r tuples that arrived before Partition 2’s ProbeTS.

Page 27: XJoin: Faster Query Results Over Slow And Bursty Networks

27

Experiments

• HHJ (Hybrid Hash Join)

• Xjoin (with 2nd stage and with caching)

• Xjoin (without 2nd stage)

• Xjoin (with aggressive usage of 2nd stage)

Page 28: XJoin: Faster Query Results Over Slow And Bursty Networks

28

Case 1: Slow NetworkBoth sources are slow (bursty)

XJoin improves delivery time of initial answers -> interactive performance

The reactive background processing is an effective solution to exploit intermittant delays to keep continued output rates.

Shows that 2nd stage is very useful if there is time for it

Page 29: XJoin: Faster Query Results Over Slow And Bursty Networks

29

Slow Network: both resources are slow

Page 30: XJoin: Faster Query Results Over Slow And Bursty Networks

30

Case 2: Fast NetworkBoth sources are fast

All XJoin variants deliver initial results earlier. XJoin also can deliver the overall result in

equal time to HHJ HHJ delivers the 2nd half of the result faster

than XJoin. 2nd stage cannot be used too aggressively if

new data is coming in continuously

Page 31: XJoin: Faster Query Results Over Slow And Bursty Networks

31

Case 2: Fast NetworkBoth sources are fast

Page 32: XJoin: Faster Query Results Over Slow And Bursty Networks

32

Conclusion

Can be conservative on space (small footprint)

Can be used in conjunction with online query processing to manage the streams

Resuming Stage 1 as soon as data arrives Dynamically choosing techniques for

producing results

Page 33: XJoin: Faster Query Results Over Slow And Bursty Networks

33

References

Urhan, Tolga and Franklin, Michael J. “XJoin: Getting Fast Answers From Slow and Bursty Networks.”

Urhan, Tolga, Franklin, Michael J. “XJoin: A Reactively-Scheduled Pipelined Join Operator.”

Hellerstein, Franklin, Chandrasekaran, Deshpande, Hildrum, Madden, Raman, and Shah. “Adaptive Query Processing: Technology in Evolution”. IEEE Data Engineering Bulletin, 2000.

Hellerstein and Avnur, Ron. “Eddies: Continuously Adaptive Query Processing.”

Babu and Wisdom, Jennefer. “Continuous Queries Over Data Streams”.