Download pdf - 4th PFI System reading

Transcript
Page 1: 4th PFI System reading

How Soccer Players Would do Stream Joins

3/4/2015 @nobu_k

1

Page 2: 4th PFI System reading

Who?久保田展行 (@nobu_k)

CTO@Preferred Networks America, Inc.

Speciality

DBMS, Search engine

Distributed Systems (consensus)

beatmania IIDX SP/DP皆伝 (DPメイン)

2

Page 3: 4th PFI System reading

How Soccer Players Would do Stream Joins

Jens Teubner, Rene Mueller, SIGMOD 2011

Handshake Join

Window-based stream joins supporting any join predicate

Very high degrees of parallelism

multi-core CPUs

FPGA

Massively Parallel Processor Arrays (MPPAs)

3

Page 4: 4th PFI System reading

JoinsJoins(⋈) combine two or more relations(tables) in RDBMS

A join is a cross product of relations followed by a selection(σ)

Many methods

Nested-loops joins

Sort-merge joins

(Recursive, hybrid) hash joins

4

Page 5: 4th PFI System reading

Stream JoinsProblems

Unbounded "infinite" input data

Solution: (sliding) window-based joins

tuple-based/time-based

Latency of the output

Solution: online, symmetric evaluation

How can it be scalable?

5

Page 6: 4th PFI System reading

Handshake Joins

Streams flow by each other in opposite directions.

Each core locally evaluates tuples.

Core 1 Core 2 Core 3

6

Page 7: 4th PFI System reading

A newly arrived tuple( ) will be compared to all tuples( ) in the other stream in the same core.

Any comparison algorithm(predicate) can be used.

Evaluation Strategy

7

Page 8: 4th PFI System reading

Strategies

Lock Step Forwarding

Two-Phase Forwarding using Async-MQ

Asymmetric protocol

Synchronization

b

a

a and b miss each other.

8

Page 9: 4th PFI System reading

Two-Phase Forwarding Using Asynchronous Message Queue

b

a

b

a

b

Leaving the tuple with a special mark.

FIFO queue

9

1.

2.

Page 10: 4th PFI System reading

Two-phase forwarding: ACK

b

a

b

b

a

b

b

When the left core receives tuple b, it sends an ack to the right core before sending any other tuples.

The right core deletes b when it receives the ack.

10

2.

3.

Page 11: 4th PFI System reading

Two-Phase Forwarding: when a and b miss each other

bb

a

bb

a

a will be compared to tuples in the right core including b.

11

4.

5.

Page 12: 4th PFI System reading

Load Balancing

Automatic load balancing without centralized control.

Each core can handle an arbitrary number of tuples.

Core 1 Core 2 Core 3

12

Page 13: 4th PFI System reading

Software Implementation

AMD Opteron 6174 2.2GHz

libnuma

quoted from page 8

13

Page 14: 4th PFI System reading

Scalability

page 8 page 9

14

Page 15: 4th PFI System reading

FPGA Implementation

Assume the system has to provide a throughput of 500ktuples/sec with a window size of 100 tuples. Config- urations with 1, 2, 5, and 10 join cores can guarantee this throughput if operated at clock frequencies of 50, 25, 10, or 5 MHz, respectively. from page 10

15

Page 16: 4th PFI System reading

Summary

Handshake join

Window-based stream join

Flexible and scalable

Working well with FPGA

16


Recommended