Upload
nobuk
View
493
Download
0
Embed Size (px)
Citation preview
How Soccer Players Would do Stream Joins
3/4/2015 @nobu_k
1
Who?久保田展行 (@nobu_k)
CTO@Preferred Networks America, Inc.
Speciality
DBMS, Search engine
Distributed Systems (consensus)
beatmania IIDX SP/DP皆伝 (DPメイン)
2
How Soccer Players Would do Stream Joins
Jens Teubner, Rene Mueller, SIGMOD 2011
Handshake Join
Window-based stream joins supporting any join predicate
Very high degrees of parallelism
multi-core CPUs
FPGA
Massively Parallel Processor Arrays (MPPAs)
3
JoinsJoins(⋈) combine two or more relations(tables) in RDBMS
A join is a cross product of relations followed by a selection(σ)
Many methods
Nested-loops joins
Sort-merge joins
(Recursive, hybrid) hash joins
4
Stream JoinsProblems
Unbounded "infinite" input data
Solution: (sliding) window-based joins
tuple-based/time-based
Latency of the output
Solution: online, symmetric evaluation
How can it be scalable?
5
Handshake Joins
Streams flow by each other in opposite directions.
Each core locally evaluates tuples.
Core 1 Core 2 Core 3
6
A newly arrived tuple( ) will be compared to all tuples( ) in the other stream in the same core.
Any comparison algorithm(predicate) can be used.
Evaluation Strategy
7
Strategies
Lock Step Forwarding
Two-Phase Forwarding using Async-MQ
Asymmetric protocol
Synchronization
b
a
a and b miss each other.
8
Two-Phase Forwarding Using Asynchronous Message Queue
b
a
b
a
b
Leaving the tuple with a special mark.
FIFO queue
9
1.
2.
Two-phase forwarding: ACK
b
a
b
b
a
b
b
When the left core receives tuple b, it sends an ack to the right core before sending any other tuples.
The right core deletes b when it receives the ack.
10
2.
3.
Two-Phase Forwarding: when a and b miss each other
bb
a
bb
a
a will be compared to tuples in the right core including b.
11
4.
5.
Load Balancing
Automatic load balancing without centralized control.
Each core can handle an arbitrary number of tuples.
Core 1 Core 2 Core 3
12
Software Implementation
AMD Opteron 6174 2.2GHz
libnuma
quoted from page 8
13
Scalability
page 8 page 9
14
FPGA Implementation
Assume the system has to provide a throughput of 500ktuples/sec with a window size of 100 tuples. Config- urations with 1, 2, 5, and 10 join cores can guarantee this throughput if operated at clock frequencies of 50, 25, 10, or 5 MHz, respectively. from page 10
15
Summary
Handshake join
Window-based stream join
Flexible and scalable
Working well with FPGA
16