Upload
lesley-good
View
15
Download
2
Tags:
Embed Size (px)
DESCRIPTION
A Row-Permutated Data Reorganization Algorithm for Growing Server-less VoD Systems. Presented by Ho Tsz Kin. Agenda. Background Existing solutions Row-Permutated (RP) Algorithm Multi-RP Algorithm Performance Evaluation Conclusion. Background. Each node keeps balance video data blocks - PowerPoint PPT Presentation
Citation preview
A Row-Permutated Data Reorganization Algorithm forGrowing Server-less VoD Systems
Presented by Ho Tsz Kin
Agenda Background Existing solutions Row-Permutated (RP) Algorithm Multi-RP Algorithm Performance Evaluation Conclusion
Background Each node keeps balance video data blocks Nodes join the system Data must be reorganized to utilize storage
and streaming capacity
0
4
8
12
16
1
5
9
13
17
2
6
10
14
18
3
7
11
15
19
n0 n1 n3n2
node n4 joins
0
5
10
15
1
6
11
16
2
7
12
17
3
8
13
18
n0 n1 n3n2
4
9
14
19
n3
Background Data reorganization
Require data block movement Consume bandwidth Should not disrupt services Achieve storage and streaming balance
Existing Solutions Round-robin Reorganization
Round-robin placement policy Advantages: Perfect storage and streaming
balance Drawbacks: Nearly all the data blocks must be
reorganized
0
4
8
12
16
1
5
9
13
17
2
6
10
14
18
3
7
11
15
19
n0 n1 n3n2
node n4 joins
0
5
10
15
1
6
11
16
2
7
12
17
3
8
13
18
n0 n1 n3n2
4
9
14
19
n3
Existing Solutions Randomized Reorganization
Randomized placement policy Blocks are distributed to nodes randomly
n0 n1 n3n2
0
Assign to each nodewith equal probability
3
8
9
15
16
1
2
4
13
17
0
6
11
12
19
5
7
10
14
18
n0 n1 n3n2
Existing Solutions Reorganization Algorithm
Number of nodes, N Probability of residing in same node = Probability of moving to new node =
3
8
9
15
16
1
2
4
13
17
0
6
11
12
19
5
7
10
14
18
n0 n1 n3n2
1
N
N 1
1N
n4
P = 1
1N P = 1
N
N
Existing Solutions
3
8
9
15
16
1
2
4
13
17
0
6
11
12
19
5
7
10
14
18
n0 n1 n3n2
node n4 joins
Randomized Reorganization Advantages: Block movement is minimized, achieve
reasonable storage balance Drawbacks: Streaming load is imbalance
3
8
9
15 16
1
2
4
13
17
0
6 11
12
19
5
7
1014
18
n0 n1 n3n2 n4
imbalance row
Goal Two extreme cases
Round-robin Reorganization Overhead is maximum, balance streaming load
Randomized Reorganization Overhead is minimum, imbalance streaming load
Two Goals: Maintain balance streaming load but lower the
overhead of round-robin reorganization Allow controllable tradeoff between overhead and
streaming load balance
Row-Permutated (RP) Algorithm Idea: the sequence of blocks within each row is
not important in streaming load Row-permutated placement policy Streaming load is still balanced
1 0 3 2
n0 n1 n3n2
0 1 2 3
n0 n1 n3n2
Both maintain
balanced streaming load
Round-robin Placement Row-Permutated Placement
Row-Permutated (RP) Algorithm Reorganization Algorithm
Reorganize one row per iteration Identify overflow and underflow nodes
Overflow if more than 1 block Underflow if no block
Move excess block from overflow nodes to underflow nodes
0
7
8
13
16
1
4
10
12
17
3
5
9
14
19
2
6
11
15
18
n0 n1 n3n2 n4
OverflowNode
UnderflowNode
Excess block Target row in this iteration
Row-Permutated (RP) Algorithm Perfect streaming and storage balance Significantly lower down number of block movement during reorganization
n0 n1 n3n2
0
7
8
13
16
1
4
10
12
17
3
5
9
14
19
2
6
11
15
18
node n4 joins
n0 n1 n3n2
0
7 8
13
16
1 4
10 12
17
3
59
14
19
2
6
11
15 18
n4
Multi-RP Algorithm Tradeoff between overhead and streaming
balance Control streaming balance by window size, w
n0 n1 n3n2
0
4
11
12
16
1
5
9
13
20
2
6
10
14
18
3
7
8
15
22
17 21 19 23
n0 n1 n3n2
11
12
16
13 10
14
18
15
17 19
n4
w =2
Consider 2 rows
Multi-RP Algorithm Reorganization Algorithm
Reorganize w rows per iteration Identify overflow and underflow nodes
Overflow if more than w blocks Underflow if fewer than w blocks
n0 n1 n3n2
11
12
16
13 10
14
18
15
17 19
n4
w =2
OverflowNodes
UnderflowNodes
Multi-RP Algorithm In each overflow node
Choose row with largest number of block Take blocks in this row as excess blocks
Move to underflow nodes Contains smallest number of blocks in this row
n0 n1 n3n2
11
12
16
13 10
14
18
15
17 19
n4n0 n1 n3n2
11
12
16
13 10
14
18
15
17
19
n4
randomly
Multi-RP Algorithm Idea: Spread out blocks within row
n0 n1 n3n2
11
12
16
13 10
14
18
15
17
19
n4
row with largest number of blocks
n0 n1 n3n2
11
1216
13 10 14
18
15
17 19
n4
n0 n1 n3n2
11
1216
13 10
14
18
15
17
19
n4
Performance Evaluation Experiment Details
Number of data blocks = 4000 Grow from 1 node to 200 nodes
Metrics Data Reorganization Overhead
Number of block movement
Streaming Load Balance Proportion of missing data block within one row,
given that each node can only send out one block each round
Data Reorganization Overhead
1
1.5
2
2.5
3
3.5
4
0 20 40 60 80 100 120 140 160 180 200
system size after reorganization (nodes)
log
(reor
gani
zatio
n ov
erhe
ad) (
bloc
ks)
experimentalrandomized
round-robin
w=10
w=5
w=2
w=1
theoreticalrandomized
Streaming load balance
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 50 100 150 200
system size after reogranization (nodes)
prop
ortio
n of
missing
stri
pe u
nits
round-robin w=1
experimental randomized
w=10
w=5
w=2
Conclusion Identify the shortcomings of round-robin and
randomized reorganization RP and multi-RP reorganization are proposed Perfect streaming load balance with lower
overhead Controllable tradeoff between overhead and
streaming load balance