22
Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin Zheng, Yongguang Zhang, Yibo Zhu, Chen Chen University of Science and Technology of China Microsoft Research Asia Tsinghua University University of California, Santa Barbara University of Pennsylvania

Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Embed Size (px)

Citation preview

Page 1: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Datacast: A Scalable and Efficient Reliable Group Data

Delivery Service for Data Centers

Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin Zheng, Yongguang Zhang, Yibo Zhu, Chen Chen

University of Science and Technology of ChinaMicrosoft Research Asia

Tsinghua UniversityUniversity of California, Santa Barbara

University of Pennsylvania

Page 2: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Reliable Group Data DeliveryThe problem of RGDD is:

<0,0>

01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

<1,0> <1,1> <1,2> <1,3>

<0,1> <0,2> <0,3>

00

given a data source, Src, and a set of receivers, R1, R2, …, Rn,

how to reliably transmit bulk data from Src to all the receivers?

In a data center network,

DataDataDataDataDataData

Page 3: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Reliable Group Data Delivery•RGDD is important in DCNs:

• Bootstrapping or OS upgrading.

•Distributed file systems, e.g., GFS.

•VM setup.

•And more...

Page 4: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Reliable Group Data DeliveryA good RGDD design should have the following properties:

•Scalable (large group numbers and large group sizes)

•High bandwidth efficiency

Page 5: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Existing solutions to RGDD

Existing solutions can be classified into two categories: • Reliable IP multicast. Not scalable, e.g., ACK implosion.

• End-host based overlays. Low bandwidth efficiency.

None of the existing systems can perfectly achieve RGDD.

Page 6: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

New opportunities in DCN

Recently, there are two clear trends in DCN: • Multiple edge-disjoint Steiner trees for RGDD.• Practical packet caching abilities in network devices.

We can cache packet!

<0,0>

00

01 02 03

<0,1>

10

11 12 13

<0,2>

20

21 22 23

<0,3>

30

31 32 33

<1,1> <1,2> <1,3>

<1,0>

00

10 20 30

<1,1>

01

11 21 31

<1,2>

02

12 22 32

<1,3>

03

13 23 33

<0,1> <0,2> <0,3>

Page 7: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

The architecture of Datacast

Fabric Manager

Master i

Master jSrc

R1 R2

IMD

Src

R1 R2 R3 R4

RGDD Group i1

RGDD Group i2

RGDD Group in

NetworkTopology

How to calculate multiple Steiner

trees?

How to efficiently transmit data in

each Steiner tree?

Page 8: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Multiple edge-disjoint Steiner trees in DCN•Our multiple Steiner trees algorithm takes three steps:

1. Use specific algorithms to construct spanning trees.

2. Prune the spanning trees.

3. Use Breath First Search(BFS) to repair the trees broken by network failures.

This algorithm is fast (O(k|V|) + O(|E|) + O(k|E|)) and efficient.

Page 9: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Datacast transport protocol• Datacast is built on top of Content Centric Network (CCN):

00 01 02 03

10 11 12 13

20 21 22 23

30 31 32 33

Inst

Data Data

Data Inst

Data Data

InstData

Inst

Data

Data

Data

Page 10: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Datacast transport protocol• Datacast uses a rate based congestion control algorithm at

the source side:

Page 11: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Datacast transport protocol• We built a fluid model for Datacast congestion control algorithm.

• and are the current data positions of the data source and the slowest receiver. • R is the “desired” rate of the slowest receiver.

Page 12: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Datacast transport protocol• Theorem 1 (Cache Usage). Datacast works at the full

rate, i.e., the rate of the slowest receiver, R, if the cache size, C, satisfies

When R is 100Mbps, is 5Mbps, T is 1ms, C only needs to be greater than 125KB.

Page 13: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Datacast transport protocol• Theorem 2 (Duplicate Data Ratio). When Datacast

works at the full rate, the duplicate data ratio of Datacast is

When R is 100Mbps, is 5Mbps, T is 1ms, and MTU is 1.5KB, the duplicate data ratio is 1.19%.

Page 14: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Simulation: multiple Steiner trees algorithm• We tested our algorithm in Fattree(24,3), BCube(8, 3), Torus(16, 3)

under the link failure rates (LFR) of 1%, 3% and 5%.

Running times. Steiner tree numbers.

Page 15: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Simulation: Datacast congestion control•Simulation Setup: • BCube(4, 1) with 1Gbps

links. • The link from <0,0> to

02 is slowed down to 100Mbps. • , T = 1ms•MTU = 1.5KB

<0,0>

00

01 02 03

<0,1>

10

11 12 13

<0,2>

20

21 22 23

<0,3>

30

31 32 33

<1,1> <1,2> <1,3>

<1,0>

00

10 20 30

<1,1>

01

11 21 31

<1,2>

02

12 22 32

<1,3>

03

13 23 33

<0,1> <0,2> <0,3>

Steiner Tree 1. Steiner Tree 2.

Page 16: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Simulation: Datacast congestion control• Based on Theorem 1, Datacast needs 125KB caches to work at full

rate. • Based on Theorem 2, the duplicate data ratios is 1.19%.

Cache Size (KB) Throughput(Mbps) Duplicate Data Ratio (%)

8 91.380 1.15

32 95.076 1.14

128 98.799 1.11

512 98.799 1.10

2048 98.799 1.12

Page 17: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Simulation: Datacast congestion control• Compare with BitTorrent.

Fattree. BCube. Torus.

Page 18: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Experiment: Datacast congestion control• We build Datacast with ServerSwitch platform, and evaluate it in a

BCube(4, 1). • = 5Mbps, T = 1ms, MTU = 8KB.

• Based on Theorem 1, Datacast works at the full rate when C > 256KB. • When C = 64KB, the average throughput is 91.998Mbps. • When C = 256KB, the average throughput is 99.595Mbps.

• Based on Theorem 2, the duplicate data ratio should be 3.48%. • When Datacast works at the full rate, the measured duplicate data ratio is

3.10%.

Page 19: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Experiment: Datacast congestion control• We compare Datacast with BitTorrent. We use both of them to

transmit 4GB data.

Finish time (s) Link stress

Datacast 16.9 1.01

BitTorrent 52 1.39

Page 20: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Related work

•Reliable IP multicast• Pgm congestion control (pgmcc)• Active Reliable Multicast (ARM)

•End-host based overlays• SplitStream• End System Multicast• Cornet

Page 21: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Conclusion

•In this paper, we propose Datacast which

• Calculates multiple edge-disjoint Steiner trees in DCNs

• Uses CCN to turn hard group states to soft packet caching

• Uses a simple rate-based AIMD congestion control algorithm to achieve high efficiency

Datacast is scalable and achieves high bandwidth efficiency

Page 22: Datacast: A Scalable and Efficient Reliable Group Data Delivery Service for Data Centers Jiaxin Cao, Chuanxiong Guo, Guohan Lu, Yongqiang Xiong, Yixin

Thank you!