Multicast Data DisseminationMulticast Data Dissemination
Wang LamSpecial University Oral Examination
7 July 2004
2
ContentsContents
Current multicast networksContributions– Data scheduling– Network issues
Related and future workConclusion
3
Current multicast networksCurrent multicast networks
Traditional data service: one-to-one
Multicast networks: one-to-many
IP: multicast group addresses (IPv4: 224.0.0.0 - 239.255.255.255; IPv6: FF00::/8)
Network bottlenecks
Client joins Unreliable delivery Datagrams (UDP)
4
Multicast data disseminationMulticast data dissemination
Client joins
Unreliable delivery
Datagrams
Supports varying bandwidth clients
All requested data must arrive
Data arranged to optimize performance
5
Principal contributionsPrincipal contributions
Data scheduling–Minimize delay for clients requesting many
items– Scheduling for subscribers and Scheduling for subscribers and
downloadersdownloaders 11Networking issues– Reliable deliveryReliable delivery
2– Splitting bandwidth into channels
6
Principal contributionsPrincipal contributions
Data scheduling
– Scheduling for subscribers and Scheduling for subscribers and downloadersdownloaders
7
Subscribers and downloadersSubscribers and downloaders
Data scheduling– Scheduling for subscribers and
downloaders• Distributing data for a Web repository• Metrics and techniques• Sample results
8
The multicast sourceThe multicast source
Stanford WebBase
100+ million Web pages
Additional benefits of multicast
crawler
repository
multicastserver
indexingand
analysis
clients
WWW
9
Multicast server
A multicast facilityA multicast facility
Clients issue requests to server
Clients listen to shared multicast
Server schedules data onto multicast
Downloaders and subscribers
clients
10
Clients request multiple itemsClients request multiple items
Broadcast disks: one-item “response time”
Multicast: client delay is different
Subscribers: freshness and age
w x y z y w x z A • • A • • B • • B • • C • • C • • D • D • E • E •
F • F •
11
Example scheduler: CircExample scheduler: Circ
Arbitrarily order data items
Send requested data
w x y z G • • • H • • I • J •
12
Example scheduler: PopExample scheduler: Pop
Send most requested data
w x y z G • • • H • • I • J •
13
Example scheduler: R/QExample scheduler: R/Q
Number of requesting clients
Smallest request size
w x y z G • • • H • • I • J •
clients 1 3 2 1 minreq 3 1 2 1
14
Example scheduler: R/QExample scheduler: R/Q
Number of requesting clients
Smallest request size
w x y z G • • H • I J •
clients 1 0 2 1 minreq 2 0 1 1
15
Some results for subscribersSome results for subscribers
Choice of scheduler depends on performance metric
Update frequency has little effect
16
Downloaders and subscribersDownloaders and subscribers
Average download client delay
0
100
200
300
400
500
600
700
800
25 50 75 100 125 150 175 200
Number of downloaders
hou
rs
Circ
R/Q
RxC
Pop
17
Downloaders and subscribersDownloaders and subscribers
Average freshness over clients
0.830.840.850.860.870.880.89
0.90.910.920.930.94
25 50 75 100 125 150 175 200
Number of downloaders
Circ
R/Q
RxC
Pop
18
Downloaders and subscribersDownloaders and subscribers
Average download client delay
0
100
200
300
400
500
600
700
800
25 50 75 100 125 150 175 200
Number of downloaders
hour
s
Circ
R/Q
RxC
Pop
Average freshness over clients
0.830.840.850.860.870.880.89
0.90.910.920.930.94
25 50 75 100 125 150 175 200
Number of downloaders
19
SummarySummary
Differences from broadcast disksDownloaders and subscribersStudied design tradeoffs for various
metrics and techniques
20
Principal contributionsPrincipal contributions
Data scheduling–Minimize delay for clients requesting many
items– Scheduling for subscribers and Scheduling for subscribers and
downloadersdownloadersNetworking issues– Reliable deliveryReliable delivery
2– Splitting bandwidth into channels
21
Principal contributionsPrincipal contributions
Networking issues– Reliable deliveryReliable delivery
22
Principal contributionsPrincipal contributions
Networking issues– Reliable deliveryReliable delivery• Multicast server model• Reliability techniques• Sample results• Other challenges
23
The multicast sourceThe multicast source
Stanford WebBase
100+ million Web pages
Network loss <5% to >20%
crawler
repository
multicastserver
indexingand
analysis
clients
WWW
24
multicast server
A multicast facilityA multicast facility
Clients issue requests to server
Clients listen to shared multicast
Server schedules data onto multicast
Data channel unreliable
clients
25
Forward Error CorrectionForward Error Correction
Compute fixed fraction of redundant data
Reconstruct from subset of bits
Vary padding by item
data FEC data FEC
requests requests
26
Forward Error CorrectionForward Error Correction
Compute fixed fraction of redundant data
Reconstruct from subset of bits
Vary padding by item
27
Forward Error CorrectionForward Error Correction
Compute fixed fraction of redundant data
Reconstruct from subset of bits
Vary padding by item
28
Forward Error CorrectionForward Error Correction
Compute fixed fraction of redundant data
Reconstruct from subset of bits
Vary padding by item
29
Forward Error CorrectionForward Error Correction
Compute fixed fraction of redundant data
Reconstruct from subset of bits
Vary padding by item
FEC(0.2R)
30
RetransmissionRetransmission
Wait for NAK Queue
retransmission of enough bits
Queue only on selected NAKs requests requests
data data
31
RetransmissionRetransmission
Wait for NAK Queue
retransmission of enough bits
Queue only on selected NAKs
32
RetransmissionRetransmission
Wait for NAK Queue
retransmission of enough bits
Queue only on selected NAKs
NAK
NAK
33
RetransmissionRetransmission
Wait for NAK Queue
retransmission of enough bits
Queue only on selected NAKs
34
RetransmissionRetransmission
Wait for NAK Queue
retransmission of enough bits
Queue only on selected NAKs
35
RetransmissionRetransmission
Wait for NAK Queue
retransmission of enough bits
Queue only on selected NAKs
36
RetransmissionRetransmission
Wait for NAK Queue
retransmission of enough bits
Queue only on selected NAKs
37
RetransmissionRetransmission
Wait for NAK Queue
retransmission of enough bits
Queue only on selected NAKs
NAK
R(1)
NAK
38
ReschedulingRescheduling
Do nothing Rerequest data
item
Combine with prior reliability schemes
requests requests
data data
39
ReschedulingRescheduling
Do nothing Rerequest data
item
Combine with prior reliability schemes
NAK
NAK
40
Clients of Uniform Loss RatesClients of Uniform Loss Rates
Average download client delay
400
500
600
700
800
900
1000
0 2 4 6 8 10client loss (% packets)
dela
y (s
econ
ds)
FEC(0)+R(0)
FEC(0)+R(1)
FEC(0)+R(2)
FEC(0)+R(10)
FEC(0)+R(inf)
41
Clients of Tiered Loss RatesClients of Tiered Loss Rates
Average download client delay
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
0 0.2 0.4 0.6 0.8 1fraction of clients having high loss
dela
y (s
econ
ds)
FEC(0.03R)+R(inf)
FEC(0.03R)+R(0)
FEC(0.1)+R(inf)
42
Clients of Tiered Loss RatesClients of Tiered Loss Rates
Average download client delay
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
0 0.2 0.4 0.6 0.8 1fraction of clients having high loss
dela
y (s
econ
ds)
FEC(0.01R)+R(inf)FEC(0.03R)+R(inf)FEC(0.05R)+R(inf)FEC(0.01R)+R(0)FEC(0.03R)+R(0)FEC(0.05R)+R(0)FEC(0.1)+R(inf)
43
Additional resultsAdditional results
Error-correcting packets help retransmissions
Variable FEC can outperform matched-rate FEC
Data-in-progress announcement can slightly help new clients
44
SummarySummary
Multicast server scenario allows a variety of reliability techniques
Techniques form many combinationsStudied design tradeoffs
45
Principal contributionsPrincipal contributions
Data scheduling–Minimize delay for clients requesting
many items– Scheduling for subscribers and Scheduling for subscribers and
downloadersdownloadersNetworking issues– Reliable deliveryReliable delivery– Splitting bandwidth into channels
46
PublicationsPublications
W. Lam and H. Garcia-Molina, “Multicasting a Data Repository,” WebDB 2001
W. Lam and H. Garcia-Molina, “Multicasting a Changing Repository,” ICDE 2003
W. Lam and H. Garcia-Molina, “Reliably Networking a Multicast Repository,” SRDS 2003
W. Lam and H. Garcia-Molina, “Slicing Broadcast Disks,” submitted for publication
W. Lam and H. Garcia-Molina, “Implementing Multicast Data Dissemination,” technical report
47
Publications (Stanford WebBase)Publications (Stanford WebBase)
J. Cho, T. Haveliwala, W. Lam, S. Raghavan, A. Paepcke, and H. Garcia-Molina, “Stanford WebBase Components and Applications”
http://www-diglib.stanford.edu/~testbed/doc2/WebBase/
Web crawler and client code:ftp://db.stanford.edu/pub/digital_library/
48
Related workRelated work
Broadcast disksWeb cachingPublish/subscribe systemsVideo on demandReliable multicast protocolsLayered multicast protocols
49
Next stepsNext steps
Other kinds of clients– On-the-fly processing– Partially ordered clients– Opportunistic clients
Distributed serversRequest mining
50
http://www.cs.stanford.edu/~wlam/compsci/http://www.cs.stanford.edu/~wlam/compsci/
51
52
More challengesMore challenges
Different network loss ratesDifferent download speeds
Multicast server
clients
53
More challengesMore challenges
Different download speeds– How many, how fast, how distributed?
Multicast server
clients
54
Related workRelated work
Data schedulingAcharya, Franklin, Aksoy, Zdonik,et al.
Web cachingBestavros, Rodriguez, et al.
Multicast networkingFloyd, Van Jacobson, McCanne, Miller, Almeroth, et al.
55
Related workRelated work
Multicast dissemination– Acharya, Franklin, Zdonik, Vaidya,
Hameed (broadcast disks)– Almeroth, Ammar, Fei (Web service)
Reliable multicast networking– Floyd, Van Jacobson, McCanne (SRM)–Miller, Robertson, et al. (MFTP)– Yajnik, Kurose, Towsley, et al. (Mbone)