20
Transis 1 Fault Tolerant Video-On- Demand Services Tal Anker, Danny Dolev, Idit Keidar, The Transis Project

ppt

  • Upload
    ronny72

  • View
    135

  • Download
    2

Embed Size (px)

Citation preview

Page 1: ppt

Transis

1

Fault Tolerant Video-On-Demand Services

Tal Anker, Danny Dolev, Idit Keidar,

The Transis Project

Page 2: ppt

Transis

2

VoD Service

• VoD: Full VCR control• 1 video stream per client

Client C1

VoD Serviceprovider

Requests

Video Stream

Moviesdisk(s)

Page 3: ppt

Transis

3

High Availability

• Multiple servers – at different sites

• Fault tolerance:– servers can crash

• Managing the load:– new servers can be brought up / down– load should be re-distributed “on the fly”

migration of clients

Page 4: ppt

Transis

4

The challenges• Low overhead

• Transparency– How do clients know whom to connect to? “abstract” service

– Clients should be unaware of migration

serverserver

VoD Service

Client C1

Client C2

server

Client C1

Client C2

server

server

VoD Service

server

Failed Server

Page 5: ppt

Transis

5

Buffer Management andFlow Control

• Overcome jitter, message re-ordering and migration periods

• Re-fill buffers quickly after migration– avoid buffer overflow

• Minimize buffers– minimize pre-fetch bandwidth

• Dynamically adjust transmission rate to client capabilities– Re-negotiation of QoS

Page 6: ppt

Transis

6

Features of our solution• Use group communication in the control plane

– connection establishment– fault tolerance and migration

• Flow control explicitly handles migration• Low overhead

– ~1/1000 of the bandwidth– Negligible memory and CPU overhead

• Commodity hardware and publicly available network technologies

Page 7: ppt

Transis

7

Environment

• Implementation– UDP/IP over 10 Mbit/s switched ethernet– Transis– Sun Sparc and BSDI PC’s as video servers– Win NT machines as video clients– MPEG1 & 2 hardware decoders

• Machine and Network Failures

Page 8: ppt

Transis

8

Implementing the abstract service

• Use group communication – clients communicate with a well known group

name (logical entity)– unaware of the number and identity of the

servers in the group

• Servers periodically share information about clients (every 1/2sec)

• If a server crashes (or is overloaded), another server transparently takes over

Page 9: ppt

Transis

9

Group Communication

• Reliable Group Multicast(Group Abstraction)

• Message Ordering

• Dynamic Reconfiguration• Membership with Strong Semantics

(Virtual Synchrony)

Systems: Transis, Horus, Ensemble, Totem, Newtop, RMP, ISIS, Psync, Relacs

Page 10: ppt

Transis

10

The group layout of the VoD service

Title:VISIO-vod-grp.vsdCreator:PSCRIPT.DRV âéøñä 4.0Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.

Page 11: ppt

Transis

11

Transis Allows Simple Design

Group abstraction for connection establishment and transparent migration

Reliable group multicast allows servers to consistently share information

Membership services detects conditions for migration

Reliable messages for control

– Server takes ~2500 C++ code lines– Client takes ~4000 C code lines (excluding GUI and display)

Page 12: ppt

Transis

13

Flow Control• Feedback based flow-control (sparse):

– FC messages are sent to the logical server (session group)

– Clients determines the changes in the flow:

Value of buffer occupancyRange and freq request0 – critical low-1 urgent emergecnycritical low – low mark-1 urgent uplow mark – high mark-1 < prev normal uplow mark – high mark-1 > prev normal downhigh mark – full urgent down

Page 13: ppt

Transis

14

Emergency Flow Control

• When the server receives an emergency message:– The server change the fps rate:

fps = latest-known-fps + emergency quantity

• The emergency quantity decays every second (by a factor) – While the quantity is above zero, the server

ignores FC messages from the client

Page 14: ppt

Transis

15

Performance Measurements

• On HUJI Network (LAN)

• Servers at TAU and clients at HUJI (WAN)

• The measurements show the system is robust and support our transparency claims

Page 15: ppt

Transis

16

Software BuffersTitle:'vod100.stats.softb.ps'Creator:gnuplotPreview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.

Page 16: ppt

Transis

17

Hardware BuffersTitle:'vod100.stats.hardb.ps'Creator:gnuplotPreview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.

Page 17: ppt

Transis

18

Skipped Frames on LANTitle:'vod100.stats.pl_skipped.ps'Creator:gnuplotPreview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.

Page 18: ppt

Transis

19

Skipped Frames on WANTitle:'vod100.stats.pl_skipped.ps'Creator:gnuplotPreview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.

Page 19: ppt

Transis

20

Summary

• Scalable VoD service

• Load balancing

• Tolerating machine and network failure

• All the above are achieved practically for free:– ~1/1000 of the total bandwidth– Negligible memory and CPU overhead

Page 20: ppt

Transis

21

Thanks to ...

• Gregory Chockler

• The other members of the Transis project