ppt

Preview:

Citation preview

Transis

1

Fault Tolerant Video-On-Demand Services

Tal Anker, Danny Dolev, Idit Keidar,

The Transis Project

Transis

2

VoD Service

• VoD: Full VCR control• 1 video stream per client

Client C1

VoD Serviceprovider

Requests

Video Stream

Moviesdisk(s)

Transis

3

High Availability

• Multiple servers – at different sites

• Fault tolerance:– servers can crash

• Managing the load:– new servers can be brought up / down– load should be re-distributed “on the fly”

migration of clients

Transis

4

The challenges• Low overhead

• Transparency– How do clients know whom to connect to? “abstract” service

– Clients should be unaware of migration

serverserver

VoD Service

Client C1

Client C2

server

Client C1

Client C2

server

server

VoD Service

server

Failed Server

Transis

5

Buffer Management andFlow Control

• Overcome jitter, message re-ordering and migration periods

• Re-fill buffers quickly after migration– avoid buffer overflow

• Minimize buffers– minimize pre-fetch bandwidth

• Dynamically adjust transmission rate to client capabilities– Re-negotiation of QoS

Transis

6

Features of our solution• Use group communication in the control plane

– connection establishment– fault tolerance and migration

• Flow control explicitly handles migration• Low overhead

– ~1/1000 of the bandwidth– Negligible memory and CPU overhead

• Commodity hardware and publicly available network technologies

Transis

7

Environment

• Implementation– UDP/IP over 10 Mbit/s switched ethernet– Transis– Sun Sparc and BSDI PC’s as video servers– Win NT machines as video clients– MPEG1 & 2 hardware decoders

• Machine and Network Failures

Transis

8

Implementing the abstract service

• Use group communication – clients communicate with a well known group

name (logical entity)– unaware of the number and identity of the

servers in the group

• Servers periodically share information about clients (every 1/2sec)

• If a server crashes (or is overloaded), another server transparently takes over

Transis

9

Group Communication

• Reliable Group Multicast(Group Abstraction)

• Message Ordering

• Dynamic Reconfiguration• Membership with Strong Semantics

(Virtual Synchrony)

Systems: Transis, Horus, Ensemble, Totem, Newtop, RMP, ISIS, Psync, Relacs

Transis

10

The group layout of the VoD service

Title:VISIO-vod-grp.vsdCreator:PSCRIPT.DRV âéøñä 4.0Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.

Transis

11

Transis Allows Simple Design

Group abstraction for connection establishment and transparent migration

Reliable group multicast allows servers to consistently share information

Membership services detects conditions for migration

Reliable messages for control

– Server takes ~2500 C++ code lines– Client takes ~4000 C code lines (excluding GUI and display)

Transis

13

Flow Control• Feedback based flow-control (sparse):

– FC messages are sent to the logical server (session group)

– Clients determines the changes in the flow:

Value of buffer occupancyRange and freq request0 – critical low-1 urgent emergecnycritical low – low mark-1 urgent uplow mark – high mark-1 < prev normal uplow mark – high mark-1 > prev normal downhigh mark – full urgent down

Transis

14

Emergency Flow Control

• When the server receives an emergency message:– The server change the fps rate:

fps = latest-known-fps + emergency quantity

• The emergency quantity decays every second (by a factor) – While the quantity is above zero, the server

ignores FC messages from the client

Transis

15

Performance Measurements

• On HUJI Network (LAN)

• Servers at TAU and clients at HUJI (WAN)

• The measurements show the system is robust and support our transparency claims

Transis

16

Software BuffersTitle:'vod100.stats.softb.ps'Creator:gnuplotPreview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.

Transis

17

Hardware BuffersTitle:'vod100.stats.hardb.ps'Creator:gnuplotPreview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.

Transis

18

Skipped Frames on LANTitle:'vod100.stats.pl_skipped.ps'Creator:gnuplotPreview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.

Transis

19

Skipped Frames on WANTitle:'vod100.stats.pl_skipped.ps'Creator:gnuplotPreview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.

Transis

20

Summary

• Scalable VoD service

• Load balancing

• Tolerating machine and network failure

• All the above are achieved practically for free:– ~1/1000 of the total bandwidth– Negligible memory and CPU overhead

Transis

21

Thanks to ...

• Gregory Chockler

• The other members of the Transis project