Salman Abdul Baset [email protected] Thesis defense October 29, 2010

Protocol and System Design, Reliability, and Energy Efficiency in Peer-to-Peer

Communication Systems

Salman Abdul [email protected]

Thesis defense

October 29, 2010

mailto:[email protected]

2

Motivation and Contributions

3

Client-server communication system

NAT / firewallnode A node B

node Cnode D

register(ip addr D)

proxy server / registrar register

(ip addr C)

register(ip addr A)

register(ip addr B)

ip addr A, B, C, D

signaling signaling

media traffic(voice, video, IM)

NAT / firewall

media relay server


Up to 40% InternetVoIP calls need media relays

Scaling for millions of users

economic costs

- servers

- bandwidth

- management

4

ip addrof B

What is a p2p communication system?


node Cnode D

register(ip addr D)

register(ip addr C)

register(ip addr A)

register(ip addr B)

signaling

NAT / firewall

P2P

ip addrB,C

ip addrA

ip addrD

What is the ip address of B


signaling


media traffic over TCP(voice, video, IM)

5

• Protocol and system design– can we design a p2p communication protocol for diverse

deployments such as ad hoc, enterprise, and Internet?

• Reliability– are p2p communication systems reliable (dropped calls)?

• Session quality– what is the quality of real-time traffic over TCP or through other

nodes?

• Energy efficiency– are p2p communication systems more energy efficient than client-

server?

• Measurement– how can we analyze the performance of p2p systems such as

Skype?

ChallengesDesigning, building, and analyzing p2p communication systems

6

• Protocol and system design– first standardized, and interoperable protocol for building p2p

communication systems• Peer-to-peer protocol (P2PP) and RELOAD [IETF I-D’s 2007 and 2010]• OpenVoIP – a P2PP proof-of-concept system [SIGCOMM’08 demo]

• Reliability– a simple model to investigate the reliability of p2p communication

systems [IPTCOMM’10]

• Session quality– a comprehensive study of TCP’s feasibility for real-time traffic

[SIGMETRICS’08]

• Energy efficiency– first energy-efficiency study of VoIP systems [GreenNetworking’10]

• Measurement– novel techniques for analyzing the workings of p2p communication

systems [GI’08,Infocom’06]

Contributions

7






[SIGMETRICS’08]


• Measurement– Novel techniques for analyzing the workings of p2p communication


In this talk

8

Related work

• Protocol and system design– Skype proprietary commercial system

– Distributed Hash Table (DHT) design [Rhea05Sigcomm]• nodes on Planet Lab run OpenDHT and store data• no relaying service, only storage• P2PP allows nodes with ‘good’ connectivity to provide

storage and relaying service, not just Planet Lab nodes

– feasibility of making Session Initiation Protocol (SIP) peer-to-peer [Bryan06AAA, Singh06NOSSDAV]

• explored the feasibility of distributing SIP registrar• protocol tied to SIP, could not be used by non-SIP protocols• P2PP can be used by non-SIP protocols

9

• Reliability analysis– Node isolation [Leonard07ToN]

• probability that a node outlives its neighbors• I build a framework for understanding reliability of calls

– Minimizing churn [Godfrey06Sigcomm]• not sufficient• I devise techniques to improve reliability

• Energy efficiency study– p2p is more energy efficient than client-server for file-sharing

[Valancius09CoNEXT, Nedevschi08Hotpower]• I analyze if it is also the case with VoIP systems• I also study where are energy inefficiencies in VoIP systems

Related work

10

IntroductionRelated work Protocol and system design• Reliability analysis• Energy efficiency study of VoIP systems• Conclusion

Outline

11

Protocol and system design

Goal• design open, standardized, and interoperable

p2p communication protocol for building systems – that distribute the functionality of proxy server,

registrar, and media relay server to end-points– that work in diverse deployments, e.g., ad hoc,

enterprise, Internet– that run on heterogeneous devices– that are extensible

12

Requirements


node Cnode D

P2P

NAT / firewall

• Multiple overlay algorithms• Node heterogeneity• NAT and firewall traversal• Bootstrap and join

portable

desktop

Emergency p2pInternet p2p

new node

13

Requirements

node A node B

node Cnode D

• Multiple overlay algorithms

• Node heterogeneity

• NAT and firewall traversal

• Bootstrap and join

• Resilience

• Message reliability

• Request routing

• Security

• Data model

• Monitoring and diagnostics

P2P

message

ip addrA

14

Peer-to-peer protocol (P2PP)• Protocol stack of a P2PP node

• A request / response binary protocol– protocol methods

• Reuses existing protocols

• Not a new DHT or bit-encapsulation

• How does it meet the requirements?

SIP

P2PP NAT

TLS / SSL

API

Published in IETF P2PSIP working group

15

Meeting the requirements (1)

Multiple overlay algorithms• LookupPeer method

– find a peer in the overlay to fill node’s routing table

– method customized for each overlay algorithm

– Chord (X+2i,X+2i+1), Kademlia (XOR), etc.

• ExchangeTable method– exchange routing tables

• KeepAlive method– check liveness

node A node B

node Cnode D

P2P

Routing table

node B

node D

Routing table

node C

node D

In one overlay algorithm

In another overlay algorithm

Multiple overlay algorithms

16


• Node heterogeneity– peers (super nodes) and

clients (ordinary nodes)– peer vs. client decision left

to the system designer

• NAT traversal– a node encodes its host, NAT,

and a relay IP address in every message

– then performs connectivity checks

• Bootstrap and join– Bootstrap, Join, Leave

methods– Bootstrap server

P2P

portable

node A node B

node Cnode D

NAT / firewall

NAT / firewall

node A

bootstrapserver

new node

Multiple overlay algorithms Node heterogeneity NAT traversal Bootstrap and join

17


P2P

• Resilience– KeepAlive method

• Message reliability– hop-by-hop, e2e– ACK-based mechanism

for unreliable transports

• Request routing– recursive vs. iterative– specified per message

or per overlay

messagenode B

node Cnode D

node A

Multiple overlay algorithms Node heterogeneity NAT traversal Bootstrap and join Resilience Message reliability Request routing

18


• Data model– Publish, Lookup, Replicate methods

– flexible data model• key / type-value pairs• data integrity (hash)

• Security– identity (user, nodes)

• enrollment server, X.509 certificates

• Enroll method– message confidentiality

• TLS, DTLS

• Monitoring and diagnostics gathering (e.g., CPU, uptime)– GetDiagnostics method

node B

node Cnode D

node A

P2P

enrollment server

Resource-ID

Type 1

Sub-type 1 Sub-type 2

Value Value

Signature

[email protected]

Phone record

Desktop phone

IP addr:port

386af6194c4d

Multiple overlay algorithms Node heterogeneity NAT traversal Bootstrap and join Resilience Message reliability Request routing Data model Security Monitoring and diagnostics

19

• Design summary– methods for implementing the common aspects of overlay

algorithms– overlay algorithm defines components of specific methods– separation of mechanism vs. policy

• P2PP now part of RELOAD protocol being standardized in the IETF

• Limitations– not a replacement for network file systems

• no permissions, store ephemeral data– does not replace delay-tolerant network protocols

• Does it work in practice?

P2PP – summary

20

OpenVoIP – a P2PP proof of concept

P2POverlay 1

P2POverlay 2

bootstrap server monitoring server / Google maps

NAT / firewall

NAT / firewall

node A node B

node Cnode D

node E node F

SIGCOMM (demo) 2008

21

OpenVoIP – key facts and lessons learned

• 1000 node network on ~500 PlanetLab machines• DHTs: Kademlia, Bamboo, Chord• App: Windows XP / Vista, Linux• Code used and modified by Ericsson Labs, Nokia Labs,

Telecom Italia, and many universities around the world

• Lessons learned– DHT specific part is only 10-15% of the total code– want to test a new p2p protocol?

• use the library provided

22

IntroductionRelated workProtocol and system design Reliability analysis• Energy efficiency study of VoIP systems• Conclusion

Outline

23

Reliability of P2P comm. systems

• Reliability=Proportion of completed calls (e.g., 99.9%)

• Goals– understand reasons for call failure– devise techniques to address them

• Reasons for call failure– (1) distributed search fails to find online callee– (2) distributed search fails to find a suitable relay– (3) relay fails during voice/video session

IPTCOMM’2010

Recall: up to 40% VoIP calls in the Internetneed relaying

24

Understanding reliability of relayed calls (1)

For desired reliability, minimum relays k per call?

• Model– when ith relay fails, call is switched to (i+1)st relay which

is instantly selected from the global pool of all relays.

– Ri residual lifetime of a relay candidate (i.i.d.)

– let D denote the call duration.

)(rel Desired0

DRPk

ii

Qualitatively: if node lifetime >>> call duration, small k and vice versa

99.9%

R1 RkRk-1

D

1 2 k-1 k

25

Min # of relays k

6 4

3 5

1 10

Min # of relays k

Skype

12 hours (mean)

4 hours (med)

3

(mean call holding time= one hour)

For one hour Skype calls, minimum of 3 relays needed to maintain 99.9% success rate

95% of Skype relay calls last less than 1 hourkv))/((199.9%

Exponential node lifetimes Skype node lifetimes

lifetimes approximated as pareto

Mean node lifetime

Mean call duration

What if the system does not have enough relays?

Understanding reliability of relayed calls (2)

26

• Model

– complicated for arbitrary distributions– For exponential lifetimes, I used

markov analysis

Approaches for addressing the reliability of relayed calls

)( DRP NR

• Approaches– No-replacement (NR)

• select k relays in the beginning of a call

• do not replace failed relays

– With-replacement (WR)• select k relays in the

beginning of a call

• replace failed relays after μ

– Skype uses 2-relay with-replacement scheme

pure death process

)( DRP WR

))...(max(1 1 DRRP k (1) Why not make k arbitrary large?

(2) Isn’t WR always better NR?

27

Comparing the approaches for reliability of relayed calls

(1) Why not make k arbitrary large? – i.e., add more relays?– diminishing returns– liveness checks overhead

(2) Isn’t WR always better than NR?– yes, but the percentage improvement gains vary– depends on mean lifetime, call duration, repair time

Skypemean=12 hoursMedian=4 hours

Skype: 2 relay with-replacement

search time=60s

Num relays MTTF improvement

2 50%

3 22%

4 13%

28

Outline

IntroductionRelated workProtocol and system design

– P2PP– OpenVoIP

Reliability analysis Energy efficiency study of VoIP systems• Conclusion

29

Energy efficiency study

• Goals(1) Where is energy consumed in IP-telephony systems?

(2) How do different design choices (p2p vs. client-server) affect the energy consumption?

(3) How can we make IP-telephony more energy-efficient?

• IP-communication system classification– p2p vs. client-server– PSTN replacement

always on, providing emergency calling vs. communication addendum

GreenNetworking’2010

30

Sources of energy consumption in IP-communication systems

• End-point – handsets

– VoIP conversion boxes

– PCs

– NATs and firewalls

• Core– signaling / directory

– media relaying

– PSTN / mobile gateways

– cooling • power utilization efficiency (PUE)

– ratio of data center power draw to IT power draw

• Network– joules per bit

31

Approach

• Data (from client-server VoIP provider) – 100K users (mostly business)– 15 calls per second (CPS) peak– ~5K calls in system – NAT keep-alive traffic – all calls relayed

• Modeling– P2P– Client-server

• Measurements

– End points– desktop clients

– laptop clients

– hardware SIP phones

– Skype peers

– Core– SIP server

– relay server

32

(1) Where is energy consumed? PSTN replacement

• VoIP servers consume less than 0.06% of total!– 1 server 500k users ~200W– 1 servers 50k simultaneous calls ~200W– 500k phones, each phone ~5-7W

– even after a redundancy factor of 2, and conservative PUE of 2!

Make PSTN replacement green?Reduce end-device power energy consumption

33

(1) Where is energy consumed? Non-PSTN replacement

• Typically run on desktops, laptops as soft phones

• If soft phone draws little additional power– still likely that end-device biggest component– but may not dominate consumption

• If users leave PCs on just as phones– possibly even worse than PSTN!

34

(2) Client-server vs. peer-to-peer?

• Client-server model– C/S power consumption

pc/s= #servers * Watts/server *redundancy factor * PUE

• P2P model– S super nodes active– ps Watts/super node

ps < 162mW

P2P more energy efficient when:

S * ps < pc/s

• One active super node per relayed call (Skype)

• 30% calls relayed• super nodes 1.5% of total nodes

P2P may consume more than client-server!

35

(3) How can we make IP-telephony greener?

• Phones– make phones energy efficient

• LCD, processor, Wake-on-LAN for phones?

• PCs– wakeup on receiving calls

• NATs and firewalls– eliminate NATs (IPv6 – at least in theory)

36






[SIGMETRICS’08]


• Measurement– novel techniques for analyzing the workings of p2p communication


Conclusion

37

PublicationsJournal and magazine• Eli Brosh, Salman A. Baset, Vishal Mira, Dan Rubenstein, and Henning Schulzrinne, The Delay-Friendliness of TCP for Real-time Traffic,

IEEE/ACM Transactions on Networking, Accepted.• Salman A. Baset and Henning Schulzrinne, Reliability and Relay Selection in Peer-to-Peer Communication Systems, in submission.• Salman A. Baset and Henning Schulzrinne, Making Peer-to-Peer Video Conferencing Work, in submission.

Conference and workshop• Salman A. Baset, Joshua Reich, Jan Janak, Pavel Kasparek, Vishal Misra, Dan Rubenstein, and Henning Schulzrinne, How Green is IP-

telephony?, in Proc. of SIGCOMM Green Networking workshop, New Delhi, India, August 2010• Salman A. Baset and Henning Schulzrinne, Reliability and Relay Selection in Peer-to-Peer Communication Systems, in Proc. of

IPTCOMM, Munich, Germany, August 2010 (Best paper).• Omer Boyaci, Andrea Forte, Salman A. Baset, and Henning Schulzrinne, vDelay: A Tool to Measure Capture-to-Display Latency and

Frame Rate, in Proc. of International Symposium on Multimedia (ISM), San Diego, CA, USA, December 2009.• Katerina Argyraki, Salman A. Baset, Byung-Gon Chun, Kevin Fall, Gianlucca Iannaconne, Allan Knies, Eddie Kohler, Maziar Manesh, Sergiu

Nedevschi, and Sylvia Ratnasamy, Can Software Routers Scale, in Proc. of second PRESTO workshop, Seattle, WA, USA, August 2008.• Eli Brosh, Salman A. Baset, Dan Rubenstein, and Henning Schulzrinne, The Delay-Friendliness of TCP, in Proc. of ACM SIGMETRICS,

Annapolis, MD, USA, June 2008.• Wookyun Kho, Salman A. Baset, and Henning Schulzrinne, Skype Relay Calls: Measurements and Experiments, in Proc. of IEEE Global

Internet Symposium, Phoenix, AZ, USA, April 2008.• Salman A. Baset and Henning Schulzrinne, An Analysis of the Skype Peer-to-Peer Internet Telephony Protocol, in Proc. of IEEE

INFOCOM, Barcelona, Spain, April 2006.• Kishore Dhara, Salman A. Baset, Venkatesh Krishnaswamy, Dynamic Peer-To-Peer Overlays for Voice Systems, in Proc. of 3rd IEEE

Workshop on Mobile Peer-to-Peer Computing, Pisa, Italy, March 2006.Demo• Omer Boyaci, Andrea Forte, Salman A. Baset, and Henning Schulzrinne, vDelay: A Tool to Measure Capture-to-Display Latency and

Frame Rate, in Proc. of International Symposium on Multimedia (ISM), San Diego, CA, USA, December 2009.• Salman A. Baset, Gaurav Gupta, and Henning Schulzrinne, OpenVoIP: An Open Peer-to-Peer VoIP and IM System, in Proc. of SIGCOMM

(demo), Seattle, WA, August 2008.Poster• Salman A. Baset, Eli Brosh, Vishal Misra, Dan Rubenstein, and Henning Schulzrinne, Understanding the Behavior of TCP for Real-time

Workloads, in Proc. of CoNEXT, Lisbon, Portugal, December 2006.Internet drafts• Cullen Jennings, Bruce Lowekamp, Eric Rescorla, Salman A. Baset, and Henning Schulzrinne, Resource Location and Discovery

(RELOAD), Internet Draft, draft-ietf-p2psip-base-11 (work-in-progress), October 2010.• Cullen Jennings, Bruce Lowekamp, Eric Rescorla, Salman A. Baset, and Henning Schulzrinne, A SIP Usage for RELOAD, Internet Draft,

draft-ietf-p2psip-sip-05 (work-in-progress), July 2010.• Salman A. Baset and Henning Schulzrinne, TCP-over-UDP, Internet Draft, draft-baset-tsvwg-tcp-over-udp-01 (work-in-progress), June 2009.• Salman A. Baset, Henning Schulzrinne, and Marcin Matuszewski, Peer-to-Peer Protocol (P2PP), Internet Draft, draft-baset-p2psip-p2pp-01,

November 2007.

38

References

[Bryan06AAA] David A. Bryan, Bruce Lowekamp, and Cullen Jennings, SoSIMPLE: A SIP/SIMPLE based P2P VoIP and IM system, in Proc. of AAA workshop, Orlando, FL, USA, July 2005

[Godfrey06Sigcomm] P. Brighten Godfrey, Scott Shenker, and Ion Stoica, Minimizing churn in distributed systems, in Proc. of SIGCOMM, Pisa, Itlay, August 2006.

[Leonard07ToN] Derek Leonard, Zhongmei Yao, Vivek Rai, and Dmitri Loguinov, On lifetime-based node failure and stochastic resilience of decentralized peer-to-peer networks, in IEEE/ACM Transactions on Networking, June 2007.

[Nedevschi08Hotpower] Sergiu Nedevschi, Jitendra Padhye, and Sylvia Ratnasamy, Hot data centers vs. cool peers, in Proc. of HotPower, San Diego, CA, USA, December 2008.

[Rhea05Sigcomm] Sean Rhea, OpenDHT: A publicly accessible DHT service, PhD thesis, University of California at Berkeley, Berkeley, CA, USA, 2005.

[Singh06NOSSDAV] Kundan Singh and Henning Schulzrinne, Peer-to-peer Internet telephony using SIP, in Proc. of NOSSDAV, Stevenson, WA, USA, June 2005.

[Valancius09CoNEXT] Vytautas Valancius, Nikolaos Laoutaris, Laurent Massoulie, Christophe Diot, and Pablo Rodriguez, Greening the Internet with nano data centers, in Proc. of CoNEXT, Rome, Italy, December 2009.

39

Backup

40

IP-based communication systems

• Basic services– establish voice, video, IM sessions– voicemail

• Advanced services– conferencing, telepresence– voicemail to text

Client-server Peer-to-Peer

41

Client-server IP communication systemSIP registrar / proxy server

REGISTER(ip addr)

REGISTER(ip addr)

User agent User agent

(1) signaling (1) signaling

(2) media(voice, video, IM)

SIP registrar / proxy / presenceserver

Utopian Internet

No NATs or firewalls

PSTN / Mobile

IP-PSTN gateway

42

Client-server IP communication system

SIP registrar / proxy / presence / server


media server

NAT / firewall

NAT / firewall

NetworkNetwork

NAT

Src-IP Dst-IPPub-IP

Src-IP Dst-IP Pub-IP Dst-IP

Pr-IP

packet packet

aka server-reflexive address

43

• P2P file-sharing systems– tit-for-tat– open NAT ports– reduce download rate of files for nodes

behind NATs

• P2P communication systems– no tit-for-tat– opening NAT ports is a hassle– cannot reduce rate, will impact quality

P2P: communication vs. file sharing

44

Percentage of VoIP calls in the Internet that need relaying?

• the provider knows • Some client-server VoIP providers relay all calls• 15-20% calls for a commercial client-server IM /

VoIP application• Microsoft messenger ~ 40%• 341 relayed calls in 20 days for Skype

[Suh05Infocom] ~17 per day for a super node (~50K super nodes)

• NAT studies

45

Protocol and system design

46

• Data model– addressing, storage,

integrity• Message reliability

– hop-by-hop, e2e

Different aspects• Next-hop determination

– depends on the overlay algorithm

– Chord, Kademla, Gia, – proximity aware etc.

Shared and different aspects

Shared aspects• Connectivity

– NAT traversal– bootstrap

• Resilience– recovery from node churn

• Request routing– recursive vs. iterative– parallel vs. sequential

• Heterogeneity of nodes– mobile, desktop– super node vs. ordinary node

• Security– Identity (user, nodes)– message confidentiality

Request

Request

Response

Response

Request

Response

Request

Response

A B C A B C

• Methods for implementing the common aspects

• Overlay algorithm defines components of specific methods

Now part of RELOAD protocol being standardized in the IETF

47

Peer-to-peer protocol (P2PP)

• Now part of RELOAD protocol being standardized in the IETF

• Not a new DHT or bit-encapsulation• Geared towards IP telephony but

applicable to streaming, VoD etc.• A request / response binary protocol

– Shared methods• Join, Leave, Publish, Lookup, KeepAlive etc

– Overlay-specific methods• FindPeer, ExchangeTable

• Support different overlay algorithms (Chord, Kademlia etc)• Application-level API• Security

– enrollment server, X.509 certificates– TLS, DTLS for message confidentiality

IETF P2PSIP working group

SIP

P2PP ICE

TLS / SSL

protocol stack of a node

API

48

Peer-to-peer protocol (P2PP)

• Node heterogeneity– peers (super nodes) and

clients (ordinary nodes)– decision left to the system

designer– use of peers as relays

• NAT traversal built-in– a node exchanges its host,

NAT, and a relay IP address in requests and responses

– then uses ICE (interactive connectivity establishment) for NAT traversal

• Message reliability– hop-by-hop, e2e

• Data model– key / value pairs– data integrity

• Monitoring and diagnostics gathering

Resource-ID

Value 1

Value 2

Type 1

Value 1

Value 2

Type 1

Signature

49

Implementation design

Transport / timers

Node

BigInt

Parser / encoder

UDP TCP

Transactions

ClientBootstrap KadPeer BambooPeer Diagnostic

Sys

publish (key, value, callback)callback (resp)

lookup (key, callback)

Routing table

Neighbor table

Distance

DTLS TLS

{multiplatform

app. pluggability} {

821

1177

2921

Other2771

2803211

2566299

Data storage 1946

406 1019

208

1182 869 630

Non-DHT LoC15783

NAT 3400

50

• Is there any such thing as the ‘best’ DHT?• Chord widely cited but not widely deployed• DHTs are parameterized

– base, hash algorithm– symmetric vs. asymmetric distance

• Chord (modulo) vs. Kademlia

– next-hop determination may be purely based on DHTs or a combination of DHT+proximity aware routing

– debugging and deployment

Why not the ‘best’ DHT?

51

P2PP and RELOAD

• Commonalities– pluggable overlay algorithm

• feasibility demonstrated in OpenVoIP

– security• self-signed and CA-signed certificates• DTLS, TLS• data integrity

– routing• recursive, iterative, direct response

– NAT traversal• core part of the protocol

52

P2PP and RELOAD

• Differences– message model

• P2PP: all messages can be routed in recursive or iterative manner

• RELOAD: only one message permitted for iterative– data model

• P2PP: opaque blob, only app can interpret the data• RELOAD: single value, array, dictionary

– message fragmentation over unreliable transports• P2PP: use TCP• RELOAD: handle UDP

– NAT traversal• RELOAD: explicit ‘connection’ establishment• P2PP: encode host, server-reflexive, and relay addresses in

every message

53

Skype using P2PP?

• Why not open the Skype protocol?– sure, but– Skype protocol tied with VoIP

• To use P2PP, Skype will have to– abandon its own protocol – use SIP for call establishment– TLS, DTLS for security– STUN, TURN, ICE protocols for NAT traversal

54

OpenVoIP: geo+logical interface

55

OpenVoIP: lessons learned

• Bootstrap– maintain bootstrap nodes and ensure their availability

• Randomization is our best friend!– send the maintenance messages within a bounded random time

• Churn recovery– is on demand and periodic

• Insert a new entry in routing table after checking liveness• Periodically republish SIP records

– not feasible for large records

• Avoid overly complex mechanisms – can backfire!

56

• Send video to every participant (NxN)• But

– uplink capacity– NAT and firewalls– downlink capacity

• Solution– centralized

• costly, hardware-based– peer-to-peer

• use helpers (idle Skype users)

• construct an application layer multicast tree rooted at every participant

P2P video conferencing

57

P2P video conferencing

• Challenges– optimize latency or number of

helpers or both (within a threshold)?

– select helpers close to source or final recipients?

• recipients behind NAT and firewalls?

– helper churn• backup for every helper?

– participant join and churn?– who searches for the helper?

• root or new recipient?

– share helpers across trees?– participants as helpers?

150ms

150ms

20ms

20ms

20ms

10ms

50ms50ms

50ms

10ms50ms

58

Number of helpers

participant outdegree=1helper outdegree=2

Number of helpers per tree (stream)=7Total helpers=63

1/4

Nine party conference

Number of helpers per tree=4Total helpers=36 from 63!

1/4

1/4

1/4

59

Number of helpers

part Participant outdegree=3 Participant outdegree=1

2 3 4 5 6 2 3 4 5 6

3 0 0 0 0 0 3 3 3 3 3

4 4 4 4 4 4 8 4 4 4 4

6 12 6 6 6 6 24 12 12 6 6

10 60 30 20 20 20 80 40 30 20 20

6 helpers per treeback up for

every helper?

participant outdegree=1helper outdegree=2

Number of helpers per tree (stream)=7Total helpers=63

60

Number of helpers

part Participant outdegree=3 Participant outdegree=1

2 3 4 5 6 2 3 4 5 6

3 0 0 0 0 0 3 3 3 3 3

4 4 4 4 4 4 8 4 4 4 4

5 5 5 5 5 5 15 10 5 5 5

6 12 6 6 6 6 24 12 12 6 6

7 21 14 7 7 7 35 21 14 14 7

8 32 16 16 8 8 48 24 16 16 16

9 45 27 18 18 9 63 36 27 18 18

10 60 30 20 20 20 80 40 30 20 20

6 helpers per tree

back up for every helper?

HO od

61

Split the stream

participant outdegree=1helper outdegree=2Total helpers=4x6=24

participant outdegree=1helper outdegree=2split the stream=3Total helpers=3x6=18

62

Tree construction for video conferencing

• Split the stream – decreases helpers– gain increases as the helpers increase (>4)

• Source selects helpers close to itself• Helper pool

Related work– ALM 1->many vid-conf many->many– participant churn, bandwidth, managed servers as

helpers– 1-hop tree construction without split and incorporate

participants as helpers

63

Reliability analysis

64

Understanding reliability of relayed calls

• Model and simulations– event driven, 107 calls

– [synthetic] exponential, pareto,[real] Skype lifetime data set

• Skype node lifetime data set (1,740 Skype nodes)– Skype (uptime mean=~12 hours, med~=4 hours)

• approximated using shifted pareto and exponential• 95% of relayed Skype calls are less than 60 minutes [Guha’06]

– desired reliability = 99.9%

1-relay failure

error (15%)

95% of Skype call durations – minimum of 3 relays to maintain 99.9% success rate

65

Improving reliability of relayed calls

))...(max(1)( 1 DRRPDRP kNR

• Approach 1 -- no-replacement– select k relays in the

beginning of a call– do not replace failed relays

• Approach 2 -- with-replacement– select k relays in the

beginning of a call

– replace failed relays after μ– no failure during switch over

– Skype uses 2-relay with-replacement scheme

pure death process

)3/(2/1 2 MTTF

for )( /MTTFtR etFWR

)/()( vvDRP MTTFWR

2 1 0

2λ λ1-(λ + μ)1-2λ

μ

[Bir04]

66

Distributed relay selection

NAT

NAT

IP address RTT Bandwidth

IP address RTT Bandwidth

• Goal O(1) hop• 2-level hierarchical network

1-relay

close-by

Give me a relay

Here is a randomly selectedrelay

local-random scheme

search performance dropped calls

67


• Delay• User annoyance

– interference with user applications

– file sharing (draft idle peers)– spare capacity

• random• mindelay

– select relay with minimum delay• netmax

– select relay with maximum spare bw

• threshold– select relays with delay < 150

ms and maximum spare capacity

• Results– strategies perform similar near

system collapse point– minimizing latency increases

annoyance, number of jobs per relay, vice versa

– threshold approach performs reasonably well

• Comparison with existing approaches

– OneHop DHT• Efficient routing for peer-to-peer overlays

[Gupta04NSDI]

– Direct comparison not possible• we do not create one hop DHT

• leverage the connectivity information of a peer

68


69

Understanding TCP behavior for real-time traffic

70

Real-time traffic over TCP

• Why over TCP? – restrictive NAT and firewalls

• Why not? – TCP is likely to exhibit poor performance for VoIP and live video

(first tried in 1970s, but TCP has evolved )

• Our result: – acceptable performance for VoIP and video (streaming, conferencing)

under certain conditions

• Why VoIP and video over TCP is feasible?– (1) Factors impacting delay– (2) Working region

App limited

Networklimited

Fast retransmit / TO

no loss

packet loss

backlog cleared

TCP (New Reno) carrying CBR real-time traffic

backlog

71

Factors impacting delay

(1) Packet size (small is better)– during backlog, VoIP packets (~200 bytes) can

be combined in one MSS (~ 1500B)

(2) Congestion window regulation (implicitly favors small packet size)

– TCP regulates cwnd based on number of ACKs received (ACK-counting)

– for two flows with the same packet-rate, butdifferent bit-rate, works in favor of smaller bit-rate

Delay friendliness for VoIP•No Nagle or delayed ACKs•ACK-counting

w

MSS=1500 bytesVoIP200bytes

w+1

72

Factors impacting delay• Byte-counting increases VoIP delays by 10-20%

• VoIP delays are significantly lower than video for the same packet rate

• TCP induced delay: AIMD, HOL (head-of-line)

• VoIP: HOL dominates, video: AIMD dominates– (CBR) VoIP 64 kb/s (173 byte/packet) video 573 kb/s (MSS bytes/packet)

Delay improvementVoIP: modify tcp recv()Video: use parallel connections or inflate the window

73

Working region

VoIPLimit 200ms

StreamingLimit 5s

Video conferencingLimit 500ms

VoIP64 kb/s

Video573 kb/s

VoIP 100ms 2%

Video (stream) 100ms 3%

Video (conf) 100ms 1%

• Video conferencing has the most constrained region• acceptable performance for VoIP

74

Playout buffer setting

• Time to recover a lost packet– Fast retransmit 1.5RTT +3/f– Timeout 4*RTT

i

i

RTT

3/f

0.5RTT

VoIP: RTT 100ms lr: 2%Video (conf): RTT 100ms lr: 1%Video (stream): RTT 100ms lr: 2%

75

Playout buffer setting

• Time to recover a lost packet– Fast retransmit 1.5RTT +3/f (CBR flows)– Timeout 4*RTT

i

i

RTT

3/f

0.5RTTRTT=100ms lr=1%

RTT=100ms lr=3%

76

Related work

• Supporting low-latency TCP based media streams [IWQoS’02]– TCP stack modification at sender

• TCP-RC: a receiver-centered TCP protocol for delay-sensitive applications [MMCN’05]– TCP stack modification at receiver

77

Energy Efficiency

78

Does VoIP consume more energy than PSTN?

• Insufficient information• Columbia phone system (presently)

– system: 40K watts– cooling: 50K watts– phone lines: 13,848– per user: 6.4W

• VoIP– Cisco phone: 5-7W

79

Servers needed

Transport NAT keep-alive

100k 1M 10M 100M Watts / server

UDP Yes,

NOTIFY/s

1 2 20 200 210

UDP NO 1 1 10 100 190

TLS NO 3 25 250 2500 209

% calls relayed

100k 1M 10M 100M

0% 0 0 0 0

30% 1 2 10 96

100% 1 4 32 320

% calls relayed

100k 1M 10M 100M

UDP (NAT)

0.4% 0.1% 0.05% 0.04%

UDP

TLS 0.2%

server as % of totals c/s

80

Skype

81

Measurement: Skype• Super node, ordinary node, login

server• Actively prevent against reverse

engineering– LD_PRELOAD– forcing Skype to use a modified shared

library• Voice and video calls

– relaying– over TCP

• Ports: no default listening port– opens port 80 (HTTP) and 443 (TLS)

• Contact list– stored centrally, initially distributed

• Video conferencing– using central servers

Skype login server

Message exchangewith the login serverduring login

ordinary host (SC)

super node (SN)

neighbor relationships in the Skype network

INFOCOM’06

82

Is Skype free-riding on universities bandwidth?

• Two Skype clients in Columbia University forced to use a relay

• 6,000 relay calls• Median latency: ~95ms• 46% calls through relays with

a .edu suffix• 8% of calls through Columbia

Skype users• Is it deliberate?

– probably not

– relay selection biased towards high-capacity nodes which happen to be in universities

GI’08

NAT NAT our lab

83

Future work

84

Directions for future research

• A holistic framework for reliability, performance, and energy tradeoffs in data centers– virtualization, consolidation

• Comparing VoIP and PSTN energy consumption

• Preventing data lock-in for social networks and cloud-based services– enabling seamless data migration across different

cloud providers– holy grail: ‘one click’ data migration

85

Client-server IP communication system

PSTN / Mobile

SIP registrar / proxy server


(1) signaling

(1) signaling(2) media

(voice, video, IM)(UDP or TCP)

media server

NAT / firewall

NAT / firewall

IP-PSTN gateway

What is centralized?• directory service• call signaling• media session and

conferencing• PSTN connectivity

• Scaling for millions of users

– servers – bandwidth costs– management

overhead

}Peer-to-Peerdistribute to user agents

Why is this a

problem?

• How many calls need media relaying?

– 15-40%– some ISPs relay all

calls

Documents

Salman Abdul Baset [email protected] Thesis defense October 29, 2010