Upload
gianna
View
40
Download
2
Embed Size (px)
DESCRIPTION
Protocol and System Design, Reliability, and Energy Efficiency in Peer-to-Peer Communication Systems. Salman Abdul Baset [email protected] Thesis defense October 29, 2010. Motivation and Contributions. Client-server communication system. node D. node C. ip addr A, B, C, D. - PowerPoint PPT Presentation
Citation preview
Protocol and System Design, Reliability, and Energy Efficiency in Peer-to-Peer
Communication Systems
Salman Abdul [email protected]
Thesis defense
October 29, 2010
2
Motivation and Contributions
3
Client-server communication system
NAT / firewallnode A node B
node Cnode D
register(ip addr D)
proxy server / registrar register
(ip addr C)
register(ip addr A)
register(ip addr B)
ip addr A, B, C, D
signaling signaling
media traffic(voice, video, IM)
NAT / firewall
media relay server
media traffic(voice, video, IM)
Up to 40% InternetVoIP calls need media relays
Scaling for millions of users
economic costs
- servers
- bandwidth
- management
4
ip addrof B
What is a p2p communication system?
NAT / firewallnode A node B
node Cnode D
register(ip addr D)
register(ip addr C)
register(ip addr A)
register(ip addr B)
signaling
NAT / firewall
P2P
ip addrB,C
ip addrA
ip addrD
What is the ip address of B
media traffic(voice, video, IM)
signaling
media traffic(voice, video, IM)
media traffic over TCP(voice, video, IM)
5
• Protocol and system design– can we design a p2p communication protocol for diverse
deployments such as ad hoc, enterprise, and Internet?
• Reliability– are p2p communication systems reliable (dropped calls)?
• Session quality– what is the quality of real-time traffic over TCP or through other
nodes?
• Energy efficiency– are p2p communication systems more energy efficient than client-
server?
• Measurement– how can we analyze the performance of p2p systems such as
Skype?
ChallengesDesigning, building, and analyzing p2p communication systems
6
• Protocol and system design– first standardized, and interoperable protocol for building p2p
communication systems• Peer-to-peer protocol (P2PP) and RELOAD [IETF I-D’s 2007 and 2010]• OpenVoIP – a P2PP proof-of-concept system [SIGCOMM’08 demo]
• Reliability– a simple model to investigate the reliability of p2p communication
systems [IPTCOMM’10]
• Session quality– a comprehensive study of TCP’s feasibility for real-time traffic
[SIGMETRICS’08]
• Energy efficiency– first energy-efficiency study of VoIP systems [GreenNetworking’10]
• Measurement– novel techniques for analyzing the workings of p2p communication
systems [GI’08,Infocom’06]
Contributions
7
• Protocol and system design– first standardized, and interoperable protocol for building p2p
communication systems• Peer-to-peer protocol (P2PP) and RELOAD [IETF I-D’s 2007 and 2010]• OpenVoIP – a P2PP proof-of-concept system [SIGCOMM’08 demo]
• Reliability– a simple model to investigate the reliability of p2p communication
systems [IPTCOMM’10]
• Session quality– a comprehensive study of TCP’s feasibility for real-time traffic
[SIGMETRICS’08]
• Energy efficiency– first energy-efficiency study of VoIP systems [GreenNetworking’10]
• Measurement– Novel techniques for analyzing the workings of p2p communication
systems [GI’08,Infocom’06]
In this talk
8
Related work
• Protocol and system design– Skype proprietary commercial system
– Distributed Hash Table (DHT) design [Rhea05Sigcomm]• nodes on Planet Lab run OpenDHT and store data• no relaying service, only storage• P2PP allows nodes with ‘good’ connectivity to provide
storage and relaying service, not just Planet Lab nodes
– feasibility of making Session Initiation Protocol (SIP) peer-to-peer [Bryan06AAA, Singh06NOSSDAV]
• explored the feasibility of distributing SIP registrar• protocol tied to SIP, could not be used by non-SIP protocols• P2PP can be used by non-SIP protocols
9
• Reliability analysis– Node isolation [Leonard07ToN]
• probability that a node outlives its neighbors• I build a framework for understanding reliability of calls
– Minimizing churn [Godfrey06Sigcomm]• not sufficient• I devise techniques to improve reliability
• Energy efficiency study– p2p is more energy efficient than client-server for file-sharing
[Valancius09CoNEXT, Nedevschi08Hotpower]• I analyze if it is also the case with VoIP systems• I also study where are energy inefficiencies in VoIP systems
Related work
10
IntroductionRelated work Protocol and system design• Reliability analysis• Energy efficiency study of VoIP systems• Conclusion
Outline
11
Protocol and system design
Goal• design open, standardized, and interoperable
p2p communication protocol for building systems – that distribute the functionality of proxy server,
registrar, and media relay server to end-points– that work in diverse deployments, e.g., ad hoc,
enterprise, Internet– that run on heterogeneous devices– that are extensible
12
Requirements
NAT / firewallnode A node B
node Cnode D
P2P
NAT / firewall
• Multiple overlay algorithms• Node heterogeneity• NAT and firewall traversal• Bootstrap and join
portable
desktop
Emergency p2pInternet p2p
new node
13
Requirements
node A node B
node Cnode D
• Multiple overlay algorithms
• Node heterogeneity
• NAT and firewall traversal
• Bootstrap and join
• Resilience
• Message reliability
• Request routing
• Security
• Data model
• Monitoring and diagnostics
P2P
message
ip addrA
14
Peer-to-peer protocol (P2PP)• Protocol stack of a P2PP node
• A request / response binary protocol– protocol methods
• Reuses existing protocols
• Not a new DHT or bit-encapsulation
• How does it meet the requirements?
SIP
P2PP NAT
TLS / SSL
API
Published in IETF P2PSIP working group
15
Meeting the requirements (1)
Multiple overlay algorithms• LookupPeer method
– find a peer in the overlay to fill node’s routing table
– method customized for each overlay algorithm
– Chord (X+2i,X+2i+1), Kademlia (XOR), etc.
• ExchangeTable method– exchange routing tables
• KeepAlive method– check liveness
node A node B
node Cnode D
P2P
Routing table
node B
node D
Routing table
node C
node D
In one overlay algorithm
In another overlay algorithm
Multiple overlay algorithms
16
Meeting the requirements (2)
• Node heterogeneity– peers (super nodes) and
clients (ordinary nodes)– peer vs. client decision left
to the system designer
• NAT traversal– a node encodes its host, NAT,
and a relay IP address in every message
– then performs connectivity checks
• Bootstrap and join– Bootstrap, Join, Leave
methods– Bootstrap server
P2P
portable
node A node B
node Cnode D
NAT / firewall
NAT / firewall
node A
bootstrapserver
new node
Multiple overlay algorithms Node heterogeneity NAT traversal Bootstrap and join
17
Meeting the requirements (3)
P2P
• Resilience– KeepAlive method
• Message reliability– hop-by-hop, e2e– ACK-based mechanism
for unreliable transports
• Request routing– recursive vs. iterative– specified per message
or per overlay
messagenode B
node Cnode D
node A
Multiple overlay algorithms Node heterogeneity NAT traversal Bootstrap and join Resilience Message reliability Request routing
18
Meeting the requirements (4)
• Data model– Publish, Lookup, Replicate methods
– flexible data model• key / type-value pairs• data integrity (hash)
• Security– identity (user, nodes)
• enrollment server, X.509 certificates
• Enroll method– message confidentiality
• TLS, DTLS
• Monitoring and diagnostics gathering (e.g., CPU, uptime)– GetDiagnostics method
node B
node Cnode D
node A
P2P
enrollment server
Resource-ID
Type 1
Sub-type 1 Sub-type 2
Value Value
Signature
Phone record
Desktop phone
IP addr:port
386af6194c4d
Multiple overlay algorithms Node heterogeneity NAT traversal Bootstrap and join Resilience Message reliability Request routing Data model Security Monitoring and diagnostics
19
• Design summary– methods for implementing the common aspects of overlay
algorithms– overlay algorithm defines components of specific methods– separation of mechanism vs. policy
• P2PP now part of RELOAD protocol being standardized in the IETF
• Limitations– not a replacement for network file systems
• no permissions, store ephemeral data– does not replace delay-tolerant network protocols
• Does it work in practice?
P2PP – summary
20
OpenVoIP – a P2PP proof of concept
P2POverlay 1
P2POverlay 2
bootstrap server monitoring server / Google maps
NAT / firewall
NAT / firewall
node A node B
node Cnode D
node E node F
SIGCOMM (demo) 2008
21
OpenVoIP – key facts and lessons learned
• 1000 node network on ~500 PlanetLab machines• DHTs: Kademlia, Bamboo, Chord• App: Windows XP / Vista, Linux• Code used and modified by Ericsson Labs, Nokia Labs,
Telecom Italia, and many universities around the world
• Lessons learned– DHT specific part is only 10-15% of the total code– want to test a new p2p protocol?
• use the library provided
22
IntroductionRelated workProtocol and system design Reliability analysis• Energy efficiency study of VoIP systems• Conclusion
Outline
23
Reliability of P2P comm. systems
• Reliability=Proportion of completed calls (e.g., 99.9%)
• Goals– understand reasons for call failure– devise techniques to address them
• Reasons for call failure– (1) distributed search fails to find online callee– (2) distributed search fails to find a suitable relay– (3) relay fails during voice/video session
IPTCOMM’2010
Recall: up to 40% VoIP calls in the Internetneed relaying
24
Understanding reliability of relayed calls (1)
For desired reliability, minimum relays k per call?
• Model– when ith relay fails, call is switched to (i+1)st relay which
is instantly selected from the global pool of all relays.
– Ri residual lifetime of a relay candidate (i.i.d.)
– let D denote the call duration.
)(rel Desired0
DRPk
ii
Qualitatively: if node lifetime >>> call duration, small k and vice versa
99.9%
R1 RkRk-1
D
1 2 k-1 k
25
Min # of relays k
6 4
3 5
1 10
Min # of relays k
Skype
12 hours (mean)
4 hours (med)
3
(mean call holding time= one hour)
For one hour Skype calls, minimum of 3 relays needed to maintain 99.9% success rate
95% of Skype relay calls last less than 1 hourkv))/((199.9%
Exponential node lifetimes Skype node lifetimes
lifetimes approximated as pareto
Mean node lifetime
Mean call duration
What if the system does not have enough relays?
Understanding reliability of relayed calls (2)
26
• Model
– complicated for arbitrary distributions– For exponential lifetimes, I used
markov analysis
Approaches for addressing the reliability of relayed calls
)( DRP NR
• Approaches– No-replacement (NR)
• select k relays in the beginning of a call
• do not replace failed relays
– With-replacement (WR)• select k relays in the
beginning of a call
• replace failed relays after μ
– Skype uses 2-relay with-replacement scheme
pure death process
)( DRP WR
))...(max(1 1 DRRP k (1) Why not make k arbitrary large?
(2) Isn’t WR always better NR?
27
Comparing the approaches for reliability of relayed calls
(1) Why not make k arbitrary large? – i.e., add more relays?– diminishing returns– liveness checks overhead
(2) Isn’t WR always better than NR?– yes, but the percentage improvement gains vary– depends on mean lifetime, call duration, repair time
Skypemean=12 hoursMedian=4 hours
Skype: 2 relay with-replacement
search time=60s
Num relays MTTF improvement
2 50%
3 22%
4 13%
28
Outline
IntroductionRelated workProtocol and system design
– P2PP– OpenVoIP
Reliability analysis Energy efficiency study of VoIP systems• Conclusion
29
Energy efficiency study
• Goals(1) Where is energy consumed in IP-telephony systems?
(2) How do different design choices (p2p vs. client-server) affect the energy consumption?
(3) How can we make IP-telephony more energy-efficient?
• IP-communication system classification– p2p vs. client-server– PSTN replacement
always on, providing emergency calling vs. communication addendum
GreenNetworking’2010
30
Sources of energy consumption in IP-communication systems
• End-point – handsets
– VoIP conversion boxes
– PCs
– NATs and firewalls
• Core– signaling / directory
– media relaying
– PSTN / mobile gateways
– cooling • power utilization efficiency (PUE)
– ratio of data center power draw to IT power draw
• Network– joules per bit
31
Approach
• Data (from client-server VoIP provider) – 100K users (mostly business)– 15 calls per second (CPS) peak– ~5K calls in system – NAT keep-alive traffic – all calls relayed
• Modeling– P2P– Client-server
• Measurements
– End points– desktop clients
– laptop clients
– hardware SIP phones
– Skype peers
– Core– SIP server
– relay server
32
(1) Where is energy consumed? PSTN replacement
• VoIP servers consume less than 0.06% of total!– 1 server 500k users ~200W– 1 servers 50k simultaneous calls ~200W– 500k phones, each phone ~5-7W
– even after a redundancy factor of 2, and conservative PUE of 2!
Make PSTN replacement green?Reduce end-device power energy consumption
33
(1) Where is energy consumed? Non-PSTN replacement
• Typically run on desktops, laptops as soft phones
• If soft phone draws little additional power– still likely that end-device biggest component– but may not dominate consumption
• If users leave PCs on just as phones– possibly even worse than PSTN!
34
(2) Client-server vs. peer-to-peer?
• Client-server model– C/S power consumption
pc/s= #servers * Watts/server *redundancy factor * PUE
• P2P model– S super nodes active– ps Watts/super node
ps < 162mW
P2P more energy efficient when:
S * ps < pc/s
• One active super node per relayed call (Skype)
• 30% calls relayed• super nodes 1.5% of total nodes
P2P may consume more than client-server!
35
(3) How can we make IP-telephony greener?
• Phones– make phones energy efficient
• LCD, processor, Wake-on-LAN for phones?
• PCs– wakeup on receiving calls
• NATs and firewalls– eliminate NATs (IPv6 – at least in theory)
36
• Protocol and system design– first standardized, and interoperable protocol for building p2p
communication systems• Peer-to-peer protocol (P2PP) and RELOAD [IETF I-D’s 2007 and 2010]• OpenVoIP – a P2PP proof-of-concept system [SIGCOMM’08 demo]
• Reliability– a simple model to investigate the reliability of p2p communication
systems [IPTCOMM’10]
• Session quality– a comprehensive study of TCP’s feasibility for real-time traffic
[SIGMETRICS’08]
• Energy efficiency– first energy-efficiency study of VoIP systems [GreenNetworking’10]
• Measurement– novel techniques for analyzing the workings of p2p communication
systems [GI’08,Infocom’06]
Conclusion
37
PublicationsJournal and magazine• Eli Brosh, Salman A. Baset, Vishal Mira, Dan Rubenstein, and Henning Schulzrinne, The Delay-Friendliness of TCP for Real-time Traffic,
IEEE/ACM Transactions on Networking, Accepted.• Salman A. Baset and Henning Schulzrinne, Reliability and Relay Selection in Peer-to-Peer Communication Systems, in submission.• Salman A. Baset and Henning Schulzrinne, Making Peer-to-Peer Video Conferencing Work, in submission.
Conference and workshop• Salman A. Baset, Joshua Reich, Jan Janak, Pavel Kasparek, Vishal Misra, Dan Rubenstein, and Henning Schulzrinne, How Green is IP-
telephony?, in Proc. of SIGCOMM Green Networking workshop, New Delhi, India, August 2010• Salman A. Baset and Henning Schulzrinne, Reliability and Relay Selection in Peer-to-Peer Communication Systems, in Proc. of
IPTCOMM, Munich, Germany, August 2010 (Best paper).• Omer Boyaci, Andrea Forte, Salman A. Baset, and Henning Schulzrinne, vDelay: A Tool to Measure Capture-to-Display Latency and
Frame Rate, in Proc. of International Symposium on Multimedia (ISM), San Diego, CA, USA, December 2009.• Katerina Argyraki, Salman A. Baset, Byung-Gon Chun, Kevin Fall, Gianlucca Iannaconne, Allan Knies, Eddie Kohler, Maziar Manesh, Sergiu
Nedevschi, and Sylvia Ratnasamy, Can Software Routers Scale, in Proc. of second PRESTO workshop, Seattle, WA, USA, August 2008.• Eli Brosh, Salman A. Baset, Dan Rubenstein, and Henning Schulzrinne, The Delay-Friendliness of TCP, in Proc. of ACM SIGMETRICS,
Annapolis, MD, USA, June 2008.• Wookyun Kho, Salman A. Baset, and Henning Schulzrinne, Skype Relay Calls: Measurements and Experiments, in Proc. of IEEE Global
Internet Symposium, Phoenix, AZ, USA, April 2008.• Salman A. Baset and Henning Schulzrinne, An Analysis of the Skype Peer-to-Peer Internet Telephony Protocol, in Proc. of IEEE
INFOCOM, Barcelona, Spain, April 2006.• Kishore Dhara, Salman A. Baset, Venkatesh Krishnaswamy, Dynamic Peer-To-Peer Overlays for Voice Systems, in Proc. of 3rd IEEE
Workshop on Mobile Peer-to-Peer Computing, Pisa, Italy, March 2006.Demo• Omer Boyaci, Andrea Forte, Salman A. Baset, and Henning Schulzrinne, vDelay: A Tool to Measure Capture-to-Display Latency and
Frame Rate, in Proc. of International Symposium on Multimedia (ISM), San Diego, CA, USA, December 2009.• Salman A. Baset, Gaurav Gupta, and Henning Schulzrinne, OpenVoIP: An Open Peer-to-Peer VoIP and IM System, in Proc. of SIGCOMM
(demo), Seattle, WA, August 2008.Poster• Salman A. Baset, Eli Brosh, Vishal Misra, Dan Rubenstein, and Henning Schulzrinne, Understanding the Behavior of TCP for Real-time
Workloads, in Proc. of CoNEXT, Lisbon, Portugal, December 2006.Internet drafts• Cullen Jennings, Bruce Lowekamp, Eric Rescorla, Salman A. Baset, and Henning Schulzrinne, Resource Location and Discovery
(RELOAD), Internet Draft, draft-ietf-p2psip-base-11 (work-in-progress), October 2010.• Cullen Jennings, Bruce Lowekamp, Eric Rescorla, Salman A. Baset, and Henning Schulzrinne, A SIP Usage for RELOAD, Internet Draft,
draft-ietf-p2psip-sip-05 (work-in-progress), July 2010.• Salman A. Baset and Henning Schulzrinne, TCP-over-UDP, Internet Draft, draft-baset-tsvwg-tcp-over-udp-01 (work-in-progress), June 2009.• Salman A. Baset, Henning Schulzrinne, and Marcin Matuszewski, Peer-to-Peer Protocol (P2PP), Internet Draft, draft-baset-p2psip-p2pp-01,
November 2007.
38
References
[Bryan06AAA] David A. Bryan, Bruce Lowekamp, and Cullen Jennings, SoSIMPLE: A SIP/SIMPLE based P2P VoIP and IM system, in Proc. of AAA workshop, Orlando, FL, USA, July 2005
[Godfrey06Sigcomm] P. Brighten Godfrey, Scott Shenker, and Ion Stoica, Minimizing churn in distributed systems, in Proc. of SIGCOMM, Pisa, Itlay, August 2006.
[Leonard07ToN] Derek Leonard, Zhongmei Yao, Vivek Rai, and Dmitri Loguinov, On lifetime-based node failure and stochastic resilience of decentralized peer-to-peer networks, in IEEE/ACM Transactions on Networking, June 2007.
[Nedevschi08Hotpower] Sergiu Nedevschi, Jitendra Padhye, and Sylvia Ratnasamy, Hot data centers vs. cool peers, in Proc. of HotPower, San Diego, CA, USA, December 2008.
[Rhea05Sigcomm] Sean Rhea, OpenDHT: A publicly accessible DHT service, PhD thesis, University of California at Berkeley, Berkeley, CA, USA, 2005.
[Singh06NOSSDAV] Kundan Singh and Henning Schulzrinne, Peer-to-peer Internet telephony using SIP, in Proc. of NOSSDAV, Stevenson, WA, USA, June 2005.
[Valancius09CoNEXT] Vytautas Valancius, Nikolaos Laoutaris, Laurent Massoulie, Christophe Diot, and Pablo Rodriguez, Greening the Internet with nano data centers, in Proc. of CoNEXT, Rome, Italy, December 2009.
39
Backup
40
IP-based communication systems
• Basic services– establish voice, video, IM sessions– voicemail
• Advanced services– conferencing, telepresence– voicemail to text
Client-server Peer-to-Peer
41
Client-server IP communication systemSIP registrar / proxy server
REGISTER(ip addr)
REGISTER(ip addr)
User agent User agent
(1) signaling (1) signaling
(2) media(voice, video, IM)
SIP registrar / proxy / presenceserver
Utopian Internet
No NATs or firewalls
PSTN / Mobile
IP-PSTN gateway
42
Client-server IP communication system
SIP registrar / proxy / presence / server
User agent User agent
media server
NAT / firewall
NAT / firewall
NetworkNetwork
NAT
Src-IP Dst-IPPub-IP
Src-IP Dst-IP Pub-IP Dst-IP
Pr-IP
packet packet
aka server-reflexive address
43
• P2P file-sharing systems– tit-for-tat– open NAT ports– reduce download rate of files for nodes
behind NATs
• P2P communication systems– no tit-for-tat– opening NAT ports is a hassle– cannot reduce rate, will impact quality
P2P: communication vs. file sharing
44
Percentage of VoIP calls in the Internet that need relaying?
• the provider knows • Some client-server VoIP providers relay all calls• 15-20% calls for a commercial client-server IM /
VoIP application• Microsoft messenger ~ 40%• 341 relayed calls in 20 days for Skype
[Suh05Infocom] ~17 per day for a super node (~50K super nodes)
• NAT studies
45
Protocol and system design
46
• Data model– addressing, storage,
integrity• Message reliability
– hop-by-hop, e2e
Different aspects• Next-hop determination
– depends on the overlay algorithm
– Chord, Kademla, Gia, – proximity aware etc.
Shared and different aspects
Shared aspects• Connectivity
– NAT traversal– bootstrap
• Resilience– recovery from node churn
• Request routing– recursive vs. iterative– parallel vs. sequential
• Heterogeneity of nodes– mobile, desktop– super node vs. ordinary node
• Security– Identity (user, nodes)– message confidentiality
Request
Request
Response
Response
Request
Response
Request
Response
A B C A B C
• Methods for implementing the common aspects
• Overlay algorithm defines components of specific methods
Now part of RELOAD protocol being standardized in the IETF
47
Peer-to-peer protocol (P2PP)
• Now part of RELOAD protocol being standardized in the IETF
• Not a new DHT or bit-encapsulation• Geared towards IP telephony but
applicable to streaming, VoD etc.• A request / response binary protocol
– Shared methods• Join, Leave, Publish, Lookup, KeepAlive etc
– Overlay-specific methods• FindPeer, ExchangeTable
• Support different overlay algorithms (Chord, Kademlia etc)• Application-level API• Security
– enrollment server, X.509 certificates– TLS, DTLS for message confidentiality
IETF P2PSIP working group
SIP
P2PP ICE
TLS / SSL
protocol stack of a node
API
48
Peer-to-peer protocol (P2PP)
• Node heterogeneity– peers (super nodes) and
clients (ordinary nodes)– decision left to the system
designer– use of peers as relays
• NAT traversal built-in– a node exchanges its host,
NAT, and a relay IP address in requests and responses
– then uses ICE (interactive connectivity establishment) for NAT traversal
• Message reliability– hop-by-hop, e2e
• Data model– key / value pairs– data integrity
• Monitoring and diagnostics gathering
Resource-ID
Value 1
Value 2
Type 1
Value 1
Value 2
Type 1
Signature
49
Implementation design
Transport / timers
Node
BigInt
Parser / encoder
UDP TCP
Transactions
ClientBootstrap KadPeer BambooPeer Diagnostic
Sys
publish (key, value, callback)callback (resp)
lookup (key, callback)
Routing table
Neighbor table
Distance
DTLS TLS
{multiplatform
app. pluggability} {
821
1177
2921
Other2771
2803211
2566299
Data storage 1946
406 1019
208
1182 869 630
Non-DHT LoC15783
NAT 3400
50
• Is there any such thing as the ‘best’ DHT?• Chord widely cited but not widely deployed• DHTs are parameterized
– base, hash algorithm– symmetric vs. asymmetric distance
• Chord (modulo) vs. Kademlia
– next-hop determination may be purely based on DHTs or a combination of DHT+proximity aware routing
– debugging and deployment
Why not the ‘best’ DHT?
51
P2PP and RELOAD
• Commonalities– pluggable overlay algorithm
• feasibility demonstrated in OpenVoIP
– security• self-signed and CA-signed certificates• DTLS, TLS• data integrity
– routing• recursive, iterative, direct response
– NAT traversal• core part of the protocol
52
P2PP and RELOAD
• Differences– message model
• P2PP: all messages can be routed in recursive or iterative manner
• RELOAD: only one message permitted for iterative– data model
• P2PP: opaque blob, only app can interpret the data• RELOAD: single value, array, dictionary
– message fragmentation over unreliable transports• P2PP: use TCP• RELOAD: handle UDP
– NAT traversal• RELOAD: explicit ‘connection’ establishment• P2PP: encode host, server-reflexive, and relay addresses in
every message
53
Skype using P2PP?
• Why not open the Skype protocol?– sure, but– Skype protocol tied with VoIP
• To use P2PP, Skype will have to– abandon its own protocol – use SIP for call establishment– TLS, DTLS for security– STUN, TURN, ICE protocols for NAT traversal
54
OpenVoIP: geo+logical interface
55
OpenVoIP: lessons learned
• Bootstrap– maintain bootstrap nodes and ensure their availability
• Randomization is our best friend!– send the maintenance messages within a bounded random time
• Churn recovery– is on demand and periodic
• Insert a new entry in routing table after checking liveness• Periodically republish SIP records
– not feasible for large records
• Avoid overly complex mechanisms – can backfire!
56
• Send video to every participant (NxN)• But
– uplink capacity– NAT and firewalls– downlink capacity
• Solution– centralized
• costly, hardware-based– peer-to-peer
• use helpers (idle Skype users)
• construct an application layer multicast tree rooted at every participant
P2P video conferencing
57
P2P video conferencing
• Challenges– optimize latency or number of
helpers or both (within a threshold)?
– select helpers close to source or final recipients?
• recipients behind NAT and firewalls?
– helper churn• backup for every helper?
– participant join and churn?– who searches for the helper?
• root or new recipient?
– share helpers across trees?– participants as helpers?
150ms
150ms
20ms
20ms
20ms
10ms
50ms50ms
50ms
10ms50ms
58
Number of helpers
participant outdegree=1helper outdegree=2
Number of helpers per tree (stream)=7Total helpers=63
1/4
Nine party conference
Number of helpers per tree=4Total helpers=36 from 63!
1/4
1/4
1/4
59
Number of helpers
part Participant outdegree=3 Participant outdegree=1
2 3 4 5 6 2 3 4 5 6
3 0 0 0 0 0 3 3 3 3 3
4 4 4 4 4 4 8 4 4 4 4
6 12 6 6 6 6 24 12 12 6 6
10 60 30 20 20 20 80 40 30 20 20
6 helpers per treeback up for
every helper?
participant outdegree=1helper outdegree=2
Number of helpers per tree (stream)=7Total helpers=63
60
Number of helpers
part Participant outdegree=3 Participant outdegree=1
2 3 4 5 6 2 3 4 5 6
3 0 0 0 0 0 3 3 3 3 3
4 4 4 4 4 4 8 4 4 4 4
5 5 5 5 5 5 15 10 5 5 5
6 12 6 6 6 6 24 12 12 6 6
7 21 14 7 7 7 35 21 14 14 7
8 32 16 16 8 8 48 24 16 16 16
9 45 27 18 18 9 63 36 27 18 18
10 60 30 20 20 20 80 40 30 20 20
6 helpers per tree
back up for every helper?
HO od
61
Split the stream
participant outdegree=1helper outdegree=2Total helpers=4x6=24
participant outdegree=1helper outdegree=2split the stream=3Total helpers=3x6=18
62
Tree construction for video conferencing
• Split the stream – decreases helpers– gain increases as the helpers increase (>4)
• Source selects helpers close to itself• Helper pool
Related work– ALM 1->many vid-conf many->many– participant churn, bandwidth, managed servers as
helpers– 1-hop tree construction without split and incorporate
participants as helpers
63
Reliability analysis
64
Understanding reliability of relayed calls
• Model and simulations– event driven, 107 calls
– [synthetic] exponential, pareto,[real] Skype lifetime data set
• Skype node lifetime data set (1,740 Skype nodes)– Skype (uptime mean=~12 hours, med~=4 hours)
• approximated using shifted pareto and exponential• 95% of relayed Skype calls are less than 60 minutes [Guha’06]
– desired reliability = 99.9%
1-relay failure
error (15%)
95% of Skype call durations – minimum of 3 relays to maintain 99.9% success rate
65
Improving reliability of relayed calls
))...(max(1)( 1 DRRPDRP kNR
• Approach 1 -- no-replacement– select k relays in the
beginning of a call– do not replace failed relays
• Approach 2 -- with-replacement– select k relays in the
beginning of a call
– replace failed relays after μ– no failure during switch over
– Skype uses 2-relay with-replacement scheme
pure death process
)3/(2/1 2 MTTF
for )( /MTTFtR etFWR
)/()( vvDRP MTTFWR
2 1 0
2λ λ1-(λ + μ)1-2λ
μ
[Bir04]
66
Distributed relay selection
NAT
NAT
IP address RTT Bandwidth
IP address RTT Bandwidth
• Goal O(1) hop• 2-level hierarchical network
1-relay
close-by
Give me a relay
Here is a randomly selectedrelay
local-random scheme
search performance dropped calls
67
Distributed relay selection
• Delay• User annoyance
– interference with user applications
– file sharing (draft idle peers)– spare capacity
• random• mindelay
– select relay with minimum delay• netmax
– select relay with maximum spare bw
• threshold– select relays with delay < 150
ms and maximum spare capacity
• Results– strategies perform similar near
system collapse point– minimizing latency increases
annoyance, number of jobs per relay, vice versa
– threshold approach performs reasonably well
• Comparison with existing approaches
– OneHop DHT• Efficient routing for peer-to-peer overlays
[Gupta04NSDI]
– Direct comparison not possible• we do not create one hop DHT
• leverage the connectivity information of a peer
68
Distributed relay selection
69
Understanding TCP behavior for real-time traffic
70
Real-time traffic over TCP
• Why over TCP? – restrictive NAT and firewalls
• Why not? – TCP is likely to exhibit poor performance for VoIP and live video
(first tried in 1970s, but TCP has evolved )
• Our result: – acceptable performance for VoIP and video (streaming, conferencing)
under certain conditions
• Why VoIP and video over TCP is feasible?– (1) Factors impacting delay– (2) Working region
App limited
Networklimited
Fast retransmit / TO
no loss
packet loss
backlog cleared
TCP (New Reno) carrying CBR real-time traffic
backlog
71
Factors impacting delay
(1) Packet size (small is better)– during backlog, VoIP packets (~200 bytes) can
be combined in one MSS (~ 1500B)
(2) Congestion window regulation (implicitly favors small packet size)
– TCP regulates cwnd based on number of ACKs received (ACK-counting)
– for two flows with the same packet-rate, butdifferent bit-rate, works in favor of smaller bit-rate
Delay friendliness for VoIP•No Nagle or delayed ACKs•ACK-counting
w
MSS=1500 bytesVoIP200bytes
w+1
72
Factors impacting delay• Byte-counting increases VoIP delays by 10-20%
• VoIP delays are significantly lower than video for the same packet rate
• TCP induced delay: AIMD, HOL (head-of-line)
• VoIP: HOL dominates, video: AIMD dominates– (CBR) VoIP 64 kb/s (173 byte/packet) video 573 kb/s (MSS bytes/packet)
Delay improvementVoIP: modify tcp recv()Video: use parallel connections or inflate the window
73
Working region
VoIPLimit 200ms
StreamingLimit 5s
Video conferencingLimit 500ms
VoIP64 kb/s
Video573 kb/s
VoIP 100ms 2%
Video (stream) 100ms 3%
Video (conf) 100ms 1%
• Video conferencing has the most constrained region• acceptable performance for VoIP
74
Playout buffer setting
• Time to recover a lost packet– Fast retransmit 1.5RTT +3/f– Timeout 4*RTT
i
i
RTT
3/f
0.5RTT
VoIP: RTT 100ms lr: 2%Video (conf): RTT 100ms lr: 1%Video (stream): RTT 100ms lr: 2%
75
Playout buffer setting
• Time to recover a lost packet– Fast retransmit 1.5RTT +3/f (CBR flows)– Timeout 4*RTT
i
i
RTT
3/f
0.5RTTRTT=100ms lr=1%
RTT=100ms lr=3%
76
Related work
• Supporting low-latency TCP based media streams [IWQoS’02]– TCP stack modification at sender
• TCP-RC: a receiver-centered TCP protocol for delay-sensitive applications [MMCN’05]– TCP stack modification at receiver
77
Energy Efficiency
78
Does VoIP consume more energy than PSTN?
• Insufficient information• Columbia phone system (presently)
– system: 40K watts– cooling: 50K watts– phone lines: 13,848– per user: 6.4W
• VoIP– Cisco phone: 5-7W
79
Servers needed
Transport NAT keep-alive
100k 1M 10M 100M Watts / server
UDP Yes,
NOTIFY/s
1 2 20 200 210
UDP NO 1 1 10 100 190
TLS NO 3 25 250 2500 209
% calls relayed
100k 1M 10M 100M
0% 0 0 0 0
30% 1 2 10 96
100% 1 4 32 320
% calls relayed
100k 1M 10M 100M
UDP (NAT)
0.4% 0.1% 0.05% 0.04%
UDP
TLS 0.2%
server as % of totals c/s
80
Skype
81
Measurement: Skype• Super node, ordinary node, login
server• Actively prevent against reverse
engineering– LD_PRELOAD– forcing Skype to use a modified shared
library• Voice and video calls
– relaying– over TCP
• Ports: no default listening port– opens port 80 (HTTP) and 443 (TLS)
• Contact list– stored centrally, initially distributed
• Video conferencing– using central servers
Skype login server
Message exchangewith the login serverduring login
ordinary host (SC)
super node (SN)
neighbor relationships in the Skype network
INFOCOM’06
82
Is Skype free-riding on universities bandwidth?
• Two Skype clients in Columbia University forced to use a relay
• 6,000 relay calls• Median latency: ~95ms• 46% calls through relays with
a .edu suffix• 8% of calls through Columbia
Skype users• Is it deliberate?
– probably not
– relay selection biased towards high-capacity nodes which happen to be in universities
GI’08
NAT NAT our lab
83
Future work
84
Directions for future research
• A holistic framework for reliability, performance, and energy tradeoffs in data centers– virtualization, consolidation
• Comparing VoIP and PSTN energy consumption
• Preventing data lock-in for social networks and cloud-based services– enabling seamless data migration across different
cloud providers– holy grail: ‘one click’ data migration
85
Client-server IP communication system
PSTN / Mobile
SIP registrar / proxy server
User agent User agent
(1) signaling
(1) signaling(2) media
(voice, video, IM)(UDP or TCP)
media server
NAT / firewall
NAT / firewall
IP-PSTN gateway
What is centralized?• directory service• call signaling• media session and
conferencing• PSTN connectivity
• Scaling for millions of users
– servers – bandwidth costs– management
overhead
}Peer-to-Peerdistribute to user agents
Why is this a
problem?
• How many calls need media relaying?
– 15-40%– some ISPs relay all
calls