CDN and Traffic-structure
2
Outline
§ Basics CDN § Traffic Analysis
3
Outline
§ Basics CDN § Building Blocks § Services § Evolution
§ Traffic Analysis
4
A Centralized Web!
Slow § content must traverse multiple
backbones and long distances
Unreliable § delivery may be prevented by
congestion or backbone peering problems
Not scalable § usage limited by bandwidth
available at master site
Inferior streaming quality § packet loss, congestion, and
narrow pipes degrade stream quality
Source: Bruce Maggs, CCGrid 2001 Keynote
The old centralized server oriented web did not scale.
5
Content Delivery Network (CDN)
Content Delivery Networks (CDNs) emerged as a solution to Internet service degradation
§ Moving content to the “edge” of the Internet, close to end-users
Alternatives § Increased bandwidth, Web
caching, Web pre-fetching
CDN advantages § Reduced server loads § Distributed network traffic § Reduced latency
Source: Content Delivery Networks: Overlay Networks for Scaling and Enhancing the Web www.gridbus.org
First CDN emerged in 2001
6
Content Delivery Network (CDN) Basic Building Blocks of a CDN.
§ Infrastructure: servers that replicate the content of the server that publishes content (infrastructure, replication)
§ Redirection mechanism to
forward requests to servers, typically done through DNS
§ Content synchronization and
consistency: mechanism that guarantees that the content is fresh
§ Operation support
Basic Building Blocks
Simple building blocks, Elementary interaction with DNS needed
7
Content Delivery Network (CDN) Details of building blocks. Content delivery component
§ Origin server and a set of edge servers (surrogates) to replicate content
Request-routing component § Direct user requests to edge servers § Interact with the distribution
component to keep an up-to-date view of content
Content distribution component § Moves content from the origin to
edge servers and ensures consistency
Accounting component § Maintains logs of client accesses
and records usage of the servers § Assists in traffic reporting and
usage-based billing
CDN
Origin Server
Request Routing System
Distribution System
Replica Server 1
Replica Server N
`Web User1
Accounting System
`Web UserM
Billing Organization
CDNs can be realized on top of carriers IP networks: Easy traffic management by third parties
CDN functional components
Source: Content Delivery Networks: Overlay Networks for Scaling and Enhancing the Web www.gridbus.org
8
Content Delivery Network (CDN) CDN Supported Content and Services.
Sources of content § Large enterprises, Web service providers,
media companies, and news broadcasters Customers
§ Media and Internet advertisement companies, data centers, ISPs, online music retailers, mobile operators, consumer electronics manufacturers, and other carrier companies
User interaction § Cell phone, smart phone/PDA, laptop, and
desktop
Static content § Static HTML pages, images,
documents, software patches Streaming media
§ Audio, real-time video User Generated Video (UGV) Content services
§ Directory, e-commerce, file transfer services
CDN
Contents / services
Cell Phone
Smart phone / PDA
Laptop
Desktop
Music ( MP 3 ) / Audio
E - docs
Web Pages
Streaming media
Clients
9
Content Delivery Network (CDN). Content replication and synchronization.
Content outsourcing § Push: mostly for real-time, dynamic § Pull: static content
§ Non-cooperative pull-based § Upon cache miss, surrogate servers pull content from the origin server § Used by most CDN providers
§ Cooperative pull-based § Surrogate servers cooperate with each other to get the requested content in case of a cache miss § Only used by Coral CDN, making use of DHTs
CDN is highly decoupled from standard IP transport business
Content outsourcing mode
10
Location of servers: § close to customers (akamai model): put hardware inside the ISP (leased or not), advanced virtualization, acts as a cache/proxy for the ISP so reduces ISP backbone network utilization
§ well-connected data-centers (Google, Limelight): directly peer with large ISPs and at IXPs at strategic locations, not so ISP backbone friendly
Relation to peerings: § close to customers does not require
large peerings, only to populate the caches
§ large peerings that require timely re-negociations and SLAs
Sharing of infrastructure: § CDN dedicated to a single service/
customer base § general-purpose
Content Delivery Network (CDN). Deployment and redirection.
Highly decoupled from standard IP transport business
CDN Infrastructure Deployment Schemes
Static:
§ pre-allocation of user to a (set of) server (IPTV)
Dynamic: § load-based: load-balancing
across the CDN servers (irrespective of the network) (CDN optimization)
§ network-based: aware of network properties such as latency and proximity-based (to the user) (service-oriented optimization)
CDN Redirection Schemes
11
§ Usually CDN provider do not have detailed information about the entry point of the customer
§ Most of the incumbent are not active in the CDN market
§ Traffic of ISP strongly depend on activities of CDN provides and OTT
Easy change of traffic structure by third parties. CDN Evolutions.
Pre - evolutionary period Late 90 ' s 2002 2005 2007 2010 2010 onwards
Improved Web server
Static and Dynamic Content
Caching proxy deployment
Hierarchical caching
Server farms
Video on Demand , media
streaming , mobile CDNs
C h a n
g e d f o c
u s , i n c
r e a s e d
f u n c t i o
n a l a b
i l i t y ,
i m p r o v
e d p e r
f o r m a
n c e , u s
e r - c e n t
r i c
Community - based CDNs
Pre CDN Evolution First Gen.CDN
Second Gen.CDN
Third Gen.CDN
CDN: Is there an opportunity for incumbent operator?
12
Outline
§ Basics CDN § Traffic Analysis
§ Application MIX § Diversity of Traffic Sources Application – ISP collaboration
13
Application Mix. Which applications – DT network.
0
50
100
150
200
250
300
350
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1 2 3
Volume (GB)
UnclassifiedClassifiedNTTPHTTPEdonkeyBitTorrent
TOP Applications Based on Aug’09 Data 1. HTTP 63% 2. BitTorrent 9% 3. RTMP 3% 4. eDonkey 3% 5. NNTP 2% 6. SSL 2% 7. Shoutcast 1%
Traffic dominated by HTTP ~ 63%
14
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
§ HTTP dominates (57% of Bytes)
§ Content types used: § 25% of HTTP is Flash-
Video types, e.g., YouTube
§ 15% of HTTP is RAR-Archive, e.g. RapidShare
§ Top Domains include: § One-Click Hoster § Video streaming § Software Downloads/
Updates § No significant hiding/
tunneling via HTTP § HTTP dominance due to
popular high-volume content
Measurement results and analysis: Application mix. Application mix & HTTP by content types and Top domains.
HTTP Content-Types
Flash-video 25%
RAR-archive 15%
image 11%
video 8%
other 23%
unclassified 17%
TOP Domains (Aug’09) 1. 12.6%
2. 10.8%
3. 2.8%
4. 2.2%
5. 1.8%
6. 1.5%
7. 1.4%
8. 1.4%
9. 1.3%
10. 1.1%
11. 1.0%
15
Application mix: Which applications – comparison to other networks?
16
Application mix: Residential Traces Details (Maier 2009).
Source: Maier et al, IMC’09
17
Outline
§ Basics CDN § Traffic Analysis
§ Application MIX § Diversity of Traffic Sources Application – ISP collaboration
18
Internet Traffic Characteristics. Location diversity . .
0%
20%
40%
60%
80%
100%
ISP Arbor (global) Sandvine (global)
Textbox Headline Internet and P2P Traffic Characteristics
HTTP
Other
P2P
Proportion of P2P traffic in Internet decreasing. HTTP is the dominant traffic. High location diversity of important content.
Internet Traffic Distribution 2009
Textbox Headline Location Diversity of HTTP Content
0
20
40
60
80
100
120
75% 50% 25% 10%
Availability at # locatio
ns
19
Internet Traffic Characteristics.
§ Top-10 Applications or Content Providers are responsible for around 50% the HTTP traffic.
§ More than 60% of the HTTP traffic can be download from at least 3 different locations
Consolidation of Content Diversity of Paths
20
Diversity of content Public peering points: Google.
General trend for huge OTT provider: Peering close to the customer, bypassing of tier 1 level.
21
Diversity of content Public peering points: Akamai
22
Traffic Engineering Opportunities. Opportunities for redirection.
5
80% of HTTP traffic can be re-directed
DNS queries of the top 10,000 hosts by volume as monitored in residential network of 20,000 DSL users.
Opportunity from for DNS usage: 80% of HTTP traffic can be re-directed
23
Client
External DNS
Provider DNS
Internet Service Provider
(ISP)
Servers
1
2
3
4
5
Traffic redirection via DNS
24
Download times – Example: CDN
Zum Teil suboptimale Downloadzeiten!
25 Client
Host A
Host B
Host C
Traffic engineering via server selection
26
Traffic engineering via server selection
Client
Host A
Host B
Host C
27
Traffic engineering via server selection
Client
Host A
Host B
Host C
28
Client
External DNS
Provider DNS
Internet Service Provider
(ISP)
Host
1
2
3
4
6 PaDIS
5
Provider-aided Distance Information System
Server selection: How?