Upload
jillian-foley
View
40
Download
0
Embed Size (px)
DESCRIPTION
Towards a Transparent and Proactively-Managed Internet. Ehab Al-Shaer School of Computer Science DePaul University. Yan Chen EECS Department Northwestern University. Motivations. The Internet has evolved to become a un-cooperative ossificated network of networks - PowerPoint PPT Presentation
Citation preview
Towards a Transparent and Proactively-Managed
Internet
Ehab Al-Shaer
School of Computer Science
DePaul University
Yan Chen
EECS DepartmentNorthwestern
University
Motivations
• The Internet has evolved to become a un-cooperative ossificated network of networks– Network has to be treated as a blackbox
» Performance of even neighboring networks are opaque» Inter-domain routing based on policies but not performance» Have to resort to overlay networks which are suboptimal
– Diagnosis and fault location extremely hard
• Network config management reactive and expensive– Reactive configurations: tune after deployment– Vulnerable: manually handled and subject to conflicts– Imperative & fragmented: need to access several
specific devices in order to implement a service goal
Proposed Solution I: Transparent Internet
• Every network shares its measurement and management information with other networks when necessary (glass box)– Link-level performance: delay, loss rate, available
bandwidth, etc.– Management info
» Configuration: QoS setting, traffic policing» Middle box settings: firewalls, etc.
• The information sharing – As part of the inter-domain protocols: Transparent
Gateway Protocols (TGP)– Other applications: leverage DHT
Analogy to the Airline Alliance
• When airlines compose multi-lag flights, they need more than just route info.– Type of aircraft, # of vacancies, probability of
punctuation, etc.
• Such open model is mutual beneficial– Provide the best flight composition for clients– Similarly, open network model can provide
best communications for applications
Proposed Solution II: Proactive Configuration Management
• Proactive verification: configuration verified and translated to different vendor specific devices
• Proactive validation: Test the configuration changes on the real archived network traffic without interrupting the main operation network
• Autonomic configuration: configurations are auto-tuned dynamically to achieve the “objectives
defining Verifying
Deploying
Evaluating Optimizing
Validation
Dynamic Validation: auto-tuning
ObjectivesProvides a completely transparent view of the Internet to
networks and applications• Diagnosis & trouble shooting becomes extremely easy
– No more Internet tomography needed
• Flexible inter-domain routing– Not just based on policy or # of AS/hops– Flexible metrics based on bandwidth, latency, etc.
• Global traffic engineering– Each AS performs its own local traffic engineering– Provide AS path-level routing guide
• Unified framework that applications query (push/pull) info as needed
– Streaming media, content distribution– Anomaly/security applications
Flexible Inter-domain Routing
• Multiple routing paths with TGP– Incorporate measurement info into AS paths– Bandwidth-intensive and latency-intensive
applications can take different AS paths.
• Challenge: inter-domain routing based on bandwidth without making reservation
• Solution: Discretize the bandwidth for better stability– Though stability is a classical problem, not unique
to TGP
Global Traffic Engineering
• For the current Internet, only local optimum is achieved in each AS– Allowing the network to handle all traffic patterns
possible, within the networks ingress-egress capacity constraints (e.g. two phase routing)
• With global information, we can potentially achieve global optimum (or Nash equilibrium)– Each AS is a selfish individual– A center (or each AS) infers the Nash equilibrium – Each AS can try the Nash equilibrium, or attempt
to benefit itself based on the inferred Nash equilibrium
Example of Benefit of Global TE
AS 1
AS 2
AS 3
AS 5
AS 4
1G
2G 2G
1G
2G
1G traffic to AS 1
1G traffic to AS 1
• Without Global TE
Example of Benefit of Global TE
AS 1
AS 2
AS 3
AS 5
AS 4
1G
2G 2G
1G
2G
1G traffic to AS 1
1G traffic to AS 1
1G
0.5G
0.5G
1.5G
0.5G
• With Global TE
Example of Benefit of Global TE
AS 1
AS 2
AS 3
AS 5
AS 4
1G
2G 2G
1G
2G
1G traffic to AS 1
1G traffic to AS 1
1G
1G
1G
1G
Unified Transparency Framework for Various
Functionality• Sharing of anomaly/security-related
measurement– Various characteristics of traffic: heavy hitter,
heavy changes, histogram, etc.– Self-diagnosis to survivability
• Adaptations– Routing adaptations at router level or application
level
Practical Issues and Solutions
• Incentives for information sharing– Mandatory for next-generation Internet ?– Alliance model for incremental growth
• Security/cheating: Trust but verify– Trust most of the info shared but periodically verify
» Much easier than the current Internet tomography unless many ASes collude
– Verification part of the protocol» Some fields in the packet headers designed for that
purpose
Backup Materials
Measurement Info to Share
• Basic metrics– Delay, loss rate, capacity, available bandwidth– Demand (or traffic volume) and application types
• Intra-AS Measurement Info– Link-level info
» Queried only when necessary
– Aggregated Info» OD flow level info» Path segment b/t entry and exit points in each AS
• Inter-AS Measurement Info– General AS relationship– AS-level topology– Inter-AS link metrics
Combined w/ routing info and
export to neighboring ASes
through TGP protocol
Provide global retrievableManagement Information Base (MIB)
with DHT
Network link-level monitoring
Transparent Internet Architecture
Methodology
• Network topology• Web workload• Network end-to-end
latency measurement
Analytical evaluation
Algorithm design
Realistic simulation
iterate
PlanetLab tests
TGP MIB Dissemination Architecture
• Leverage Distributed Hash Table - Tapestry for– Distributed, scalable location with guaranteed
success– Search with locality
data plane
network plane
datasource
Web server
SCAN server
client
replica
always update
cache
DHT mesh
Replica Location
Dynamic Replication/Update
and Replica Management
adaptivecoherence
Overlay Network Monitoring
SERVER
OVERLAY RELAYNODE
OVERLAY NETWORKOPERATION CENTER
CLIENT
3. Network congestion /failure
4. Detect congestion /failure
2. Register trigger
7. Skip-free streamingmedia recovery
6. Setup New Path
1. Setupconnection
5. Alert +New Overlay Path
X
UC Berkeley
UC San Diego
Stanford
HP Labs
Adaptive Overlay Streaming Media
• Implemented with Winamp client and SHOUTcast server
• Congestion introduced with a Packet Shaper• Skip-free playback: server buffering and rewinding• Total adaptation time < 4 seconds
Summary• A tomography-based overlay network
monitoring system– Selectively monitor a basis set of O(n logn) paths
to infer the loss rates of O(n2) paths– Works in real-time, adaptive to topology changes,
has good load balancing and tolerates topology errors
• Both simulation and real Internet experiments promising
• Built adaptive overlay streaming media system on top of TOM– Bypass congestion/failures for smooth playback
within seconds
Tie Back to SCAN
Provision: Dynamic Replication
+ Update Multicast Tree BuildingReplica Management:
(Incremental) Content Clustering
Network End-to-End Distance Monitoring
Internet Iso-bar: latency TOM: loss rate
Network DoS Resilient
Replica Location: Tapestry
Contribution of My Thesis
• Replica location – Proposed the first simulation-based network DoS
resilience benchmark and quantify three types of directory services
• Dynamically place close to optimal # of replicas– Self-organize replicas into a scalable app-level
multicast tree for disseminating updates
• Cluster objects to significantly reduce the management overhead with little performance sacrifice– Online incremental clustering and replication to
adapt to users’ access pattern changes
• Scalable overlay network monitoring
Existing CDNs Fail to Address these Challenges
Non-cooperative replication inefficient
No coherence for dynamic content
Unscalable network monitoring - O(M × N)M: # of client groups, N: # of server farms
X
Problem Formulation
• Subject to certain total replication cost (e.g., # of URL replicas)• Find a scalable, adaptive replication strategy to reduce avg access cost
CDN Applications (e.g. streaming media)
SCAN: Scalable Content Access Network
Provision: Cooperative Clustering-based Replication
User Behavior/Workload Monitoring
Coherence: Update Multicast Tree Construction
Network PerformanceMonitoring
Network Distance/ Congestion/ FailureEstimation
red: my work, black: out of scope
Comparison of Content Delivery Systems (cont’d)
Properties Web caching (client initiated)
Web caching (server initiated)
Pull-based CDNs (Akamai)
Push-based CDNs
SCAN
Distributed load balancing
No Yes Yes No Yes
Dynamic replica placement
Yes Yes Yes No Yes
Network- awareness
No No Yes, unscalable monitoring system
No Yes, scalable monitoring system
No global network topology assumption
Yes Yes Yes No Yes