View
221
Download
0
Tags:
Embed Size (px)
Citation preview
Towards Resilient Networks using Programmable Networking
TechnologiesLinlin Xie, Paul Smith, Mark Banfield, Helmut Leopold,
James Sterbenz and David Hutchison
Computing DepartmentLancaster University
Department of Electrical Engineering and
Computer ScienceUniversity of Kansas
Telekom Austria AG
Linlin Xie IWAN 2005 2
Presentation Outline
• Introduction to resilience networking– Motivation– Resilient networks– Aims– Approaches
• Scenario– Flash Crowd Event to Web Servers– Ill-Effect Detection– Remediation
Linlin Xie IWAN 2005 3
Motivation
• The Internet is a utility– Consumers, businesses, governments
• Failures & attacks are inevitable– Hurricane Katrina, 9/11, NE blackout… – Link/device failures, DDoS…
• Current Internet and applications not resilient• Need networking effort: network providers
should take the responsibility– to protect network resources and optimize the
utilization– to protect cross traffic as well as stricken customers
Linlin Xie IWAN 2005 4
Resilient Networks
• Ability of network to maintain or recover an acceptable level of service in the face of challenges to normal operation in an acceptable period of time
• Example challenges to normal operation:– Unusual traffic load (e.g. flash crowds) – High-mobility of nodes and sub-networks – Weak and episodic connectivity of wireless channels – Long delay paths – Large-scale natural disasters – Attacks against the network hardware, software, protocol
infrastructure – Natural faults of network components
Linlin Xie IWAN 2005 5
General Resilience Aims
• Provide acceptable services to applications– Ensure information is accessible– Maintain end-to-end communication when possible– Operation of distributed processing and networked
storage
• Resilient services must remain accessible– Can degrade gracefully when necessary, but
ensure correctness– Recover rapidly and automatically when challenges
dissipate
Linlin Xie IWAN 2005 6
Role of Programmable Networks
• Challenges to normal operation will rapidly change over time and space
• Prescribed solutions cannot be deployed• Therefore, resilient networks must:
– Operate in real-time– Be autonomic– Be context-aware and “intelligent”– Be dynamically extensible
• Programmable networking technologies are key to enabling these facilities
Linlin Xie IWAN 2005 7
Programmable Networking Facilities
• Dynamic extensibility and self-organisation– Programmability allows dynamic response to challenges by
altering its behaviour– But need to be controlled in order to avoid misuse and potential
harm (e.g., stealthy interfaces)– Service to determine suitable locations to deploy services is
required
• Traffic and network environment awareness– Packet inspection at line speed– Network information collection
• Cross layer awareness and interaction– Avoid waste of resources and enhance coordination– How and the possible consequences need further study
Linlin Xie IWAN 2005 8
Related Work
• Knowledge Plane (“KP”, David Clark et al. MIT)– Part of the KP purpose is to detects faults &intrusion and
mitigate the ill-effects– It proposes to add a new plane into the Internet architecture– The supporting technology is cognitive AI– The purpose of KP covers a very broad range– Cognitive AI is still in its initial stage of development– No concrete mechanisms for resilience maintenance yet
• Autonomic Communications– Efforts largely focused on self-configuring, self-managing, and
self-healing networked server systems– Initiatives now on making communications system autonomic
• Learn network context and automatically adapt
Linlin Xie IWAN 2005 9
Related Work (Cont’d)
• COPS (Checking, Observing and Protecting Services) (Randy Katz, UCB)– Propose to protect network using iBoxes on the network edge– Propose an annotation layer between IP and transport layers to
carry information along the traffic
• Other similar/related efforts– Disruption Tolerant Network (DTN)
• Mean to provide stable end to end paths for applications when network connectivity faces challenges
– Survivability• Enable the system to fulfil its mission even in the presence of
attacks or failures (CMU)
– Resilience covers a broader range including protection against unusual traffic load (e.g., FC)
Linlin Xie IWAN 2005 10
Resilience Networking Scenario
• Demonstrate the applicability of programmable networks
• Flash Crowd Event– Although flash crowd requests are legitimate, the
damage caused is equally as bad as malicious attacks
• Two activities investigated:– Detecting ill-effects of a flash crowd on Web servers– Remediation of a flash crowd event
Linlin Xie IWAN 2005 11
Network Model
We take the role of network provider, i.e. ISP, to detect and mitigate the ill-effects occurred to the web servers network (which subscribes such service), and protect resources and cross traffic in the network of its own
ISP Core Network
E1
E2
E3
E4
R1
R2R3
R4
Web Servers Network
ISP
LAN Network
Dial Network
LAN Network
Core Router
ISP Network
Edge Router (Ingress, Egress,
Border)
Ei
Ri
Linlin Xie IWAN 2005 12
Ill-Effects Detection
• Detection basis: – An increase of request rate in an association with a
decrease or level-off of response rate
• Detection location: – The edge router that connects the web server
network to the ISP network
• Algorithm overview: – compare actual observed response rate with the
expected one
Linlin Xie IWAN 2005 13
Ill-Effect Detection (Cont’d)Mechanism based on the formulae:
Where the sizes of response objects are estimated according to the size distribution calculated from sampling the “content-length” domain in HTTP header of the response traffic
Linlin Xie IWAN 2005 14
Simulation Setup
• Based on ns-2 • Topology• α chosen to 0.2• Detection interval t set
to be 30s
...
20 clients
Ingress Edge Router
Egress Edge Router
Web Server
LAN: 10Mb
/s
WAN: 50Mb/s
LAN: 15Mb/s
Linlin Xie IWAN 2005 15
Simulation Setup (Cont’d)
Parameters set up as follows
Linlin Xie IWAN 2005 16
Simulation Results
Flash crowd traffic simulationFlash crowd starts at 500s
We use access link congestion to simulate the server-side behavior
Linlin Xie IWAN 2005 17
Simulation Results (Cont’d)
Detection results
Ratio of the actual response volume over the expected one
Linlin Xie IWAN 2005 18
Simulation Results
Statistical distribution of ratio samples of background traffic:
N(1.10817, 0.2274772)
The 95% confidence range of this distribution is [0.662315, 1.554025]
Linlin Xie IWAN 2005 19
Remediation• Drop excessive requests at the ingress edges of
the network– Pushback-similar mechanism
• Opportunistic multiple-routing of large response traffic that is packet-sequence-tolerant to protect cross traffic from degrading QoS too much– Multiple routes database– Path bandwidth information collection– Split the response traffic in proportion to the available
bandwidth of each path• Must consider the possibility of having zero or
just a few of programmable routers in the core network
Linlin Xie IWAN 2005 20
Scenario Conclusions
• Contributions– Cross-layer coordination in detection– Cross traffic protection in the network
• Future work– Mitigation mechanism and experiments– Design and improve a resilient network
infrastructure and architecture
Linlin Xie IWAN 2005 21
Conclusions
• Resilient networks are crucial for the future information society
• Programmable networking technology is appropriate for building resilient networks
• Example flash crowd scenario demonstrates the need for programmability, namely:– cross-layer interaction
– dynamic extensibility
Linlin Xie IWAN 2005 22
Thanks!
Questions?