View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Controlling High Bandwidth Aggregates in the Network
Ratul Mahajan, Steven M. Bellovin, Sally Floyd, John Ioannidis, Vern Paxson, and Scott ShenkerAT&T Center for Internet Research at ICSI (ACIRI)and AT&T Labs Research
Presented byScott McLauren
Overview
Introduction Overview of ACC Local ACC Pushback Simulations Discussion Related Work Conclusions
Introduction
Overloads can result from a single flow not using congestion control. These flows continue to transmit, despite packet drops
DoS – when a large amount of traffic is directed at a network link or server
Flash crowd – A large number of users try to access a server. They can overload the server and network link, which interferes with unrelated traffic
Introduction
ACC – Aggregate-based Congestion Control Aggregate – a collection of packets from one or
more flows that have some property in common Source or destination addresses, application type,
TCP traffic, HTTP traffic to a specific server
Local ACC and Pushback Expected to be invoked rarely
Overview of ACC
1. Am I seriously congested?
2. If so, can I identify an aggregate responsible for an appreciable portion of the congestion?
3. If so, to what degree do I limit the aggregate?
4. Do I also use pushback?
5. When do I stop? When do I ask upstream routers to stop?
Policies
Very large number of possible policiesProtect high bandwidth aggregatesPunishing some aggregate when congestion
startsFairnessRestricting max throughput of an aggregate
Policies are left as future work
Detecting congestion
Apply ACC only when output queue has sustained severe congestion
Monitor loss rate at the queue, and looking for an extended high loss rate period
Types of Congestion
Undifferentiated congestionUnder-engineered networkFiber cut
Traffic clustering to form aggregatesFlash crowds, flooding attacks, application
types (email worms) DDoS attacks – the attacker can vary the
traffic to escape detection
Identifying Responsible Aggregates
Congestion signatureThe router does not need to make any
assumptions about the malicious or benign nature of the aggregate
Collateral damageSignature is too broad – traffic beyond the
aggregate is included in the signature
Determining the Rate Limit for Aggregates Rate limit is determined such that a
minimum level of service is guaranteed for the remaining traffic
Completely shutting off traffic is not used because of:Flash crowdsAn aggregate for a DDoS attack will also
contain innocent traffic
Pushback
Used to control an aggregate upstream Congested router asks (recursively) its
neighbors to rate-limit the aggregate Can be invoked by a router, or a server
connected to a router
Reviewing Rate-limiting
Rate-limiting is updated periodically, to update the limit based on current conditions, and to release aggregates that start to behave
Decisions are easy for local ACC, difficult with pushback
An attacker could predict these decisions to evade ACC
Local ACC
Triggered when the output queue experiences sustained high congestion
Using the packet drop history of the last K seconds, the ACC agent tries to identify the high bandwidth aggregates, and the limit to which they should be restricted
Identification of High Bandwidth Aggregates Expectation is that most aggregates will be
based on either a source or destination address prefix
Detection based on destination address is presented, other algorithms require further research
Identification of High Bandwidth Aggregates From the drop history, extract a list of
high-bandwidth addresses (32-bit) Cluster these into 24-bit prefixes
For each of these, try obtaining a longer prefix that still contains most of the drops
Determining the Rate Limit for Aggregates ACC agent sorts the list of aggregates based on the
number of drops Uses the total arrival rate at the output queue and the
drop history to estimate the arrival rate ACC agent calculate the excess arrival rate at the output
queue Traffic that would be dropped at the rate limiter to bring the drop
rate down to the target drop rate Compute rate-limit L for each aggregate, such that:
Aggregate[k].arr is the arrival rate of the kth aggregate
Rate-limiter
Controls the throughput of the aggregates, and estimates arrival rate using exponential averaging
It is in the forwarding fast path, so it must be light-weight
Once a packet is past the rate-limiter, packets lose their identity as part of an aggregate
Implemented as a virtual queue
Narrowing the Congestion Signature Goal is to drop more of the attack traffic
Based on dominant signature within an aggregate Drop more heavily from this subset
Flow-aware rate-limiting during flash crowds Drop more heavily from SYN packets, so connections
that are established get better service Dangerous in DDoS attacks, the attacker could just
send the packets that are being favored (TCP above)
Simulations
Aggregates 1-4 are composed of multiple CBR flows. Aggregate 5 is a VBR source whose sending rate increases at t=13, decreases at t=25
Invoking Pushback
Invoked if the drop rate for an aggregate remains high for several secondsThe high drop rate indicates the router hasn’t
been able to control the aggregate by preferential dropping (RED)
Sending Pushback Requests Upstream Each upstream link is classified as
Non-contributing – send a small fraction of aggregate’s traffic Contributing – send a large fraction of aggregate’s traffic
Non-contributing aggregates do not receive pushback requests, only limit those aggregates sending most of the traffic
Algorithm used: max-min
Arrival rates of 2, 5, and 12 Mbps Desired arrival rate of 10 Mbps Limited to 2, 4, and 4 Mbps
Non-contributing neighbors could start sending more traffic, but it doesn’t matter because they are using rate-limiting
Protocol defined in IETF draft, since deleted
Feedback to Downstream Routers
Upstream routers send status messages to downstream routers Report total arrival rate for that
aggregate Messages enable congested
router to decide if it want to continue pushback
Ending pushback may result in larger arrival rate Because dropping is no longer
contributing to congestion control
Solid lines indicate arrival rate estimate in the status message
Dashed lines did not receive pushback requests
Labels indicate arrival rate estimate
Simulations
Simple Intended to illustrate some of the basic functionality of
the ACC mechanisms Bad sources – send attack traffic to victim D Poor sources – innocent sources sending traffic to D Good sources – send traffic to destinations other than D
Local ACC Good and Poor
aggregates contain 7 infinite demand TCP connections
Bad sources use a UDP flow with equal on-off sending times, randomly chosen between 0 and 4 seconds 1 MBps during on
period
DDoS Attacks
10 good sources & 4 poor sources spawn web-like traffic
Sparse-attack – 4 random 2 MBps on-off bad sources
Diffuse-attack – 32 UDP 0.25 MBps on-off sources
Flash Crowds
Flash traffic from 32 sources sending web traffic to the same destination
Good traffic from ten other sources sending web traffic to other destinations Accounts for 50% link
utilization without flash
Pushback Discussion
Advantages Prevents scarce upstream bandwidth from being wasted on
packets that will eventually be dropped When traffic can be localized spatially, pushback can effectively
concentrate rate-limiting on attack traffic within aggregate Disadvantages
For DDoS attacks uniformly distributed across inbound links, pushback is not effective at rate-limiting
May overcompensate, especially during flash crowds, dropping extra traffic resulting in link being underutilized
Can sometime increase damage done to legitimate traffic – when legitimate and attack sources are within the same aggregate and the sources are in a edge network without pushback
Pushback Implementation
Identification of aggregates can be done as a background task, or on a separate machine, so processing power is not an issue
Router needs to determine if a packet is part of an aggregate. If number of aggregates is large, router has a large lookup table. The lookup-time increases with the number of aggregates
These should not be an issue, pushback will only be used occasionally, on a handful of aggregates
Pushback Deployment
Estimating Upstream Contribution Difficult for routers joined by LANs, VLANs, or frame
relay circuit – multiple routers attached to interface Downstream router my not be able to distinguish
between upstream routers Workaround – send dummy pushback request that
doesn’t rate-limit, status messages with estimated arrival rate are returned, then actual pushback requests can be sent to the necessary routers.
Deployment Incrementally at the edges of an island of routers
Related Work
Ingress Filtering Attempts to stop the attacks, ACC doesn’t
Traceback Attempts to find the sources of the attacks, ACC doesn’t
IDS Protocol for interaction between routers Does not deal with identification or rate-limiting
CDNs and Multicast Prevent flash crowds by mirroring data What about traffic not yet cached? Traffic not suitable for multicast?
Flow-based congestion control Doesn’t handle aggregates of many flows that are low-bandwidth
CBQ Used for fixed definitions of aggregates, not dynamic aggregates
Conclusions
Local and cooperative mechanisms for aggregate-based congestion control have potential to control DDoS attacks and flash crowds
More research needs to be done Need to understand pitfalls and limitations of ACC How frequently is sustained congestion caused by
aggregates, and not by failures? What do attack traffic and topologies look like? Policy decision will play a role in shaping ACC
mechanisms
Questions