The War Between Mice and Elephantsweb.cs.wpi.edu/~rek/Adv_Nets/Fall2007/Mice_Elephants06.pdf · The War Between Mice and Elephants Liang Guo, Ibrahim Matta Computer Science Department

The War Between Mice and Elephants

Liang Guo, Ibrahim MattaComputer Science Department

Boston UniversityBoston, MA 02215 USA

Daniel Courcy- [email protected]

Nathan Salemme- [email protected]

Worcester Polytechnic Institute

2

Outline

• Introduction• Analyzing Short Flow Performance • Scheme Architecture and Mechanisms• Simulation• Discussion• Conclusion


3

References

• Most figures, tables and graphs in this presentation were gathered from the Journal paper: The War Between Mice and Elephants.


4

Mice and Elephants?• Most connections are

short (mice)• However, long

connections (elephants) dominate Internet bandwidth

• Problem?– Elephants hinder

performance of short mice connections

– Long delays ensue

Internet Bandwidth Utilization

80%

20%

SmallConnections(mice)LargeConnections(Elephants)


5

TCP Factors• TCP transport layer has properties that can

cause this problem– Sending window slowly ramped up starting at

minimum• Regardless of available bandwidth, always starts at

minimum value– Packet loss always determined by timeout

• Not enough time for duplicate ACKS– Conservative ITO can have devastating affects if

initial control packets are lost (SYN, SYN-ACK)


6

Proposed Solution• These factors cause short flows to receive less than

fair bandwidth• Proposed plan- ‘Preferential treatment’ to short flows

• Differentiated Services Architecture (Diffserv)– Core and Edge Routers (similar to CSFQ)

• Active Queue Management (AQM)– Core Routers use RIO-PS Queue– Edge Routers use threshold based classification– No packet reordering!

• Simulations prove– Better fairness and response time by giving treatment to short

flows– Goodput remains the same if not better

*flow=connection=session*


7

Related Work• Guo & Matta wrote “Differentiated Predictive Fair Service for TCP Flows” (2000)

– A Study of Interaction Between Short and Long Flows– Propose to Isolate Short / Long Flows

• This Would Increase Response Time and Fairness for Short Flows– Results Show

• Class-based Flow Isolation with Threshold-based Classification at the Edge may Cause Packet Reordering

• (Thus) Degradation of TCP Performance• Difference: (Mice & Elephants) Bandwidth (Load) Control is Performed on the Edge• Cardwell et al. Show in “Modeling TCP Latency” that Short TCP flows are Vulnerable• Seddigh et al. Show Negative Impact of Initial Timeout Value (ITO) on Short TCP Flow

Latency– Propose to Reduce Value (G & M Test This Later On…)

• Many Proposed Solutions Attempt to Revise TCP Protocol• This Study Places Control Inside Network (Fair to All)• Corvella et al. & Bansal et al. Claim Size-aware Job Scheduling Helps to Enhance

Response Time of Short Jobs (While Not Really Hurting the Performance of Long Jobs)


8

Related Work• G & M Propose Alternative use of Diffserv Architecture

– Diffserv = Framework; Allows for Classification & Differentiated Treatment

• G & M Want to Provide a New “better-than-best-effort” TCP Service that Attempts to Enhance the Competence of Short TCP Flows

– This Creates a More Fair Environment for Web Traffic

• They Classify In / Out to Distinguish Flow Length as Short / Long


9

Analyzing Short Flow Performance• Packet loss rate needs to be low for short

connections• Simulation of different packet size

connections– RTT 0.1 seconds– RTO 0.4 seconds– Default ITO 3 seconds– Fixed Packet Size (for each connection)– Vary loss rate

• Goal is to show how short flows become very sensitive as packet loss rate increases


10

Sensitivity of Short Flows• Short flows not

sensitive at low loss rates

• Short flows very sensitive at high loss rates

• Long flows, loss rate simply grows exponentially

Exponential growth as loss rate increasesLinear growth as loss rate

increases


11

Sensitivity of Short Flows• Covariance (C.O.V.)

– Ratio between standard deviation and the mean of a certain random variable

• Variance for short flows increase as loss rate increases

• Variance for long flows decrease as loss rate increases

Long connections decrease in varianceShort connections increase

in variance


12

Sensitivity of Short Flows• Possible reasons for high covariance in

short flows– High congestion puts TCP into exponential

backoff– Resending packets in slow start vs.

congestion avoidance will yield high variance

• Slow start=aggressive• Congestion avoidance=conservative

– Law of Large Numbers; more variance for short connections


13

Preferential Treatment of Short TCP Flows

• Simple Simulations Used to Show:– Preferential Treatment of Short TCP Flows can

Significantly Enhance Short Flow Transmission Time (w/ No Major Effects on Long Flows)

• G & M use ‘ns’ to Setup:– 10 Long (10000-packet) TCP Flows– 10 Short (100-packet) TCP Flows– These Compete For Bandwidth Over 1.25 Mbps Link

• G & M Use Queue Management @ Bottleneck• They Measure Instantaneous Bandwidth

– Shows Effects of Preferential Treatment


14

Impact of Preferential Treatment

• Consider Drop Tail vs. RED vs. RIO-PS– (RIO with Preferential Treatment)

• Drop Tail Fails to Give Fair Treatment– Favors Aggressive Flows w/ Larger Windows

• RED Gives Almost Fair Treatment• RIO-PS Queue Gives Short Flows > Than Fair Share• Graph Shows Short Flows Under RIO-PS can Temporarily Steal

Bandwidth• In The Long Run a Short Flow’s Early Completion Returns Resources

– i.e. Long Flows can Better Adapt to Network State– Giving Short Flows Preferential Treatment Does Not Impact LT Goodput


15

Additional Notes on Preferential Treatment

•Preferential Treatment Might Enhance Long Flows (As Seen on Previous Slide)•Helps Get Initial ‘Long Flow’ (Control) Packets Through via Threshold•Short Flows w/ Less Drops Enhances Fairness Between Themselves

• Table 1 (Above):– When Load is Low: RIO-PS & RED Slightly Lower Goodput– When Load is Higher: RIO-PS Slightly Higher Goodput

• Therefore G & M Propose Diffserv-like Architecture


16

Proposed Architecture• Diffserv Architecture

– Edge Routers– Core Routers

• Edge Routers– Classify flows into short

and long– Marks packets

accordingly• Core Routers

– Actively manage flows based on their class

– AQM policy implemented at core


17

Edge Router• Edge router sits on edge of network and is responsible for

determining short and long flows• Simple threshold counter used, Lt

– Connection < Lt, short flow– Connection > Lt, long flow– All connections start of short– Lt is dynamic

• Connection is considered active until Tu expires• SLR parameter used to balance short connections vs. long

connections• This ratio is updated every Tc time period through additive

increase/decrease• Chosen values Tu=1 second and Tc=10 seconds• All flows at first are considered short


18

Core Router: Preferential Treatment to Short Flows

• G& M Require Core Routers to Give Preferential Treatment to Short Flows

• Many Queue Policies Available• Picked RIO (RED w/ In and Out)

• Conforms to Diffserv• Other Advantages (Lets Bursty Traffic Through)


19

RIO

• Operation of Queue– In (Short) Packets not Affected by Out (Long)

Packets– Dropping / Marking Probability for Short Based

Only on Average Backlog of Short Packets (Qshort)– Dropping / Marking Probability for Long Based on

Total Average Queue Size (Qtotal)


20

Design Features of RIO• G & M Discuss RIO Features

– Only a Single FIFO Queue is Used for all Packets• Reordering Will not Happen

– Reordering Can Lead to Over-Estimation of Congestion– RIO Inherits Features of RED

• Protection of Bursty Flows• Fairness within Traffic Classes• Detection of Incipient Congestion

• RIO-PS (RIO w/ Preferential Treatment for Short Flows)• BW for Short Flows Determined by Parameters• G & M Choose 75% Total Link Bandwidth In Times of

Congestion


21

Simulation

• G & M use ‘ns’ Simulator to Study:• Performance vs. Drop Tail & RED


22

Simulation Setup• Assume Network Traffic Dominated by Web Traffic• Model as Such:

– Randomly Selected Clients Start Sessions That Randomly Surf to Random Web Pages (of Different Sizes)

– Each Page may Have Several Objects• Thus Each Requires a TCP Connection [HTTP 1.0 is Assumed]

– Client Request and Servers Respond w/ Acknowledgement and Remainder of Data (the Web Page)

• Load is Carefully Tuned to be Close to Bottleneck Link Capacity


23

Simulation Topology

• Traffic Largely Flows Right to Left• Node 0 is Thus Entry Edge Router• Nodes 1, 2, 3 are Core Routers• Bottlenecks to Clients (1-3 and 2-3)• Bottleneck Buffer Size and Queue Management Set to Maximize Power

– Power is Ratio Between Throughput and Latency– High Power is Low Delay and High Throughput

• Again for RIO-PS Queue Short Flows are Set to Get about 75% of Total Bandwidth• ECN is Turned ON (This Aids RED w/ Short Flows)• ECN was Tried OFF Witnessed Larger Performance Gains


24

Experiment 1: Single Client• Only one client set used in previous topology• Parameters:

– Experiment run for 4000 seconds (first 2000 discarded)

– SLR set to 3– Drop-tail, RED and RIO-PS are subjects– Response time recorded after each successful

object download


25

Experiment 1: Single ClientResponse Time (ITO=3)

• ITO set to 3 seconds

• Average response time determined to be 25-30% less than competition

• It is argued that 3 second ITO is too safe, so 1 second timer is tried25-30% gap between RIO-

PS and competition. Pretty good…


26

Experiment 1: Single ClientResponse Time (ITO=1)

• ITO set to 1 seconds• Potentially an unsafe

setting if used in environment with long slow links and long propagation times

• Results show less gap between RIO-PS and competition

• Still good performance, might not be worth risk15-25% gap between RIO-PS

and competition. Not as good and more risky…


27

Experiment 1: Single Client Queue Size (ITO=3)

• Instantaneous queue size for last 20 seconds using 3 second ITO

• High variation due to file size distribution

• RIO-PS relatively low compared to Drop-tail and RED


28

Experiment 1: Single ClientDrop/Mark Rate (ITO=3)

• Overall drop/mark rate of entire network reduced

• Short connections rarely drop packets


29

Study of Foreground Traffic

• To Better Show Queue Management Policy’s Effects on Fairness of TCP Connections G & M Injected 10 Short and 10 Long Foreground TCP Connections and Recorded the Response Times

• The Fairness Index of Response Times Computed


30

Fairness – Short Flows

• RIO-PS Most Fair For Short Flows


31

Fairness – Long Flows

• Long Flows Do Well Across The Board


32

Transmission Time – Short Connections


33

Transmission Time – Long Connections


34

Table IV

• Summary of Overall Goodput• Proposed Scheme Does Not Hurt• Slightly Improves Overall Goodput Over

Drop Tail


35

Experiment 2: Unbalanced Requests

• Simulation Where File Requests on Different Routes are Unbalanced

• Small Requests on One Route / Large Requests on Another

• Proposed Scheme Reduced to Tradition Unclassified Traffic + RED– i.e. No Different than RED

• Still Getting the First Few Packets Across w/ Preferential Treatment Does Help Reduce the Chance of Retransmission and Allows Short Connections to Finish Quickly

• Remainder of Results Omitted


36

Discussion• Simulation Model:

-“Dumbbell and Dancehall”– Only one-way– Different propagation delays not

considered• Queue Management

– RIO does not necessarily guarantee class-based flows

– PI controlled RED may be better solution

http://www.cse.iitk.ac.in/users/braman


37

Discussion

• Deployment Issues:– Scheme Requires that Edge Devices be Able

to Perform Per-flow State Maintenance and Per-packet Processing

– G & M Quote Previous Work [31] Stating this Does not Really Impact End-to-end Functionality

– Incrementally Deployable?• Only Edge Routers Need to Be Configured


38

Flow Classification• Flow Classification:

– Threshold-based Classification• Because Edge Node Cannot Predict

– This Allows the Beginning of a Long Flow to Look like a Short Flow

– This “mistake” Actually Helps Performance• This Allows the First Few Packets of a Long TCP

Flow to be Treated as the First Few of a Short Flow

• Makes System Fair to all TCP Connections


39

Discussion• Controller Design:

– Edge load control may not be so dependant upon the value of SLR

– More importantly are the values of Tc and Tu (small values may be more accurate but increase overhead)

• Malicious Users– Users who would try and break long TCP

connections into smaller segments– The dynamic nature of edge routers

should prevent such things


40

Conclusions• TCP = Majority of Bytes Flowing Over Internet• Diffserv-like Architecture Where Edge Routers

Classify Flow Size• Core Routers Implement Simple RIO to give

Preference to Short Flows• Mice Get Better Response Time and Fairness• Elephants are Enhanced (Slightly) or at Least Only

Minimally Affected• Goodput Enhanced (or at Least not Degraded)• Flexible Architecture (Needs only Edge Tuning)• Size-aware Traffic Management has Considerable

Promise

The War Between Mice and ElephantsOutlineReferencesMice and Elephants?TCP FactorsProposed SolutionRelated WorkRelated WorkAnalyzing Short Flow PerformanceSensitivity of Short FlowsSensitivity of Short FlowsSensitivity of Short FlowsPreferential Treatment of Short TCP FlowsImpact of Preferential TreatmentAdditional Notes on Preferential TreatmentProposed ArchitectureEdge RouterCore Router: Preferential Treatment to Short FlowsRIODesign Features of RIOSimulationSimulation SetupSimulation TopologyExperiment 1: Single ClientExperiment 1: Single ClientResponse Time (ITO=3)Experiment 1: Single Client Response Time (ITO=1)Experiment 1: Single Client Queue Size (ITO=3)Experiment 1: Single ClientDrop/Mark Rate (ITO=3)Study of Foreground TrafficFairness – Short FlowsFairness – Long FlowsTransmission Time – Short ConnectionsTransmission Time – Long ConnectionsTable IVExperiment 2: Unbalanced RequestsDiscussionDiscussionFlow ClassificationDiscussionConclusions

Documents

The War Between Mice and Elephantsweb.cs.wpi.edu/~rek/Adv_Nets/Fall2007/Mice_Elephants06.pdf · The War Between Mice and Elephants Liang Guo, Ibrahim Matta Computer Science Department