Upload
univ-savoie
View
1
Download
0
Embed Size (px)
Citation preview
1
OntheFeasibilityof IdentifyingElephantsin
InternetBackboneTraffic
K. Papagiannaki,N. Taft, S.Bhattacharya,P. Thiran,K. Salamatian,C. Diot
Abstract—The elephantsand mice phenomenonis known asone of the few invariants of Inter net traffic. No systematicmethod has been proposedso far for defining a boundaryto separateflows into two distinct classes.We consider foursingle-feature classificationmethodsand examinenetwork-level prefix traces collectedfr om a Tier-1 backbone. We il-lustrate that onecannot simultaneouslyachieve consistencyin the fraction of total load due to elephantsand the num-ber of elephants. Thesemethods also result in short aver-ageholding times in the elephant state. To avoid reclassifi-cation due to short term fluctuations we proposea two fea-tur e classificationmethod that incorporatesa metric, calledlatent heat, that is a function of the differ encein elephantbandwidth and the classseparation thr esholdover a num-ber of time intervals. Our first contribution is to illustratethe volatility of elephantsand hencethe difficulty in isolat-ing them for traffic engineeringpurposes,especiallyif onewants to achieve consistencywith respectto load, numberand duration of elephant flows. Our secondcontribution isto proposea classificationmethodthat achievesa reasonabletradeoff betweenfluctuations in elephant load and fluctua-tions in the number of elephantswhile simultaneouslyyield-ing longer averageholding times.
I . INTRODUCTION
Traffic engineeringof IP networkshasattracteda lot of at-tentionin the recentpast. Network operatorsareseekingways in which they could make optimal useof their net-works’ capacity, withoutintroducingcomplexity andmain-taining the simplicity of the packet switching paradigm.With that in mind, researchershave proposeddifferental-gorithmsfor traffic engineeringtasksthatcouldleadto bet-terutilizednetworksby exploiting certainpropertiesof theunderlyingtraffic [2], [6], [10], [12].
Studiesof the Internettraffic at the level of network pre-fixes,fixed lengthprefixes,TCP flows, AS’s, andWWWtraffic, have all shown thata very small percentageof theflows carriesthe largestpart of the information [2], [6],
[10], [12]. This behavior is known as“the elephantsandmice phenomenon”,and is consideredto be one of thefew invariantsof Internettraffic [7]. In statistics,this phe-nomenonis also called the “mass-countdisparity”. The“mass-countdisparity” statesthat for heavy-tailed distri-butionsthemajority of themassin a setof observationsisconcentratedin averysmallsubsetof theobservations[3].
In orderto beableto exploit suchapropertyfor traffic en-gineeringpurposes,onehasto be able to identify whichflows carry the majority of the bytes. Examplesof trafficengineeringapplicationsincludere-routing[10] andloadbalancingof elephantflows. The attractionof this ap-proachto traffic engineeringis thatby treatinga relativelysmall numberof flows differently one can affect a largeportion of the overall traffic. For this to be practical,ele-phantflows would needto remainelephantflows for rea-sonablylong periodsof time, so thata traffic engineeringapplicationneednot changeits policy or statefrequently.
To thebestof our knowledgeno systematicway hasbeenproposedsofar, for choosingthecriterionthat isolatesthehigh-volumeflows. Thecriterionshouldallow oneto con-siderany particularflow andto decideif it shouldbecate-gorizedasanelephantor a mouse.Otherstudieshave se-lecteda particularcriterionandthenexaminedotherprob-lemsgiventhis fixedcriterion. For example,in [5] anele-phantis any flow whoserateis larger than1% of the linkutilization; andin [9] an elephantis any flow whosepeakrateexceedsthemeanplusthreestandarddeviationsof theaggregatelink flow. Thesestudiesdonotaddresstheques-tion of how to choosethecriterionfor defininganelephant.Insteadthey focuson, given a fixedcriterion,how shouldonesamplehigh-volumeflows [5], andon how to mathe-maticallymodelthetwo typesof flows [9].
Our aim is to separateout a smallnumberof high-volumeflows thataccountfor mostof thetraffic, from avery large
SPRINTATL TECHNICAL REPORT TR01-ATL-110918 2
numberof flows that contribute negligibly to the overallload. In therestof thepaperwe refer to theformerastheelephant� class,andthelatterasthemouseclass. Thetermelephantsand mice hasbeenuseddifferently throughouttheliterature.In thiswork, elephantsandmicearedefinedaccordingto their averagebandwidth(as opposedto filesizesor flow durations).In [9] they usetheterms � and
�flowswhenthisphenomenonis appliedto bandwidthratesof 5-tupleflows.
We investigatea numberof schemesfor defininga thresh-old valuethatallowsoneto separateout theelephantsfromthemice.Wecollectpacketmeasurementsfrom Sprint’s IPbackbone,andstudyflow bandwidthsat thenetwork prefixlevel. Initially weconsidertwosingle-featureclassificationmethods. The first methodexploits the propertythat thenetwork-level prefix flow bandwidthdistribution is heavytailed. It identifiesthethresholdseparatingelephantsfrommice,asa cut-off point in theComplementaryCumulativeDistribution Function(CCDF). This techniquemakes noassumptionsabouthow high or low the thresholdshouldbe. Instead,it definesthe thresholdas the first point intheflow bandwidthdistribution afterwhich power-law be-havior canbewitnessed.Thesecondmethodclassifiesaselephantsthe high-volumeflows whosecumulative band-width is equalto a certainfraction of the overall traffic.This is motivatedby practicalityasit is simplefor carriersto implement.
We use four performance metrics to evaluate theseschemes.Our findingsillustratethevolatility of elephantsin anumberof ways.Thenumberof elephantsat any timeis highly variable,elephantflows needfrequentreclassifi-cation,andelephantsdo not remainelephantssufficientlylong for traffic engineeringapplications.We quickly con-sidertwo othersinglefeatureclassificationschemes(with-out examiningall four metricsin detail), to illustratethattheseother methodsalso yield highly volatile elephants.A comparisonof all theseschemesrevealsan interestingtradeoff, namely that it is difficult to keepboth the ele-phant load and the numberof elephantsconsistent(i.e.,smallfluctuations)at thesametime.
Becausepersistencein time is an importantproperty, andsimplesinglefeatureclassificationschemescannotachievethis, we next proposea two featureclassificationmethod.We suggesta metric, called latent heat, that incorporatestemporalbehavior of aflow. By addingthis into theclassi-ficationwe avoid unnecessaryreclassificationof flowsdueto short-termfluctuationsthat do not changea flow’s es-sentialnature. We show that this schemeachievesa bet-ter tradeoff in termsof fluctuationsin the elephantloadandthe numberof elephants,while simultaneouslyyield-
ing elephantflowswith muchlargeraverageholdingtimesin theelephantstate.
In SectionII, wedescribethedataanalyzedthroughoutthepaper. In SectionIII we presentour methodology, andinSectionIV, we presentinitial results.We improve on theproposedapproachesin SectionV, and characterizetheidentifiedelephantnetwork prefixesin SectionVI. Wecon-cludein SectionVII.
I I . MEASUREMENT ENVIRONMENT
Thedatausedin this papercomesfrom packet tracescol-lectedin the coreof Sprint’s Tier-1 IP backbonenetwork[8]. Optical splitters are usedin conjunctionwith pas-sivemonitoringequipmentto collect44-byteheadersfromevery IP packet traversingthe monitoredlink. Monitor-ing equipmenthasbeeninstalledin threemajor POPsintheUSA. Theanalysisdescribedthroughoutthepaperhasbeenperformedon6 tracescollectedonOC-12andOC-48links, yielding similar results.In this paperwe presentre-sultsonly from two differentOC-12 links. The specificsof thosetwo tracesare presentedin Table I. The linksusedare two hops away from the peripheryof the net-work, and traffic is capturedon its way towardsthe coreof thenetwork. Therefore,traffic towardsspecificdestina-tionsshouldexhibit a sufficient level of aggregation. Theutilization levels of thesetwo links aregiven in Figure1(The times displayedin the figuresarealways in EDT.).Weselectedthosetwo particularlinks becausethey exhibitdifferentbehavior, which hasimplicationson theclassifi-cationprocessaswill beseenin latersections.Thefirst oftheselinks providesadatasetthatvariesin asmoothfash-ion throughoutthe day. The secondlink exhibits a morepronouncedburstyor busybehavior.
Tracelabel
Starttime Endtime LinkSpeed
westcoast
Jul 24 08:00:352001EDT
Jul 29 02:42:552001EDT
OC-12
eastcoast
Jul 24 08:00:342001EDT
Jul 25 13:21:552001EDT
OC-12
TABLE IDESCRIPTION OF COLLECTED TRACES.
The first issueaddressedis the granularity level of theflowsto beclassified.For thepurposeof intra-domaintraf-fic engineeringthenaturalgranularitylevel is that of net-work prefixesappearingin routingtables.Othercandidates
2
SPRINTATL TECHNICAL REPORT TR01-ATL-110918 3
05:00 09:00 13:00 17:00 21:00 01:00 05:00 09:000
10
20
30
40
50
60
70
Jul 24, 2001 Jul 25, 2001
Thr
ough
put o
f tot
al a
ggre
gate
(M
bits
/sec
)
west−coasteast−coast
Fig. 1. Link utilization for thewestandeastcoasttrace.
05:00 09:00 13:00 17:00 21:00 01:00 05:00 09:000
2000
4000
6000
8000
10000
12000
Jul 24, 2001 Jul 25, 2001
Num
ber
of a
ctiv
e ne
twor
k pr
efix
es
west−coasteast−coast
Fig. 2. Numberof active network prefixesfor thewestandeastcoasttrace.
could include flows definedby their destinationaddress,a fixed prefix length,the usualfive-tuple(sourceaddress,destinationaddress,sourceport, destinationport, protocolid), or theAS level.
Network prefixes is a natural candidatefor a few rea-sons.First,suchflowscaneasilybemanipulatedby simplychangingthenext hopaddressin theroutingtablefor thatparticularflow. Second,routingpoliciestendto beappliedto a network prefix asa whole,sincenetwork prefixesarethesmallestroutableentitiesin theInternet.Thereforeanychangein policy would affect theentireflow aswe defineit. This would not thecase,for example,for flows definedby destinationaddressor prefixesof a fixedlength.Third,it hasbeenobservedthatelephantsandmicedoexist at thislevel of granularity[6].
In parallel to the packet tracecollection, we collect theBGProutingtablesatthecorrespondingPOPs.ThoseBGPtablesaredefault-freeandcontainapproximately120Ken-tries. We calculatethe volume of traffic headedtowardseachBGPdestinationandcomputetheaveragebandwidthof eachflow over5 minutetimeintervals.Weselect5 min-
0 200 400 600 800 10000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Number of top flows
Fra
ctio
n of
tota
l ban
dwid
th
Fig. 3. Cumulative Distribution Functionof flow bandwidthsfor thefirst 125-minuteintervalsof theeastcoasttrace.
100
102
104
106
10−4
10−3
10−2
10−1
100
Bandwidth (Bytes/sec)
Com
plem
enta
ry C
umul
ativ
e D
istr
ibut
ion
Fun
ctio
n
west coasteast coast
Fig. 4. ComplementaryCumulative Distribution Functionofflow bandwidthsfor theeastandwestcoasttrace.
utesasthe measurementinterval, sincethis is the defaulttime interval at which SNMP information is usuallycol-lected. We find that in any given measurementinterval,approximately90%of thenetwork prefixeshave no traffictraveling towardsthem. We definea flow to beactiveif itreceivesat leastonepacket duringthemeasurementinter-val. Figure2 presentsthenumberof activeprefixesateachmeasurementinterval over a two day period. The obser-vation that thereareonly a few flows active at any giventime slot is furtherconfirmationthatthis aggregationlevelis appropriatefor the identificationof elephantsin that itproducesalimited numberof flowsonwhichcomputationsneedto becarriedout.
The cumulative distribution function (cdf) for the band-width of the derived flows is presentedin Figure3. Eachcurve correspondsto a different5 minute interval withinthefirst hourof the trace. We seethatat any time duringthishour, thetop100flowsaccountfor 50%to 70%of thetotal traffic, while thetop 400flows accountfor 75%-85%
3
SPRINTATL TECHNICAL REPORT TR01-ATL-110918 4
of the total traffic. This confirmsthat elephantsandmicedo exist in ourdatasetat thechosengranularitylevel.
Wefurtherexamineourdatasetfor heavy-tail properties.InFigure4 we presentthecomplementarycumulative distri-bution functionplot of thebandwidthvaluesof our flows.Thestraightline at theendof thedistribution indicatesthatit may be heavy-tailed. We usethe aesttestdesignedbyCrovella andTaqqu(describedin [4]) to estimatethescal-ing parameterof thedistribution of theflows’ bandwidths.We find a scalingparameterwhosevaluesvary between1.03 and1.2, verifying that the bandwidthdistribution isindeedheavy-tailed. In SectionIII, wedescribetheproper-tiesof theheavy-tail distributions,andwedeviseamethodin which we makesuseof thesepropertiesin theclassifi-cationprocess.
I I I . SINGLE FEATURE CLASSIFICATION
Using our packet-level measurements,we aggregatetraf-fic into “flows”, basedon its network prefix asannouncedthroughBGP[11]. This givesusa setof flows,andanas-sociatedbandwidthfor eachflow. We now describetech-niquesthatcouldbeusedto classifytheseflows into “ele-phants”,and“mice”.
Beforeproceedingwe introducesomenotation. Let � de-notetheindex of anetwork prefixflow, i.e.,aBGProutingtable entry. Let � denotethe length of the time intervalover which measurementsare taken. Time is discretizedinto theseintervals, and � is the index of time intervals.Wedefine����� � to betheaveragebandwidthof thetrafficdestinedto aparticularnetwork prefix � duringthe ����� timeslotof length � . Weuse ��� to representthesetof all flowsconsideredelephants,and ��� to representthesetof flowsconsideredmice.
Our methodologyconsistsof two phases:1) thresholdde-tectionphase,and2) thresholdupdatephase.In the firstphase,we seekfor a bandwidthvalue ���� � , suchthat if aflow’s averagebandwidthin the chosentime interval ex-ceedsthe threshold,it is classifiedasan elephant.Other-wise, it is considereda mouse. In the secondphase,weupdatethethreshold,i.e., at time ����� , basedon thecur-rentthresholddetectedpluspastthresholdvalues.
A. ThresholdDetectionPhase
Previous attemptsat classifyingflows into elephantsandmice usea singleclassificationfeature,which is either1)bandwidth[6], [12], or 2) duration[10]. Given that weare interestedin high-volume flows, for the first stepof
ourmethodologywe focuson techniquesthatidentify ele-phantsbasedon their bandwidthalone. Issuesregardingthedurationof theelephantflows areinvestigatedin laterstages.
In thedetectionphaseof ourclassificationprocess,we an-alyzethepropertiesof thedatasetcollectedandcalculateathresholdvalue ���� � accordingly. Thisvalue ���� � is likelyto changewith time alwaysisolatingthoseflows thatcon-tribute thehighestamountof informationin eachtime in-terval.
We proposetwo differenttechniquesto identify the initialthresholdvalue.
� The first method takes into accountthe heavy tailpropertyof thecollecteddataset.Thethresholdvalueis setin sucha way sothata flow is characterizedasanelephant,only if it is locatedin thetail of theflowbandwidthdistribution.� Thesecondtechniqueis moreintuitive, andcouldbepreferredin operationalenvironments.It requiresthesetting of an input parametercorrespondingto thefraction of total traffic we would like to placein theelephantclass.Thethresholdis setin suchaway thatall the flows exceedingit accountfor the requestedfractionof thetotal traffic.
1) AEST approach: A random variable X follows aheavy-taileddistribution (with tail index � ) if
�! �#"%$'&�(*)+$-,'.�/1032�$54768/19;:<�=:<> (1)
where ) is a positive constant,and where ( meansthatthe ratio of the two sidestendsto 1 as $?4 6 . AESTis a methodfor the estimationof the scalingparameter�proposedby Crovella andTaqqu[4].
The particularvalue of � is important in many practicalsituations,andanumberof methodsarecommonlyusedtoestimateits value. For instance,valuesof � lower than1indicatethat the randomvariableX hasan infinite mean.Valuesof � between1 and2 indicateinfinite variance.
TheAEST methodidentifiestheportionof a dataset’s tailthatexhibitspower-law behavior. It’sbasedonthefactthattheshapeof thetail determinesthescalingpropertiesof thedatasetwhenit is aggregated. By observingthe distribu-tionalpropertiesof theaggregatesonecanmakeinferencesaboutwherein thetail power-law behavior begins.Weusethismethodto identify thefirst point in thedistribution af-ter which power law propertiescanbe witnessed.We setthis particularbandwidthvalueasthethresholdvaluesep-aratingelephantsandmice in that given measurementin-
4
SPRINTATL TECHNICAL REPORT TR01-ATL-110918 5
terval.
In otherwords, if we define @A �$�� as the complementarycumulative distribution function of the flow bandwidths,i.e. @A �$�� = 1 - F(x) = P[X " x], then ���� � = B$ , sothat
B$DCFE;G�HI JLKNMPO @A �$��JLKNMPO $ (RQ��1�
TheAEST methodologyhastheadvantagesof beingnon-parametricandeasyto apply. Moreover, it hasbeenshownto berelatively accurateonsyntheticdatasets.Moreimpor-tantly, therehasbeenevidencethatthescalingestimator, ascalculatedby AEST, appearsto increasein accuracy asthesizeof the datasetgrows. Given that our datasetsfeatureat leasttwo thousandmeasurements,sucha techniquewillprovide us with reasonablyaccuratevaluesfor the cutoffpoints in the heavy tail flow bandwidthdistribution mea-suredin eachtime interval.
2) Constantload approach: This secondapproachis amorepragmaticapproachtowardsidentifying the thresh-old bandwidthvaluesseparatingelephantsfrom mice. Foreachtime interval � , we definelocal threshold���� � , suchthat theflows exceedingthis thresholdaccountfor � % ofthe total traffic on the link. We sort the flows andstart-ing from the largest, we proceeddown the list addingflows into the elephantclass. We stop when the volumeof the elephantclassreaches� % of the link load. If werank bandwidthsin ascendingorder, let S denotethe in-dex of a flow, and T denotethe highestbandwidthflowin timeslot n, then ���� �UCV��W , � �� � where X is cho-sensuchthat: Y WZ\[�] � Z �� �^:_�a`bY �Z\[�] � Z �� � , and
Y W , �Z+[�] � Z �� �c"dC*�e`cY �Z+[�] � Z �� �\f
B. ThresholdUpdatePhase
To make useof thethresholdderived for engineeringpur-poses,weclassifythedatain timeslot ��d�g�h� accordingto���� � . This maynot beadequatebecausewe maynot havea sufficiently large numberof measurementsin eachtimeinterval to yield representative estimates.In addition,weexpectthatcorrelationsexist acrosstimeslots(shown laterin SectionIV-B) andarethusmotivatedto allow pastdatato influenceour choiceof threshold. We thuscalculateamoregeneralclassificationthreshold B���� � usinga simplefirst orderauto-regressive filter on ���� � asfollows.
B����i�*�h�jCkl�mQ � �n`oB���� ��� � `j���� � (2)
Wecall this theupdaterule. We foundthatavalueof� C
93fp> leadsto a sufficiently smooth B���� � . We now usethethreshold B���� � to classifythe traffic during the ��q�R�h�lrs�
time interval. In otherwords,thefinal decisionrule is thefollowing.
if � � �� �t"kB���� � then �juq��� else��uq� � (3)
C. ThresholdEvaluation
In mostapplicationsof classificationeachdataobject in-herentlybelongsto someparticularclass.Thisclassis un-known to theobserver whosejob is to guesscorrectlythetrue stateof nature. Our situationis different in that theflows do not naturallybelongto a particularclass.Insteadweareimposingacategorizationontoourflowsbecauseitis usefulfor traffic engineeringpurposes.Becauseof this,thetraditionalperformancemeasuresbasedonmisclassifi-cationerrorscannotbeused.Theevaluationof thetypeofclassificationtechniquesproposedhereshouldbebasedonthe needsof the particulartraffic engineeringapplicationusingthe classification.The first two metricswe usefortheidentificationof theelephantsarethefollowing.
1) The fraction of total traffic apportionedto the ele-phantclassshouldbesufficiently large.
2) Thenumberof elephantflows shouldbereasonablysmall, say lessthan a number
A, where
Ais con-
strainedby theroutercapabilities,anddenotesarea-sonablenumberof flows for which the router cankeepstate.
D. Persistenceof Classificationin Time
We explainedearlier that for traffic engineeringpurposeswe would like flows to retaintheir classaslong aspossi-ble. Ideally the majority of the elephantsshouldremainelephantsfor long periodsof time, i.e. in theorderof fewhours. Intuitively this meansthat we would like to clas-sify aselephantsthoseflows whosevolume is above theidentifiedthresholdconsistently.
Thelengthof timethatanelephantcanremainanelephantis both a function of the flow itself andof the classifica-tion. It is a functionof theclassificationin thesensethata particularhigh-volumeflow will remainin the elephantclassas long as the continuallyadjustingthresholdstayslower that its averagebandwidth. Becausetraffic on theInternetis highly variable,we don’t expectflows to retaintheir classificationindefinitely. Weknow thattherewill bemany flows thatwill burst for a smallamountof time,andbecomequietafterwards.
To definea performancemetric to capturethepersistenceof theclassificationin time, we proceedasfollows. Notethat the classificationschemewe have proposed,induces
5
SPRINTATL TECHNICAL REPORT TR01-ATL-110918 6
the following underlyingtwo-stateprocesson eachflow.Let vw��� �xCy� if �tug� � , and vw�l�� ��Cz9 if �tu{��� . At eachclassification| time interval, theprocesseithertransitionstotheotherstateor staysin thesamestate.Wedefinethefol-lowing two performancemetricsin termsof thisprocess.
3) For eachflow � , we calculatetheautocovariancefora lag of 1, namely � M �}�l�h�tCR~ vw��� �n`tvw�l��D�F�h�s& .Thismetricreflectshow predictabletheclassof flow� is at time �e�R� basedon its classat time � . Theclassificationschemewith the largerautocovarianceis theonein whichtheflowsaremorelikely to retaintheir classification.
4) For eachflow � , we calculatethe averageholdingtime the flow spendsin the elephantstate,and itsaverageholding time in the mousestate. The clas-sificationschemethat yields larger holding timesismoreattractive for ourpurposes.
IV. RESULTS
In this sectionwe presentthe resultsderived applyingthetwo methodsdescribedin the previous section. We com-parethetwo methodswith respectto thefour performancemetricsidentified. We discusstemporalissuesrelatedtoclassificationsuch as the duration of the validity of theclassification,and the timescaleof the measurementandclassificationintervals.
A. ClassificationResults
Figures 5, 6, and 7 presentthe thresholdvalues com-putedusing the two proposedtechnques,the numberofelephants,and the fraction of total traffic apportionedtoelephantsfor bothtraces.
For the west coasttrace,the thresholdobtainedwith the“constantload” approachfluctuatesthroughoutthedaybe-tween1 KBytes/secand2.5KBytes/sec.As alreadyshownin SectionII, thewestcoasttraceexhibits high utilizationlevels during the peakperiod (Figure 1). A burst whichis notaccompaniedby similarburstybehavior in thenum-berof active network prefixes(Figure2). This impliesthattheburst in link utilization is not dueto new network pre-fixesbecomingactive, but ratherdueto existing networkprefixes receiving traffic at higher rates. Given this risein thebandwidthsachieved by thealreadyactive networkprefixes,it is expectedthatnow thethresholdvalueisolat-ing theflows contributing 80%of the total traffic is muchhigher.
The “aest” approachprovides a lower estimatefor thethreshold. The reasonwhy the thresholdis lower‘ in the
09:00 13:00 17:00 21:00 01:00 05:00 09:00 13:000
500
1000
1500
2000
2500
3000
3500
4000
Jul 24, 2001 Jul 25, 2001
Thr
esho
ld (
Byt
es/s
ec)
constant load (west coast)aest (west coast)constant load (east coast)aest (east coast)
Fig. 5. Derivedthresholds�������� for 0.8-constantload, andaest.
10:00 13:00 17:00 21:00 01:00 05:00 09:00 13:000
200
400
600
800
1000
1200
1400
Jul 24, 2001 Jul 25, 2001
Num
ber
of e
leph
ants
constant load (west coast)aest (west coast)constant load (east coast)aest (east coast)
Fig. 6. Numberof elephantflows for 0.8-constantload, andaest.
“aest” approachis becausemany of the flows in thepeakperiodhave now beenpositionedin the tail of the band-width distribution.
The effect of the burstinessin the utilization of the westcoasttraceis that thenumberof elephantsidentifiedwithboth techniquesis much larger. Given that the “aest”thresholdis muchlower thanthe “constantload” one,thenumberof elephantsidentifiedwith “aest” is muchhigher,thusaccountingfor asmuchas90%of the total traffic onthelink.
For the smoothereastcoasttrace,wherethe trend in thelink utilizationis accompaniedby similar trendin thenum-berof active network prefixes,the thresholdvaluescalcu-latedwith bothtechniquesbehave similarly. Thethresholdvaluesareslightly higherfor the“aest”approach,account-ing for lesselephants,andsmallerfractionof thetotal traf-fic. It is importantto notice,though,thatfor theeastcoasttrace,the total amountof traffic apportionedto elephants
6
SPRINTATL TECHNICAL REPORT TR01-ATL-110918 7
10:00 13:00 17:00 21:00 01:00 05:00 09:00 13:000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Fra
ctio
n of
tota
l tra
ffic
appo
rtio
ned
to e
leph
ants
constant load (west coast)aest (west coast)constant load (east coast)aest (east load)
Fig. 7. Fractionof total traffic accountedto elephantsfor 0.8-constantload, andaest.
is aproximatelyconstantwith time, andequalto 80% forthe “constantload” (imposedby thetechniqueitself), and70%for the“aest” approachrespectively.
From the previous discussionit can be easily seenthatin casesof bursty link utilization, maintaininga constantamountof traffic in the elephantclassleadsto a volatilenumberof elephantflows,andequalyburstythresholdval-ues. Moreover, in suchcases,the performanceachievedusingthe two suggestedtechniquesis muchdifferent. Inthecaseof smoothtraffic thetwo approachesachievecom-parableresults.
B. TimeBehaviorof Classification
As discussedin SectionIII-C, we theflows to retaintheirclassificationaslong aspossible.We say“as long aspos-sible” becauseon theonehandwe wantto avoid updatingstatetoofrequently, but on theotherhandwedon’t wanttokeepflows in a particularstateif their bandwidthchangessignificantly. To assessthis issueof thepersistency of theclassificationwith time, we examineour last two metrics,namely, theautocorrelationvalueat lag 1, andtheaverageholdingtimesin theelephantstate.Wecomputetheseval-uesfor the peakhours,which is the periodduring whichpersistency is perhapsmostlyneeded.Thepeakhoursarebetween12pmand5pmEDT.
TableII shows thatthe vw��� � processesachievehighvaluesof autocorrelationfor bothapproachesandfor bothtraces.Whathasto benotedis that thevaluespresentedin TableII arehighly biasedtowardsthe performanceof the miceflows. The total numberof destinationprefixes that be-comeactive at somepoint throughoutthe durationof thewestcoasttraceis 30,241out of which theelephantsvarybetween454 and994 (dependingon the time interval � ).In otherwords,only asmallnumberof theautocorrelation
EastCoast WestCoast
AEST constantload AEST constantload
min. 0.8611 0.8687 0.8497 0.85835% 0.9789 0.9786 0.9737 0.9769avg. 0.9827 0.9823 0.9814 0.981995% 0.9833 0.9833 0.9833 0.9833max. 0.9913 0.9913 0.9906 0.9912
TABLE IIAUTOCORRELATION VALUES AT LAG 1 FOR PEAK HOURS.
EastCoast WestCoast
AEST constantload AEST constantload
min. 1 1 1 120% 1 1 1 150% 1.5 1.5 2 2avg. 3.72 4.1 8.63 8.0980% 3.5 3.5 12 10.5max. 60 60 60 60
TABLE IIIAVERAGE HOLDING TIMES IN THE ELEPHANT STATE FOR PEAK
HOURS (IN 5-MINUTES SLOTS).
valuescorrespondto elephantflows. This is oneof therea-sonswe weremotivatedto look at theholdingtimesof theclassificationprocess.
Wecomputetheaverageholdingtimesfor eachflow in theelephantstatefor thefive hourbusy period. Thedistribu-tion of theaverageholdingtimesis shown in Figure8. Thex-axisvaluesrepresenttime slotsof 5 minuteseach(so avalueof 12 correspondsto 1 hour). TableIII presentsthebasicstatisticsfor theaverageholdingtimesin theelephantstate.We seefrom this tablethat theaverageholdingtimeis approximately20 minutesfor the eastcoasttrace,and40 minutesfor thewestcoasttrace. Moreover, morethan80%of theflowshaveaverageholdingtimeslessthanhalfthemaximum.
It is clear from Table III andFigure8 that few flows re-main elephants“forever” (i.e., the durationof the wholepeakperiod). It is naturalto askhow well pastbehaviorcanpredictfuturebehavior. In particularwe would like toknow what is thelikelihoodthatanelephantthathasbeenanelephantfor acertainamountof timewill remainanele-phant“forever”. To measurethiswecalculatetheprobabil-ity thatanelephantwill remainanelephantfor thewholedurationof thepeakperiod,giventhat it hasstayedin the
7
SPRINTATL TECHNICAL REPORT TR01-ATL-110918 8
0 10 20 30 40 50 6010
0
101
102
103
104
Average holding time in the elephant state
Num
ber
of o
ccur
ence
s (lo
g)constant load (west coast)aest (west coast)constant load (east coast)aest (east coast)
Fig. 8. Averageholding time in the elephantstatefor peakhours(in 5-minutesslots).
elephantstatefor a givennumberof intervals. In Figure9,we presentthoseprobabilitiesfor bothtracesandfor bothapproaches.
Figure9 shows thatfor theburstywestcoasttracethepre-dictability of theelephantsis very limited. Flows that re-main elephantsfor 36 intervals (i.e. 3 hours)during thepeakperiodhave a probabilityof lessthan0.2 of remain-ing in that statefor the whole 5 hours. For theeastcoasttraceelephantsshow ahigherlevel of predictability, andifthey remainelephantsfor 3 hours,they have a probabilitygreaterthan0.5of remainingelephantsfor thewholedura-tion. Therefore,Figure9 providesanotherindicationthatnetwork prefixflowsareveryvolatile,andoncepersistencyis themostdesirablepropertyof theclassification,simplyclassifyingflows basedon theirbandwidthat eachintervalis not sufficient. In practicalsettingssuchsmallprobabil-ities of successfullyidentifying elephantsin thefuturehaslimited applicability.
Theseresultshave implications for other methodspro-posedin the literature. For example, in [5] the authorscreatestatefor eachflow thatexceedstheir threshold,andthey keepthatstatefor thelifetime of theapplication.Thatwouldn’t makesensefor ourdatasincelessthan5%of theflows classifiedaselephantsin any instantremainedele-phantsfor theentire5-hourperiod.Weconcludethatflowsneedto bereclassifiedfairly frequently.
We saw in the previous subsectionthat in order to main-tain a consistentamountof load in theelephantclass,thenumberof elephantsis quitevolatile. Thefactthataverageholding timesareshortis anotherindicatorof thevolatil-ity of elephants.If elephantsareindeedvolatile, thentheycouldneedto bereclassifiedoftenif thegoalis to maintainaconsistentloadin theelephantclass.
0 10 20 30 40 50 600
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Number of sequential intervals in the elephant state
Pro
babi
lity
of "
alw
ays"
bei
ng a
n el
epha
nt
constant load (west coast)aest (west coast)constant load (east coast)aest (east coast)
Fig. 9. Probability of “always” remainan elephantfor 0.8-constantload, andaest.
108 110 112 114 116 118 120 1220
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time interval (5 mins)
Fra
ctio
n of
tota
l tra
ffic
appo
rtio
ned
to e
leph
ants
constant load (update = 1hr)aest (update = 1hr)constant load (update = 30 mins)aest (update = 30 mins)
Fig. 10. Effect of infrequentclassificationon the load of theelephantclass.
Until now we have consideredreclassifyingflows every 5minutes.In orderto evaluateothertimescalesfor reclassi-fication,weconsidertwo othercases,namelyreclassifyingevery 30 minutesand1 hour. Even thoughwe reclassify-ing at theseintervals,we continueto calculatethesepara-tion thresholdvaluesevery5 minutes.Figure10shows theamountof traffic apportionedto elephantsusingbothtech-niqueswith theselongerreclassificationtimes. This helpsusto understandhow fastthelist of flows belongingto theelephantclassbecomesstale. Regardlessof the methodusedfor classification,we seethatwhenflowsareupdatedat longerintervals, the fractionof total traffic apportionedto the elephantclassmay drop dramatically, in particularit canfall aslow as40% within 25 minutesfrom the lastclassificationinterval. Flowsclassifiedaselephantsat timeslot 109 (Figure 10) experiencelower bandwidthsin thenext intervals,while flows thatmayexperiencehigh loadscannotbeplacedin theelephantclass,until thenew updateperiod. Moreover if we updatetoo slowly, we canexperi-encea significantjump in theelephantloadat the time ofreclassification(e.g.,at time interval 115). Note that not
8
SPRINTATL TECHNICAL REPORT TR01-ATL-110918 9
updatingthelist of elephantssufficiently oftenmayendan-ger the efficacy of the traffic engineeringapplicationbe-causeit may no longerbe giving differential treatmenttothelargestflows.
C. OtherSingle-Feature ClassificationTechniques
Thetwo metricsfor thenumberof elephantsidentified,andthefractionof total traffic apportionedto theelephantclassareclearly related.The targetof our secondclassificationapproachis to maintaina constantelephantload. In thissection,we report resultsaboutsimple classificationap-proachesthat couldbeselectedandwould targetat main-taininga constantnumberof elephantsor a thresholdcon-sistentlyequalto a specificfraction of the link utilizationor capacity. Thissectionshouldnotbeconceivedasacom-pleteevaluationof the techniquesdescribed,but a simpleillustrationof how alternative techniqueswouldperform.
A simpleway to identify the high-volumeflows in a net-work is to collect thebandwidthdistribution for thedesti-nationprefixesin eachinterval, andto classifythe top �flowsateachinterval aselephants.Forourwestcoasttrace,weset � equalto 100(approximatelythenumberof flowsconsistentlyin theelephantclassfor the5-peakhours),andmeasurethefractionof total traffic apportionedto theele-phantclass1. For comparisonreasons,we also calculatethe thresholdvalueas the bandwidthof the smallestele-phant. This is the bandwidthvaluesflows would have toexceedto beclassifiedaselephantsin thethreshold-basedapproaches.
Ourresultsindicatethatwith suchatechniquetheelephantloadmayfluctuatebetween35and94%of thetotal traffic.This pointsto an importanttradeoff. We saw in Figures6and7 thatmethodsthatsucceedin keepingthefractionoftotaltraffic in theelephantclassreasonablyconsistentyieldlargefluctuationsin thenumberof elephants.In this alter-nateapproachthat focuseson keepingthenumberof ele-phantsconstant,we observe large fluctuationsin the loaddueto elephantflows. It thusappearsdifficult to keepboththeelephantloadandthenumberof elephantfrom experi-encingfluctuationssimultaneously.
The correspondingthresholdrangefor this methodwas1to 12 KBytes/secdependingon the time of day, exhibit-ing similarbusyperiods,astheutilizationof therespectivelink. An importantresultof this classificationapproachisthatonly 11flowsremainedelephantsfor thewhole5-hourpeakperiod, while 97% of the other elephantflows fea-turedholdingtimesof lessthan24 intervals (i.e. 2 hours).
�Thenatureof our resultsdid notchangefor differentvaluesof � .
During this 5-hourperiod,approximately1800flows be-cameelephantsat somepoint (in both traces).Sinceonly11 of theseremainedelephantsfor the entireperiod, theprobability that a given flow in the setof top 100 at anygiveninterval remainsamongthetop 100flows for a verylongdurationis minimal.
Wepresentresultswhenthethresholdvalueis setequalto1% of theutilization of themonitoredlink, asproposedin[5] (1% of thecapacityof our OC-12links wasnever ex-ceededby a singleflow in our traces). Oncewe set thethresholdvalue in such a way, then the numberof ele-phantsidentified is limited between3 and 23 flows, andtheamountof traffic they correspondto fluctuatesbetween10 and60%of the total traffic. Themaximumfractionoftraffic is apportionedto the elephantclassin the off-peakhours,when the utilization is much lower, and thereforeeasierto exceed.Suchaclassificationprocessoffersmuchlower autocorrelationvaluesthanany otherclassificationtechnique,while themaximumholdingtimeattheelephantclassis 60 5-minuteintervals, achieved by only 2 flows.All theotherelephantflows achieve holding timesof lessthan6 intervals (i.e. 36 minutes).Thestatisticsfor theav-erageholding timesachieved with the four single-featureclassificationmethods,described,aresummarizedin TableIV.
Averageholdingtimestatistics
min. 20% 50% avg. 80% max.AEST 1 1 2 8.63 12 60constantload 1 1 2 8.09 10.5 60constantnum-ber
1 1 1.2 3.47 2.75 60
1%utilization 1 1 1 4.12 3 60
TABLE IVSTATISTICS FOR THE HOLDING TIMES AT THE ELEPHANT STATE
FOR ALL FOUR SINGLE-FEATURE CLASSIFICATION METHODS
(WEST COAST IN 5-MINUTE SLOTS).
D. OtherTimescalesfor MeasurementandClassification
Throughoutthe paperwe have consideredmeasurementand classificationintervals of ��C�� minuteseach. Asstatedin Section III, we chose5 minutes for practicalreasonsbecausethis correspondsto the interval at whichSNMP informationis normally collected. In order to as-sesstheutility of this time scalewe alsoconsidered�iC��and �!CF�}9 minutes.
9
SPRINTATL TECHNICAL REPORT TR01-ATL-110918 10
Table V presentsthe statisticsfor autocorrelationvaluesachieved at lag 1, calculatedover the five peak hours.Thesevaluesaremuchsmallerthan the onesin TableII.At �=C��}9 we seemoreof a distinctionbetweenthe twoschemesthanwe did when �=C�� . We alsocalculatethemainstatisticsfor theaverageholdingtime at theelephantstate,andweobserve thatthe80thpercentileof thederiveddistribution is equalto 3 intervals(i.e. 3 hours).
Therefore, it is evident that even at higher timescales,whenflows’ bandwidthwould beexpectedto behave in asmootherfashion,volatility is themainpropertyof theele-phantflows. We noticethat theautocorrelationat ��Cy�}9is approximately10% lessthanwhat it is at �FC�� , andthenumberof intervalsanelephantmayremainasanele-phantis muchsmallerthantheoneachievedwhen �eCk� .Thoseremarksgive us confidencein that the resultspre-sentedso far arenot an artifact of the timescaleusedbutratherapropertyof thetraffic itself.
We alsoconsiderthe interval ��C�� minute. We find thattheautocorrelationfor ��Cy� and �5Cz� arealmostidenti-cal, indicatingthatthereis notany needto go lower than5minutes.This is only aninitial examinationinto theappro-priatetime scalefor sucha classification;however it givesusconfidencethatwe areusingreasonablevalues.
East-Coast� = 60 mins
AEST 0.8-constantloadmin. 0.5833 0.83335% 0.8333 0.8333avg. 0.8318 0.837395% 0.8333 0.8571max. 0.8889 0.8571
TABLE VAUTOCOVARIANCE VALUES AT LAG 1 FOR THE PEAK 5 HOURS.
V. TWO FEATURE CLASSIFICATION
It is evident from the previous sectionthat regardlessofthe criterion we useto definea separationthreshold,westill seealot of volatility in theelephants.Whatwewanttoavoid is reclassificationof aflow if it transitionsto theothersideof thethresholdfor ashorttime. Largeflowscanhaveshorttransientperiodsin which their volumedrops. Sim-ilarly, small flows canhave short-lived bursts. Althoughmany flows may spendmost their time on the samesideof the separationthreshold,every so often they crossthethresholdfor justoneor two timeintervals.Wedonotwantto reclassifyflowsdueto shorttransientburstsor dips.
This motivatedus to includea secondfeaturein theclas-sificationthatwouldbebasedona temporalcomponentofan elephant’s behavior. For traffic engineeringpurposes,it is important to be able to differentiatebetweenflowsthatcontribute largeamountsof traffic yet mayexperienceshort-lived periodsof reducedbandwidth,and flows thatfall below the thresholdand remaintherefor the rest oftheir lifetime. As a consequence,our reactionto transientdropsbelow the identified thresholdshouldexhibit suffi-cientlatency.
In thermodynamics,“latent heat” is the heat energy re-quiredto changea substancefrom onestateto another. Inorder to be able to identify the points in time, whenele-phantflowsbelow thethresholdchangestateinto mice,wedefinea new metric,which we call “latent heat”. An ele-phantis assumedto accumulate“latent heat”aslong asitexceedsthe identified threshold. Onceit falls below thethreshold,it startslosing“heat”, andchangesstate(i.e. be-comesamouse)whenits “latentheat”becomeslower thanapresetvalue.
In order to achieve this, at eachtime interval, apartfromtheflow’sbandwidthwecalculatethedistancebetweenthebandwidthachievedandthecorrespondingthresholdvalue,derivedusingtheproposedapproachessuchasthe“aest”,andthe “constantload”. We definethe “latent heat” of aflow asthesumof thosedistancesin thepast12 timeslots,i.e. theprevioushour.
�x� ��� ��C � [���� [�� , � �
��������1Q�B������
In eachclassificationinterval �q�?� , we classifyflows asfollows.
1) if ������i�*�h�c"�B���� � and�j� �l�� ��"a9 , �xuq� �
2) if ������i�*�h�c"�B���� � and�j� �l�� ��:a9 , �xuq���
3) if ������i�*�h�c:�B���� � and�j� �l�� ��"a9 , �xuq� �
4) if � � ��i�*�h�c:�B���� � and�j� � �� ��:a9 , �xuq� �
With thefirst condition,we cantrackflows thatsystemati-cally transmitovertheidentifiedthreshold.If theflow hap-pensto transmitunderthethresholdfor a smallnumberofintervals,thenwith condition(3) wemakesurethatit doesnot changeits state.With thefourth condition,we classifyasmicethoseflows thatfall consistentlybelow thethresh-old. If thoseflowshappento transmitoverthethresholdforasmallamountof intervals,thenthey remainmice(condi-tion 2). The “latent heat” metric takes into accounthowmuchbeyond or below the thresholda flow may transmitandthusaccountsfor majorchangesin theflow’sbehavior.In otherwords,if a flow hasbeenidle andstartstransmit-
10
SPRINTATL TECHNICAL REPORT TR01-ATL-110918 11
ting at threetimesthe identifiedthreshold,thenit will beclassifiedas an elephantwithin at least4 time intervals,assuming� thethresholdremainsconstant.
Weclassifyflows basedon thethresholdvaluescalculatedusingthe “aest” approach,andtheir “latent heat” in time.Theresultingnumberof elephantsandfractionof totaltraf-fic apportionedto theelephantclassis presentedin Figures11and12for bothtraces.If wecomparethesefigureswiththoseof Figures6 and7, we seethat with this classifica-tion approachwe cankeepthe load reasonablyconsistent(Figure12) while simultaneouslyexperiencinglessfluctu-ationsin the elephantload (Figure 11). Henceour two-featureclassificationschemeusing latentheatachieves abettertradeoff betweenreducingfluctuationsin eitherele-phantloador thenumberof elephants,thanany of thesin-gle featureclassificationschemes.
Theempiricalprobabilitydensityfunctionfor theaverageholdingtimein theelephantstateis presentedin Figure13.Specificstatisticsaregivenin TableVI. Thereis no longera peakat smallholdingtime values.Recallthat for singlefeatureclassificationwe have averageholdingtimesof 20and40 minutes(dependinguponthetrace),while hereweseeaverageholding timesfor both tracesare1 1/2 hoursor longer. We believe that classificationschemessuchasthis one,thatavoid reclassificationfor short-termfluctua-tions, identifiesthe type of long-lived elephantflows thataregoodcandidatesfor traffic engineeringapplications.
Averageholdingtime statistics
min. 20% 50% avg. 80% max.westcoast
1 7 16.5 23.56 44 60
eastcoast 1 6 11.5 17.42 24 60
TABLE VISTATISTICS FOR THE HOLDING TIMES AT THE ELEPHANT STATE
FOR THE “ LATENT HEAT” APPROACH ON THE AEST THRESHOLDS
(IN 5-MINUTE SLOTS).
VI. PROFILE OF ELEPHANT FLOWS
In thissectionweexaminetheprofileof theelephants.Weconsideredthoseflows thatwereclassifiedaselephantsbyour two featureclassificationmethodthroughouttheentire5 hour peakperiod. As a preliminaryassessmenton theprofile of the elephants,we examinetheir network prefixlengthsandthetypeof organizationthey belongto.
09:00 13:00 17:00 21:00 01:00 05:00 09:00 13:000
200
400
600
800
1000
1200
1400
Jul 24, 2001 Jul 25, 2001
Num
ber
of e
leph
ants
west coasteast coast
Fig. 11. Numberof elephantsderived using the two-featureclassificationmethod.
09:00 13:00 17:00 21:00 01:00 05:00 09:00 13:000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Jul 24, 2001 Jul 25, 2001
Fra
ctio
n of
tota
l tra
ffic
appo
rtio
ned
to e
leph
ants west coast
east coast
Fig. 12. Fractionof total traffic apportionedto elephantsusingthetwo-featureclassificationmethod.
0 10 20 30 40 50 6010
0
101
102
103
104
Average holding time in the elephant state
Num
ber
of o
ccur
ence
s (lo
g)
west coasteast coast
Fig. 13. Averageholding times in the elephantstate,whenclassifyingbasedon theaestthreshold,andthelatentheatof theflow bandwidthin anhour.
11
SPRINTATL TECHNICAL REPORT TR01-ATL-110918 12
0 5 10 15 20 25 3010
0
101
102
103
104
105
106
Network Prefix Length
Em
piric
al D
ensi
ty D
istr
ibut
ion
Fun
ctio
n
west coast active prefixeswest coast elephant prefixeseast coast active prefixeseast coast elephant prefixes
Fig. 14. Network prefix lengthdistribution for theelephants.
One would expect that the network prefixes of the ele-phants would correspondto large domains advertisedthroughBGP. As alreadymentioned,the monitoredlinksare OC-12 links interconnectingbackbonerouters in aPOP, andtraffic is capturedas it travels towardsthe coreof thenetwork. In Figure14 we presentthedensityfunc-tion for the lengthof network prefixesthatbecomeactiveatsomepoint in our trace.Weobserve thatmostof ourac-tivenetwork prefixescomefrom /24networks,followedby/16 and/18. Active networks with prefixesof /8, and/10arelimited to lessthan100. In thesamefigurewe presentsimilarstatisticsfor thenetwork prefixesthatgetclassifiedaselephants.Thelengthsfor theelephantprefixesarecon-tainedin a smallerregion, namelybetween/12 and /26.For the west-coastwe even witnessa /32 network alwaysin the elephantclass. This is a customernetwork usingNAT, advertisinga single IP addresswithin the Sprint IPbackbone.
Finding only two or three(dependingupon the trace)/8flows amongourelephantsis anunexpectedresult.Sucharesultcannotbegeneralizedfor thewholeInternetbecauseit is influencedby theway our network is engineered,andthe way BGP policies impact traffic flow throughoutournetwork. For instance,we comparedthe elephantsiden-tified on the westandthe eastcoast,andwe found that afew network prefixeswereelephantsononly oneof thetwocoasts.This resultwasexpectedsinceBGPpoliciesguidethe entry andexit pointsof our network for traffic that isnotdestinedfor ourcustomers.
In order to get a betterunderstandingabout the type ofbusinessescontributing the most to the overall load, weclassify elephantsinto the following categories: Tier-1,Tier-2, Tier-3 ISPs,corporations,universities,andgovern-mentinstitutions.Thedefinitionof whatconstitutesaTier-1 network is rathercontroversial.Themostpopulardefini-tion is thatTier-1 ISPsarethosenetworksthathave access
to the global InternetRoutingTableanddo not purchasenetwork capacityfrom other providers [1]. The secondtier generallyhasa smallernationalpresencethana Tier1 ISPandmay leasepartor all of its network from a Tier1. Lastly, aTier 3 ISPis typically a regionalprovider withnonationalbackbone.
Accordingto theabove categorization,we profile theele-phantsin ournetwork asshown in TableVII. Thevastma-jority of the elephantnetwork prefixesbelongsto Tier-1,Tier-2, andTier-3 ISP providers. We believe that profil-ing elephantflows is essentialonceone intendsto applyspecifictraffic engineeringpolicieson the traffic destinedtowardsthem.
Tier-1 Tier-2 Tier-3 corp. .edu gov.
westcoast
40 42 28 32 19 6
eastcoast
22 72 27 40 7 7
TABLE VIIPROFILE OF THE ELEPHANT NETWORK PREFIX FLOWS.
Noticethatcertainbusinessesmayadvertisemorethanonenetwork prefixesfor their network. Therefore,the resultspresentedin TableVII do not correspondto uniquebusi-nesses,but ratherto the numberof elephantnetwork pre-fixesin eachcategory.
VII . CONCLUSIONS
In thispaperwe consideredclassificationschemesfor net-work prefix flows that identify elephantsby separatingtheflows into two classescalledelephantsandmice. We con-sideredtwo singlefeatureschemesin depth,andtwo oth-ers summarily. We comparedthesemethodsin termsofthe numberof elephantsthey yield, and the total load intheelephantclass.Weshowedthatsinglefeatureschemesthat lead to persistency for one of thesemetrics, lead tofluctuationsfor theothermetric. Hencethereis a tradeoffbetweenthesetwo metrics.
We evaluatedmetricsbasedon the temporalbehavior ofelephants.We illustratedthe volatility of elephantsthreeways: 1) we found thatflows remainin theelephantstatefor surprisinglyshort periodsof time; 2) the numberofelephantsvariesconsiderablyoverdifferenttime intervals;and3) if elephantsarenot reclassifiedfrequently, thetotalloadin theelephantclasscanchangesubstantially.
12
SPRINTATL TECHNICAL REPORT TR01-ATL-110918 13
Elephantflowsthatarenotpersistentfor sufficientamountsof time, suchasperiodsof time larger thanroutingproto-col con� vergencetimes,maynot beusefulfor traffic engi-neeringapplications.Therefore,classificationtechniquesshould make sure that flows are reclassifiedonly whennecessaryandnot whenthey exhibit short-termtransitionsabove or below thethreshold.
For that reason,we proposeda new classificationmech-anismbasedon two features,a flow’s bandwidthand its“latent heat”. We calculatethe “latent heat” of a flowas follows. We measurethe differencebetweena flow’sbandwidthandtheseparationthresholdat eachtime inter-val, andsuma sequenceof thesedifferencesover multipletime intervals.Thismeasuresmoothesoutshort-termtran-sitionsbecausean elephant’s latentheatwill remainpos-itive despitea brief transitionbelow the threshold.Simi-larly, a mouse’s latentheatwill remainnegative despiteabrief transitionabove the separationthreshold. We showthatwith this two featureclassificationschemeweachievea bettertradeoff betweenour first two metrics. The ele-phantload remainsreasonablyconsistentwhile the num-berof elephantsfluctuateslessthanin single-featureclas-sificationschemes.More importantly, the propsoedtwo-featureschemeyields elephantsthat remainelephantsformuchlongerperiods.With singlefeatureclassificationele-phantsremainedelephantsonaveragebetween20-40min-utes,while using the two featureclassificationapproach,elephantsremainedelephantsfor 1 1/2 to 2 hours.
In profiling the elephantsfrom both our traces,we foundtheelephants’prefixesgenerallylie within a limited rangeof network prefix lengthsbetween/16 and /26. Surpris-ingly, we found few large prefixessuchas/8 and/10. Al-thoughtherewerea few corporations,governmentagen-ciesanduniversitiesamongour elephants,we learnedthatthevastmajority of themareTier-2 andTier-3 ISPs.
Overall, we concludethat while single featureclassifica-tion schemesare attractive in their simplicity, they areprobablyinsufficient for mosttraffic engineeringapplica-tions. While we believe thatour two featureclassificationschemeis more attractive to traffic engineeringapplica-tions, we also concludethat identifying elephantsis notstraightforward despitetheir heavy-hitter nature. This isimportantbecausetheideaof isolatingelephantsfor trafficengineeringpurposeshasbeenwidely proposed,but therehasbeenno prior effort on assessingthefeasibility andis-suesinvolvedin doingso.
REFERENCES
[1] D. Allen. Theimpactof peeringon isp performance:What’s bestfor you? NetworkMagazine, 2001.
[2] S. Bhattacharyya,C. Diot, J. Jetcheva, and N Taft. POP-levelandAccess-Link-Level Traffic Dynamicsin a Tier-1 POP. ACMSIGCOMMInternetMeasurementWorkshop, November2001.
[3] M. Crovella. Performanceevaluationwith heavy taileddistribu-tions. In Lecture Notesin ComputerScience1786, March2000.
[4] M. CrovellaandM. Taqqu.EstimatingtheHeavy Tail Index fromScalingProperties.MethodologyandComputingin AppliedProb-ability, 1999.
[5] C. EstanandG. Varghese. New Directionsin Traffic Measure-ment and Accounting. ACM SIGCOMMInternet MeasurementWorkshop, August2001.
[6] W. FangandL. Peterson.Inter-AS Traffic PatternsandTheir Im-plications.Proc.IEEE Globecom, December1999.Brazil.
[7] S.Floyd. Simulationis Crucial. IEEESpectrum, January2001.[8] C. Fraleigh,C. Diot, B. Lyles, S. Moon, P. Owezarski,D. Pa-
pagiannaki,andF. Tobagi. Designanddeploymentof a passivemonitoring infrastructure. In Proceedingsof Passiveand ActiveMeasurementWorkshop, Amsterdam,April 2001.
[9] S.Sarvotham,R.Riedi,andR.Baraniuk.Connection-Level Anal-ysisandModelingof Network Traffic. ACM SIGCOMMInternetMeasurementWorkshop, August2001.
[10] A. Shaikh,J. Rexford, andK. Shin. Load-Sensitive RoutingofLong-LivedIP Flows. Proc.ACM SIGCOMM, September1999.
[11] J.W. Steward. BGP4: Inter-DomainRoutingin theInternet. Ad-disonWesley, 1999.
[12] S.Uhlig andO. Bonaventure.OntheCostof UsingMPLSfor In-terdomainTraffic. Qualityof Future InternetServices, September2000.Berlin.
13