[IEEE 2005 2nd International Symposium on Wireless Communication Systems - Siena, Italy (05-09 Sept. 2005)] 2005 2nd International Symposium on Wireless Communication Systems - Further

Further Improving TCP Performanceon Satellite Channel

C. Caini and R. FirrincieliDEIS/ARCES, University ofBologna, Viale Risorgimento 2, 40136, Bologna, Italy

Email: [email protected], [email protected]. it, tel: +390512093062, fax: +390512095410

Abstract-Satellite systems offer several advantages to wirelessInternet communications. Nevertheless, satellite radio links areaffected by two severe problems, namely long propagation delaysand relatively high error rates, which pose a difficult challenge tothe Internet protocols and in particular to TCP. Among thepotential solutions, we are specifically interested in the TCPenhancements that preserve the end-to-end semantics of TCP. Inthis paper, the authors evaluate the performance improvementsprovided by some additional features, when applied to TCPWestwood (TCPW), a promising transport protocol specificallydesigned to cope with errors on wireless links. This enhancedversion (E-TCPW) is compared with both the original TCPWand other known TCP variants (NewReno, SACK, Hybla),considering goodput, fairness and friendliness as performancefigures. Results, obtained through ns-2 simulations, seem veryencouraging and suggest the inclusion of the proposed additionalfeatures in the future official versions ofTCPW.

Index Terms- Transport Protocols, Satellite Communications,SACK, TCP Westwood.

I. INTRODUCTIONS ATELLITE systems offer several advantages to the

wireless Internet communications. They generally providewider coverage, higher bandwidth, and relatively faster setupthan other wireless systems (e.g. radio-terrestrial).Nevertheless, satellite links pose some challenges to thelntemet protocols and in particular to the TCP (TransmissionControl Protocol). First, the Round Trip Time (RTT) is greatlyincreased by the long propagation time on the satellite radiochannel (for instance, RTT is about 300 ms, considering theforward link via a GEO satellite and the return link via a fastterrestrial network, or 600 ms if both the forward and thereturn links are via a GEO satellite). Second, the presence ofrandom losses not originated by congestion cannot beconsidered negligible. Both these aspects pose severe threatsto the transport layer performance and require the adoption ofadequate countermeasures [1].A number of interesting contributions has been presented in

literature to solve these problems. Among the potentialsolutions, we are specifically interested in the enhancementproposals that preserve the end-to-end semantics of TCP. Inparticular, in the development of the author's proposal (TCPHybla [2]), some additional components proved very effective,namely the SACK option, packet spacing, the initial SlowStart threshold estimation and the congestion avoidance

This work was supported in part by the IST-507052 SatNEx Network ofExcellence.

adjustment at the end of the recovery phase (recovery fix). Inthis paper, the authors evaluate the performance improvementsprovided by the same additional features, when applied toTCP Westwood (TCPW) [3], a promising transport protocolspecifically designed to cope with errors on wireless links.TCPW basically aims at limiting the negative effects ofchannel losses, through a bandwidth estimation that replacesthe standard cwnd halving. As TCPW appears compatible (atleast in principle) with the aforementioned additional features,we suppose that its performance can be further improved byapplying them to it, especially when dealing with a satellitechannel. On these premises, we developed an ns-2 module,based on the SACK class, to implement an enhanced TCPW(E-TCPW) which include the proposed additional features.Performance evaluation is then carried out by comparing thenew E-TCPW, with TCPW (making use in this case of theofficial ns-2 module), and TCP Hybla.

Results are presented considering both micro- and macro-analysis. In the former case, a detailed segment analysis,allows a direct insight on the advantages provided by theenhancements introduced, as well as a graphical representationof their mechanisms. In the latter case, more complex layoutscan be studied (e.g. congestion with background terrestrialtraffic on a shared bottleneck), considering goodput, fairnessand friendliness as performance indicators.

II. KNOWN TCP VARIANTS

A. TCP NewReno and TCP SACKTCP NewReno [4] is the most widely adopted TCP version

and it will be considered as a standard reference whendescribing the following TCP variants. When a new TCPconnection is established, the sender probes for bandwidthavailability by gradually increasing the congestion window(cwnd). In the Slow Start (SS) phase, starting from an initialvalue typically equal to one or two maximum segment size(MSS), the cwnd is increased by MSS bytes per every non-duplicate received ACK. When the congestion windowreaches the slow start threshold (ssthresh), the source switchesto the Congestion Avoidance (CA) phase, during which thecwnd increment is reduced to MSS / W bytes per every newreceived ACK. In short, by expressing the value W of thecongestion window in MSS units, we have

Ss

CAwhere the index i denotes the reception of the ith ACK.

(1)

0-7803-9206-X/05/$20.00 ©2005 IEEE 772

P.- + IT.-+ ::,:

+ I/w,

In a real channel the cwnd rise continues until either the sizeof receiver buffer (advertised window) is reached, or asegment is lost. In the latter case, a duplicateacknowledgement (dupACK) is generated by the receiverevery time a new segment arrives. The reception of the thirddupACK triggers in the sender the Fast Retransmit and theFast Recovery procedures. First, the segment lost isretransmitted, then the ssthresh is updated tomax (FlightSizel2, 2MSS), where FlightSize represents thenumber of unacknowledged packets when the loss wasdetected.

It is worth noting that if multiple losses occur in the samewindow, NewReno remains in the recovery phase until eitherall the lost packets have been recovered or a RetransmissionTime Out (RTO) expires. As NewReno is able to recover onlyone packet per RTT, the time spent in the recovery phase maybe huge for long RTT connections. To overcome this problem,the selective acknowledgment (SACK) option was proposed[5]. By means of it, the receiver informs the sender about allthe segments that have been successfully delivered, thusallowing the sender to recover more than one packet per RTT.This ability is extremely useful when dealing with large cwnds(where multiple losses are frequent) and long RTTs, i.e. insatellite channel, as it will be shown later.B. TCP WestwoodTCPW [3] aims at limiting the consequences of the losses

introduced by a wireless channel, which are alwayserroneously ascribed to congestion by the standard TCPprotocol. To this end, TCPW introduces a modification of theFast Recovery algorithm called Faster Recovery. Instead ofhalving the congestion window after three duplicate ACKs,and fixing the slow start threshold to this value, TCPW setsthe ssthresh as a function of the actual available bandwidth,avoiding the dramatic slow-down of the transmission rate ofthe TCP standard versions. The bandwidth is estimated byaveraging the rate of returning ACKs. In particular, thereception of an ACK at the time tk implies that an amount ofdata dk has been received. Therefore, the k-th sample ofbandwidth used by a given connection is measured as:bk= (2)

t,- tk-lwhere tk-l is the arrival time of the previous ACK. Thesevalues are then filtered to obtain the available bandwidthestimate (BWE) at time tk, given by

bk =ak bk l+(l-ak(b +b k (3)

where cik = (2r-At)/(2T+At4) depends on the inter-arrival time(Atk = tk - tk-I) which is variable, and I/t is the cut-offfrequency of the filter. After a loss detection, TCP Westwoodsets the ssthresh and cwnd as follows:

sstresh -b * RTTrninIsstresh if cwnd > sstresh (4)

cwnd =1 after a time out

Several variants of the basic algorithm have been presented

in the literature [6-7], to improve performance. In thenumerical results we will make use of the public ns-2 releasethat can be downloaded from the official TCPW web site [8].C. TCP HyblaTCP Hybla [2] has been conceived with the primary aim of

counteracting the performance deterioration originated by thelong RTTs typical of satellite connections. It consists of a setof procedures, which includes an enhancement of the standardcongestion control laws, the mandatory adoption of the SACKpolicy, the adoption of Hoe's channel bandwidth estimation,the use of timestamps and the implementation of packetspacing techniques.The modification of the standard congestion control rules is

dictated by the TCP Hybla ideal aim of obtaining for longRTT connections the same instantaneous segmenttransmission rate of a comparatively fast reference TCPconnection (e.g. a wired one). To this end, it is necessary tospeed up the cwnd increase of the long RTT connectionbecause the longer the RTT, the larger the cwnd required (thesegment transmission rate being given by cwnd/RTT). To thisend, let us introduce the normalized round trip time, p, as theratio between the actual RTT and the round trip time of thereference connection to which TCP Hybla aims to equalize theperformance, denoted by RTTo,p = RTT/RTTo (5)

In the references it is shown that the same segmenttransmission rate of the reference connection can be achievedby longer RTT connections by replacing the standardcongestion control rules with the following,

W+2P-1 SS

{Wi +p IWi CA (6)As already mentioned, apart from the modification of the

congestion control rules, TCP Hybla includes other severalimportant enhancements. The rationale for their introduction isthe following.SACK policy is mandatory because thanks to the improved

congestion control algorithm of TCP Hybla, a larger averagecwnd has to be expected for long RTT connections. As aresult, multiple losses in the same window will occur morefrequently. Large cwnds also frequently cause severeinefficiencies of the "exponential back-off' RTO policy.These can be avoided by resorting to timestamps [9].Bandwidth estimation [10] is used in order to appropriately setthe initial ssthresh. If the initial ssthresh is too low, comparedto bandwidth-delay product, TCP prematurely switches to theCA phase, leading to inefficient bandwidth utilization. Viceversa, if the initial ssthresh is too high, the exponential cwndincrease of the SS phase may cause huge overflows when thecapacity of the channel is reached. Packet spacing [11] hasbeen adopted to counteract the burstiness due to large cwnds.Finally, the use of the recovery fix option [12] implies that atthe end of the recovery phase the cwnd is set to the minimumof ssthresh and the actual number of packets in fly. Thisprevents the transmission of burst of data caused by the slidingofthe cwnd at the end of the Fast Recovery.

773

I1. TCP ENHANCED WESTWOODWith the exception of the congestion window rules, the

other features included in TCP Hybla are potentiallycompatible with TCP Westwood. In practice, however, it islikely that packet spacing hinders the TCPW bandwidthestimation mechanism. Therefore, we identified in theremaining three features, namely, the SACK option, the initialssthresh estimation and the recovery fix, the possibleenhancements that could be straightforwardly applied toTCPW. To this end we developed a new ns-2 module bymerging the TCP Sackl and the TCP WestwoodNR agents,both already present in the simulator. Moreover, we slightlymodified the TCPW BWE algorithm which, in the FastRecovery, considers the reception of a dupACK as anindication that two segments are received. This assumptionmay hold if the delayed ACK policy is applied, but it leads toa harmful overestimation of the available bandwidth in theopposite case, where the reception of a single segment shouldbe considered instead. As in our simulations the delayed ACKpolicy is disabled, we modified the code accordingly.

IV. TOPOLOGY AND SMULATION SETUP

The network topology considered is shown in Fig. 1. Thesatellite TCP connections are composed of both wired legs anda satellite link, while the background traffic is represented byentirely wired paths. All the connections share the R1-R2bottleneck link, whose bandwidth has been deliberatelylimited to 10 Mbps in order to study the congestion effects.The router RI, where all the congestion events are confined,follows either a DT (Drop Tail) policy (with a buffer lengthequal to the Bandwidth Delay Product of the bottleneck) or aRED (Random Early Detection) policy with defaultparameters (qlen=50 seg., maxth=15 seg., and minth=5 seg.);all the other hosts follow a DT policy. The two-waypropagation delay of the satellite links varies in such a waythat the RTT of the satellite connections ranges from 25 ms(considered only for comparison purposes with the wiredconnections) to 600 ms (corresponding to the case of aforward and return GEO satellite link). The wired links aresupposed to be error free, while a uniformly distributed errormodel has been considered with a variable PER (0-1%) for thesatellite link. The MSS is always 1500 bytes and the sendersaccess the 10Mbps bottleneck through either a 10 Mbps or a100 Mbps access link. For every TCP connection, a persistentFTP file transfer process is considered. The performance isevaluated in terms of goodput, i.e. the amount of packetscorrectly received divided by the transfer process time. Inorder to prevent the transmission bit rate from being limitedby the advertised window instead of the cwnd, the advertisedwindow of the satellite receivers has been appropriatelyincreased. This is necessary to grant fair conditions to thedifferent TCP flavors. Finally, the RTTo in TCP Hybla ischosen equal to the RTT of the wired connections forcomparison purposes.

V. NUMERICAL RESULTS

A. Micro analysisA detailed packet level analysis, called micro-analysis, may

be useful to present the effects of the enhancements proposedfor TCPW. Only one satellite connection is considered active(no background traffic) with PER=1%. In this situation theability of TCPW to counteract losses caused by the wirelessleg, is not sufficient to exploit the channel capacity becausethe RI-R2 bottleneck induces huge buffer overflows. TCPW,indeed, being based on TCP NewReno, is able to recover onlyone segment per RTT (here equal to 300 ms) resulting in verylong recovery phases as it is shown in Fig. 2 where thebehavior of the cwnd, ssthresh, and BWE are depicted as wellas the sequence number of the transmitted packets. Bycomparing Fig. 2 with Fig. 3, which refers to E-TCPW, it isapparent that thanks to the SACK adoption, E-TCPW is ableto recover multiple losses in a much more efficient way thanTCPW. Moreover, the presence of the recovery fix preventssuccessive losses (due to the transmission of burst of data atthe end of each recovery phase) and the initial ssthreshestimation avoids the initial buffer overflow. Otherperformance indicators are collected in Tab. I, where it can benoted that, considering the bottleneck congestion, E-TCPWexperiences less losses, in relation to the amount oftransmitted packets, than TCPW. Even if the former spent40% more time in the recovery phases than the latter, the finalgoodput, after 60 seconds, of E-TCPW is more than ten timethe TCPW one. These results lead to the conclusion that theproposed enhancements dramatically improve the TCPWcapability of exploiting the satellite channel at least in thepresence of a bottleneck. To remove this effect, that may bepresent or not on a real network, we also considered the caseof an access link of 10 Mbps (i.e. no bandwidth restriction onthe source-sink path). Space restriction prevents us frompresenting time behaviors in this case, but correspondingresults, reported in Tab. I, show that, even if no morecongestion events on RI are obviously registered, the E-TCPW goodput improvement is still greater than 100%.Finally, last row of Tab I reports the measured packet errorrate, as a result of a packet error probability of 1% on thesatellite channel.B. Macro analysis

Macro analysis regards the quantitative evaluation of theoverall performance of different TCP variants. The presentedresults are obtained as average values of several runs and arerelated to the achieved goodput at the end of a long (600seconds) and continuous file transfer.

In order to have exhaustive results, we considered first asingle satellite connection over a lossy link, then we addedbackground traffic on the bottleneck and, finally, weconsidered the case in which a number of satellite connectionscompete with entirely wired connections for the bottleneck.The access link is set to 10 Mbps in order to have the most

favorable conditions for TCPW.

774

1) Performance in presence oflink lossesAs the standard TCP does not distinguish the origin of

packet losses, link errors cause spurious interference on thecongestion control mechanism, causing many unnecessarycwnd halvings. The longer the RTT, the larger theconsequences of this erroneous interpretation, as highlightedin Fig. 4, where only a single satellite connection is active andall the losses are due to the wireless channel, with a PER of1%. Considering TCP NewReno and TCP SACK, only alimited fraction of the available 10 Mbit/s bottleneckbandwidth is actually exploited, with a fast degradation withthe RTT increase. TCPW performance depends on thecombined effects of PER and large RTT. For PER=1%, itscountermeasures against link losses are really effective forRTTs less than 200 ms, while for longer RTTs theperformance decay is evident, although at 600 ms the goodputis increased of a factor five with respect to NewReno. As faras TCP Hybla is concerned, it must be remarked that, althoughthis variant is not specifically designed against link losses, itprovides a better performance than TCPW when RTT isgreater than 200 ms (a factor ten at 600 ms). Finally, E-TCPW, offers always the best performance even when longRTT are considered. At 600 ms thanks to the includedenhancements, the improvement with respect of TCPW is of afactor 3.5.2) Performance in presence ofcongestion and link losses

In order to analyze the effects of congestion, we added 5background wired connections (with TCP SACK) to theprevious scenario. The resulting satellite connection goodputis shown in Fig. 5, where the goodput of backgroundconnections is not reported for the sake of clarity (in brief,they equally share most the bandwidth not used by satelliteconnection). Congestion does not alter dramatically thequalitative performance behavior seen in the previous case,although a general performance reduction is obviouslypresent. The main result is that TCP Hybla and E-TCPWbasically provide the same excellent performance, showing asubstantially independence of the RTT with a goodput veryclose to the "maximum fair share" (i.e. the capacity of thebottleneck link divided by the number of sharing connections).While this limited dependence on the RTT is coherent withTCP Hybla primary aim, it can be explained for E-TCPW bysupposing that the more aggressive ssthresh setting methodleads the TCP to often operate in slow start, compensating inthis way the RTT penalization. On the other hand, it could beobserved that E-TCPW could be too aggressive whenconsidering RTTs less than 100 ms as its goodput overcomesthe maximum fair share.3) Fairness andfriendliness

Fairness and friendliness are two important features for anyversion of TCP protocol. Fairness refers to the capacity toassure a fair band subdivision among competing connectionsthat use the same version of the protocol, while friendlinessindicates the same ability with reference to different protocolvariants. To study fairness and friendliness in a heterogeneousenvironment, in Fig. 6 is reported the RI-R2 bandwidth share

of three satellite connections (RTT=300 ms, PER=1%) andthree wired connections (RTT=25 ms, PER=0%), allsimultaneously active. TCP SACK confirms a good fairnessamong connections with the same RTT but also a largeunfairness between wired and satellite connections, essentiallybecause ofthe different RTT (it would have been present evenin absence of PER). TCPW reduces the penalization sufferedby satellite connections, which however is still apparent. Bycontrast, both TCP Hybla and E-TCPW basically remove thesatellite penalization, at the expense of only a slight increasein the amount of unused bandwidth.

VI. CONCLUSIONSThe results carried out in the paper indicate that E-TCPW

always outperforms TCPW in connections over satellite links,without any noticeable drawback. Consequently, the authorssuggest that the proposed additional features should beconsidered for inclusion in the next release ofTCP Westwood.

REFERENCES[I] M. Allman et al., "Ongoing TCP Research Related to Satellites", RFC

2760, February 2000.[2] C. Caini and R. Firrincieli, "TCP Hybla: a TCP Enhancement for

Heterogeneous Networks", Intemational Joumal of SatelliteCommunications and Networking, vol. 22, n. 5, Sep. 2004, pp. 547-566.

[3] C. Casetti, M. Gerla, S. Mascolo, M. Y. Sanadidi, and R. Wang, "TCPWestwood: end-to-end congestion control for wired/wireless networks",Wireless Networks, Kluwer Academic Publisher, 8, 467-479, 2002.

[4] M. Allman, W. Stevens, "TCP congestion control", IETF RFC 2581,April 1999.

[5] M. Mathis and J. Mahdavi, "TCP Selective Acknowledgment Options",IETF RFC 2018, October 1996.

[6] M. Gerla, B. K. F. Ng, M. Y. Sanadidi, M. Valla, R. Wang "TCPWestwood with adaptive bandwidth estimation to improveefficiency/friendliness tradeoffs", Joumal of computer communications,Vol. 27, Number 1, January 2004.

[7] R. Wang, G Pau, K. Yamada, M. Y. Sanadidi, M. Gerla, "TCP StartupPerformance in Large Bandwidth Delay Networks ", INFOCOM 2004,Hong Kong, March 2004, pp.796-805.

[8] [westwoodwebsite] http://www.cs.ucla.edu/NRL/hpi/tcpw/.[9] V. Paxon and M. Allman, "Computing TCP's Retransmission Timer",

IETF RFC 2988, Nov 2000.[10] J. C. Hoe, "Improving the Start-up Behavior of a Congestion Control

Scheme for TCP", In Proceedings ofACM SIGCOMM '96, pp 270-280,1996.

[11] C. Caini and R. Firrincieli, "Packet spreading techniques to avoid burstytraffic in satellite TCP connections", Proc. of IEEE 59th VehicularTechnology Conference, VTC2004-Spring, Milan, Italy, May 2004,pp.2906-2910.

[12] S. Floyd, T. Henderson, and A. Gurtov, "The newreno modication totcp's fast recovery algorithm," RFC 3782,http://www.faqs.org/rfcs/rfc3782.html, April 2004.

10Mbi satellite linkRl R2

MbiV[ S 10 Mbitis

Satellitesender botleneck

1 00 Mbits Satellite receiver1 0-1 00 Mbitls

Wired sender Wired receiverFig. 1. Simulation topology.

775

10

5 1a00(9

0.1

0 10 20 30 40 50 60Elapsed bme (s)

Fig. 2. TCPW micro analysis: transmitted segments, BWE, ssthresh andcwnd as a function of the elapsed time.

300- 18000-BWE

250 X ssthresh I.." 16000

w ~Cwnd / 14000

xb200 _ ^ Tx 12000 ,

100002 ~~~~~~~~~~~~~~~8000a

0 10 20 30 40 50 60Elapsed time (s)

Fig. 3. E-TCPW micro analysis: transmitted segments, BWE, ssthresh andcwnd as a function of the elapsed time.

TABLE IPERFORMANCE FIGURES CARRIED OUT BY MICRO ANALYSIS (FIRST 60 s)

0 0.1 0.2 0.3RTT (s)

0.4 0.5 0.6

Fig. 5. Performance of various TCP techniques in presence of both aresidual packet error rate on the satellite channel (PER=I%) and RI-R2congestion (5 terrestrial connections with SACK, RTT=25 ms and PER=0%).

SACK 25295%

SACK;25.3%

-nuindSACK 25 __142%_

SACK 2514.5%

b)

TCPW300515% TCPW300

3TC005.5%25

253%

SACK 25144%

E-TCPV 30015.4%

Unused Hybl 3008.7% 112 5%

TCPW E-TCPW TCPW E-TCPWAccess link Access link Access link Access link@ 100 Mbps @ 100 Mbps @ 10 Mbps @110 Mbps

Goodput (Mbps) 0.24 3.25 2.27 4.89Recovery time 52.46% 71.94% 56.33% 64.67%

Rl tx losses 11.16% 7.43% 0.00% 0.00%

Sattx losses 1.21% 1.02% 0.97% 1.11%

10

5 1

0

0.1

0.1 0.2 0.3

RTT (s)

d) SACK 25179%

Fig. 6: TCP faimess and friendliness: Rl-R2 bottleneck share in a

heterogeneous environment. SACK on the terrestrial connections (PER=0%/o,RTT=25 ms) and: a) SACK, b) E-TCPW c) TCPW d) Hybla, on the satelliteones (PER=1%, RTT=300 ms).

0.4 0.5 0.6

Fig. 4. Performance of various TCP techniques in presence of a residualpacket error rate on the satellite channel (PER=1%); no congestion.

776

E-TCPV 30015.2%

- E-TCPW 3015.4%

Documents

[IEEE 2005 2nd International Symposium on Wireless Communication Systems - Siena, Italy (05-09 Sept. 2005)] 2005 2nd International Symposium on Wireless Communication Systems - Further