Transcript

1

The Silence of the LANs: Efficient LeakageResilience for IPsec VPNs

Steffen Schulz, Vijay Varadharajan, and Ahmad-Reza Sadeghi

Abstract—Virtual Private Networks (VPNs) are increasinglyused to build logically isolated networks. However, existingVPN designs and deployments neglected the problem of trafficanalysis and covert channels. Hence, there are many ways toinfer information from VPN traffic without decrypting it. Manyproposals have been made to mitigate network covert channels,but previous works remained largely theoretical or resulted inprohibitively high padding overhead and performance penalties.

In this work, we (1) analyse the impact of covert channelsin IPsec, (2) present several improved and novel approaches forcovert channel mitigation in IPsec, (3) propose and implement asystem for dynamic performance trade-offs, and (4) implementour design in the Linux IPsec stack and evaluate its performancefor different types of traffic and mitigation policies. At only24% overhead, our prototype enforces tight information-theoreticbounds on information leakage. To encourage further research,we put our prototype code and data in the public domain.

Index Terms—IPsec, VPNs, covert channels, performance,trade-off

EDICS: STG-COVC, STG-APPS, NET-ATTP

I. INTRODUCTION

Virtual Private Networks (VPNs) are popular means forenterprises and organizations to securely connect their networksites over the Internet. Their security is implemented andenforced by VPN gateways that tunnel the transferred datain secure channels, thus logically connecting the remote sitesin an isolated network. Abstracted this way, VPNs are increas-ingly used in scenarios that secure channels were not designedfor: to logically isolate networks, providing “networks asa service” in virtualized environments like Clouds, TrustedVirtual Domains, or the Future Internet [1], [2], [3]. However,what is not considered in these scenarios is the long knownproblem of covert channels.

Covert channels violate the system security policy by usingchannels “not intended for information transfer at all” [4], [5].While there is a large body of research on covert channels, fewworks have considered the practical implementation and per-formance impact of comprehensive covert channel mitigationin modern networks. We believe this topic is important for anumber of reasons, especially in virtual networks and VPNs:

(1) Insider Threat: In contrast to end-to-end secure chan-nels, where the endpoints are implicitly trusted, VPNs are also

Ahmad Sadeghi is director of the System Security Labs and the Intel Col-laborative Research Institute for Secure Computing (ICRI-SC) at TechnischeUniversitat Darmstadt, Germany ([email protected]).

Vijay Varadharajan leads the Information and Networked System Security(INSS) group at Macquarie University, Australia ([email protected]).

Steffen Schulz is PhD student at the System Security Labs, Ruhr-UniversityBochum, Germany, the INSS at Macquarie University, and associated with theICRI-SC at Technische Universitat Darmstadt ([email protected]).

used for logical network isolation and perimeter security en-forcement. In this context, the members of a VPN are often notfully trusted, but instead the trust is reduced to central policyenforcement points, the VPN gateways, which should preventundesired information flows. However, malicious insiders inthe LAN may leak information through the VPN gatewaysusing covert channels, thus circumventing the security policy.Examples of such insiders can be actual humans or stealthmalware, engaging in industrial espionage, leaking realtimefinancial transaction data, or disclosing large amounts of datafrom physically secured institutions (e.g., to Wikileaks).

(2) Traffic Analysis: By analysing traffic patterns and meta-data, it is also possible to infer information about transferreddata without assuming a malicious insider [6], [7]. Such“passive” Man-in-the-Middle (MITM) scenarios are becom-ing more prevalent with network virtualization, allowing co-located, supposedly isolated systems to analyse each other [8].To mitigate such attacks, a common approach is to consider themaximum possible information leakage by assuming colludingmalicious insiders. By limiting this information leakage, covertchannel mitigation thus also affects traffic analysis [9].

(3) Combination with Detection: Although application-layerfirewalls and intrusion detection systems are widely deployed,carefully designed covert channels remain hard to detect [10],[11]. In these systems, the adversary chooses a weaker signaland mimics the patterns of regular channel usage. Covert chan-nel mitigation can be useful here to induce noise, forcing theadversary to use a stronger signal and thus facilitate detection.We expect the combination of covert channel mitigation anddetection to allow for less intrusive pattern enforcement andthus significantly reduce the performance penalty.

a) Contributions: This paper provides for the first timean explicit analysis of covert channels in IPSec based VPNsand a comprehensive set of techniques and mechanisms tomitigate them. We identify and categorize the different typesof covert channels and determine their capacity. We developa framework for mitigation of these covert channels anddescribe mechanisms and techniques for high-performancecovert channel mitigation. In particular, we propose an algo-rithm for on-demand adjustment of traffic pattern enforcementthat increases peak network performance while also reducingoverhead during reduced usage. We present a practical instanti-ation of this framework for the Linux IPSec stack and analyseits performance for different kinds of traffic. In contrast toprevious works, which achieve throughput rates in the rangeof modem speed [12], [9] and taunt the performance impact ofproposed mitigation mechanisms [13], our prototype achieves169 Mbit/s in a 200 Mbit/s VPN link at only 24% overhead.

2

MITM

LAN

LAN

Insider

Insider

WAN/VPN

LANVPN-Gateway

WAN/VPNWAN/VPN

Unprotected Domain Protected DomainProtected Domain

VPN-Gateway

VPN-Gateway

outbound covert channelinbound covert channel

Fig. 1. Problem scenario: A VPN with three LAN sites. The adversary aims toexchange information between the MITM and malicious insiders using covertchannels.

b) Outline: After defining the problem of VPN covertchannels in Section II, we discuss efficient covert channelmitigation and performance trade-offs in Section III. An imple-mentation for the Linux IPsec stack is presented and evaluatedin Section IV. We discuss related work in Section V andconclude in Section VI. A detailed discussion of the identifiedcovert channels in IPsec VPNs is provided in Appendix A.

II. PROBLEM SETTING AND ADVERSARY MODEL

In the following we define the problem of covert channelsin VPNs. Note that our definition differs from previous,less explicit considerations, which consider communicationbetween legitimate VPN participants and are better describedas steganographic channels [14], [15], [16]. Although we limitourselves to VPNs in state-of-the-art IPsec configuration [17],most of our results can be generalized.

A. System Model and Terminology

As illustrated in Figure 1, we consider a VPN comprisingtwo or more Local Area Networks (LANs) that are inter-connected over an insecure Wide Area Network (WAN). In ourscenario, the security goal of the VPN is not only to provide asecure channel (confidentiality, authenticity, integrity) but alsoto confine communication of LAN hosts to the VPN, i.e., toisolate the protected from the unprotected domain. VPNs areincreasingly used for such logical isolation, to create securevirtualized or overlay networks, or simply enforce perimetersecurity in large companies [1], [2], [3]. This de-facto securitygoal of isolating the protected from the unprotected domain,and its efficient implementation, is the main focus of this work.

For this purpose, we distinguish legitimate channels thattransfer and protect user data according to the VPN securitypolicy from covert channels that can be used to circumvent thispolicy. Covert channels exist because the legitimate channelacts as a shared resource between the protected and unpro-tected domain, exhibiting certain characteristics that can bemanipulated and measured by different parties. We denotechannels from the protected to unprotected domain and viceversa as outbound and inbound covert channels, respectively.

We measure the security of our system using the Shannoncapacity of the covert channels, i.e., the information theoreticlimit on the amount of information that can be transferred

through them [6]. The covert channel capacity is given inbits per legitimate channel packet (bpp) or, where applicable,in bits per second (bps). The capacity of each covert chan-nel type is denoted as Ctype. The capacities are classifiedas maximum (m) vs. remaining (r) covert channel rate forinbound (in) vs. outbound (out) covert channels. For example,the maximum capacity of the outbound covert channel basedon packet size is denoted as CPktSize

m,out , or as CPktSizer,out after

countermeasures have been applied. The remaining aggregatedinbound and outbound covert channel rates are denoted asCr,in and Cr,out, respectively.

B. Adversary Model

The adversary controls one or more compromised hosts inthe LAN sites as well as an active MITM in the WAN. We referto the LAN hosts controlled by the adversary as (malicious)insiders, regardless of whether they are controlled by actualhumans or malware. The adversary’s goal is to establisha communication channel between the MITM and one ormore possibly colluding malicious insiders, as illustrated inFigure 1. This would allow the adversary to send instructionsto the insiders or to leak information from the protectedto the unprotected domain, breaching the perimeter securityof the VPN. For this purpose, we assume a state-of-the-art IPsec configuration with authenticated encryption usingEncapsulated Security Payload (ESP) in tunnel mode [17], andthe cryptographic primitives of the VPN are securely enforcedby the VPN gateways. However, the legitimate VPN trafficcan be manipulated by malicious parties in the protected andunprotected domains to exchange information that “survives”these packet transformation enforced by the VPN gateways.

Unfortunately, no systematic approach is known for identi-fying network covert channels apart from exhaustive search,and the categorization as storage or timing channels can beambiguous [5]. We used a comprehensive analysis on theIPsec specification and related work on covert channels innetwork protocols (cf. Section V), as well as source codeanalysis and testing1 to identify potential covert channels inIPsec VPNs. IP-Tunneling and authenticated encryption by theIPsec gateways greatly simplified this problem, as none of theprotocol headers that the MITM can read or modify (i.e., theouter IP and ESP header) are directly available to the LANhosts.

In total, we have identified only eight covert channels.As shown in Table I, the available covert channels comprisethree storage-based channels based on fields in the outerIP header (ECN, DS, Flags) and five timing-based covertchannels that manipulate Inter-Packet Delay (IPD), packetorder (PktOrd), WAN capacity (PktDrop), and Path MTUDiscovery (PMTUD). The remaining characteristic of therespective destination LAN of a packet (DestIP) does notconstitute a covert channel in its own right but can act as am-plification of other covert channels. For a detailed descriptionand analysis of these channels, please refer to Appendix A.

1Specifically, we examined the IPsec implementations of Linux 2.6.32 to2.6.38 and OpenBSD 4.7 to 4.8 releases.

3

Class Type Capacity Cm in bppOutbound Inbound

storageECN 2 1DS 6 6Flags 1 -PktSize 8.4 -

timing/ IPD ≥ 1 ≥ 1channel- PktOrd - > 6.58

logic PktDrop - 1PMTUD - 5.18

amplify DestIP log2(N) -TABLE I

INBOUND AND OUTBOUND COVERT CHANNELS CAPACITIES FOR AN IPSECVPN WITH N + 1 ENDPOINTS.

We emphasize that some of these channels are implementa-tion dependant, e.g., the treatment of ECN header flags orPMTUD at the VPN gateway, while others (IPD, PktSize,PktOrd) are generic problems faced by all packet-orientedchannels. While we are confident to have identified all covertchannels, we cannot account for all possible implementationsand interpretations of IPsec. Hence in this paper we limit ourconsiderations to the identified attack vectors.

III. COVERT CHANNEL-RESILIENT IPSEC

In this section we present the design of a high-performancecovert channel-resilient IPsec, i.e., a system with low, knowncovert channel capacity and high throughput. We present novelor improved techniques for efficient covert channel mitigationin Section III-A. Section III-B considers the performance ofdifferent mitigation strategies, introducing on-demand perfor-mance trade-offs. Finally, we derive the remaining aggregatedinbound and outbound covert channel capacities of the systemin Section III-C.

A. Covert Channel Mitigation

In the following we present and improve efficient mitiga-tions for each of the covert channels identified in Section II-B.

1) Packet Size (PktSize): The packet size characteristic isusually addressed by padding packets to maximum size orassuming them to be of constant size [6]. However, as theproduct throughput = pkt size · pkt rate is constant for agiven link, enforcement of small packet sizes can reduce theload per packet significantly, allowing higher packet rates andmore simultaneous connections.

It was previously proposed to allow multiple alternatepacket sizes [18], but then the ratio between packets of differ-ent sizes creates another covert channel. Mode Security [19]was proposed to manage the switching between differentenforcement modes and audit such a remaining covert channel.However, real network traffic is often mixed, i.e., packetstreams using different packet sizes are often transmitted at thesame time. Moreover, the enforcement of small packet sizesis problematic for IP protocols: With Path MTU Discovery(PMTUD), the connection endpoints quickly detect and adaptto the maximum allowed packet size of an IP route, but onlyslowly recover to a larger MTU using a conservative trial-and-error approach. This active adaption also makes it harder forthe VPN gateways to estimate the actual demand for packetsof larger size.

We address these problems by combining packet paddingwith transparent fragmentation and multiplexing, mechanismsthat were previously only considered for traffic pattern obfus-cation [20], [21]. Packet fragmentation within IPsec allowsus to efficiently and transparently enforce various packetsizes at the gateway without influencing the channel’s PathMTU (PMTU). This is different from regular IP fragmentationbefore or after IPsec processing, which results in visiblefragments either on the LAN or WAN sides that could againbe used as covert channels. The fragmentation mechanismis complemented by packet multiplexing, which can be usedto reduce packet padding overhead by concatenating multiplesmaller packets up to the desired packet size. This also reducesthe IPsec encapsulation overhead (ESP, IP).

When working with mixed traffic, the sender gateway firstfragments large packets and then attempts to multiplex smallpackets or fragments into the padding area of previouslyprocessed packets that are still in the packet buffer. At thereceiving gateway, packets are first de-multiplexed and thendefragmented. As this mechanism work transparently for theLAN sender and receiver, the LAN gateways can preciselymonitor the current demands of the adjacent LAN site tooptimally adjust the enforced packet size.

2) Inter-Packet Delay (IPD): The covert channel basedon IPDs and its mitigation were subject of several previousworks (e.g., [22], [23], [6], [24], [10]). In theory, it is easilyeliminated by enforcing a fixed IPD at the VPN gateway, in-serting dummy packets when no real packets are available [24].However, due to the very high packet rates in modern net-works, even short periods of non-optimal IPDs (and thuspacket rate) enforcement at the VPN gateway quickly resultin packet loss due to buffer overflows or network congestion.This is often aggravated by the deployed congestion avoidancealgorithms, which will quickly adapt to a lower enforcedpacket rate while only slowly making use of increased rates,thus making it difficult for the VPN gateway to estimate theoptimal enforcement rate. The effect can be partly mitigatedwith larger packet buffers; however, this can also create highpacket delays, degrading network responsiveness [25]. Further,the optimal enforced packet rate can be very large in modernnetworks, creating a high computational overhead for the time-synchronous packet processing. For example, to saturate a100 Mbit/s link with 200 byte packets, an average IPD of

500 byte100·106 byte/s = 2µs should be enforced. Finally, one mustconsider inaccuracies in the timing enforcement that appear athigh system loads [23], [26]: Since high activity on the LANinterface can influence the system load of the gateway, a LANhost may induce inaccuracies in the IPD enforcement of thegateway that can again be measured by the MITM, yieldingCIPD′

r = 0.16 bps [9].We have implemented traffic reshaping inside the Linux ker-

nel, using the kernel’s High-Precision Event Timer (HPET) in-frastructure for packet scheduling with nanosecond resolution.This substantially reduces the overhead of context switchingand buffering, allowing IPDs in the range of microsecondsrather than several milliseconds (e.g., [12], [9]) and noticeablyimproves throughput and responsiveness. To maintain goodsystem performance at even higher packet rates we use packet

4

bursts, i.e., we translate very low IPDs into bursts of multiplepackets at correspondingly larger delays. For optimal packetbuffering we adjust the buffer size depending on the currentlyenforced IPD. This prevents long delays at low rates whileallowing generous buffering at high rates.

Additionally, to address the problem of timing inaccuracies,we use the high resolution of the HPET timers to monitorand actively compensate for timing inaccuracies in random-ized IPD enforcement. Specifically, we exploit the fact thatdetermining timing enforcement inaccuracies is harder for theremote MITM than on the local system. The adversary alwaysrequires significantly more measurements to first detect thevariance of the random IPD enforcement and then the inac-curacy in the enforced variance [23], while the VPN gatewayitself can directly compare the intended versus actual packetsending time. Hence, the gateway can approximate the currentinaccuracy much faster. Given this knowledge of unintendedchange in IPD variance, we let the VPN gateway compensatefor the enforcement inaccuracy by dynamically adapting thevariance of the randomized IPD enforcement. This prevents theadversary from ever collecting sufficient measurement samplesfor a particular IPD enforcement variance, preventing it fromestimating the enforcement inaccuracy and eliminating thetiming channel (CIPD

r = 0). However, further evaluation withspecialized network diagnosis hardware is needed to confirm(the non-existence of) this effect.

3) Packet Order (PktOrd): Sequence numbers in proto-col headers have been used before to create a covert orsteganographic channel based on packet reordering [16], [27].However, in contrast to previous works we can eliminatethis channel in the VPN scenario using the IPsec anti-replaywindow and secure sequence numbers in ESP.

IPsec implementations already maintain a bitmap of the lastr seen and unseen sequence numbers so that replay attackswithin the window size can be detected. To eliminate commu-nication through packet re-ordering, we propose to implementthis window as a packet buffer, where new packets are insertedsorted by their ESP sequence number. When an authenticatedpacket with a higher sequence number than the currentlyconsidered set of r is observed, the window advances to thathighest seen number and any packets falling out of the bufferare forwarded. As a result, all packets forwarded from theVPN gateway into the LAN are ordered, regardless of packetdrops, and the covert channel is eliminated: CPktOrd

r,in = 0.Unfortunately, the approach is problematic for low packet

rates, since the window may advance slowly and individualpackets are not forwarded fast enough. We solve this by estab-lishing a certain maximum IPD (e.g., 50ms, which also assuresnetwork responsiveness at very low throughput modes), andby having a gateway sent at least r dummy packets before aconnection can be stopped.

4) Packet Drops (PktDrop): In general, it appears impossi-ble to eliminate covert channels based on packet dropping inthe WAN. Mitigation with error correction codes is expensiveand easily defeated by dropping even more packets. Instead,we propose to mitigate the channel by injecting noise, byincreasing packet loss proportionally to the actual packet loss.

Specifically, let the gateways maintain a buffer D of size l.

At the sender gateway, packets are buffered in D and theirorder is randomized before encapsulation. At the receivergateway, the packets are again collected in D and the numberof dropped packets i in a sequence of l packets is determinedbased on their ESP sequence number. If i > 0, the gatewaydrops another j packets from the current buffer, such thati+ j = 2x, for 1 < x ≤ log2(l), and forwards the remainingpackets after randomizing their order once again. As a result,the MITM can choose the overall number of packets to bedropped for the receiving LAN client but cannot select whichpackets to drop, resulting in a symbol space of log2(l) + 1packets per l packets. The remaining covert channel capacityis then CPktDrop

r,in = 1l · log2(log2(l) + 1) bpp.

Like in the packet re-ordering mitigation in Section III-A3,the inbound packet buffer D at the receiving gateway isproblematic for very low packet rates and requires similarrestrictions to assure a steady stream of packets.

Note that implementations could combine the inboundpacket sorting and IPD enforcement with the packet droppingfacility to mitigate the delays of using multiple queues.

5) Path MTU Discovery (PMTUD): To our knowledge, noprevious work considered the possibility of covert channelsbased on PMTUD, in particular with respect to VPNs. SincePMTUD is critical for good network performance, we donot disable it but instead mitigate the channel by enforcinglimits on the rate and values that are propagated by the VPNgateways into the LAN.

In particular, we limit the possible PMTU values by main-taining a list of common PMTU values and only propagatethe respective next lower PMTU to the LAN. Such com-mon PMTUs values can be established on site or can bederived from previously proposed performance optimizationsfor PMTUD [28]. The rate limitation of PMTU propagation isproblematic in general, as a lack of MTU adaption will lead topacket loss. However, in our case the current PMTU is alwaysknown to the trusted VPN gateways, which can then use thetransparent fragmentation feature from PktSize enforcementto translate between LAN and WAN packet sizes. Consideringthe 10 most common PMTU values and an average interval of,e.g., 2 minutes [28] between propagation of PMTU changes,our measures reduce the covert channel rate to less thanCPMTUD

r,in = 0.03 bps.6) Storage-based Channels (ECN, DS, Flags): The storage-

based covert channels exploiting the Explicit Congestion No-tification (ECN), Differentiated Services (DS) and IPv4 Flagshandling of IP/IPsec are easily eliminated by resetting therespective fields of the outer IP header at encapsulation and ig-noring them during decapsulation. Normalizing the IPv4 Flagsfield is unproblematic as en-route fragmentation is deprecatedin IP. However, eliminating the ECN and DS covert channelsdisables these performance optimizations in the WAN.

B. Mitigation Policies and Performance

In this section, we discuss different covert channel mit-igation policies that can be enforced using the techniquesdescribed in Section III-A. We start by discussing the problemsof previously proposed Fully Padded Channel and Mode

5

Security approaches, and then propose a new system for on-demand, dynamic adaption of the enforced channel character-istics. We focus on the IPD and PktSize enforcement policies,as they have by far the highest performance impact.

1) Fully Padded Channel: When applied without any per-formance trade-offs, the mitigation mechanisms described inSection III-A result in a fully padded channel: The WANpacket stream is constantly padded to the maximum desiredthroughput rate and packet size. However, this mitigationpolicy has several disadvantages: (1) The system must com-promise between high throughput and responsiveness, likelyopting to enforce maximum packet sizes to reduce fragmen-tation overhead; (2) the maximum (desired) network load isconstantly enforced in both directions, reducing overall per-formance due to network congestion; (3) TCP/IP congestionavoidance algorithms do not work, since any rate throttlingis compensated by additional channel padding. In case oftemporary reductions in WAN capacity, this leads to repeatedpacket loss and throttling, until the network is not usableanymore. Hence, the fully padded channel seems unfit forpractical use, except in private/dedicated networks.

2) Mode Security: Mode Security is a generic scheme fortrading covert channel-resilience against system performance.This is done by organizing system operation in a set ofalternative operation modes that can be switched at a certainrate [19], [6]. The current operation mode should be selectedsuch that performance penalty and/or overhead produced bythe covert channel mitigation is minimized. Since the enforcedoperation mode will usually depend on the actually requiredusage, this adaption itself may be exploited as a covert channel.In this case, the covert channel capacity can be given asCModeSec

out = R·log2(M), where M is the number of operationmodes and R is the maximum rate at which the operation modecan be changed (transition rate).

Unfortunately, existing evaluations are based on simula-tions and assume that the optimal operation mode can bedetermined at any time [6]. However, network traffic is ofteninteractive and not easily predictable, and the congestionavoidance algorithms in Internet protocols react quickly toany experienced throughput limit. Hence, the enforcement of“optimal” operation modes that strictly enforce the currentlyexperienced network load and adapt in fixed intervals R resultsin poor performance in most practical scenarios.

This was confirmed by our own implementation. Newchannels quickly turned into congestion avoidance mode astheir additional packets quickly surpass the enforced “optimal”packet rate. Once in congestion avoidance mode, the experi-enced throughput limit is tested very conservatively, makingit difficult for our enforcement gateways to detect or evenquantify a higher demand for the next interval R.

Our On-Demand Mode Security scheme described in thefollowing section extends and generalizes Mode Security toaddress these issues, and our evaluation shows that the ap-proach is feasible.

3) On-Demand Mode Security Management: An algorithmfor on-demand adaptation of traffic enforcement must accom-modate multiple conflicting constraints. It must quickly reactto changes in channel usage to elude congestion avoidance

algorithms, yet the average number of mode changes shouldbe small to reduce information leakage. The mode enforcementshould accommodate short-time packet bursts, yet it must reactquickly when the average packet rate is exceeded, since largepacket queues degrade network responsiveness. We addressthese conflicts using the following regulation mechanisms:

a) Token Bucket Filter: We generalize the transition rateR of the Mode Security paradigm to a token bucket filter [29].Tokens are generated at a fixed rate R and each mode transitionconsumes a token from the token bucket. This allows us to“save up” unused mode transitions in form of tokens andconsume them on demand, at temporarily higher rates than R.The amount of cached tokens is limited by the token bucketsize and the average transition rate R is bound by the rate R atwhich new tokens are generated. Thus, the token bucket filterallows us to immediately react to changes in network usage,before connection throttling kicks in or network delays becomenoticeable. Further, the token bucket status may influence andoptimize decisions on the operation mode to be enforced.

b) Aggressive Increase: Network throughput is scaledmainly based on its packet rate r, with typically exponentialrate increase until the first network bottleneck is detected. Butwhile the current optimal WAN packet rate ropt is easily de-rived from the currently observed LAN rate rLAN, fragmentedand multiplexed packets (rfrag, rmplex), an accurate predictionof the next required packet rate rnew is more involved.Algorithm 1 shows a simplified version of our implementation,with constants derived from empirical evaluations.

Algorithm 1: Pseudo-code for packet rate adjustment.

1 while true do2 (rLAN, rfrag, rmplex)← get-stats()3 ropt ← rLAN + rfrag − rmplex

4 ravg ← 0.3 · ropt + 0.7 · ravg5 case ropt > 0.9 · rnow6 ramp ← (rmax − ropt)/tnum7 rnew ← ropt + ramp

8 end9 case rnow > 1.25 · ravg ∧ tnum > tdec

10 rnew ← ravg11 end12 rnow ← quantatize(rnew, rq)13 sleep(ival)14 end

To adequately consider exponential rate increases withoutincurring high average transition rates R, we constantly over-estimate the current optimal packet rate ropt and increase rnewas soon as ropt > 0.9 · rnow (Line 5 in Algorithm 1). Addi-tionally, we amplify any rate increase based on the distance tothe maximum available WAN rate and the remaining availablerate transition tokens as ramp ← (rmax−ropt)/tnum (Line 6f).

Combined with moderate buffering and short monitoringintervals ival ≈ 200ms, this approach successfully eludes con-gestion avoidance algorithms and prevents undesired through-put throttling even if multiple new channels are started at once.At the same time, the amplified rate increase ensures that the

6

system does not exhaust its transition tokens during iterativerate increase phases without reaching rmax.

c) Conservative Slowdown: When turning the WANchannel in a state of decreased performance, it is importantto reserve sufficient transition tokens such that a subsequentrate increase can be adequately handled. In contrast to theaggressive rate increase policy, any reduction in the enforcedtraffic rate is therefore delayed until a certain amount of tokenstmin have been collected in the token bucket. Moreover, toreduce the impact of short-term fluctuations on rate adaptionand to avoid an immediate subsequent rate increase phase, ratereduction is only performed based on the longer-time averagetraffic rate ravg, overestimated by a factor of 1.25 (Line 9f).

Overall, the described algorithm saves up tokens in theslowdown phase while aggressively spending them in theincrease phase, creating an equilibrium around tmin and ravg.The overestimation of the required throughput gives room toconservative rate increase algorithms. Finally, rnew is quanta-tized in Line 12 to realize the limited number Mr of enforcedWAN packet rates as rnow ← rnew

rq+1 ·rq, with rq = rmax

Mr−1 .d) Dynamic Queue Size with RED: When dynamically

adjusting the overall throughput of the WAN channel, we mustalso adjust the size of the packet queue accordingly. A largequeue will incur large delays and timeouts at very low rates,while a small queue is not effective at supporting a channelwith very high packet rates. Hence, we dynamically adapt thequeue size based on the desired maximum buffering delayand the currently enforced packet rate. Eventually, the WANchannel or its enforcement policy may also reach a pointwhere further rate increases are not possible. In this case,the endpoints should be notified of the increasing congestionquickly, to avoid dropping several packets at once due to fullbuffers. We realize this by employing Random Early Detection(RED) [30] as the packet queue’s dropping policy, so thatendpoints are notified of increasing congestion early on.

We implemented several variations of this approach andevaluated the effect of different parameters on the short-termand long-term usage adaption. The achieved performance andadaptation behavior is presented in Section IV-C.

C. Remaining Covert Channel Capacity

In the following we summarize the identified covert chan-nels and derive the aggregated remaining covert channelcapacity of our covert channel-resilient VPN.

Unfortunately, it is not possible to give all the covert channelrates in a closed form and with comparable units. Severalcovert channels also depend on additional parameters likenetwork PMTU or minimum WAN packet rate. To provide areasonable overview of the overall effectiveness of the covertchannel mitigation, we have used the capacity estimationsderived in the examples of Section III-A, assuming a state-of-the-art IPsec VPN configuration (cf. Section II).

Table II lists the individual covert channel capacities for theunmitigated (Cm) and mitigated (Cr) case. Considering thattoday’s networks easily transmit several thousand packets persecond, i.e., 1 bpp � 1 bps, our system results in significantimprovements over standard IPsec. In fact, all outbound covert

Type Max. Capacity Rem. CapacityCm in bpp Cr in bps

Outbound Inbound Outbound InboundECN 2 1 0 0

DS 6 6 0 0Flags 1 - 0 -

PktSize 8.4 - 0 -IPD ≥ 1 ≥ 1 0 0

PktOrd - > 6.58 - 0PktDrop - 1 - ≤ 5PMTUD - 5.18 - 0.03

DestIP log2(N) - log2(N) -Overall > 18.4 > 20.76 0 ≤ 5.03

TABLE IIMAXIMUM AND REMAINING COVERT CHANNEL CAPACITIES FOR N + 1ENDPOINTS. NOTE THAT Cr IS IN BPS, AND TYPICALLY 1 BPP ≫ 1 BPS.

channels are completely eliminated, except for the DestIPchannel. However, as explained in Section II-B, the DestIPcharacteristic does not by itself constitute a covert channelbut can only be used to amplify other channels. Hence,the overall remaining covert channel capacity is given byCr,out = CModeSec

out · CDestIPout .

For the less critical inbound covert channels (e.g., controlchannels for stealth malware), only the channels based onPMTUD and PktDrop remain. The PktDrop covert channelhas the highest impact with CPktDrop

r,in ≤ 5 bps and is easyto exploit. Since the PMTUD channel could be exploited atthe same time, their capacities must be added up: Cr,in =

CPktDropr,in + CPMTUD

r,in = 5.03 bps.

IV. PRACTICAL COVERT CHANNEL MITIGATION

In this section we describe the instantiation of our systembased on the Linux IPsec stack and analyse the achievednetwork performance and behavior.

In our prototype implementation and evaluation we onlyconsider the mitigation of outbound covert channels, sinceinformation leakage from the protected to the unprotecteddomain is usually considered more critical (e.g., consider Bell-LaPadula [31]). Moreover, from our discussions in Section IIIit is clear that outbound covert channel mitigation is moreefficient, as it requires less buffering and processing but ismore effective in reducing the covert channel capacity. Wehave validated the mitigation of the respective channels bylooking at the resulting traffic dumps and using the tools wehave developed previously in context of [10].

A. Architecture and Implementation Details

We have implemented our design as an extension to theIPsec stack of the Linux kernel, called High-PerformanceCovert Channel Mitigation (HPCM). The implementation andis based on the Traffic Flow Confidentiality (TFC) project, asystem for probabilistic traffic flow obfuscation and re-routingin IPsec [20]. We revised and extended TFC to supportHigh-Precision Event Timers (HPETs), fragmentation, mul-tiplexing, dummy packet generation that is indistinguishablefrom real traffic payloads, elimination of storage-based covertchannels in the encapsulation headers and, most importantly,an interface for monitoring packet processing statistics andflexible configuration of the traffic pattern enforcement via

7

HPET

LANInput

WANOutput

Kernelspace

packetqueue

dummyqueue

Fragmenta�on

HPCM Manager

Userspace

HPCM EngineLinux IPsec (ESP Tunnel)

Timer

HeaderReset

get_sta�s�cs()

compute_se�ngs()

update_se�ngs()

Mul�plexing

Padding

send-event

Fig. 2. Architecture of our Linux prototype.

userspace. The resulting architecture is illustrated in Figure 2.In kernelspace, the HPCM Engine processes packets as part ofthe IPsec subsystem, rewriting problematic header fields andenforcing the currently desired size and IPD constraints asdescribed in Section III-A. In userspace, the HPCM Managercollects processing statistics from the enforcement engine andcombines them with the observed inbound LAN traffic todetermine the optimal enforcement parameters, as presentedin Section III-B3. As such, all performance-critical trafficenforcement is performed in kernelspace, while the morevolatile security policy management and configuration is flex-ibly performed in userspace.

e) Packet processing: Packet resizing is done on a best-effort basis to minimize processing delays. Packets that aretoo large are iteratively fragmented until the last remainingfragment is smaller or equal than the remaining packet size.

Packet multiplexing is performed based on two configurablethresholds, the flagging and the multiplexing threshold. Pack-ets smaller than the flagging threshold are deemed to containrelatively large amounts of padding and are flagged as candi-dates for multiplexing before they are put into the outboundpacket queue. Packets smaller than the multiplexing thresholdare first considered for multiplexing, by searching the currentpacket queue for previously flagged packets with sufficientpadding space and merging the current packet into such apreviously processed packet. Only if no suitable candidatecan be found in the outbound queue, packets smaller than themultiplexing threshold follow the regular size padding path.

This approach ensures that packet multiplexing has noadverse effect on the packet processing delays and avoidskeeping extra state and timeouts for multiplexing candidates.In particular, the approach avoids cases where candidatepackets for multiplexing are held up in a separate queue whilethe packet queue is empty, which would result in sending ofdummy packets and negate any performance gained throughpacket multiplexing.

Since the resulting multiplexed packets typically containa recent as well as an older packet that was temporarilyexempted from the (concurrently running) packet sendingprocess, they are inserted at the start of the packet queue tofacilitate their fast sending.

We use the Linux sysfs filesystem to configure the ker-nelspace HPCM Engine and report realtime processing statis-tics back to userspace. In particular, the HPCM Engine exports

ESP-TrailerPayloadTFCIP ESP

Next Hdr Reserved F M Length

Security Parameter Index (SPI)

Fragment ID OffsetM

Fragmented Mul�plexed

More Fragments

0 8 16 31

F

Fig. 3. TFC encapsulation protocol.

counters for fragmented, padded and multiplexed packets aswell as the amount of sent real and dummy packets. Thecounters can be periodically reset from userspace to yieldaverage processing rates.

f) Protocol Format: For flexible packet padding andrerouting, we deploy our own encapsulation protocol basedon TFC [20] as shown in Figure 3. While the length field issufficient to recognize and remove padding, we require twoadditional flags to mark packets as using fragmentation ormultiplexing. TFC payloads are flagged as multiplexed if theyare followed by another payload, and TFC payloads containingfragments are flagged as fragmented. Fragments payloads areaccompanied with a 4 byte fragmentation extension headercompatible with IPv4, which allows us to reuse the existingIP defragmentation support in the Linux kernel.

The resulting protocol overhead is relatively large, espe-cially due to the additional Security Parameter Index (SPI)field which is used in IPsec to associate the stateless packetstream with some previously negotiated state at the VPNgateways, such as the encryption and authentication algo-rithms to be employed. However, in case of outbound trafficnormalization such stateful processing at the receiver is notactually required. A more optimized implementation couldintegrate the encapsulation protocol into Encapsulated SecurityPayload (ESP), requiring only the two flags for markingfragmented and/or multiplexed packets and the optional 4 bytefragmentation extension header 2

B. Testbed and Raw PerformanceIn this section we describe the performance achieved by

our prototype in terms of network throughout, transactionrate (i.e., roundtrip time) and protocol overhead. Our testbedcorresponds to the VPN scenario in Figure 1, except that weuse only two LAN sites with one physical host per LAN. TheMan-in-the-Middle (MITM) is implemented as an Ethernetbridge between the two VPN gateways, allowing reliableobservation of all transmitted packets. For our evaluation,the MITM is completely passive and only used to provideindependent performance measurements of the WAN. All hostsare 3.2 Ghz Intel Core i5-650 machines, equipped with twoIntel PCIe GBit network cards and 4GB system memory. Allnetwork links are established at full-duplex GBit/s speed.

We have used the Netperf3 benchmarks TCP_STREAMand TCP_RR to measure the maximum TCP throughput and

2Note that the ESP specification already discusses facilities for trafficnormalization but only supports basic packet padding and considers fullypadded channels as too costly.

3http://www.netperf.org

8

Benchmarks No Padding Fully Padded On-DemandIP ESP TFC 1422 800 R = 10−1s

LAN Throughput (Mbit/s) 570 201 175 58 75 169TCP Transaction Rate (Hz) 1756 1462 1364 611 740 532LAN/WAN Overhead (%) 0 10 13 (73) (61) ≈24Relative Throughput (%) 283 100 87 28 37 84

TABLE IIITHROUGHPUT AND TRANSACTION RATE FOR REGULAR AND MODIFIED IPSEC VPN.

transaction rate between the LAN sites. By comparing LANand WAN throughput, we can determine the protocol overheadof the covert channel mitigation, including dummy packets andpacket padding.

We list the overall performance results in Table III. The firsttwo columns show the testbed performance for raw IP (plain-text) transmission and IPsec ESP tunneling. With 570 Mbit/s,the raw transmission does not reach the expected GBitthroughput, likely due to deficient hardware or drivers. As theLAN hosts and the MITM measure the same IP payloads, thereis no LAN/WAN overhead. With 201 Mbit/s, the throughput ofa standard IPsec ESP tunnel is already notably slower due to10% protocol overhead but mainly computational constraintsof the VPN gateways. As our covert channel mitigation is anextension of this ESP tunnel configuration, we normalize therelative throughput to 100%.

For reference and confirmation of the expected implemen-tation overhead of our prototype, we next evaluated the rawperformance of our HPCM Engine compared to the standardIPsec ESP tunnel. The third column “TFC” of Table IIIlists the achieved network performance when tunneling TFCinside ESP with with all covert channel mitigation techniquesdisabled. The overall LAN/WAN overhead of 13% (or 3%when compared with the ESP tunnel) is the result of the 8 to12 byte TFC protocol encapsulation plus some computationaloverhead.

C. Covert Channel Mitigation Performance

We now describe the behavior and performance of differentmitigation policies. The fourth and fifth column of Table IIIshow the performance of a “fully padded channel”, enforcingpacket sizes of 1422 and 800 bytes at the maximum possiblepacket rate. For this purpose, we first measured the maximumbi-directional throughput of the VPN channel (201 Mbit/sper direction) and then selected the desired packet rate (in-verse IPD) such that the bidirectional channel capacity isalmost4 saturated. We then again measured the maximum(uni-directional) throughput and roundtrip time. As shownin Table III, the fully padded channel configuration achievesrather poor performance in both configurations, reaching only37% and 28% of the ESP tunnel throughput. Observe thatthe enforcement of 800 byte packet size achieves highertransaction rate as well as higher throughput. We believe thisis due to the overhead of padding TCP acknowledgements tomaximum packet size.

4As explained in Section III-B1, it is critical that the link is not fullysaturated since congestion leads to packet loss and congestion avoidance doesnot work.

0

50

100

150

200

250

0 20 40 60 80 100 120

Throughput (Mbit/s)

Elapsed Time (Seconds)

WAN at 5^-1sWAN at 10^-1sWAN at 15^-1sWAN at 20^-1s

Fig. 4. WAN adaption to repeated TCP load (three consecutive loads shown).

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0 100 200 300 400 500 600

Request Time (Seconds)

Elapsed Time (Seconds)

LAN at 5^-1sLAN at 10^-1sLAN at 15^-1sLAN at 20^-1s

Fig. 5. HTTP request delay in mixed traffic.

We have also implemented and tested an instantiation ofour on-demand mode security management scheme presentedin Section III-B3. As shown in the last column of Table III,the employed mode adaption heuristics reach almost the samethroughput as the raw TFC encapsulation without time/sizepadding (169 Mbit/s vs. 175 Mbit/s), which is the theoreticalmaximum for our testbed. The LAN/WAN overhead is slightlyhigher (24% vs. 13%) and the transaction rate rather low. How-ever, we achieve good maximum throughput despite the higheraverage overhead (84% vs. 87%). This is because, as shownin Figure 4, the WAN channel experiences overhead mainlyin the rate increase and especially rate decrease phases, i.e.,when the connection is not actually used much. This overheadis the direct result of our design decision in Section III-B3to reduce the frequency of mode adaptations R in areas thatdo not primarily impact user experience. As confirmed byFigure 4, further reductions of the token regeneration rateR ≤ 15−1s mainly increases overhead in between TCP loads,without affecting the performance at higher loads or depletingavailable transition tokens.

Finally, we have investigated the ability of our on-demand

9

0

50

100

150

200

250

0 100 200 300 400 500 600

Throughput (Mbit/s)

Elapsed Time (Seconds)

WAN at 5^-1sWAN at 10^-1sWAN at 15^-1sWAN at 20^-1s

Fig. 6. WAN adaption to pseudo-random web traffic and downloads.

mode security management to adapt to random, highly het-erogeneous traffic patterns one would expect from a VPNwith many users. We used Tsung, a traffic load testingtool5, to record several HTTP sessions in our network, partlyalso including larger (≈ 60 MB) HTTP downloads. Wethen configured one of our testbed LANs to act as Internetgateway for the other LAN and used Tsung to replay therecorded HTTP sessions in a pseudo-random fashion with60 to 80 simultaneous users. Figure 6 shows how the WANtraffic enforcement for four different token regeneration ratesR dynamically adapts to the LAN usage (grey filled). ForR ≤ 15−1s, only the larger peaks in LAN usage influence theWAN traffic enforcement, reducing information leakage at thecost of padding overhead. As shown in Figure 5, the meanduration of responding to individual HTTP requests is keptwithin reasonable limits. However, in contrast to unpaddedtraffic (grey filled) the accumulated request delays becomenoticeable to the user.

In the presented configuration, our mode adaption algorithmswitches packet sizes in steps of 100 bytes and packet rates insteps of 1000 packets per second. Considering the maximumWAN packet rate of about 250.000 packets/s, we can deriveCModeSec

r,out = R · log2( 1500100 ·

2500001000 ) = R · 11.87 and an overall

outbound covert channel capacity of, e.g., Cr,out = 0.6 bpsfor R = 20−1s and N = 1.

V. RELATED WORK

Several works consider the problem of covert channels andcovert channel mitigation in Internet protocols [32], [33], yetwe know of no works that specially discuss the problem ofcovert channels in IPsec. The covert channels we identifyin IPsec are generally known, but we found no previousdiscussion of the PMTUD channel. Additionally, the Pkt-Size [18], PktSort [11], [27] and DestIP [18] characteristicshave different impact in IPsec, and the discussion of storage-based covert channels in the IPsec specification [14] provedto be inaccurate.

Although the IPD-based covert channel is generally well-known [22], [10], [18], [34], [6], the problem of inaccuraciesin timing enforcement during increased system load remainedunsolved [9], [23]. We consider this complication in our designin Section III-A2 and present a compensation mechanism that

5http://tsung.erlang-projects.org

detects and compensates unintended timing inaccuracies. Also,while most works simply assume that packets are of constantsize [24] or padded to the maximum desired size [33], [18], ouradoption of multiplexing and fragmentation enables flexiblepacket size enforcement. The combination of different mitiga-tion techniques makes our implementation the first prototypefor comprehensive covert channel mitigation.

Regarding performance trade-offs, Mode Security was pro-posed as a general approach to adapt to resource usage byswitching between different operation modes [19]. A similarapproach called Traffic Stereotyping was proposed for net-works [18]. To our knowledge, there is only one system thatuses Mode Security to optimize covert channel mitigation,which aims to provide sender anonymity based on dynamicre-routing and IPD enforcement [24], [6]. They assume atrusted network stack on each network endpoint and a periodicglobal negotiation to achieve an equalized traffic matrix [24].A performance analysis was done based on statistics collectedfrom a medium-sized network [12]; however, no actual perfor-mance measurements of their system have been provided andthe problem of determining the optimal enforcement modewas left unsolved. Alternatively, NetCamo [34] requires itsendpoints to explicitly request their delay and throughputdemands beforehand. We extend on these works by proposinga practical algorithm to determine the optimal operation modeon-demand. As we do not require sender-anonymity, problemsof mix-networks do not apply to our approach (e.g., [35], [36]).

In contrast to probabilistic traffic obfuscation schemes suchas HTTPOS [37] or Traffic Morphing [21], our frameworkenforces an information-theoretic boundary for the maximuminformation leakage. As argued in Section I, covert channeldetection schemes such as [38], [39] are complementary toour work and should be used where mitigation is costly, e.g.,for the PktSort and PktDrop characteristics.

While we know of no practical performance measurementsfor comprehensive covert channel elimination, an overhead of45%-56% was reported solely for obfuscating the packet sizein website traffic [7], [21].

VI. CONCLUSION AND FUTURE WORK

We have motivated the problem of covert channels inVirtual Private Networks (VPNs) and presented the design,implementation, and performance of a covert channel-resilientVPN. We identified several covert channels and presented

10

new countermeasures. We have investigated the problem ofon-demand adaption of operation modes and presented animplementation for comprehensive, efficient covert channelmitigation in the Linux IPsec stack. Our evaluation showsthat on-demand rate adaption is feasible and practical evenfor highly unpredictive traffic. In more predictable throughputbenchmarks, our system achieves remarkable 169 Mbit/s in a201 Mbit/s VPN connection (84%). Our prototype and toolsare available online: https://code.google.com/p/ipsec-tfc/.

An interesting topic for future work is further improvementof our trade-off algorithms. Also, further investigation of ourproposed mitigation of IPD enforcement inaccuracies wouldbe worth pursuing.

VII. ACKNOWLEDGEMENTS

We thank Amir Herzberg, Haya Shulmann and ThomasSchneider for their insightful comments and review of earlierversions of this work.

REFERENCES

[1] Cohesive Flexible Technologies, “VPN-Cubed,” http://cohesiveft.com,Apr. 2012.

[2] L. Catuogno, A. Dmitrienko, K. Eriksson, D. Kuhlmann, G. Ramunno,A.-R. Sadeghi, S. Schulz, M. Schunter, M. Winandy, and J. Zhan,“Trusted Virtual Domains - design, implementation and lessons learned,”in International Conference on Trusted Systems (INTRUST), Springer2009.

[3] J. Carapinha, P. Feil, P. Weissmann, S. Thorsteinsson, c. Etemoglu,O. Ingthorsson, S. Ciftci, and M. Melo, “Network Virtualization -Opportunities and Challenges for Operators,” in Future Internet Sym-posium (FIS’10), Springer 2010.

[4] B. W. Lampson, “A note on the confinement problem,” Communicationsof the ACM, 1973.

[5] A Guide to Understanding Covert Channel Analysis of Trusted System,National Computer Security Center, Nov. 1993.

[6] B. R. Venkatraman and R. E. Newman-Wolfe, “Capacity estimation andauditability of network covert channels,” in Research in Security andPrivacy (S&P), IEEE 1995.

[7] M. Liberatore and B. N. Levine, “Inferring the source of encryptedHTTP connections,” in Conference on Computer and CommunicationsSecurity (CCS), ACM 2006.

[8] T. Ristenpart, E. Tromer, H. Shacham, and S. Savage, “Hey, you, getoff of my cloud: exploring information leakage in third-party computeclouds,” in Conference on Computer and Communications Security(CCS), ACM 2009.

[9] B. Graham, Y. Zhu, X. Fu, and R. Bettati, “Using covert channels toevaluate the effectiveness of flow confidentiality measures,” in Paralleland Distributed Systems (ICPADS), IEEE 2005.

[10] Y. Liu, D. Ghosal, F. Armknecht, A.-R. Sadeghi, S. Schulz, andS. Katzenbeisser, “Hide and seek in time — robust covert timingchannels,” in European Symposium on Research in Computer Security(ESORICS), Springer 2009.

[11] S. Murdoch and S. Lewis, “Embedding covert channels into TCP/IP,”in Information Hiding (IH), Springer 2005.

[12] B. R. Venkatraman and R. E. Newman-Wolfe, “Performance analysis ofa method for high level prevention of traffic analysis using measurementsfrom a campus network,” in Computer Security Applications Conference(ACSAC), IEEE 1994.

[13] J. Millen, “20 years of covert channel modeling and analysis,” inResearch in Security and Privacy (S&P), IEEE 1999.

[14] S. Kent and K. Seo, “Security Architecture for the Internet Protocol,”RFC 4301, Dec. 2005.

[15] K. Ahsan, “Covert channel analysis and data hiding in TCP/IP,” Master’sthesis, Department of Electrical and Computer Engineering, Universityof Toronto, 2002.

[16] D. Kundur and K. Ahsan, “Practical internet steganography: Data hidingin IP,” in Texas Workshop on Security of Information Systems, 2003.

[17] J. P. Degabriele and K. G. Paterson, “On the (in)security of IPsecin MAC-then-encrypt configurations,” in Conference on Computer andCommunications Security (CCS), ACM 2010.

[18] C. G. Girling, “Covert channels in LAN’s,” IEEE Transactions onSoftware Engineering, Feb. 1987.

[19] R. Browne, “Mode security: An infrastructure for covert channel sup-pression,” in Research in Security and Privacy (S&P), IEEE 1994.

[20] C. Kiraly, S. Teofili, R. Lo Cigno, M. Nardelli, and E. Delzeri, “Trafficflow confidentiality in IPsec: Protocol and implementation,” in TheFuture of Identity in the Information Society, Springer 2008.

[21] C. V. Wright, S. E. Coull, and F. Monrose, “Traffic morphing: Anefficient defense against statistical traffic analysis,” in Network andDistributed Systems Security (NDSS), Internet Society 2009.

[22] I. S. Moskowitz and A. R. Miller, “Simple timing channels,” in Researchin Security and Privacy (S&P), IEEE 1994.

[23] X. Fu, “On traffic analysis attacks and countermeasures,” Ph.D. disser-tation, Texas A&M University, Dec. 2005.

[24] B. R. Venkatraman and R. E. Newman-Wolfe, “Transmission schedulesto prevent traffic analysis,” in Computer Security Applications Confer-ence (ACSAC), IEEE 1994.

[25] J. Gettys, “Bufferbloat: Dark buffers in the Internet,” IEEE InternetComputing, May 2011.

[26] X. Fu, B. Graham, R. Bettati, and W. Zhao, “On effectiveness oflink padding for statistical traffic analysis attacks,” in InternationalConference on Distributed Computing Systems (ICDCS), IEEE 2003.

[27] A. El-Atawy and E. Al-Shaer, “Building covert channels over the packetreordering phenomenon,” in International Conference on ComputerCommunications (INFOCOM), IEEE 2009.

[28] J. Mogul and S. Deering, “Path MTU discovery,” RFC 1191, Nov.1990.

[29] W. Zhao, D. Olshefski, and H. Schulzrinne, “Internet quality of service:An overview,” Columbia University, Tech. Rep., 2000.

[30] B. Braden, D. Clark, J. Crowcroft, B. Davie, S. Deering, D. Estrin,S. Floyd, V. Jacobson, G. Minshall, C. Partridge, L. Peterson,K. Ramakrishnan, S. Shenker, J. Wroclawski, and L. Zhang,“Recommendations on Queue Management and Congestion Avoidancein the Internet,” RFC 2309, Apr. 1998.

[31] D. E. Bell, “Looking back on the Bell-LaPadula model,” in ComputerSecurity Applications Conference (ACSAC), IEEE 2005.

[32] D. Llamas, C. Allison, and A. Miller, “Covert channels in internetprotocols: A survey,” Mar. 2006.

[33] S. Zander, G. Armitage, and P. Branch, “A survey of covert channelsand countermeasures in computer network protocols,” Comm. Surveys& Tutorials, 2007.

[34] Y. Guan, X. Fu, D. Xuan, P. U. Shenoy, R. Bettati, and W. Zhao,“NetCamo: Camouflaging network traffic for QoS-guaranteed missioncritical applications,” Trans. on Systems, Man, and Cybernetics - Systemsand Humans, Jul. 2001.

[35] V. Shmatikov and M.-H. Wang, “Timing Analysis in Low-Latency MixNetworks: Attacks and Defenses,” in European Symposium on Researchin Computer Security (ESORICS), Springer 2006.

[36] T. Abraham and M. Wright, “Selective cross correlation in passivetiming analysis attacks against Low-Latency mixes,” in Global Com-munications Conference (GLOBECOM), IEEE 2010.

[37] X. Luo, P. Zhou, E. W. W. Chan, W. Lee, R. K. C. Chang, andR. Perdisci, “HTTPOS: Sealing information leaks with browser-sideobfuscation of encrypted flows,” in Network and Distributed SystemsSecurity (NDSS), Internet Society 2011.

[38] V. Berk, A. Giani, and G. Cybenko, “Detection of covert channelencoding in network packet delays,” Dartmouth College, Tech. Rep.TR536, Aug. 2005.

[39] P. A. Gilbert and P. Bhattacharya, “An approach towards anomalybased detection and profiling covert TCP/IP channels,” in Information,Communications and Signal Processing (ICICS), IEEE 2009.

[40] N. Lucena, G. Lewandowski, and S. Chapin, “Covert channels in IPv6,”in Privacy Enhancing Technologies, Springer 2006.

[41] K. Ramakrishnan, S. Floyd, and D. Black, “The Addition of ExplicitCongestion Notification (ECN) to IP,” RFC 3168, Sep. 2001.

[42] B. Briscoe, “Tunnelling of Explicit Congestion Notification,” RFC6040, Nov. 2010.

[43] C. Perkins, “IP Encapsulation within IP,” RFC 2003, Oct. 1996.[44] S. Kent, “IP Encapsulating Security Payload (ESP),” RFC 4303, Dec.

2005.[45] S. D. Servetto and M. Vetterli, “Communication using phantoms: Covert

channels in the internet,” in International Symposium on InformationTheory, IEEE 2001.

[46] X. Fu, B. Graham, R. Bettati, and W. Zhao, “Active traffic analysisattacks and countermeasures,” International Conference on ComputerNetworks and Mobile Computing, 2003.

11

APPENDIX

In general, for a channel characteristic to be exploited as acovert channel, it must be measurable after transformation bythe VPN gateway. An outbound covert channel requires thatcertain characteristics of packets remain measurable when theyare encapsulated and encrypted when traversing from the LANto the WAN side. Similarly, inbound covert channels requirepacket characteristics to survive the decapsulation from theWAN to the LAN side.

In this section we review the impact of covert channels onVPNs that apply to the setup described in Section II. Many ofthe channels are known, however, their impact is different inVPNs and other mitigations are possible.

Unfortunately, no systematic approach exists for identify-ing covert channels besides systematic exhaustive search [5].The following results are based on such examination of theIPsec specification [14], which standardizes the process ofencapsulation and decapsulation at the VPN boundary, andrelated work. We then examined recent implementations inLinux 2.6.36 to 2.6.38 and OpenBSD 4.7 to 4.8, identifyingnotable deviations from the specification that also impactcovert channel capacities.

A. Storage Channels

Several storage-based covert channels have been identifiedin TCP/IP networks [32], [40], [33]. Storage-based covertchannels are usually easy to eliminate, but create reliable andefficient channels when left undetected.

1) Differentiated Services (DS): The 6 bit DS field in the IPheader is used to signal service requirements, such as realtimeVoIP traffic. Hence it may leak information about the typeof the transmitted data even if not actively exploited by aninsider. The IPsec specification recognizes this problem, how-ever, countermeasures are only described for inbound covertchannels [14]. The proposed countermeasure is to discardthe DS field of the outer IP header during defragmentation,unless explicitly configured to do otherwise. Indeed, the ex-amined Linux implementation honors this recommendation,while OpenBSD always discards the outer header’s DS fieldduring decapsulation. Hence we get CDS

m,out = 6 bpp in bothcases (and likely also for most other IPsec implementations)while the optional decapsulation of the DS field opens anotherinbound channel of the same capacity CDS

m,in = 6 bpp.2) Explicit Congestion Notification (ECN): Explicit Con-

gestion Notification (ECN) uses two bits of the IP headerto let routers signal network congestion to the endpoints.As such, ECN between WAN routers and LAN endpointsmay directly violate the VPN policy. Originally, a “limitedfunctionality” mode was supplied to eliminate such covertchannels in IPsec, which effectively precludes the WAN fromECN signaling [41]. However, the current revision of IPsecspecifies that the ECN field should be copied during encap-sulation, and one bit is copied back to the inner header ondecapsulation, if ECN is enabled on the inner IP header [14].As such, the specification enables covert channels with acapacity of CECN

m,in = 1 and CECNm,out = 2 bpp. The ECN

treatment was refined and unified in [42]. After discussing

the trade-off between covert channels and ECN, an updatedscheme is defined with CECN

m,out = 2 bpp and CECNm,in = 1.5

bpp. Furthermore, a legacy-compatible mode with CECNm,out = 0

but CECNm,in = 1.5 is specified. The inbound covert channel is

deemed insignificant [42].The examined Linux and OpenBSD systems both imple-

ment the original ECN specification [41]. They can be con-figured to use “limited functionality” mode, eliminating ECN-based covert channels. However, Linux does not drop packetswhere the outer ECN field was invalidly set to CongestionExperienced (see also Section B4).

3) IPv4 Flags (Flags): The IPv4 Flags field consists ofone unused bit (Reserved) and two bits for fragmentation,Don’t Fragment (DF) and More Fragments (MF). The IPsecspecification discusses the problem of fragmentation at greatlength, but refers to standard IP-IP encapsulation [43] for theexact treatment of inner and outer header fields. In general, theIPsec gateway must copy the DF bit if set by the sender toenable Path MTU Discovery (PMTUD) along the route. Oth-erwise, if the sender allows fragmentation, the IPsec gatewayis is advised to still use PMTUD but fragment the originalpacket before encapsulation. If the advise is followed, nooutbound covert channel is created by the Flags field. Duringdecapsulation, any WAN fragments must first be reassembled,eliminating any option for the adversary to encode informationfor the final recipient of the packet.

The examined OpenBSD implementation correctly resetsthe DF and MF bits on encapsulation and decapsulation, butcopies the Reserved bit as a side-effect of this normalization.The Linux implementation copies the DF bit by default unlessPMTUD has been disabled by configuration. However, in thiscase a compatibility issue leads to a repeated incrementationof the IP ID field for fragmentable packets. Hence, bothimplementations leak information about the Flags field settingson the inner header, creating an outbound covert channel withCFlags

m,out = 1 bpp.4) Destination IP (DestIP): The destination IP address can

be used as a covert channel [18]. Although the insider in aVPN cannot directly modify the destination address visiblein the WAN, the destination IP can be modified indirectlyif more than to LAN sites are connected in the VPN, byaddressing the packet to one or the other LAN site. However,and assuming that other covert channels are eliminated, theMITM cannot distinguish the (time and size padded) streamof packets for each VPN channel to determine which channelis being preferred by the insider. Hence, the characteristic onlyacts as an amplifier for an existing outbound covert channel,multiplying the symbol space of the existing covert channelby the number of alternate destination LANs N .

B. Timing Channels

1) Packet Size (PktSize): Information can also be encodedin the size of packets [18]. In IPsec VPNs, the createdcovert channel is a limited outbound channel since (1) theMITM cannot covertly modify the packet size and (2) thesymbol space is reduced by ESP payload alignment andblock cipher padding [44]. To estimate the available symbol

12

space, consider that the maximum and minimum possiblepacket length is reduced by protocol headers. The remain-ing possible packet lengths are then partitioned due to ESPpadding and alignment. For example, the maximum packetsize in standard Ethernet LAN is limited by the 1500 byteMaximum Transmission Unit (MTU), minus 40 bytes innerand outer IP headers and two times ≈ 12 bytes for the ESPheader with cipher block padding and ESP MAC, respectively.Hence lmax ≈ 1500 − 2 · 20 − 2 · 12 = 1436 bytes andlmin ≈ 2 ·20 + 2 ·12 = 646. Considering the 4 byte alignmentof ESP we get CPktSize

m,out = log2( 1436−644 ) = 8.4 bpp.

2) Inter-Packet Delay (IPD): Several previous works dis-cuss exploitation and mitigation techniques for covert channelsbased on the relative delay between packets (IPD) [18], [22],[34], [38], [10]. The characteristic can be used for both,outbound and inbound covert channels and is not significantlyaffected by IPsec processing. The channel capacity generallydepends on the accuracy of the timing measurements on the re-ceiver [22]. However, when approximating it as simple binarychannel that either delays a packet or not, it achieves a capacityof CIPD

m = 1 bpp. Previously, a throughput of 0.98 bpp wasreported on an intercontinental Internet connection [10].

3) Packet Order (PktOrd): Information can also be encodedby changing the order of packets [16], [27]. In IPsec VPNs,the MITM cannot distinguish the order of IPsec payloads.However, the MITM may reorder WAN packets to sendinformation to an insider, creating an inbound covert channel.The receiving IPsec gateway can optionally maintain a partialanti-replay window, a bit mask that marks the last r receivedpackets based on the Encapsulated Security Payload (ESP) orAH sequence number field. If used, a typical replay windowof, e.g., r = 32 packets leaves an inbound covert channelcapacity of CPktOrd

m,in = 1/r · log2(r!) = 3.67 bpp.4) Packet Drops (PktDrop): Packet dropping was proposed

in [45] to create covert channels. However, since the MITMin a VPN cannot distinguish secure channel packets, theoutbound version of this channel is equivalent to the IPD-based covert channel. When used as an inbound covert chan-nel, one LAN client may generate a stream of enumeratedpackets from which the WAN MITM adversary may dropsome at will. Hence, a reliable covert channel with at mostCPktDrop

m,in = 1 bpp (drop/no drop) can be created.5) Path MTU Discovery (PMTUD): PMTUD is a mech-

anism to dynamically detect the Path MTU (PMTU), themaximum allowed packet size along an IP path [28]. By settingthe Don’t Fragment (DF) flag in the IP header, the sender asksintermediate routers to not fragment packets on demand but todrop it and instead report the MTU of their network segment asan ICMP error (PMTU error). The sender notes the respectivesmallest reported MTU towards a particular destination as thePMTU and adjusts the size of emitted packets accordingly.The sender will also periodically test by trial and error if alarger PMTU is possible, e.g., in case the routing has beenchanged (PMTU aging).

6Excluding the block cipher’s initialization vector (IV).

While the size of VPN packets cannot be manipulated di-rectly, the MITM may artificially limit the MTU of his networksegment to inject information via PMTUD. Specifically, whenusing PMTUD in the WAN, the MITM may inject ICMP errorsto reduce the PMTU perceived by the IPsec gateways. Uponreceiving a packet that is larger than the known PMTU for therespective WAN destination, and if that packet has the DF bitset, IPsec gateways will propagate the smaller PMTU to therespective LAN client by synthesizing the appropriate PMTUerror [14].

Since the PMTU is unlikely to change frequently, the IPsecspecification recommends that gateways follow the PMTUaging process described in the PMTUD specification [28].There, a periodic timeout of t = 2 minutes is recommendedbefore attempting to increase the PMTU, and immediatelyincreasing further if the attempt was successful, until the newPMTU is discovered. Since only smaller PMTU values arepropagated to LAN clients, this limits the rate of PMTU errorsthat the MTIM can usefully inject.

For example, if we assume possible MTU range from theminimum IP packet size (768 bytes) to the maximum EthernetMTU minus IPsec overhead (≈ 1500 − 20 − 35 = 1445bytes), with changes in 4 byte steps due to ESP paddingand alignment, there are M ≈ 1445−768

4 ≈ 169 differentpacket sizes that could be reported by the IPsec gateway tothe LAN client. Furthermore, the adversary can mitigate thedelay imposed by the PMTU aging timeout t by partitioningthe symbol space such that at most M/2 = 169/2 = 84subsequent 1 bit symbols can be sent, achieving up toCPMTUD

m,in = M/2t = 842 · 60 = 5.18 bps.

C. Active Probing

The MITM may infer information on the LAN statusby actively probing the IPsec gateways and evaluating theirresponse behavior, an approach that was introduced as activetraffic analysis [23], [46]. In particular, a LAN client couldcause high load on the LAN interface of the IPsec gateway.The resulting change in the gateway’s system load can thenbe measured by how it responds to legitimate service requestsby the MITM, such as ICMP pings [23].

Note that, contrary to the previous channel characteristics,this attack actually exploits a side channel at the gateway: Itscapacity does not depend on the usage of the VPN channel buton the frequency at which the insider can induce high and lowsystem loads at the gateway as well as on the rate at whichthe MITM is able to probe the gateway to measure its systemload with sufficient accuracy.

It was previously proposed to either normalize the (seem-ingly uncritical) responses by the gateway. We also believe thatthe attack can be prevented by combining the IPsec gatewaywith a second physical firewall on the WAN side, to filterinvalid requests and act as key negotiation server on behalfof the VPN gateway. However, such side-channels are outsidethe scope of this work.


Recommended