16
The Anatomy of LDNS Clusters: Findings and Implications for DNS-Based Network Control Hussein A. Alzoubi Case Western Reserve University Michael Rabinovich Case Western Reserve University Oliver Spatscheck AT&T Research Labs ABSTRACT We present a large-scale measurement of clusters of hosts sharing the same local DNS servers. We analyze proper- ties of these “LDNS clusters” from the perspective of DNS- based network control, a widely used technique to built scal- able Internet platforms such as content delivery networks. We found that not only LDNS clusters differ widely in terms of their size and geographical compactness but that the largest clusters are actually extremely compact. This suggests po- tential benefits of network control strategy with nuanced treat- ment of different LDNS clusters based on the combination of their size and compactness. We observed interesting varia- tions in LDNS setups including a wide use of “LDNS pools” (which as we explain the paper are different from setups where end-hosts simply utilize multiple resolvers). Finally, we measured and found no evidence of the performance penalty for unilateral IPv6 adoption by a Web site - a finding we hope will be found useful by the sites considering their IPv6 migration strategy. 1. INTRODUCTION Domain Name System (DNS) is a key component of the today’s Internet apparatus. Its primary goal is to resolve human-readable host names, such as “cnn.com” to hosts IP addresses. In particular, HTTP clients send queries for these resolutions to their local DNS servers (LDNS), which route these queries through the DNS in- frastructure and ultimately send them to authoritative DNS servers (ADNS) that maintain the needed map- ping information. The ADNS servers then return the corresponding IP addresses back to LDNS, which for- ward them to the clients, who then can proceed with their HTTP interactions. By returning different IP ad- dresses to different queries, ADNS can direct different HTTP requests to different servers. This commonly forms the basis for transparent client request routing in replicated web sites, content delivery systems, and – more recently – cloud computing platforms. When selecting an IP address to reply to a DNS query, ADNS only know the identity of the request- ing LDNS and not the client that originated the query. Thus, the LDNS acts as the proxy for all its clients. We call the group of clients “hiding” behind a common LDNS server an “LDNS cluster”. DNS-based network control can only distribute client demand among data centers or servers at the granularity of the entire LDNS clusters, leading to two fundamental problems, hidden load problem [6], which is that a single load balancing decision may lead to unforeseen amount of load shift, and the originator problem [22], which is that when the request routing apparatus attempts to route clients to the nearest data center, the apparatus only consider the location of the LDNS and not the clients behind. This paper studies properties of LDNS clusters from the perspective of their effect on client request rout- ing. Various proposals were put forward to include the identity of the client into the LDNS queries but they have not been adopted, presumably because these proposals are incompatible with shared LDNS caching, where the same response can be reused by multiple clients. There is another such proposal currently un- derway, spearheaded by Google [9]. Our work will be useful in informing these efforts. The key findings in our study are the following: An overwhelming majority of the clusters, even as seen from a very high-volume web site, are very small, posing no issue with respect to hidden load. However, there are a few “elephant” clusters. Thus, a DNS-based request routing system may benefit by tracking the elephant LDNS clusters and treat- ing them differently. LDNS clusters differ widely in terms of their geo- graphical and AS span. Furthermore, the extent of this span does not correlate with cluster size: the busiest clusters are very compact geographically but not in terms of AS-sharing. Thus, a DNS- based request routing system can benefit by treat- ing LDNS clusters differently depending on a com- bination of their size and compactness: when there is a need to rebalance server load, the system may re-route requests from non-compact clusters first because they benefit less from proximity-sensitive routing anyway. 1

The Anatomy of LDNS Clusters: Findings and Implications ...engr.case.edu/rabinovich_michael/LDNS_full.pdfBy measuring these delays, our in-strumentation allows us to weigh in on the

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Anatomy of LDNS Clusters: Findings and Implications ...engr.case.edu/rabinovich_michael/LDNS_full.pdfBy measuring these delays, our in-strumentation allows us to weigh in on the

The Anatomy of LDNS Clusters: Findings and Implicationsfor DNS-Based Network Control

Hussein A. AlzoubiCase Western Reserve University

Michael RabinovichCase Western Reserve University

Oliver SpatscheckAT&T Research Labs

ABSTRACTWe present a large-scale measurement of clusters of hostssharing the same local DNS servers. We analyze proper-ties of these “LDNS clusters” from the perspective of DNS-based network control, a widely used technique to built scal-able Internet platforms such as content delivery networks.We found that not only LDNS clusters differ widely in termsof their size and geographical compactness but that the largestclusters are actually extremely compact. This suggests po-tential benefits of network control strategy with nuanced treat-ment of different LDNS clusters based on the combination oftheir size and compactness. We observed interesting varia-tions in LDNS setups including a wide use of “LDNS pools”(which as we explain the paper are different from setupswhere end-hosts simply utilize multiple resolvers). Finally,we measured and found no evidence of the performance penaltyfor unilateral IPv6 adoption by a Web site - a finding wehope will be found useful by the sites considering their IPv6migration strategy.

1. INTRODUCTIONDomain Name System (DNS) is a key component of

the today’s Internet apparatus. Its primary goal is toresolve human-readable host names, such as “cnn.com”to hosts IP addresses. In particular, HTTP clients sendqueries for these resolutions to their local DNS servers(LDNS), which route these queries through the DNS in-frastructure and ultimately send them to authoritativeDNS servers (ADNS) that maintain the needed map-ping information. The ADNS servers then return thecorresponding IP addresses back to LDNS, which for-ward them to the clients, who then can proceed withtheir HTTP interactions. By returning different IP ad-dresses to different queries, ADNS can direct differentHTTP requests to different servers. This commonlyforms the basis for transparent client request routingin replicated web sites, content delivery systems, and –more recently – cloud computing platforms.When selecting an IP address to reply to a DNS

query, ADNS only know the identity of the request-ing LDNS and not the client that originated the query.Thus, the LDNS acts as the proxy for all its clients.

We call the group of clients “hiding” behind a commonLDNS server an “LDNS cluster”. DNS-based networkcontrol can only distribute client demand among datacenters or servers at the granularity of the entire LDNSclusters, leading to two fundamental problems, hiddenload problem [6], which is that a single load balancingdecision may lead to unforeseen amount of load shift,and the originator problem [22], which is that when therequest routing apparatus attempts to route clients tothe nearest data center, the apparatus only consider thelocation of the LDNS and not the clients behind.This paper studies properties of LDNS clusters from

the perspective of their effect on client request rout-ing. Various proposals were put forward to includethe identity of the client into the LDNS queries butthey have not been adopted, presumably because theseproposals are incompatible with shared LDNS caching,where the same response can be reused by multipleclients. There is another such proposal currently un-derway, spearheaded by Google [9]. Our work will beuseful in informing these efforts.The key findings in our study are the following:

• An overwhelming majority of the clusters, even asseen from a very high-volume web site, are verysmall, posing no issue with respect to hidden load.However, there are a few “elephant” clusters. Thus,a DNS-based request routing system may benefitby tracking the elephant LDNS clusters and treat-ing them differently.

• LDNS clusters differ widely in terms of their geo-graphical and AS span. Furthermore, the extent ofthis span does not correlate with cluster size: thebusiest clusters are very compact geographicallybut not in terms of AS-sharing. Thus, a DNS-based request routing system can benefit by treat-ing LDNS clusters differently depending on a com-bination of their size and compactness: when thereis a need to rebalance server load, the system mayre-route requests from non-compact clusters firstbecause they benefit less from proximity-sensitiverouting anyway.

1

Page 2: The Anatomy of LDNS Clusters: Findings and Implications ...engr.case.edu/rabinovich_michael/LDNS_full.pdfBy measuring these delays, our in-strumentation allows us to weigh in on the

• A large number of IPs act as both Web clients andtheir own LDNSs. We find evidence that much ofthis phenomenon is explained by the presence ofmiddleboxes (NATs, firewalls, and web proxies).However, although they aggregate traffic from mul-tiple hosts, these clusters exhibit, if anything, loweractivity. Hence this aspect by itself does not ap-pear to warrant special treatment from the requestrouting system.

• We find strong evidence of LDNS pools with sharedcache, where a set of “worker servers” shares workfor resolving clients’ queries. While the implica-tions of this behavior for network control remainunclear, the prevalence of this LDNS behavior war-rants a careful future study.

• We examine performance penalty for unilateral IPv6adoption by a Web site and found scant evidenceof it. Some sites, including Google [10], pursueconservative approaches to IPv6 adoption basedon explicit client opt-in. Our measurement shouldhelp Web sites develop their IPv6 migration strate-gies.

2. SYSTEM INSTRUMENTATIONWe used an enhanced approach from [17] to gather

our measurements. Our setup is illustrated in Figure 1.We registered the domain dns-research.com and builta specialized DNS server to act as its authoritative DNSserver as well as a specialized Web server to serve con-tent from this domain. We deployed both servers onthe same host. The Web server hosts a special imageURL, dns-research.com/special.jpg, which we em-bedded into the home page of a high-volume consumer-oriented Web site. When a user accesses this Web siteand hence our special image, their browser first sends aDNS query for dns-research.com (we refer to this as a“base domain” and “base query”) to the user’s LDNS(step 1 in the figure). The LDNS obtains the IP ad-dress from our DNS server (steps 2 and 3), forwardsthe response to the client (step 4), which then sendsthe HTTP request for this image to our server (step5). Our Web server returns an HTTP 302 (“Moved”)response (step 6) redirecting the client to another URLin the dns-research.com domain, but with the hostname that embeds the client’s IP address.For example, when our Web server receives a request

for special.jpg from client 206.196.164.138, the Web serverreplies to the client with the following redirection link:206 196 164 138.sub1.dns-researh.com/special.jpg. Sincethe redirection URL contains a new hostname, the clientneeds to resolve this name through its LDNS. Thus theclient issues another DNS request to its LDNS (step 7),which in turn sends the request to our DNS server (step8), allowing the latter to record both the IP address of

Figure 1: System Setup

the LDNS that sent the query and the IP address of itsassociated client that had been embedded in the host-name. In the original approach of [17], the DNS serverwould now complete the interaction by resolving thequery to the IP of the special Web server, which wouldrespond to the client’s HTTP request with a 1-pixelimage.We augmented this approach in two ways. First,

our specialized DNS server responds to any query for*.sub1.dns-research.com with the corresponding CNAMErecord *.sub2.dns-research.com, where * denotes the samestring representing client’s IP address (step 9), forcingthe client to perform another DNS resolution for the lat-ter name. Moreover, our DNS server includes its ownIP address in the authority section of its reply to the“sub1” query, which ensures that the LDNS sends thesecond request (“sub2” request) directly to our DNSserver. We added this functionality in order to measuresome timing properties of clients networks but we usedit to discover LDNS pools (Sectionrefsec:Stacks). Uponreceiving the “sub2” query (step 10), our DNS returnsits own IP address (steps 11-12), which is also the IPaddress of our Web server, and the client performs thefinal HTTP download of our special image (steps 13-14).Second, we added functionality to measure the per-

formance implications of deploying a dual-stack IPv6-enabled Web site and serving IPv6-enabled clients froman IPv6 server. Specifically, we configured our spe-cialized DNS server to respond to IPv6 queries (type-AAAA requests) for *.sub1.dns-research.com in the sameway as to type-A queries (i.e., with the correspondingCNAME record *.sub2.dns-research.com) but when theLDNS then issues the AAAA request for *.sub2.dns-research.com, our DNS server replies with an unreach-able IPv6 address 1. This imitates the situation when

1Since the base DNS queries can not be reliably associated

2

Page 3: The Anatomy of LDNS Clusters: Findings and Implications ...engr.case.edu/rabinovich_michael/LDNS_full.pdfBy measuring these delays, our in-strumentation allows us to weigh in on the

Table 1: High-level dataset characterizationUnique LDNS IPs 278,559Unique Client IPs 11,378,020Unique Client/LDNS IP Pairs 21,432,181

there is no IPv6 path between the IPv6-enabled clientand server. We then measure the time between when wesend the unreachable IPv6 address and when the clientfalls back to IPv4 (i.e., the corresponding HTTP requestarrives over IPv4). By measuring these delays, our in-strumentation allows us to weigh in on the followingquestion: could an IPv6-enabled web site blindly sendIPv6-enabled clients to its IPv6 server or it should doso only for the clients who explicitly verify the presenceof an end-to-end IPv6 tunnel to the site and then optin (as the practice Google currently follows [10])?Finally, to ensure we see all DNS queries that would

be seen by content delivery networks, we used a low 10seconds TTL for our DNS records – lower than any CDNwe are aware of. Further, our Web server adds a “cache-control:no-cache” header field to its HTTP responses tomake sure we receive every request to our special image.

3. THE DATASETWe have collaborated with a major consumer-oriented

website to embed the image linked to our setup intotheir home page. Whenever a web browser visits thehome page, the browser downloads the linked imageand the interactions discussed in Figure 1 take place.We have collected the DNS logs (including the times-

tamp of the query, LDNS IP, query type, and querystring) and HTTP logs (request time, User-Agent andHost headers) resulting from these interactions. Our ex-periment lasted 28 days, from Jan 5th, 2011 to Feb 1st,during which we collected the total of over 67.7 millionsub1 and sub2 DNS requests and around 56 million ofthe HTTP downloads of the final image (steps 13/14 inFigure 1).2

Table 1 shows a high-level characterization for ourdataset. We have collected over 21M client/LDNS asso-ciations between 11.3M unique client IP addresses from17,778 autonomous systems (ASs) and almost 280K LDNSresolvers from 14,627 ASs.We refer to all clients that used a given LDNS as

with the clients, our DNS responds with NXDOMAIN toAAAA requests for the base domain.2While one could have expected the number of final HTTPdownloads to be roughly half the number of sub* DNS re-quests since each client access would normally generate asub1 and sub2 DNS request and one final HTTP download,the number of these HTTP downloads in our dataset is muchgreater than that. We verified that this is due to clients andLDNSs caching our replies to “Sub2” queries for much longerthan our specified TTL value. These wide-spread TTL vio-lations were first reported in [19].

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000 10000 100000 1e+06

CD

F

# Clients associated with a LDNS

1000

10000

100000

1e+06

0 200 400 600 800 1000

Top 1000 LDNSs

# A

sso

cia

ted

Clie

nts

Figure 2: Distribution of LDNS cluster sizes.

the LDNS cluster. Thus, an LDNS cluster contains oneLDNS IP and all clients that used that LDNS in ourexperiment. Note that the same client can belong tomultiple LDNS clusters if it used more than one LDNSduring our experiment.

4. CLUSTER SIZEWe begin by characterizing LDNS clusters in terms

of their size. We should stress that our characterizationis done purely based on IP addresses, and our use ofthe term “client” is simply a shorthand for “client IPaddress”. It has been shown that IP addresses maynot be a good representation of individual hosts due tothe presence of network address translation boxes anddynamic IP addresses [16, 4].Figure 2 shows the CDF of LDNS cluster sizes while

the cut-in subfigure shows the sizes of the 1000 largestLDNS clusters (in the increasing order of size). Wefound that a vast majority of LDNS clusters are small -over 90% of LDNS clusters contain fewer than 10 clients.This means that most clusters do not provide much ben-efit of shared DNS cache to their clients. To see thepotential impact on clients, Figure 3 shows the cumula-tive percentage of sub1 requests issued by LDNSs repre-senting clusters of different sizes as well as cumulativepercentages of their client/LDNS associations. Moreprecisely, for a given cluster size X, the correspondingpoints on the curves show the percentage of sub1 re-quests issued by LDNS clusters of size up to X, andthe percentage of all client/LDNS associations belong-ing to these clusters. As seen on the figure, small clus-ters, with less than 10 clients, only contribute less than10% of all sub1 requests and comprise less than 1% ofall client/LDNS associations. Thus, even though thesesmall clusters represent over 90% of all LDNS clusters,an overwhelming majority of clients belong to largerclusters, which are also responsible for most of the ac-tivity.Moreover, we observed a few “elephant” clusters. The

3

Page 4: The Anatomy of LDNS Clusters: Findings and Implications ...engr.case.edu/rabinovich_michael/LDNS_full.pdfBy measuring these delays, our in-strumentation allows us to weigh in on the

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000 10000 100000 1e+06

CD

F

LDNS cluster size (# Clients)

% of LDNS-Client pairs% of Sub1 requests

Figure 3: Distribution of sub1 requests andclient/LDNS pairs attributed to LDNS clustersof different sizes

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000

CD

F

# LDNSs associated with a Client

100

1000

0 10 20 30 40 50 60 70 80 90 100

Top 100 Clients

# A

sso

cia

ted

LD

NS

s

Figure 4: Size distribution of clients’ LDNS sets.

largest such cluster (with LDNS IP 167.206.254.14) com-prised 129,720 clients and it alone was responsible foralmost 1% of all sub1 requests. Elephant clusters mayaffect dramatically load distribution, and their smallnumber suggests that it might be warranted and fea-sible to identify and handle them separately from therest of the LDNS population. Overall, the size of LDNSclusters ranged from 1 to 129,720 clients, with the aver-age size being 76.94 clients. We further consider top-10elephant LDNS clusters in Section 8.We also consider the sets of LDNS servers individual

clients associate themselves with. We refer to these sets,for the lack of a better term, as client’s LDNS sets, orLDNS sets for short. Figure 4 shows the CDF of thesizes of LDNS sets, with an insert showing the size ofthe largest 100 sets. While almost 61% of clients showedup with only 1 LDNS, 10% of clients used more than4 LDNSs and almost 1% of clients (more than 100K),associated with more than 10 LDNSs during our experi-ment. The average number of servers used by one clientwas 1.88, which agrees with an intuition for a typicalLDNS setup. Yet again, there were some outliers, suchas client (151.190.254.108 ) utilizing 982 LDNS servers.

5. CLUSTER ACTIVITYWe now turn to characterizing the activity of LDNS

clusters. This is important to DNS-based server selec-tion because of the hidden load problem [6]: a singleDNS response to an LDNS will direct HTTP load tothe selected server from all clients behind this LDNSfor the TTL duration. Uneven hidden loads may leadto unexpected results from the load balancing perspec-tive. On the other hand, knowing activity characteris-tics of different clusters would allow one to take hiddenloads into account during server selection process. Forexample, dynamic adjustments of the TTL in DNS re-sponses to different LDNSs can be used to compensatefor different hidden loads [5, 6].We characterize the activity of the LDNS clusters by

the number of their sub1 requests as well as by thenumber of the final HTTP requests for the special imagegenerated by their clients (steps 13-14 of our setup inter-action). We refer to these final HTTP requests as sim-ply HTTP requests in the rest of the paper, but stressthat we do not include the initial redirected HTTP re-quests (steps 5-6 of the setup) into any of the results.Since a client may belong to multiple LDNS clusters(e.g., when it used a different LDNS at different times),we associate an HTTP request with the last LDNS thatwas used by the client prior to the HTTP request inquestion.Figure 5 shows the CDF of the number of sub1 queries

issued by LDNSs, as well as the CDF of the numberof HTTP requests issued by clients behind each LDNSduring our experiment. Again, both curves in the figureindicate that there are only a small number of high-activity clusters. Indeed, 35% of LDNSs issued onlyone sub1 request, and 96% of all LDNSs issued less than100 sub1 requests over the entire experiment duration.Yet the most active LDNS sent 303,042 sub1 requests.The HTTP activity presents similar trends although wedo observe some hidden load effects even among low-activity clusters: whereas 35% of LDNSs issued a singleDNS query, only less than 20% of their clusters issueda single HTTP request.Overall, our observations on both the size and ac-

tivity of LDNS clusters confirm that DNS-based serverselection may benefit from treating different LDNSs dif-ferently. At the same time, one may only need toconcentrate on a relatively small number of “elephant”LDNSs for such special treatment.

6. CLIENT-TO-LDNS PROXIMITYWe consider the proximity of clients to their LDNS

servers, which determines the severity of the originatorproblem and can have other implications for proximity-based request routing. Prior studies looked at differ-ent proximity metrics, including TCP traceroute diver-gence, network delay difference as seen from a given

4

Page 5: The Anatomy of LDNS Clusters: Findings and Implications ...engr.case.edu/rabinovich_michael/LDNS_full.pdfBy measuring these delays, our in-strumentation allows us to weigh in on the

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000 10000 100000 1e+06

CD

F

LDNS activity (# requests )

Sub1 requestsHTTP requests

Figure 5: LDNSs Activity in terms of DNS andHTTP requests.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 2000 4000 6000 8000 10000 12000

CD

F

AirMiles between Clients and their LDNSs

Figure 6: Air miles for all client/LDNS pairs

external vantage point, and autonomous system shar-ing - how many clients reside in the same AS as theirLDNS servers. We revisit the AS-sharing metric, butinstead of the other metrics, which are vantage-pointdependent, we consider the air-mile distance betweenclients and their LDNSs.

6.1 Air-Miles Between Client and LDNSWe utilized the GeoIP city database from Maxmind

[18], which provides the geographical location informa-tion for IP addresses, to study geographical proper-ties of LDNS clusters. Using the database dated fromFebruary 1, 2011 (so that our analysis would reflect theGeoIP map at the time of experiment), we mapped theIP addresses of the clients and their associated LDNSsand calculated the geographical distance (“air-miles”)between them.Figure 6 shows the cumulative distribution function

(CDF) of air-miles of all client/LDNS pairs. The figureshows that clients are sometimes situated surprisinglyfar from their LDNS servers. Only around 25% of allclient/LDNS pairs were less than 100 miles apart while30% were over 1000 miles apart. This suggests an inher-ent limitation to how accurate, in terms of proximity,

0.1

1

10

100

1000

10000

100000

0 50000 100000 150000 200000 250000 1

10

100

1000

10000

100000

1e+06

Avg

AirM

iles

# C

lien

ts a

sso

cia

ted

with

a L

DN

S

LDNS

Avg AirMiles for a LDNS# Clients associated with a LDNS

Figure 7: Avg air-miles in an LDNS cluster

0.1

1

10

100

1000

10000

100000

0 5000 10000 15000 20000 25000 1

10

100

1000

10000

100000

1e+06

Avg

AirM

ile

s

# C

lie

nts

asso

cia

ted

with

a L

DN

S

LDNSs

Avg AirMiles for a LDNS# Clients associated with a LDNS

Figure 8: Avg client/LDNS distance in topLDNS clusters

DNS-based server selection can be. We note that ourmeasurements showed greater distances than previouslymeasured in [13].

6.2 Geographical SpanWe are also interested in the geographical span of

LDNS clusters. Concentrated clusters are more amenableto proximity routing than the spread-out ones. If acontent platform can distinguish between these kindsof clusters, it could treat them differently: requestsfrom an LDNS representing concentrated cluster couldbe preferentially resolved to a highly proximal contentserver, while requests from LDNSs representing spread-out clusters could be used to even out load with lessregard for proximity. Unlike today’s equal treatment,this would result in more requests resolving to proximalcontent servers when it actually counts.Figure 7 plots for each LDNS server the average air-

miles from this server to all its clients and the numberof clients for the same LDNS. The X-axis shows LDNSssorted by the size of their client cluster, and withinLDNSs of equal cluster size, by the average air milesdistance.Most of the LDNSs with only ONE client have an

5

Page 6: The Anatomy of LDNS Clusters: Findings and Implications ...engr.case.edu/rabinovich_michael/LDNS_full.pdfBy measuring these delays, our in-strumentation allows us to weigh in on the

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100

CD

F

% of associated Clients (LDNSs) outside the

LDNS’s (Client’s) AS

% LDNSs outside a Client's AS% Clients outside a LDNS's AS

Figure 9: CDF of LDNS clusters with a given %of clients/LDNSs outside their LDNS’s/Client’sautonomous system.

avg AirMiles distance of 0 Miles - as we will see later(Section 9), most of these actually have the same IP rep-resenting both the client and LDNS. However, this gen-eral trend notwithstanding, we found some single-clientclusters with LDNS/client distance of over 10,000 miles.Obviously any proximity server selection for these clus-ters is futile.We also focused on the LDNSs with more than 10

clients, which represent almost 10% of all LDNSs in thedata set. Figure 8 shows the average air-miles and thenumber of clients for these top LDNS clusters. As thenumber of clients increases, the variation of geographi-cal span among clusters narrows but still remains signif-icant, with an order of magnitude differences betweenclusters. This provides evidence in support of differen-tial treatment of LDNSs not just with respect to theirdifferences in size and activity as we saw in Sections 4and 5 but also with respect to proximity-based serverselection.

6.3 AS SharingAnother measure of proximity is the degree of AS

sharing between clients and their LDNSs. Figure 9shows this information from, respectively, LDNS andclient perspective. The LDNS perspective reflects, fora given LDNS, the percentage of its associated clientsthat are in the same AS as the LDNS itself. The clients’perspective considers, for a given client, the percentageof its associated LDNSs that are in the same AS as theclient itself.While almost 77% of LDNSs have all their clients in

the same AS as they are, 15% of LDNSs have half oftheir clients outside their AS and 10% have all theirclients in a different AS. From the clients’ perspective,we have found that more than 9 million client have allthe LDNSs in their AS while nearly 2 million (almost17%) have all their LDNSs in a different AS. Only asmall number of clients - over 180K had a mix of some

Table 2: Client activity attributed to client-LDNS associations sharing the same AS

DNS requests HTTP requests73.79% 81.97%

LDNSs within and some LDNSs outside the client’s AS.Such a strong dichotomy (i.e., that clients either had allor none of their LDNSs in their own AS) is explainedby the fact that most clients associate with only a smallnumber of LDNSs.Our discussion so far concerned the prevalence of AS

sharing in terms of client population. In other words,each client-LDNS association is counted once in thosestatistics. However, different clients may have differentactivity levels, and we now consider the prevalence ofAS sharing from the perspective of clients’ accesses tothe Web site. Table 2 shows the fraction of client activ-ity stemming from client-LDNS associations that sharethe same AS. The first column reports the fraction ofall sub1 and sub2 request pairs with both the client andLDNS belonging to the same AS. This metric reflectsdefinitive information but it only approximates the levelof client activity because of the 10s TTL we use for sub1and sub2 responses: we do not expect the same client toissue another DNS query for 10 seconds no matter howmany HTTP requests it issues within this time. Thesecond column shows the fraction of all HTTP requeststhat we attributed – using the technique described inSection 5 – to a preceding DNS query with the clientusing an LDNS within the same AS. This metric reflectsdefinitive activity levels but is not iron-clad in attribut-ing the activity to a given client/LDNS association.First, we note that the prevalence of AS sharing based

on activity is somewhat lower than based on client pop-ulations. Second, these levels of AS sharing are sig-nificantly higher than those reported in the 10-year oldstudy [17] (see Table 5 there). This is an encouragingdevelopment for DNS-based network control.Overall, while the prevalence of AS sharing increased

form 10 years ago, we find a sizable fraction (15-17%)of client/LDNS associations where clients and LDNSsreside in different ASs. One of the goals in server se-lection by CDNs, especially those with a large numberof locations such as Akamai, is to find an edge serversharing the AS with the originator of the request [21].Our data shows fundamental limits to the benefits fromthis approach.To gain more insight into the reasons behind rela-

tively high number of clients with LDNSs in externalASs, Figure 10 shows the distribution of clients andLDNSs in the autonomous systems represented in ourdata set. The X-axis shows the AS order number sortedaccording to the decreasing number of LDNSs it con-tains, and the Y-axis shows the number of LDNSs and

6

Page 7: The Anatomy of LDNS Clusters: Findings and Implications ...engr.case.edu/rabinovich_michael/LDNS_full.pdfBy measuring these delays, our in-strumentation allows us to weigh in on the

0.1

1

10

100

1000

10000

100000

1e+06

1e+07

0 3000 6000 9000 12000 15000 18000 21000

# U

niq

ue

IP

s

AS ID (sorted by # LDNSs)

Unique Client IPsUnique LDNS IPs

Figure 10: Distribution of clients and LDNSs inautonomous systems.

clients in the corresponding AS. While in general wecan see correlation in the number of clients and LDNSs,the numbers of clients in ASs with similar number ofLDNSs differ by 3-4 orders of magnitude. In fact, thereis a number of ASs with sizable client populations thatshowed no LDNS in our measurements; on the flip side,there are examples of ASs with almost a 100 LDNSsand clients in single digits, as well ASs with around 10LDNSs and no clients at all. Clearly, there are vast andwide-spread differences in LDNS setups, indicating thatthe question of the proper LDNS setup on the Internetis still unsettled.

7. TTL EFFECTSThe above analysis considered the LDNS cluster ac-

tivity over the duration of the experiment. However,platforms that use DNS-based server selection, such asCDNs, usually assign relatively small TTL to their DNSresponses to retain an opportunity for further networkcontrol. In this section, we investigate the hidden loadsof LDNS clusters observed within typical TTL windowsutilized by CDNs, specifically 20s (used by Akamai),120s (AT&T’s ICDS content delivery network) and 350s(Limelight).In order to get the above numbers, we use our DNS

and HTTP traces to emulate the clients’ activity undera given TTL. The starting idea behind this simulationis simple: the initial sub1 query from an LDNS startsa TTL window, and then all subsequent HTTP activ-ity associated with this LDNS (using the procedure de-scribed in Section 5 ) is “charged” to this window; thenext sub1 request beyond the current window starts anew window. However, two subtle points complicatethis procedure.First, if after the initial sub1 query to one LDNS,

the same client sends another DNS query through an-other LDNS within the emulated TTL window (whichcan happen since the actual TTL in our experiments

0

0.2

0.4

0.6

0.8

1

1 10 100

CD

F

Client IPs per LDNS per TTL

Clients in 20s TTLClients in 120s TTLClients in 350s TTL

Figure 11: LDNS cluster sizes within TTL win-dows (all windows).

was only 10s) we “charge” these subsequent queries andtheir associated HTTP activity to the TTL window ofthe first LDNS. This is because with the longer TTL,these subsequent queries would not have occurred sincethe client would have reused the initial DNS responsefrom its cache.Second, confirming the phenomenon previously mea-

sured in [19], we have encountered a considerable num-ber of requests that violated TTL values, with viola-tions sometimes exceeding the largest TTL values wesimulated (350s). Consequently, in reporting the hid-den loads per TTL, we use two lines for each TTLvalue. The lines labeled “strict” reflect only the HTTPrequests that actually fell into the TTL window3 Thus,these results ignore requests that violate the TTL value.The “non-strict” lines include these violating HTTP re-quests and count them towards the hidden load of theTTL window to which the associated DNS query wasassigned.Figure 11 shows the CDF of the LDNS cluster sizes

observed in each TTL window, i.e., each window con-tributed a data point to the CDF. The majority of win-dows, across all LDNSs, contained only one client. Andas expected, as the TTL window increases, the num-ber of clients an LDNS serves increases. Still, onlyaround 10% of TTL intervals had more than 10 clientsunder TTL of 350s, and less than 2% of the intervalshad more than 10 clients with TTL of 120s. Figure 12shows the CFP of the average in-TTL cluster sizes forLDNSs across all their TTL intervals. That is, eachLDNS contributes only one data point to the CDF, re-flecting its average cluster size for all its TTL intervals.The average in-TTL cluster sizes are even smaller, with

3For simplicity of implementation with also counted HTTPrequests whose corresponding sub* queries were within thewindow but the HTTP requests themselves were within ourreal TTL of 10s past the window. There were very smallnumber of such requests (a few thousand out of 56M total)thus this does not materially affect our results.

7

Page 8: The Anatomy of LDNS Clusters: Findings and Implications ...engr.case.edu/rabinovich_michael/LDNS_full.pdfBy measuring these delays, our in-strumentation allows us to weigh in on the

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

1 10 100

CD

F

Avg # Client IPs per LDNS

Clients in 20s TTLClients in 120s TTLClients in 350s TTL

Figure 12: Average LDNS cluster sizes withina TTL window (averaged over all windows for agiven LDNS)

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000 10000 100000 1e+06

CD

F

HTTP requests per LDNS per TTL

Strict TTL 20 Strict TTL 120Strict TTL 350

UN-Strict TTL 20 UN-Strict TTL 120UN-Strict TTL 350

Figure 13: HTTP requests within TTL windows(all windows).

virtually all LDNSs exhibiting average in-TTL clustersize below 10 clients under all TTL values. The differ-ence between the two figures is explained by the factthat busier LDNSs (i.e., those showing up with moreclients within a TTL) tend to appear more frequentlyin the trace, thus contributing more TTL windows inFigure 11.To assess how the “hidden” load of LDNSs depends

on TTL, Figures 13 and 14 show, respectively, the CDFsof the number of HTTP requests in all TTL windowsand average in-TTL number of HTTP requests for allLDNS across all their TTL intervals. A few obser-vations are noteworthy. First, the difference betweenstrict and non-string lines indicate sizable violationsof even the largest TTL we considered. As Figure 13shows, although these violations decrease for larger TTL(as expected) and importantly, all but disappear forTTL of 350 sec. This indicates that at these TTL levels,the CDNs might not need to be concerned about unfore-seen affect of these violations on hidden load. Second,while the hidden loads diminish for lower TTLs and thedifferences in hidden loads among different LDNSs and

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000 10000 100000

CD

F

Avg number of HTTP requests per LDNS

Strict TTL 20 Strict TTL 120Strict TTL 350

UN-Strict TTL 20 UN-Strict TTL 120UN-Strict TTL 350

Figure 14: Average number of HTTP requestsper LDNS within a TTL window (averaged overall windows for a given LDNS).

windows are significant, their absolute values are smalloverall - virtually all windows contain fewer than 100requests even for the largest TTL of 350s. Thus, lowTTL values are important not for proper load-balancinggranularity in routine operations but mostly to reactquickly to unforeseen flash crowds. It is obviously un-desirable to have to pay overhead on routine operationwhile using it only for extraordinary scenarios. A betterknob would be desirable and should be given consider-ation in future Internet architectures.

8. TOP-10 LDNS CLUSTERSWe have investigated the top 10 LDNSs manually

through reverse DNS lookups, whois records, and Max-Mind ISP records for their IP addresses. The top-10 LDNSs in fact all belong to just two ISPs, whichwe refer to as ISP1 (LDNSs ranked 10-4), and ISP2(ranked 3-1). ISP2’s three top clusters by themselvescontributed 1.6% of all unique client-LDNS associationsin our traces and 2.33% of all sub1 requests. The ex-tent of the AS sharing for these clusters is shown inFigure 15. In the figure, the bars for each cluster rep-resent (from left to right) the number of clients sharingthe AS with the cluster’s LDNS, the number of clientsin other ASs, and for the latter clients, the number ofdifferent ASs they represent. The figure shows very lowdegree of AS sharing in the top clusters. Virtually allthe clients belong to a non-LDNS AS, and each clusterspans dozens of different ASs. We further verified thatthese ASs belong to different organizations from thoseowning the corresponding LDNSs. Interestingly, the ASsharing is very similar between ISP1 and ISP2.We also consider the geographical span of the top

10 clusters using MaxMind GeoIP city database. Fig-ure 16 shows the average and median air-miles distancebetween the LDNS and its clients, as well as the 95thand 99th percentiles for these distances for each LDNS

8

Page 9: The Anatomy of LDNS Clusters: Findings and Implications ...engr.case.edu/rabinovich_michael/LDNS_full.pdfBy measuring these delays, our in-strumentation allows us to weigh in on the

!"

!#"

!##"

!###"

!####"

!#####"

!######"

$%&'!"

$%&'("

$%&')"

$%&'*"

$%&'+"

$%&',"

$%&'-"

$%&'."

$%&'/"

$%&'!#"

0&12'" 34512'" 67485"79"%:;"2'<=":8"34512'"

Figure 15: AS sharing of top-10 LDNSs andtheir clients

!"

!#"

!##"

!###"

!####"

$%&'!"

$%&'("

$%&')"

$%&'*"

$%&'+"

$%&',"

$%&'-"

$%&'."

$%&'/"

$%&'!#"

0123"04564789" 68:4;<"04564789" /+=>"?85@8<A78"04564789"

//=>"?85@8<A78"04564789" B"C748<=9"D"!##"64789"

Figure 16: Air miles between top-10 LDNSs andtheir clients.

!"#$

#$

#!$

#!!$

#!!!$

%&'()*#$

%&'()*+$

%&'()*,$

%&'()*-$

%&'()*.$

%&'()*/$

%&'()*0$

%&'()*1$

%&'()*2$

%&'()*#!$

34567$ 89*567$ %:9)*$:;$<'=$67>?$')$89*567$

Figure 17: AS sharing of top-10 clients and theirLDNSs

Cluster. The last bar shows the percentage of clients inthat cluster that are less than 100 Mile apart from theLDNS.While figure 15 shows that top-10 LDNS clusters spans

dozens of ASs which suggests topologically distant pairs,Figure 16 shows that these clusters are very compact,with most clients less than 100 miles away from theirLDNSs. Further, although both ISPs exhibit similartrends, ISP2 clearly has more ubiquitous LDNS infras-tructure and more of their customers can expect betterservice from CDN-acceleratedWeb sites. Indeed, ISP2’sLDNSs have more than 99.9% of their clients within 100AirMiles radius, with the average range between 20 - 43AirMiles.Turning to the top 10 clients, they represent eight

different firms and the top four appear to be firewalls orproxies (as they include “firewall”, “WebDefense”, and“proxy” in their reverse DNS names). Figure 17 showsthe extent to which these client share their AS withthe their LDNSs. The bars for each cluster represent(from left to right) the number of LDNSs sharing theAS with the client, the number of LDNSs in other ASs,and for the latter LDNSs, the number of different ASsthey represent.Reflecting more diverse ownership of these clients,their

AS sharing behavior is also more diverse than that oftop LDNSs. Still, 9 out of 10 of these clients had themajority of their LDNSs outside their AS, including oneclient with all its LDNSs outside its AS. This was trueeven for clients representing major ISP proxies (clientsranked 6 and 4) or otherwise belonging to major ISPs(clients 10 and 7). At the same time, for 7 out of 10clients, most of these external LDNSs were concentratedin just one (or a small number of) other ASs. Again,we verified that these external ASs do in fact belong todifferent organizations. From this, we speculate that atleast the clients with external LDNSs concentrated in1-2 external ASs subscribe to an external DNS service,which employs a large farm of LDNS servers that are

9

Page 10: The Anatomy of LDNS Clusters: Findings and Implications ...engr.case.edu/rabinovich_michael/LDNS_full.pdfBy measuring these delays, our in-strumentation allows us to weigh in on the

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.1 1 10 100 1000 10000

CD

F

AirMiles

Top 10 Clients - Unique PairsTop 10 LDNSs - Unique Pairs

Top 10 Clients - RequestsTop 10 LDNSs - Requests

Figure 18: CDF of air-miles between top 10LDNSs and their clients and between top-10clients and their LDNSs.

not NAT’ed.Finally, Figure 18 examines the geographical distance

of top-10 clients from their LDNS servers. Two of thecurves in the figure give a CDF of these distances for ev-ery unique association of these clients and their LDNSs,as well as for each request coming from these clientsFor comparison, the figure provides similar CDFs forthe top-10 LDNSs considered earlier. We see that thetop clients are often situated much farther away fromtheir LDNSs than the clients of the top LDNSs; inparticular, only around 12% of the top-10 clients arewithin 100 miles of their LDNSs as compared to roughly65% of top-10 LDNSs being within 100 miles of theirclients Another observation is that top clients are ei-ther co-located with their LDNSs or use LDNSs thatare hundreds of miles away. Furthermore, the volumeof requests arriving from the top-10 clients that em-ployed co-located LDNSs contributed around 54% ofall the activity despite constituting only 12% of all top-client/LDNS associations.

9. CLIENT SITE CONFIGURATIONSWe now discuss some noteworthy client and LDNS

behaviors we observed in our experiments.

9.1 Clients OR LDNSs?!Our first observation is a wide-spread sharing of DNS

and HTTP behavior among clients. Out of 278,559LDNS servers in our trace, 170,137 or 61.08% also showup among the HTTP clients. We refer to these LDNSsas the “Act-Like-Clients” group. A vast majorityof these LDNSs – 98% or 166,859 – have themselvesamong their own associated clients. We will call theseLDNSs the Self-Served group. The other 3278 LDNSIPs always used different LDNSs when acted as HTTPclients. Within the Self-Served group, we found that149,013 of these LDNSs, or 53% of all the LDNSs inour dataset, had themselves as their only client during

!"#$%!"&'"()

*+",*+")

-./012)

034)

!"#$%!"&'"()

-5#6"+7,89:;!<)

=01=1)

-14)

!"#$%!"&'"()

,85#6"+7<)

-23=1)

14)

>"<7)?$)9:;!<)

-.3=,,)

0@4)

AB7%96C"%

5#6"+7<%;?7%

!"#$%!"&'"()

0,23)

-4)

Figure 19: Distribution of LDNSs

our experiment (“self-served-one-client”) while the re-maining 17,846 LDNSs had other clients as well (“Self-Served-2+Clients”). Moreover, 105,367 of the self-served-one-client client/LDNS IPs never used any other LDNS.We call them the “Self-Served-One2One” group. Thisleaves us with 43,646 LDNSs that had themselves astheir only client but in their client role, they also uti-lize other LDNSs. This group will be called the “Self-Served-1Client2+LDNSs”. Figure 19 summarizes thedistribution of these types of LDNSs.While a likely explanation for the Act-Like-Client

Not-Self-Served group is the reuse of dynamic IP ad-dresses (so that the same IP address is assigned to anHTTP client at some point and to an LDNS host at an-other time), the Self-Served behavior could be causedby sharing of a common middle-box between the LDNSand its clients. In particular, the following two casesare plausible.

• Both clients and their LDNS are behind a NATor firewall, which exposes a common IP address tothe public Internet. A particular case of this con-figuration is when a home network configures itswireless router to behave as LDNSs. Such configu-ration is easily enabled on popular wireless routers(e.g., Linksys), although these routers often re-solve their DNS queries through ISP LDNS servers[3].

• Clients are behind a proxy that acts as both HTTPproxy/cache and its own LDNS resolver.

We find support for the above explanation using anapproach similar to [16]. We utilized the User-Agentheaders to identify the hosts sharing the same middle-box based on the operating system and browser foot-prints. We consider an IP as a possible middle-box ifit shows two or more operating systems or operatingsystem versions, or three or more different browsers orbrowser versions. Out of the total 11.7M clients, wehave flagged only 686,651 (5.87%) clients who fall into

10

Page 11: The Anatomy of LDNS Clusters: Findings and Implications ...engr.case.edu/rabinovich_michael/LDNS_full.pdfBy measuring these delays, our in-strumentation allows us to weigh in on the

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000 10000 100000 1e+06

CD

F

# Clients associated with a LDNS (cluster size)

Self-ServedAct-Like-Clinet Not Self-Served

Not Act-Like-ClientsAll LDNSs

Figure 20: Cluster size distribution of LDNSgroups.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000 10000 100000 1e+06

CD

F

Sub1 requests issued by a LDNS

Self-ServedAct-Like-Clients Not Self-Served

Not Act-Like-ClientsAll LDNSs

Figure 21: The number of Sub1 requests issuedby LDNSs of different types.

the above category.4 However, 51,864 clients amongthem were from the Self-Served LDNS group, out of thetotal of 166K such LDNSs. Thus, the multi-host behav-ior is much more prevalent among self-serving LDNSsthan the general client population even though our tech-nique misses single-host NAT’ed networks (which con-stitute a majority of NAT networks according to [4] al-though not according to [16]) and NATs whose all hostshave the same platform.We would like to understand if these configurations

deviate from the “regular” LDNS cluster behavior, inwhich case they might need special treatment in DNS-based network control. For example, a proxy acting asits own LDNS might show as a small single-client clusteryet impose incommensurately high load on a CDN edgeserver as a result of a single act of server selection bythe CDN.Figure 20 compares the cluster sizes of self-served and

other LDNSs. It shows that self-served LDNS clusters

4Compared to previous studies of NAT usage, notably [16]and [4], this finding is more in line with the latter. Notethat our vantage point - from the perspective of a Web site- is also closer to [4] than [16].

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000 10000 100000 1e+06

CD

F

Sub1 requests issued by a LDNS

All One2One One2One Self-Served

One2One Not-Self-ServedAll LDNSs

Figure 22: Number of Requests issued byOne2One LDNSs.

are much smaller than other clusters, in fact they over-whelmingly contain only one client IP. This finding is inconformance with the behavior expected from a middle-box fronted network. A more revealing finding is dis-played in Figure 21, which compares the activity (interms of sub1 requests) of the self-served LDNSs withother groups. The figure shows that the self-servedLDNS clusters exhibit lower activity levels than thenot-act-like-clients clusters. Thus, while middleboxesaggregate demand from several hosts behind a singleIP address, these middleboxes seem to predominantlyfront small networks - smaller than other LDNS clus-ters.To confirm the presence of demand aggregation in

self-served LDNS clusters, Figure 22 factors our the dif-ference in client sizes and compares the activity of self-serve and non-self-serve LDNSs only for One2One clus-ters in both cases. There were 105,367 LDNSs/clientsin the One2One Self-Served group and 27,640 in theOne2One Not-Self-Served group. Figure 22 shows thatthe One2One self-served LDNSs in general are indeedmore active than Not-Self-Served LDNSs. For instance,66% of non-Self-Served LDNSs issued a single request isvs. only 46% of the self-served ones. This increased ac-tivity of self-served LDNSs is consistent with moderateaggregation of hosts behind a middle-box.In summary, we found a large number of LDNSs op-

erating from within middle-box fronted networks - theyare either behind the middleboxes or operated by themiddleboxes themselves. However, while these LDNSsexhibit distinct demand aggregation, their clusters areif anything less active than other clusters. Thus, amiddle-box fronted LDNS in itself does not seem to bean indication for separate treatment in network controlenvironments.

9.2 LDNS PoolsWe now consider another interesting behavior. As a

reminder, our sub* DNS interactions start with a sub1

11

Page 12: The Anatomy of LDNS Clusters: Findings and Implications ...engr.case.edu/rabinovich_michael/LDNS_full.pdfBy measuring these delays, our in-strumentation allows us to weigh in on the

End User

ADNS

ADNS Facing

LDNSs

LDNS Pool

Client Facing

LDNSs

Sub1CNAME

Sub2

Figure 23: LDNS Pool

request issued by the LDNS to our setup, to which wereply with a sub2 CNAME, forcing the LDNS to sendanother query, this time for sub2. However, we observedoccurrences in our traces where these two consecutivequeries (which we can attribute to the same interac-tion because both embed the same client IP address)came from different LDNS servers. In other words, eventhough we sent our CNAME response to one LDNS, wegot the subsequent sub2 query from a different LDNS.Note that this phenomenon is distinct from resolverclusters mentioned in [14] and measured in our compan-ion paper [3] and, from a different vantage point, in Sec-tion 4. Indeed those other sets of resolvers occur whenclients (ISPs on their behalf) load-balance their originalDNS queries among multiple resolvers – the measure-ments mentioned do not consider which resolvers mighthandle CNAME redirections. In contrast, in the be-havior discussed here, CNAME redirections arrive fromdifferent IP addresses.Such behavior could be caused by an LDNS server

with multiple ethernet ports (in which case a servermight select different ports for different queries), or by aload-balancing LDNS server farm with shared state. Anexample of such configuration, hinted by Google in [11],is shown in Figure 23, where two distinct layers of LDNSclusters face, respectively, clients and ADNSs, and theADNS-facing LDNSs are not recursive. Here, client-facing servers load-balance their queries among ADNS-facing servers based on a hash of the queried hostname;ADNS-facing servers send CNAME responses back tothe client-facing server, which forward the subsequentquery to a different ADNS-facing server due to differenthostname. In this paper we will call such a behavior –for the lack of a better term – themultiport behavior andLDNS IP addresses showing together within the sameinteractions LDNS pools to indicate that they belong to

the same multiport host or a server farm.In an attempt to remove fringe scenarios involving

rare timeout combinations, we only considered LDNSsL1 and L2 to be part of a pool if (1) the sub1 request fora given client came from L1 while sub2 request for thesame client came from L2; (2) the sub2 request from L2

came within one second of the sub1 request from L1,and (3) the sub1 request type matched sub2 requesttype, i.e. both are A or both are AAAA requests.Multiport behavior was not rare in our dataset. We

found 5,105,467 cases of such behavior representing 407,303unique LDNS multiport IP pairs and involving 36,485unique LDNS IP addresses, or 13% of all LDNSs in ourtrace. Furthermore, 1,924,359 clients (17% of all clients)were found to be involved in LDNS multi-port behavior(i.e., observed to have sub1 and sub2 requests within thesame interaction coming from different LDNS IPs), andover 10M clients – 90% of all the clients in our trace –were associated at some point with an LDNS belongingto a pool. Overall, the 13% of LDNSs with multiportbehavior were the busiest – they were responsible forover 90% of both sub* queries and subsequent HTTPrequests.Such significant occurrence of multiport behavior war-

rants a closer look at this phenomenon as it may haveimportant implications for DNS-based network control.Indeed, if LDNS pools always pick a random LDNSserver to forward a given query, the entire pool and allclients associated with any of its member LDNSs shouldbe treated as a single LDNS cluster. If, however, theLDNS pools attempt to preserve client affinity when se-lecting LDNS servers (i.e., if the same client tends to beassigned the same LDNS for a long time) then individ-ual LDNSs in the pool and clients associated with theseindividual LDNSs could be treated as separate LDNSclusters.In the remainder of this subsection, we group LDNSs

into pools as follows. Any pair of LDNSs observed inmultiport behavior is obviously considered as an LDNSpool. Then recursively, if the same LDNS is found intwo pools, these two pools are further grouped into onepool, and so on until no further grouping is possible.This process produced 5,547 pools. Figure 24 shows theCDF of pools sizes in terms of the LDNS count. Most ofLDNS pools are small - 90% of the pools have at most3 LDNSs. This is understandable because multi-portservers typically have a small - 2–4 – number of Ether-net ports, and a small farm of LDNS servers should besufficient for all but the largest client sites. However,we did find a number of pools with up to 69 LDNSs anda standout pool with 22514 LDNSs.We considered the outlier pool more closely and found

that the top 100 IPs (in terms of the number of otherIPs they pair up with in sub1/sub2 resolutions) all be-longed to the same tier-1 ISP with just the single top

12

Page 13: The Anatomy of LDNS Clusters: Findings and Implications ...engr.case.edu/rabinovich_michael/LDNS_full.pdfBy measuring these delays, our in-strumentation allows us to weigh in on the

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 10 100 1000 10000 100000

CD

F

# of LDNSs in a stack

Figure 24: The sizes of LDNS pools.

IP pairing up with 745 other IPs. We also verifiedL3’s multiport behavior directly by sending an origi-nal query to the anycast IP address of their public DNSresolver (4.2.2.2) and observing the original request andthe follow-up request for CNAME arrive to our author-itative DNS from different unicast addresses. We didfind some IP addresses in this pool that did not belongto that ISP suggesting a possibility that our heuristicsmentioned above may not have filtered out all aberra-tional scenarios, which could cause spurious merging ofseveral pools. Still, this does not undermine our gen-eral finding that these pools are a common occurrencein today’s Internet. A focused future study with longertrains of CNAME redirections is needed to better un-derstand this phenomenon.As an initial step towards analyzing multiport be-

havior, Figure 25 shows the prevalence of multiport be-havior in LDNSs belonging to pools. Specifically, thisfigure depicts CDFs of fractions of sub1/sub2 requestpairs that show multiport behavior within each LDNS(i.e., one but not both requests in the pair is from thegiven LDNS) over all sub1/sub2 pairs with at least oneof the requests coming from this LDNS. To simplifyour reasoning about multiport behavior, we report ourresults separately for pools of different sizes. Theoret-ically, a pool of size n with perfectly random load bal-ancing would have (1 − 1/n) fraction of request pairsshowing multiport behavior - e.g., for a pool with twoLDNSs, 50% of all pairs would come from the sameLDNS simply due to the random choice. Figure 25 con-firms this intuition - larger pools exhibit greater mul-tiport behavior prevalence. For instance, around 15%of LDNSs in pools of size up to 10 had multiport be-havior prevalence of over 60% compared to over 35% forpools of size 11-100 and almost 60% for the top pool. Infact, 35% of all LDNSs in the top pool always exhibitedmultiport behavior, for all their requests. These resultsindicate that LDNSs do not exhibit strong client affinitywithin pools and thus keying load balancing decisionson individual LDNS’s IP addresses may not always be

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100

CD

F

% of Multiport Request to Overall Sub* Requests by LDNS

Stacks of size 2-5Stacks of size 6-10

Stacks of size 6-100Top Stack of size 22K

Figure 25: Affinity in LDNSs

appropriate. Further research is needed to better under-stand implications of this phenomenon on DNS-basednetwork control.

10. IPV6 BEHAVIORAs we mentioned earlier, we have configured our DNS

server to respond to IPv6 (AAAA) sub1 requests thesame way it responds to IPv4 (A) requests. But forAAAA sub2 requests, our server will respond with aspecial unreachable IPv6 address. This setup allowsus to detect and estimate any excessive delays causedby deploying IPv6 Web sites in the current Internet dueto the lack of an end-to-end IPv6 path or tunnel. In-deed, attempting to communicate with our non-existentaddress has the same effect as an attempt to commu-nicate with an IPv6 destination over a path that is notend-to-end IPv6-enabled.Plausible scenarios for IPv6-enabled clients can be

grouped into two categories. In the first scenario theclient follows the recent recommendation from [24] toavoid any delay in attempting to use an unreachableIPv6 web server. Basically, clients would establish boththe IPv4 and IPv6 HTTP connections at the same timeusing both addresses; if the IPv6 connection advancesthrough the TCP handshake, the IPv4 connection isabandoned through an RST segment. The other sce-nario is that the client attempts to use IPv6 addressesfirst and then, after failure to connect, issues the IPv4request, which leads to a delay penalty. The macro-effects of dual IPv4/6 web site deployment are the re-sult of complex interactions between browsers, operat-ing systems, and LDNS behaviors which differ widely,leading to drastically different delay penalties (see [8]for an excellent survey of different browser and OS be-haviors).Out of the total of 278,559 LDNSs we observed during

our experiment, almost 22% were IPv6-enabled (i.e.,sent some AAAA queries). However, the fraction ofIPv6 sub* requests was much lower. We attribute this

13

Page 14: The Anatomy of LDNS Clusters: Findings and Implications ...engr.case.edu/rabinovich_michael/LDNS_full.pdfBy measuring these delays, our in-strumentation allows us to weigh in on the

Table 3: The basic IPv6 statisticsSub1 Sub2

# Requests 1,752,425 2,398,367LDNSs 22,334 32,291Clients 872,981 1,134,617

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.001 0.01 0.1 1 10

CD

F

Number of seconds between (AAAA and A) Sub2

Figure 26: Time difference between A andAAAA Sub2 requests

to the fact that many LDNS servers seem to cache theNXDOMAIN response which our DNS server returnsto the IPv6 queries for the base domain5 and not issuequeries for subdomains of the base domain. Table 3summarizes the general statistics about IPv6 requests,as well as clients and LDNSs behind them.We should stress that we can only observe AAAA

queries from LDNSs and subsequent IPv4 HTTP re-quests from their clients. In theory, an LDNS servercould issue an AAAA query even if the client is notIPv6-enabled. We cannot detect such behavior and thusour results relate to LDNS behavior and macro-effectsof deploying an IPv6-enabled web server. With thisunderstanding, unless it may cause confusion, we willrefer to both IPv6-enabled LDNS and all its clients asdual-stack.Our first experiment investigates any potential delays

in obtaining the IPv4 DNS resolution given that ourIPv6 web server is unreachable. To this end, we havecalculated the time difference between A and AAAAsub2 requests from the same client. Our immediateobservation is that almost 88% out of the 2.3 MillionAAAA Sub2 requests were received after their corre-sponding A request. This says that not only theseclients/LDNSs perform both resolutions in parallel but,between both wall-to-wall queries, they most likely sendthe A query first. For the remaining 12% of requests,Figure 26 shows the CDF of the time difference be-

5Recall that to avoid imposing delays on clients before wecould associate clients with their LDNS, we configured ourDNS server to respond to the AAAA base domain requestswith NXDOMAIN.

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

1 10 100 1000 10000 100000 1e+06 1e+07

CD

F

# of seconds between DNS/Sub2 and HTTP requests

All IPv4 Delays

All IPv6 Delays

Figure 27: Comparison of all IPv6 and IPv4 de-lays.

tween A and AAAA sub2. The figure indicates thateven among these requests, most clients did not waitfor a failed IPv6 connection before obtaining the IPv4address. Indeed, even assuming an accelerated defaultconnection timeout used in this case by Safari and Chrome(270ms and 300ms respectively [8] - as opposed to hun-dreds of seconds for regular TCP timeout [2]), roughly70% of type-A queries in these requests came withinthis timeout value. We conclude that a vast majority(roughly 95%) of requests do not incur extra DNS res-olution penalty due to IPv6 deployment.To assess the upper bound on the overall delay for

IPv6-enabled clients, we measure the time between AAAAsub2 DNS request (a conservative estimate of when theclient receives the unreachable IPv6 address) and theactual HTTP request by the client. To eliminate HTTPrequests that utilized previously cached DNS resolu-tions (as their time since the preceding DNS interac-tion would obviously not indicate the delay penalty) wemeasure the incidents of IPv6 delays for an HTTP re-quest only if it was immediately preceded (i.e., withoutanother interposed HTTP request) by a full DNS inter-action, including both A and AAAA sub2 requests andat least one sub1 request (either A or AAAA) for thatclient. (We did not require both A and AAAA sub1requests because either returns the same CNAME andallows the whole interaction to proceed even if the otherrequest is lost.)Applying the above filter resulted in 1,949,231 in-

stances of IPv6 delays from 1,086,323 unique client IPs.Our HTTP logs provide timestamps with granularityof one second thus we can only report our delays atthis granularity. Figures 27 and 28 compare delaysincurred by IPv6-enabled and IPv4-only clients. Fig-ure 27 shows CDFs of all delays across all clients inthe respective categories (i.e., multiple delay instancesfrom the same client are counted multiple times) andFigure 28 shows the CDFs of average and maximumdelays observed per client. Neither figure shows sig-

14

Page 15: The Anatomy of LDNS Clusters: Findings and Implications ...engr.case.edu/rabinovich_michael/LDNS_full.pdfBy measuring these delays, our in-strumentation allows us to weigh in on the

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

1 10 100 1000 10000 100000 1e+06 1e+07

CD

F

# of seconds between DNS/Sub2 and HTTP requests per Client

IPv6 Avg. Delays

IPv6 Max. DelaysIPv4 Max. Delays

IPv4 Avg. Delays

Figure 28: IPv4 and IPv6 delays per client.

nificant differences in delay between the two categoriesof clients. In fact, the only metric showing any penaltyis the maximum per-client delays. It appears that inpractice, the delay penalty for unilaterally deploying adual-stack IPv6-enabled Web site is slight.

11. RELATED WORKThis paper explores client-side DNS infrastructure.

Among the previous client-side DNS studies, Liston atal. [15] and Ager et al. [1] measured LDNS by re-solving a large number of hostnames from a limited setof client vantage points (60 in one case and 75 in theother), Pang et al. [20] used access logs from Akamaias well as active probes, and [7, 3] based their studieson large-scale scanning for open resolvers. Our goal wasa broad characterization of clients’ LDNS clusters fromthe perspective of a busy Web site. In particular, ourcompanion paper [3] found that most DNS resolvers onconsumer devices it discovered utilize an ISP LDNS fortheir DNS queries. The two studies are complimentaryin that [3] considers actively scanned devices while thispaper considers LDNSs actually performing queries fora busy web site.Both Ager et al. [1] and Huang et al. [13] compared

the performance implications of using public DNS re-solvers such as Google DNS with ISP-deployed resolversand found the former to be at significantly greater dis-tances from clients. Further, Huang et al. consideredthe geographical distance distribution between clientsand their LDNSs (Fig. 5 in [13]). Our study foundthese distances to be significantly greater: while theyobserved 80% of clients to be within 428km of theirLDNS resolvers, over 50% of our clients were over 500miles (806km) away from their resolvers (cf. Fig. 6).Network proximity of clients to their LDNS was con-sidered in [23] and [17]. Our measurement techniqueis an extension of [17], which we augmented to allowmeasurement of LDNS pools and IPv6 behavior.The extent of IPv6 penetration among DNS resolvers

was previously considered in the Netalyzr study [14],which also observed that some resolvers operate as clus-ters. Netalyzr obtained its results by performing activeDNS measurements from self-selected clients while ourstudy passively observed actual accesses to a busy Website.As an alternative to the Faster Internet initiative

mentioned earlier [9], Huang et al. [12] recently pro-posed a different method to inform Web sites about theclients behind the LDNS. This proposal does not requirechanges to DNS and instead modifies client applica-tions, which are presumably more amenable to changes.

12. CONCLUSIONThis paper investigates clusters of hosts sharing the

same local DNS server (“LDNS clusters”). Our studyis done from the vantage point of a busy consumer-oriented web site and is based on a large-scale measure-ment over 28 day period, during which our web page wasaccessed around 56 million times by 11 million clientIPs.We found that among the two fundamental issues in

DNS-based network control - hidden load and client-LDNS distance, hidden load plays appreciable role onlyfor a small number of “elephant” LDNS servers whilethe client-LDNS distance is significant in many cases.Further, LDNS clusters vary widely in both character-istics, and the largest clusters are actually more com-pact than others. Thus, a request routing system suchas a content delivery network can attempt to balanceload by reassigning non-compact LDNSs first as theirclients benefit less from proximity-sensitive routing any-way. We also report on several other important aspectsof LDNS setups and in particular observed a wide useof what we called “LDNS pools” that – unbeknownto end-hosts – appear to load-balance DNS resolutiontasks. Finally we found scant evidence of any perfor-mance penalty for unilateral IPv6 adoption by a Website; we hope this finding will help sites as they considertheir IPv6 migration strategy.

13. REFERENCES[1] Bernhard Ager, Wolfgang Muhlbauer, Georgios

Smaragdakis, and Steve Uhlig. Comparing DNSresolvers in the wild. In Proceedings of the 10thannual conference on Internet measurement, IMC’10, pages 15–21, 2010.

[2] Z. Al-Qudah, M. Rabinovich, and M. Allman.Web timeouts and their implications. In Passiveand Active Measurement, pages 211–221.Springer, 2010.

[3] T. Callahan, M. Rabinovich, and M. Allman.Submitted for publication, 2012.

[4] M. Casado and M.J. Freedman. Peering throughthe shroud: The effect of edge opacity on

15

Page 16: The Anatomy of LDNS Clusters: Findings and Implications ...engr.case.edu/rabinovich_michael/LDNS_full.pdfBy measuring these delays, our in-strumentation allows us to weigh in on the

IP-based client identification. In Proceedings ofthe 4th USENIX conference on Networked systemsdesign & implementation, pages 13–13. USENIXAssociation, 2007.

[5] Michele Colajanni and Philip S. Yu. Aperformance study of robust load sharingstrategies for distributed heterogeneous webserver systems. IEEE Trans. Knowl. Data Eng.,14(2):398–414, 2002.

[6] Michele Colajanni, Philip S. Yu, and ValeriaCardellini. Dynamic load balancing ingeographically distributed heterogeneous webservers. In ICDCS, pages 295–302, 1998.

[7] D. Dagon, N. Provos, C.P. Lee, and W. Lee.Corrupted DNS resolution paths: The rise of amalicious resolution authority. In Proceedings ofNetwork and Distributed Security Symposium(NDSS), 2008.

[8] Dual stack esotropia.http://labs.apnic.net/blabs/?p=47.

[9] The global internet speedup.http://www.afasterinternet.com/.

[10] Google over IPv6.http://www.google.com/intl/en/ipv6/.

[11] Google Public DNS. Performance Benefits.https://developers.google.com/speed/public-dns/docs/performance.

[12] Cheng Huang, Ivan Batanov, and Jin Li. Apractical solution to the client-LDNS mismatchproblem. SIGCOMM Comput. Commun. Rev.,42(2):35–41, April 2012.

[13] Cheng Huang, D.A. Maltz, Jin Li, andA. Greenberg. Public DNS system and globaltraffic management. In INFOCOM, 2011Proceedings IEEE, pages 2615 –2623, 2011.

[14] Christian Kreibich, Nicholas Weaver, BorisNechaev, and Vern Paxson. Netalyzr: illuminatingthe edge network. In Proceedings of the 10thAnnual Conference on Internet Measurement,IMC ’10, pages 246–259, 2010.

[15] R. Liston, S. Srinivasan, and E. Zegura. Diversityin DNS performance measures. In Proceedings ofthe 2nd ACM SIGCOMM Workshop on Internetmeasurment, pages 19–31, 2002.

[16] G. Maier, F. Schneider, and A. Feldmann. NATusage in residential broadband networks. In 12thPassive and Active Measurement Conf., pages32–41, 2011.

[17] Zhuoqing Morley Mao, Charles D. Cranor, FredDouglis, Michael Rabinovich, Oliver Spatscheck,and Jia Wang. A precise and efficient evaluationof the proximity between web clients and theirlocal DNS servers. In USENIX Annual TechnicalConference, General Track, pages 229–242, 2002.

[18] Maxmind GeoIP city database.

http://www.maxmind.com/app/city.[19] J. Pang, A. Akella, A. Shaikh, B. Krishnamurthy,

and S. Seshan. On the responsiveness ofDNS-based network control. In Proceedings of the4th ACM SIGCOMM conference on Internetmeasurement, pages 21–26, 2004.

[20] Jeffrey Pang, James Hendricks, Aditya Akella,Roberto De Prisco, Bruce Maggs, and SrinivasanSeshan. Availability, usage, and deploymentcharacteristics of the domain name system. InProceedings of the 4th ACM SIGCOMMconference on Internet measurement, IMC ’04,pages 1–14, 2004.

[21] I. Poese, B. Frank, B. Ager, G. Smaragdakis, andA. Feldmann. Improving content delivery usingprovider-aided distance information. In The 10thACM Internet Measurement Conf., pages 22–34,2010.

[22] M. Rabinovich and O. Spatscheck. Web cachingand replication. Addison-Wesley, 2001.

[23] Anees Shaikh, Renu Tewari, and MukeshAgrawal. On the effectiveness of DNS-based serverselection. In INFOCOM, pages 1801–1810, 2001.

[24] D. Wing and A. Yourtchenko. Happy eyeballs:Success with dual-stack hosts. IETF draft.http://tools.ietf.org/html/draft-ietf-v6ops-happy-eyeballs-05, October2011.

16