Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Operating a DNS-based Active Internet Observatory
Jens Hiller, Oliver Hohlfeld, Jan Rüth, Torsten Zimmermann
www.netray.io
The NetRay Internet Observatory Motivation
● Study Internet Evolution ● Internet: an entirely man-made system yet not fully understood ● Optimizing the Internet requires understanding its properties
● Longitudinal and multi-protocol studies rare ● Often only a single protocol is measured for a short duration
● Domain name probes provide new perspective ● IPv4 space probed regularly à doesn’t account for virtualization (SNI)
Goal
● Regular, multi-protocol probes of IPv4 & >50% of domain name space for multiple protocols ● Regular probes: daily or weekly ● Probe more than one protocol ● Probe large portion of the domain name space
Architecture ● Target lists ● Zmap based IPv4 address space scan ● DNS zone files for multiple TLDs (e.g., .com, .net, .org) ● Complete zone files for few TLDs ● Passive DNS feed + CT logs to reconstruct other TLDs
● DNS resolution ● Perform DNS resolution for every domain for multiple RRs ● DNS resolution by cluster of machines ● Output annotated: e.g., CDN, ASN, Cloud, … ● Output written to Rabbit MQ message bus
● Protocol probing ● Protocol worker for every protocol ● Can run on multiple server ● Subscribe to workload via Rabbit MQ message bus
t
Pro
toco
lP
robi
ng
DN
S R
esol
utio
n In
put:
Targ
et
List
s ZoneFiles
DNS Crawler
PassiveD
NS
CT
Logs
zmap
…
domainnames en*reIPv4Space
Rabbit MQ Message Bus
Results
A/AA
AA
NS/MX
…
A-www
Results
HTTP2
A-www+IPs
Results
QUIC
Results
MX
ProtocolX
*me
protocolprober
1!
2!
3!
-! blacklist
-!
Classifiers
Example Studies
Data Sets
Domain Name System ● Complete zone files (daily) for ● com, .net, .org, .fi, .se, .nu, .gov
fed.us, .name, + >1000 new gTLDs (e.g., .london)
● Incomplete zone files for ● 80 ccTLDs (e.g., .de) ● Source: passive DNS & CT log
● Probed RRs: ANY, SOA, CAA, LOC ● A & AAAA: (www.) domain.tld ● A & AAAA for every NS/MX
HTTP/2 ● Probe selected TLDs for HTTP2 ● Full connection establishment ● Regular scans: daily/weekly
● HTTP2 Server Push adoption ● Monitor which sites push
content on their landing page
QUIC ● Probe all TLDs / IPv4 for QUIC
support ● Perform connection
establishment ● Google & IETF QUIC ● Regular scans: daily/weekly
● Server fingerprinting etc.
TCP Initial Window ● Assessment of global TCP Initial
Window distribution ● Probe 1% random subsample of
the IPv4 space & Alexa Top 1M domains ● Few full scans available ● 1% random sample sufficiently
approximates overall distribution à Reduce scan footprint
Web Security ● TLS connections with all TLDs ● Establish connection ● Retrieve certificates &
256kB payload ● Regular scans: daily/weekly
● Certificate Authority Authorization ● CAA goal: limit cert mis-
issuance ● Probe all TLDs for CAA RRs
Interested in data? Contact us: [email protected]
http2.netray.io quic.netray.io iw.netray.io
Acknowledgements We would like to thank Jens Hektor and Bernd Kohler (RWTH Aachen IT Center) for enabling and supporting our work. Funded by the Excellence Initiative of the German federal and state governments and by the DFG as part of the CRC 1053 MAKI.
1
2
3
●●● ●●●●●●● ●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●● ●● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
0
100
200
300
400
2017−0
1
2017−0
2
2017−0
3
2017−0
4
2017−0
5
2017−0
6
2017−0
7
2017−0
8
2017−0
9
2017−1
0
2017−1
1
2017−1
2
2018−0
1
2018−0
2
2018−0
3
2018−0
4
2018−0
5
2018−0
6
HTT
P2 G
row
th [%
] TLD● alexa
comnetnu
orgse
The Rise of HTTP2 The Rise of QUIC The Rise of Crypto Miners in Web Pages
11.01.1811.03.18
02.03.1811.05.18
27.02.1808.05.18
28.02.1809.05.18
Scan Date
0.00
0.25
0.50
0.75
1.00
NoC
oin
Det
ecti
onSh
are
Ale
xa
.com .net
.org
coinhiveauthedmine
wp-monerocryptoloot
cpmstarother
710 621 6676 5744 618 553 473 399# Potential Mining Domains
● Methodology ● Pattern matching on HTML payload of TLS scans ● Match javascript object names against NoCoin list
● Result ● Crypto mining in web pages exist ● Coinhive most prevalent framework / operator
Summary
● New active measurement infrastructure to study Internet evolution with large-scale, DNS-based multi-protocol measurements ● Ambitious goal to cover a large domain name space with
longitudinal measurements ● web page showing current statistics and further information
about our studies: netray.io
● Methodology ● Establish QUIC connections with all IPv4 hosts ● Retrieve QUIC version supported by server
● Result ● QUIC is on the rise, driven by Google & Akamai ● New protocol versions (color shades) come and go quickly
● Methodology ● Establish H2 connections with domains in selected TLDs ● Analyze Server Push usage
● Result ● HTTP2 adoption is on the rise ● Server Push adoption orders of magnitude lower (not shown)
Influence of Internet Top Lists Cloud & CDN Adoption ●
●●
●●
● ●
●
● ●●
●
●
●
●
●●
●
● ●
●
●●
●●
●
●
●
●
●● ●
●
●
●
●●
●
●
●
●
●
● ●
●
●
● ●
●●
●
● ●
●
●
●●●●
●
●● ● ●
●
●
●
●●
●●
●
●● ●●
● ●
●
●
●● ●●
●
●
● ●
●
●
●
● ●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●● ●●
●
●
●●
●●
●●
●●
●
●
●
●
●
●
●
●●
●
● ●●
●
● ●
●●
●
●●
●●
●●
●
●●
● ●
●
● ●●
●●
●●
●
● ●
●
●
●
● ●
●
●
●
●
●
●●
●
●●●
●
●
●
●
●●●●
●●
●
●
●
●
●
● ●●●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●●
●●
●●
●●
●
●
●
●
●
●●
●●
● ●
●●
●●
●
●
●
● ●●●
●
●●
●
●
●
●
●
●
●
●
●●
●
● ●●
●
●●●
●
●●
●
●
●● ●●
●●●
●
●
●
●●
●●
●●●
0
2
4
6
8
2017−0
4
2017−0
5
2017−0
6
2017−0
7
2017−0
8
2017−0
9
2017−1
0
2017−1
1
2017−1
2
2018−0
1
2018−0
2
2018−0
3
2018−0
4
CD
N S
hare
[%]
● alexacom
netnu
orgse
●●● ●● ●●● ●● ●
●●●●●
●● ●●
●●●●
● ●●● ●● ●● ●● ●
● ●● ●● ●●
● ● ●●●
●●●
●●
●●● ●
●
● ●●
●●● ●● ●●
●●
●●●
●●●
●●● ●
● ●●● ● ●●● ●
●●
● ●●
●● ● ●
● ●●●●
●●●●● ●
●● ● ●● ●
●●● ●
●
●●
●● ●
● ●● ● ●
●●
●●● ●
●● ●
● ● ● ●●
●●● ●●
●●●● ●
● ●● ●●●
●●●●●
●●● ●●●●
●●
●
● ●●●
● ● ●●●●
●● ●●●● ●
●
●
●●●
●●●
●●●● ● ● ●●●●●● ● ●
●● ●●
●●●
● ●●●●
●●
● ●●
●●●●●
●●● ●● ● ● ●●
●●●● ●
● ●●●●● ●
●●
● ●●
● ●●●● ●
●●●● ● ●● ● ●
●●
●
●●● ●●●●●
● ● ●
0
10
20
30
40
2017−0
4
2017−0
5
2017−0
6
2017−0
7
2017−0
8
2017−0
9
2017−1
0
2017−1
1
2017−1
2
2018−0
1
2018−0
2
2018−0
3
2018−0
4
Clo
ud H
oste
d D
omai
ns [%
]
TLD● alexa
comnetnu
orgse
● Methodology ● Cloud usage: match www. A records to cloud prefixes ● (full-site) CDN usage: match www. CNAME to CDN pattern
● Result ● CDN adoption higher on Alexa Top 1M then complete TLDs ● Cloud usage higher than full-site CDN hosting
2018-04
-11
2018-04
-14
2018-04
-17
2018-04
-20
2018-04
-23
2018-04
-26
2018-04
-29
2018-05
-02
2018-05
-05
2018-05
-080
10
20
30
40
50
60
Shar
e[%
]
Alexa 1MAlexa 1k
Umbrella 1MUmbrella 1k
Majestic 1MMajestic 1k
c/n/o
● Methodology ● Partition HTTP2 adoption measurement data by top lists ● Alexa, Umbrella & Majestic Top 1M
● Result ● Results differ by list and rank (e.g., top 1k vs top 1M) ● Umbrella has high number of NXDOMAIN entries
tls.netray.io
Certification Authority Authorization (CAA)
caastudy.github.io toplists.github.io
dns.netray.io
20.Aug 2016
16.Sep2016
14.Oct 2016
11.Nov 2016
09.Dec2016
06.Jan2017
03.Feb2017
03.Mar 2017
31.Mar 2017
28.Apr 2017
26.May
2017
30.Jun 2017
28.Jul 2017
25.Aug 2017
22.Sep2017
20.Oct 2017
17.Nov 2017
15.Dec2017
12.Jan2018
09.Feb2018
09.Mar 2018
06.Apr 2018
04.May
2018
01.Jun 2018
29.Jun 2018
27.Jul 2018
0.0
2M
4M
#H
osts
37..3538..3539..3539..37,3539..37,35,4140..37,3541,41,39,3541,41..37,3543..41,39,3544..43,39,35Other
netray.io netray.io