Upload
pauline-mcbride
View
232
Download
4
Embed Size (px)
Citation preview
Towards Street-Level Client-Independent IP Geolocation
Yong Wang, UESTC/Northwestern
Daniel Burgener, Northwestern
Marcel Flores, Northwestern
Aleksandar Kuzmanovic, Northwestern
Cheng Huang, Microsoft Research
http://networks.cs.northwestern.edu
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation
Problem and Motivation
How to accurately locate IP addresses on the Internet?
Host-dependent solutions:– GPS– WiFi (e.g., Google My Location, Skyhook)
Host-independent solutions:– Server cannot always expect clients’ cooperation
• Security / access restrictions• Online service access analytics• Location-based online advertising
2
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation
A Scenario of Street-Level Online Advertising
3
User’s location
Local Businesses
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation
Prior Work
Constrained Based Geolocation [ToN 06]
Median error distance = 228 km– Measure delays from active vantage points
Topology Based Geolocation [IMC 06]
Median error distance = 67 km– CBG + consider network topological information
Octant [NSDI 07]
Median error distance = 35.2 km– CBG + consider router’s location, geographical and
demographics information
4
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation
Methodology Highlights
Our methodology is based on two insights
– Websites often provide the actual geographical location of associated entities
• E.g., universities, businesses, government offices, etc.• Develop methods to determine if web- or e-mail servers
reside at the corresponding locations
– Relative network delays highly correlate with geographical distances
• Absolute network delay measurements are fundamentally limited in their ability to achieve fine-grained geolocation results
5
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation 6
Institutional Network Example
to externalnetwork
router
IP subnet
mail server
web server
550 South Hill Street Suite 890, Los Angeles, CA 90013
Web cloud-
sourcing
Web cloud-
sourcing550 South Hill Street Suite 890,
Los Angeles, CA 90013
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation 7
< <<
Measured delays:
The Role of Relative Network Delays
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation
A Case Study
Target IP address: 38.100.25.196
Target postal address: 1850, K Street NW, Washington DC, DC, 20006
8
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation
Three-Tier Geolocation System
9
Tier 1Goal: Find the coarse-grained region for the
targeted IP
Measured delays
Geographical distances
Create intersection
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation
Three-Tier Geolocation System
10
Tier 2
Estimate the delay between landmarks and the target
D1 + D2 < D3 +D4
Create a new intersection
Populate the intersection with landmarks
Goal: Use passive landmarks to determine
finer-grained region for the targeted IP
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation
Three-Tier Geolocation System
11
Tier 3
Select the landmark with the minimum delay to the target, and associate the target’s location with it.
10.6 km vs. 0.103 km
Measured distance ∝ Geographical distance
Goal: Geolocatethe target IP using passive landmarks
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation
Remaining Issues
Verifying landmarks– Sweep-out most of the erroneous landmarks– Errors are still possible!
Resilience to errors– The larger the error – the more resilient our method
is– We prove that the likelihood that an erroneous
landmark will affect the accuracy is small
12
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation
Evaluation
Three datasets– Planetlab dataset (Academic)– Collected dataset (Residential)– Online Maps dataset (In the wild)
Factors impact the accuracy– Landmark density– Population density– Access networks
13
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation
Dataset Characteristics
14
The three datasets cover both urban areas and rural areas.
Urban areas
Rural areas
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation
Baseline Results
15
Error distance (km) Planetlab Residential Online Maps
The best previous result
Median 0.69 2.25 2.11 35.2
Maximum 5.24 8.1 13.2 276.8
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation
Landmark Density
16
The larger the number of landmarks we can discover in the vicinity of a target, the larger the probability we will be able to more accurately geolocate the targeted IP.
Density sequence:
Planetlab > Residential > Online Maps
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation
The Role of Population Density
17
The error distance is smallest in densely populated areas The error grows as the population density decreases
Middle of “nowhere”
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation
The Role of Access Networks
18
Error distance (km) AT&T Comcast Verizon
Median 1.68 2.38 1.48
2 km
700 meters
Cable access networks (Comcast) have a much larger latency variance than DSL networks (AT&T and Verizon)
Aleksandar Kuzmanovic Towards Street-Level Client-Independent IP Geolocation
Conclusions
A geolocation system able to geolocate IP addresses with more than an order of magnitude better precision than the best previous method
Our methodology consists of two components– Mining landmarks from the Web and using Web or
E-mail servers as landmarks– Using relative network distances as opposed to
absolute network distances
19