Spread and Page Rank for Interactive Maps

Preview:

DESCRIPTION

Talk presented at GIS in Action conference, Portland, OR, April 2013

Citation preview

Spread and Page Rank for Interactive Maps

Wm Leler - Flightstats, Inc.wm@flightstats.com

openstreetmap.org

zoom level 6

openstreetmap.org

Missing: San Francisco, San Jose, Los Angeles, Las Vegas, Phoenix, Seattle, Vancouver, Detroit, Dallas, NYC, Miami

Stamen Terrain

Beaverton? Hillsboro? Forest Grove? Tigard?

Google Maps

Seattle? Denver? Salt Lake City? Las Vegas?

MapQuest

56

where did they go?

The Big Problem

• A map is a spatial display of a bunch of objects: cities, highways, parks, airports, etc.

• At most zoom levels, there are far too many objects to display.

• What is the best way to pick which objects to display (per zoom level) on a map?

Our Immediate Problem

• At FlightStats we need to decide which airports to draw on a map

Every airportthat suppliesus with flightdata (4180)

FAA Categories

Based on % of passenger-enplanements

1. Primary large hub (>1% of p-e)

2. Primary medium hub (.25 - 1%)

3. Primary small hub (.05 - .25%)

4. Primary non-hub (<0.05%)

5. Secondary (< 10,000 p-e / year)

Bad Solution• Not so good for maps

• Airports bunch together and leave big empty spaces

Bunching

Empty Spaces

Bunching

Delay Map

• South America has only 3 large primary airports

• Two serve the same city (Rio)

• Africa has one (JNB)

Even Worse Internationally

• Should be dependent on zoom level

• as you zoom in, you want to see more

• Number of passengers is a bad measure

• short commuter flights to little airports should count less

• flights to other major airports should count more

More Problems

Goals

• Show “important” airports

• major airports

• plus less major airports if they are the primary airport for an area

• Avoid airports “bunching up”

Solution

Importance of an airport is based on:

1. How connected it is to other airports

weighted by the importance of each other airport (recursive)

2. Area for which it is the primary airport

3. Based on current map view

Connectivity• Calculate connectivity using the PageRank

algorithm (by Larry Page at Google, adapted by Steve Wilson at FlightStats)

SEA

PDX

EUG

PDT

RDM

PR(PDX) = foreach airport x: flights(PDX, x) * PR(x)

Connectivity• Calculate connectivity using the PageRank

algorithm (by Larry Page at Google, adapted by Steve Wilson at FlightStats)

SEA

PDX

EUG

PDT

RDM

PR(PDX) = foreach airport x: flights(PDX, x) * PR(x)

.0025

.0036

.00014

.0002.0003

Spread

• (Page) Rank still suffers from bunching and empty spaces

• Add Spread – the distance to the closest airport of higher rank

• a reasonable proxy for the area

• The Spread for PDX is 111.98 nautical miles (the distance to SEA)

CombiningRank and Spread

Rank Spread X

SFO 0.0040 293.5 1.19

OAK 0.0015 10.26 0.0158

SJC 0.0012 24.8 0.0287

SMF 0.0011 65.6 0.0702

The four biggest airports in Northern CA

Algorithm

1. Pre-calculate Rank and Spread for all airports (they don’t change very often)

2. For each map view, display the N airports with the largest product of Rank and Spread

3. (optionally) set minimum Rank and Spread

Flexibility

• Applications can pick their weighting of Rank and Spread, and value of N

• N can depend on zoom level

• Can also use them as limits

• minimum Spread to debunch airports

• minimum Rank to hide small airports

Beyond Airports

• Can use Spread to space out almost anything on a map

• cities, neighborhoods, roads, parks, mountains, rivers, lakes, ...

• your data!

• Rank emphasizes connectivity over raw size or category

Connectivity

gaps

noise

DEMO

http://demo.flightstats-ops.com/spread

http://www.slideshare.net/wmleler/spread-20277955

Recommended