33
Finding Wormholes with Flickr Geotags Maarten Clements Marcel Reinders Arjen de Vries Pavel Serdyukov December 3 rd , 2009 GIS

Finding Wormholes with Flickr Geotags Maarten Clements Marcel Reinders Arjen de Vries Pavel Serdyukov December 3 rd, 2009 GIS

Embed Size (px)

Citation preview

Finding Wormholes with Flickr Geotags

Maarten ClementsMarcel ReindersArjen de VriesPavel Serdyukov

December 3rd, 2009GIS

03/12/2009 2Maarten Clements

Maarten Clements

• PhD: personalized retrieval in Social Media• Faculty of EEMCS – ICT group. • Supervisors

º Marcel Reinders – Prof. Bioinformatics (and more)

º Arjen de Vries – CWI, Prof. MM Dataspaces

03/12/2009 3Maarten Clements

Location prediction

• Predict relevant locationsº Location Locationº User Location

• Why?

Flickr: MarsW Flickr: msokal

1

2

?

03/12/2009 4Maarten Clements

Location prediction

03/12/2009 5Maarten Clements

Flickr

• Foto sharing website º Billions of photosº Active community:

º Tags, Geotags, Favorites, Comments…

20092008

32.3M

91.4MGeotags in flickr

03/12/2009 6Maarten Clements

Flickr

• Using Flickr API to collect data:º http://www.flickr.com/services/api/

• Strategy to find people who geotag:• First collected top cities in 2008

1. 'New York, NY, United States'2. 'London, England, United Kingdom'3. 'San Francisco, California, United States'4. 'Paris, Ile-de-France, France'5. …8643. Lo Verdes, Canary Islands, Spain

100

101

102

103

104

101

102

103

104

105

106

Place

Pho

tos

Total nr. of photos in 2008

03/12/2009 7Maarten Clements

Flickr

• Repeat:º Select a city based on full distributionº Get a photo at this location (geotagged)º Select the user who made the photoº Get all this users photos

City

03/12/2009 8Maarten Clements

Longitude

Latit

ude

-180 -108 -36 36 108 180

90

54

18

-18

-54

-90

Flickr

Users: 36,264 Photos: 52,425,279Geo Tags: 22,710,496

03/12/2009 9Maarten ClementsLongitude

Latit

ude

-17 -8.2 0.6 9.4 18.2 27

61

55.4

49.8

44.2

38.6

33

Flickr

TagsTitlesTime stampsSocial networkDescriptionsGroups

03/12/2009 10Maarten Clements

Flickr

100

101

102

103

104

105

100

101

102

103

104

105

User

Geo

Tag

s

Photos

All Geo-tagsUnique Geo-tags

Round to 1000th degree

Clustered 100km

03/12/2009 11Maarten Clements

Wormholes

• Places that are similar but not necessarily spatially close.

• Use user travel patterns to detect these places• Assumptions

º Users have a certain travel preferenceº Users make photos at places they like

03/12/2009 12Maarten Clements

Wormholes

• Given a target location, find relevant users• Weigh Euclidean distance with normal

distribution

03/12/2009 13Maarten Clements

Wormholes

• Given a target location, find relevant users• Weigh Euclidean distance with normal

distribution• Aggregate data over all users, using computed

weightsº 2000x4000 histogram, example 4x8:

User 1:User 2:User 1+2:

03/12/2009 14Maarten Clements

Convolution:

Wormholes

• Given a target location, find relevant users• Weigh Euclidean distance with normal

distribution• Aggregate data over all users, using computed

weights• Compute convolution with Gaussian kernel• Compute difference with expected geotag

distribution

03/12/2009 15Maarten Clements

Wormholes

• Result

03/12/2009 16Maarten Clements

Wormholes

• Sigma determines how many users we call Relevant

σσ

Many relevant users Few relevant users

03/12/2009 17Maarten Clements

Evaluation

• Find ground truth data: Wikipedia, GeoNames

03/12/2009 18Maarten Clements

Evaluation

• Rank predicted peaks and compute precision• Is there a mountain in a range of 3cells around

the predicted peak?

0 10 20 30 40 50 60 70 80 90 1000

0.05

0.1

0.15

0.2

Avera

ge

Pre

cision

σ (km)

So… Does it work?

03/12/2009 19Maarten Clements

Evaluation (manual)

03/12/2009 20Maarten Clements

Evaluation (manual)

σ = 100km

03/12/2009 21Maarten Clements

Evaluation (manual)

σ = 20m

Target: Tour Eiffel

03/12/2009 22Maarten Clements

Evaluation (manual)

σ = 20m

Target: Tour Eiffel

03/12/2009 23Maarten Clements

Evaluation (manual)

σ = 80m

Target: Tour Eiffel

03/12/2009 24Maarten Clements

Evaluation (manual)

σ = 80m

Target: Tour Eiffel

03/12/2009 25Maarten Clements

Evaluation (manual)

Target: Tour Eiffel

σ = 300m

03/12/2009 26Maarten Clements

Evaluation (manual)

Target: Tour Eiffel

σ = 300m

03/12/2009 27Maarten Clements

Evaluation (manual)

σ = 60m

Target: Pere Lachaise

03/12/2009 28Maarten Clements

Evaluation (manual)

σ = 60m

Target: Pere Lachaise

03/12/2009 29Maarten Clements

What next?

• User Location • Query exists of multiple points (instead of 1)• Get rid of grid based prediction

º Compute kernel convolution peaks directly from continuous geotag data.

03/12/2009 30Maarten Clements

What next?

03/12/2009 31Maarten Clements

What next?

-200

-150

-100

-50

0

50100

150

200

-100

-80

-60

-40

-20

0

20

40

60

80

100

0

10

20

30

40

50

60

70

80

90

100

03/12/2009 32Maarten Clements

Conclusions

• We have proposed a new method to predict similar locations based on geotags.

• Scale parameter can be used to predict relevant locations at different scales.

• ECIR’10: Comparing different user aggregation methods

03/12/2009 33Maarten Clements

http://ict.ewi.tudelft.nl/~maarten/wormholes/[email protected]

http://ict.ewi.tudelft.nl/~maarten/wormholes/[email protected]