Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database...

Preview:

Citation preview

Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina

Digital Library Project, Database Group

Stanford University

Automatic Organization for Digital Photographs with Geographic Coordinates

JCDL 2004 2

Geo-Referenced Photos

April 8th, 2004 1:20:02pm

Latitude: N34.3121

Longitude: W122.234

JCDL 2004 3

Geo-Photography Technology

+1) 2)

JCDL 2004 4

Personal Photo Libraries

• Searching/browsing very difficult

• Little discernible structure to photo collections

JCDL 2004 5

• Content-based retrieval– Basic, primitive (far from semantic)

• Manual labeling– Improved, yet cumbersome

• Visual methods for fast scanning (Zoom)– Don’t scale well

Managing Personal Photos

JCDL 2004 6

Our Approach

• Absolutely no human effort required

• Utilize time and location– Automatically captured – Easy to get

JCDL 2004 7

Automatic Organization

JCDL 2004 8

Automatic Organization

JCDL 2004 9

Automatic Organization

JCDL 2004 10

Automatic Organization

JCDL 2004 11

Outline

• Requirements and challenges

• The algorithms

• Sample output

• Experiment results

JCDL 2004 12

Browsing by Location/Time

• Use a map/calendar– wwmx.org from MSR:

• Map issues – Lots of screen space– Sparse – Limited interaction?– Not intuitive for some

Using Hierarchies

Time

United States

Yosemite N.P, Yosemite Valley, CA

Location:Around: San Francisco, Berkeley, Sonoma CA

San Francisco, Golden Gate Park, CA

Seattle, WA

……

Berkeley,

Oakland CA

2003-01-01: Yosemite N.P. (2 Days)

2003-01-18: San Francisco (1 hour)

2003-01-18: San Francisco (1 hour)Time:

JCDL 2004 14

Challenges

• Locations should be intuitive

• Events are tricky – 3-days trip to NYC– The kid’s soccer game, followed by a

birthday party

• Good names are important.

JCDL 2004 15

Outline

• Requirements and challenges

• The algorithms

• Sample output

• Experiment results

JCDL 2004 16

Process Diagram

JCDL 2004 17

Discovering Structure

Location Hierarchy

Initial Event Segmentation

Location Clustering

Final Event Segmentation

Event Hierarchy

Initial Event Segmentation

Automatic Organization

JCDL 2004 18

Initial Event Segmentation

• Photos occur in bursts

• Identify bursts: semantically “connected”

JCDL 2004 19

Initial Event Segmentation

Stream of photos

More details: •Graham et al, JCDL 2002•Tomorrow•Proceedings

JCDL 2004 20

Discovering Structure

Location Hierarchy

Initial Event Segmentation

Final Event Segmentation

Event Hierarchy

Location Clustering

Automatic Organization

JCDL 2004 21

Location Clusters

• Cluster the bursts into locations

• A. Gionis and H. Mannila. Finding recurrent sources in sequences. In Proceedings, Computational molecular biology 2003.– Minimize: number of clusters– Minimize: error (distance to cluster centers)

Photo location

Location Clusters: 2-D View

2-D View: with Bursts

JCDL 2004 24

Location Clusters

Location4 -

Location3 -

Location2 -

Location1 -

Location4 -

Location3 -

Location2 -

Location1 -

Location Clusters (breakdown)

• Some clusters may be overloaded:– Many bursts / picture-taking days in one location

San Francisco

JCDL 2004 26

Discovering Structure

Location Hierarchy

Initial Event Segmentation

Location Clustering

Event Hierarchy

Final Event Segmentation

Automatic Organization

JCDL 2004 27

Final Event Segmentation

• Again scan sequence, new events detected:– Whenever location context changes– In the same location, use adaptive time

threshold

JCDL 2004

Final Event Segmentation

Overnight trip to Yosemite

Soccer game and dinner

JCDL 2004 29

Next - names

• Detected location and event structure

• Need to choose names for each node

30

Assigning Names

Photo location

Stanford

Palo Alto City Park

Palo AltoButano State Park

Stanford 42

Palo Alto 30

Butano 10

P.A. park 8

31

Assigning Names – Nearby?

San Jose, 20 miles

San Francisco, 30 milesWhat if photos occur sparsely within cities or parks?

JCDL 2004

Assigning Names - Nearby

Which city has stronger “gravity”?

JCDL 2004

Assigning Names - Nearby

San Jose is Closer

JCDL 2004

Assigning Names - Nearby

San Jose is bigger**larger population

JCDL 2004

Assigning Names - Nearby

But San Fran is more important!**greater Google count

Final name for location cluster:

“Stanford, 30 miles South of SF”

JCDL 2004 36

Assigning Names - Alexandria

• Using polygon-based dataset of administrative areas

• Alexandria gazetteer can be used for other prominent geographic features

JCDL 2004 37

Outline

• The requirement and challenges of automatic organization

• The algorithms

• Sample output

• Experiment results

JCDL 2004 38

Location Hierarchy

Photoshop Album (at least 4 man-hours)

Our system (about 0 man-seconds)

39

Location Hierarchy (US)

+San Francisco, Berkeley, Sonoma, CA-Stanford, Mountain View, Monterey, CA

•Monterey (58 miles S of San Jose) •Mountain View (4 miles NW of San Jose) •Stanford

-Colorado (219 miles W of Denver)-Long Beach (35 miles S of Los Angeles, CA)-Philadelphia, PA-Seattle, WA-Sequoia N.P. (153 miles E of Fresno, CA)-South lake Tahoe; Bear Valley, CA-Yosemite N.P.; Yosemite Valley, CA 

Events

about 0 man-seconds:

...

2003-06-28: Long Beach,CA (3 days)

2003-07-04: San Francisco,CA (3 hours)

2003-07-10: Colorado (3 days)

2003-07-15: San Francisco,CA(1 hours)

2003-07-18: Mountain View,CA (5 hours)

2003-07-27: San Francisco,CA (1 hours)

2003-09-28: Philadelphia,PA (1 hours)

2003-10-03: Sequoia NP (3 days)

...

Photoshop Album (at least 4 man-hours)

JCDL 2004 41

Event Names

• LOCALE: share automatically

• Check personal calendar

• Event Gazetteer

• Easy interface

JCDL 2004 42

Experiment

• Tested on 3 real-world geo-referenced photo collections

• Our system automatically generated the structure and names

• Tested with the owners

JCDL 2004 43

Experiment - Locations

• Accepted the automatic hierarchy

• Only minor edits requested– Merge/split few of the locations

JCDL 2004 44

Experiment - Events

• Compared to events as annotated by users

• 80-85% in both recall and precision

• Other metrics proposed (see paper)

JCDL 2004 45

Experiment - Naming

• Naming location clusters– For 76% of clusters, system and users pick

at least one name in common– For the rest, “automatic” name was useful

Not yet published:

• Paid 13 participants to “geo-reference” their photos• Loaded to WWMX and our browser

– Most liked the map better, but…– Performed the same for search/browse tasks– Event notion helps overcome location handicap– Organization “made sense”

P.S. Some didn’t touch the map, yet used our location hierarchy.

P.S.2 This was on a BIG screen!

JCDL 2004 47

Thank You!

More details:

Proceedings

Google: Mor Naaman

mor@cs.stanford.edu

http://www-db.stanford.edu/~mor/

JCDL 2004 48

Future Work

• User interface

• PDA

• Integrate with map

• Global photo libraries

JCDL 2004 54

Remember The Bursts?