16
Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Embed Size (px)

Citation preview

Page 1: Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Alexandria Digital Library Project

Goals and Challengesin

Georeferenced Digital Libraries

Greg Janée

Page 2: Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Alexandria Digital Library Project

2

Goals

Digital library: “an integrated set of services for capturing, cataloging,

storing, searching, protecting, and retrieving information”

ADL: a lightweight, distributed digital library for heterogeneous,

georeferenced information a system and an infrastructure

– supports personal collections ... institutions– provides interoperability across spatial data providers

Page 3: Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Alexandria Digital Library Project

3

Adjectives

Heterogeneous remotely-sensed imagery; textual documents multimedia instructional materials; executable models gazetteer placenames

Georeferenced generalizes to “scientific data”: any highly-structured, metadata-

rich information

Distributed for scalability

Lightweight accommodate small, cheap (i.e., free) implementations include non-traditional spatial data sources

Page 4: Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Alexandria Digital Library Project

4

Where we are today

Downloadable server software, two clients

In operational use by MIL

Other (potential) users: Bren/ESSW Scripps DLESE Norwegian National Library Auckland University of Technology

Page 5: Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Alexandria Digital Library Project

5

Challenges

Discovery Gazetteers Ranking Scalability Context Client integration

More at http://www.dlib.org/

Page 6: Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Alexandria Digital Library Project

6

Challenge 1: discovery

Can’t beat word search when it works I want a map of Boulder “Downtown street map of Boulder, Colorado”

But there are so many names for a place... Boulder, Arapahoe County, Colorado Chautauqua, Mapleton Hill, Pearl Street Mall Area code 303, ZIP code 80305, UTM grid 13S Flatirons, Rocky Mountains, Front Range Landers earthquake, hurricane Hugo

Page 7: Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Alexandria Digital Library Project

7

If you’re still not convinced...

Remote-sensing imagery is nameless “AVHRR NOAA-13 2002-06-03 14:33 UTC”

Challenge: exactly which two words will find a USGS map of the Flatirons in the Rocky Mountains behind Boulder, Arapahoe County, Colorado?

Eldorado Springs

Page 8: Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Alexandria Digital Library Project

8

ADL approach

Coordinate-based representation and discovery generic lat/lon coordinates rich geometry

– polygons, polylines spatial operators

– overlaps, contains

Gazetteer defines representation of

places maps placenames

coordinates

client

gazetteer

library

coordinates

placenames

Page 9: Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Alexandria Digital Library Project

9

Challenge 2: gazetteers: necessary evil

Few (public) sources of gazetteer data

Lousy quality digitized from maps

Difficult problems conflation classification boundary determination change over time

Conclusion gazetteer-based spatial reasoning seems unlikely interaction will likely remain client-centric

Page 10: Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Alexandria Digital Library Project

10

Final thoughts on discovery

Coordinate-based approach is costly burden on users and catalogers limits potential collections relies on gazetteer’s weakest aspect: footprints continuous coordinate space adds complexity

Gazetteer improvements federated gazetteers new gazetteer models: topological as opposed to metric

Other coordinate spaces, grids, etc.

Page 11: Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Alexandria Digital Library Project

11

Challenge 3: ranking

Observed phenomenon: World Map is first result of every query

Idea: rank by spatial similarity to query region

query

13

2

4

Page 12: Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Alexandria Digital Library Project

12

Challenge 4: scalability

Easy to accumulate lots of data satellites image continuously 1 m resolution, Earth’s surface area = 5 1014 m2

Support for scalability text: amazingly good spatial: not so good

– indexing becomes unwieldy at 106 items combining spatial with other constraint types is difficult

Page 13: Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Alexandria Digital Library Project

13

ADL approach

Partition and distribute the problem

Multiple levels of discovery find relevant collections search just those collections

Support multiple implementation strategies spatial engine relational database home-grown

Page 14: Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Alexandria Digital Library Project

14

Challenge 5: context

Context is critical for evaluation Textual context:

poem

software

Page 15: Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Alexandria Digital Library Project

15

Geospatial context

Does this answer your question?

Flatirons 1-5

Flagstaff Rd.

Green Mountain

Page 16: Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée

Alexandria Digital Library Project

16

Challenge 6: client integration

“Click here” approach places large burden on users navigate interpret evaluate download

Service-based access will become predominant just as the WWW replaced FTP

Needed: description/access standards, protocols integration with search constraints