26
Exploring the Use of Linked Data to Bridge State and Federal Archives Jon Voss, LookBackMaps MARA Guest Lecture San Jose State University June 15, 2010

Exploring the Use of Linked Data to Bridge State and Federal Archives

Embed Size (px)

Citation preview

Page 1: Exploring the Use of Linked Data to Bridge State and Federal Archives

Exploring the Use of Linked Data to Bridge State and Federal Archives

Jon Voss, LookBackMapsMARA Guest Lecture

San Jose State UniversityJune 15, 2010

Page 2: Exploring the Use of Linked Data to Bridge State and Federal Archives

Overview

1. Quick intro, logistics2. Evolution and context of the Civil War

Data 150 Project: A Very Exciting Time3. Overview of Civil War Data 150 Project

 Halftime Q&A

– Some Technical Details on the Methodology, Tools

– Placing CWD150 in the Big Picture– Dig Deeper Links

Final Q&A

http://www.loc.gov/pictures/item/cwp2003000505/PP

Page 3: Exploring the Use of Linked Data to Bridge State and Federal Archives

Give me feedback...

email:[email protected]

www.twitter.com/LookBackMaps

Comments welcome, just use @LookBackMaps onTwitter or email me.

Page 4: Exploring the Use of Linked Data to Bridge State and Federal Archives

• www.lookbackmaps.net • From perspective of presenting data, not

organizing it--coming from the Web community • Started in 2008 as a Google MyMaps mashup • Based on the simple idea of creating community

around local history.• Created to solve the problem of disparate archives

with no geotags through community and crowdsourcing 

• Finding ways to access, display, and improve upon data

Page 5: Exploring the Use of Linked Data to Bridge State and Federal Archives

screenshots from LookBackMaps iPhone app,overlay photos from The Bancroft Library

Page 6: Exploring the Use of Linked Data to Bridge State and Federal Archives

screenshots from LookBackMaps iPhone app,overlay photos from California Historical Society

Page 7: Exploring the Use of Linked Data to Bridge State and Federal Archives

2008 Marks a major shift for public archives

• The Library of Congress and Flickr collaboration spurs the Flickr Commons, and blows open the Web 2.0 door at archives and institutions worldwide.

Page 8: Exploring the Use of Linked Data to Bridge State and Federal Archives

2008 Marks a major shift for public archives

• The Library of Congress and Flickr collaboration spurs the Flickr Commons, and blows open the Web 2.0 door at archives and institutions worldwide.

• Multiple open source collections management and web publishing platforms begin to take hold and lower the barrier to entry for Web 2.0 presentation, collaboration, plugins and extensions

Page 9: Exploring the Use of Linked Data to Bridge State and Federal Archives

Some stats from the LOC summary report a year after launch speaks to the success.  As of 10/23/08:• 10.4 million views of LOC photos on Flickr• 79% of the 4,615 photos have been made a "favorite"• 67,176 tags were added by 2,518 unique Flickr

accounts• Less than 25 instances of user-generated comments

were removed as inappropriate.• More than 500 records have been enhanced with new

information provided by the Flickr community.

LOC/Flickr Commons

Page 10: Exploring the Use of Linked Data to Bridge State and Federal Archives

Public Archives in the Web 2.0 Environment

While the majority of archives and libraries remain in a Web 1.0 environment, users have Web 2.0 expectations. Institutions and users are meeting in the middle to build community around holdings.• Search/Share: Archives want to get their holdings out

to a wide-reaching public, Users want to search across institutions to discover based on interest, locality, etc.

• Comment/Community: the ability to discuss and engage, create community

• Contribute/Improve: Tag, geotag, crowdsource• Compare: Then and now. community identity often

tied to history

Page 11: Exploring the Use of Linked Data to Bridge State and Federal Archives

Stage is set for collaboration and innovation

Mashups, collaborations, shared datasets, open source, open data, and open tools

Bing Maps Streetside Photos (tech preview) http://www.bing.com/maps/explore/#/9gk357c6yqx3jost

Page 12: Exploring the Use of Linked Data to Bridge State and Federal Archives

The more shared data we have, the more we can do with it!   

By end of 2009, group of archivists and technologists start exploring collaborative efforts utilizing Linked Data to connect isolated archives and datasets in order to: • join data in a robust, scalable, community-maintained

database• increase discovery of and traffic to the archives while

adding value to the data through crowdsourcing• make the data searchable and available to other web

applications via API and semantic web queries

Page 13: Exploring the Use of Linked Data to Bridge State and Federal Archives

Archives Metadata Mapping Project  

Two important outcomes:

1. The potential of using Linked Data now by using Freebase as a Linked Data publishing platform.

2. The importance of use cases.

Page 14: Exploring the Use of Linked Data to Bridge State and Federal Archives

There, I said it.  Linked Data.

Providing ways to start linking to DATA, no longer just DOCUMENTS. It entails using tools and standards to make information (like metadata, MARC records, etc) searchable and machine readable.

image: Harry Halpin. http://www.ibiblio.org/hhalpin/homepage/presentations/socialnet/

Page 15: Exploring the Use of Linked Data to Bridge State and Federal Archives

The Civil War Data 150 Project

Born out of conversations with AMMP participant, Archives of Michigan.

Key ingredients for a strong use case:• Specific subject

matter• Diverse data in a wide

array of institutions• A passionate user

group• A significant

anniversary

Page 16: Exploring the Use of Linked Data to Bridge State and Federal Archives

http://www.flickr.com/photos/usnationalarchives/4166330219/

Three Primary Goals of CWD150: 1. Identify sources and

map metadata into Freebase.

• Create web apps to enable users to add to or modify shared metadata with strong identifiers.

• Engage the public in the process of interacting with and adding value to the data. 

Page 17: Exploring the Use of Linked Data to Bridge State and Federal Archives

http://www.flickr.com/photos/usnationalarchives/3996142724/

Pause for Q&A 

Page 18: Exploring the Use of Linked Data to Bridge State and Federal Archives

Some Technical Details on the Methodology, Tools

You can follow along and contribute to the project on the Freebase Wiki: http://wiki.freebase.com/wiki/CWD150 

Page 19: Exploring the Use of Linked Data to Bridge State and Federal Archives

Some Technical Details on the Methodology, Tools

 1. Identifying primary data sets and ways at getting at the data    Link to Google Spreadsheet on sources. Web crawling, screen scraping, XML dumps, CSV files, etc.

2. Creating Web Apps• Once we have metadata mapped in Freebase, we can

create RABJ queues.  See a simple example: Genderizer.• Then apply this to data that needs work, like regiments, or a

photo queue. • Work with Civil War historians and others to add to specific

schema.

Page 20: Exploring the Use of Linked Data to Bridge State and Federal Archives

Some Technical Details on the Methodology, Tools

 3. Engaging the Public, User Interface Development• Messaging and powerful images• An easy interface with game elements and rewards• A plea for assistance and opportunity to genuinely make

records more useful. • Holy Grail: Civil War Soldier Survival App based on city of

enlistment

Page 21: Exploring the Use of Linked Data to Bridge State and Federal Archives

The Big Picture

http://www.flickr.com/photos/37377809@N00/4701512132/

Page 22: Exploring the Use of Linked Data to Bridge State and Federal Archives

The Big Picture

• CWD150 is a strong use case and an example for what can become possible in the wider web and developer community if libraries, archives and museums publish their metadata utilizing Linked Data standards and open licenses.

• Our experience is showing us that the technological barriers are not as significant as the institutional barriers around adoption and openness.  But the Flickr Commons Shift has changed that.

• With CWD150, we are side-stepping the Big Next Step of enabling institutions to publish their own metadata as Linked Data, and make meaningful connections. This is on the near horizon.

Page 23: Exploring the Use of Linked Data to Bridge State and Federal Archives

The Big PictureLibraries, Archives and Museums will be critical to the adoption of Linked Data • The vast information stored in disparate, isolated

databases held by the worlds public institutions.• The expertise held by these institutions in the

organization of systems and vocabularies to make sense of this information.

• You can be on the front lines of this movement.http://www.loc.gov/pictures/item/cwp2003000216/PP

Page 24: Exploring the Use of Linked Data to Bridge State and Federal Archives

Dig Deeper!LibrariesOCLC Research Linked Data parts 1 and 2 webinarEMTACL10 April 2010. Gillian Byrne & Lisa Goddard: video | slidesJISC Linked Data Horizon ScanEd Summers is doing Linked Data work with LOC: Twitter | Blog ArchivesMark Matienzo Linking as Repurposing MetadataTim Wragge's Flickr Machine tag Challenge

ToolsBuild your own NYT Linked Data ApplicationBuild apps on FreebaseClean vast amounts of data with Gridworks

Tim Berners-LeeTED Feb 2009TED Feb 2010Gov 2.0 Expo May 2010

Page 25: Exploring the Use of Linked Data to Bridge State and Federal Archives

http://www.flickr.com/photos/library_of_congress/3252917783/

What will you do with that data?

Q&A 

Page 26: Exploring the Use of Linked Data to Bridge State and Federal Archives

Give me feedback...

email:[email protected]

www.twitter.com/LookBackMaps

Comments welcome, just use @ or #LookBackMaps onTwitter or email me.