48
#lod4h Publishing and Using Linked Open Data Richard J. Urban, Ph.D. School of Library and Information Studies Florida State University [email protected] @musebrarian

Publishing and Using Linked Open Data - Day 1

Embed Size (px)

DESCRIPTION

A Gentle Introduction to Linked Open Data. Linked Data Use Cases

Citation preview

Page 1: Publishing and Using Linked Open Data - Day 1

#lod4h

Publishing and Using Linked Open Data

Richard J. Urban, Ph.D.

School of Library and Information StudiesFlorida State [email protected]@musebrarian

Page 2: Publishing and Using Linked Open Data - Day 1

#lod4h

January 7, 2013Monday’s Schedule

• 9:30 - 10:00 Class Session: Participant Introductions

• 10:00- 10:45 Class Session: A Gentle Introduction to Linked Data

• 10:45-11:00 am Break

• 11:00 am- Noon Class Session: Exploring Linked Data Use Cases

• Noon- 1 pm Lunch (on your own)

• 1:00-2:30 pm Class Session: A Gentle Introduction to Linked Data (con't)

• 2:30-2:45 pm Break

• 2:45-3:45pm Class Session: Participant Project Kick-off

• 4:00-5:00 pm Lecture: Seb Chan - Location: Ulrich Recital Hall, Tawes Fine Arts Building

• 5:30 pm-7:00 pm Graduate Student Networking Event Hosted by CUNY and MITH

Location: MITH

0301 Hornbake Library (inside Non-Print Media)

Refreshments Provided

Page 3: Publishing and Using Linked Open Data - Day 1

#lod4h

PARTICIPANT INTRODUCTIONS

Page 4: Publishing and Using Linked Open Data - Day 1

#lod4h

A GENTLE INTRODUCTION TO LINKED DATA: PART I

Page 5: Publishing and Using Linked Open Data - Day 1

#lod4h 5

A Web of Documents

Berners-Lee, T. (1989) Information Management: A Proposalhttp://goo.gl/xh36K

Page 6: Publishing and Using Linked Open Data - Day 1

#lod4h 6

World Wide Web

WWW - documents with simple relationships

Page 7: Publishing and Using Linked Open Data - Day 1

#lod4h 7

An HTML tree

Page 8: Publishing and Using Linked Open Data - Day 1

#lod4h

Document semantics

• XML (and HTML) provides a descriptive markup for documents (including metadata records)

• Even for more complex XML, like TEI, the meaning of many elements is dependent on it’s context within a document instance.

• Interpreting this context requires human intervention.

Page 9: Publishing and Using Linked Open Data - Day 1

#lod4h

Organizing the Web

• Human organization• Crawl and Index

– Uses many of the methods used by digital humanities scholars to extract information from web documents.

• Page Rank– Inferring importance from links

Page 10: Publishing and Using Linked Open Data - Day 1

#lod4h 10

Database-driven Web:Silos of Data

Page 11: Publishing and Using Linked Open Data - Day 1

#lod4h 11

Data-driven documents

WWW - documents with simple relationships

Data

Data

Data

Data

Page 12: Publishing and Using Linked Open Data - Day 1

#lod4h 12

•Federated (Z39.50)

•Aggregated (Open Archives Initiative – Protocol for Metadata Harvesting)

•Application Programming Interface (API) (service specific)

Page 13: Publishing and Using Linked Open Data - Day 1

#lod4h

Data Semantics

• Often dependent on human interpretation of documents/standards.

• Local data-provider interpretations not always documented or available to data consumers.

Page 14: Publishing and Using Linked Open Data - Day 1

#lod4h

LINKED OPEN DATAA New Vision in Two Parts

Page 15: Publishing and Using Linked Open Data - Day 1

#lod4h 15

Linked Data Principles

1. Use URIs as names for things

2. Use HTTP URIs so that people can look up those names.

3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)

4. Include links to other URIs. so that they can discover more things.

Page 16: Publishing and Using Linked Open Data - Day 1

#lod4h

“Things” = Resources

A resource can be anything that has identity.

Familiar examples include an electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can also be considered resources.

The resource is the conceptual mapping to an entity or set of entities, not necessarily the entity which corresponds to that mapping at any particular instance in time. Thus, a resource can remain constant even when its content---the entities to which it currently corresponds---changes over time, provided that the conceptual mapping is not changed in the process.

http://www.ietf.org/rfc/rfc2396.txt

Page 17: Publishing and Using Linked Open Data - Day 1

#lod4h

Uniform Resource Identifiers

• More than a Uniform Resource Locator (URL)

• Proves a mechanism to name resources in a way that works at Internet scale.

http://en.wikipedia.org/wiki/Uniform_resource_identifier

Page 18: Publishing and Using Linked Open Data - Day 1

#lod4h

De-referencing URIs

• When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)

• URIs can be used to name non-networked resources (concepts, people, physical objects, etc.)

• Useful if information about these objects can be returned when the name is used.

• CoolURIs for the Semantic Web http://www.w3.org/TR/cooluris/

Page 19: Publishing and Using Linked Open Data - Day 1

#lod4h

Resource Description Framework

• A model for representing data– An artificial language with a formal semantic

model – Can be expressed using multiple syntaxes– Simple grammar

• RDF “Triple”– <subject> <predicate> <object>– NAME verb Object– Mona Lisa painted by Leonardo da Vinci

Page 20: Publishing and Using Linked Open Data - Day 1

#lod4h 20

It’s a graph!

• That uses URIs

http://ex.org/monaLisa# http://purl.org/dc/terms/creator/http://ex.org/daVinci#

Page 21: Publishing and Using Linked Open Data - Day 1

#lod4h 21

From a simple language, we can say complex things.

Page 22: Publishing and Using Linked Open Data - Day 1

#lod4h

RDF Data Modeling

• RDF can be used with multiple tools for modeling data

• Simple: RDF Schema (RDFS)• Robust: Web Ontology Language (OWL)

– OWL-Lite– OWL-Full

Page 23: Publishing and Using Linked Open Data - Day 1

#lod4h

Limitations

• Best used for simple declarative statements– Difficult to express meta-assertions

i.e. “john believes that sally is 5’ tall” – Data provenance/trust– Negation “sally is not 5’ tall” – Tenseless (need to explicitly model time)– Modeling a “record” (named graphs)

Page 24: Publishing and Using Linked Open Data - Day 1

#lod4h

SPARQL

• SPARQL Protocol and RDF Query Language– A query language for RDF– Similar to SQL– Implemented by RDF publication software

(Triplestore)

Page 25: Publishing and Using Linked Open Data - Day 1

#lod4h

Link to Other Resources

• Include links to other URIs. so that they can discover more things.

– Link to controlled vocabularies/ontologies– Use existing RDFS/OWL schemas– link different representations of the same

resources together• Associate annotations with resources

Page 26: Publishing and Using Linked Open Data - Day 1

#lod4h 26

The Linked Data so far

Page 27: Publishing and Using Linked Open Data - Day 1

#lod4h

Linked Open Data Criteria

★ Available on the web (whatever format), but with an open license

★★ Available as machine-readable structured data (e.g. excel instead of image scan of a table)

★★★ as (2) plus non-proprietary format (e.g. CSV instead of excel)

★★★★ All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things, so that people can point at your stuff

★★★★★ All the above, plus: Link your data to other people’s data to provide context

Page 28: Publishing and Using Linked Open Data - Day 1

#lod4h

LINKED DATA USE CASES

Page 30: Publishing and Using Linked Open Data - Day 1

#lod4h

A GENTLE INTRODUCTION TO LINKED DATA: PART II

Page 31: Publishing and Using Linked Open Data - Day 1

#lod4h

A Simple Start

• Friend of a Friend (FOAF)http://www.foaf-project.org/

• A simple RDF vocabulary for describing people and their relationships.

Page 32: Publishing and Using Linked Open Data - Day 1

#lod4h

FOAF (Turtle) Syntax

@prefix : <http://xmlns.com/foaf/0.1/> .

<http://chi.cci.fsu.edu/person/rurban#>

:name "Richard J. Urban" ;

:givenname “Richard” ;

:familyname “Urban” ;

:website <http://chi.cci.fsu.edu/> ;

:workplacehomepage <http://slis.fsu.edu> ;

:workplacedirectory <http://directory.cci.fsu.edu/richard-urban/> ;

:publications <http://chi.cci.fsu.edu/person/rurban/publications> ;

:mbox_sha1sum <e122ce3b5475f25d5824e02574806b5e116b2662> ;

:weblog <http://www.inherentvice.net> .

Page 33: Publishing and Using Linked Open Data - Day 1

#lod4h

• http://goo.gl/PgdqN

Page 34: Publishing and Using Linked Open Data - Day 1

#lod4h

FOAF <XML> Syntax

<?xml version="1.0"?>

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/">

<rdf:Description rdf:about="http://chi.cci.fsu.edu/person/rurban#">

<foaf:name>Richard J. Urban</foaf:name>

<foaf:givenname>Richard</foaf:givenname>

<foaf:familyname>Urban</foaf:familyname>

<foaf:website rdf:resource="http://chi.cci.fsu.edu/" />

<foaf:workplacehomepage rdf:resource="http://slis.fsu.edu" />

<foaf:workplacedirectory rdf:resource="http://directory.cci.fsu.edu/richard-urban/" />

<foaf:publications rdf:resource="http://chi.cci.fsu.edu/person/rurban/publications" />

<foaf:mbox_sha1sum rdf:resource="e122ce3b5475f25d5824e02574806b5e116b2662" />

<foaf:weblog rdf:resource="http://www.inherentvice.net" />

</rdf:Description>

</rdf:RDF>

Page 35: Publishing and Using Linked Open Data - Day 1

#lod4h

Page 36: Publishing and Using Linked Open Data - Day 1

#lod4h

Basic Turtle

• Terse RDF Triple Language http://www.w3.org/TeamSubmission/turtle/

• Always start with a @prefix to declare a namespace for each schema you will use in your graph– Can mix/match any published RDF schema

@prefix : <http://xmlns.com/foaf/0.1/> .

Page 37: Publishing and Using Linked Open Data - Day 1

#lod4h

FOAF Propertieshttp://xmlns.com/foaf/spec/

• FOAF Core– Agent– Person– name– title– img– depiction (depicts)– familyName– givenName– knows– based_near– age– made (maker)– primaryTopic (primaryTopicOf)– Project– Organization– Group– member– Document– Image

• Social Web– nick– mbox– homepage– weblog– openid– jabberID– mbox_sha1sum– interest– topic_interest– topic (page)– workplaceHomepage– workInfoHomepage– schoolHomepage

– publications– currentProject– pastProject– account– OnlineAccount– accountName– accountServiceHomepage– PersonalProfileDocument– tipjar– sha1– thumbnail– logo

Page 38: Publishing and Using Linked Open Data - Day 1

#lod4h

Get Yourself a URI

• Can use a CoolURI based on your homepage

• A mailto:[email protected]• A “blank node”

_:me(although these are discouraged for Linked Data)

Page 39: Publishing and Using Linked Open Data - Day 1

#lod4h

@prefix : <http://xmlns.com/foaf/0.1/> .

<http://chi.cci.fsu.edu/person/rurban#>

:name “Richard Urban” ;

:homepage <http://chi.cci.fsu.edu> .

URIs are always enclosed in brackets

Properties start with a colon. Strings are in quotes.

Statements end with a semi-colon…..

Except the last statement ends in a period.

Page 40: Publishing and Using Linked Open Data - Day 1

#lod4h

Hands-on

• Open a text editor.• Write a FOAF description for yourself

using the Turtle Syntax. – http://xmlns.com/foaf/spec/– http://www.w3.org/TeamSubmission/turtle/

• Save the file with .ttl extension– yourName.ttl

Page 41: Publishing and Using Linked Open Data - Day 1

#lod4h

Publishing Your FOAF

• Put the file online, link it from your website.• Publish using an RDF Triplestore• Using FOAF-based plugins for

Wordpress/Drupal, etc.

Page 42: Publishing and Using Linked Open Data - Day 1

#lod4h

Sesame Triple Store

• Let’s use my sandbox for this week:– http://goo.gl/PgdqN

• Select the DHWI repository• Select ADD• Context baseURL: http://chi.cci.fsu/dhwi• Past your Turtle into the RDF box.

• All of us together:

Page 43: Publishing and Using Linked Open Data - Day 1

#lod4h

Linking our FOAF together.

• I know we just met, and this is crazy, but…

:knows <http://chi.cci.fsu.edu/person/rurban#>

• Add the URI of anyone else in the class you know.

Page 44: Publishing and Using Linked Open Data - Day 1

#lod4h

Some FOAF Humanities Use Cases

• Virtual International Authority Filehttp://www.viaf.org

• Social Networks and Archival Contexthttp://socialarchive.iath.virginia.edu/

• Linking Liveshttp://data.archiveshub.ac.uk/page/person/ncarules/skinnerbeverley1938-1999artist

• dbPedia http://dbpedia.org/data/Abraham_Lincoln.n3

Page 45: Publishing and Using Linked Open Data - Day 1

#lod4h

Beyond FOAF

• Organization Ontologyhttp://www.w3.org/TR/vocab-org/

• Encoded Archival Context-Corporate, Personas, Families Ontologyhttp://goo.gl/oFIkW

• Other domain ontologies with representations of people.

Page 46: Publishing and Using Linked Open Data - Day 1

#lod4h

BREAK

Page 47: Publishing and Using Linked Open Data - Day 1

#lod4h

Participant Projects

• What’s a small linked data project you can complete in the next few days?– Explore modeling questions

• Identify existing models

– Create/transform some data • What data is already out there?

– Publish some examples– Explore potential applications

Page 48: Publishing and Using Linked Open Data - Day 1

#lod4h

Tonight’s Events

• 4:00-5:00pm Lecture: Seb Chan– Location: Ulrich Recital Hall in Tawes Fine

Arts Building

• 5:30pm-7:00pm Graduate Student Networking Event– Hosted by CUNY and MITH; Location: MITH,

0301 Hornbake Library inside Non-Print Media

– Refreshments Provided