32
AIB CILW 2016 Conference, Rome October 21, 2016 Because the web of data doesn’t organize itself OCLC Research’s contributions to linked data in the library community Titia van der Werf Senior Program Officer

Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: [email protected];[email protected] Keywords:

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

AIB CILW 2016 Conference, Rome

October 21, 2016

Because the web of data

doesn’t organize itselfOCLC Research’s contributions to

linked data in the library community

Titia van der Werf

Senior Program Officer

Page 2: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:
Page 3: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

Web of Documents

• Web pages or other

documents

• Human-readable

text

• Independent

• Static

Web of Data

• Statements about

entities, or ‘Things’

• Machine-processable

data

• Integrated

• Actionable

The two models of the Web

Page 4: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

An example: a Knowledge Card

Page 5: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

Albert Einstein

Person

Relativity: The Special and General Theory

Work

Physics

Subject

author

about

Entities and relationships

Page 6: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

https://www.wikidata.org/wiki/Q937 and http://viaf.org/viaf/75121530

Wikidata and VIAF

http://experiment.worldcat.org/entity/work/data/369081611

WorldCat Works

http://id.loc.gov/authorities/subjects/sh85101653.html

Library of Congress Subject Headings

author

about

…linked for machine understanding

Page 7: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

THE OCLC RESEARCH

INTERNATIONAL LINKED DATA

SURVEYS FOR IMPLEMENTERSKAREN SMITH-YOSHIMURA

Page 8: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:
Page 9: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

Geographic breakdown of 90 responding institutions

20 countries

represented

0 5 10 15 20 25 30 35 40 45

USA

Spain

UK

The Netherlands

Norway

Canada

Australia

France

Germany

Italy

Switzerland

Austria

Czech Republic

Hungary

Ireland

Japan

Malaysia

Portugal

Singapore

Sweden

Linked Data Survey Respondents

Page 10: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

Academic library

National library

Network

Government

Scholarly

Public Library

Museum

Other

31%

20%14%

10%

8%

7%4% 6%

2015 responding institutions by type

Page 11: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

What is published as linked data

0 10 20 30 40 50 60

Authority files

Bibliographic data

Data about musuem objects

Datasets

Descriptive metadata

Digital collections

Encoded archival descriptions

Geographic data

Ontologies/vocabularies

Other

Page 12: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

• Steep learning curve for staff

• Inconsistent legacy data

• Difficulties in

– selecting appropriate ontologies to model

data

– establishing links

• Little documentation or advice on how to build

the systems

Barriers to publishing linked data

Page 13: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

VIAF

DBpedia

GeoNames

id.loc.gov

“Resources we convert to linked data ourselves”

Getty's Art and Architecture Thesaurus

FAST (Faceted Application of Subject Terminology)

WorldCat.org

data.bnf.fr

Deutsche National Bib Linked Data Service

2015 linked data resources most

consumed

Page 14: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

DBpedia

Libraries, publishing

Life sciences

Social networking

Government

Page 15: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

• Unreliable quality of published linked data

– not always reusable

– lack of authority control or URIs

– stale or obsolete datasets

• Difficulty understanding its structure and meaning

• Matching, disambiguating, and aligning locally produced

data with third-party resources

• Mapping vocabulary

• Size of RDF datasets—too large or small

Barriers to consuming linked data

Maturity

Analysis

Implementation

Page 16: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

• Expose our data to larger Web audience

• Demonstrate what can be done

• Heard about it and wanted to try it

• Improve SEO

• Create a richer user experience

• Enhance our own data

• Improve internal metadata management

• Achieve greater accuracy and scope in search results

• Experiment with data integration

Publishing

Consuming

Both

Reasons for publishing and consuming linked data

Page 17: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

OCLC RESEARCH’S CONTRIBUTIONS

Page 18: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

WorldCat growth since 1998

39 41 44 47 50 52 55 61 67

86

108

139

197

236

264

0

50

100

150

200

250

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

Millions of records

As of 27 April 2012

Page 19: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

In aggregations:

• data lose their local context

• data get lost in the bigger context

Making sense of data at the aggregate level:

• FRBR

• GLIMIR

• VIAF

• FAST

• Mining for entities/names

Aggregating data

Page 20: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

Manifestations

Reproductions

Translations

Works

FRBRisation of WorldCat: 2006 - now

GLIMIR:

Clustering

records which

differ in

language and

cataloguing

rules

2014: 197 million bibliographic work descriptions available as Linked Data

Page 21: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

VIAF

Virtual International Authority File

• Merge of 24+ national level authority files

• Cooperative program run by OCLC

• Initiated by LoC, DNB, BnF and OCLC

• 29 million authority records

• 112 million bibliographic records

• Migrated from an OCLC Research project to an

OCLC service in 2012

• VIAF is available as linked data

Page 22: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

OCLC’s linked data resources

WorldCat Catalog

WorldCat Works

FAST

VIAF

ISNI

Page 23: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

The EntityJS explorer

Page 24: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

Show related entities

Page 25: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:
Page 26: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:
Page 27: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

WHAT WE’VE LEARNED

Page 28: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

Linked data in the library community:

Where the effort is focused

Data publishing

Data consumption

Application development

?

Page 29: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

Why linked data?

Replicate existing library

functions more cheaply and

efficiently

Improve data integration

A better user

experience

Greater Web

visibility

Develop better models of

resources not well served by

current standards

Improve internal data

management

Page 30: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

Library linked data is not…

A silver bullet

A killer app

A panacea

The result of cumulative and joint effort

But it is...

Page 31: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

SM

Together we make breakthroughs possible.

Acknowledgements

Jean Godby

AIB CILW 2016 Conference, Rome - October 21, 2016

Karen Smith-Yoshimura

Page 32: Because the web of data itself - COnnecting REpositories · 2017. 10. 13. · An OCLC Research View of the Linked Data Landscape Author: smithyok@oclc.org;godby@oclc.org Keywords:

SM

Together we make breakthroughs possible.

Comments?

Titia van der Werf

AIB CILW 2016 Conference, Rome - October 21, 2016

[email protected]