30
@LorcanD Lorcan Dempsey, OCLC 11 October 2013 ARL Fall Forum: Mobilizing the research enterprise #ARLforum13 SHARE Discovery:Focus on papers

SHARE: Discovery: A Focus on Papers

Embed Size (px)

DESCRIPTION

Presented by Lorcan Dempsey 11 October 2013 at ARL Fall Forum 2013: Mobilizing the Research Enterprise, Arlington, Virginia (USA). http://www.arl.org/events/upcoming-events/arl-fall-forum-2013 http://www.oclc.org/research/presentations.html

Citation preview

Page 1: SHARE: Discovery: A Focus on Papers

@LorcanD

Lorcan Dempsey, OCLC 11 October 2013

ARL Fall Forum: Mobilizing the research enterprise

#ARLforum13

SHAREDiscovery:Focus on papers

Page 2: SHARE: Discovery: A Focus on Papers

Aggregation is a pain

Page 3: SHARE: Discovery: A Focus on Papers

Shenghui Wang (OCLC), Antoine Isaac (Europeana), Valentine Charles (Europeana), Rob Koopman (OCLC), Anthi Agoropoulou (Europeana), and Titia van der Werf (OCLC)Hunting for Semantic Clusters: Hierarchical Structuring of Cultural Heritage Objects within Large Aggregations17th International conference on Theory and Practice of Digital Libraries (TPDL), 22-26 September 2013, Valletta (Malta)

Page 4: SHARE: Discovery: A Focus on Papers

Duplicates

Page 5: SHARE: Discovery: A Focus on Papers

Duplicates? Same object: different providers

Page 6: SHARE: Discovery: A Focus on Papers

Duplicates? Same page: different digital copies

Page 7: SHARE: Discovery: A Focus on Papers

Cataloging error

Harvested – points to repository splash page

Analytic – essay in book

Catalan translation

Loaded from Crossref

Loaded from Elsevier

Three ‘expressions’.Cataloging now fixed

Page 8: SHARE: Discovery: A Focus on Papers

Cross repository record matching issues – confused identities

• Different data models– Mapping is lossy.– Relationship issues, e.g.

• Preprint• Published article• Publisher splash page• Repository splash page• …

• Replication of content across repositories • Different content and ‘fullness’ standards• Granularity issues

– What is being described?

• ‘Business’ issues– Publisher wants separate display?

Page 9: SHARE: Discovery: A Focus on Papers

From strings to things … an emerging pattern?

Search enginesLinked data

Page 10: SHARE: Discovery: A Focus on Papers

The social graph

Page 11: SHARE: Discovery: A Focus on Papers

Three benefits acc to Google:

1. Find the right thing2. Get the best summary3. Go deeper and broader

Within a discovery service …1. Aspire to a singular identity for entities/things

(people, works, places, organizations, …) 2. Gather data associated with those identities

(e.g. ‘cards’)3. Create relationships between identities.

Page 12: SHARE: Discovery: A Focus on Papers
Page 13: SHARE: Discovery: A Focus on Papers
Page 14: SHARE: Discovery: A Focus on Papers
Page 15: SHARE: Discovery: A Focus on Papers
Page 16: SHARE: Discovery: A Focus on Papers
Page 17: SHARE: Discovery: A Focus on Papers
Page 18: SHARE: Discovery: A Focus on Papers
Page 19: SHARE: Discovery: A Focus on Papers

Make data work harder so that the user doesn’t • Create singular identity

(‘entification’)• Gather information about

entities (e.g. cards)• Create relationships between

entities (navigation – citation, co-creation, derivative, affiliation, recommendations, …)

• Strongly leverage four types of metadata about things ..– ‘Professional’– Crowdsourced (claiming

profiles, …)– Programmatically

promoted (entity extraction, categorization, clusters, ….)

– Usage (relationships based on usage)

• Now: shredding records• Future: manage entities in

linked data world

• Plural - Work with what you have.

• Wikipedia – an addressible knowledgebase

• Wikidata/Freebase – source of structured data

Page 20: SHARE: Discovery: A Focus on Papers

National Libraries

English Wikipedia

VIAF Matching Algorithm

German Wikipedia

Other Wikipedias

Wikidata Wikibase

3rd Party Users

Submit VIAF IDs /Show centralized data

Submit VIAF IDs /Show centralized data

Read data

VIAF matches Articles /Wikipedia shows matched IDs

A small example of links/entities

Page 21: SHARE: Discovery: A Focus on Papers
Page 22: SHARE: Discovery: A Focus on Papers

The scholarly graph?

• Architecture components– Author IDs– Paper/work IDs– Institutions?

• Signals of interest– Research analytics– Research workflow

• Questions– What is the role of libraries/SHARE/….?– Vivo?– Who will manage entity backbones in linked data

world?

Page 23: SHARE: Discovery: A Focus on Papers

Questions and issues …

Page 24: SHARE: Discovery: A Focus on Papers

Repository scope

Campus bibliography*-printsDigital materials

Tactical ‘structure up’/SEO

More links to entities in records - Identifiers

Orcid, ISNI, VIAF, …DOI, Pubmed ID, …

Schema.org markupSite maps; ResourceSyncWhat do hubs want to see? (e.g. Scholar)

Page 25: SHARE: Discovery: A Focus on Papers

Purposeful syndication

Share data with network/disciplinary hubs

A discovery service?

A discovery destination? The bar is getting higher …

A source of data for others?

Page 26: SHARE: Discovery: A Focus on Papers

Sourcing and scaling … Workflow, Repository, Disclosure, Discovery, …

ScalingRightscalingDifferent things done at different scalesInstitution, Consortium, ARL, world?

SourcingCollaboratively sourced?Third party?Existing agency?Multiple approaches?

Page 27: SHARE: Discovery: A Focus on Papers

Discovery and SHARE

What is Share’s role in creating and/or maintaining the scholarly graph?

Page 28: SHARE: Discovery: A Focus on Papers

Credits

Page 29: SHARE: Discovery: A Focus on Papers

Ack kind advice from …

• Max Klein, Merrilee Proffitt, Karen Smith Yoshimura, Thom Hickey (Wikidata/Wikipedia/Viaf)

• Shenghui Wang, Rob Koopman, Titia van der Werf (clustering and Europeana data)

• Jeff Young, Eric Childress

Page 30: SHARE: Discovery: A Focus on Papers

©2013 OCLC. This work is licensed under a Creative Commons Attribution 3.0 Unported License. Suggested attribution: “This work uses content from [presentation title] © OCLC, used under a Creative Commons Attribution license: http://creativecommons.org/licenses/by/3.0/”

Q@LorcanD