43
Linked Data Love: research representation, discovery, and assessment Kristi Holmes, PhD @kristiholmes Linked Library Data Interest Group # alaac15 - June 27, 2015

#ALAAC15 Linked Data Love

Embed Size (px)

Citation preview

Linked Data Love: research representation, discovery, and assessment

Kristi Holmes, PhD @kristiholmes Linked Library Data Interest Group #alaac15 - June 27, 2015

The Semantic Web: a value proposition

 At  its  heart,  the  Seman.c  Web  is  really  about  extending  

standard  Web  technologies  to  be9er  deal  with  data  on  the  

Web.        

If  the  WWW  is  for  people,  the  Seman.c  Web  is  for  machines  

George Thomas and Jim Hendler, http://www.data.gov/communities/node/116/blogs/142

Data modeled as bidirectional relationships

Web-based infrastructure of standards and technologies which allows for a distributable, machine readable description of data that allows for

stronger data and smart web application linkages

Let’s talk about the data…

The Semantic Web isn't just about putting data on the web. It is about making links, so that a

person or machine can explore the web of data. With linked data, when you have some of it, you

can find other related data.

http://www.w3.org/DesignIssues/LinkedData.html

Let’s think about why this is important on the institutional level…

•  Research is increasingly more interdisciplinary •  How can you find collaborators, track competitors, and stay abreast of current

research inside large institutions, at other institutions, and globally? •  How can you find others with shared interests or expertise? •  How can you build diverse teams? Find mentors? Be identified as a partner

by community groups?

Faculty

•  Library administration or directors of core facilities want to align their strategic plan with the evolving research needs of their clientele.

•  Identifying growth areas of research through increasing publications, focused areas of research and grant dollars enables this task to become more evidence-based.

Support: facilities and personnel

•  Research institutions can be extremely large and diverse •  How can administrators showcase and monitor research activity, track

competitors, and stay abreast of current research inside large institutions, at other institutions, and globally?

•  How can you enhance visibility and present a unified picture of an institution?

Administrators

We face a number of challenges on our campuses!

Research networking can help. Information about scholars is optimized using a Web-based infrastructure of standards and technologies which allows for a distributable, machine readable

description of data that allows for stronger data and smart web application linkages across many universities, agencies, societies both within the US and abroad.

Why is this important? Linked data infrastructure allows for •  Visualizations, research and clinical data integration,

and deep semantic searching across multiple types and sources of data

•  By breaking data out of traditional database silos, research networking platforms promote a network effect within a single site and across multiple sites

–  The value of the network increases with the amount of linked data and applications that are available to consume the linked data.

1.  An open source semantic web application

2.  An information model 3.  An open community

Let’s talk about research networking in the context of VIVO – what is it?

What is VIVO?

1.  An open source semantic web application

2.  An information model 3.  An open community

VIVO is one research networking platform, although there are others. Organizations

make decisions about adopting these tools based on many different features. The

most important aspect isn’t the software, it is the data! More on that later…

VIVO An open-source semantic web application that enables the discovery of research and scholarship across disciplines in an institution.

VIVO harvests data from verified sources and offers detailed profiles of faculty and researchers.

Public, structured linked data about investigators interests, activities and accomplishments, and tools to use that data to advance science.

VIVO enjoys a robust open community space to support implementation, adoption, &development efforts around the world. See http://wiki.duraspace.org/display/VIVO

A VIVO profile allows you to:

Showcase credentials, expertise, skills, and professional achievements for individuals and campus groups.

Connect within focus areas and geographic expertise.

Simplify reporting tasks and link data to external applications – e.g., to generate biosketches or CV or for reporting purposes.

Publish the URL or link the profile to other applications.

Discover potential colleagues or campus resources by work area, authorship, & collaborations.

Display visualizations of expertise areas or complex collaboration networks and relationships.

What is VIVO?

1.  An open source semantic web application

2.  An information model 3.  An open community

CTSA: Recommendations and Best Practices for Research Networking

The Research Networking Recommendations were approved by the CTSA Consortium Executive and Steering Committee on October 25, 2011.

Recommendations for Research Networking: •  Recommendation: All CTSAs should encourage their institution(s) to implement

research networking tool(s) institution-wide that utilize RDF triples and an ontology compatible with the VIVO ontology.

•  Recommendation: Information in people profiles at institutions should be publicly available as data as a general principle, specifically as Linked Open Data. To ensure quality of information, authoritative electronic data sources versus manual entry should be emphasized. Institutions will vary in the amount of information that they will include and make publicly available but the value is enhanced by the quality and quantity of information.

•  Recommendation: Monitoring of the research networking landscape, technology, and tools should continue to be overseen by experts from the CTSA consortium (e.g., the Research Networking group of the Informatics KFC).

https://www.ctsacentral.org/recommendations-and-best-practices-research-networking

Building a large web of data, greater than any one effort, greater than any one platform.

Data Creators, Data Aggregators, & Data

Consumers

Repositories. Tools. Applications. Workflows

A couple of local examples…

Brown University

Weill Cornell Medical College

http

://lib

rary

conn

ect.e

lsev

ier.c

om/a

rticl

es/te

chno

logy

-con

tent

/201

3-03

/aut

horit

ativ

e-re

sear

cher

-met

adat

a-on

e-pl

ace-

vivo

WCMC CTSC’s VIVO data sources

http

://lib

rary

conn

ect.e

lsev

ier.c

om/a

rticl

es/te

chno

logy

-con

tent

/201

3-03

/aut

horit

ativ

e-re

sear

cher

-met

adat

a-on

e-pl

ace-

vivo

Duke University

Data, Tools and Scientists

http://vivosearch.org/ http://vivosearchlight.org/ http://nrn.cns.iu.edu/

VIVO search scenarios

•  Multiple campuses of one university •  Regional connections -  e.g., Illinois ties with regional federal

labs •  Consortia – 62+ CTSAs, USDA plus

land grant universities •  International -  13 Netherlands universities and the

National Library -  German Universities -  AgriVIVO – UN FAO

Searchlight, AgriVIVO, etc.

Concept Coverage •  Research networking systems queried: 57 -  SPARQL endpoints queried: 9 -  Sites crawled: 48

•  Institutions indexed: 64 -  CTSA institutions: 27

•  Total person URIs: 4,933,757 -  Unique individuals profiled: 140,949 - 300,239

•  Total publications by those persons indexed as part of their profile: 8,396,744 •  Total co-author pairs (two people on the same paper): 48,012,993 •  The harvesting times listed below are the times required to interrogate the respective SPARQL

endpoints or crawl the respective servers and cache the results locally at Iowa.

CTSAsearch http://research.icts.uiowa.edu/polyglot/

What is VIVO?

1.  An open source semantic web application

2.  An information model 3.  An open community

VIVO Community

•  DuraSpace wiki •  Calls and listservs -  Ontology -  Development -  Implementation -  Outreach -  Tools and Apps

•  Social Media -  Facebook -  LinkedIn -  Twitter

• Events •  Annual conference •  Implementation Fest •  Workshops •  Hackathons

VIVO Community

VIVO projects around the world

https://wiki.duraspace.org/display/VIVO/Sites+implementing+VIVO

Current  and  future  VIVO  efforts  

VIVOs • 150+ impl. &

pilot projects • 35+ countries • 20+ CTSAs

Standards

• CTSAconnect Integrated Semantic Framework ontology

• ORCID • CASRAI •  others

Partners

•  Symplectic •  euroCRIS •  W3C, DERI,

ConceptWeb Alliance, OpenPHACTS

•  Institutions/organizations

Events

•  VIVO conference Aug. 2015

•  Spring Implementation Fest @ OHSU

•  DuraSpace VIVO webinars

•  Hackathon

Community

• VIVO wiki • Listservs • Weekly calls • GitHub •  vivoweb.org • @VIVOcollab

vivoweb.org

VIVO Updates

The changing role of libraries

•  Are a trusted, neutral entity •  Have a tradition of service and support •  Strive to serve all missions of the institution •  Are technology centers and have IT and data expertise

•  Have skills—information organization, instruction, usability, subject expertise

•  Have close relationships with their clients (buy in) •  Understand user needs •  Understand the importance of collaboration and know how to

bring people together •  Have knowledge of institution, research, education, clinical

landscape

Library Staff:

Libraries:

What roles can the library play?

What roles can the library play?

Librarians are successfully stepping up to the semantic web plate in a variety of roles related to institutional

research networking platforms.

•  Outreach and adoption activities •  Education and training on the use of the platform •  Ontology and controlled vocabulary expertise,

extending the model •  Negotiations with data providers •  Programming, technical support •  Workgroup representation •  …and more!

Research networking also provides an opportunity for libraries

to become familiar with many concepts around linked open data and the semantic web.

Building an ecosystem for evaluation and continuous improvement

Northwestern University Clinical and Translational Sciences (NUCATS) Institute

Mission: Speeding transformative research discoveries to patients

and the community

http://nucats.northwestern.edu/

Library as Partner

Opportunity!

Metrics and Impact Core Digital projects

Digital Projects led by Digital Systems and Collection Services

Among other projects…

Symplectic Elements -  Back-end bibliometric aggregator -  Support OA with repository integration -  Facilitates reports and reuse of clean aggregated data from a number of

diverse sources Digital repository -  We’ll gain the ability to create, share, and preserve attractive, functional,

and citable digital collections and exhibits -  Promotes discovery and access of FSM scholarship, both traditional and

alternative outputs -  Better metrics

35

Symplectic Elements

Tracking, evaluation, and reporting

Digital Asset Management System

(IR)

Tasks (CVs and biosketches, etc.)

Research Information Systems

The Symplectic Elements platform &

data will help facilitate new avenues of

support

37

Our shop is committed to open source principles and we leverage semantic web

languages and architecture whenever possible to support open science.

We want to optimize discoverability and dissemination of content and enhance the impact of FSM, NUCATS, and our Northwestern Medicine

community.

•  Measurement  instruments  •  Con4nuing  educa4on  materials  •  Cost-­‐effec4ve  interven4on  •  Consensus  development  conferences    •  American  Medical  Associa4on  Current  

Procedural  Terminology  (CPT)  codes  •  Change  in  delivery  of  healthcare  services  •  Gray  literature  

Going beyond the counts to find evidence of meaningful impact

•  New  experimental  methods,  databases  or  soHware  tools  

•  New  diagnos4c  criteria  or  standards  of  care  •  Biologics  •  Curriculum  guidelines  •  Clinical/prac4ce  guidelines  •  Quality  measure  guidelines  

https://becker.wustl.edu/impact-assessment http://nucats.northwestern.edu/

Pathways Advancement of Knowledge

Clinical Implementation Legislation and Policy Enactment

Economic Benefit Community Benefit

39

Bringing scholarship out into the open

Enhancing discovery. Enhancing impact.

Hope to see you at the conference in August!

http://vivoconference.org/

Acknowledgements Teams: • The amazing team at Galter Library • VIVO Colleagues worldwide

Support: • Northwestern University Clinical and Translational

Sciences Institute, NIH award UL1TR000150 • VIVO, NIH award U24 RR029822 • VIVO/DuraSpace

Questions/Follow-up: •  [email protected] •  Twitter: @kristiholmes

Thank you! Kristi Holmes @kristiholmes [email protected] orcid.org/0000-0001-8420-5254

Images and site credits Images •  http://i.telegraph.co.uk/multimedia/archive/02032/road-rail-us_2032312i.jpg •  http://www.data.gov/communities/node/116/blogs/142 •  http://www.w3.org/DesignIssues/LinkedData.html

Websites, resources •  http://vivoweb.org/ •  https://wiki.duraspace.org/display/VIVO/VIVO •  http://vivosearch.org/ •  http://vivosearchlight.org/ •  http://nrn.cns.iu.edu/ •  http://research.icts.uiowa.edu/polyglot/ •  http://www.data.gov/communities/node/116/blogs/142 •  http://www.w3.org/DesignIssues/LinkedData.html