Upload
gudmundur-thorisson
View
1.751
Download
3
Embed Size (px)
DESCRIPTION
A major challenge facing VIVO is the retrieval of published works associated with specific authors from participating institutions, and automated disambiguation & identification of authors and scholarly works. VIVO thus shares many of the same goals as the Open Researcher and Contributor ID not-for-profit organization (ORCID: http://www.orcid.org). ORCID is working to solve the long-standing name ambiguity problem in scholarly communication globally, not only for researchers affiliated with academic institutions, but for contributors to scholarly works of all kinds. The aim of this mini-grant collaborative project is to explore how VIVO and ORCID could interact in the scholarly identity ecosystem, by way of small-scale implementation work and technology evaluation&review. The presentation will provide a brief introduction to ORCID and a background to the project, summarize the technical development undertaken thus far and outline the work remaining, and discuss some possilities for future work beyond this specific short-term project.
Citation preview
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
The VIVO platform and ORCID in the scholarly identity ecosystem
1
Gudmundur A. Thorisson <[email protected]>University of Leicester, United Kingdom
On behalf of the Open Researcher and Contributor ID initiative (http://www.orcid.org)
-- Outline --• Background (or, how did we get sucked into this identity business?)
• ORCID - Unique IDs for authors/contributors
• Introduction to the initiative
• Update on technical development
• Mini-grant collaborative project
• Thoughts on ORCID & VIVO interop going forward
This work can be freely copied, redistributed and adapted, as long as proper attribution is given
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Prologue
2
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
3
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
44
Prof Anthony J Brookes GEN2PHEN coordinatorChair, Bioinformatics and GenomicsDepartment of GeneticsUniversity of Leicester, UK
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Cataloguing genome-wide association studieshttp://www.gwascentral.org
5
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
The data sharing problem in the biosciences
6
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Lack of incentives for sharing research data
• Effort required to prepare, package and submit datasets to public repositories
• Time better spent writing papers & grants
• All sticks (funders, journals) - no carrots
• Need incentives - treat data as publications and credit creators
76
“[...] Many of the issues regarding data availability can be addressed if the principles of “publication” rather than “sharing” are applied. However, online data publication systems also need to develop mechanisms for data citation and indices of data access comparable to those for citation systems in print journals”
Costello, M. Motivating Online Publication of Data. BioScience (2009) vol. 59 (5) pp. 418-427
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
The identification requirement
8
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
The identification requirement• Identifying published scholarly works
– Why? So we can..
• ..cite the work unambiguously (‘..we used the method described in Thorisson et al (2009)’)
• ..locate the work (retrieve Nature article as PDF from journal website)
• ..give credit to those persons who contributed to the work (G. Thorisson authored paper X)
– What kind of scholarly works?
• journal articles, books, dataset, blog entries, Wikipedia articles, scientific software [..]
8
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
The identification requirement• Identifying published scholarly works
– Why? So we can..
• ..cite the work unambiguously (‘..we used the method described in Thorisson et al (2009)’)
• ..locate the work (retrieve Nature article as PDF from journal website)
• ..give credit to those persons who contributed to the work (G. Thorisson authored paper X)
– What kind of scholarly works?
• journal articles, books, dataset, blog entries, Wikipedia articles, scientific software [..]
• Identifying scholarly contributors– Why? So we can..
• ..link content creators with their works - attribute credit
• ..figure out: who contributed to publication X? which publications has Y contributed to?
– What kind of contributions? Characterizing ‘contributorship’
• author, creator, analyst, reviewer, ‘conceived of study & designed experiment’ etc.
8
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Tackling the author name ambiguity problem(aka ‘Who’s Who?’)
9
Are these authors all the same person?G. Thorisson, University of LeicesterG. A. Thorisson, University of LeicesterG. A. Thorisson, Cold Spring Harbor Laboratory
J. SmithJ. SmithJ. SmithJ. SmithJ. Smith [etc.]
Or these?
∼2/3 of the ∼6 million authors in MEDLINE share a last name and first initial with at least one other author, and an ambiguous name refers to ∼8 persons on average.Torvik and Smalheiser. Author name disambiguation in MEDLINE. ACM Transactions on Knowledge Discovery from Data (2009) vol. 3 (3)
How about these?
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Introducing the ORCID initiative
10
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
11
ORCID will work to support the creation of a permanent, clear and unambiguous record of scholarly communication by enabling reliable attribution of authors and contributors through unique identifiers
The ORCID mission
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
12
ORCID will add value for scholars and the organizations that they are interacting with, including universities, scholarly societies, funding organizations and publishers
•Joins faculty or student body•Joins scholarly society•Applies for grant•Submits manuscript
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Example use cases for contributor IDs• Scholar
– I need a list of my publications (and other scholarly output) for this job application.
– What other papers were published by the authors of this interesting paper?
• University– What was the research output of our institution (departments, individual researchers) last year?
– What open access papers did we publish?
• Scholarly Society– We want to offer better user profiles in our membership database.
– What was the research output of our members last year?
• Funding Agency– What papers were written by this grant applicant?
– What papers were published as a result of our funding?
– What datasets were made available by this research project?
• Publisher– We want to better track authors and reviewers in our journal submission system.
– Tell us more about this author, including other papers that he published.
– What potential reviewers have collaborated with this author?
13
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
The organization
14
ORCID transcends discipline, geographic,national and institutional boundaries
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
15
ORCID is open to any organization with an interest in scholarly communication
The organization
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
The organization
16
The Board of Directors represents a broad cross-section of stakeholders
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
17
Centralized contributor identification service will: i) enable researchers to manage & use their ORCID ID ii) facilitating linking content creators with their published works iii) interact with other systems (publishers, digital libraries, universities etc)
ORCID
F67572010
?
ORCID ID: 1242-2010-24G. Thorisson, Univ. LeicesterG. A. Thorisson, Univ. LeicesterG. A. Thorisson, Cold Spring Harbor Lab.
ORCID ID: 1442-2009-42J. Smith, Univ. North Pole
ORCID ID: 2400-2010-91J. Smith, Luthor Corporation
The infrastructure
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
18
ORCID will interact with other scholarly author identification systems
The infrastructure
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
19
Researchers will be able to create and maintain an ORCID ID and profile free of charge, and will control their privacy settings
The infrastructure
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
ORCID is serious about being open
20
Data - all profile data contributed to ORCID by researchers or claimed by them will be released under the CC0 waiver
Sourcecode - all software developed by ORCID will be publicly released under an Open Source Software license approved by the Open Source Initiative
Services - mix of free and non-free, but any fees will be used to ensure the long- term sustainability of the ORCID system
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
ORCID technical development
21
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Development Approach
522
Geoff Bilder, CrossRefORCID’s Interim Technical Director
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Phase I devevelopment - planned beta features
Core ResearcherID.com functionality plus:
• Institutional seeding of profiles (i.e. batch upload, alerting)
• Delegated management of profiles
• Profile exchange into grant/manuscript submission systems
• Fine-grained control of privacy settings at the claim level– public=”share with anybody”
– protected=”share with parties authorized via OAuth”
– private=”do not share”
• ORCID identifier resolution (both via GUI and REST API)
• Metadata search (both GUI and REST API)
23
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
• ResearcherID.com sourcecode donated by Thomson Reuters has been delivered to ORCID
• Oxford development team hired & Phase I work commenced– Initial code ‘cleansing’ & de-branding - decouple core s/w from TR’s proprietary stuff
– Current codebase passes core regression tests for core functionality
– Will be released as Open Source (as per Principles!!)
• Timeline for beta ORCID service– Prospect good for a spring 2012 go-live date for first public release
– Developer API access to a test sandbox probably well before this
• API documentation being drafted & discussed by the TWG
24
Phase I development - status update
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Phase I prototype screenshots
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Phase I prototype screenshots
25
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Phase I prototype screenshots
25
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Phase I prototype screenshots
25
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Phase I prototype screenshots
25
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Phase I prototype screenshots
25
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Collaborative development with VIVO
26
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
ORCID vs. VIVO: similarities/differences
27
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
28
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
28
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Mini-grant project outline
• Broad aim = technology evalutation for ORCID– Get hands-on experience w/ VIVO platform
– analogy: take a radio / sewing machine apart, see how it works (and maybe not be able to put it together!)
– Can/should ORCID reuse VIVO technology? Lessons learned? Etc.
• Implementation objectives: extend VIVO system to enable– #1 Searching for and adding publications from external bibliographic system
– #2 Identification & exchange of profile information with external system
• Other objectives– investigate ontological approach (e.g. what can ORCID reuse/repurpose/learn?)
– engage with VIVO community on technical level and build relationships
29
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Hacking VIVO #1
• Objective: enabling VIVO user to interactively search for & retrieve records from external bibliographic system and add to their profile
• Implementation stuff:– JRuby apps running alongside VIVO servlets inside Tomcat
• *very* simple Sinatra app <-- 2x RESTful web service endpoints
• Wrapper API around CrossRef’s SIGG metadata search
– <200 lines of client-side jQuery JavaScript
• Interactive UI bits
• Process biblio-RDF retrieved from DOI metadata service
– JSP hacking
• Modify VIVO UI forms + deal with biblio-RDF
– Freemarker hacking
• More useful reference listing in profile + other UI tweaks
30
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Screenshots from hack #1
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Screenshots from hack #1
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Screenshots from hack #1
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Screenshots from hack #1
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Screenshots from hack #1
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Hacking VIVO #2• Objective: enabling VIVO user to identify him/herself to &
exchange profile data with external service
• Implementation stuff - some more JRuby: – #1: Embedded Rails app - runs alongside VIVO servlets
• Turns VIVO into OAuth provider <-- works with existing VIVO authentication
• Hooks into VIVO via JRuby-Rack servlet integration (e.g. is user logged in?)
• Reuses existing Rails components, incl. oauth-plugin for most of the OAuth API
– #2: Standalone Rails app
• Demo OAuth consumer app <-- mockup manuscript tracking system
• Reuses existing Rails components, incl. devise for user registration/login/etc.
• Standard API - can in principle point any app to an OAuth-enabled VIVO instance
32
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Screenshots from hack #2
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Screenshots from hack #2
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
33
Screenshots from hack #2
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
33
Screenshots from hack #2
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
33
Screenshots from hack #2
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
33
Screenshots from hack #2
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
33
Screenshots from hack #2
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
33
Screenshots from hack #2
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
33
Screenshots from hack #2
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Hacking VIVO #2 cont.
• Utility of this proof-of-principle work? i) User registering his ORCID profile with a journal when submitting manuscript [and vice versa]ii) ORCID user linking his VIVO profile with his institutional VIVO profile
• OAuth-based authn/authz will be at the core of ORCID API
34
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
• Key next steps to complete mini-grant project
– In OAuth routine, fetch profile as RDF & fish out E-mail, name, affiliation etc.
• use new rich export feature?
• dig back into the ontology and have proper look
– Get hacks work with new VIVO v1.3 release [bonus: will be useful for testing how easy/hard extensions are to install]
– write report and, erm, write documentation
35
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Sourcecode, docs & community
– Disclaimer!! not complete, don’t use yet
– WIMH status ( as in ‘works in my hands’)
36
Code: https://github.com/gthorisson/vivo-orcidextensionsSandbox: http://vivo.crossref.org
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
Challenges• Java!
– Documentation - Java codebase largely undocumented
• Ruby!– learning language+Rails framework, debugging JRuby-rack etc.
• Getting data out of VIVO triplestore - crash course in SPARQL
• Getting data into VIVO triplestore
• Freemarker templates - non-trivial, but fairly logical
• Hacking JSP templates - non-trivial, PITA
• Community participation been tricky– Other projects / travel / summer vacation / family move between countries
– Timing of weekly VIVO dev calls (am 4-5 hours ahead of Eastern time)
37
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
VIVO + ORCID interop
38
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
VIVO + ORCID interop
• How can VIVO take advantage of ORCID (2012 beta & beyond)– Leverage ORCID IDs for integrating with global services (journals, data
repositories)
• NB not nearly all institutions will adopt VIVO
– Retrieve publication claims for researchers from central ORCID registry
• How can ORCID take advantage of VIVO– VIVO-ified institution could bulk-deposit profile information into ORCID
• pre-registration of research staff
• VIVO could provide set of tools to help inst. implementers with this?
– Explore semantic approach
• how should ORCID interoperate on the semantic level? Ontologies / URIs
• Example utility: nanopub a la Mons et al
39
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
40
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
41
Friday, 26 August 2011
2nd Annual VIVO Conference, Washington, 24-26 August 2011
http://www.orcid.org G. A. Thorisson, University of Leicester
42
ORCID Technical Working Group
GEN2PHEN Consortium
http://www.gen2phen.org/about-gen2phen/partners
Prof Anthony J. Brookes Bioinformatics Group, Leicester
This work has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013)under grant agreement number 200754 - the GEN2PHEN project.
Contact me!
<[email protected]> |<[email protected]>http://www.linkedin.com/in/mummihttp://www.twitter.com/gthorisson
http://www.gthorisson.name
Acknowledgements
Friday, 26 August 2011