Upload
elizabeth-leblanc
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
Linking Data from
ScienceDirect Articles
Presented by: IJsbrand Jan AalbersbergHannover, DataCite MeetingDate: June 8, 2010
Linking to & from Data
from & to ScienceDirect Articles
Presented by: IJsbrand Jan AalbersbergHannover, DataCite Meeting Date: June 8, 2010
Linking Data in ScienceDirect
The Past Supplementary data Entity links to databases
The Present Some considerations PANGAEA-type linking
A Future Getting even closer connected
3
The Past (supplementary data)
Raw research data delivered as supplementary data
Available for limited number of data set types / formats Data distributed over multiple articles and publishers Format frozen in time – not maintained for preservation Only available for smaller data sets (at most few 10 MBs) Limited access due to use of existing publishing platforms Data and article remain nicely coupled / packaged Supplementary data always being peer-reviewed
4
The Past (entity linking - manual)
Authors manually identify (and tag) entities that are mentioned in articles and of which associated data is present (or registered) in databases, like GenBank, MINT, Uniprot, PDB, CCDC, ...
Very accurate and unambiguous However, requiring author effort Publisher takes care of actual linking Reciprocal linking usually taken care of
5
The Past/Present (entity linking – automatic)
Sometimes automatically (e.g., NextBio and Reflect)
Easily extendable to new / other entities Works retrospectively on older content Does create recall / precision errors
6
The Present (some considerations) STM, “Brussels Declaration”, June 2006:
“... believe that, as a general principle, data sets, raw data outputs of research, and sets or subsets of that data should wherever possible be made freely accessible ...”
Data sets should be freely accessible – at publisher? Scientists prefer independent data repositories Need for single domain-specific coordination Huge costs for maintenance and preservation
Proper deposit mechanism needed Through publisher? Extra overhead vs. ease of use
Enforcing deposit prior to publication If community-supported, surely a possibility
Data set standardization is needed for optimal use
7
The Present (more considerations)
Scientist needs the combination of formal publication record and the raw data sets
To get optimal interoperability, close collaboration between publisher and data set repositories needed
Publisher should “enable and support” raw data sets Submission: enforce if supported by community Discoverability: interconnect article with data sets
Reciprocal linking at deepest level possible PANGAEA-type linking
Data feeds from publisher to repositories? Managing large amount of data set repositories?
DataCite as single discussion partner
8
The Present (PANGAEA linking)1. Author submits article to publisher2. Author submits data set to repository3. At article publication, repository links article DOI to
associated data set DOI, creating actual connection4. User sees link to ScienceDirect from PANGAEA5. User sees link to PANGAEA from ScienceDirect:
9
SD Server
USERSD Article
PANGAEA Server
link
articles
data + associations
PANGAEA links to ScienceDirect
10
ScienceDirect links to PANGAEA
11
A Future (tighter interoperability)
Not just a link to / from data and journal article But provide integrated experience for scientist Single page (environment) with data and article
12
SD Server
USER
SD Article
Supplementary Data Server
articles
data sets
A Future (tighter interoperability)
Not just a link to / from data and journal article But provide integrated experience for scientist Single page (environment) with data and article
Some users prefer it other way around; so also offer:
13
Data Set Server
USER
Data Set
Article Server
data sets
articles
A Future (inline supplementary data)
14
A Future (inline supplementary data)
15
Structures submitted as supplementary data files (MOL files)
Displayed inline through Reaxys application / service
Linking to & from Data
from & to ScienceDirect Articles
Presented by: IJsbrand Jan AalbersbergHannover, DataCite Meeting Date: June 8, 2010
Presented by: IJsbrand Jan AalbersbergHannover, DataCite Meeting Date: June 8, 2010
Creating the best User Experience
by integrating Data with Articles
Presented by: IJsbrand Jan AalbersbergHannover, DataCite Meeting Date: June 8, 2010
Creating the best User Experienceby integrating Data with Articles
requires close collaboration between data set repositories and publishers