Upload
lshtm
View
201
Download
4
Embed Size (px)
Citation preview
PREPARING RESEARCH DATA
FOR SHARING
This work is licensed under a
Creative Commons Attribution 2.0
UK: England & Wales License
Gareth Knight
Open Access Week 2014
Data Sharing in the News
“Publicly funded research data are a public good, produced in the public interest, which should be made openly available with
as few restrictions as possible in a timely and responsible manner that does not harm intellectual property.”
RCUK Common Principles on Data Policy
http://www.rcuk.ac.uk/research/datapolicy/
LSHTM Policy
“Research data… should be offered to an appropriate data repository or enclave… except in circumstances that would
breach IPR, ethical, confidentiality, or other obligations”
Principle 6
“Research data that substantiate research findings should be made available for access and use in a timely manner, within the
boundaries of conditions established by contractual, legislative, ethical, or other requirements
Principle 7
LSHTM Research Data Management Policyhttp://www.lshtm.ac.uk/research/researchdataman/rdm_policy_summary.html
Journal Expectations
PLOS state that journal papers must be accompanied by a ‘minimal dataset’ containing:
1. Data “used to reach the conclusions drawn in the manuscript” that is necessary to understand, validate, or replicate the work.
2. Documentation necessary to understand the data content and the methods applied to produce it.
A Data Availability Statement which outlines how data can be located
PLOS Data Policy Overviewhttp://blogs.lshtm.ac.uk/rdmss/plos-data-policy-overview/
An increasing number of journal publishers expect data
to be made available at same time as paper
Researcher Incentives
• Leads to new collaborations between data users and data creators
• Establishes a reputation for scholarly rigour by allowing outputs to be scrutinised
• Increased visibility of research presenting new opportunities for analysis and use
• Saves time and money, by ensuring researcher does not duplicate existing data collection or analysis.
Data sharing benefits the research community, but let’s focus upon personal benefits
Are you able to share data?
Sharing is encouraged, but not suitable for all data. Must be balanced with other obligations
• Ethical duty of confidentiality and protect participants from harm
• Allow participants to make own decisions on how their information can be used, shared and made public (through informed consent)
• Enable researcher to gain maximum benefit from findings
Desire to patent research findings
Data can’t be anonymised
3rd party IPR exists
Concern that will attract lower rate of response or people will be less honest
Meet funder obligations
Encourage research uptake
Higher citation rate
Encourage validation of results
againstfor
Data Sharing Code of Practice
1. Why do you wish to share?• What does the project gain by sharing data?
• What are the benefits to the wider world?
2. Are you able to share?• Data Protection Act
• Participant Consent
• Intellectual Property Rights (IPR)
• Other risks or barriers that prevent or limit sharing
3. Are there conditions associated with sharing?• Academic researchers only?
• Consent permit specific use only
Information Commissioner Office offer a 3-step decision model
http://www.ico.org.uk/for_organisations/data_protection/topic_guides/data_sharing
A Sample Consent Form
UK Data Service: Model consent forms and information sheeetshttp://ukdataservice.ac.uk/manage-data/legal-ethical/consent-data-sharing/overview.aspx
Managing Personal Information
Personal DataInformation that can be used to directly identify person in isolation or in combination
• Name
• Address
• Date of Birth
• National Identification Number (NIN)
Sensitive Personal DataInformation that can be used to discriminate requires extra protection
• racial or ethnic origin
• political opinions
• religious beliefs
• physical or mental health
• sexual life
• criminal offenses
• Trade union membership
Research projects typically apply same(high) level of protection to both information types
The DP Act outlines two classes of information to be protected
Risk Management
Assess likelihood that data can be used to:
• Identify a person directly
• Infer information about a person
• Link records relating to person to other info
Determine action to address issue:
• Randomisation - noise addition, permutation
• Generalisation - aggregating results, limiting geographic details
• Pseudonymisation - hash functions
How likely is it that you will share
personal or sensitive information?
Information Commissioner Office: Anonymisation Code of Practicehttp://www.ico.org.uk/for_organisations/data_protection/topic_guides/anonymisation
Data Sharing Methods
Data may be shared using several formal and informal approaches:
1. Emailing a set of data files to a colleague orother researcher
2. Making data files available througha project or institutional website
3. Submitting data files to a journal in orderto support a publication (e.g. PLoS)
4. Deposit to a data repository, data archive, or data bank for preservation and sharing
Data Repositories
• Several repositories provide infrastructure necessary to store, preserve and deliver data
• Carefully read deposit conditions – some insist that data is made available immediately under a CC0 licence
• LSHTM will launch its own data repository in early 2015
A small, but growing number of data repositoriessupport population health data
A full list of data repositories can be found at:http://service.re3data.org and http://databib.org
LSHTM Data Repository
Use scenarios:
1. Ensure non-sensitive data is curated & preserved in short & long-term
2. Fulfil funder obligations for data management & sharing
3. Meet journal expectations for data publication
4. Showcase research outputs that you wish to:
– Make available in controlled conditions
– Cite in research papers and reports
LSHTM Data Repository
Object Model
Web URL
Dataset 1
Study 1
Dataset 2
0…n
1…n
Data Collection 1 Data Collection 2
1. Public: available to all
2. Registered: LSHTM & registered users. request access for others
3. Controlled: Specific groups only.
4. Restricted: Depositor & administrator only.
5. Embargo: Not available until a set date
Summary
• Consider sharing at earliest point in project and develop practices and procedures to address it
• Inform study participants of intent to share
• Perform risk analysis of likelihood of re-identification and take action to address
• Consider sharing method and data licences
Further Information
• LSHTM Research Data Management http://www.lshtm.ac.uk/research/researchdataman/
• Association for Data Management in the Tropics (ADMIT) https://admit.tghn.org/
• ICO: Data Sharing Code of Practice http://ico.org.uk/for_organisations/data_protection/topic_guides/data_sharing
• ICO: Anonymisation Code of Practice http://ico.org.uk/for_organisations/data_protection/topic_guides/anonymisation
• MANTRA – Research Data Management Training http://datalib.edina.ac.uk/mantra/
• UK Data Servicehttp://ukdataservice.ac.uk/
• UK Digital Curation Centre
• http://www.dcc.ac.uk/