18
PREPARING RESEARCH DATA FOR SHARING This work is licensed under a Creative Commons Attribution 2.0 UK: England & Wales License Gareth Knight [email protected] Open Access Week 2014

Preparing Research Data for Sharing

  • Upload
    lshtm

  • View
    201

  • Download
    4

Embed Size (px)

Citation preview

PREPARING RESEARCH DATA

FOR SHARING

This work is licensed under a

Creative Commons Attribution 2.0

UK: England & Wales License

Gareth Knight

[email protected]

Open Access Week 2014

Data Sharing in the News

“Publicly funded research data are a public good, produced in the public interest, which should be made openly available with

as few restrictions as possible in a timely and responsible manner that does not harm intellectual property.”

RCUK Common Principles on Data Policy

http://www.rcuk.ac.uk/research/datapolicy/

LSHTM Policy

“Research data… should be offered to an appropriate data repository or enclave… except in circumstances that would

breach IPR, ethical, confidentiality, or other obligations”

Principle 6

“Research data that substantiate research findings should be made available for access and use in a timely manner, within the

boundaries of conditions established by contractual, legislative, ethical, or other requirements

Principle 7

LSHTM Research Data Management Policyhttp://www.lshtm.ac.uk/research/researchdataman/rdm_policy_summary.html

Funder Expectations

http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies

Journal Expectations

PLOS state that journal papers must be accompanied by a ‘minimal dataset’ containing:

1. Data “used to reach the conclusions drawn in the manuscript” that is necessary to understand, validate, or replicate the work.

2. Documentation necessary to understand the data content and the methods applied to produce it.

A Data Availability Statement which outlines how data can be located

PLOS Data Policy Overviewhttp://blogs.lshtm.ac.uk/rdmss/plos-data-policy-overview/

An increasing number of journal publishers expect data

to be made available at same time as paper

Researcher Incentives

• Leads to new collaborations between data users and data creators

• Establishes a reputation for scholarly rigour by allowing outputs to be scrutinised

• Increased visibility of research presenting new opportunities for analysis and use

• Saves time and money, by ensuring researcher does not duplicate existing data collection or analysis.

Data sharing benefits the research community, but let’s focus upon personal benefits

Are you able to share data?

Sharing is encouraged, but not suitable for all data. Must be balanced with other obligations

• Ethical duty of confidentiality and protect participants from harm

• Allow participants to make own decisions on how their information can be used, shared and made public (through informed consent)

• Enable researcher to gain maximum benefit from findings

Desire to patent research findings

Data can’t be anonymised

3rd party IPR exists

Concern that will attract lower rate of response or people will be less honest

Meet funder obligations

Encourage research uptake

Higher citation rate

Encourage validation of results

againstfor

Data Sharing Code of Practice

1. Why do you wish to share?• What does the project gain by sharing data?

• What are the benefits to the wider world?

2. Are you able to share?• Data Protection Act

• Participant Consent

• Intellectual Property Rights (IPR)

• Other risks or barriers that prevent or limit sharing

3. Are there conditions associated with sharing?• Academic researchers only?

• Consent permit specific use only

Information Commissioner Office offer a 3-step decision model

http://www.ico.org.uk/for_organisations/data_protection/topic_guides/data_sharing

A Sample Consent Form

UK Data Service: Model consent forms and information sheeetshttp://ukdataservice.ac.uk/manage-data/legal-ethical/consent-data-sharing/overview.aspx

Managing Personal Information

Personal DataInformation that can be used to directly identify person in isolation or in combination

• Name

• Address

• Date of Birth

• National Identification Number (NIN)

Sensitive Personal DataInformation that can be used to discriminate requires extra protection

• racial or ethnic origin

• political opinions

• religious beliefs

• physical or mental health

• sexual life

• criminal offenses

• Trade union membership

Research projects typically apply same(high) level of protection to both information types

The DP Act outlines two classes of information to be protected

Risk Management

Assess likelihood that data can be used to:

• Identify a person directly

• Infer information about a person

• Link records relating to person to other info

Determine action to address issue:

• Randomisation - noise addition, permutation

• Generalisation - aggregating results, limiting geographic details

• Pseudonymisation - hash functions

How likely is it that you will share

personal or sensitive information?

Information Commissioner Office: Anonymisation Code of Practicehttp://www.ico.org.uk/for_organisations/data_protection/topic_guides/anonymisation

Data Sharing Methods

Data may be shared using several formal and informal approaches:

1. Emailing a set of data files to a colleague orother researcher

2. Making data files available througha project or institutional website

3. Submitting data files to a journal in orderto support a publication (e.g. PLoS)

4. Deposit to a data repository, data archive, or data bank for preservation and sharing

Data Repositories

• Several repositories provide infrastructure necessary to store, preserve and deliver data

• Carefully read deposit conditions – some insist that data is made available immediately under a CC0 licence

• LSHTM will launch its own data repository in early 2015

A small, but growing number of data repositoriessupport population health data

A full list of data repositories can be found at:http://service.re3data.org and http://databib.org

LSHTM Data Repository

Use scenarios:

1. Ensure non-sensitive data is curated & preserved in short & long-term

2. Fulfil funder obligations for data management & sharing

3. Meet journal expectations for data publication

4. Showcase research outputs that you wish to:

– Make available in controlled conditions

– Cite in research papers and reports

LSHTM Data Repository

Object Model

Web URL

Dataset 1

Study 1

Dataset 2

0…n

1…n

Data Collection 1 Data Collection 2

1. Public: available to all

2. Registered: LSHTM & registered users. request access for others

3. Controlled: Specific groups only.

4. Restricted: Depositor & administrator only.

5. Embargo: Not available until a set date

Summary

• Consider sharing at earliest point in project and develop practices and procedures to address it

• Inform study participants of intent to share

• Perform risk analysis of likelihood of re-identification and take action to address

• Consider sharing method and data licences

Further Information

• LSHTM Research Data Management http://www.lshtm.ac.uk/research/researchdataman/

• Association for Data Management in the Tropics (ADMIT) https://admit.tghn.org/

• ICO: Data Sharing Code of Practice http://ico.org.uk/for_organisations/data_protection/topic_guides/data_sharing

• ICO: Anonymisation Code of Practice http://ico.org.uk/for_organisations/data_protection/topic_guides/anonymisation

• MANTRA – Research Data Management Training http://datalib.edina.ac.uk/mantra/

• UK Data Servicehttp://ukdataservice.ac.uk/

• UK Digital Curation Centre

• http://www.dcc.ac.uk/

Image Credits

Slide 7: “Sharing” (CC BY-NC 2.0)

https://www.flickr.com/photos/tobanblack/3773116901/

“Disguise” (CC BY-NC-SA 2.0)

https://www.flickr.com/photos/estherase/2190068148

“Sharing” (CC BY-NC 2.0)

https://www.flickr.com/photos/ryanr/142455033/