39
Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Embed Size (px)

Citation preview

Page 1: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Research Data Management System project:

Best Practices in Research Data Management*

*Adaptation of the NECDMC

Page 2: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Today’s Objectives

Why manage data? Identify common data management issues Best practices for managing data Support: how the library and TTS can help you

and your lab

Page 3: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

What is Data?

• “Research data, unlike other types of information, is collected, observed, or created, for purposes of analysis to produce original research results” (University of Edinburgh). • Observational• Experimental• Simulation data • Derived or compiled data

Page 4: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Why Should I Manage it?

• Transparency & Integrity• Compliance

Page 5: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Science & Personal Benefits

• Who uses your data now?• Who COULD use your data?

• Shared/Open Data• Scientific progress• Impact on your career• Citation counts

Page 6: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

What if I Don’t Consider RDM?

Data Sharing and Management Snafu in 3 Short Acts: A data management horror story by Karen Hanson, Alisa Surkis and Karen Yacobucci.

http://www.youtube.com/watch?v=N2zK3sAtr-4

Page 7: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Seven “Issues” in Research Data Management• Responsibility• Data Management Plans• Records Management• File Management• File Naming

• Metadata• Backup and Security• Ownership and Retention• Long Term Planning

Page 8: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Issue: Responsibility

• Best Practices• Define roles and assign responsibilities for data

management• Identify skills needed to perform tasks outlined in DMP

and match to available staff • Develop training plans for continuity• Assign responsible parties and monitor results

Page 9: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Issue: Data Management Plans

CREATING DATA

PROCESSINGDATA

ANALYSING DATA

PRESERVING DATA

GIVING ACCESS TO

DATA

RE-USING DATA

Data Life Cycle

Page 10: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Creating a Data Management Plan

• “the types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project;

• the standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies);

• policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements;

• policies and provisions for re-use, re-distribution, and the production of derivatives; and

• plans for archiving data, samples, and other research products, and for preservation of access to them”

Page 11: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Issue: Data Management Plans

• Best Practices• What types of data will be created?• Who will own, have access to, and be responsible for

managing these data?• What equipment and methods will be used to capture

and process data? • Where will data be stored during and after?

Page 12: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Issue: File Management

• Does this sound familiar?• Inconsistently labeled files• in multiple versions…• inside poorly structured folders…• stored on multiple media…• in multiple locations… • and in various formats…

Page 13: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC
Page 14: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Issue: File Naming

• Best Practices• Avoid special characters in a file name. • Use capitals or underscores instead of periods or

spaces.• Use 25 or fewer characters. • Use documented & standardized descriptive

information about the project/experiment.• Use date format ISO 8601:YYYYMMDD.• Include a version number.

Page 15: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Issue: File Naming

Page 16: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Issue: File Naming

• Best Practices• Avoid special characters in a file name. • Use capitals or underscores instead of periods or

spaces.• Use 25 or fewer characters. • Use documented & standardized descriptive

information about the project/experiment.• Use date format ISO 8601:YYYYMMDD.• Include a version number.

Need Help?Contact

[email protected]

Page 17: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Issue: MetadataWhat is Metadata? • “Metadata is structured information that describes,

explains, locates, or otherwise makes it easier to retrieve, use or manage an information resource.”

--2004, NISO, Understanding Metadata, pg. 1

• A love note to the future…• How will someone make sense of your data e.g. the

cells and values of your spreadsheet?• What universal or disciplinary standards could be used

to label your data?• How can you describe a data set to make it

discoverable?

Page 18: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Why Use Metadata?• find data from other researchers to support your

research• use the data that you do find• help other professionals find and use data from your

research• use your own data in the future when you may have

forgotten details of the research• Help ensure consistency and clarity of data through

the use of technical standards and controlled vocabularies

Page 19: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Common metadata fields• Title• Creator• Identifier• Subject• Funders• Rights• Access information• Language• Dates• Location

• Methodology• Data processing• Sources• List of file names• File Formats• File structure• Variable list• Code lists• Versions• Checksums

Page 20: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC
Page 21: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

What else?• Standard conventions are used to describe content in a

way that ensures units such as date, time, location, etc. are entered consistently among the researchers in your group

• Controlled vocabularies are lists of predefined terms that ensure consistency of use, and help disambiguate similar concepts. Use the controlled vocabulary that best matches your research. • You might create a short list of terms to choose from when

populating a specific piece of data• For example, subject terms used in research about biometric

sensing might be taken from a controlled vocabulary list such as Medical Subject Headings (MeSH)

Page 22: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Issue: Metadata

• Biology and health-specific metadata examples

Page 23: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Issue: Metadata

• Best Practices – Create a Data Dictionary• Describe the contents of data files• Define the parameters and the units on the parameter• Explain the formats for dates, time, geographic

coordinates, and other parameters• Define any coded values• Describe quality flags or qualifying values• Define missing values

Need Help?Contact

[email protected]

Page 24: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Metadata and the ELN• Any searchable field in the Agilent or LabArchives ELN

technically contains metadata• In both ELNs, you can add tags/keywords to

experiments, data files, and image files• In some cases you can create a pre-defined list of

tags/keywords to choose from

Page 25: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

AgilentSearchable fields:

Page 26: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

AgilentFunding Source via menu:

Project Focus via menu:

Page 27: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

AgilentAssociate metadata with an experiment using keywords:

Page 28: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

LabArchivesAssociate metadata with an experiment using tags:

Page 29: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

LabArchivesAssociate keyword metadata with an image file:

Page 30: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Issue: Backup & Security

• How often should data be backed up?• How many copies of data should you have?• Where can you store your data?• How much server space can I get?

Page 31: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Issue: Backup & Security

• Best Practices• Make 3 copies (original + external/local + external/remote)• Have them geographically distributed (local vs. remote)• Use a Hard drive (e.g. Vista backup, Mac Timeline, UNIX rsync) or

Tape backup system• Cloud Storage - some examples of private sector storage

resources include: (Amazon S3, Elephant Drive, Jungle Disk, Mozy, Carbonite)

• Unencrypted is ideal for storing your data because it will make it most easily read by you and others in the future…but if you do need to encrypt your data because of human subjects then:• Keep passwords and keys on paper (2 copies), and in a PGP

(pretty good privacy) encrypted digital file• Uncompressed is also ideal for storage, but if you need to do so

to conserve space, limit compression to your 3rd backup copy

Page 32: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Issue: Ownership & Retention

• How long is long enough?

Page 33: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Issue: Ownership & Retention

• Intellectual Property Policy• IRB data retention policy• Funders’ data retention policy• Publishers’ data retention policy• Federal and State laws

Page 34: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Issue: Long-Term Planning

• What will happen to my data after my project ends?• How can I appraise the value of my data?• What are my options for archiving and

preserving my data?• What are my options for publishing and sharing

data?

Page 35: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Open vs. Proprietary Formats Used in Research Labs

Page 36: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Issue: Long-Term Planning

• Best Practices• When choosing a file format, select a consistent

format that can be read well into the future and is independent of changes in applications.• Non-proprietary: Open, documented standard,

Unencrypted, Uncompressed, ASCII formatted files will be readable into the future.

Page 37: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Works CitedLamar Soutter Library, University of Massachusetts Medical School. 2014. “New England Collaborative Data Management Curriculum: Module 1.” http://library.umassmed.edu/necdmc.

DataONE. 2013. “Best Practices for Data Management.”http://www.dataone.org/best-practices.

MIT Libraries. 2013. “Data Management and Publishing.” MIThttp://libraries.mit.edu/guides/subjects/data-management/index.html.

Office of Research Integrity. 2013. “Data Management.” United States Department of Health and Human Services. United States Federal Government. http://ori.hhs.gov/education/products/rcradmin/topics/data/open.shtml.

This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 3.0 United States License.

Page 38: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Learn More

• Data Management Principles & Education:• Tufts Libraries Data Management Guide • Research Data MANTRA• DataONE: Best Practices• UK Data Archives• MIT Data Management and Publishing Guide

• Data Management Plans• Digital Curation Centre• DMPTool2• DataONE: Data Management Planning

Page 39: Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

Find Help

• Data Management Plans and Metadata services:• Medford/Somerville Campus: [names/contact info]• Boston/Grafton Campus: [librarian names/contact info]

• Data storage and security services + ELN support:• [TTS contact info]