60
Research Data Management: Humanities and Social Sciences Edition CC BY-NC Celia Emmelhainz and Suzi Cole August 11, 2015 Modified from presentation by Leslie Barnes, Dylanne Dearborn, Andrew Nicholson at http://guides.library.utoronto.ca/RDM-intro

Research Data Management in the Humanities and Social Sciences

Embed Size (px)

Citation preview

Page 1: Research Data Management in the Humanities and Social Sciences

Research Data Management: Humanities and Social Sciences Edition

CC BY-NC

Celia Emmelhainz and Suzi ColeAugust 11, 2015

Modified from presentation by Leslie Barnes, Dylanne Dearborn, Andrew Nicholson

at http://guides.library.utoronto.ca/RDM-intro

Page 2: Research Data Management in the Humanities and Social Sciences

• All liaison librarians need a basic knowledge of research data management (RDM).

• RDM is part of the librarian’s toolkit for serving faculty research needs.

• We don’t all need to be data experts, just as we aren’t experts in many areas that we cover.

• RDM is one of many topics we discuss with faculty over time, like collections, instruction, course guides, and student research.

• Our faculty may not know RDM terms or may not understand what our institutional repository or other archives can do with data.

• Humanists may react negatively to the term “data.”• (Optional): we can faculty by reading their drafts of data

management plan: if we don’t understand, reviewers won’t either.• Knowing data concepts enhances our role & expands our visibility.• Data collection and the data lifecycle are part of where we help

with curation in the library.• This is a new knowledge area for all academic librarians.

Our assumptions

Page 3: Research Data Management in the Humanities and Social Sciences

Why do academic libraries help with data management?

• Library culture is to acquire, organize, and preserve information • Logical extension of services we’ve traditionally been involved with• Libraries bring people together across disciplinary differences & campuses

Reading: Coates (2014) Ensuring research integrity: the role of data management in current crises. C&RL News 75(11): 598-601.

Page 4: Research Data Management in the Humanities and Social Sciences

After these sessions, you should…

● Know the concepts in data management

● Feel less anxious when talking about data

● Begin listening to faculty talk about their research process and outputs

● Know where to get more help with research data for faculty in your disciplines

Page 5: Research Data Management in the Humanities and Social Sciences

But why liaisons?

Info: eScience Team presentation on liaison roles, Image: CC0 from pixabay.com

A logical extension of our role as connections between the library and teaching faculty

A great way to show faculty that we care about their research as well as teaching

Liaisons as natural point of “triage”

Page 6: Research Data Management in the Humanities and Social Sciences

Liaisons – Learning Over Time

First Steps: Get comfortable with the idea of research data management.

Next Steps: Start a conversation with faculty about their needs, share resources, and direct them to data librarians for complex questions.

Moving Ahead: Take self-paced courses for librarians on the web. And try it out! Try managing data for one of your own projects.

Source: eScience Team presentation on liaison roles for data management

Page 7: Research Data Management in the Humanities and Social Sciences

Our path…

Today …introduction to data management…types of research data you’ll encounter…data formats and organization

Thursday…intro to data storage…intro to data sharing…advising on data management plans

Page 8: Research Data Management in the Humanities and Social Sciences

DATA?Q1: What is

Prompt: what materials do your faculty use to make sense of their research?

Page 9: Research Data Management in the Humanities and Social Sciences

“Research data is collected, observed, or created

for purposes of analysis to produce original research results.”

- U Edinburgh

Page 10: Research Data Management in the Humanities and Social Sciences

DATAQ2: What are

in the humanities?

Page 11: Research Data Management in the Humanities and Social Sciences

Textual data in the humanities could include:

- Scholarly editions- Text corpora- Text with markup- Thematic collections- Annotations- Accompanying analysis - Finding aids

Cf: guides.library.ucla.edu/c.php?g=180580&p=1187629, guide.dhcuration.org/intro/, image source: slideshare.net/ULCCEvents/the-humanities-and-data-management

Page 12: Research Data Management in the Humanities and Social Sciences

Data in the qualitative social sciences could include:

• microfilms• copies of old

documents• oral interviews• video tapes• hand-written

records

from: www.nsf.gov/sbe/ses/common/archive.jsp

Page 13: Research Data Management in the Humanities and Social Sciences

Humanities and arts data:● Texts used for research● Annotations● Images and illustrations● Citations ● Bibliographic information● Contextual information● Audio or video files

Health and Life Sciences data: Health indicators, vital signs Protein or genetic sequences Spectra and images Artifacts and samples Slides and specimens

Social Sciences data:● Survey responses● Focus groups and interviews ● Administrative records● Demographic information● Opinion polling● Maps and geospatial data● Websites, primary sources

Physical Sciences data: Sensor or lab measurements Computer modeling and

simulations Observations and/or field notes Numerical measurements

Cf: Best Practices for Arts/Humanities Data Management Plans, CU-Boulder http://bit.ly/1MkKCIa

Page 14: Research Data Management in the Humanities and Social Sciences

DigitalThoreau.org: On the left, the Princeton edition of Walden; right, original 1847 draft with changes marked up.

Page 15: Research Data Management in the Humanities and Social Sciences

Text Encoding Initiative (TEI) is a markup language that records the structure of text (author, chapters, pages, quotes) for digital humanities/curation purposes.

Page 16: Research Data Management in the Humanities and Social Sciences

Ask Yourself (#1):

Using a project summary, ask yourself:

- what is this research project about? - what types of data are being collected- what types of data are being created

Page 17: Research Data Management in the Humanities and Social Sciences

data (the stuff we do research with) are vital at every point in the

research lifecycle. Image: www.lib.uci.edu/dss/images/lifecycle.jpg

Page 18: Research Data Management in the Humanities and Social Sciences

example: temperature data from a lake

Raw Processed Analyzed Finalized/Published

Example: data across the lifecycle

Page 19: Research Data Management in the Humanities and Social Sciences

WHY manage data?

① for the researchers’ own current/future benefit② for transparency and integrity③ for sharing knowledge & how constructed

④ to meet grant requirements (NEH, NSF)⑤ to comply with ethics requirements⑥ to increase exposure to faculty research

Page 20: Research Data Management in the Humanities and Social Sciences

2: Data Formats and Organization

CC image from pixabay.com/en/filing-cabinet-office-furniture-146160/

Page 21: Research Data Management in the Humanities and Social Sciences

File Naming video

● Use meaningful names ● Avoid special characters ● Use caps or underscores, not spaces● Choose a standard date format:

YYYYMMDD or YYYY-MM-DD● Label versions (v2, v15)

Page 22: Research Data Management in the Humanities and Social Sciences

Data Structures videoCould organize by: ● Type of information● Date and time● Research project● Theme or subject

frontispieces/20141211/images

images/frontispieces/20141211

Page 23: Research Data Management in the Humanities and Social Sciences

Data Dictionaries and CodebooksExplains what a dataset contains:● Contents or organization of a file● Glossary of key concepts or terms● Definitions for each variable name● Describes relationships of tables/files● Codes that have been used to sort data● Sampling or other methods used

Page 24: Research Data Management in the Humanities and Social Sciences

Use open formats when possible:

“open source” formats keep files accessible over time; proprietary formats may be lost of a company goes out of business. Open formats let future researchers access your data!

Video: .mov, .mpegAudio: .wav, .mp3Data: .csv, .sasImages: .tiff, JPEG 2000Text: PDF/A, ASCII

Page 25: Research Data Management in the Humanities and Social Sciences

Ask Yourself (#2):

Using the project summary, ask yourself:

- what file formats are the data now in? - do they need conversion to open formats?- are they well documented with metadata?

Page 26: Research Data Management in the Humanities and Social Sciences

Intersession exercise:

Read the NEH guidelines for data management.

View any two data management libguides: Who is the audience? What services are offered? How does it connect to users?

Briefly review your chosen project summary, in preparation for the final class.

Page 27: Research Data Management in the Humanities and Social Sciences

Research Data Management: Session Two!

CC BY-NC

Celia Emmelhainz and Suzi Cole

August 13, 2015

Modified from presentation by Leslie Barnes, Dylanne Dearborn, Andrew Nicholson

at http://guides.library.utoronto.ca/RDM-intro

Page 28: Research Data Management in the Humanities and Social Sciences

3: Data Security and Sensitive Data

CC image: pixabay.com/en/computer-security-business-767784/

Page 29: Research Data Management in the Humanities and Social Sciences

Don’t let this be you! (or your faculty, or your students…)

Image www.neatorama.com/2013/04/24/Backup-Your-Data/

Page 30: Research Data Management in the Humanities and Social Sciences

Common options for data storage:

● Local hard drives (weak)Ex: personal or office desktop, laptop computer

● External storage devices (weak) Ex: USB drives, External hard drives

● Networked storage (okay)Ex: university servers, but see Colby**

● Cloud storage services (okay) Ex: Microsoft, RackSpace, Amazon, Google

Page 31: Research Data Management in the Humanities and Social Sciences

Data Storage: Best Practices● Back up all data frequently, especially after

major changes

● Automate the backup process

● Use ‘versioning software’ (see ITS) or file names to track changes in team projects

The “Rule of 3”: Keep three copies of key data… in at least two different locations

(original file, local backup, remote backup)… in at least one offline/offsite location

Page 32: Research Data Management in the Humanities and Social Sciences

Sensitive Data:

…is any data that, if released, could harm the people who participated in the research:

● Address, birth date, name, location● Sensitive political opinions● Sexual practices● GPS data locating endangered species● Coordinates for burial sites or sacred places

This is treated with caution; few archiving options now.

Page 33: Research Data Management in the Humanities and Social Sciences

Concepts in Sensitive Data

● Research ethics: protect identities of people interviewed; minimize risk of any leaks

● Confidentiality: how participants’ identifiable private information will be managed and disseminated

● Disclosure risk: increased with online accessibility of data or storage of documents

Page 34: Research Data Management in the Humanities and Social Sciences

Sensitive Data: Best Practices

● Collect data without identifying information, if possible

● Strip sensitive or identifying information before archiving or sharing research data

● Encrypt your computer, and use secure connections, and secure servers

● Place sensitive data in a restricted archive with an embargo (time delay) or ethics approval required for access

Page 35: Research Data Management in the Humanities and Social Sciences

Ask Yourself (#3):

Using the project summary, ask yourself:

- where will data be stored? - who is responsible for storage and backup? - how will you manage access to sensitive

data?

Page 36: Research Data Management in the Humanities and Social Sciences

4: Data Retention & Preservation

Page 37: Research Data Management in the Humanities and Social Sciences

image from datasupport.researchdata.nl/

Page 38: Research Data Management in the Humanities and Social Sciences

“What data do I keep?”It all depends on:

…whether data is irreplaceable

e.g. are there other copies of this book, document, version, image, interview?

…how much data is needed to verify or reanalyze a research project

…policies of funders, IRB, discipline

Page 39: Research Data Management in the Humanities and Social Sciences

Best Practices: Data Preservation

● Use open-source, non-proprietary files

● Include all software needed, if possible

● Note all files and their relationship/structure

● Identify who is responsible for preservation

● Determine how long data should be held

● Budget time and money before starting a project to properly preserve and archive data at the end!

Page 40: Research Data Management in the Humanities and Social Sciences

Ask Yourself (#4):

Using the project summary, ask yourself:

- Which data should be kept? Why? - How long should data be kept for? - Who is responsible to preserve the data?

Page 41: Research Data Management in the Humanities and Social Sciences

5: Data Sharing and Publication

Page 42: Research Data Management in the Humanities and Social Sciences

Fears in sharing data…

Often, researchers want to hide their data:● Fear criticism of their methods/results● Fear exposure of confidential data● Fear political/legal ramifications● Fear getting “scooped” on analysis● Believe benefits are low, and the cost is high

CC image: pixabay.com/en/hands-holding-embracing-loving-718562/

Page 43: Research Data Management in the Humanities and Social Sciences

But, sharing data…

● Is often required by journals and funders

● Reduces the costs of research by reducing project duplication

● Is a valuable check on methods and ethics

● Helps promote faculty discoveries

● Increases the impact of faculty work

● May support faculty tenure or salary increases!

Page 44: Research Data Management in the Humanities and Social Sciences

Relevant data repositories:

and of course…

Page 45: Research Data Management in the Humanities and Social Sciences

Data Papers:

Dataset Description

Reuse Potential

Methods

Overview/Context

Page 46: Research Data Management in the Humanities and Social Sciences

Data as a Publication● Data which has been shared can be cited:

Data citations involve: author, title, year, publisher / archive, version, URL or DOI for access.

● Data citations are a metric that can support tenure and promotion for our faculty!

● ORCiDs can help people find and cite data by a given researcher.

Page 47: Research Data Management in the Humanities and Social Sciences

Best Practices in Data Sharing

● Find out who owns the data (researcher? university? funding organization?)

● Review legal issues such as copyright or publishers’ embargoes

● Consider ethical issues related to sensitive data or communities

● See publisher/funder requirements for sharing

Page 48: Research Data Management in the Humanities and Social Sciences

Data Management Plans

CC image: pixabay.com/en/whiteboard-man-presentation-write-849812/

Page 49: Research Data Management in the Humanities and Social Sciences

What’s in a Data Management Plan?

All the things we’ve discussed!

Page 50: Research Data Management in the Humanities and Social Sciences

What’s in a Data Management Plan?

● What types of data will be created?● Who will own, have access to, and be

responsible for managing these data?● What equipment or methods will capture,

process and document the data? ● Where will data be stored during and after

active research? ● How will the data be shared with current or

future researchers?

Page 51: Research Data Management in the Humanities and Social Sciences

Data Management Plans (DMPs) are a great way to…

plan how you’ll handle research materials describe how you’ll document, store, and

share data so that others can use it remain accountable for how you use and

share research materials get funded on major research projects!

Page 52: Research Data Management in the Humanities and Social Sciences

All research proposals sent to the National Science Foundation (NSF) must include a 2-page data management plan, showing how the data will be cared for and shared.

The NSF is a common source of research money in: anthropology, geography, psychology, economics, government, STS, and many interdisciplinary projects.

Page 53: Research Data Management in the Humanities and Social Sciences

The NSF expects that all researchers:

“should be prepared to place their data in fully cleaned and documented form in a data archive or library within one year after the expiration of an award.

Before an award is made, investigators will be asked to specify in writing where they plan to deposit their data set”

- National Science Foundation guide for social and economic sciences at nsf.gov/sbe/ses/common/archive.jsp

Page 54: Research Data Management in the Humanities and Social Sciences

For the NEH, data are “materials generated or collected during the course of conducting research.”

Humanities data such as “citations, software code, algorithms, digital tools, documentation... geospatial coordinates… reports, and articles” should be archived. Sensitive information can be excluded.

So, humanities faculty should also have a plan for how they’ll archive and share their research data! Source: neh.gov/files/grants/data_management_plans_2015.pdf

Page 55: Research Data Management in the Humanities and Social Sciences

How do we actually make DMPs?

● Templates are a starting point:

● However, researchers still need to carefully think through data issues with grants officers, peers, or librarians

● http://libguides.colby.edu/data_mgmt

Page 56: Research Data Management in the Humanities and Social Sciences

Sample DMPSimage: asphalttexas.com/wp-content/uploads/2014/06/Screen-Shot-2014-06-18-at-4.33.29-PM.png

Page 57: Research Data Management in the Humanities and Social Sciences

Data management at Colby:• Liaisons are first point of contact

• Suzi and Celia advise on further issues

• We are an ICPSR member; quantitative researchers can deposit data there.

• Images and data may be archived in Digital Commons/Shared Shelf; check with Marty.

cf. libguides.colby.edu/data_mgmt.

Page 58: Research Data Management in the Humanities and Social Sciences

Question: What 3 things can you do this year with data management?

Image: http://www.dailymail.co.uk/news/article-2728736/Otter-aerobics-Large-group-spotted-going-paces-synchronised-exercise.html

Page 60: Research Data Management in the Humanities and Social Sciences

Thanks to New England Collaborative Data Management Curriculum for sharing their slides.

Many thanks to Leslie Barnes, Dylanne Dearborn, and Andrew Nicholson at University of Toronto for sharing their abbreviated slides (http://guides.library.utoronto.ca/RDM-intro), from which this presentation was adapted for the humanities.