43
Next generation data services at the Marriott Library Rebekah Cummings J. Willard Marriott Library March 5, 2014

Next generation data services at the Marriott Library

Embed Size (px)

Citation preview

Next generation data services at the Marriott Library

Rebekah CummingsJ. Willard Marriott Library

March 5, 2014

Two questions to tackle today:

How do you see the data needs of the social sciences and humanities changing and evolving over the next five years?

How do you imagine libraries being involved in this changing landscape and how might we partner with campus faculty around their data needs?

Before I came to Utah…

UCLA Social Science Data Archive

Provided 1:1 consultations to research teams

Conducted data management workshops

Worked closely with the UCLA Civil Rights Project to create a preservation strategy for their publication and datasets.

“The active and ongoing management of data

through its lifecycle of interest and usefulness to

scholarship, science, and education. Data curation

activities enable data discovery and retrieval,

maintain its quality, add value, and provide for

reuse over time.” – University of Illinois’ Graduate

School of Library and Information Science

What is data curation?

Data Lifecycle

PLANNING

Courtesy of the UK Data Archive http://www.data-

archive.ac.uk/create-manage/life-cycle

Social Science Data

Opinion polls

Surveys

Interviews

Government records

Social/ Mass Media

Laboratory experiments

Field experiments

Census records

Voting records

Economic indicators

Humanities Data

Newspapers

Photographs

Letters

Diaries

Books, articles

Birth, death, marriage records

Church records

Court records

Yearbooks

Maps

Why libraries?

Opportunity to expand our services in ways that can benefit faculty.

Opportunity to build stronger relationships between libraries and research communities.

We can continue to play a role in the preserving and making available the scholarly record.

We have the skills!

Organizing information

Describing information (Metadata)

Grant writing

Copyright and licensing

Digital preservation

Open access

Instructional experience

Reference work

Professional ethics

Our faculty and students have spoken

In response to the question “What services should be added at the Marriott Library?” on the Strategic Planning survey:

40% of users responded by saying they need more assistance with research data.

60% of library employees responded by saying we should provide more assistance with research data.

Key recommendation from the external library consultants was to develop more support for research data management.

Data Policies and Funder Requirements

2011 – National Science Foundation Data Management Plan requirement

2013 – White House Public Access to Federally Funded Research memo

2014 – NEH Office of Digital Humanities Data Management Plan requirement

Journal Requirements

“A condition of publication in a Nature journal is that authors are required to make materials, data, code, and associated protocols promptly available to readers without qualification.” – Nature’s open data policy

Challenges

Data is unlike materials we’ve worked with in the past

Needs context to be understandable

Varies greatly in size and complexity

Version control

Ethical considerations

Most data are born-digital objects

Digital Libraries vs. Data Curation

PDF

PDF

PDF

PDF

PDF

PDF

PDF

TIFF

WAV

TIFF

TIFF

TIFF

TIFF

TIFF

TXT

PPT

SPSS

XLSX

TIFF

DOC

GPSS

PY

TIFF

CSVWAV

Most researchers:

Were not trained in data management

Don’t know how to write a data management plan

Don’t know how to create proper metadata

Have concerns about sharing their data

Aren’t convinced they really have to share their data

Adapted from http://www.slideshare.net/carlystrasser/iassist20120608

The last five years (2010 – 2015), cont.

New policies and funder requirements

Development of best practices and standards

Technical infrastructure

Growing community of research data managers

Case studies as models for excellence

Tools for helping researchers

Question #1

How do you see the data needs of

the social sciences and humanities

changing and evolving over the

next five years?

Almost everyone is working digitally now

ResearchersResearchers

Using Technology

1990

2020

ResearchersResearchers

Using Technology

Different methods of data collection

Larger datasets

Hathi Trust Research Center

10.5 million volumes

3.6 billion pages

1890-present

Discover patterns over time that were previously invisible

Twitter Archive

Acquired by Library of Congress April, 2010

As of January 2013, 170 billion tweets

Available to researchers 6 months after posting

Individual to Collaborative

Digital Public Library of America

Print to Visual

1993 2013 – 2 million pageviews

Changing metrics

Altmetrics

Easier to count use than citations

“Impact” means something different than it did ten years ago.

Increased awareness of importance around data sharing

Data management plan requirements

Journal requirements

White House directive

Trends towards transparency and openness.

Johns Hopkins Data Stack Model

With all this in mind, remember…

NONE OF US WERE TRAINED FOR THIS!!!

Question #2

How do you imagine libraries being

involved in this changing landscape

and how might we partner with

campus faculty around their data

needs?

Research Data Services

Education and Training

Data Management Plan help

Data Consultation

Metadata Assistance

Analysis and Visualization Tools (Digital Scholarship Lab)

Long-term stewardship/ preservation

Repository services (Uspace)

Mint DOIs and ARKs

Catalog Datasets

Data reference and acquisition

Research Data Curation pilot projects

University of Minnesota

8 months (May – Dec 2013)

Call for pilot datasets among the faculty

Outputs

Data curation workflow

Five pilot datasets

Summary report

Faculty engagement

Data Repository for U of Minnesota

Embedded librarianship

Librarians written into grant proposals as data managers.

Cost is underwritten in the grant

Option for Sustainability

Allows us to get involved at the beginning and throughout the entire data lifecycle!

The UCLA Civil Rights Project

Started as part of a class project

High profile, mostly quantitative, social science data

Two P.I.s and many graduate students in a distributed research team.

Website to eScholarship

Moved 72 CRP publications to eScholarship

Structured metadata

Open access scholarly publishing for the UC Community

Some preservation strategies

UCLA CRP Dataverse

Secured datasets and codebooks from CRP researchers

Added four datasets to Dataverse

Converted files to non-proprietary format

Added structured metadata

Added data citation and persistent identifier

Linked to related publications

Worked out data governance with researchers

Created workflow with research team for data and publication archiving

UCLA Civil Rights Project Dataverse

Future concerns

Developing a cost model to support the ongoing expense of curating research data.

Longevity of file formats

Data governance issues

Developing library staff to support data curation activities

Questions?

Image courtesy of the US Archives via Flickr/The Commons