Upload
cyndy-parr
View
31
Download
0
Embed Size (px)
Citation preview
Cynthia Parr @cydparrUS Department of AgricultureNational Agricultural Library11 January 2017
Public access to research results
at USDA
Credit: Phenocam Swan Lake Research Farm, MN
The Story
• committed to public access• PubAg is well along• Ag Data Commons in the process
of making USDA-funded data accessible
We are
The Story part 2
public access isn’t good enoughButTherefore we are
1. enhancing our platform2. establishing sound curation policies
and processes3. promoting machine-readability and
data stories4. seeking a sustainable business
model
Federal directives: Public access to open, machine-readable data
PubAg https://pubag.nal.usda.gov
• Launched 2014• Almost 50K full-text peer-reviewed articles• More than 1 million citations for other papers
• Will collaborate with US Forest Service Treesearch• Will expand beyond Agricultural Research Service full text• Will cooperate with the CHORUS publisher consortium• Will soon launch a redesign
Transform agriculture to deliver a 20% increase in quality production with 20% lower environmental impact by 2025
-- USDA Agricultural Research Service
Public access is not enough
Goals for USDA digital scientific data
Who WhatResearchers and funders Compliance with public access
Agencies Compliance with open data
Research (data submitters) Safe, citable place for data
Research (data users) Find and use awesome data
Ag Data Commons https://data.nal.usda.gov
DKAN http://nucivic.com/dkan/ PRO• Open source community• Drupal modules for basic
CMS functions • Can feed Data.gov• Basic metadata already
supported
CON• Not designed for scientific
data or scientists• No links to literature• No Digital Object
Identifiers• Doesn’t handle dataset
relationships• Metadata inadequate for
compliance checking & re-use
Use all this for some data intensive research
1. enhancing the platformAg Data Commons Pilot FY 2016
• Self-submission accounts (almost 100 now)• More than 240 datasets (104 harvested)• Distributed curation• Links to PubAg, tagged with NAL thesaurus terms• DataCite Digital Object Identifiers, ORCIDs, FundRef• Methods metadata, data dictionaries for re-use• Designed to feed Data.gov
2. Sound curation policies and processes• Who can submit?• What do we accept?• When do we assign DOIs?• What embargo periods are okay?• How much review of metadata
and data do we do?• Who reviews metadata and data?• How should data be organized? • When do we offer a group a
“collection”?
• Must we host all the data?• What can we automate?• How do we make things more
machine-readable?• When should datasets be versioned?• How do we handle preservation?• How much and what kind of data
storage do we need?• How do we avoid licensing and
“ownership” confusion?
Research productsInclude in the Ag Data Commons (or provide links)• Raw data files and/or Processed data files• Data dictionary or Readme
Do not submit with the data • Manuscript• Figures/tables from manuscript
Research productsInclude as resources (resource can be URL pointer)• Web database• Software• Source code/Scripts/Workflows• User manuals
Do not submit with the data • Presentations associated with the study• News articles or press releases• Related or cited data
JSON, RDF
Data dictionary
CSV, API, DB, code
3. machine-readability
16
It is still early
Effort is needed to create usable, scalable systems & linkable data
3. Data stories
From data.gov
Data-driven stories
To sum upcommitted to public accessUSDA
is public access isn’t good enoughBut
Therefore
Acknowledgements
Susan McCarthy, Ursula Pieper, Erin Antognoli, Jon Sears, Qing Qu, Jeff Campbell, Jocelyn McNamara, Melissa Lohrey, Don Gourley, GovDelivery, Angry Cactus team
The PubAg team, especially Melanie GardnerUMD: Kerry Huller, Adam Kriesberg, Meghna
Sarin, Candice HoOther students: Jaylen Nathwani