45
Research Data Repositories Developing and Implemen5ng Infrastructures for Ins5tu5onal and Consor5al Environments Ray Uzwyshyn, Ph.D. MBA MLIS Director, Collec5ons and Digital Services, Texas State University Libraries

Research Data Repositories - CNI: Coalition for Networked ......3) Ins5tu5onal or Consor5al (either ins5tu5on wide or consor5al repositories) All-Purpose and Specialized Data Repository

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • ResearchDataRepositoriesDevelopingandImplemen5ngInfrastructuresforIns5tu5onalandConsor5alEnvironments

    RayUzwyshyn,Ph.D.MBAMLISDirector,Collec5onsandDigitalServices,

    TexasStateUniversityLibraries

  • OnlineDataResearchRepositoriesWhatareThey?

    •  WaytoManageaResearcher’sData/Metadata

    •  PermalinkingStrategyforDataCita5on

    •  WaytoManageFederalGrantCompliance

    •  Middle-TermDataArchivingandSharingStrategy

  • TheResearchDataRepositoryLifecycle

    BecomingpartofScience,SocialScienceandHumani5esResearchProcess

    Promotes:accuracy,efficiency,sharing

  • WhyareDataManagementRepositoriesNecessary?

    MostmajorFederalgrantagenciesrequiredataaccessasmandatorypartofthegrantproposal/oversiteprocess.(NIH,NSF,NEH,USDA)

    WordleoftheFinalNIHStatementonSharingResearchData,Mandatory2003

  • WhatmakesDataManagementRepositoriesuseful?

    •  Makesavailablefaculty,departmentalandins5tu5onalresearch•  Allowspublica5onofnega5vedata(lessensresearchreplica5on)

    WordleoftheNa-onalScienceFounda-on’sAwardandAdministra5onGuide.ChapterVI.D.4,Mandatory2011

  • TypesofResearchDataRepositories

    1)Projectspecificlargesinglefaculty/teamprojects

    2)Disciplinespecifici.e.PurdueNanohub/Nanotechnology

    3)Ins5tu5onalorConsor5al(eitherins5tu5onwideorconsor5alrepositories)

  • All-PurposeandSpecializedDataRepositoryPla^orms

    Fearon,D&Sallans,A.C.(January2014)Ins5tu5onalResearchDataManagement:Policies,Planning,ServicesandSurveys.Coali5onforNetworkedInforma5on.hbps://www.youtube.com/watch?v=rvbrW7S2fes(54ARLLibrariescurrentlyofferdatamanagementservices_)

  • ResearchDataRepositorySoBwareCharacterisDcs

    •  Hostedoronaserver•  Sohwarecontainsmanagementandcollabora5veop5ons

    •  Opensourceorproprietarysohware•  WideVarietyofDataTypes

    (ExceltoSPSStovariousdisciplinaryspecificformats)

  • Evaluation Criteria •  System Performance/ Robustness •  Usability •  an active open source community

    Gather Finalists: Harvard’s Dataverse, Purdue’s Hubzero

    Figshare

    Make Final Choice: Harvard’s Dataverse

    PartI:PlanningYourRepository

    DataRepositoryWorkingGroupReport(August28,2015)

    EnvironmentalScanofNeedsforYourInsDtuDonorConsorDum

  • DataverseHarvard’sOpenSourceResearchDataSolu5on

    Datasharing,datacita5on,datapublishingandversioningmanagement

    SocialSciencesBeginnings(IQSS)DataScience(site)hbp://thedata.orgDataverseOpenSourceDownload(Github),SohwareBackground

  • DataverseArchitecture(Consor5al)

    ResearchStudyData

    OriginalDataSetFilesMetadataParatextualMaterials(Methodology,FieldNotes,Mul5media,Graphs,Programsetc.)

    TexasStateUniversityDataverse

    TexasDigitalLibraryDataverse

    UniversityofHouston,UT

    AusDnDataverses,etc.

    Centers

  • DataCita5onandMetadata

  • DataverseMetadataExample(FromtheSimpletoComplex)

    SchemasSupported:GeoSpa5al,LifeSciences,AstronomyandPhysics,GeoreferencedData

  • TheManyPlanningAspectsofDataResearchRepositories

    PlanningPrinciples

    WideFlexibilityonIns5tu5onalLevels.

    GuidingConsor5alTemplateswhichcanbecustomizedonins5tu5onallevels

  • Part II: Developing Your Data Repository TDL Dataverse State Working Group

    (August 2015 – December 2016)

    Charge: Develop, Pilot and launch a consortial repository for research data archiving and management.

    Sub-Committees

    Working Group

    members

    Texas Universities

    MainWorkingGroup(14)(4Subcommibees)•  PolicyandGovernance•  WorkflowsandOutreach•  Budget/BusinessModel•  Technology

    StateDataRepositorySymposiumGroup(Baylor)

    FinalReportOctober,2016

  • http://data.tdl.org

    Interface Design & Usability

  • TexasDataRepository

    Member University Libraries (service & outreach)

    Researchers (deposit, search, publish)

    1) Mixed 2) Mediated 3) Unmediated (Direct)

    Service Models

  • TexasStateAcademicResearch

    ResearchData

    TSDataverse(RegulartoMediumSizeDataSets)

    CustomDataStorage

    (BigData,TB+,TR)

    Text

    D-SpacePublica5onsRepository

    TexasStateRepositoriesArchitecture

  • OneSizeDoesNotFitAll

    TypesofDataProjects(Sizes)1)NormalRangeProjectsFiles/DataFitonServer,maybeuploaded,Dataverse,Hubzero)

    2)LargeProjects(DatamayrequirespecializeduniversityITSupport,i.e.terabyte/petabytedrives,Pointersetc.)

    3)HugeProjects(Projectsrequireconsor5alpossibili5es,na5onalmodels,TexasAdvancedComputerCenterTAAC,DEEPN,Duracloud,AWS,CustomSolu5ons)

  • FacultyDataManagementPlanDocumenta5on/PolicyTool

    OverviewVideo

    CustomizablePlanOutlineToolResourceLinksSupportsAllMajorFunders

    hbps://dmptool.org/CaliforniaDigitalLibrary

    Connec5onswithOfficeofSponsoredResearchandOtherRelevantUniversityOfficesLibrary/DataverseTemplates

  • Part III: Human Resource Infrastructures (Working Teams)

    Full or Part Time

    Data Repository Liaison Publication Repository Liaison Metadata Liaison Subject Liaisons (Outreach) Committee for Workflows & Policies

    Current Hires

    Digital Collections Librarian (Texas State Data Repository Dataverse/Publications Repository: D-Space)

    Data Visualization and Analytics Librarian (Tableau, Bayesia)

    Future Hires

    Machine Learning/Neural Networks/AI Librarian (working with the data)

  • Marketing and Other Possibilities

    FuturePossibili5es:VIREO,DATAREPOSITORYCONNECTIONS

    ElectronicThesisandDissertaDons(ETD)Repository(D-Space)

    WorkingwiththeData–SupportMechanismsDataLiteracy(Workshops/Educa5on)DataVisualiza5on,DataAnaly5csMachineLearning/NeuralNetworks/AI

  • ResearchDataRepositoryAdop5onLifecycle

    (2018)

  • FurtherLinks/References•  ARLNSFDataSharingPolicyandResourceLinks,

    hbp://www.arl.org/focus-areas/e-research/data-access-management-and-sharing•  ARL(WhiteHouseDirec5vesandFundedResearchData)

    hbp://www.arl.org/focus-areas/public-access-policies#.VoaV0I-cFzo•  Borgman,C.2015.BigData,LiFleData,NoData.ScholarshipintheNetworkedAge.MITPress•  Baker,Monya.1500Scien5stsLihtheLidonReproducibility.

    www.nature.com/news/1-500-scien5sts-lih-the-lid-on-reproducibility-1.19970•  Harris,Richard.(April2017).RigorMor-sHowSloppyScienceCreatesWorthlessCures•  CaliforniaDigitalLibraryDMTTool:hbps://dmptool.org/•  Chronopolis:hbp://www.digitalpreserva5on.gov/partners/chronopolis.html•  DataReproducibilityCrisis.Nature.

    hbp://www.nature.com/news/1-500-scien5sts-lih-the-lid-on-reproducibility-1.19970•  Dataverse.hbp://thedata.org/•  Dataverse(DataScienceSite).hbp://datascience.iq.harvard.edu/dataverse•  DataInforma5onLiteracyGuide.hbp://www.datainfolit.org/dilguide/•  DataInforma5onLiteracyCompetencies(Purdue).hbp://blogs.lib.purdue.edu/dil/the-twelve-dil-competencies/•  DPN(DigitalPreserva5onNetwork)hbp://www.dpn.org/•  Duracloud:hbp://www.duracloud.org/•  Force11.DataCita5onPrinciples.hbps://www.force11.org/group/joint-declara5on-data-cita5on-principles-final•  Purr.(PurdueIns5tu5onalDataRepository).hbps://purr.purdue.edu/•  Hubzero.hbps://hubzero.org/

  • FurtherLinks/References•  Figshare.hbp://figshare.com/•  ICPSRDataManagement&Cura5on.hbp://www.icpsr.umich.edu/icpsrweb/content/datamanagement/•  ResearchDataManagement.Principles,Prac5ces,andProspects(November2013).CouncilonLibraryand

    Informa-onResources.hbp://www.clir.org/pubs/reports/pub160•  Cox,A.andPinfield,S.ResearchDataManagementandLibraries.JournalofLibrarianshipandInforma5on

    Science.June2013.•  Fearon,D&Sallans,A.C.(January2014).Ins5tu5onalResearchDataManagement:Policies,Planning,Services

    andSurveys.Coali5onforNetworkedInforma5on.hbps://www.youtube.com/watch?v=rvbrW7S2fes(videopresenta5on)

    •  DataManagementforLibraries:(LITAGuide)hbp://www.alastore.ala.org/detail.aspx?ID=10737•  NMCHorizonReport:2014LibraryEdi-on.hbp://cdn.nmc.org/media/2014-nmc-horizon-report-library-EN.pdf•  “ResearchDataManagement”.pp.6-7andpp24–45.•  Holden,J.MemorandumforHeadsofExecu5veDepartmentsandAgencies:IncreasingAccesstotheResultsof

    FederallyFundedResearch(2013).hbp://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf

    •  Green,A.Macdonald,SandRice,R.Policy-makingforResearchDatainRepositories:AGuide.DISC-UK.hbp://www.disc-uk.org/docs/guide.pdf

    •  ResearchDataManagementintheArtsandHumani5es(2013).UniversityofOxford.hbp://www.dcc.ac.uk/events/research-data-management-forum-rdmf/rdmf10-research-data-management-arts-and-humani5es(ConferencePresenta5ons)

    •  TexasDataRepository.TDRFinalReport(October,2016),Selec5onProcess,Aug.2015,PeaceWilliamsonetal.UTArlington,DataCompetencies.TDLTexasDataRepositoryPresenta5on.Video.,KristyPark,San5Thompsonetal(October,2016)

    •  Uzwyshyn,R.2016.ResearchDataRepositories:TheWhat,When,WhyandHowofDataResearchRepositoriesComputersinLibraries.

  • Comments/Ques5ons

    ContactInforma5on:

    RayUzwyshyn,Ph.D.MBAMLISDirector,Collec5onsandDigitalServicesTexasStateUniversityLibrariesruzwyshyn@txstate.edu(512)245-5687

  • AcademicResearchLibrariesEnvironmentalScan

    OnlineDataResearchRepositories(CNI)

    Fearon,D&Sallans,A.C.(January2014)Ins5tu5onalResearchDataManagement:Policies,Planning,ServicesandSurveys.Coali5onforNetworkedInforma5on.hbps://www.youtube.com/watch?v=rvbrW7S2fes(54ARLLibrariescurrentlyofferdatamanagementservices_)

  • DataverseNetworkArchitecture

    WhytheDataverseNetwork?(silentvideooverview)

    OpenJournalSystemsDataverseIntegra5on

    ResearchStudyDataDataSetFilesMetadata(DataDescribingthedata)ParatextualResearchMaterial(Methodology,FieldNotesetc.)GraphDataFiles

  • PURRandHubzero:Purdue’sDataManagementSystem

    •  Purr:PurdueUniversityResearchRepository(video)

    •  PurrSite(ProprietarytoUniversity)

    •  PurrBackground

    1.)CreateDataManagementPlans2)CollaboratewithotherResearchers3)PublishDataSets(PurduecanpublishaDOI:DigitalObjectIden5fierforDataSets)UsefulForCita5on4)ArchiveDataSets

    Boilerplatetextfordatamanagementproposalsavailable

    PurrispartofHubzeropla^ormforscien5ficcollabora5on(OriginallyNanohub)

  • Hubzero:OpenSourcePla^ormforScien5ficCollabora5on

    •  hbps://hubzero.org/•  GexngStarted,DownloadableandHostedOp5ons•  HubzeroVideo,Hubzero2

    ResearchCollabora5onandDataManagementSolu5on

    ResearchDataTypesSpreadsheetsInstrumentorSensorReadingsSohwareSourceCodeSurveysInterviewTranscriptsImagesandAudiovisualFiles

  • Figshare/Cloudbased/Proprietary

    Repositorywhereusersmaketheirresearchavailableincitable,shareableanddiscoverablemanner

    Figures,datasets,media,papers,posterspresenta5onsandfilesetscanbedisseminatedInawaythatthecurrentscholarlypublishingModeldoesnotallow

    OpenSourcePla^ormforSharingResearch

    Figshare(video)

    FigshareforIns5tu5ons(Video)

  • FigshareFeatures(CloudBased/Proprietary)

  • hbps://www.force11.org/group/joint-declara5on-data-cita5on-principles-final

    DataCita5onPrinciples

  • TexasDataRepositoryTexasDigitalLibraryIni5a5ve,2014-2016

    TDLConsor5umof22universi5esacrossTexasleveragingtechnologicalcoopera5onamongacademiclibraries

  • InsDtuDonalRepository(MIT,D-Space)

    hbps://digital.library.txstate.edu/

    Facultypublica5ons,whitepapers,preprints,theses,disserta5ons,workingprojects,reports,greyliterature

    LargerIdea,GrantCompliance,EnablingFacultyResearchOnline,RaisingResearchVisibility,

  • PilotStudyResponsesPerceivedBenefitsofDataRepository

    •  Fulfillfederalmandatesforsharingpublica5onsandresearchdata

    •  Makeresearchdatamorewidelyavailable•  Sta5s5csondownloadsandcita5onsofmydata•  MakemydataciteablethroughtheassignmentofaDOI(digitalobjectiden5fier)

    •  Savingvariousversionsofthedataset(datalifecycle)•  Collec5ngallmydatainoneplace

  • Collaboration Across Institutions

    Jones et al. (2008). Science 322: 1259-1262.

  • DataSharing

    Currently,80%ofresearchersdonotsharetheirdata

    Andreoli-Versbach,P.,Mueller-Langer,F.(November2014).Openaccesstodata:Anidealprofessedbutnotprac5ced.ResearchPolicy.,hbp://dx.doi.org/10.1016/j.respol.2014.04.008

  • hbp://www.nature.com/news/1-500-scien5sts-lih-the-lid-on-reproducibility-1.19970Harris,Richard.(April2017).RigorMor-sHowSloppyScienceCreatesWorthlessCures

    ResearchDataReproducibilityCrisis(Nature.2016)

  • Hubzero/PurrCustomiza5on