15
Towards a world where it is easier to publish, link, search, and reuse data of all kinds Advancing discovery by enabling open sharing of data Increase collaboration within/across fields National effort to bringtogether R&D efforts surroundingdata Interoperability and extensibility of data cyberinfastructure efforts In collaboration with the RDA implementing standards and protocols

Towards a world where it is easier to publish, link ... · publish, link, search, and reusedata of all kinds • Advancing discovery by enabling open sharing of data • Increase

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Towards a world where it is easier to publish, link ... · publish, link, search, and reusedata of all kinds • Advancing discovery by enabling open sharing of data • Increase

• Towardsaworldwhereitiseasiertopublish,link,search,andreuse dataofallkinds• Advancingdiscoverybyenablingopen

sharingofdata• Increasecollaborationwithin/across

fields

• NationalefforttobringtogetherR&Deffortssurroundingdata• Interoperability andextensibilityofdata

cyberinfastructure efforts• IncollaborationwiththeRDA

implementingstandards andprotocols

Page 2: Towards a world where it is easier to publish, link ... · publish, link, search, and reusedata of all kinds • Advancing discovery by enabling open sharing of data • Increase

• preservedataforlongerperiodsof time• finddata• sharedatacollectionsovertheweb• setdataaccesspermissions• publishdata,citedata,getcreditforyourdata• efficientlyandreliablymovelargeamountsofdataand/orlargenumbersof files• runcustomtools/softwareondata,nearthedata• runcompute intensiveanalysisondata• runanalysison largecollectionsofdata• runanalysisacrossanyavailableresources• utilizeanextensiblesuiteofreusabledataanalysis/manipulation tools• preserve, share,andfindsoftware/toolsassociatedwiththedata• publishsoftware,citesoftware,getcreditforyoursoftware• createandshareworkflowsoverdata• asknovelquestionsspanningallavailabledata

EASILY …

Page 3: Towards a world where it is easier to publish, link ... · publish, link, search, and reusedata of all kinds • Advancing discovery by enabling open sharing of data • Increase

EnablingDiscoverywithOpenData• Scientistshavepinneddownamolecularprocessinthe

brainthathelpstotriggerschizophrenia• Analyzeddatafromabout29,000schizophreniacases,

36,000controlsand700postmortembrains.Theinformationwasdrawnfromdozensofstudiesperformedin22countries.

• Theauthorsstressedthattheirfindings,whichcombinebasicsciencewithlarge-scaleanalysisofgeneticstudies,dependedonanunusuallevelofcooperationamongexpertsingenetics,molecularbiology,developmentalneurobiologyandimmunology.

• "Thiscouldnothavebeendonefiveyearsago".

https://www.washingtonpost.com/news/speaking-of-science/wp/2016/01/27/scientists-open-the-black-box-of-schizophrenia-with-dramatic-genetic-finding/

Page 4: Towards a world where it is easier to publish, link ... · publish, link, search, and reusedata of all kinds • Advancing discovery by enabling open sharing of data • Increase

TheNationalDataServiceConsortium• Collaborationofmanykindsofinstitutions

• Computationalanddataservicecenters• ANL,NCSA,PSC,SDSC,TACC

• Universitiesandlibraryrepositories• CUBoulder,Harvard,Indiana,Purdue,UIUC

• Communityefforts• LSST,LIGO,ICPSR,...

• Cyberinfrastureefforts• DataONE, iPlant, IRODs,Globus,SciServer,SEAD,...

• Publishers• Nature,Science, APS,IEEE, PLOS,Elseveier,…

• Consortium toguide thebuilding/governance ofservices• Bi-annualworkshops• NDSConsortiumSteeringCommittee• CoordinateseparatelyfundedeffortstobuildNDScomponents• Ensureinteroperability, integrateexisting toolsandresources

• TechnicalAdvisoryCommittee• technicalguidance todevelopmentofintegrativeservices

Page 5: Towards a world where it is easier to publish, link ... · publish, link, search, and reusedata of all kinds • Advancing discovery by enabling open sharing of data • Increase

NSFDataNetsDIBBs

Bill Michener$21,194,5482009-2014

Golam Choudhury$10,085,1202009-2014

Reagan Moore$8,300,9922011-2016

Kenton McHenry$10,519,7162013-2018

Steven Ruggles$7,993,2662011-2016

Margaret Hedstrom$8,000,0002011-2016

Alex Szalay$7,603,7232013-2018

Long Term Access to Large Scientific Data Sets: The SkyServer and Beyond

Michael Levine$4,902,6012013-2018

The Data ExacellXiaohui Carol Song

$3,409,0292013-2018

Integrating Geospatial Capabilities into HUBzero

Geoffrey Fox$5,000,0002014-2019

Middleware and High Performance Analytics

Libraries for Scalable Data Science Ken Koedinger

$4,830,8192014-2019

Building a Scalable Infrastructure for Data-Driven Discovery and Innovation in

Education

Page 6: Towards a world where it is easier to publish, link ... · publish, link, search, and reusedata of all kinds • Advancing discovery by enabling open sharing of data • Increase

89regionalresources

[email protected]

Page 7: Towards a world where it is easier to publish, link ... · publish, link, search, and reusedata of all kinds • Advancing discovery by enabling open sharing of data • Increase

BreakDownScienceDrivers• DataCollections– Un-digitized– Offline– Restricted– Unstructured– Un-curated– Simulationoutputs– Representations– Diversedata– Crossdisciplinarycollections– Largedatasets– Manysmalldatasets

• DataHandlingNeeds– Datahosting– Accesscontrol– Datacollaboration– Datacuration tools– Dataanalysistools– Datavisualizations– Longtermarchiving– Supportforcommunitytools– Highperformancecomputation

– Cloudcomputination

Page 8: Towards a world where it is easier to publish, link ... · publish, link, search, and reusedata of all kinds • Advancing discovery by enabling open sharing of data • Increase

BreakDownofTechnologyComponents

• Infrastructure&Resources– Data– Networking– Storage– Security– Highperformance

computing– Cloudcomputing

resources

• DataServiceCapabilities– Authentication

• Identitymanagement• Accesscontrol

– Transfer• Largetransfers• Robusttransfer

– Storage• Storageabstraction• Sharing/collaboration• Archiving• Replication• Datapublishing• Codepublishing

– Curation• Ingestion• Description• Provenance• Transformation• Integration

– Analysis• Dataanalysis• Eventnotification

– Exploration• Federatedsearch• API• Userinterface

Page 9: Towards a world where it is easier to publish, link ... · publish, link, search, and reusedata of all kinds • Advancing discovery by enabling open sharing of data • Increase

U.S.NDSComponents

Page 10: Towards a world where it is easier to publish, link ... · publish, link, search, and reusedata of all kinds • Advancing discovery by enabling open sharing of data • Increase
Page 11: Towards a world where it is easier to publish, link ... · publish, link, search, and reusedata of all kinds • Advancing discovery by enabling open sharing of data • Increase
Page 12: Towards a world where it is easier to publish, link ... · publish, link, search, and reusedata of all kinds • Advancing discovery by enabling open sharing of data • Increase

Tools&Services• Containedbroadlyreusablecomponentsaddressingsomespecificdataneed– DataTransferService– IdentityService– DataStorageService– HPC/CloudComputationService– DataSharing/PublishingServices– DataArchivingService– DataTransformationService– Code/ToolPublishingService

• Interoperabilitywithothersuchcomponents

Page 13: Towards a world where it is easier to publish, link ... · publish, link, search, and reusedata of all kinds • Advancing discovery by enabling open sharing of data • Increase

https://www.youtube.com/channel/UCWuPo7LDCzsqF3RaKzIQmHw

Page 14: Towards a world where it is easier to publish, link ... · publish, link, search, and reusedata of all kinds • Advancing discovery by enabling open sharing of data • Increase
Page 15: Towards a world where it is easier to publish, link ... · publish, link, search, and reusedata of all kinds • Advancing discovery by enabling open sharing of data • Increase