Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
• Towardsaworldwhereitiseasiertopublish,link,search,andreuse dataofallkinds• Advancingdiscoverybyenablingopen
sharingofdata• Increasecollaborationwithin/across
fields
• NationalefforttobringtogetherR&Deffortssurroundingdata• Interoperability andextensibilityofdata
cyberinfastructure efforts• IncollaborationwiththeRDA
implementingstandards andprotocols
• preservedataforlongerperiodsof time• finddata• sharedatacollectionsovertheweb• setdataaccesspermissions• publishdata,citedata,getcreditforyourdata• efficientlyandreliablymovelargeamountsofdataand/orlargenumbersof files• runcustomtools/softwareondata,nearthedata• runcompute intensiveanalysisondata• runanalysison largecollectionsofdata• runanalysisacrossanyavailableresources• utilizeanextensiblesuiteofreusabledataanalysis/manipulation tools• preserve, share,andfindsoftware/toolsassociatedwiththedata• publishsoftware,citesoftware,getcreditforyoursoftware• createandshareworkflowsoverdata• asknovelquestionsspanningallavailabledata
EASILY …
EnablingDiscoverywithOpenData• Scientistshavepinneddownamolecularprocessinthe
brainthathelpstotriggerschizophrenia• Analyzeddatafromabout29,000schizophreniacases,
36,000controlsand700postmortembrains.Theinformationwasdrawnfromdozensofstudiesperformedin22countries.
• Theauthorsstressedthattheirfindings,whichcombinebasicsciencewithlarge-scaleanalysisofgeneticstudies,dependedonanunusuallevelofcooperationamongexpertsingenetics,molecularbiology,developmentalneurobiologyandimmunology.
• "Thiscouldnothavebeendonefiveyearsago".
https://www.washingtonpost.com/news/speaking-of-science/wp/2016/01/27/scientists-open-the-black-box-of-schizophrenia-with-dramatic-genetic-finding/
TheNationalDataServiceConsortium• Collaborationofmanykindsofinstitutions
• Computationalanddataservicecenters• ANL,NCSA,PSC,SDSC,TACC
• Universitiesandlibraryrepositories• CUBoulder,Harvard,Indiana,Purdue,UIUC
• Communityefforts• LSST,LIGO,ICPSR,...
• Cyberinfrastureefforts• DataONE, iPlant, IRODs,Globus,SciServer,SEAD,...
• Publishers• Nature,Science, APS,IEEE, PLOS,Elseveier,…
• Consortium toguide thebuilding/governance ofservices• Bi-annualworkshops• NDSConsortiumSteeringCommittee• CoordinateseparatelyfundedeffortstobuildNDScomponents• Ensureinteroperability, integrateexisting toolsandresources
• TechnicalAdvisoryCommittee• technicalguidance todevelopmentofintegrativeservices
NSFDataNetsDIBBs
Bill Michener$21,194,5482009-2014
Golam Choudhury$10,085,1202009-2014
Reagan Moore$8,300,9922011-2016
Kenton McHenry$10,519,7162013-2018
Steven Ruggles$7,993,2662011-2016
Margaret Hedstrom$8,000,0002011-2016
Alex Szalay$7,603,7232013-2018
Long Term Access to Large Scientific Data Sets: The SkyServer and Beyond
Michael Levine$4,902,6012013-2018
The Data ExacellXiaohui Carol Song
$3,409,0292013-2018
Integrating Geospatial Capabilities into HUBzero
Geoffrey Fox$5,000,0002014-2019
Middleware and High Performance Analytics
Libraries for Scalable Data Science Ken Koedinger
$4,830,8192014-2019
Building a Scalable Infrastructure for Data-Driven Discovery and Innovation in
Education
89regionalresources
BreakDownScienceDrivers• DataCollections– Un-digitized– Offline– Restricted– Unstructured– Un-curated– Simulationoutputs– Representations– Diversedata– Crossdisciplinarycollections– Largedatasets– Manysmalldatasets
• DataHandlingNeeds– Datahosting– Accesscontrol– Datacollaboration– Datacuration tools– Dataanalysistools– Datavisualizations– Longtermarchiving– Supportforcommunitytools– Highperformancecomputation
– Cloudcomputination
BreakDownofTechnologyComponents
• Infrastructure&Resources– Data– Networking– Storage– Security– Highperformance
computing– Cloudcomputing
resources
• DataServiceCapabilities– Authentication
• Identitymanagement• Accesscontrol
– Transfer• Largetransfers• Robusttransfer
– Storage• Storageabstraction• Sharing/collaboration• Archiving• Replication• Datapublishing• Codepublishing
– Curation• Ingestion• Description• Provenance• Transformation• Integration
– Analysis• Dataanalysis• Eventnotification
– Exploration• Federatedsearch• API• Userinterface
U.S.NDSComponents
…
Tools&Services• Containedbroadlyreusablecomponentsaddressingsomespecificdataneed– DataTransferService– IdentityService– DataStorageService– HPC/CloudComputationService– DataSharing/PublishingServices– DataArchivingService– DataTransformationService– Code/ToolPublishingService
• Interoperabilitywithothersuchcomponents
https://www.youtube.com/channel/UCWuPo7LDCzsqF3RaKzIQmHw