Cyberinfrastructure & Major Facilities
Manish Parashar Office Director
Office of Advanced Cyberinfrastructure
Directorate for Computer & Information Science & Engineering
National Science Foundation
NSF Large Facilities Workshop
Texas Advanced Computing Center, UT Austin April 4, 2019
Outline
▪ About OAC and NSF Cyberinfrastructure▪ Accelerating facility science with cyberinfrastructure ▪ Upcoming events
Outline
▪ About OAC and NSF Cyberinfrastructure▪ Accelerating facility science with cyberinfrastructure ▪ Upcoming events
NSF Office of Advanced Cyberinfrastructure (OAC)Directorate for Computer & Information Science & Engineering (CISE)
$224MFY 2018 researchbudget
950proposals
305awards
32% Success Rate
People, organizations, and communities
Data Infrastructure
Gateways, Hubs,and Services
Cloud Resources &
ServicesCI-Enabled
Instrumentation
ComputingResources
R&E Networks,Security Layers
Coordination& User support
Software andWorkflow Systems
Pilots,Testbeds
Source: https://dellweb.bfa.nsf.gov/starth.asp
Foster a cyberinfrastructure ecosystem to transform science and engineering research…
… through Research CI and CI research
NSF Office of Advanced Cyberinfrastructure (OAC)Directorate for Computer & Information Science & Engineering (CISE)
$224MFY 2018 researchbudget
950proposals
305awards
32% Success Rate
People, organizations, and communities
Data Infrastructure
Gateways, Hubs,and Services
Cloud Resources &
ServicesCI-Enabled
Instrumentation
ComputingResources
R&E Networks,Security Layers
Coordination& User support
Software andWorkflow Systems
Pilots,Testbeds
Source: https://dellweb.bfa.nsf.gov/starth.asp
Foster a cyberinfrastructure ecosystem to transform science and engineering research…
… through Research CI and CI research
“…. an agile, integrated, robust, trustworthy and sustainable CI ecosystem that drives new thinking and transformative discoveries in all areas of S&E research and education”
6
Transforming Science Through Cyberinfrastructure
NSF’s Blueprint for a National Cyberinfrastructure Ecosystem forScience and Engineering in the 21st Century
7
Early user access in May 2019
Frontera will be: • A leadership-class computational instrument with the broadest utility for all of S&E applications• The largest CPU system on a US academic campus• A national asset that complements other leadership-class computing investments in the US research ecosystem
Computation for the Endless Frontier
https://www.tacc.utexas.edu/systems/frontera
RESEARCH IDEASWindows on the
Universe:Multi-messenger
Astrophysics
Quantum Leap:
Leading the Next
Quantum Revolution
Navigating the
New Arctic
Understanding the Rules of
Life:Predicting Phenotype
PROCESS IDEAS
Mid-scale Research
Infrastructure
Growing Convergence
Research at NSF
NSF 2026
NSF INCLUDES: Enhancing STEM through Diversity and Inclusion
Harnessing Data for 21st
Century Science and Engineering
Work at theHuman-
TechnologyFrontier:
Shaping the Future
“ … bold questions that will drive NSF's long-term research agenda -- questions that will ensure future generations continue to reap the benefits of fundamental S&E research. ”
NSF Big Ideas
Cyberinfrastructure is a key enabler for NSF Big Ideas
Convergence Accelerators
Accelerating Discovery through Convergence Research
BIO
CIS
E
EHR
ENG
GEO
MP
S
SBE
HDR Convergence Accelerator
FW-HTF Convergence Accelerator
Future accelerator(s)
Outline
▪ About OAC and NSF Cyberinfrastructure▪ Accelerating facility science with cyberinfrastructure ▪ Upcoming events
Cyberinfrastructure is central to NSF’s Large Facilities
Research success depends on robust, reliable, and highly connective cyberinfrastructure
10
Large Facilities present new CI challenges … and opportunities
Challenges▪ Rapidly changing (evolving) CI requirements, user communities
▪ Technology integration, capability evolution
▪ Integrated data lifecycle management
▪ Data provenance, data identifiers, reproducibility
▪ Workforce training, retention
▪ Cybersecurity
Opportunities▪ On-demand data processing, analytics, data product generation
▪ Efficiencies, integration, interoperability across data/compute islands
▪ End-to-end science workflows
▪ Enhanced (intelligent) data delivery, open access, shared use, and beyond
2017 NSF Facilities CI Workshop http://www.facilitiesci.org/
11
Holistic view of CI for Facilities
Instruments
Data pipeline
Internal Compute,Data, Analysis
Capabilities
Leverage a shared National CI ecosystem: computing, data, middleware, and networking resources and services which evolve rapidly with technology arc and research needs.
Workforce includes specialized CI personnel available at all times.
010101010101
010101010101
010101010101
DataProducts
At the boundary: Data management and networking, communication systems must evolve faster, in response to evolving usage mode, technology disruptions, etc.
Within the facility: Certain CI elements must be specialized and conservatively managed.
Cybersecurity systems and practices are critical to protecting the whole facility.
Larg
e Fa
cilit
y
12
Cloud
Science Gateways
Workflows & Middleware
Compute
Storage
R&ENetworkEnterprise best practices for operations
maintenance
Regard CI from strategic and long-term perspective
1. Facility science is complex and evolving.
➢Realize full scientific potential thorough early and continuous consideration of CI.
2. User science workflow needs are evolving.➢Address the CI required for data transformations
integration, knowledge extraction.
3. CI design and operations are challenging.
➢Share practices; leverage external capabilities, technologies and expertise.
Build the FacilityCI Community
Develop the “missing middle” CI for facility science
Invest in CI conceptualization execution, refresh
13
…to ensure facility science is revolutionized by – not limited by – the CI
PI: Patrick R Brady, UW-MilwaukeeCo-PIs: G Allen (UIUC), W Anderson (UWM), F Bianco (NYU),
J Bloom (UC, Berkeley), A Brazier (Cornell), P Couvares (Caltech), T DeYoung, (MSU), D Fox (PSU), C Hanna (PSU), D Hogg (NYU), K Holley-Bockelman (Vanderbilt U.), A Howell
(LCO/UCSB), D Kaplan (UWM), E Katsavounidis (MIT), Z Marka (Columbia U.), I Taboada (Georgia Tech)
Broader Impacts: Through open engagement of multiple research communities, the project will deliver a Community White Paper documenting the needs & opportunities and a Strategic Plan for an institute to address these needs. The project advances the objectives of the National Strategic Computing Initiative (NSCI) and two of the 10 Big Ideas for Future NSF Investments (“Harnessing the Data Revolution" and "Windows on the Universe”)
Goals: Identify key questions and cyberinfrastructure projects required to take full advantage of current facilities and imminent next-generation projects for Multi-messenger Astrophysics (MMA).
Intellectual Merit: Multi-Messenger Astrophysics (MMA) is an exciting new field of science that combines traditional astronomy with the new capabilities to measure gravitational waves, high-energy neutrino particles and other messengers that originate from celestial objects. The promise of MMA can be realized only if sufficient cyberinfrastructure is available to rapidly handle, combine, and analyze the very large-scale distributed data from all the types of astronomical measurements.
OAC-1841625
Community Planning for Scalable Cyberinfrastructure to support Multi-Messenger Astrophysics (SCiMMA)
http://scimma.org14
Invest in CI conceptualization execution, refresh
Investment partners: OAC + MPS AST + MPS/PHY + MPS/OMA
High-Luminosity Large Hadron Collider (HL-LHC) upgrade• order of magnitude increase in data analysis complexity• order of magnitude increase in store and compute cycles• solutions needed by : HL-LHC : 2025/2026
Convergence: Physics, computer and data science, applied math, software engineering
Coordination: Multi-agency (NSF, DOE) + International
IRIS-HEP mission:• Active center for software R&D• Intellectual hub for community-wide software R&D• Transform the operational services and computing model
Note: Complements the NSF MREFC for HL-LHC upgrade
CMS
ATLAS
IRIS-HEP: Institute for Research and Innovation in Software for High-Energy Physics
Award 1836650OAC & PHY
PI: Peter Elmer (Princeton) 21 researchers, 17 institutions, 5 years, $25M
Invest in CI conceptualization execution, refresh
Holistic view of CI for Facilities
Instruments
Data pipeline
Internal Compute,Data, Analysis
Capabilities
Leverage a shared National CI ecosystem: computing, data, middleware, and networking resources and services which evolve rapidly with technology arc and research needs.
Workforce includes specialized CI personnel available at all times.
010101010101
010101010101
010101010101
DataProducts
At the boundary: Data management and networking, communication systems must evolve faster, in response to evolving usage mode, technology disruptions, etc.
Within the facility: Certain CI elements must be specialized and conservatively managed.
Cybersecurity systems and practices are critical to protecting the whole facility.
Larg
e Fa
cilit
y
16
Cloud
Science Gateways
Workflows & Middleware
Compute
Storage
R&ENetworkEnterprise best practices for operations
maintenance
Develop the “missing middle”
The “missing middle”: Data-Intensive Discovery Pathways
Data Sources
Discover
Publish, Share
Integrate
Collaborate
Reuse
Science Outcomes &Results Dissemination
Data lifecycle
Disciplinary Specific
Resources & Workflows
Instrument/Facility Portals& Data streams
Sources & Repositories
17
Transformations &Knowledge Extraction
End-to-end Workflows
(Assimilation, Integration
Modeling, Simulation
Analysis, Visualization)
Realtime, Streaming,
On-Demand
Intelligent delivery, composition (AI/ML),
…
New science drivers, users and usage modes
The “missing middle”: Data-Intensive Discovery Pathways
Data Sources
Middleware Infrastructure: Authentication , Access, Distributed Workflows
Computing Resources, Data Infrastructure, Networking
Science Outcomes &Results Dissemination
TransdisciplinaryEnabling CI
Disciplinary Specific
Resources & Workflows
Instrument/Facility Portals& Data streams
Sources & Repositories
18
Discover
Publish, Share
Integrate
Collaborate
Reuse
Data lifecycle
Scalable CI to Accelerate Data-Driven Science & Engineering Research
• Support exemplars to accelerate discovery (≥ $1.5 million/project, up to 2 years)
✓Capitalize on NSF priority investments, Big Ideas, and NSF major facilities.
✓Enable new science pathways & enrich scientific value of community data.
• Projects must rapidly expand/scale and substantially augment science within award.
• Funded collaborative projects including 4 targeting facility CI advances (DKIST, IceCube, LHC, LIGO, NCAR, NEON, and future MMA).
19
FY 2018 NSF-18-076 OAC-CESER DCL
Facilities in initial focus: DKIST, NCAR, NEON
Challenge. Deliver data rapidly, efficiently, efficiently, and reliably from NSF facilities to remote researchers.
PI: Ian Foster, Co-PI: Steve Tuecke, U. Chicago. + science and facility collaborators
Virtual Data Set Services Enabling New Science at NSF Facilities
OAC-1841531
PI: F. Wuerthwein, SDSC. Co-PIs: D. Schultz, UW-M., P. Couvares, CIT. Gardner, U Chicago
Data Infrastructure for Open Science in Support of NSF Facilities
OAC-1841530
Challenge. Seamlessly share, integrate, and analyze data from many instruments for Multi-Messenger Astrophysics.
Facilities in initial focus: LIGO, IceCube
Develop the “missing middle”
Many instantiations…
Images from Report from the 2017 NSF Large Facilities Cyberinfrastructure Workshop, Appendix F. Facility-contributed white papers
20
…Opportunities to exchange practices, expertise, lessons
learned…
Build the Facility CI
Community
PI: Ewa Deelman, USC Co-PIs: A. Mandal, RENCI/UNC-CH; J. Nabrzyski, ND;
V. Pascucci and R. Ricci, U Utah
Desired Outcomes: Amplify scientific effectiveness of NSF Large Facilities. Benefit other projects via community and CI resources & services. Cultivate & enhance CI workforce and improve expertise.
Goal: Develop a model and plan for a CI Center of Excellence (CI CoE) for knowledge sharing and community building about CI models & best practices.
Award 1842042OAC & BIO
http://cicoe-pilot.org
2. Learn
1. Engage with
Large Facility
3. Provide expertise
5. Disseminate
6. Foster a CI
community
Evaluate approach and adjust
engagement process
NSF Large Facilities
CI CoE Pilot
4. Distill best
practices
Approach and Activities: Build a community around CI for NSF Large Facilities. Provide a community forum to discuss technical, social, and project management issues; and foster interactions and sharing of technical experience. Create a community-curated portal to collect and disseminate information on robust CI software, services, and platforms. Define a structure to address workforce development, training, retention, career paths, and diversity of CI personnel. Engage with NEON to pilot the effort.
Pilot Study for a Cyberinfrastructure Center of Excellence
21
Build the Facility CI
Community
Envisioning the Future of Facility Science and Cyberinfrastructure
This one-day workshop will bring together the facility and CI communities to
• Explore CI innovations and use cases that significantly enhance the scientificimpact and productivity offered by NSF large facilities, and which point towardsthe future of transformative cyberinfrastructure-enabled large-scale science.
• Highlight new data delivery and usage modes, adoption of shared resourcesand analytical tools, and achieve higher levels of integration to plan forward forfuture science capabilities.
• Provide input to planning for the 2019 NSF Facility CI Workshop in September.
https://www.largefacilitiesworkshop.com/19workshop/
A special event following the Large Facilities WorkshopFriday April 5, 2019 (Tomorrow) 8:30 AM – 3:00 PM, TACC/UT Austin
22
Build the Facility CI
Community
Outline
• About OAC and NSF Cyberinfrastructure
• Accelerating facility science with cyberinfrastructure
• Upcoming events
Mark your calendars!
2019 NSF Facility Cyberinfrastructure Workshop
September 16-17, 2019
Alexandria, Virginia
More details to follow…
24
Upcoming OAC Webinars
25
April 18 Science GatewaysNancy Wilkins-Diehr, SDSC
May 16 Women in IT networkingMarla Meehl, UCAR
June 20 Multiscale cardiovascular flow simulationsAlison Marsden, Stanford
July 18 Astrophysics simulationsManuela Campanelli, RIT
August 15 Digital mapping of the ArcticPaul Morin, U Minnesota
Join the conversation• OAC Webinar Series
• 3rd Thursday @ 2PM ET• OAC Newsletter • Follow us on Twitter @NSF_CISE
Get involved▪ Reviews proposals, serve on panels▪ Visit NSF, get to know your programs
and Program Officers▪ Participate in NSF workshops and
visioning activities▪ Join NSF: serve as Program Officer,
Division Director, or Science AdvisorStay informed• Join the OAC, CISE Mailing Lists
• Learn about NSF events, programs, webinars, etc.
• Send email to:• [email protected]
Take-homes…▪ Facility science is exciting, wide ranging, and highly impactful.
• Important to take a strategic and long-term perspective to ensure CI enables – not limits –facility science.
▪ CI is essential to facility operations and scientific mission … but is complex and evolves differently as compared to other facility elements.
• To realize the full scientific potential of facilities early, fundamental and continuous consideration of CI is important – concept => execution => refresh.
▪ Facilities are challenged to keep up with the state-of-the-art and maintain CI that is efficient, robust, secure, scalable, and sustainable.
• Build the facility CI community to share best practices for design and operations; and draw on existing resources and expertise.
▪ Collectively, faculties can transform science. • Addressing the “missing middle” CI for science workflows, data transformations, integration,
and knowledge extraction.
“Make no little plans; They have no magic to stir men's blood …”
Daniel H. Burnham, Architect and City Planner Extraordinaire, 1907.
“If you want to travel fast, travel alone;if you want to travel far, travel together”
African Proverb.
Manish Parashar
Office Director, Office of Advanced Cyberinfrastructure
Email: [email protected]
To subscribe to the OAC Announce Mailing ListSend an email to: [email protected]