40
University of Illinois at Urbana-Champaign NCSA – Evolution of an HPC Center Infrastructure and Services for Scientific Analysis and Decision Support Danny Powell Executive Director National Center for Supercomputing Applications University of Illinois at Urbana-Champaign

Danny Powell Executive Director National Center for Supercomputing Applications

  • Upload
    adonis

  • View
    33

  • Download
    0

Embed Size (px)

DESCRIPTION

NCSA – Evolution of an HPC Center Infrastructure and Services for Scientific Analysis and Decision Support. Danny Powell Executive Director National Center for Supercomputing Applications University of Illinois at Urbana-Champaign. Talk Outline. About NCSA – Who we are now.. Basic numbers - PowerPoint PPT Presentation

Citation preview

Page 1: Danny Powell Executive Director National Center for Supercomputing Applications

University of Illinois at Urbana-Champaign

NCSA – Evolution of an HPC Center

Infrastructure and Services for Scientific Analysis and Decision Support

NCSA – Evolution of an HPC Center

Infrastructure and Services for Scientific Analysis and Decision Support

Danny PowellExecutive Director

National Center for Supercomputing ApplicationsUniversity of Illinois at Urbana-Champaign

Page 2: Danny Powell Executive Director National Center for Supercomputing Applications

National Center for Supercomputing Applications

Talk OutlineTalk Outline

• About NCSA – Who we are now..– Basic numbers– Mission– Basic methods of operation

• Projects and Customers– Cyber-Infrastructure and Science Projects– Industry– Education– Government – Public Health

• Evolving into a successful HPC Center– How we changed over the years

– User service – centric focus

– Your staff – it’s almost always about the people

– Management – effective roles

Page 3: Danny Powell Executive Director National Center for Supercomputing Applications

University of Illinois at Urbana-Champaign

National Center for Supercomputing Applications

University of Illinois at Urbana-Champaign

National Center for Supercomputing Applications

• Applied Research Unit of University of Illinois– Origin: 1986 NSF-funded national supercomputing centers

– Original Mission: Provide state-of-the-art computing and data capabilities to the nation’s scientists and engineers

– Develop software tools and software systems needed to make full use of advanced computing and data systems (Mosaic, Apache Web Server, Telnet, D2K, MyProxy, numerous others…)

• NCSA by the Numbers– Approximately 275 staff (250 technical/professional staff)

– Two facilities (NCSA Building, NPCF) (>220k sq.ft)

Page 4: Danny Powell Executive Director National Center for Supercomputing Applications

National Center for Supercomputing Applications

Basic Facts about NCSABasic Facts about NCSA

•Computing/Data Resources– Blue Waters: 11+ Petaflop (1+ PF sustained) computer (Cray)

• Most powerful machine in NSF portfolio – NSF’s only Tier One machine• $350 million project ($200 million construction - $150 million operations)

– Mid-Range Supercomputing systems: ~200 TF – Archival storage system: 500+ PB – Advanced visualization systems

•Types of projects– Local, National and Global scale– Individual tools to large CI frameworks– Point solutions to systemic improvements

•IP – Majority of work at NCSA is open source. – Can effectively deal with secure environments, proprietary codes, confidentiality

Page 5: Danny Powell Executive Director National Center for Supercomputing Applications

National Center for Supercomputing Applications

It is All About Working with OthersIt is All About Working with Others

• Funding– Federal Agencies, Industry, State of Illinois, Foundations, International sources– Most projects are partnerships with others (88%)

• Leveraging skills/resources of others

• Goal to be viewed as the “Partner of Choice”

• IACAT (Institute for Advanced Computing Applications and Technologies)

– Integrates applied research of NCSA with basic research teams of Universities

• International Program– 30+ institutions from 22+ countries– Faculty and student exchanges, joint projects, workshops, technology sharing

• Industrial Program– Nationally/internationally recognized for it’s level of functional interaction, technology

transfer, student engagement– 23+ companies (Fortune 50/100/500, smaller technology companies)

Page 6: Danny Powell Executive Director National Center for Supercomputing Applications

Theoretical&

BasicResearch

Commercialization&

Production(.com or .org)

AppliedPrototyping

&Development

Optimization&

Robustification

NCSABridges the Gap

BETWEENBasic Research &

Commercialization

Product Life Cycle

Phase 1Feasibilit

y

Phase 3Prototyping

Phase 4Production/Deployment

Phase 0Concept/

Vision

Universities& Labs

PrivateIndustry

Application

Phase 2Design/

Development

Economic Development

NCSA Bridges Basic Research and Commercialization with Application

Page 7: Danny Powell Executive Director National Center for Supercomputing Applications

USERS:High EndComputer

& DataNeeds

Individual tools, System software,

Analytics, Visualization, Integrated SW systems,Workflow, User Support,

Training EffectiveResource Utilization

NCSAEnables effective/efficientuse of high end computer

and data resources in support

of science and education

Mission: Enable Science/Engineering/Education

Scientific, Decision Support, InquiryResults

Page 8: Danny Powell Executive Director National Center for Supercomputing Applications

National Center for Supercomputing Applications

CyberInfrastructure Development

A Collaboration/Partnership with a Broad Set of Communities

Projects and Customers

Page 9: Danny Powell Executive Director National Center for Supercomputing Applications

Blue WatersBlue Waters

9Presentation Title

Page 10: Danny Powell Executive Director National Center for Supercomputing Applications

Blue Waters Project

Input from Scientific CommunityBlue Waters Project

Input from Scientific Community• D. Baker, University of Washington

Protein structure refinement and determination

• M. Campanelli, RITComputational relativity and gravitation

• D. Ceperley, UIUCQuantum Monte Carlo molecular dynamics

• J. P. Draayer, LSUAb initio nuclear structure calculations

• P. Fussell, BoeingAircraft design optimization

• C. C. GoodrichSpace weather modeling

• M. Gordon, T. Windus, Iowa State UniversityElectronic structure of molecules

• S. Gottlieb, Indiana UniversityLattice quantum chromodynamics

• V. GovindarajuImage processing and feature extraction

• M. L. Klein, University of PennsylvaniaBiophysical and materials simulations

• J. B. Klemp et al., NCARWeather forecasting/hurricane modeling

• R. Luettich, University of North CarolinaCoastal circulation and storm surge modeling

• W. K. Liu, Northwestern UniversityMultiscale materials simulations

• M. Maxey, Brown UniversityMultiphase turbulent flow in channels

• S. McKee, University of MichiganAnalysis of ATLAS data

• M. L. Norman, UCSDSimulations in astrophysics and cosmology

• J. P. Ostriker, Princeton UniversityVirtual universe

• J. P. Schaefer, LSST CorporationAnalysis of LSST datasets

• P. Spentzouris, FermilabDesign of new accelerators

• W. M. Tang, Princeton UniversitySimulation of fine-scale plasma turbulence

• A. W. Thomas, D. Richards, Jefferson LabLattice QCD for hadronic and nuclear physics

• J. Tromp, Caltech/PrincetonGlobal and regional seismic wave propagation

• P. R. Woodward, University of MinnesotaAstrophysical fluid dynamics

National Center for Supercomputing Applications

Page 11: Danny Powell Executive Director National Center for Supercomputing Applications

11

LanguagesLanguages

C (OpenACC)

C++ (OpenACC)

Fortran/CAF (OpenACC)

Python

UPC

CompilersCompilers

GNU

Cray Compiling Environment (CCE)

Programming Models

Programming Models

Distributed Memory (Cray MPT)

• MPI• SHMEM

Shared Memory

• OpenMP 3.0

PGAS & Global View

• UPC (CCE)• CAF (CCE)

Cray developed

Under development

Licensed ISV SW

IO LibrariesIO Libraries

HDF5

ADIOS

NetCDF

Resource ManagerResource Manager

Adaptive/Other

ToolsTools

Modules

Optimized Scientific Libraries

Optimized Scientific Libraries

ScaLAPACK

BLAS (libgoto)

LAPACK

Iterative Refinement

Toolkit

Cray Adaptive FFTs (CRAFFT)

FFTW

Cray PETSc (with CASK)

Cray Trilinos (with CASK)

Environment setup

Debugging Support Tools

• Fast Track Debugger(CCE w/ DDT)

• Abnormal Termination Processing

STAT

Cray Comparative Debugger#

Cray Comparative Debugger#

3rd party packaging

NCSA supported

Cray added value to 3rd party

DebuggersDebuggers

Allinea DDT

lgdb

Performance Analysis

Performance Analysis

Cray Performance

Monitoring and Analysis Tool

PerfSuite

Tau

Cray Linux Environment (CLE)/SUSE Linux

VisualizationVisualization

VisIt

Paraview

YT

PAPI Prog. Env.Prog. Env.

Eclipse

Traditional

Data Transfer

GO

HPSS

MWTCC - May 31, 2013

RAIT

Charm++

Page 12: Danny Powell Executive Director National Center for Supercomputing Applications

Blue WatersBlue Waters

12Presentation Title

Designed to meet compute-intensive, memory-intensive, and data-intensive needs across a wide range of disciplines.

Page 13: Danny Powell Executive Director National Center for Supercomputing Applications

XSEDE – National Compute and Data CyberInfrastructure

XSEDE – National Compute and Data CyberInfrastructure

• Collaboration between multiple US CI centers with deep experience: a partnership led by NCSA

• PI: John Towns NCSA/Univ of Illinois– Co-PIs: Jay Boisseau, TACC/Univ of Texas Austin Gregg Peterson, NICS/Univ of Tenn-Knoxville Ralph Roskies, PSC/CMU Nancy Wilkins-Diehr, SDSC/UC-San Diego

• Partners who complement these CI centers with expertise in science, engineering, technology and education– Univ of Virginia Ohio Supercomputer Center

SURA CornellIndiana Univ PurdueUniv of Chicago RiceBerkeley NCARShodor Jülich Supercomputing Centre

13

Page 14: Danny Powell Executive Director National Center for Supercomputing Applications

Advanced Information Systems

National CyberinfrastructureAdvanced Information Systems

National CyberinfrastructureHardware• Computers

• Data sources

• Data stores

• Networks

Software• Middleware

• Portals

• Grid-enabled• Applications

• Visualization

• Data analysis

• Workflows

National Center for Supercomputing Applications

Page 15: Danny Powell Executive Director National Center for Supercomputing Applications

CyberInfrastructure is also about the tools/systems that

allow effective use

CyberInfrastructure is also about the tools/systems that

allow effective use• Workflow

• Data management

• Software models/simulations

• Compute resources

• Software/Hardware optimization

• Visualization tools and resources

• Analytic tools

• Collaborative environments

• Resource sharing

• Publishing support toolsNational Center for Supercomputing Applications

Page 16: Danny Powell Executive Director National Center for Supercomputing Applications

Examples: Community Infrastructure Projects Examples: Community Infrastructure Projects

• Earthquake Engineering

– Consequence based risk management for seismic events

• Environmental Observatories

– Ocean Observatories, Coupled Human/Natural Systems, BioDiversity

• Atmospheric Modeling

– Severe Weather Predictions, Regional Climate Modeling

• Astronomy

– Very large data transport, processing, and analysis pipelines

• BioMedical Informatics

– Multisource infectious disease surveillance and patient safety

• Humanities/Social Science Research

– Digital libraries, Text/Image analysis, social networks

• Science Educational Support Systems– Teaching support and educational enhancement systems

National Center for Supercomputing Applications

Page 17: Danny Powell Executive Director National Center for Supercomputing Applications

Industrial Partnerships

National Center for Supercomputing Applications

Projects and Customers

Page 18: Danny Powell Executive Director National Center for Supercomputing Applications

Private Sector Program Partners – August 2012

Page 19: Danny Powell Executive Director National Center for Supercomputing Applications

Imaginations unbound

Industrial Interests in HPCIndustrial Interests in HPC

• PDM (Product Development Management)

• CRM (Customer Relationship Management)

• ERP (Enterprise Resource Planning)

• SCM (Supply Chain Management)

• BENEFITS:– Reduced Time-to-Market

– Improved Product Quality

– Reduced Prototyping Costs

– Re-use original data

– Reduced Waste

– Framework for Optimization

– Global CollaborationCourtesy of TranscenData.com

Page 20: Danny Powell Executive Director National Center for Supercomputing Applications

Imaginations unbound

Industrial ActivitiesIndustrial Activities

• Cycle provision – Overflow – when need exceeds their internal capacity

– Testing – new architectures before purchasing

– Research – testing new methods prior to large investments

• Scalability, algorithms, optimization, security, …

• Prototype tool/system development

• Training

• Peer discussions – on non-competitive basis

– Stated as an important and unique reason for participating

• Industrial park participation – Partners – proximity to expertise and students

– New company spinoffs

Page 21: Danny Powell Executive Director National Center for Supercomputing Applications

Education

National Center for Supercomputing Applications

Projects and Customers

Page 22: Danny Powell Executive Director National Center for Supercomputing Applications

TrainingTraining

• Workshops

– Train the trainer workshops

– Targeted disciplinary/technology/techniques workshops

– National conferences and other venues

• Training materials

– XSEDE https://www.xsede.org/training1

– Blue Waters – Petascale undergraduate education program http://www.shodor.org/petascale/

• Short courses

– Virtual School of Computational Science and Engineering – petascale oriented (including big data)

– http://www.vscse.org/

– Collaboration – multiple universitiesNational Center for Supercomputing Applications

Page 23: Danny Powell Executive Director National Center for Supercomputing Applications

Outreach Outreach

• Public awareness– Visualization of real scientific data in public venues

• Planetariums – digital domes – astronomy– Hubble 3-D

– Cosmic Voyage

• Science and Technology Museums – weather, astronomy– Search for Life– Computational Tornado Science– Dynamic Earth

• TV and Film – “Tree of Life” - Academy Award nomination – Cinematography

and visual effects– “Hunt for the Supertwister” - a public television (NOVA) special– “Monster of the Milky Way” - NOVA PBS television special– Others …

National Center for Supercomputing Applications

Page 24: Danny Powell Executive Director National Center for Supercomputing Applications

Educational TechnologyIn support of the learning processEducational TechnologyIn support of the learning process

• Often - the technology used to support research is also valuable in supporting education– Digital informational resources

• Books, references, lectures, photos, videos, audio

• Virtual museums, artifacts

• Data, experiments

– Tools• Analysis, Inquiry, Applications, Visualization

• Models and Simulations

– Collaborative Environments• Virtual coordination, workflow spaces

• Resource sharing – data, computation, visualization

National Center for Supercomputing Applications

Page 25: Danny Powell Executive Director National Center for Supercomputing Applications

National Center for Supercomputing Applications

Government and Public Health Informatics

Projects and Customers

Page 26: Danny Powell Executive Director National Center for Supercomputing Applications

National Center for Supercomputing Applications

Examples of Uses of HPC / Data Analytics

Examples of Uses of HPC / Data Analytics

– Illinois State Police – analysis of historical data to help determine crime (and hence staffing) patterns

– Policy makers – hazard risk assessments and planning (and response)

– Public health officials – early warning on disease outbreaks, with informed options to manage

– National Archives – data tools for long term preservation and for public analysis of the data

– Economic Development – agricultural marketing enhancement and monitoring program

– Policy Decision Support - Urban Planners, Environmental Monitoring, Socio-Economic Modeling, Social Network analysis… many others

Page 27: Danny Powell Executive Director National Center for Supercomputing Applications

Evolving into a successful HPC Center

How we have changed over time

User focus

Keeping your staff sharp – not complacent

Management

National Center for Supercomputing Applications

Page 28: Danny Powell Executive Director National Center for Supercomputing Applications

USERS:High EndComputer

& DataNeeds

Individual tools, System software,

Analytics, Visualization, Integrated SW systems,Workflow, User Support,

Training EffectiveResource Utilization

NCSAEnables effective/efficientuse of high end computer

and data resources in support

of science and education

Mission: Enable Science/Engineering/Education

Scientific, Decision Support, InquiryResults

Page 29: Danny Powell Executive Director National Center for Supercomputing Applications

29

Traditional Function: System SupportTraditional Function: System Support

•System Management• Resource and job scheduling

•Storage Management • On-line and Near-line system and data

administration• Information life cycle management

•Cyber-protection•Networking provisioning and tuning•System Monitoring•System software upgrades and SW management.•Quality Assurance

BW Full Service Overview

Page 30: Danny Powell Executive Director National Center for Supercomputing Applications

30

User Support Function: Basic and BeyondUser Support Function: Basic and Beyond

• Requirement Analysis• Service Request Management• Application Services

• Application analysis• Porting and Tuning at scale • Bottleneck reduction• Client consulting• Application re-engineering• Library and tools creation and support• Third Party Application support

• Visualization and Data Analysis• Information provisioning

• Documentation, notification, training, community

• Account/allocation management• Quality Assurance

BW Full Service Overview

Page 31: Danny Powell Executive Director National Center for Supercomputing Applications

31

Community Engagement Function:

Relationship Building

Community Engagement Function:

Relationship Building•Partnership/Team Building•Structured Requirement Analysis•Workflow Systems

Business / operation rulesCollaborative environmentsIntuitive user interfacesData storage, data management tools

Visualization and data analytics tools

•Community engagement•Work Plan Management•Participation in evaluation and planning•Trust

BW Full Service Overview

Page 32: Danny Powell Executive Director National Center for Supercomputing Applications

Staff Changes (estimated numbers)Staff Changes (estimated numbers)

Technical staff breakdown

National Center for Supercomputing Applications

Current

Very Early Days

Technical system administration 50 70

Applied R&D 100 40

User Support (from basic service to Customized disciplinary support)

50 20

Technical management (mid level to senior) 5025

Page 33: Danny Powell Executive Director National Center for Supercomputing Applications

And Finally: Organizational

Management

And Finally: Organizational

Management• Hire and retain skilled staff

– Continued professional development

– Keep staff motivated and sharp• Proposals – competitions

• Peer speaking engagements – personnel exchanges

• Enable them to grow personally and professionally– Don’t micromanage – empower your staff to succeed, and let them

• The MONEY – Always the Money!!!– Core funding – work closely with your core funding sources

– Variety of competitive grant funding

– Help your funding agencies understand the value of HPC and CyberInfrastructure, and what it takes to be successful.

– It’s not cheap, and the ROI will take time to show value – but without a long term commitment from your core funding agency, it will be very, very difficult to accomplish.

National Center for Supercomputing Applications

Page 34: Danny Powell Executive Director National Center for Supercomputing Applications

Questions?

STEM Smart Workshop •10 April 2012 •Chicago, Illinois •http://iclcs.illinois.edu

Page 35: Danny Powell Executive Director National Center for Supercomputing Applications

Imaginations unbound

Page 36: Danny Powell Executive Director National Center for Supercomputing Applications

National Center for Supercomputing Applications

National Center for Supercomputing Applications

Building Integrated Application/Decision Support Systems – It’s an Iterative Process

of Teamwork

RequirementsAnalysis & Specification

Development &System Integration

Prototype or ProductionCyberenvironments

Situation AnalysisKnowledge and Decision Support

TeraGridWorking Groups

Advisory CommitteesIndustrial Partners

InternationalPartners

Partners

Portals & GUIsWorkflow MgmtS&E Applications

Data Mining & AnalysisVisualizationWebservices

CollaboratoriesMiddleware

Security

Integrated Project Teams

User RepresentativesTeam Participation

Technology RoadmapsApplication Roadmaps

Cyberarchitecture WorkingGroup

Page 37: Danny Powell Executive Director National Center for Supercomputing Applications

Science & Engineering Application SupportScience & Engineering Application Support

Science Team (ST) Requirements and Challenges Gathering

SEAS Staff and Points of Contact (PoC)

Page 38: Danny Powell Executive Director National Center for Supercomputing Applications

Advanced Information Systems

Major New Data SourcesAdvanced Information Systems

Major New Data Sources

Instruments

New instruments, e.g., telescopes and detectors, are using advanced digital technologies to support increasingly detailed observations

Sensors, Surveys and Satellites

Sensor arrays, aerial surveys and satellite data will revolutionize our understanding of the environment

Computers

New high-end computers are producing massive amounts of data from ever more detailed computational models

National Center for Supercomputing Applications

Page 39: Danny Powell Executive Director National Center for Supercomputing Applications

• $5M, 18-month Public-Private Partnership (PPP)• 4 OEMs; 4 solution providers; • Phase 1: 8 manufacturing sector SMEs• Advanced modeling, simulation & analysis (MS&A)• Rationale:

– MS&A adoption by OEMs is high and growing– SMMs’ use of advanced MS&A is suboptimal– ROI is definitely favorable

• Objectives:– Boost MS&A adoption at SMMs– Simplified access to advanced MS&A– Demonstrate a scalable business model

NDEMC - OVERVIEW

Page 40: Danny Powell Executive Director National Center for Supercomputing Applications

Networks are Critical InfrastructureNetworks are Critical Infrastructure

National Center for Supercomputing Applications