Transcript
Page 1: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Award Number ACI-1547611

Science Gateways Community Institute

Suresh Marru Indiana University

Page 2: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

About me

• Deputy Director for the Science Gateways Research Center • Nominated Member of the Apache Software Foundation • Co-PI on several NSF awards • Co-Instructor, Science Gateways Architectures Course • The Apache Software Foundation Vice President for Apache Airavata

Page 3: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Aims for next 30 mins

Introduce Science

Gateways

How we got here (LEAD)

What we are doing now

(Apache Airavata)

National Infrastructure to assist with

the “Paradigm Shift” (SGCI,

XSEDE)

Pondering: Actionable Next Steps?

Page 4: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

What is a science gateway?

science gateway /sī′ əns gāt′ wā′/ n.

1. an online community space for science and engineering research and education.

2. a Web-based resource for accessing data, software, computing services, and equipment specific to the needs of a science or engineering discipline.

Page 5: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Computational Science and Engineering Challenges

• What is a cluster/cloud and how do I use it? • What clusters/cloud are available to me? • How do I use this particular machine? • How do I get my data on and off? • How did I get that result? • Where is that result? • Can you share that result with me?

Page 6: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Problems • My applications are taking too long to run on my desktop. • I know I should run my applications on supercomputer, but it is too

confusing, and the person who knows how to do that is too busy. • You mean there are other supercomputers or computing clouds out

there that I could use besides my team’s cluster? • That supercomputer is really different from what I know how to use, so

I’ll not bother. • [Data input for my application on Supercomputer A is different

Supercomputer B. • Workflow problem: connect A to B]

Page 7: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Science Gateways Solve These Problems

Page 8: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Technology Adoption Choices

Page 9: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

What is a Science Gateway?

• Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds.

• Science gateways encode expertise • Running specific scientific application • Running jobs on diverse, nonlocal machines • Moving data to and from world-wide resources

• Science gateways enable sharing of results • Science gateways make results recoverable and reproducible

Page 10: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Science Gateways Are So Popular that We Started a Center And then “Apache Airavata is Software that we developed to build science gateways”

Page 12: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

On-Demand Grid Computing

Example: Adapting Weather Prediction to Observational Sources Using

Dynamic Adaptivity

Streaming Observations

Storms Forming

Forecast Model

Data Mining

Page 13: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

88D Radar Remapper

Satellite Data Remapper

NIDS Radar Remapper

Radar data (level II)

Surface data, upper air mesonet data,

wind profiler

Radar data (level III)

Satellite data

ADAS

Terrain Preprocessor

3D Model Data Interpolator

(Initial Boundary Conditions)

3D Model Data Interpolator

(lateral Boundary Conditions)

Terrain data files NAM, RUC, GFS data

WRF Static Preprocessor ARPS to

WRF Data Interpolator

ARPS Plotting Program

IDV Bundle

Surface, Terrestrial data files

Dynamic Workflow in LEAD

1

WRF to ARPS Data Interpolator

2

3

4

5

6

7

8

9

10

11

12

13

Run Once per forecast Region

Repeated for periodically

for new data

Triggered if a storm is detected

Visualization on users request

ADAM

Data mining: looking for

storm signature

WRF WRF

WRF WRF

14

ARPS Ensemble Generator

15

Static data Initialization Forecast Visualization Real time data Analysis Data Mining

Page 14: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Analyze & Predict

Research & Reproducibility

Education & Outreach

Discover &Visualize

Page 15: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

LEAD Architecture

Distributed Resources

Computation Specialized Applications

Steerable Instruments Storage

Data Bases

Resource Access Services GRAM

Grid FTP

SSH

Scheduler

LDM

OPenDAP Generic Ingest Service

User Interface

Desktop Applications • IDV • WRF Configuration GUI

LEAD Portal

Portlets Visualization Workflow Education

Monitor

Control

Ontology Query

Browse

Control

Crosscutting Services

Authorization

Authentication

Monitoring

Notification

Conf

igur

atio

n an

d Ex

ecut

ion

Serv

ices

Workflow Monitor

MyLEAD

Workflow Engine/Factories

VO Catalog

THREDDS

Application Resource Broker (Scheduler)

Host Environment

GPIR

Application Host

Execution Description

WRF, ADaM, IDV, ADAS

Application Description

Application & Configuration Services

Client Interface

Observations • Streams • Static • Archived

Data

Ser

vice

s

Wor

kflo

w

Ser

vice

s Ca

talo

g S

ervi

ces

RLS OGSA-

DAI

Geo-Reference GUI

Control Service

Query Service

Stream Service

Ontology Service

Decoder/Resolve

r Service

Transcoder Service/

ESML

Page 16: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Excerpt from LEAD Final Report to NSF

• “……The stretch goal for LEAD was to begin ushering in this paradigm change. After 6 years, 415 papers and presentations, a mention of LEAD by Bill Gates at Supercomputing 2007, 9 PhD’s awarded, 14 Master’s degrees, and 1 Bachelor’s degree; ………..”

Page 17: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Generalizing LEAD and embracing “Community over code”

Page 18: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Towards a Generic SGW Hosted Platform

Web Portals Workflows Services

Command line Interfaces

Unified Repository

NMI Build & Test

Software a la carte

Generic Middleware

Desktop GUI’s

DAG based Workflows

Packaged Software

Downloadable Software

Apache Airavata

Multi-Tenancy

NextGen Web Interfaces

Dynamic & Interactive Workflows

Centralized Operations

Hosted Software

Hosted Gateways-as-a-Service

Scripting Support

Reusable Portlets

2003 to 2010 2010 to 2013 2013 to 2018 and Beyond

Page 19: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Apache Airavata Architecture

Page 20: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

A Build vs Buy vs Collaborate Story

Page 21: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Scale number of gateways without having to scale FTE’s needed to support them.

SciGaP Key Mission

Page 22: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Improve sustainability by converging on a single set of hosted infrastructure services

Page 23: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Single Campus Cyberinfrastructure

Page 24: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

What is the chemistry of hydrated calcium carbonate? • Bio-mineralization of skeletons and shells • Geological C02 sequestration • Cleanup of contaminated environments

CaCO3.1H2O CaCO3.12H2O

Lopez-Berganza, et al. J Phys. Chem. A(2015)

Page 25: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

CaCO3.xH2O Initial guess

Stampede Supercomputer TINKER

Monte Carlo Molecular Mechanics

(Minimize Torsional Energy in <20,000 steps)

Stampede Supercomputer

DFTB+ Approximate DFT-Based

Comet Supercomputer

Gaussian09 Ab initio Quantum Chemistry

-2-3 CaCO3 Equilibrium Structures

-Thermochemistry (E,H,G, etc.)

-Vibrational Frequencies

x=x+1

Lopez-Berganza, et al. J Phys. Chem. A(2015)

SEAGrid.org enabled workflow

Page 26: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

In the beginning, we had no services We paid science teams to help us develop them Science Gateway Prototype Discipline Science Partner(s) TeraGrid Liaison

Linked Environments for Atmospheric Discovery (LEAD)

Atmospheric Droegemeier (OU) Gannon (IU), Pennington (NCSA)

National Virtual Observatory (NVO) Astronomy Szalay (Johns Hopkins) Williams (Caltech)

Network for Computational Nanotechnology (NCN) and “nanoHUB”

Nanotechnology Lundstrum (PU) Goasguen (PU)

Open Life Sciences Gateway Biomedicine and Biology Schneewind (UC), Osterman (Burnham/UCSD), DeLong (MIT), Dusko (INRA)

Stevens (UC/Argonne)

Biology and Biomedical Science Gateway

Biomedicine and Biology Cunningham (Duke), Magnuson (UNC) Reed (UNC), Blatecky (UNC)

Neutron Science Instrument Gateway Physics Cobb (ORNL) Cobb (ORNL)

Grid Analysis Environment High-Energy Physics Newman (Caltech) Bunn (Caltech)

Transportation System Decision Support Homeland Security Stephen Eubanks (LANL) Beckman (Argonne)

Groundwater/Flood Modeling Environmental Wells (UT-Austin), Engel (ORNL) Boisseau (TACC)

Science Grid [GrPhyN/ivDGL/Grid3] Multiple Pordes (FNAL), Huth (Harvard), Avery (Uflorida)

Foster (UC/Argonne), Kesselman (USC-ISI), Livny (UW)

Page 27: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Eventually we had a program XSEDE users

All users

Gateways

• And customers • 2013, gateway users

surpass command line users in XSEDE

• 2016, gateways now 77% of active XSEDE users

Page 28: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Despite many successes, we observed challenges Gateways often funded as 3-year research projects

• Developers typically • work in isolation • must bridge to variety of resources • need building blocks in order to

focus on higher-level functionality • struggle to secure sustainable

funding

Early adopters

Publicity

Wider adoption

Funding ends

Scientists disillusioned

New project

prototype

Page 29: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

We studied the problem And studied it some more

2009-2012 EAGER

2012-2015 Concept. phase

2016 Software Institute!

• More focus groups • Survey with 5000

responses

Focus groups

Page 30: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

10+ year road to the birth of an institute

• “discipline-specific CI capabilities” = science gateways • First example of community groups using supercomputers without

individual identification

Despite the technological progress of grid technology and deployment, only a minority of the scientific, engineering, and education community use today’s national computing infrastructure. Our WIDE strategy addresses this situation by working directly with specific community leaders who are building discipline-specific cyberinfrastructure capabilities and resources for their communities. TeraGrid proposal, 2003

Page 31: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

5-year S2I2 Implementation phase award begins Aug, 2016

31

Page 32: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Science Gateways Community Institute Designed to help the community build gateways more effectively

Diverse expertise on demand

Longer-term, hands-on support

Student opportunities &

educator resources

Sharing experiences & knowledge as a community

Software & visibility for gateways

Page 33: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

Closing Thoughts

• Gateways translate resource-centric views into science-centric views. • LEAD succeeded in ushering a“paradigm shift” from a technical

perspective, but fell short in preparing the community. • LEAD is not currently functional as a virtual organization, but many of its

constituent elements are. Any interest in resurrecting in its entirety?

• Gateways are developed by the community for the community. In addition to the longer term vision setting of this workshop, I would be thrilled to work with many of you to build “Gateways to Clouds” for research and education.

• You can get “free” (payed by NSF) allocation of our time to help you in these efforts. We will help you build and operate gateways, but you are on your own to support the community.

Page 34: Science Gateways Community Institute - Unidata · • Science gateways are Web and desktop interfaces to high performance computing clusters, computing clouds. • Science gateways

More Information

• Science Gateways Research Center • https://sgrc.iu.edu/

• Apache Airavata Open Source Science Gateway Software • http://airavata.apache.org/

• Contact • Center email: [email protected] • Marlon Pierce: [email protected] • Suresh Marru: [email protected]


Recommended