33
e-Infrastructures for Science and Industry -Clusters, Grids, and Clouds – Wolfgang Gentzsch, The DEISA Project and OGF 8 th Int. Conference on Parallel Processing and Applied Mathematics Wroclaw, Poland, Sep 13 – 16, 2009

e-Infrastructures for Science and Industry

Embed Size (px)

DESCRIPTION

Keynote at 8th Int. Conference on Parallel Processing and Applied MathematicsWroclaw, Poland, Sept 2009

Citation preview

Page 1: e-Infrastructures for Science and Industry

e-Infrastructures

for Science and Industry

-Clusters, Grids, and Clouds –

Wolfgang Gentzsch, The DEISA Project and OGF

8th Int. Conference on Parallel Processing and Applied Mathematics

Wroclaw, Poland, Sep 13 – 16, 2009

Page 2: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 2

RI-222919

HPC Centers

• They are service providers, for past 40 years

• For research, education, and industry

• Computing, storage, apps, data, services

• Very professional

• to end-users, they look (almost) like Clouds

Page 3: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 3

RI-222919

Grids

1998: The Grid: Blueprint for a

New Computing Infrastructure

2002: The Anatomy of the GridIan Foster, Carl Kesselman, Steve Tuecke

Ian Foster, Carl Kesselman

Page 4: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 4

RI-222919

Grids (Sun in 2001)

Departmental

Grids

Enterprise

Grids

Global

Grids

Clouds

Page 5: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 5

RI-222919

Public Clouds

• Outside corporate data center

• Access over the Internet

• Virtual (Vmware, Xen,...)

• Abstraction of the hardware

• Service oriented: SaaS, PaaS, IaaS, HaaS

• Variable cost of services (QoS)

• Pay-per-use IT services

• Scaling up/down

• IaaS, PaaS, SaaS

• Access

• Elasticity

• Abstraction

• Public, private, hybrid

• Capex => Opex

• Pay-per-use

• Scaling

• and

• we

• have

• all

• the

• components

• available

• today

Clouds

Page 6: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 6

RI-222919

Benefits of moving HPC to Grids

• Closer collaboration with your colleagues (VCs)

• More resources allow faster/more processing

• Different architectures serve more users

• Failover: move jobs to another system

. . . and Clouds

• No upfront cost for additional resources

• CapEx => OpEx, pay-per-use

• Elasticity, scaling up and down

• Hybrid solution (private and public cloud)

Page 7: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 7

RI-222919

The Cloud of Cloud Companies

• Akamai

• Areti Internet

• Enki

• Fortress ITX

• Joyent

• Layered Technologies

• Rackspace

• Terremark

• Xcalibre

• Manjrasoft / Aneka

• GridwiseTech / Momentum

• NICE/EnginFrame

• Amazon

• Google

• Sun

• Salesforce

• Microsoft

• IBM

• Oracle

• EMC

• Cloudera

• Cloudsoft

Page 8: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 8

RI-222919

NICE EnginFrame

Cluster/Grid/Cloud Portal

Remote, interactive, transparent, secure access to apps & data

on corporate Intranet or Internet, or in the Cloud.

Interactive

Applications

Intranet Clients

Win LX

UXMac

Intranet Clients

Win LX

UXMac

Virtualized Data Center Clusters

Users

BatchApplications

Virtualized Storage

Cloud Portal

/ Gateway

Cloud Portal

/ Gateway

Administrators

Administrators

Users

Administrators

Administrators

Users

Sta

nd

ard

pro

toco

lsS

tan

da

rd p

roto

co

ls

Licenses

Page 9: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 9

RI-222919

A Scalable Data Cloud Infrastructure

Example: GridwiseTech Momentum

Page 10: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 10

RI-222919

ANEKA Cloud Platform

Private Cloud

LAN network

Amazon

Microsoft Google

Sun

Data Center

Virtual Machines

Windows Mac with Mono Linux with Mono

An

ek

a

Clo

ud

P

latfo

rm

IaaS

PaaS

SaaS Cloud applications

Social computing, Enterprise, ISV, Scientific, CDNs, ...

Cloud Programming Models & SDK

Task

Model

Task

ModelThread

Model

Thread

ModelMap Reduce

Model

Map Reduce

ModelThird Party

Models

Third Party

Models

Core Cloud Services

SLA

Management

SLA

Management

VM

Management

VM

ManagementVM

Deployment

VM

Deployment

QoS

Negotiation

QoS

Negotiation

Job

Scheduling

Job

SchedulingExecution

Management

Execution

Management

PricingPricing BillingBilling

Admission

Control

Admission

Control

MeteringMetering

Data

Storage

Data

StorageMonitoringMonitoring

Workflow

Model

Workflow

Model

Private Cloud

LAN network

Private CloudPrivate CloudPrivate Cloud

LAN network

Amazon

Microsoft Google

Sun

Amazon

Microsoft Google

Sun

Data Center

Virtual Machines

Windows Mac with Mono Linux with Mono

Virtual Machines

WindowsWindows Mac with MonoMac with Mono Linux with MonoLinux with Mono

An

ek

a

Clo

ud

P

latfo

rm

IaaS

PaaS

SaaS Cloud applications

Social computing, Enterprise, ISV, Scientific, CDNs, ...

Cloud Programming Models & SDK

Task

Model

Task

ModelThread

Model

Thread

ModelMap Reduce

Model

Map Reduce

ModelThird Party

Models

Third Party

Models

Core Cloud Services

SLA

Management

SLA

Management

VM

Management

VM

ManagementVM

Deployment

VM

Deployment

QoS

Negotiation

QoS

Negotiation

Job

Scheduling

Job

SchedulingExecution

Management

Execution

Management

PricingPricing BillingBilling

Admission

Control

Admission

Control

MeteringMetering

Data

Storage

Data

StorageMonitoringMonitoring

Workflow

Model

Workflow

Model

Courtesy:

Manjrasoft

Page 11: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 11

RI-222919

Page 12: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 12

RI-222919

Courtesy: Werner Vogels

Page 13: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 13

RI-222919

Animoto EC2 image usage

Day 1 Day 8

0

4000

Page 14: e-Infrastructures for Science and Industry

‚My‘ current project:

DEISA: Grid or Cloud ?Distributed European Infrastructure for Supercomputing Applications

Ecosystem for HPC Grand-Challenge Applications

Page 15: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 15

RI-222919

DEISA1: May 1st, 2004 – April 30th, 2008

DEISA HPC Centers

DEISA2: May 1st, 2008 – April 30th, 2011

Page 16: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 16

RI-222919

Gateway

CSC

Gateway

ECMWF

Gateway

FZJ

Gateway

IDRIS

Gateway

SARA

Gateway

LRZ

Gateway

HPCX

Gateway

HLRS

NJS CINECA IBM P5

IDB UUDB

Gateway

BSC

Gateway

CINECA NJS

FZJ IBM

IDB UUDB

NJS RZG IBM

IDB UUDB

NJS ECMWF IBM P5

IDB UUDB

NJS CSC Cray XT4/5

IDB UUDB

NJS HPCX Cray XT4

IDB UUDB

NJS LRZ SGI ALTIX

IDB UUDB

NJS

HLRS NEC SX8

IDB UUDB

CINECA user

LRZ user

job

job

NJS SARA IBM

IDB UUDB

NJS BSC IBM PPC

IDB UUDB

Gateway

RZG

NJSIDRIS IBM P6

IDB UUDB

AIXLL-MC

AIXLL

LINUXPBS Pro

Super-UXNQS II

GridFTP

LINUXMaui/Slurm

UNICOS/lcPBS Pro

LINUXLL

AIXLL-MC

AIXLL-MC

UNICOS/lcPBS Pro

AIXLL-MC

DEISA UNICORE Infrastructure

Page 17: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 17

RI-222919

Technologies

reques

tssu

pport

Applications

Operations

offer

spro

duct

requests

config

uratio

n

offers

service

offers technology

requests development

Categories of DEISA services

Page 18: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 18

RI-222919

DEISA

Sites

UnifiedUnified

AAAAAANetworkNetwork

connectivityconnectivity

DataData

transfer transfer

toolstools

Data stagingData staging

toolstools

JobJob

reroutingrerouting

SingleSingle

monitormonitor

systemsystem

CoCo--

reservationreservation

and coand co--

allocationallocation

WorkflowWorkflow

managemntmanagemnt

MultipleMultiple

ways toways to

accessaccess

CommonCommon

productionproduction

environmntenvironmnt

WANWAN

sharedshared

File systemFile system

Network

and

AAA

layers

Job manag.

layer and

monitor.

Presen-

tation

layer

Data

manag.

layer

DEISA Service Layers

Page 19: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 19

RI-222919

AIXLL-MC

AIXLL

LINUXPBS Pro

Super-UXNQS II

GridFTP

UNICOS/lcPBS Pro

LINUXLL

AIX, LinuxLL-MC

AIX, LinuxLL-MC

IBM P5

IBM P6 & BlueGene/P

IBM P6 & BlueGene/P

IBM P6

Cray XT4/5

Cray XT4

SGI ALTIX

NEC SX8

IBM P5+ / P6IBM PPC

IBM P6 & BlueGene/P

UNICOS/lcPBS Pro

AIX, LinuxLL-MC

DEISA Global File System

LINUXMaui/Slurm

Global transparent file system based on the Multi-Cluster General Parallel File System

(MC-GPFS of IBM)

Page 20: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 20

RI-222919

User management in DEISA

• A dedicated LDAP-based distributed repository

administers DEISA users

• Trusted LDAP servers are authorized to access each

other (based on X.509 certificates) and encrypted

communication is used to maintain confidentiality

BSC CINECA CSC ECMWF EPCC FZJ HLRS IDRIS LRZ RZGSARA

SARA

Page 21: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 21

RI-222919

DEISA: Grid or Cloud ? • Built on top of proven, professional infrastructure of HPC

centers with expertise in implementation, operation, services.

• Ecosystem of resources, middleware, applications is respectingadministrative, cultural and political autonomy of partners.

• Globalizing existing HPC services - from local to global -according to user requirements: revolution by evolution.

• User support: user-friendly access to resources, porting userapps onto turnkey architecture.

• After EU funding, DEISA HPC ecosystem will operate in a sustainable way, in the interest of the ‘global scientist’, as...

... almost an HPC Cloud !

Page 22: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 22

RI-222919

There are still many

Challenges with Clouds

SustainableSustainable

CompetitiveCompetitive

AdvantageAdvantage

CULTURALCULTURAL

TECHNICALTECHNICAL

LEGAL &LEGAL &

REGULATORYREGULATORY

Page 23: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 23

RI-222919

• Not all applications are cloud-ready or cloud-enabled

• Interoperability of clouds (standards ?)

• Sensitive data, sensitive applications (med.patient records)

• Different organizations have different ROI

• Security: end-to-end from your resources to the cloud !

• Current IT culture is not predisposed to sharing resources

• “Static” licensing model doesn’t embrace cloud

• Protection of intellectual property

• Legal issues (FDA, HIPAA)

Challenges, Potential Cloud Inhibitors

Page 24: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 24

RI-222919

A Cloud Checklist for HPCWhen is your HPC app ready for the Cloud ?

� ... no issues with licenses, IP, secrecy, privacy, sensitivedata and big data movement, legal or regulatory issues,trust, . . .

� ...your app is architecture independent, not optimized forspecific architecture (single process, loosely-coupled low-level parallel, I/O-robust)

� ...it’s just one app and zillions of parameters

� ...latency and bandwidth are not an issue

Ideally, your meta-scheduler knows your

requirements and schedules automatically ☺☺☺☺

Page 25: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 25

RI-222919

Hybrid Grid/Cloud

Resource Management

External CloudResources Department 1

Department 2

Department resource access

Campus wide resource demand

Project A

Team B

Contractor X

Project C

User 1

User 2

Department 3Department 4

Define policies according to

priorities, budget, and time

Page 26: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 26

RI-222919

Ed Walker, Benchmarking Amazon EC2 for high-performance scientific computing, ;Login, October 2008.

Page 27: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 27

RI-222919

Ed Walker, Benchmarking Amazon EC2 for high-performance scientific computing, ;Login, October 2008.

Page 28: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 28

RI-222919

Ed Walker, Benchmarking Amazon EC2 for high-performance scientific computing, ;Login, October 2008.

Page 29: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 29

RI-222919

Ed Walker, Benchmarking Amazon EC2 for high-performance scientific computing, ;Login, October 2008.

Page 30: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 30

RI-222919

A Closer Look at HPC Load

� Single parallel job, cpu-intensive, tightly-coupled, highly scalable, peta, exa,..

� Single parallel job, cpu-intensive, weakly-scalable

� Capacity computing, throughput, parameter jobs

� Managing massive data sets, possibly geographically distributed

� Analysis and visualization of data sets

*) Similar to the analysis of T.Sterling and D.Stark, LSU, HPCwire

Page 31: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 31

RI-222919

Clouds and supercomputers:

Conventional wisdom?

Too slow

Too expensive

Clouds/clusters

Supercomputers

Loosely coupledapplications

Tightly coupledapplications

�Courtesy

Ian Foster

Page 32: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 32

RI-2229193

Loosely coupled problems• Ensemble runs to quantify climate model uncertainty

• Identify potential drug targets by screening a database of ligand

structures against target proteins

• Study economic model sensitivity to parameters

• Analyze turbulence dataset from many perspectives

• Perform numerical optimization to determine optimal resource

assignment in energy problems

• Mine collection of data from advanced light sources

• Construct databases of computed properties of chemical compounds

• Analyze data from the Large Hadron Collider

• Analyze log data from 100,000-node parallel computations

☺☺☺☺ all can run in the cloud ☺☺☺☺

Page 33: e-Infrastructures for Science and Industry

PPAM, Wroclaw, Sept 2009 Wolfgang Gentzsch 33

RI-222919

Thank You

[email protected]