26
HP CCN Meeting, Seattle, November 12, 2005 CCN Grid Collaboration Lennart Johnsson University of Houston (CS and TLC 2 ) and Kungl. Tekniska Hogskolan (NADA and PDC)

CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

  • Upload
    phamdan

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

CCN Grid Collaboration

Lennart JohnssonUniversity of Houston

(CS and TLC2)and

Kungl. Tekniska Hogskolan(NADA and PDC)

Page 2: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

CCN Grid CollaborationObjectives

• A SIG focused on exchange of experiences?

• A Collaboration focused on (rapidly) maturing and evolving Grid technologies?

• A Collaboration focused on demonstrating the utility of Grids?

Page 3: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

CCN Grid

• What can CCN Grid collaboration provide that I do not already have?

• Is CCN Grid the best vehicle to get what I want and do not have?

• How can CCN Grid make “me” more competitive?

Page 4: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

“My” Current Grid Activities• CCN Grid• GGF• Globus Alliance• VGrADS• THEGrid• TIGRE• RENoH

• LCG - Alice • EGEE• NextGrid• SweGrid• Baltic Grid• ICEAGE• DEISA?• OMII-Europe?

Page 5: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

VGrADS is an NSF-funded Information Technology Research project

Keith CooperKen Kennedy

Charles KoelbelRichard TapiaLinda Torczon

Rich Wolski Fran BermanAndrew ChienHenri Casanova

Carl Kesselman

LennartJohnsson

Dan Reed Jack Dongarra

Plus many graduate students, postdocs, and technical staff!

Page 6: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

The VGrADS Vision: Distributed Problem Solving• Where We Want To Be

– Transparent Grid computing• Submit job• Find & schedule resources• Execute efficiently

• Where We Are– Low-level hand programming– Programmer needs to manage

• Heterogeneous resources• Computation and data movement scheduling• Fault tolerance and performance adaptation

• What Do We Need?– A more abstract view of the Grid

• Each developer sees a scalable “virtual grid”– Simplified programming models built on the abstract view

• Permit the application developer to focus on the problem

Database

SupercomputerSupercomputerDatabase

SupercomputerSupercomputer

Page 7: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

Virtual Grid Execution System (vgES)

ApplicationApplication

vgES APIs

vgMON

vgDL

InformationServices

ResourceManagers

vgLAUNCH

vgFABVG

VG

VGVG

DVCW

vgAgent

GridResources

vgDL Description

Virtual Grid

Successfully BoundCandidates

Grid ResourceUniverse

• A Virtual Grid (VG) takes– Shared heterogeneous resources– Scalable information service

• and provides– An hierarchy of application-defined

aggregations (e.g. ClusterOf) with constraints (e.g. processor type) and rankings

• Virtual Grid Execution System (vgES) implements VG

– VG Definition Language (vgDL)– VG Find And Bind (vgFAB)– VG Monitor (vgMON)– VG Application Launch

(VgLAUNCH+DVCW)– VG Resource Info (vgAgent)

Page 8: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

• Scheduling of workflow computations

– Off-line look-ahead scheduling dramatically improves in makespan(total time)

– Accurate performance models significantly affect quality of scheduling

– Queue wait prediction allows scheduling into batch queues

• Fault tolerance– Diskless checkpointing for linear

algebra computations (application-specific)

– Temporal reasoning for fault prediction

– Optimal checkpoint frequency for iterative applications

CF=1 CF=10

CF=100

Offline

Online0

5000

10000

15000

20000

25000

30000

35000

40000

45000

Sim

ula

ted

Makesp

an

Compute Factors

Scheduling Strategy

Online vs. Offline - Heterogeneous Platform

NoneSimple

Accurate

Heuristic

Random

0

200

400

600

800

1000

1200

Tim

e (m

in)

Performance Model

Scheduler

Performance Models and Schedulers - Heterogeneous Platforms

P0 P1

P3P2P4

P4 = P0 P1 P2 P3

Parityprocessor

Applicationprocessors

Diskless Checkpointing

VGrADS is studying a range of tools for grid programming tasks, including

Page 9: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

VGrADS Application Collaborations

EMANElectron Micrograph

Analysis

GridSATBoolean Satisfiability

BPEL Workflow Engine

LDM Service

GridFTP Service

WRF Service

vgES

Information Service

Rice SchedulerEnsemble

BrokerVisualization

Service

Data arrive

s

ResourceBroker

Data Mining

Start

End

Static Workflow

DynamicWorkflow

LEADAtmospheric Science

MontageAstronomy

Page 10: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

THEGrid• Several Universities

– UT, UH, Rice, TTU, TAMU, UTA, UTB, UTEP, SMU, UTD, etc• Many different research facilities used

– Fermi National Accelerator Laboratory– CERN, Switzerland, DESY, Germany, and KEK, Japan– Jefferson Lab– Brookhaven National Lab– SLAC, CA and Cornell– Natural sources and underground labs

• Sizable community, variety of experiments and needs• Very large data sets now! Even larger ones coming!!

Page 11: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

High Performance Computing Across Texas (HiPCAT) — http://www.hipcat.net

TIGRE - Texas Internet Grid for Research and Education

Research areas of particular interest

biomedicine, energy and the environment, materials science, agriculture, and information technology

Page 12: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

TIGRE Activities• Assembling a software stack– Start small, add by concensus

• Globus Toolkit 4.0, pre-Web Services and Web Services• GSI OpenSSH• UberFTP• Condor-G

– Will make available to other HiPCAT institutions• Amassing resources

– Allocated by TIGRE institutions• Lonestar (UT Austin): 1024 Xeons + Infiniband + GigE• Hrothgar (Texas Tech): 256 Xeons + Infiniband + GigE• Cosmos (Texas A&M): 128 Itaniums + Numalink• Rice Terascale Cluster: 128 Itaniums + GigE• Eldorado (Houston): 124 Itaniums + GigE + SCI• plus several smaller systems

– Will incorporate other institutions as appropriate to applications

Page 13: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

RENoH

San Antonio

College StationDallas

El PasoLos Angeles

Kansas CityChicago

Baton RougeJacksonville

LEARN

24 strands

12 strands

NLR

Page 14: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

Page 15: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

Wiltel Fiber in the West -- AT&T Fiber in the Southeast

Denver

Seattle

Sunnyvale

LA

San Diego

Chicago Pitts

Wash DC

Raleigh

Jacksonville

Atlanta

KC

Baton Rouge

El Paso -Las Cruces

Phoenix

Pensacola

Dallas

San Ant. Houston

Albuq. Tulsa

New YorkClev

Page 16: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

ESne

t Sc

ienc

e Dat

a Net

work

(S

DN) co

re

TWC

SNLL

YUCCA MT

BECHTEL-NV

PNNLLIGO

INEEL

LANL

SNLAAlliedSignal

PANTEX

ARM

KCP

NOAA

OSTI ORAU

SRS

JLAB

PPPLINEEL-DCORAU-DC

LLNL/LANL-DC

MIT

ANL

BNL

FNALAMES

4xLAB-DC

NR

EL

LLNL

GA

DOE-ALB

GTN&NNSA

International (high speed)10 Gb/s SDN core10G/s IP core2.5 Gb/s IP coreMAN rings (≥ 10 G/s)OC12 ATM (622 Mb/s)OC12 / GigEthernetOC3 (155 Mb/s)45 Mb/s and less

Office Of Science Sponsored (22)NNSA Sponsored (12)Joint Sponsored (3)Other Sponsored (NSF LIGO, NOAA)Laboratory Sponsored (6)

QWESTATM

42 end user sites

ESnet IP core

SINet (Japan)Japan – Russia (BINP)CA*net4 France

GLORIAD Kreonet2MREN NetherlandsStarTap TANet2Taiwan (ASCC)

AustraliaCA*net4Taiwan(TANet2)

Singaren

ESnet IP core: Packet over SONET Optical Ring and Hubs

ELP HUB

ATL HUB

DC HUB

peering points

MAE-E

PAIX-PAEquinix, etc.

PNW

GPo

P

SEA HUB

ESnet Summer 2005

IP core hubsSNV HUB

Abilene high-speed peering points

Abilene

Abile

ne

CERN(LHCnet – partDOE funded)

GEANT- Germany, France, Italy, UK, etc

NYC HUB

Starlight

Chi NAP

CHI-SL HUB

SNV HUB

Abilene

SNV SDN HUB

JGILBNL

SLACNERSC

SND core hubs

SDSC HUB

Equinix

MAN

LAN

Abile

ne

MAXGPoP

SoXGPoP

SNV SDN HUB

ALB HUB

ORNL

CHI HUB

Not Houston, but wait …There is LEARN

Page 17: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

LHC Computing Grid (LCG)

Truly heterogeneous system:People, languages, time zones…Complex collaborative effort

LCG prototype service (2003-05)

Truly heterogeneous system:People, languages, time zones…Complex collaborative effort

LCG prototype service (2003-05)

Yerevan

Saclay

Lyon

Dubna

Capetown, ZA

Birmingham

Cagliari

NIKHEF

Catania

BolognaTorino

PadovaIRB

Kolkata, India

OSU/OSCLBL/NERSC

Merida

Bari

Nantes

Houston

RAL

CERN

KrakowGSIBudapestKarlsruhe

Yerevan

Saclay

Lyon

Dubna

Capetown, ZA

Birmingham

Cagliari

NIKHEF

Catania

BolognaTorino

PadovaIRB

Kolkata, India

OSU/OSCLBL/NERSC

Merida

Bari

Nantes

Houston

RAL

CERN

KrakowGSIBudapestKarlsruhe

Yerevan

Saclay

Lyon

Dubna

Capetown, ZA

Birmingham

Cagliari

NIKHEF

Catania

BolognaTorino

PadovaIRB

Kolkata, India

OSU/OSCLBL/NERSC

Merida

Bari

Nantes

Houston

RAL

CERN

KrakowGSIBudapestKarlsruhe

ALICE Physics production US-ATLAS EDG NorduGridDC1: DC1: DC1:

Part of simulation; several tests full productionPile-up; reconstruction (1st test in August02)

Grid in ATLAS DC1 (July 2002 – April 2003)

Page 18: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

EGEE

ResourceCenter

(Processors, disks)

Grid server Nodes

ResourceCenter

ResourceCenter

ResourceCenter

OperationsCenter

Regional SupportCenter

(Support for ApplicationsLocal Resources)

Regional Support

Regional Support

Regional Support

Page 19: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

Baltic Grid• KTH• EENet, Tartu• NICPB, Tallinn• IMCS University of Latvia, Riga• RTU – Riga Technical University• Vilnius University• ITPA, Vilnius• IFJPAN, Cracow• PSNC, Poznan• CERN

• Heterogeneous, IA32, IA64(1537 CPU, 29 clusters)

• EGEE/LCG-2/gLite, ARC• SGAS – Grid Accounting

CERN

Page 20: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

Baltic Grid • Education,training, dissemination and outreach -

IFJPAN lead• Application Identification and Support – VU lead• Policy and Standards – KTH lead• Grid Operations – EENet lead• Network Resource Provisioning – IMCS UL lead• SLAs and account management joint research –

KTH lead

Page 21: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

Six 100 node, single CPU clusters with GigEinterconnect

WAN – Sunet 10GE

Middleware: EGEE, LCG-2, g-Lite, ARC

SGAS – Grid accounting

10 GE

2.5 GE

Page 22: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

JSCC RAS

CSCPDC

ECMWF

U Manchester

FZJ

SARAEPCC

HLRS

CINECA

CASPUR ENEA

IDRIS

BSC

RZGLRZ

CINES CSCS

DEISA Sites

Page 23: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

DEISA Top500 Capacity June 2005Site Location Top500 Linpack (TF)BSC Barcelona, Spain 27.91CINECA Bologna, Italy 6.62CSC Helsinki, Finland 1.17ECMWF Reading, UK 18.48EPCC/HPCx Daresbury, UK 6.19FZJ Julich, Germany 10.28HLRS Stuttgart, Germany 8.92IDRIS Orsay, France 3.11LRZ Munich, Germany 1.65RZG Garching, Germany 2.74SARA Amsterdam, Holland 4.16Total 91.23Total “Public” Europe 187.14

Page 24: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

DEISA• Dedicated network through GEANT• Global File System• Support of Workflow applications• Global data management• Co-Scheduling services• Portals and Web services• Extreme Computing Inititative (DECI)

Page 25: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

CCN Grid Objectives• Maturing Grid technologies

– Security– Interoperability– Accounting– (Performance) Monitoring– ……..

• Demonstrating utility of Grids– Persistency/Availability– Scale– Diversity/Uniqueness of resources– ……..

Page 26: CCN Grid Collaborationjohnsson/Talks/CCN_RS.pdf · HP CCN Meeting, Seattle, November 12, 2005. Virtual Grid Execution System (vgES) ApplicationApplication vgES APIs vgMON. vgDL. Information

HP CCN Meeting, Seattle, November 12, 2005

What to do?• Each organization has its own policies for access and

reporting • can we agree on a common application and reporting mechanism

and format? • what information about users can we share, to keep sponsors

happy, or the “FBI” when required, or auditors, or …?

• Monitoring and accounting• Software environments and tools• MoUs – roles of engagement• Need driving applications (data, collaboration,

computing, …) • Need goals and timelines (aligned with funded activities)