34
Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka Na tional Re search G rid I nitiative (NAREGI)

Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

Embed Size (px)

Citation preview

Page 1: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

Sub-Project Leader, NAREGI ProjectVisiting Professor, National Institute of

InformaticsProfessor, GSIC, Tokyo Inst. Technology

Satoshi Matsuoka

National Research Grid Initiative (NAREGI)

Page 2: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

Hokkaido University

HITACHI SR8000HP Exemplar V2500HITACHI MP5800/160Sun Ultra Enterprise 4000

Tohoku University

NEC SX-4/128H4(Soon SX-7)NEC TX7/AzusA

University of Tokyo

HITACHI SR8000HITACHI SR8000/MPPOthers (in institutes)

Nagoya University

FUJITSU VPP5000/64FUJITSU GP7000F model 900/64FUJITSU GP7000F model 600/12

Osaka University

NEC SX-5/128M8HP Exemplar V2500/N

Kyoto University

FUJITSU VPP800FUJITSU GP7000F model 900 /32FUJITSU GS8000

Kyushu University

FUJITSU VPP5000/64HP GS320/32FUJITSU GP7000F 900/64

Inter-university Computer Centers(excl. National Labs) circa 2002

Tokyo Inst. Technology (Titech)

NEC SX-5/16, Origin2K/256HP GS320/64

University of Tsukuba

FUJITSU VPP5000CP-PACS 2048 (SR8000 proto)

Page 3: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

Q: Grid to be a Ubiquitous National Research Computing Infrastructure---

How?• Simply Extend the Campus Grid?– 100,000 users/machines, 1000kms Networking PetaFlops/Peta

bytes…Problems!• Grid Software Stack Deficiency

– Large scale resource management– Large scale Grid programming– User support tools – PSE, visualization, portals– Packaging, distribution, troubleshooting– High-Performance networking vs. firewalls– Large scale security management– “Grid-Enabling” applications– Manufacturer experience and support

Page 4: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

National Research Grid Initiative (NAREGI) Project:Overview

- A new Japanese MEXT National Grid R&D project ~$(US)17M FY’03 (similar until FY’07) + $45mil

- One of two major Japanese Govt. Grid Projects-c.f. “BusinessGrid”

- Collaboration of National Labs. Universities and Major Computing and Nanotechnology Industries

- Acquisition of Computer Resources underway (FY2003)

MEXT:Ministry of Education, Culture, Sports,Science and Technology

Page 5: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

National Research Grid Infrastructure (NAREGI) 2003-2007

• Petascale Grid Infrastructure R&D for Future Deployment– $45 mil (US) + $16 mil x 5 (2003-2007) = $125 mil total– Hosted by National Institute of Informatics (NII) and Institute of Molecular Science (IMS)– PL: Ken Miura (FujitsuNII)

• SLs Sekiguchi(AIST), Matsuoka(Titech), Shimojo(Osaka-U), Hirata(IMS)…– Participation by multiple (>= 3) vendors– Resource Contributions by University Centers as well

AIST

Various Partners

Grid MiddlewareGrid Middleware

SuperSINETSuperSINET

Grid R&D Infrastr.Grid R&D Infrastr.15 TF-100TF15 TF-100TF

Grid and Grid and NetworkNetwork

ManagementManagement

““NanoGrid”NanoGrid”IMS ~10TFIMS ~10TF

(BioGrid(BioGridRIKEN)RIKEN)

OtherOtherInst.Inst.

National ResearchNational ResearchGrid Middleware R&DGrid Middleware R&D

NanotechNanotechGrid AppsGrid Apps

(Biotech(BiotechGrid Apps)Grid Apps)

(Other(OtherApps)Apps)

Titech

Fujitsu

NECOsaka-U

U-Kyushu Hitachi

Focused “Grand Challenge” Grid Apps Areas

U-Tokyo

Page 6: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

(1) R&D in Grid Middleware Grid Software Stack for “Petascale” Nation-wide “Research Grid” Deployment

(2) Testbed validating 100+TFlop (2007) Grid Computing Environment for Nanoscience apps on Grid

- Initially ~17 Teraflop, ~3000 CPU dedicated testbed - Super SINET (> 10Gbps Research AON backbone)

(3) International Collaboration with similar projects (U.S., Europe, Asia-Pacific incl. Australia)

(4) Standardization Activities, esp. within GGF

National Research Grid Initiative (NAREGI) Project:Goals

Page 7: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

Nano-science Applicatons

Director(Dr. Hirata, IMS)

Operations

R&D

Group Leader

SuperSINETTechnical

Requirements

Utilization of Network

Operations

Technology Dev.

R&D

AIST(GTRC) Joint ResearchNational

Supercomputing Centers

UniversitiesResearch Labs.

Coordination/Deployment

Center for Grid Research & Development(National Institute of Informatics)

NetworkTechnologyRefinement

National Supercomputeing

CentersCoordination in Network Research

R&D of Grand-challengeGrid Applocations

( ISSP,Tohoku-u, , AIST etc. ,Industrial Partners )

MEXT

Group Leaders

Grid R&D Progam Managemen

t Committee

ITBLProject( JAIRI )

ITBLProject Dir.

Operations

Utiliza

tion of

Computing

Resource

s Computational Nano-science Center( Institute for Molecular Science )

NAREGI Research Organization and Collaboration

Joint Research

Grid R&D Advisory

Board

Grid Networking R&D

Grid Middleware and Upper Layer

R&D

Project Leader (K.Miura, NII)

(Titech,Osaka-U, Kyushu-U. etc))

R&DR&D

R&D

Joint Research

Testbed Resources

(Acquisition in FY2003)

NII:   ~5Tflop/s

IMS:   ~11Tflop/s

Consortium for Promotion of Grid

Applications in Industry

Page 8: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

Participating Organizations

• National Institute of Informatics (NII) (Center for Grid Research & Development)• Institute for Molecular Science (IMS) (Computational Nano‐science Center)• Universities and National Labs (Joint R&D) (AIST Grid Tech. Center, Titech GSIC, Osaka-U Cybermedia, Kyush

u-U, Kyushu Inst. Tech., etc.)• Project Collaborations (ITBL Project, SC Center Grid Deployment Projects etc.) • Participating Vendors (IT and NanoTech)• Consortium for Promotion of Grid Applications in Industry

Page 9: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

NAREGI R&D Assumptions & Goals• Future Research Grid Metrics

– 10s of Institutions/Centers, various Project VOs– > 100,000 users, > 100,000 CPUs/machines

• Machines very heterogeneous, SCs, clusters, desktops– 24/7 usage, production deployment– Server Grid, Data Grid, Metacomputing…

• Do not reeinvent the wheel– Build on, collaborate with, and contribute to the “Globus, Unicore, Condor” T

rilogy– Scalability and dependability are the key

• Win support of users– Application and experimental deployment essential– However not let the apps get a “free ride”– R&D for production quality (free) software

Page 10: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

• WP-1: National-Scale Grid Resource Management: Matsuoka (Titech), Kohno(ECU), Aida (Titech)• WP-2: Grid Programming: Sekiguchi(AIST), Ishikawa(AIST)• WP-3: User-Level Grid Tools & PSE: Miura (NII), Sato (Tsukuba-u), Kawata (Utsunomiya-u)• WP-4: Packaging and Configuration Management: Miura (NII)• WP-5: Networking, National-Scale Security & User Management Shimojo (Osaka-u), Oie ( Kyushu Tech.)• WP-6: Grid-Enabling Nanoscience Applications : Aoyagi (Kyushu-u)

NAREGI Work Packages

Page 11: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

NAREGI Software Stack100 Tflops 級のサイエンスグリッド環境

WP6: Grid-Enabled Apps

WP3: Grid PSE

WP3: Grid Workflow

WP1: SuperScheduler

WP1: Grid Monitoring & Accounting

WP2: Grid Programming-Grid RPC-Grid MPI

WP3: Grid Visualization

WP1: Grid VM

(( Globus,Condor,UNICOREGlobus,Condor,UNICOREOGSA)OGSA)WP5: Grid PKI

WP5: High-Performance Grid Networking

WP

4:

Packag

ing

Page 12: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

WP-1: National-Scale Grid Resource Management

• Build on Unicore Condor Globus – Bridge their gaps as well– OGSA in the future– Condor-U and Unicore-C

• SuperScheduler• Monitoring & Auditing/Accounting• Grid Virtual Machine• PKI and Grid Account Management

(WP5)

EU GRIP

Glo

bus

Uni

vers

e

Condor-

G

Unicore-C

Unicore-C

Condor-U

Condor-U

Page 13: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

WP1: SuperScheduler( Fujitsu)

• Hierarchical SuperScheduling structure, scalable to 100,000s users, nodes, jobs among >20+ sites

• Fault Tolerancy• Workflow Engine• NAREGI Resource Schema (joint w/Hitachi)• Resource Brokering w/resource policy, advanced rese

rvation (NAREGI Broker)• Intially Prototyped on Unicore AJO/NJS/TSI

– (OGSA in the future)

Page 14: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

WP1: SuperScheduler( Fujitsu) (Cont’d)

 EuroGridBroker

[ マン大 ]

WP3 PSE

GATEWAY(U)

UPL (Unicore Protocol Layer) over SSL

Intranet

InternetWP5 h NAREGI PKI

[NEC]

NJS(U)Network Job Supervisor

Broker NJS(U)

UPL (Unicore Protocol Layer)

UUDB(U)

NAREGIBROKER-S[Fujitsu]

…Resource Broker IF

ExecutionNJS(U)

ExecutionNJS(U)

UPL (Unicore Protocol Layer)

CheckQoS & SubmitJob CheckQoS

FNTP (Fujitsu European Laboratories NJS to TSI Protocol)

TSI(U)Target System

Interface

TSI(U)Target System

Interface

TSI(U)Target System

Interface

TSI Connection IF

Condor

NAREGIBROKER-L[Fujitsu]

DRMAA ?

TSI(U)Target System

Interface

Globus

GRIP(G)

(U): UNICORE; Uniform Interface to Computing Resources

(G): GRIP; Grid Interoperability Project

CheckQoS ?

C.f. EuroGird[Manchester U]

WP3: Workflow Description(convert to UNICORE DAG)

Map Resource Requirements in RSL (or JSDL) onto CIM

Policy DB(Repository)

For Super Scheduler

For Local Scheduler

Policy Engine: “Ponder”Policy Description Lang.(as a Management App.)

Resource Discovery, Selection,

Reservation

Analysis&Prediction

OGSI portType?

CIM in XML over HTTPor CIM-to-LDAP

CIMOM (CIM Object Manager)

Batch QA CIM

Provider

NQS

CondorCIM

Provider

ClassAd

GlobusCIM

Provider

MDS/GARA

Ex. Queue change event

CIM Indication (Event)

GMA Sensor

Being Planne

d

Monitoring[Hitachi]

TOG OpenPegasus (derived from SNIA CIMOM)

Commercial Products: MS WMI (Windows Management Instrumentation), IBM Tivoli, SUN WBEM Services, etc.

Imperial

College,

London

Used in CGS-WG Demo at

GGF7

Page 15: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

WP1: Grid Virtual Machine( NEC & Titech)

• “Portable” and thin VM layer for the Grid

• Various VM functions – Access Control, Access Transparency, FT Support, Resource Control, etc.

• Also provides co-scheduling across clusters

• Respects Grid standards, e.g., GSI, OGSA (future)

• Various prototypes on Linux

 

GridVM

Access Control&Virtualization

Secure Resource Access Control

Checkpoint Support

Job Migration

Resource Usage Rate Control

Co-Scheduling & Co-Allocation

Job Control

Node Virtualization & Access Transparency

Resou

rce

Contro

lFT Support

Page 16: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

WP1: Grid Monitoring & Auditing/Accounting ( Hitachi &

Titech)• Scalable Grid  

Monitoring, Accouting, Logging

• Define CIM-based Unified Resource Schema

• Distinguish End users vs. Administrators

• Prototype based on GT3 Index Service, CIMON, etc.

PPrreesseenntt DDeettaaiilleedd RReessoouurrccee IInnffoo,, SSeeaarrcchhiinngg,, FFaauulltt AAnnaallyyssiiss,, eettcc..

UUsseerr--DDeeppeennddeenntt PPrreesseennttaattiioonn

GGrriidd MMiiddddlleewwaarree IInnffoorrmmaattiioonn

SSeerrvviiccee

UUNNIICCOORREE,, CCoonnddoorr,, GGlloobbuuss

UUnniiffiieedd SScchheemmaa

PPrreeddiiccttiioonnss UUsseerr LLoogg OOSS lloogg EEvveenntt lloogg RReessoouurrccee

IInnffoo PPeerrffoorrmmaannccee

MMoonniittoorr

SSeeccuurree LLaarrggee--SSccaallee DDaattaa MMaannaaggeemmeenntt

SSeerrvviiccee

CCIIMMOOMM ((PPeeggaassuuss))

BBaattcchh ssyysstteemm

SSuuppeerr SScchheedduulleerr

GGMMAA IInnffoo

PPrroovviiddeerr

DDiirreeccttoorryy

SSeerrvviiccee

RReeaall--ttiimmee Monitoring

SSeerrvviiccee

GGrriiddVVMM

AAddmmiinn IInnffoo PPrreesseennttaattiioonn EEnndd--UUsseerr IInnffoo PPrreesseennttaattiioonn

RRDDBB

AAddmmiinn OOppeerraattiioonn ((ee..gg.. AAccccoouunntt

MMaappppiinngg SSeerrvviiccee))

AAddmmiinn PPoolliiccyy

 

* Self Configuring Monitoring (Titech)

Page 17: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

WP-2:Grid Programming

• Grid Remote Procedure Call (RPC)–Ninf-G2

• Grid Message Passing Programming–GridMPI

Page 18: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

WP-2:Grid Programming – GridRPC/Ninf-G2 (AIST/GTRC)

GridRPC

http://ninf.apgrid.org/ Server sideClient side

Client

GRAM

3. invoke Executable

4. connect back

NumericalLibrary

IDL Compiler

Remote Executable1. interface request

2. interface reply fork

MDS InterfaceInformationLDIF File

retrieve

IDLFILE

generate

Programming Model using RPC on the Grid High-level, taylored for Scientific Computing (c.f. SOAP-RPC) GridRPC API standardization by GGF GridRPC WG Ninf-G Version 2

A reference implementation of GridRPC API Implemented on top of Globus Toolkit 2.0 (3.0 experimental) Provides C and Java APIs

DEMO is availableat AIST/Titech Booth

Page 19: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

WP-2:Grid Programming-GridMPI (AIST and U-Tokyo)

• Provides users an environment to run MPI applications efficiently in the Grid. • Flexible and hterogeneous process invocation on each compute node• GridADI and Latency-aware communication topology, optimizing communication over non

-uniform latency and hides the difference of various lower-level communication libraries.• Extremely efficient implementation based on MPI on Score (Not MPICHI-PM)

GridMPI

RSH P-to-P Communication

PMv2 OthersVendorMPI

OtherComm.Library

Latency-aware Communication Topology

Grid ADI

MPI Core

VendorMPI

GRAMSSH

RIM

IMPI

TCP/IP

Page 20: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

WP-3: User-Level Grid Tools & PSE

• Grid Workflow - Workflow Language Definition - GUI(Task Flow Representation)• Visualization Tools - Real-time volume visualization on the

Grid• PSE /Portals - Multiphysics/Coupled Simulation - Application Pool - Collaboration with Nanotech Applicato

ns Group

PSE Toolkit PSEPortal

PSE Appli-pool

Super-Scheduler

Application Server

  Problem Solving Environment    

Information ServiceWorkflow

RenderingSimulation 3D Object

Generation

Rendering3D ObjectGeneration UI

or

Storage

Storage

Server

ClientRaw Data 3D Objects ImagesRaw Data 3D Objects Images

Page 21: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

WP-4: Packaging and Configuration Management

• Collaboration with WP1 management• Issues

– Selection of packagers to use (RPM, GPTK?)

– Interface with autonomous configuration management (WP1)

– Test Procedure and Harness– Testing Infrastructurec.f. NSF NMI packagin

g and testing

Page 22: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

WP-5 Grid High Performance Networking

• Traffic measurement on SuperSINET• Optimal Routing Algorithms for Grids• Robust TCP/IP Control for Grids• Grid CA/User Grid Account Management

and Deployment• Collaboration with WP-1

Page 23: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

WP-6:Adaptation of Nano-science Applications to Grid Environment

• Analysis of Typical Nanoscience Applications - Parallel Structure - Granularity - Resource Requirement

- Latency Tolerance

• Development of Coupled Simulation• Data Exchange Format and Framework• Collaboration with IMS

Page 24: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

WP6 and Grid Nano-Science and Technology Applications Overview

Participating Organizations:-Institute for Molecular Science-Institute for Solid State Physics-AIST-Tohoku University-Kyoto University-Industry (Materials, Nano-scale Devices)-Consortium for Promotion of Grid Applications in Industry

Research Topics and Groups:-Electronic Structure-Magnetic Properties-Functional nano-molecules(CNT,Fullerene etc.)-Bio-molecules and Molecular Electronics-Simulation Software Integration Platform-Etc.

Page 25: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

SMP SC Cluster (Grid)

GridMPI etc.

RISM FMO

Solvent distribution

Solute structure

In-sphere correlation

MediatorMediator

Example: WP6 and IMS Grid-Enabled Nanotechnology

• IMS RISM-FMO Grid coupled simulation– RISM: Reference

Interaction Site Model– FMO: Fragment

Molecular Orbital method

• WP6 will develop the application-level middleware, including the “Mediator” component

Page 26: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

KEK

Operation (NII)U. of Tokyo

NIG

ISASNagoya U.

Kyoto U.

Osaka U.

DataGRID for High-energy Science

Middlewarefor Computational

GRID

Nano-TechnologyFor GRID Applicati

on

OC-48+ transmissionfor Radio Telescope

Bio-Informatics

  NIFS

Kyushu U.

 Hokkaid

o U.

 Okazaki Research Institutes

Tohoku U.

Tsukuba U.

Tokyo Institute of Tech.

Waseda U.

Doshidha U.

NAO

NII R&D

SuperSINET: AON Production Research Network (separate

funding)■ 10Gbps General Backbone■ GbE Bridges for peer-connectio

n■ Very low latency – Titech-Tsuku

ba 3-4ms roundtrip■ Operation of Photonic Cross C

onnect (PXC) for fiber/wavelength switching

■ 6,000+km dark fiber, 100+ e-e lambda and 300+Gb/s

■ Operational from January, 2002 until March, 2005

Page 27: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

SuperSINET :Network Topology

As of October, 2002Source:National Institute of Informatics

U Tokyo

Tokyo hub

IMSU TokyoOsaka hub

Kyoto U

Kyoto UUji

Nagoya U

Nagoya hub

Osaka U

NIFS

KEK

Hokkaido U

ISAS

NIIHitotsubashi

NIIChiba

NIG

NAO

Kyushu U Tsukuba U

Tohoku U

IMS(Okazaki)

TITech

Waseda U

Doshisha U

(10Gbps Photonic Backbone Network)

NAREGIGRID R&D

Page 28: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

The NAREGI Phase 1 Testbed ($45mil, 1Q2004)

• ~3000 Procs, ~17TFlops NII

(Tokyo)IMS

(Okazaki)

Small Test App Clusters (x 6)

SuperSINET (10Gbps MPLS)~400km

Center for Grid R&D~ 5Tflops

Software Testbed

ComputationalNano-science Center

~11TFlopsApplication Testbed

Osaka-U BioGrid U-Tokyo

Titech Campus

Grid~1.8TFlops

AIST SuperCluster

~11TFlops

Note: NOT a production Grid system c.f. TeraGrid

• Total ~6500 procs, ~30TFlops

Page 29: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

NAREGI Software R&D Grid Testbed (Phase 1)

• Under Procurement – Installation March 2004– 3 SMPs, 128 procs total (64 + 32 + 32), SparcV +IA64+Power4– 6 128-proc PC clusters

• 2.8Ghz Dual Xeon + GbE (Blades)• 3.06Ghz Dual Xeon + Infiniband

– 10+37TB File Server– Multi-gigabit networking to simulate Grid Env.– NOT a production system (c.f. TeraGrid)– > 5 Teraflops– WAN Simulation– To form a Grid with the IMS NAREGI application testbed infrastructure (> 10 Teraflops, March 2004), and other national centers via SuperSINET

Page 30: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

NAREGI R&D Grid Testbed @ NII

ネットワーク部分構成概要ネットワーク部分構成概要

外部外部ネットワークネットワーク接続装置接続装置

SuperSINETSuperSINETSuperSINET

外部 NW

高性能分散並列型演算サーバ1用 L2スイッチ (GbE) 75ポート以上

内部用内部用L3L3

スイッチスイッチGbEGbE

相互結合網用スイッチと共用可

高性能分散並列型演算サーバ2 用

分散並列型演算サーバ1 用

分散並列型演算サーバ2 用

分散並列型演算サーバ3 用

分散並列型演算サーバ4 用

L2スイッチ (GbE) 75ポート以上

L2スイッチ (GbE) 75ポート以上

L2スイッチ (GbE) 75ポート以上

L2スイッチ (GbE) 75ポート以上

L2スイッチ (GbE) 75ポート以上GbEGbE 6464ポート以上ポート以上(10(10GbE GbE ×× 1 1 可能可能))

•• GbEGbE 44ポート以上ポート以上•• (10(10GbE GbE ×× 2 2 可能可能))•• 高速パケットフィルタ高速パケットフィルタ

オフィス環境ネットワーク用

メモリ共有型演算サーバ1メモリ共有型演算サーバ2メモリ共有型演算サーバ3

ファイルサーバ (GbE×4)

トランク(GbE×8)

グリッド基盤ソフトウェア開発システム構成図グリッド基盤ソフトウェア開発システム構成図高性能分散並列型演算サーバ1

性能 0.33TF以上 メモリ 64GB以上

テ ィ゙スク 73GB以上

性能 0.33TF以上 メモリ 64GB以上

テ ィ゙スク 73GB以上64cpu

メモリ共有型演算サーバ1

外部外部ネットワークネットワーク接続装置接続装置

SuperSINETSuperSINETSuperSINET

外部NW

内部NW

128128プロセッサ以上プロセッサ以上 ((Linux)Linux)++管理ノード管理ノード

性能 0.75TF以上 メモリ 130GB以上

テ ィ゙スク 2.3TB以上

性能 0.75TF以上 メモリ 130GB以上

テ ィ゙スク 2.3TB以上

高性能分散並列型演算サーバ2

ノード ノード ノード

128128プロセッサ以上プロセッサ以上 ((Linux) Linux) ++管理ノード管理ノード

性能 0.75TF以上 メモリ 65GB以上

テ ィ゙スク 2.3TB以上

性能 0.75TF以上 メモリ 65GB以上

テ ィ゙スク 2.3TB以上……結合網(4Gbps以上)

性能 0.17TF以上 メモリ 32GB以上

テ ィ゙スク 73GB以上

性能 0.17TF以上 メモリ 32GB以上

テ ィ゙スク 73GB以上32CPU

メモリ共有型演算サーバ2

11node node (UNIX, 64bit processor)(UNIX, 64bit processor)

11node node (UNIX, 64bit processor)(UNIX, 64bit processor)

L3L3スイッチスイッチ

GbEGbE

ノード ノード ノード……結合網 (8Gbps以上)

ファイルサーバ

SMP(8cpu) メモリ 16GB以上

テ ィ゙スク 10TB(RAID5)以上 ハ ッ゙クアップ 20TB以上

メモリ 16GB以上 テ ィ゙スク 10TB(RAID5)以上

ハ ッ゙クアップ 20TB以上

10TB 20TB

1 1 node ( Unix, 64bit processor)node ( Unix, 64bit processor)

性能 0.17TF メモリ 64GB

テ ィ゙スク 73GB以上

性能 0.17TF メモリ 64GB

テ ィ゙スク 73GB以上32CPU

メモリ共有型演算サーバ3

11node node (LINUX, 64bit processor)(LINUX, 64bit processor)

L2 L2 GbEGbEスイッチスイッチ

分散並列型演算サーバ1

ノード ノード ノード

128128プロセッサ以上プロセッサ以上 ((Linux) Linux) ++管理ノード管理ノード

……結合網(1Gbps以上) 性能 0.65TF以上

メモリ 65GB以上テ ィ゙スク 1.2TB以上

性能 0.65TF以上 メモリ 65GB以上

テ ィ゙スク 1.2TB以上Unix OS1Unix OS1

Unix OS2Unix OS2

Unix OSUnix OS

Unix OS 3Unix OS 3

分散並列型演算サーバ2

ノード ノード ノード

128128プロセッサ以上プロセッサ以上 ((Linux) Linux) ++管理ノード管理ノード

……結合網(1Gbps以上) 性能 0.65TF以上

メモリ 65GB以上テ ィ゙スク 1.2TB以上

性能 0.65TF以上 メモリ 65GB以上

テ ィ゙スク 1.2TB以上

分散並列型演算サーバ3

ノード ノード ノード

128128プロセッサ以上プロセッサ以上 ((Linux) Linux) ++管理ノード管理ノード

……結合網(1Gbps以上) 性能 0.65TF以上

メモリ 65GB以上テ ィ゙スク 1.2TB以上

性能 0.65TF以上 メモリ 65GB以上

テ ィ゙スク 1.2TB以上

分散並列型演算サーバ4

ノード ノード ノード

128128プロセッサ以上プロセッサ以上 ((Linux) Linux) ++管理ノード管理ノード

……結合網(1Gbps以上) 性能 0.65TF以上

メモリ 65GB以上テ ィ゙スク 1.2TB以上

性能 0.65TF以上 メモリ 65GB以上

テ ィ゙スク 1.2TB以上

L2 L2 GbEGbEスイッチスイッチ

L2 L2 GbEGbEスイッチスイッチ

L2 L2 GbEGbEスイッチスイッチ

L2 L2 GbEGbEスイッチスイッチ

L2 L2 GbEGbEスイッチスイッチ

Page 31: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

AIST (National Institute of Advanced Industrial Science & Technology) Supercluster

• Challenge– Huge computing power to support various re

search including life science and nanotechnology within AIST

• Solution– Linux Cluster IBM eServer 325

• P32: 2116 CPU AMD Opteron • M64: 520 CPU Intel Madison

– Myrinet networking– SCore Cluster OS– Globus toolkit 3.0 to allow shared

resources. • World’s most powerful Linux-based

supercomputer– more than 11 TFLOPS ranked as the third

most powerful supercomputer in the world

– Operational March, 2004

CollaborationsGovernment

Life Science Nanotechnology

LAN Internet

Academia Corporations

Grid Technology

Advanced Computing

Center.

Other Research Institute

Page 32: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

NII Center for Grid R&D (Jinbo-cho, Tokyo)

Imperial Imperial PalacePalace

Tokyo Tokyo St.St.

AkihabaAkihabarara

Mitsui Office Mitsui Office Bldg. 14Bldg. 14thth

FloorFloor

700m2 office space (100m2 machine

room)

Page 33: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

Towards Petascale Grid – a Proposal

• Resource Diversity ( 松竹梅 “ Shou-Chiku-Bai”)– 松 (“shou” pine) – ES – like centers 40-100Teraflops x (a fe

w), 100-300 TeraFlops– 竹 (“chiku” bamboo) – Medium-sized machines at SCs, 5-10

TeraFlops x 5, 25-50 TeraFlops aggregate / Center, 250-500 TeraFlops total

– 梅 (“bai” plumb) – small clusters and PCs spread out throughout campus in a campus Grid x 5k-10k, 50 -100 TeraFlops / Center, 500-1 PetaFlop total

• Division of Labor between “Big” centers like ES and Univ. Centers, Large-medium-small resources

• Utilize Grid sofwate stack developed by NAREGI and other Grid projects

Univ SCs

ES’s

Page 34: Sub-Project Leader, NAREGI Project Visiting Professor, National Institute of Informatics Professor, GSIC, Tokyo Inst. Technology Satoshi Matsuoka National

Collaboration Ideas

• Data (Grid)– NAREGI deliberately does not handle data

• Unicore components– “Unicondore” (Condor-U, Unicore-C)

• NAREGI Middleware– GridRPC, GridMPI– Networking– Resource Management

• e.g. CIM resource schema

• International Testbed• Other ideas?

– Application areas as well