1
Science Grid Program NAREGI
And
Cyber Science Infrastructure
November 1, 2007
Kenichi Miura, Ph.D.
Information Systems Architecture Research Division
Center for Grid Research and Development
National Institute of Informatics
Tokyo, Japan
2
1.National Research Grid Initiatve (NAREGI)
2. Cyber Science Infrastructure(CSI)
Outline
3
National Research Grid Initiative (NAREGI) Project:Overview
- Originally started as an R&D project funded by MEXT (FY2003-FY2007) 2 B Yen(~17M$) budget in FY2003
- Collaboration of National Labs. Universities and Industry in the R&D activities (IT and Nano-science Apps.)
-Project redirected as a part of the Next Generation Supercomputer Development Project (FY2006-…..)
MEXT:Ministry of Education, Culture, Sports,Science and Technology
(1)To develop a Grid Software System (R&D in Grid
Middleware and Upper Layer) as the prototype of future Grid
Infrastructure in scientific research in Japan
(2)To provide a Testbed to prove that the High-end Grid Computing Environment (100+Tflop/s expected by 2007) can
be practically utilized by the nano-science research
community over the Super SINET (now, SINET3).
(3) To Participate in International collaboration/Interoperability
(U.S., Europe, Asian Pacific) GIN
(4) To Contribute to Standardization Activities, e.g., OGF
National Research Grid Initiative (NAREGI) Project:Goals
5
Grid MiddlewareIntegration and Operation Group
Grid MiddlewareAnd Upper Layer R&D
Project Leader: Dr. K.Miura
Center for Grid Research and Development(National Institute of Informatics)
Ministry of Education, Culture, Sports,Science and industry ( MEXT)
Computational Nano Center(Inst. Molecular science)
R&D on Grand ChallengeProblems for Grid Applications
( ISSP, Tohoku-U, AIST, Inst. Chem. Research, KEK etc.)
ITBL
SINET3
Cyber ScienceInfrastructure ( CSI)
Coordination and Operation Committee
Dir.: Dr. F.Hirata
Grid TechnologyResearch Center (AIST), JAEA
Computing and Communication Centers (7 National Universities)
etc.
TiTech, Kyushu-U,Osaka-U, Kyushu-Tech.,Fujitsu, Hitachi, NEC
Ind
us
trial Ass
ociatio
n fo
r Pro
mo
tion
o
fS
up
erco
mp
utin
g T
echn
olo
gy
Collaboration
CollaborationJoint Research
Joint Research
Joint R&D
Collaboration
OparationAnd Collaboration
Unitization
Deployment
Organization of NAREGI
NAREGI Software Stack
SINET3
(( Globus,Condor,UNICOREGlobus,Condor,UNICOREOGSA)OGSA)
Grid-Enabled Nano-Applications
Grid PSEGrid Programming
-Grid RPC -Grid MPI
Grid Visualization
Grid VM
DistributedInformation Service
Grid Workflow
Super Scheduler
High-Performance & Secure Grid Networking
Computing Resources
NII IMSResearch
Organizations etc
DataGrid P
ackag
ing
7
VO and Resources in Beta 2
IS
A.RO1 B.RO1 N.RO1
ResearchOrg (RO)1
Grid
VM
IS
Policy• VO-R01• VO-APL1• VO-APL2
Grid
VM
IS
Policy• VO-R01
Grid
VM
IS
Policy• VO-R01• VO-APL1
VO-RO1ISSS
Client
VO-APL1ISSS
IS
.RO2 .RO2 .RO2
RO2
Policy• VO-R02• VO-APL2
VO-RO2IS SS
Client
Grid
VM
IS
Policy• VO-R02
Grid
VM
ISPolicy• VO-R01• VO-APL1• VO-APL2
VO-APL2
ISSS
Grid
VM
IS
Client
RO3
Decoupling VOs and Resource Providers (Centers)
VOs & Users
Resource Providers
Grid Center@RO1 Grid Center@RO2
VOMS
VOMS
VOMS
VOMS
8
WP-2:Grid Programming – GridRPC/Ninf-G2 (AIST/GTRC)
GridRPC
Server sideClient side
Client
GRAM
3. invoke Executable
4. connect back
NumericalLibrary
IDL Compiler
Remote Executable1. interface request
2. interface reply fork
MDS InterfaceInformationLDIF File
retrieve
IDLFILE
generate
Programming Model using RPC on the Grid
High-level, taylored for Scientific Computing (c.f. SOAP-RPC)
GridRPC API standardization by GGF GridRPC WG
Ninf-G Version 2
A reference implementation of GridRPC API
Implemented on top of Globus Toolkit 2.0 (3.0 experimental)
Provides C and Java APIs
9
■ GridMPI is a library which enables MPI communication between parallel systems in the grid environment. This realizes; ・ Huge data size jobs which cannot be executed in a single cluster system
・ Multi-Physics jobs in the heterogeneous CPU architecture environment
① Interoperability: - IMPI ( Interoperable MPI ) compliance communication protocol - Strict adherence to MPI standard in implementation
② High performance: - Simple implementation - Buit-in wrapper to vendor-provided MPI libraryCluster A:
YAMPII IMPI YAMPII
IMPI server
Cluster B:
WP-2:Grid Programming-GridMPI (AIST and U-Tokyo)
10
Grid PSE- Deployment of applications on the Grid- Support for execution of deployed
applications
Grid Workflow- Workflow language independent of
specific Grid middleware- GUI in task-flow representation
Grid Visualization- Remote visualization of massive data
distributed over the Grid- General Grid services for visualization
WP-3: User-Level Grid Tools & PSE
12
njs_png2002njs_png2012
njs_png2002
njs_png2003
njs_png2004
njs_png2010
njs_png2009
njs_png2008
njs_png2007
njs_png2006
njs_png2005
njs_png2011
njs_png2057
dpcd052
dpcd053
dpcd054
dpcd055
dpcd056
dpcd056dpcd057
dpcd052
dpcd053
dpcd054
dpcd055
dpcd056
dpcd056dpcd057
njs_png2002njs_png2012
njs_png2002
njs_png2003
njs_png2004
njs_png2010
njs_png2009
njs_png2008
njs_png2007
njs_png2006
njs_png2005
njs_png2011
njs_png2057
monomer calculation
dimer calculation
NII Resources
IMS Resources
fragment data
input data
total energy calculation
densityexchange
visuali-zation
Workflow based Grid FMO Simulations of Proteins
By courtesy of Prof. Aoyagi (Kyushu Univ.)
Data component
13
MPI
RISMJob
LocalScheduler
LocalScheduler
IMPIServer
GridVM
FMOJob
LocalScheduler
SuperScheduler
WFT
RISMsourceFMO
source
Work-flow
PSECA
Site ASite B
(SMP machine)Site C
(PC cluster)
6: Co-Allocation
3: Negotiation
Agreement
6: Submission
10: Accounting
10: Monitoring
4: Reservation
5: Sub-Job
3: Negotiation
1: Submissionc: Editb: Deploymenta: Registration
CA CA CA
Resource Query
GridVM GridVM
Distributed Information
Service
GridMPI
RISMSMP
64 CPUs
FMOPC Cluster128 CPUs
Grid Visualization
Output files
Input files
IMPI
Scenario for Multi-sites MPI Job Execution
14
RISM FMO
Reference Interaction Site Model Fragment Molecular Orbital method
IMS
MPICH-G2, Globus
RISMRISM FMOFMO
NIIGridMPI
Data Transformationbetween Different Meshes
Electronic StructureAnalysis
Solvent Distribution Analysis
Grid MiddlewareGrid Middleware
Electronic Structurein Solutions
Adaptation of Nano-science Applications to Grid Environment
(Sinet3)
15
NAREGI Application: Nanoscience
3D-RISM FMO
find correlations
between mesh points
Mediator
Data exchangebetween meshes
pair correlation functions monomer calculations
evenly-spacedmesh
adaptive meshes
( )g r
( )c r
( )h r
effective charges on solute sites
q
solvent distribution
( )h r
1 1 1 1
2 2 2 2
H E
H E
dimer calculation
Simulation Scheme
By courtesy of Prof. Aoyagi (Kyushu Univ.)
16
Collaboration in Data Grid Area
• High Energy Physics ( GIN)
- KEK
- EGEE
• Astronomy
- National Astronomical Observatory
(Virtual Observatory)
• Bio-informatics
- BioGrid Project
17
Data 1 Data 2 Data nGrid-wide File System
MetadataConstruction
Data Access Management
Data ResourceManagement
Job 1
Meta-data
Meta-data
Data 1
Grid Workflow
Data 2 Data n
NAREGI Data Grid Environment
Job 2 Job n
Meta-data
Job 1
Grid-wide DB Querying
Job 2
Job n
Data Grid Components
Import data into workflow
Place & register data on the Grid
Assign metadata to data
Store data into distributed file nodes
18
FY 2003 FY 2004 FY 2005 FY 2006 FY 2007
UNICORE -based R&D Framework
OGSA /WSRF-based R&D Framework
Roadmap of NAREGI Grid Middleware
Utilization of NAREGI NII-IMS Testbed Utilization of NAREGI-Wide Area Testbed
PrototypingNAREGI Middleware
Components
Development andIntegration of
αVer. Middleware
Development and Integration of
βVer. Middleware
Evaluation on NAREGI Wide-area
Testbed Development ofOGSA-based Middleware Verification
& EvaluationOf Ver. 1
Apply ComponentTechnologies toNano Apps and
Evaluation
Evaluation of αVer.In NII-IMSTestbed
Evaluation of βVersionBy IMS and other
Collaborating Institutes
Deployment of βVersion
αVer.(Internal)
β1 Ver.Release
Version1 . 0
Release
Mid
po
int
Eva
lua
tion
β2 Ver.LimitedDistr.
19
Highlights of NAREGI release (‘05-’ 0 6)1. Resource and Execution Management
• GT4/WSRF based OGSA-EMS incarnationJob Management, Brokering, Reservation based co-allocation, Monitoring, Accounting
• Network traffic measurement and control2. Security
• Production-quality CA• VOMS/MyProxy based
identity/security/monitoring/accounting3. Data Grid
• WSRF based grid-wide data sharing with Gfarm4. Grid Ready Programming Libraries
• Standards compliant GridMPI (MPI-2) and GridRPC• Bridge tools for different type applications in
a concurrent job5. User Tools
• Web based Portal• Workflow tool w/NAREGI-WFML• WS based application contents and deployment service• Large-Scale Interactive Grid Visualization
The first incarnationIn the world @
NAREGI is operating production level CA in
APGrid PMA
Grid wide seamless data access
Support data form exchange
High performancecommunication
A reference implementation of OGSA-ACS
20
NAREGI Version 1.0
• To be developed in FY2007• More flexible scheduling methods - Reservation-based scheduling - Coexistence with locally scheduled jobs - Support of Non-reservation-based scheduling - Support of “Bulk submission” for parameter sweep type jobs• Improvement in maintainability - More systematic logging using Information Service (IS)• Easier installation procedure - apt-rpm - VM
Operability, Robustness, Maintainability
21
Science Grid NAREGI- Middleware Version. 1.0 Architecture -
File Server File Server
DGRMS (Resource Management)
Gfarm Metadata Server
File Server File Server File Server File Server File Server File Server File Server File Server
DGRMS (Resource Management)
Gfarm Metadata Server
File Server File Server File Server File Server File Server File Server File Server File Server
DG UTF DG UTF
NAREGICA
WFT
GridPSE
GVS
SS
clie
ntS
S c
lient
MyProxy
Sign-On
Data Grid
VOMS
Renewal
Grid File System
ResourceResourceInfo.Info.
AIX/LoadLeveler
AccountingDeployment of Apps.
Site-B
GRAM
PortalPortal
Linux/PBSPro
Site-A
GRAM
GridVMIS
GridMPI
Local Scheduler Local Scheduler
Local Disk
GridVMIS
Resources InfoResources Infoincl. VOincl. VO
Information Service
ACS
External FileServer AuthZ Service AuthZ Service
Super Scheduler
Local Disk
Submission
File StagingFile Transfer
File Operations
Reservation, Submission,Query, Control
22
: 1 Gbps to 20 Gbps
: Edge node (edge L1 switch)
: Core node (core L1 switch + IP router)
: 10 Gbps to 40 Gbps
• It has 63 edge nodes and 12 core nodes (75 layer-1 switches and 12 IP routers).
• It deploys Japan’s first 40 Gbps lines between Tokyo, Nagoya, and Osaka.
• The backbone links form three loops to enable quick service recovery against network failures and the efficient use of the network bandwidth.
Japan’s first 40Gbps (STM256) lines
10Gbps
2.4GbpsLos Angeles
New York
Hong Kong
Singapore
Network Topology of SINET3
622Mbps622Mbps
23
~ 3000 CPUs~ 17 Tflops
Center for GRID R&D(NII)
~ 5 Tflops
ComputationalNano-science Center(IMS)
~ 10 Tflops
Osaka Univ.BioGrid
TiTechCampus Grid
AISTSuperCluster
ISSPSmall Test App
Clusters
Kyoto Univ.Small Test App
Clusters
Tohoku Univ.Small Test App
Clusters
KEKSmall Test App
Clusters
Kyushu Univ.Small Test App Clusters
AISTSmall Test App
Clusters
NAREGI Phase 1 Testbed
SINET3
(10Gbps MPLS)
24
Computer System for Grid Software Infrastructure R & DCenter for Grid Research and Development ( 5 Tflop/s , 700GB)
File Server (PRIMEPOWER 900 + ETERNUS3000 + ETERNUS LT160 ) (SPARC64V1.3GHz)(SPARC64V1.3GHz)
1node1node / 8CPU
SMP type Compute Server (PRIMEPOWER HPC2500 )
1node (UNIX, 1node (UNIX, SPARC64V1.3GHz/64CPU)SPARC64V1.3GHz/64CPU)
SMP type Compute Server (SGI Altix3700 )
1node 1node (Itanium2 1.3GHz/32CPU)(Itanium2 1.3GHz/32CPU)
SMP type Compute Server (IBM pSeries690 )
1node 1node (Power4(Power4 1.3GHz 1.3GHz/32CPU)/32CPU)
InfiniBand 4X(8Gbps)
InfiniBand 4X (8Gbps)
Distributed-memory typeCompute Server(HPC LinuxNetworx )
GbE (1Gbps)
Distributed-memory type Compute Server(Express 5800 )
GbE (1Gbps)
GbE (1Gbps)
GbE (1Gbps)
SINET 3
High Perf. Distributed-memory Type Compute Server (PRIMERGY RX200 )
High Perf. Distributed-memory type Compute Server (PRIMERGY RX200 ) Memory 130GB
Storage 9.4TB
Memory 130GBStorage 9.4TB
128CPUs128CPUs(Xeon, 3.06GHz)(Xeon, 3.06GHz)+Control Node+Control Node
Memory 65GBStorage 9.4TB
Memory 65GBStorage 9.4TB
128 CPUs128 CPUs(Xeon, 3.06GHz)(Xeon, 3.06GHz)+Control Node+Control Node
Memory 65GBStorage 4.7TB
Memory 65GBStorage 4.7TB
Memory 65GBStorage 4.7TB
Memory 65GBStorage 4.7TB
Memory 65GBStorage 4.7TB
Memory 65GBStorage 4.7TB
Memory 65GBStorage 4.7TB
Memory 65GBStorage 4.7TB
128 CPUs 128 CPUs (Xeon, 2.8GHz)(Xeon, 2.8GHz)+Control Node+Control Node
128 CPUs 128 CPUs (Xeon, 2.8GHz)(Xeon, 2.8GHz)+Control Node+Control Node
128 CPUs(Xeon, 2.8GHz)+Control Node128 CPUs(Xeon, 2.8GHz)+Control Node
Distributed-memory type Compute Server (Express 5800 )
128 CPUs 128 CPUs (Xeon, 2.8GHz)(Xeon, 2.8GHz)+Control Node+Control Node
Distributed-memory typeCompute Server(HPC LinuxNetworx )
Ext. NW
Intra NW-AIntra NW
Intra NW-B
L3 SWL3 SW
1Gbps1Gbps
(upgradable (upgradable
To 10Gbps)To 10Gbps)
L3 SWL3 SW
1Gbps1Gbps
(Upgradable (Upgradable
to 10Gbps)to 10Gbps)
Memory 16GB Storage 10TB Back-up Max.36.4TB
Memory 16GB Storage 10TB Back-up Max.36.4TB
Memory 128GBStorage 441GBMemory 128GBStorage 441GB
Memory 32GBStorage 180GBMemory 32GBStorage 180GB
Memory 64GBStorage 480GBMemory 64GBStorage 480GB
25
SMP type Computer Server
Memory 3072GB Storage 2.2TB Memory 3072GB Storage 2.2TB
Distributed-memory type Computer Server(4 units)818 CPUs(Xeon, 3.06GHz)+Control Nodes818 CPUs(Xeon, 3.06GHz)+Control NodesMyrinet2000 (2Gbps)
Memory 1.6TBStorage 1.1TB/unitMemory 1.6TBStorage 1.1TB/unit
File Server 16CPUs (SPARC64 GP, 675MHz)(SPARC64 GP, 675MHz)
Memory 8GBStorage 30TBBack-up 25TB
Memory 8GBStorage 30TBBack-up 25TB
5.4 TFLOPS 5.0 TFLOPS16ways×50nodes (POWER4+ 1.7GHz)16ways×50nodes (POWER4+ 1.7GHz)Multi-stage Crossbar NetworkMulti-stage Crossbar Network
L3 SWL3 SW
1Gbps1Gbps
(Upgradable to 10(Upgradable to 10Gbps)Gbps)
VPN
Firewall
CA/RA Server
SINET3Center for Grid R & D
Front-end Server
Computer System for Nano Application R & DComputational Nano science Center ( 10 Tflop/s , 5TB)
Front-end Server
26
Science Grid Environment
Toward Petascale Computing Environment for
Scientific Research
Cyber Science Infrastructure ( CSI )
Grid Middleware for Large Computer Centers
Productization of Generalpurpose Grid Middleware for Scientific Computing
Personnel Training(IT and Application Engineers)
Contribution to International Scientific Community and Standardization
Resource Management in the Grid Environment
Grid Application Environment
High-Performance & Secure Grid Networking
Grid Programming Environment
Grid-Enabled Nano Applications
Center for Grid Research and Development( National Institute of Informatics)
Grid Middleware
Computational Methods for
Nanoscience using the Lastest Grid Technology
Research Areas
Large-scale Computation
High Throughput Computation
New Methodology for Computational Science
Computational Nano-science Center
( Institute for Molecular Science )
Requirement from the Industry with regard to Science Grid for Industrial Applications
Solicited Research Proposals from the Industry to Evaluate
Applications
Industrial Committee for Super Computing
Promotion
Data Grid Environment
Use In Industry (New Intellectual Product Development)
Progress in the Latest Research and Development(Nano, Biotechnology)
Vitalization of Industry
Evaluation of Grid System w ith Nan
o Applications
Future Direction of NAREGI Grid Middleware
27
Outline
1.National Research Grid Initiatve (NAREGI)
2. Cyber Science Infrastructure(CSI)
28
Cyber Science Infrastructure: background
• A new information infrastructure is needed in order to boost today’s advanced scientific research.
– Integrated information resources and system • Supercomputer and high-performance computing• Software• Databases and digital contents such as e-journals• “Human” and research processes themselves
– U.S.A : Cyber-Infrastructure (CI)– Europe : EU e-Infrastructure (EGEE,DEISA,….)
• Break-through in research methodology is required in various fields such as nano-Science/technology, bioinformatics/life sciences,…
– the key to industry/academia cooperation:
from ‘Science’ to ‘Intellectual Production’
A new comprehensive framework of information infrastructure in Japan
Cyber Science Infrastructure
Advanced information infrastructure for research will be the key in international cooperation and
competitiveness in future science and engineering areas
29
Ind
us
try
/So
cie
tal F
ee
db
ac
k
Inte
rna
tio
na
l In
fra
str
uct
ura
l Co
llab
ora
tio
n
Restructuring Univ. IT Research ResourcesExtensive On-Line Publications of Results
Deployment of NAREGI Middleware
Virtual LabsLive Collaborations
Cyber-Science Infrastructure for R & D
UPKI: National Research PKI Infrastructure
Cyber-Science Infrastructure ( CSI)
● ★
★
★★★
★★
☆
SuperSINET and Beyond: Lambda-based Academic Networking Backbone
Hokkaido-U
Tohoku-U
Tokyo-UNII
Nagoya-U
Kyoto-U
Osaka-U
Kyushu-U
( Titech, Waseda-U, KEK, etc.)
NAREGIOutputs
GeNii (Global Environment forNetworked Intellectual Information)
NII-REO (Repository of ElectronicJournals and Online Publications
30
Structure of CSI and Role of Grid Operation Center (GOC)
Center for GridResearch andDevelopment
Peta-scaleSystem VO
Networking Infrastructure ( SINET3 )
Networking Infrastructure ( SINET3 )
UPKI System
e-Science Communitye-Science Community
GOC(Grid Operation Center)
・ Deployment & Operations of Middleware・ Tech. Support・ Operations of CA・ VO Users Admin.・ Users Training・ Feedbacks to R&D Group
National Institute of InformaticsNational Institute of Informatics
・ EGEE ・ TeraGrid ・ DEISA ・ OGF etc
Industrial Project VOs Research Project VOs
Univ./NationalSupercomputing
Center VOs
Domain-specificResearch Organization VO( IMS,AIST,KEK,NAO etc)
NAREGIMiddleware
R&D and Operational
Collaboration
Planning/Operations/Support
WG forInter-university PKI
WG for GridMiddleware
R&D/Support to Operations
Planning/Collaboration
International Collaboration
International Collaboration
Academic Contents Service
Cyber-ScienceInfrastructure
Research Community VO
WG for Networking
Planning/Operations
Planning/Operations
31
Cyber Science Infrastructure
32
Expansion Plan of NAREGI Grid
National SupercomputerGrid
(Tokyo,Kyoto,Nagoya…)
Domain-specificResearch Organizations
(IMS,KEK,NAOJ….)
PetascaleComputing Environment
Domain-specificResearch
Communities
DepartmentalComputing Resources
Laboratory-levelPC Clusters
NAREGIGrid Middleware
Interoperability( GIN,EGEE,Teragrid etc.)
33
CyberInfrastruc t ure ( NSF)
Track1 Petascale System( NCSA )
NSF Supercomputer Centers(SDSC,NCSA,PSC)Track2: (TACC,UTK/ORNL,FY2009)
> 1Pflops
National:>500 Tflops
Network Infrastructure : TeraGrid
Four Important Areas (2006-2010)
・ High Performance Computing・ Data, Data Analysis & Visualization・ Virtual Organization for Distributed Communities・ Learning & Workforce Development
Local:50-500 Tflops
LeadershipClass Machine
Slogan: Deep – Wide - Open
34
EU’s e-Infrastruc t ure (HET)
PACE Petascale Project ( 2009? )
DEISA
EuropteanHPC center(s):
>1Pflops
National/Regionalcenters with Grid
Colaboration:10-100 Tflops EGEE
EGI
HET:HPC in Europe Task ForcePACE: Partnership for Advanced Computing in EuropeDEISA: Distributed European Infrastructure for Supercomputer ApplicationsEGEE: Enabling Grid for E-SciencEEGI: European Grid Initiative
Network Infrastructure : GEANT2
Tier 1
Tier 2
Tier 3 Local centers
35
Summary• NAREGI Grid middleware will enable seamless federation
of heterogeneous computational resources.
• Computations in Nano-science/technology applications over Grid is to be promoted, including participation from industry.
• NAREGI Grid Middleware is to be adopted as one of the important components in the new Japanese Cyber Science Infrastructure Framework.
• NAREGI is planned to provide the access and computational infrastructure for the Next Generation Supercomputer System.
36
Thank you!
http://www.naregi.org