34
Introduction to CNGri OS 3.0 OMII-Euro & CNGrid Joint Tra ining Material 刘刘 (Liu Jie) li [email protected] Jan. 11 2008

Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) [email protected] Jan. 11 2008

Embed Size (px)

Citation preview

Page 1: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

Introduction to CNGrid GOS 3.0

OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) [email protected] Jan. 11 2008

Page 2: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

2

Outline

CNGrid snapshot Motivation Architecture Components

– Core layer– HPCG

Summary

Page 3: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

3

CNGrid snapshot Project Background

– CNGrid (China National Grid)– CNGrid GOS 2.0

• Sponsored by China Ministry of Science and Technology (2002~2005), the tenth five-year plan

– CNGrid GOS 3.0• Sponsored by China Ministry of Science and Technolog

y (2006~2009), the eleventh five-year plan• ICT CAS, Tsinghua U, Beihang U, etc

Page 4: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

4

CNGrid snapshot

Page 5: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

5

CNGrid snapshot

International cooperation– OMII_EU/OMII_UK

• Provide software suite• Integrated into OMII software stack • Use OMII leading technology in CNGrid.

– XtreemOS• Building and Promoting a Linux-based Operating System

to Support Virtual Organizations for Next Generation Grids.

• WP2.1Virtual Organization support in Linux• WP3.5 Security in Virtual Organizations

Page 6: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

6

Motivation Why CNGrid GOS?

– Need for Internet based grid system software• Manage large scale distributed resource effectively• provide uniform approach accessing the heterogeneous resou

rces in grid• Enable Internet based resource sharing and collaborating

– Need for Easy-to-use grid• Low cost: Hiding interior details for grid applications develop

ment, deployment, management and using.• Multiple access mode:

– Client/Server, Browser/Server and other modes– Batch mode and interactive mode

Page 7: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

7

Motivation Goals

– Develop a virtualized resource sharing mechanism and framework on computing, data, software and combined resources

– Provide secured, unified and friendly interfaces accessing the scientific computing and information services

– Support multiple domain specific applications running on above

Page 8: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

8

CNGrid GOS 3.0 Architecture

Tomcat(Apache)+Axis, GT4, gLite, OMII

Dynamic DeployService

CA Service

System Mgmt Portal

Hosting Environment

Core

System

Tool/App

Message Service

Agora

User Mgmt Res MgmtAgora Mgmt

Gsh & cmd tools

GSML Browser

Naming

HPCG Portal

WorkflowIDE

WP6 Other WPs

ServiceControllerOther

RController

BatchJob mgmt

MetaScheduleAccount mgmt

File mgmt

metainfo mgmt

HPCG

Resource Space

GOS System Call (Resource mgmt,Agora mgmt, User mgmt, Grip mgmt, etc)GOS Library (Batch, Message, File, etc)

Other applications

Grip Runtime

Grip Instance MgmtSecurity

Res AC & Sharing

Other 3rd software &

tools

Java J2SE

GridWorkflowDataGrid

Science Data Grid

IDE Compiler

GSML Composer

Programming Env.DataGrid

Using Env.Workflow

Using Env.Debugger

WP2

VegaSSH

Railway Info Process Grid

Running & Mgmt Center

Batch mgmt portal

Grip

Grid Portal, Gsh, GSML Workshop and Grid Apps

OS (Linux/Unix/Windows)

PC Server (Grid Server)

J2SE(1.4.2_07, 1.5.0_07)

Tomcat(5.0.28) +Axis(1.2 rc2)

Axis Handlers for Message Level Security

Core, System and App Level Services

Page 9: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

9

Components overview

Components – Core layer– HPCG (High Performance Computing Gateway

)• Deployment• Management• Usage: Job , File & Accounting Mgmt• Application Development

Page 10: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

10

Components: System software

Core layer– Agora service (aka. VO)

• organize and manage related users and resources locally• serve as trust third part for resource providers and consumers

to negotiate sharing policies• Provide user mgmt, resource mgmt, agora mgmt functions bas

ed on underlying Naming layer– A resilience decentralized registry for variety kinds of global obje

ct– Provide low latency object locating by object GUID– Provide high success rate searching by multiple attributes match– provide stable object view based on linked naming services to en

able the effective-virtual-physical address space• Use RController to provide a uniform resource provision and m

anagement interface

Page 11: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

11

Components: System software

Core layer– Grip

• Runtime abstraction: a grip is once running of an application

• Create grips to run applications in a managed way, interact with an existing grip, kill a grip and release consuming resources in automatic way

Page 12: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

12

Components: HPCG HPCG motivation

– Aim to provide a high performance business computing environment for enterprise users

– Features• Easy to install, configure and use• Provide functions what users really need• High reliability• Professional interface• Based on GOS, but can easy to port to other grid middleware• Standard compliant

– JSDL (Job Submission Description Language)– BES (OGSA Basic Execution Service)– SAGA (A Simple API for Grid Application)– SOA and plain Web services (WS-related standards ) – RUS: Resource Usage Service (RUS) based on WS-I Basic Profile

1.0

Page 13: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

13

HPCG Components

Mgmt PortalPortal

HPCG Server

CML tools

HPCG Client

Metainfo Mgmt

File mgmt

Message

Dynamic metainfo mgmt

Environment abstraction

User ExceptionSecurity

Static metainfo mgmt

Account mgmt

Batch job mgmt

Database

Meta schedule

Page 14: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

14

Scenarios of HPCG

Internet

Enterprise user

Enterprise Intranet

Cluster

GOS

GOS

HPC gateway server

Enterprise user

Grid Site

Grid Site

Grid Site(Grid Operation &

Mgmt Center)

Message Subscribe/Notification

GOS

Requirements for High performance computation gateway

– Uniformed Web UI for HPC users and resource providers

– Many enterprise users share one HPC account

– Job submission to different HPC transparently

– Job status acquirement efficiently

– File transport without relay– Computation resource

accounting

Page 15: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

15

HPCG - Deploy

Several deploy styles– Front-end and back-end– All vs. split– Relationship with clusters

• Deploy in clusters• Deploy in a machine outside of the clusters

Page 16: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

16

HPCG - Deploy

Pre-require – Software

• JDK 1.5• Ant1.6.5 or above• Mysql1.4.12 or above• Standard Ftp server• OpenPBS (PBSPro or Torque) , LSF, etc

– Hardware• Cpu : P4 2.4G• Memory : 4GB (at least 2GB)• Disk Space : 160GB (at least 80GB)

– Network• Double Network Cards• ftp port : 21• ssh port : 22• http port : 8080, 18080• Message port : 61616

Page 17: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

17

HPCG Management portal– Manage all meta-info, such as cluster info,

jobqueue info, user mapping, software type, software instance etc.

HPCG Application portal– End users to submit and manage jobs, manage

temp files and output files, query history accounting info, etc

HPCG - Portal

Page 18: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

18

HPCG Management

Several kinds of static meta-info– Mapping of grid user to local cluster users– Cluster meta-info– Software type info– Software instance info– Jobqueue info

Dynamic meta-info– The pending job length of each job queue– The available count of license

Support scheduling

Page 19: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

19

HPCG - Management

Page 20: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

20

HPCG - Application portal

Batch job management– Submit job– Manage job

File management Accounting management

Page 21: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

21

HPCG - Batch Job mgmt

Submit jobs to the grid and schedule among multiple HPC sites

Monitor the detailed job status Cancel or rerun jobs Query history job information Job status change subscribe and notification Support both JSDL and BES standard

Page 22: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

22

Batch Job management: Job status transform diagram

Submitted Staging In Staged In

Executed

Staging OutStaged OutDone

Active:Running

Failed fail

Active:Queuing

Active:Suspended:Suspend

Terminated terminate

Re-runRe-run

Re-run

Page 23: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

23

HPCG - Batch job mgmt

Page 24: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

24

HPCG - Batch job mgmt

Page 25: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

25

HPCG - File mgmt View, create and delete of working directory in compu

tation node With zip and tar support for multiple output files Reliable big file (about 2GB) transfer between gatewa

y server and working directory View text files(<0.5MB) and pictures in working direct

ory with web browsers Support multiple ftp servers (wuftp, vsftp) with ipv6 su

pport Pause and resume of file transfer process

Page 26: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

26

HPCG - File mgmt

Page 27: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

27

HPCG - File mgmt

Page 28: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

28

HPCG - Accounting mgmt

Accounting info about jobs come from grid user and local

Standard Usage Record format Service for query, add, remove, update and

statistics for both local and global accounting info with ACL

Global Accounting statistics

Page 29: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

29

HPCG - Account mgmt

Page 30: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

30

HPCG - Development

HPCG Template– function

• Describe the public logic when submitting jobs• Have nothing with the Grid site• Every software should have at least one Template

– form• Xml file

Page 31: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

31

HPCG - Development

Schema of HPCG Template

Page 32: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

32

HPCG - Development

Benefits of the HPCG Template– Easy to develop ( No need to know GOS API

s )– Easy to share the Template – Shield the heterogeneous of the resource– Global job-schedule– Sharing of software license

Page 33: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

33

Summary

Summary of CNGrid GOS 3.0– A software suite to support multiple domain applicati

ons and enable the sharing resources among HPC sites

– Major components: System software, HPCG,– Other components: Programming & using environm

ent, Grid workflow and Data Grid Time schedule

– 2008.1 release of CNGrid GOS 3.0– 2008.2 deployed on CNGrid

Page 34: Introduction to CNGrid GOS 3.0 OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) liujie406@software.ict.ac.cn Jan. 11 2008

34

Thanks!