Introduction to Grid Technologies - westgrid.ca · Command line tools: globusrun-ws. WestGrid...

Preview:

Citation preview

Introduction to

Grid Technologies

Cameron Kiddle

Grid Systems Architect, WestGrid

Research Fellow, Grid Research Centre (GRC)

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 2

Presentation Outline/Goals

� Provide an overview of grid computing

� Introduce various grid computing

technologies

� Identify the grid computing technologies

currently supported by WestGrid

� Demonstrate how researchers can benefit

from use of these technologies via examples

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 3

What is Grid Computing?

� Many different definitions/uses� computational grids, data grids, desktop grids, campus

grids, sensor grids, access grids

� Coordinated sharing of resources that can span multiple administrative domains

� Related terms� utility computing, computing on demand, cloud computing,

e-Science, e-Infrastructure, cyberinfrastructure

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 4

Grid Computing Goals

� Accessibility� Providing users with easier access to more resources

� Collaboration� Enabling large scale collaborations

� Utility� Providing on demand access to computing resources

similar to public utilities such as electricity or water

� Transparency� Providing users access to computing resources without the

need to know how or where computations are taking place

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 5

Virtual Organization (VO)

� a group of people, typically spanning institutional

and regional boundaries, that share resources to

collaborate on a common project

Resources Shared by

Virtual Organization X

Resources Shared by

Virtual Organization Y

Domain A

Domain B Domain C

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 6

Example Grid Projects

development of standards and infrastructure to share and

analyze astronomical archives from around the world

International Virtual

Observatory Alliance (IVOA)

http://www.ivoa.net/

development of a grid-enabled biomolecular simulation

database to make results more accessible to the biological

community

BioSimGrid

http://www.biosimgrid.org/

a US national network of 15 facilities to study the impact of

earthquakes on buildings, bridges, etc.

Network for Earthquake

Engineering Simulation (NEES)

http://www.nees.org/

data storage and analysis infrastructure for the high energy

physics community using the Large Hadron Collider (LHC) at

CERN (ATLAS Tier-1 site at TRIUMF in British Columbia)

LHC Computing Grid

http://lcg.web.cern.ch/

DescriptionName

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 7

Grid Technologies

� Grid Middleware

� Security

� VO Management Services

� Information Services

� Data Management Services

� Execution Management Services

� Web Portals/Scientific Gateways

� Virtualization

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 8

Grid Middleware

� The layer between users/applications and grid resources that glues everything together

� Example grid middleware� Globus Toolkit (GT)

� GT2 – pre-standards

� GT4 – standards based (Web Services)

� UNICORE

� gLite

� ARC

� NAREGI

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 9

Security

� Authentication� X.509 certificates (IETF)

� Used to identify and authenticate users and services

� Based on public key cryptography

� Issued and signed by a certificate authority

� Provides global name space

� Enables single sign-on

� Authorization� grid-mapfile

� Maps distinguished names (found in certificates) of authorized users to local user names (e.g., unix login)

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 10

VO Management Services

� Services for managing membership and roles

within a Virtual Organization

� Helps simplify user account management

� Examples

� VOMS, GUMS, PRIMA

� Shibboleth, GridShib, myVocs

� Other – CAS, Akenti, PERMIS, SHEBANGS

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 11

Information Services� Provide information about resources, policy, services and

applications to tools and users

� Information models� DMTF Common Information Model (CIM)

� GLUE Schema

� GRC Model Schema

� Example services� Monitoring and Discovery System (MDS)

� MDS 2 – LDAP based

� MDS 4 (WS MDS) – Web Service based

� Relational Grid Monitoring Architecture (R-GMA)

� Berkeley Database Information Index (BDII)

� Universal Description, Discovery, and Integration (UDDI)

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 12

Data Management Services� Data transfer

� GridFTP

� Reliable File Transfer (RFT)

� Data replication � Replica Location Service (RLS)

� Metadata management� Metadata Catalog Service (MCS)

� Higher level data management services� Data Replication Service (DRS)

� Storage Resource Broker (SRB)

� i Rule Oriented Data Systems (iRODS)

� Proactive Data Management System (PDMS)

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 13

Execution Management Services� Handle placement, provisioning and lifetime management of

jobs

� Job submission� Submission of jobs to different types of resources

� Grid Resource Allocation and Management (GRAM)

� Meta-schedulers� Higher level schedulers that manage and distribute jobs between

different local schedulers

� Examples: Condor-G, CSF, GridWay, Moab Grid Scheduler (Silver)

� Workflow managers� Automate the management and submission of a set of jobs that have

various ordering dependencies

� Examples: DAGMan, Kepler, Triana, Taverna, Pegasus

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 14

Web Portals/Scientific Gateways

� Provide Web-based access to computing resources for communities of users

� Web portal development software� WebSphere

� GridSphere

� Web 2.0 technologies� Social networking (Facebook) , wikis (Wikipedia), blogs, …

� Example portals� nanoHUB

� myExperiment

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 15

Virtualization

� Can transform a single physical machine into multiple virtual machines (VMs) each with their own OS and software stack

� Virtualization software� Xen, VMWare

� Support allocation, deallocation, suspension and migration of VMs

� Benefits� custom environments (root access), resource

consolidation, system maintenance without disruption

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 16

WestGrid and Grid

� Is WestGrid a computational grid?

� Provides grid enabled resources

� Security services

� Data transfer tools

� Job submission services

� WestGrid resources can be part of

computational grids

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 17

GT4-based Grid Environment

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 18

Grid Services Status

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 19

Grid Services Supported in WestGrid

� Security Services

� GSI (Grid Security Infrastructure), X.509 certificates,

GSI-OpenSSH, MyProxy

� Information Services

� WS MDS, WebMDS

� Data Management Services

� GridFTP, RFT

� Execution Management Services

� WS GRAM, Condor-G (by request), DAGMan (by request)

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 20

Certificates in WestGrid

� A user automatically receives a certificate

when applying for an account

� Certificate and password protected private

key is stored in users $HOME/.globus/

directory on their home site

� Certificates are issued by Grid Canada

� Certificate must be renewed annually

(users will receive 60 day notice by e-mail)

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 21

GSI-OpenSSH

� GSI enabled version of OpenSSH

� Provides a single sign-on remote login and

file transfer service

� Command line tools:

� gsissh – GSI enabled ssh

� gsiscp – GSI enabled scp

� gsisftp – GSI enabled sftp

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 22

MyProxy

� Developed by NCSA (National Center for

Supercomputing Applications)

� Credential repository

� Allows proxy credential to be received from any

machine

� Can allow trusted services to renew proxy

credentials

� WestGrid MyProxy Server - myproxy.westgrid.ca

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 23

WS MDS

� Web Services version of the Monitoring and

Discovery System

� Index Service� Collects data and provides a query/subscription

interface to the data

� Can create hierarchy of index services

� Trigger Service� Collects data and takes actions based on the data

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 24

GRC Model Schema

� Models developed to describe systems, applications and scheduler policy

System Model Class Diagram

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 25

WebMDS� Customizable Web based interface for WS MDS information

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 26

GridFTP

� Based on FTP (File Transfer Protocol)

� GSI security on control and data channels

� Supports third-party transfers

� Improved efficiency of transfers

� Modification of TCP buffer sizes

� Parallel transfers (multiple TCP streams)

� Striped transfers

� Command line tools: globus-url-copy, gcp

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 27

GridFTP Performance

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 28

Reliable File Transfer (RFT)

� Manages a set of third-party GridFTP transfers

� Uses a database to checkpoint transfer state

� Recovers from� Source/destination server failures

� Network failures

� Container failures

� Transfers retried with exponential backoff

� Resumes transfers where they left off

� Command line tools: rft, rft_delete

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 29

WS GRAM

� Web Services version of the Grid Resource

Allocation and Management protocol

� Provides a single standard interface for remote job

submission and resource management

� Requires users and application developers to learn

only one method to gain access to a large variety of

local management systems

� Command line tools: globusrun-ws

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 30

Condor-G

� Developed at the University of Wisconsin-Madison

� An extension of Condor that makes use of Globus

services to submit jobs to different sites

� Matchmaking functionality to match jobs with

appropriate resources

� Available for use in WestGrid by request

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 31

DAGMan

� Part of the Condor software

� It manages workflows which are directed acyclic

graphs (DAGs), ensuring that jobs with

dependencies are executed in the correct order

� Available for use in WestGrid by request

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 32

Use Case: Life3D Simulations

� Example to illustrate the

benefits of workflow

management

� 3-dimensional version

of The Game of Life

� Workflow includes

simulation, rendering

and animation

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 33

Life3D - Workflow - I

Life3D

Simulation

Rendering

AnimationStage

Data

Stage

Data

Stage

Data

Stage

Data

gridstore gridstorelattice

grc15

octarine

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 34

Life3D - Workflow - II

WestGridGrid Research Center

gridstore (SFU)

lattice (UofC)

grc15

octarine

1.

2. Life3D Simulation

3.

4. Rendering5.

6. Animation

7.

Data Storage

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 35

Life3D - Technologies Used

� Grid Middleware

� GT2

� Data Management

� GridFTP

� Execution Management

� Job Submission - GRAM

� Meta-scheduler – Condor-G

� Workflow Manager - DAGMan

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 36

Use Case: Confederation Bridge ICE

Force Monitoring Project

� Monitoring of forces on the Confederation Bridge

� Data analyzed by civil engineering groups at University of Calgary and Carleton University

� GRC developed solution to automate data management as part of a CANARIE AAP project

(http://www.confederationbridge.com) (http://www.confederationbridge.com)

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 37

ICE Force - Technologies Used

� Grid Middleware

� GT4

� Data Management

� PDMS

� Data Transfer - GridFTP, RFT

� Replication Management – RLS

� Metadata Management - MCS

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 38

Use Case:

Molecular Dynamics Simulations

� GROMACS

� Parallel molecular dynamics

simulation application

� Can simulate hundreds to

millions of particles

� Simulation runs can take days,

weeks or months

� Issues with long running jobs

� Fault tolerance

� Scheduler policy constraints

(http://moose.bio.ucalgary.ca/)

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 39

GROMACS: Grid Enabled Solution

� Automated grid enabled solution developed by

GRC to manage GROMACS simulations as part

of a CANARIE AAP project

� Long jobs split into a series of shorter jobs

� Automates checkpointing, migration and

reconfiguration of jobs

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 40

GROMACS: Portal

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 41

GROMACS - Technologies Used� Grid Middleware

� GT4

� Information Services� WS MDS

� Data Management� PDMS (GridFTP, RFT, RLS, MCS)

� Execution Management� Custom system (Condor-G, GRAM)

� Portal� GridSphere

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 42

Use Case: Fire Simulation� Developed a comprehensive

environment for the Fire Dynamics Simulator (FDS) as part of a collaborative project between GRC and HP Labs

� Deployed on HP Labs Data Centre at University of Calgary

� Initial focus of project� Leverage Web 2.0 technologies

� Explore use of virtualization in a utility computing environment

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 43

Fire Simulation - Technologies Used

� User level� Web 2.0 interface (Facebook)

� Service provider level� LAMP environment (Linux, Apache, MySQL,

Perl/Python/PHP)

� Simulation (FDS, Condor)

� Visualization (Smokeview, VNC)

� Resource (utility) provider level� Virtualization (Xen)

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 44

Use Case: Ecosystem Modelling

� ecosys

� An application that models ecosystems (agriculture,

forests, savannah, grassland, tundra, desert)

� Used to study ecosystem behavior under different

environmental conditions (Dr. Robert Grant – UofA)

� Individual experiments consist of several hundred

simulations with common and run specific input files

� GRC is developing a portal/experiment engine to

automate execution of experiments as part of a

Alberta Cyberinfrastructure (Cybera) pilot project

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 45

ecosys - Technologies Used� Grid Middleware

� GT4

� Information Services� WS MDS

� Data Management� GridFTP, Stork, SRB

� Execution Management� Custom system (Condor-G, GRAM)

� Portal� Web 2.0 based portal (with GSI authentication)

� Virtualization� Xen (in HP Labs Data Centre)

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 46

Summary� Grid computing technologies enable sharing of resources

across administrative domains

� They are aimed at improving accessibility to resources, enabling large scale collaborations, providing computing on demand and making access to resources transparent

� There are a large variety of technologies supporting security, VO management, information services, data management, execution management, portals and virtualization

� WestGrid supports various technologies based on the GT4 grid middleware

� Developing grid solutions is not easy but there can be substantial benefits

WestGrid Seminar Series Feb. 6, 2008

Introduction to Grid Technologies - 47

Contact Information

Cameron Kiddle

kiddlec@cpsc.ucalgary.ca

support@westgrid.ca

(WestGrid) http://www.westgrid.ca/

(GRC) http://grid.ucalgary.ca/

Recommended