View
220
Download
1
Tags:
Embed Size (px)
Citation preview
1
July 30, 2005
Grid Computing Principles
Consortium for Computational Science and High Performance Computing
2005 Summer Workshop, July 29-July 31, 2005
2
Grid Computing Coursework Development Team
UNC-Charlotte
Barry Wilkinson
Kevin Hammond(PhD Student)
Western CarolinaUniversity
Mark Holliday
James Ruff (Undergraduate student)
Elon University
Joel Hollingsworth
Appalachian State University: Darryl Cook Systems Administrator
3
Introduction to Grid Computing
8:30 am - 9:45 am
Barry WilkinsonDepartment of Computer Science
UNC-Charlotte
4
Need to harness computersOriginal driving force behind grid computing the same as behind the early development of networks that became the Internet:
– Connecting computers at distributed sites for high performance computing.
• However, just as the Internet has changed, grid computing has changed to embrace collaborative computing.
5
History
• Began in mid 1990’s with experiments using computers at geographically dispersed sites.
• Seminal experiment – “I-way” experiment at 1995 Supercomputing conference (SC’95), using 17 sites across the US running:– 60+ applications.– Existing networks (10 networks).
6
1995 2000 200519901985
Distributed computing
Remote Procedure calls (RPC)Concept of service registry
Beginnings of service oriented architecture
Object oriented approachesJava Remote Method Invocation (RMI)
CORBA (Common Request Broker Architecture)
Cluster computing
Software Techniques:
Computing platforms:
Parallel computers
Geographically distributed computers (Grid computing in the broadest sense)
Web services
SC’95 experiment
7
Grid Computing
• Using distributed computers and resources collectively.
• Usually associated with geographically distributed computers and resources on a special high speed network, or the Internet.
• Now become much more that last slide suggests.
8
Shared Resources
Can share much more than just computers:
• Storage• Sensors for experiments at particular sites• Application Software• Databases• Network capacity, …
9
Computational Grid Applications
• Biomedical research
• Industrial research
• Engineering research
• Studies in Physics and Chemistry
10
Sample Grid Computing Projects
Physical Sciences:• Large Hadron Collider project (CERN)• DOE Particle Physics Data grid• DOE Science grid• AstroGrid• Comb-e-Chem project
Natural and Life sciences:• Protein Data grid• Mcell project
Engineering Design:• Distributed Aircraft Maintenance Environment• NASA Information Power grid
11
Science Today
is a Team Sport
I. Foster
12
eScience
eScience [n]: Large-scale science carried out through distributed collaborations—often leveraging access to large-scale data & computing
I. Foster
NSF Network for Earthquake Engineering Simulation (NEES)
Transform our ability to carry out research vital to reducing vulnerability to catastrophic earthquakes
I. Foster
Global Knowledge Communities: e.g., High Energy Physics
I. Foster
15www.earthsystemgrid.org
DOE Earth System Grid
Goal: address technical obstacles to the sharing & analysis of high-volume data from advanced earth system models
I. Foster
16
Earth System Grid I. Foster
17
TeraGridFunded by NSF in 2002 to link 5 supercomputer sites
with 40 Gb/s links
18
TeraGrid
19
Grid networks for collaborative grid computing projects
Grids have been set up at the local level, national level and international level throughout the world, to promote grid computing
20
CiscoEPA
North Carolina’s Foundation for Grid: NCREN
4-7 MCNC-owned Clusters distributed throughout the stateLocations still under evaluation
Internet Internet 2
NLR
Internet Internet 2
NLR
InternetInternet
Existing: Blend of owned and leased fiber and circuits moving toward resilient rings powered by Cisco routers
Planned: Strong focus on owned and leased fiber, Lambda, and few circuits, in resilient rings powered by Cisco routers and Wave Division Multiplexers
Close to home:
From “Grid Computing in the Industry” by Wolfgang Gentzsch, presentation to Fall 2004 grid computing course. Full set of slides on course home page.
21
Grid2003: An Operational National Grid28 sites: Universities + national labs2800 CPUs, 400–1300 jobsRunning since October 2003Applications in HEP, LIGO, SDSS, Genomics
Korea
http://www.ivdgl.org/grid2003From “A Grid of One to a Grid of Many,” Miron Livny, UW-Madison, Keynote presentation, MIDnet conference, 2005.
22
National GridsMany countries have embraced grid computing and set-up grid computing infrastructure:
• UK e-Science grid• Grid-Ireland• NorduGrid• DutchGrid• POINIER grid (Poland)• ACI grid (France)• Japanese grid• etc, etc., …
23
UK e-Science Grid
24
Resource sharing and collaborative computing
• Grid computing is about collaborating and resource sharing as much as it is about high performance computing.
25
Virtual Organizations
Grid computing offers
potential of virtual organizations:– groups of people, both geographically and
organizationally distributed, working together on a problem, sharing computers AND other resources such as databases and experimental equipment.
• Crosses multiple administrative domains.
26
Applications
• Originally e-Science applications– Computational intensive
• Not necessarily one big problem but a problem that has to be solved repeatedly with different parameters.
– Data intensive.– Experimental collaborative projects
• Now also e-Business applications to improve business models and practices.
27(Based on a slide from HP)
Utility ComputingOne of Several Commercial Drivers
shared, traded resources
value
clusters
grid-enabled systems
programmable data center
virtual data center
Open VMS clusters, TruCluster, MC ServiceGuard
Tru64, HP-UX, Linux
switchfabriccompute storage
UDC
computing utility
or
GRID
today
• Utility computing• On-demand• Service-orientation• Virtualization
I. Foster
28
Grid Computing Software Infrastructure
29
Globus Project
• Open source software toolkit developed for grid computing.
• Roots in I-way experiment.• Work started in 1996. • Four versions developed to present time.• Reference implementations of grid computing
standards.• Defacto standard for grid computing.
30
Globus Toolkit:Recent History
• GT2 (2.4 released in 2002)– GRAM, MDS, GridFTP, GSI.
• GT3 (3.2 released mid-2004): redesign– OGSA (Open Grid Service Architecture)/OGSI (Open
Grid Services Infrastructure) based.– Introduced “Grid services” as an extension of web
services.– OGSI now abandoned.
• GT4 (release for April 2005): redesign– WSRF (Web service Resource Framework) based.– Grid standards merged with Web services.
31
Supercomputing 2003 Demonstration
• We* used Globus version 2.4 in a Supercomputing 2003 demo organized by the University of Melbourne.
• 21 countries involved, numerous sites.
* The Grid group at WCU.
32
33
A re-implementation based upon the Open Grid Service Architecture (OGSA) standard.
• We used version 3.2 for the Fall 2004 grid computing course.
• Underlying implementation of version 3.x used OGSI Open Grid Service Infrastructure), which was not embraced by the community.
Version 3
34
Version 4
• Released April 2005.
• OGSA kept but OGSI abandoned in favor of new implementation standards based around pure web services.(Version 3 used “extended” web services)
• To be used in this course, with other software.
35
Interconnections and Protocols
Focus now on:
• using standard Internet protocols and technology, i.e. HTTP, SOAP, web services, etc.,
36
Web Services-Based Grid Computing
• Grid Computing now strongly based upon web services.
• Large number of newly proposed grid computing standards:– WS-Resource Framework (WSRF)– WS-Addressing– etc., etc. …. .
37
Grid Computing Standards
ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson, slides 4a version 0.1.
38
Standards Bodies
Principal standards and other interested bodies are:
• W3C consortium (http://www.w3.org)
• Global Grid Forum (GGF)
• OASIS(Organization for the Advancement of Structured Information Standards)
• …..
39
In Web Services World• XML introduced (ratified) in 1998
• SOAP ratified in 2000
• Web services developed
• Subsequently, standards have been are continuing to be developed:– WSDL– WS-* where * refers to names of one of many
standards
40
• Originally own protocols were developed (e.g. GT2)
then• OGSA (Open Grid Services architecture)
standard, and a specification called OGSI (Open Grid Service Infrastructure) developed. Extended web service invented called a grid service to embody state and transience. (GGF) Implemented in GT3.
and• Now relies more directly upon developing
web service standards (GT 4)
1996-2002
2002-2004
2005 -
Grid computing software has gone through several development cycles:
41
Grid computing standards
• Figure from “An ‘Ecosystem’ of Grid Components”, 2004, Grid Research Integration Deployment and Support Center, http://www-unix.grids-
center.org/r6/ecosystem/ecology.php
42
Open Grid Services Architecture(OGSA)
Although OGSI vanished, OGSA continues …
43
OGSA
• Defines standard mechanisms for creating, naming, and discovering service instances.
• Addresses architectural issues relating to interoperable services for grid computing.
• Originally described in “The Physiology of the Grid” http://www.globus.org/research/papers/ogsa.pdf
44
WS-Resource Framework
• A specification developed by OASIS
• Specifies how to make web services stateful, and other feature
45
46From “The Globus Toolkit 4 Programmer’s Tutorial” by Borja Sotomayor.
47From “The Globus Toolkit 4 Programmer’s Tutorial” by Borja Sotomayor.
48
WS-* StandardsPrincipal web service standards adopted for grid computing:
• WSRF Framework collection of 5 specifications:– WS-ResourceProperties
• Specifies how resource properties are defined and accessed
– WS-ResourceLifetime• Specifies mechanisms to manage resource lifetimes
– WS-ServiceGroup• Specifies how to group services or WS-Resources together
– WS-BaseFaults• Specifies how to report faults
• WS-Notification– Collection of specifications that specifies how configure services are
notification producers or consumers
• WS-Addressing– Specifies how to address web services.– Provides a way to address a web service/resource pair
49
Grid Computing Software
Components of Globus 4.0
ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson, slides 4b version 0.1.
50
Globus Version 4
• A “toolkit” of services and packages for creating the basic grid computing infrastructure
• Higher level tools added to this infrastructure• Version 4 is web-services based• Some non-web services code exists from
earlier versions (legacy) or where not appropriate (for efficiency, etc.).
52
• Each part comprises a set of web services and/or non-web service components.
• Some built upon earlier versions of Globus.
Data Management
SecurityCommonRuntime
Execution Management
Information Services
Web Services
Components
Non-WS
Components
Pre-WSAuthenticationAuthorization
GridFTP
GridResource
Allocation Mgmt(Pre-WS GRAM)
Monitoring& Discovery
System(MDS2)
C CommonLibraries
GT2
WSAuthenticationAuthorization
ReliableFile
Transfer
OGSA-DAI[Tech Preview]
GridResource
Allocation Mgmt(WS GRAM)
Monitoring& Discovery
System(MDS4)
Java WS Core
CommunityAuthorization
ServiceGT3
ReplicaLocationService
XIO
GT3
CredentialManagement
GT4
Python WS Core[contribution]
C WS Core
CommunitySchedulerFramework
[contribution]
DelegationService
GT4
Globus Open Source Grid Software
I Foster
54
Java Services in Apache AxisPlus GT Libraries and Handlers
YourJava
Service
YourPythonService
YourJava
Service RF
T
GR
AM
Del
egat
ion
Inde
x
Trig
ger
Arc
hive
r
pyGlobusWS Core
YourC
Service
C WS Core
RLS
Pre
-WS
MD
S
CA
S
Pre
-WS
GR
AM
Sim
pleC
A
MyP
roxy
OG
SA
-DA
I
GT
CP
Grid
FT
P
C Services using GT Libraries and Handlers
SERVER
CLIENT
InteroperableWS-I-compliant
SOAP messaging
YourJavaClient
YourC
Client
YourPythonClient
YourJavaClient
YourC
Client
YourPythonClient
YourJavaClient
YourC
Client
YourPythonClient
YourJavaClient
YourC
Client
YourPythonClient
X.509 credentials =common authentication
Python hosting, GT Libraries
Another view of GT4 Components
I Foster
55
GT Core
• Provides the ability to create services running inside the GT 4 container.
• Assignment 2 requires you to create a service inside GT 4 container and exercise it with a client.
Data Management
SecurityCommonRuntime
Execution Management
Information Services
Web Services
Components
Non-WS
Components
Pre-WSAuthenticationAuthorization
GridFTP
GridResource
Allocation Mgmt(Pre-WS GRAM)
Monitoring& Discovery
System(MDS2)
C CommonLibraries
GT2
WSAuthenticationAuthorization
ReliableFile
Transfer
OGSA-DAI[Tech Preview]
GridResource
Allocation Mgmt(WS GRAM)
Monitoring& Discovery
System(MDS4)
Java WS Core
CommunityAuthorization
ServiceGT3
ReplicaLocationService
XIO
GT3
CredentialManagement
GT4
Python WS Core[contribution]
C WS Core
CommunitySchedulerFramework
[contribution]
DelegationService
GT4
Java WS Core
Used in assignment 2
57
CustomWeb
ServicesWS-Addressing, WSRF,
WS-Notification
CustomWSRF Web
Services
GT4WSRF Web
Services
WSDL, SOAP, WS-Security
User Applications
Reg
istr
yA
dmin
istr
atio
n
GT
4 C
onta
iner
GT4 Web Services Core
I Foster
58
Execution Management
Key component
GRAM (Grid Resource Allocation Manager)
• For submitting executable jobs• Used in Assignment 3 to submit and execute
jobs.• May interface to a local job scheduler• Local job scheduler used in assignment 4
Data Management
SecurityCommonRuntime
Execution Management
Information Services
Web Services
Components
Non-WS
Components
Pre-WSAuthenticationAuthorization
GridFTP
GridResource
Allocation Mgmt(Pre-WS GRAM)
Monitoring& Discovery
System(MDS2)
C CommonLibraries
GT2
WSAuthenticationAuthorization
ReliableFile
Transfer
OGSA-DAI[Tech Preview]
GridResource
Allocation Mgmt(WS GRAM)
Monitoring& Discovery
System(MDS4)
Java WS Core
CommunityAuthorization
ServiceGT3
ReplicaLocationService
XIO
GT3
CredentialManagement
GT4
Python WS Core[contribution]
C WS Core
CommunitySchedulerFramework
[contribution]
DelegationService
GT4
GRAM (Grid Resource Allocation Manager)
Used in assignment 3
60
GRAMservices
GT4 Java Container
GRAMservices
Delegation
RFT FileTransfer
Transferrequest
GridFTPRemote storage element(s)
Localscheduler
Userjob
Compute element
GridFTP
sudo
GRAMadapter
FTPcontrol
Local jobcontrol
Delegate
FTP data
Cli
ent
Job
functions
Delegate
Service host(s) and compute element(s)
GT4 GRAM Structure:
Sun Grid Engine used in assignment 4
Data management components
I Foster
61
Security ComponentsAddresses the security requirements of grid computing. Three important factors are:
• Authorization– Process of deciding whether a particular identity can
access a particular resource
• Authentication– Process of deciding whether a particular identity is
who he says he is (applies to humans and systems)
• Delegation (somewhat specific to grid computing)– Process of giving authority to another identity
(usually a computer/process) to act on your behalf.
62
Security continued
• Security aspects complicated by the fact that virtual organization members and resources can be in different administrative domains.
Data Management
SecurityCommonRuntime
Execution Management
Information Services
Web Services
Components
Non-WS
Components
Pre-WSAuthenticationAuthorization
GridFTP
GridResource
Allocation Mgmt(Pre-WS GRAM)
Monitoring& Discovery
System(MDS2)
C CommonLibraries
GT2
WSAuthenticationAuthorization
ReliableFile
Transfer
OGSA-DAI[Tech Preview]
GridResource
Allocation Mgmt(WS GRAM)
Monitoring& Discovery
System(MDS4)
Java WS Core
CommunityAuthorization
ServiceGT3
ReplicaLocationService
XIO
GT3
CredentialManagement
GT4
Python WS Core[contribution]
C WS Core
CommunitySchedulerFramework
[contribution]
DelegationService
GT4
Security
65
GT4’s Use of Security Standards
I Foster
66
GT4 Data Management
• Move large data to/from nodes• Replicate data for performance &
reliability• Locate data of interest• Provide access to different data sources
– File systems, parallel file systems, hierarchical storage (GridFTP)
– Databases (OGSA DAI)
Data Management
SecurityCommonRuntime
Execution Management
Information Services
Web Services
Components
Non-WS
Components
Pre-WSAuthenticationAuthorization
GridFTP
GridResource
Allocation Mgmt(Pre-WS GRAM)
Monitoring& Discovery
System(MDS2)
C CommonLibraries
GT2
WSAuthenticationAuthorization
ReliableFile
Transfer
OGSA-DAI[Tech Preview]
GridResource
Allocation Mgmt(WS GRAM)
Monitoring& Discovery
System(MDS4)
Java WS Core
CommunityAuthorization
ServiceGT3
ReplicaLocationService
XIO
GT3
CredentialManagement
GT4
Python WS Core[contribution]
C WS Core
CommunitySchedulerFramework
[contribution]
DelegationService
GT4
GridFTP and Reliable File Transfer
68
GridFTP• Built on FTP using separation of data and
control channels• Provides features for
– Large data transfers– Secure transfers– Fast transfers– Reliable transfers– Third party transfers
• Not a web service– RTF (Reliable File Transfer) service provided WS-
level interface
69
Third party transfers
PI = FTP Protocol InterpreterDTP= FTP Data Channel Process
PI
DTP DTP
PI
PI PI
Client
Server Server
Control channels
Data channel
70
Performing a third-party transfer
1. Client establishes control channel with server2. Using control channel, client sets up transfer
parameters and requests data channel creation
3. Data channel established,4. Client sends transfer command over control
channel,5. Data transfer starts through data channel.
Either client or server can send.
71
Parallel transfers and striping
• Using multiple (virtual) connections for transfer– Same external network– Speed improvement possible, but limited by
network card
• Striping– a version of parallel transfers that can use
separate hardware interfaces– Implemented in GT 4.
72
GridFTP and RFT
WS ClientRFT service
(Java)
XIO based (C) XIO based (C)
Control channel
Data channel
GridFTP server GridFTP server
From Gridwise
Control channel
RequiresGSI proxy from client
73
GT 4 Replica Location Service
• Identify location of files via logical to physical name map
• Distributed indexing of names, fault tolerant update protocols
IndexIndex
I Foster
Data Management
SecurityCommonRuntime
Execution Management
Information Services
Web Services
Components
Non-WS
Components
Pre-WSAuthenticationAuthorization
GridFTP
GridResource
Allocation Mgmt(Pre-WS GRAM)
Monitoring& Discovery
System(MDS2)
C CommonLibraries
GT2
WSAuthenticationAuthorization
ReliableFile
Transfer
OGSA-DAI[Tech Preview]
GridResource
Allocation Mgmt(WS GRAM)
Monitoring& Discovery
System(MDS4)
Java WS Core
CommunityAuthorization
ServiceGT3
ReplicaLocationService
XIO
GT3
CredentialManagement
GT4
Python WS Core[contribution]
C WS Core
CommunitySchedulerFramework
[contribution]
DelegationService
GT4
Monitoring and Discovery
75
Monitoring and Discovery• WSRF provides common mechanisms for
monitoring and discovering a service:• GT4 “aggregator” services within MDS:
– MDS-Index: collects state information from registered resources and makes it available as XML document
– MDS-Trigger: passes this information to an executable
– MDS-Archive: archives state information (awaiting implementation)
• Every GT 4 is discoverable
76
AcknowledgementSlides numbers marked with “I. Foster” have been
selected from presentations made by Ian Foster:• Enabling eScience: Grid Technologies Today &
TomorrowAmerican Association for the Advancement of Science Annual Meeting, Washington, DC, February 21 2005.
• Globus: Bridging the GapKeynote Talk, GlobusWORLD, Boston, Mass., February 8, 2005.
• The Grid: Reality, Technologies, ApplicationsDistinguished Lecture, McGill University, Montreal, Canada, January 21 2005.
used for educational purposes only.
77
AcknowledgementsSupport for this work was provided by:
National Science Foundation’s Course, Curriculum, and Laboratory Improvement program under grant # 0410667, “Introducing Grid Computing into the Undergraduate Curricula,”
University of North Carolina Office of the President through Award # P342A000189 “A Consortium to Promote Computational Science and High Performance Computing,”
University of North Carolina Office of the President through award # IR 04-04, “Fostering Undergraduate Research Partnerships through a Graphical User Environment for the North Carolina Computing Grid.”
The grid computing coursework development group gratefully acknowledges their support.