1
Nanbor Wang, Balamurali Ananthan Tech-X Corporation Gerald Gieraltowski, Edward May, Alexandre Vaniachine Argonne National Laboratory 2. ARCHITECTURE GSIMF: A Grid Software Installation Management Framework G rid Softw are Catalog Service (G SCS) Softw are Package D escriptor Repository Grid Softw are Package Repository Java G U IClient: - Brow sesthe Catalog ofSoftw are Package’sM etadata - Selectsthe Software - Installsthe softw are on the G rid G rid Softw are Installation M anager(G SIM ) Com pute N ode Com pute N ode Com pute N ode Com pute N ode Com pute N ode Com pute N ode Softw areInstallation Engine (SIE) G rid G atekeeper G rid Softw are Installation Service (G SIS) Com pute Elem ent G rid Softw are Package Q uery Service (G SPQ S) GridFTP Storage Elem ent G RID Installed Grid Software (LocalCopy) G rid Services A B C F H D G I E J Figure 1. G SIM F A rchitecture GSIMF architecture is shown in Figure 1. The description of the core components and the sequence of actions are explained below. How GSIMF Works A. Grid Software Catalog Service (GSCS) collects information about the software packages stored in Software Package Repository and builds metadata for every individual package. B, C. The client browses through the Catalog of Software Metadata, selects the software of interest and sends the command to install the software to the Grid Software Installation Manager (GSIM). D, F. GSIM contacts the Grid Software Package Query Service (GSPQS) and after checking that the software is not already installed on the Grid, GSIM delegates the installation processes to the Grid Software Installation Service (GSIS). G, H. The GSIS starts a Software Installation Engine on one of the machines in the Grid through the Gatekeeper. The files needed to start the Installation Engine are pre-staged through the GridFTP server. Both the Gatekeeper and GridFTP servers are provided by the Globus Toolkit. I. The Software Installation Engine installs the software on the Storage Element on a location that is accessible by all the nodes of the Compute Element after taking care of the dependents of the installed software package. In a similar sequence the database-resident data packaged in files can be installed on the Grid. To manage versioned and interdependent software applications and file-based databases over the Grid infrastructure we developed the Grid Software Installation Management Framework (GSIMF). This set of Grid services provide mechanism to install software packages on distributed Grid computing elements, thus automating the software and database installation management process on behalf of the users. The challenge of software installation management in the Grid environment is not unique to the HEP domain. GSIMF can be modified to bring immediate and similar benefit to other application domains that require large-scale software systems needing software installation management tools. By lowering the cost of deploying and maintaining software on the Grid environment through GSIMF, we can push the envelope of Grid computing technologies further to reach new extents. 1. Introduction Physicists need to install their experiment-specific software into the Grid environment before they can start testing the software, making pilot runs, and processing sets of data they wish to analyze. Unlike local computing resources, systems on a Grid are geographically distributed and belong to different organizations, and hence have divergent system hardware and software combinations, as well as configurations, usage and security policies. Maintaining software installations for systems on a Grid is therefore not a trivial task and is often done manually, which is error-prone and time- consuming. Furthermore, Grids are often shared by multiple groups of researchers. What is needed is a software installation management system so that an experiment is not hindered from using an otherwise available resource simply because the necessary software or data are not accessible. To solve this, we’ve developed Grid Software Installation Management Framework (GSIMF) as a set of Grid Services for managing versioned and interdependent software applications and file-based databases over the Grid infrastructure on behalf of users. On the Grid, in addition to the software release files, physicists require access to the database-resident data in files. (An example of these are the ATLAS Database Release files that are decoupled from the software releases.) This task is similar to the installation of the software files performed by GSIMF. We study the applicability of GSIMF for the installation and management of database release files. For that purpose the GSIMF system was deployed on the Open Science Grid (OSG) fabric at the Argonne National Laboratory (Figure 2). Figure 2: Grid installation sequence 1. User opens Software Catalog and GUI client asks for the end point of the catalog service. 2. The input dialog where the user feeds in the catalog service's endpoint. 3. Software Catalog opened. 4. After browsing through the catalog, the user decides to install the DOM software - at which point the user must feed in the end point of the Manager service. 5. There are currently no user software installed on the Grid. 6. Now the DOM software is installed along with its three dependents that are getting installed one after another. 7. DOM and its dependents are installed. 8. Uninstalling DOM software. 9. After DOM and its dependents are uninstalled. 8 7 9 6 1 2 3 4 5 2. Architecture 3. Case Study Our GSIMF system is secure as it makes use of the security features available on the Grid Toolkit provided by the Grid Security Infrastructure (Figure 3). Figure 3: Secure Grid installation initialization 1. GUI client starts up. 2. Grid User Proxy expired. GUI client asks user to renew the proxy. 3. Proxy successfully getting created. 1 2 3 4. Grid Security 5. Conclusions

Nanbor Wang, Balamurali Ananthan Tech-X Corporation Gerald Gieraltowski, Edward May, Alexandre Vaniachine Argonne National Laboratory 2. ARCHITECTURE GSIMF:

Embed Size (px)

Citation preview

Page 1: Nanbor Wang, Balamurali Ananthan Tech-X Corporation Gerald Gieraltowski, Edward May, Alexandre Vaniachine Argonne National Laboratory 2. ARCHITECTURE GSIMF:

Nanbor Wang, Balamurali AnanthanTech-X Corporation

Gerald Gieraltowski, Edward May, Alexandre VaniachineArgonne National Laboratory

2. ARCHITECTURE

GSIMF: A Grid Software

Installation Management Framework

Grid Software Catalog Service (GSCS)

Software Package Descriptor Repository

Grid Software Package Repository

Java GUI Client: - Browses the Catalog of Software Package’s Metadata - Selects the Software - Installs the software on the Grid

Grid Software Installation Manager (GSIM)

Compute Node

Compute Node

Compute Node

Compute Node

Compute Node

Compute Node

Software Installation Engine (SIE)

Grid Gatekeeper

Grid Software Installation Service (GSIS)

Compute Element

Grid Software Package Query Service (GSPQS)

GridFTP

Storage Element

GRID

Installed Grid Software (Local Copy)

Grid Services

A

B C

F

H

D

G

I

E

J

Figure 1. GSIMF Architecture

GSIMF architecture is shown in Figure 1. The description of the core components and the sequence of actions are explained below.

How GSIMF Works A. Grid Software Catalog Service (GSCS) collects information about the software packages stored in Software

Package Repository and builds metadata for every individual package. B, C. The client browses through the Catalog of Software Metadata, selects the software of interest and sends the

command to install the software to the Grid Software Installation Manager (GSIM). D, F. GSIM contacts the Grid Software Package Query Service (GSPQS) and after checking that the software is

not already installed on the Grid, GSIM delegates the installation processes to the Grid Software Installation Service (GSIS).

G, H. The GSIS starts a Software Installation Engine on one of the machines in the Grid through the Gatekeeper. The files needed to start the Installation Engine are pre-staged through the GridFTP server. Both the Gatekeeper and GridFTP servers are provided by the Globus Toolkit.

I. The Software Installation Engine installs the software on the Storage Element on a location that is accessible by all the nodes of the Compute Element after taking care of the dependents of the installed software package.

In a similar sequence the database-resident data packaged in files can be installed on the Grid.

• To manage versioned and interdependent software applications and file-based databases over the Grid infrastructure we developed the Grid Software Installation Management Framework (GSIMF). This set of Grid services provide mechanism to install software packages on distributed Grid computing elements, thus automating the software and database installation management process on behalf of the users. The challenge of software installation management in the Grid environment is not unique to the HEP domain. GSIMF can be modified to bring immediate and similar benefit to other application domains that require large-scale software systems needing software installation management tools. By lowering the cost of deploying and maintaining software on the Grid environment through GSIMF, we can push the envelope of Grid computing technologies further to reach new extents.

1. Introduction Physicists need to install their experiment-specific software into the Grid environment before they can start testing

the software, making pilot runs, and processing sets of data they wish to analyze. Unlike local computing resources, systems on a Grid are geographically distributed and belong to different organizations, and hence have divergent system hardware and software combinations, as well as configurations, usage and security policies. Maintaining software installations for systems on a Grid is therefore not a trivial task and is often done manually, which is error-prone and time-consuming. Furthermore, Grids are often shared by multiple groups of researchers.

What is needed is a software installation management system so that an experiment is not hindered from using an otherwise available resource simply because the necessary software or data are not accessible. To solve this, we’ve developed Grid Software Installation Management Framework (GSIMF) as a set of Grid Services for managing versioned and interdependent software applications and file-based databases over the Grid infrastructure on behalf of users.

On the Grid, in addition to the software release files, physicists require access to the database-resident data in files. (An example of these are the ATLAS Database Release files that are decoupled from the software releases.) This task is similar to the installation of the software files performed by GSIMF. We study the applicability of GSIMF for the installation and management of database release files. For that purpose the GSIMF system was deployed on the Open Science Grid (OSG) fabric at the Argonne National Laboratory (Figure 2).

Figure 2: Grid installation sequence1. User opens Software Catalog and GUI client asks for the end point of the catalog service.2. The input dialog where the user feeds in the catalog service's endpoint.3. Software Catalog opened.4. After browsing through the catalog, the user decides to install the DOM software - at which point the user must

feed in the end point of the Manager service.5. There are currently no user software installed on the Grid.6. Now the DOM software is installed along with its three dependents that are getting installed one after another.7. DOM and its dependents are installed.8. Uninstalling DOM software.9. After DOM and its dependents are uninstalled.

87 96

1 2 3 4 5

2. Architecture

3. Case Study

Our GSIMF system is secure as it makes use of the security features available on the Grid Toolkit provided by the Grid Security Infrastructure (Figure 3).

Figure 3: Secure Grid installation initialization1. GUI client starts up. 2. Grid User Proxy expired. GUI client asks user to renew the proxy.3. Proxy successfully getting created.

1 2 3

4. Grid Security

5. Conclusions