6
EGEE is a project funded by the European Union under contract IST-2003-508833 Experiment Software Installation toolkit on LCG- 2 www.eu-egee.org

EGEE is a project funded by the European Union under contract IST-2003-508833 Experiment Software Installation…

Embed Size (px)

DESCRIPTION

Experiment Software Installation - Structure of the Experiment Software Installation toolkit Lcg-ManageSofwtare Lcg-ManageVOTag Tank&Spark gssklog Lcg-asis UI WN CE lcg-asis: is a friendly user interface It hides the difficulties that lcg-ManageSoftware invocation implies (see previous slides) It uploads (if specified) the sources of the software on the grid (tarball(s)). It loops over all available sites to the VO (complying with some requirements provided by the user in terms of CPU, memory, disk space and CPU-time) and for each site:  Checks if another software management process is running on the site through the Information System (see later)  Creates automatically the JDL for that site  Submits the jobs and stores job information lcg-ManageSofwtare: represents the “middle layer” on the current implementation 1. It is the steering script to be invoked for installing/removing/validating application software. 2. It checks for the local (WN) environment and decides the workflow to be performed. 1. Is there running some other process for the same software 2. Is it a shared file system or not ? 3. Is it a AFS file system or not (conversion of GSI credential to AFS Tokens)? 4. Is there installed Tank&Spark (invocation of the propagation)? 3. It allows for a reliable download of the tarball(s) (if specified) through the lcg-* commands to the local WN by-passing the outbound connectivity requirement. 1. Eventual un packaging of these tarball(s) 4. It invokes the experiment specific script (provided in somehow by the ESM) and checks the result of the script 5. It creates a temporary directory used for install/validate/remove software and later on it cleans up such temporary directory. 6. It publishes TAGs on the Information System with a flavor that depends on the action that it’s going to be performed and the result of a such action (do a man of the command for more information) Lcg-ManageVOTag: is a component of the lower level layer It is the command used by lcg-ManageSoftware in order to add/list/remove tags published on the Information System using the Gris running on a given CE. (It just adds/removes entries to the attribute of the IS.) (It just adds/removes entries to the GlueHostApplicationSoftwareRunTimeEnvironment attribute of the IS.) It could be also used as a standalone application. It only requires the following format for the TAG: VO-- Tank&Spark: a component of the lower level layer running either on the WN and on the CE It is here mainly used to propagate software to other WNs. BUT: It can be used as a standalone mechanism grid-independent It can be used as a standalone mechanism grid-independent It can allow for installation by-passing the grid-job-submission (high prioritization of software management) It can allow for installation by-passing the grid-job-submission (high prioritization of software management) It keeps track about the installer which is strongly authenticated and univocally identified. It keeps track about the installer which is strongly authenticated and univocally identified. It complies with external policy set by the site administrators It complies with external policy set by the site administrators It can manage all possible topologies of file system (shared, no-shared, AFS, a mix of them!) It can manage all possible topologies of file system (shared, no-shared, AFS, a mix of them!) It allows for a-synchronous (currently) and synchronous installation. It allows for a-synchronous (currently) and synchronous installation. It allows for failure recovery. (re-try of the installation on the node) It allows for failure recovery. (re-try of the installation on the node) It can allow for roll-back of a given installation (not in place) It can allow for roll-back of a given installation (not in place) It allows for an exhaustive notification (with success and problems node by node) to the ESM and (automatically) to the site admin. It allows for an exhaustive notification (with success and problems node by node) to the ESM and (automatically) to the site admin. It allows for storing many information about a given software -internally identified through GUIDs - (ex. date, size, owner, path, status and so on). It allows for storing many information about a given software -internally identified through GUIDs - (ex. date, size, owner, path, status and so on). Automatic farm management: (It a node out? Is a new node there?) It adds/removes nodes into its central DB (MySQL) Automatic farm management: (It a node out? Is a new node there?) It adds/removes nodes into its central DB (MySQL) It modifies the Information System by changing the “flavour” of the tag It modifies the Information System by changing the “flavour” of the tag gssklog: is another component of the lower level layer and it’s part of another mechanism externally developed: gssklog-gssklogd It represents the client of this mechanism allowing for the conversion of GSI credential into valid KRB5 AFS tokens,

Citation preview

Page 1: EGEE is a project funded by the European Union under contract IST-2003-508833 Experiment Software Installation…

EGEE is a project funded by the European Union under contract IST-2003-508833

Experiment Software Installation toolkit

on LCG-2

www.eu-egee.org

Page 2: EGEE is a project funded by the European Union under contract IST-2003-508833 Experiment Software Installation…

Experiment Software Installation

The current implementation of the general schema The current implementation of the general schema discussed elsewherediscussed elsewhere

((http://grid-deployment.web.cern.ch/grid-deployment/eis/docs/SoftwareInstallation/index.html

foresees a three layer structured software. foresees a three layer structured software. Tank & Spark represents a component of the lower Tank & Spark represents a component of the lower

level layer and it is mainly used for propagate level layer and it is mainly used for propagate software to the rest of a farm whenever no file software to the rest of a farm whenever no file system is provided.system is provided.

Page 3: EGEE is a project funded by the European Union under contract IST-2003-508833 Experiment Software Installation…

Experiment Software Installation

-Structure of the Experiment Software Installation toolkit

Lcg-ManageSofwtare

Lcg-ManageVOTagTank&Spark

gssklog

Lcg-asis UI

WN

CE

lcg-asis: is a friendly user interface• It hides the difficulties that lcg-ManageSoftware invocation

implies (see previous slides)• It uploads (if specified) the sources of the software on the grid

(tarball(s)).• It loops over all available sites to the VO (complying with some

requirements provided by the user in terms of CPU, memory, disk space and CPU-time) and for each site: Checks if another software management process is running on the

site through the Information System (see later) Creates automatically the JDL for that site Submits the jobs and stores job information

lcg-ManageSofwtare: represents the “middle layer” on the current implementation

1. It is the steering script to be invoked for installing/removing/validating application software.

2. It checks for the local (WN) environment and decides the workflow to be performed. 1. Is there running some other process for the same software2. Is it a shared file system or not ?3. Is it a AFS file system or not (conversion of GSI credential to AFS Tokens)?4. Is there installed Tank&Spark (invocation of the propagation)?

3. It allows for a reliable download of the tarball(s) (if specified) through the lcg-* commands to the local WN by-passing the outbound connectivity requirement.1. Eventual un packaging of these tarball(s)

4. It invokes the experiment specific script (provided in somehow by the ESM) and checks the result of the script

5. It creates a temporary directory used for install/validate/remove software and later on it cleans up such temporary directory.

6. It publishes TAGs on the Information System with a flavor that depends on the action that it’s going to be performed and the result of a such action(do a man of the command for more information)

Lcg-ManageVOTag: Lcg-ManageVOTag: is a component of the lower level layeris a component of the lower level layer

It is the command used by lcg-ManageSoftware in order to It is the command used by lcg-ManageSoftware in order to add/list/remove tags published on the Information System add/list/remove tags published on the Information System using the Gris running on a given CE.using the Gris running on a given CE.

(It just adds/removes entries to the (It just adds/removes entries to the GlueHostApplicationSoftwareRunTimeEnvironment attribute of the IS .)attribute of the IS .)

It could be also used as a standalone application.It could be also used as a standalone application.It only requires the following format for the TAG:It only requires the following format for the TAG:VO-<voname>-<whatever_string>VO-<voname>-<whatever_string>

Tank&Spark: Tank&Spark: a component of the lower level layer running either on the WN and a component of the lower level layer running either on the WN and on the CEon the CE

It is here mainly used to propagate software to other WNs.It is here mainly used to propagate software to other WNs.BUT:BUT:

• It can be used as a standalone mechanism grid-independentIt can be used as a standalone mechanism grid-independent• It can allow for installation by-passing the grid-job-submission (high prioritization of It can allow for installation by-passing the grid-job-submission (high prioritization of

software management)software management)• It keeps track about the installer which is strongly authenticated and univocally It keeps track about the installer which is strongly authenticated and univocally

identified.identified.• It complies with external policy set by the site administratorsIt complies with external policy set by the site administrators• It can manage all possible topologies of file system (shared, no-shared, AFS, a mix It can manage all possible topologies of file system (shared, no-shared, AFS, a mix

of them!)of them!)• It allows for a-synchronous (currently) and synchronous installation.It allows for a-synchronous (currently) and synchronous installation.• It allows for failure recovery. (re-try of the installation on the node)It allows for failure recovery. (re-try of the installation on the node)• It can allow for roll-back of a given installation (not in place)It can allow for roll-back of a given installation (not in place)• It allows for an exhaustive notification (with success and problems node by node) It allows for an exhaustive notification (with success and problems node by node)

to the ESM and (automatically) to the site admin.to the ESM and (automatically) to the site admin.• It allows for storing many information about a given software -internally identified It allows for storing many information about a given software -internally identified

through GUIDs - (ex. date, size, owner, path, status and so on).through GUIDs - (ex. date, size, owner, path, status and so on).• Automatic farm management: (It a node out? Is a new node there?) It adds/removes Automatic farm management: (It a node out? Is a new node there?) It adds/removes

nodes into its central DB (MySQL)nodes into its central DB (MySQL)• It modifies the Information System by changing the “flavour” of the tagIt modifies the Information System by changing the “flavour” of the tag

gssklog: gssklog: is another component of the lower level layer and it’s is another component of the lower level layer and it’s part of another mechanism externally developed: gssklog-part of another mechanism externally developed: gssklog-gssklogd gssklogd

It represents the client of this mechanism allowing for the It represents the client of this mechanism allowing for the conversion of GSI credential into valid KRB5 AFS tokens,conversion of GSI credential into valid KRB5 AFS tokens,

Page 4: EGEE is a project funded by the European Union under contract IST-2003-508833 Experiment Software Installation…

Experiment Software Installation

Tank & Spark

It consists of three different components:It consists of three different components:

Tank : =multithread (gSOAP based) service (running on the CE) listening for GSI-authenticated (and non) connections

Spark :=client application running on each WN (through a cronjob and/or through a normal “grid-job” from lcg-ManageSoftware) and contacting tank for retrieve/insert/delete software informations.

R-sync server running on another machine (a SE for instance) and acting as central repository of the software.

Page 5: EGEE is a project funded by the European Union under contract IST-2003-508833 Experiment Software Installation…

SESE TANK

TANK

CECE

WN WN WN WN WN WN WN

ESM

1

JDL-installation job from ESM arrives on CE

2

ESM requests ends up on WN that becomes SPARKSPARK

The software (here labeled as “c”) is installed locally through the middle layer lcg-ManageSoftware. A pre-validation is highly recommended before triggering the propagation. The Information System is upgraded

3

Spark-client program is called. Delegated credentials of the ESM are checked in TANK. SPARK ask for a sw tag registration in TANK central DB.

4

TANK registers the new tag and synchronize through R-SYNC the new directory created in SPARK in a central repository

5

TANK is contacted by all WNs one at the time External conditions are checked. Special site policies can be taken into account. Local installation on WNs is triggered. No authentication is required : each WN trusts TANK.

7

At the end of the whole process TANK will e-mail the ESM indicating the result of the

installation; the Information System is upgraded accordingly to

the result of the process

Site Firewall

ab

ab

abc

abc

“c”

6“c”

Page 6: EGEE is a project funded by the European Union under contract IST-2003-508833 Experiment Software Installation…

Flag flavors:Flag flavors:1.1. VO-dteam-orca-8.3VO-dteam-orca-8.3-processing-install-processing-install Installation on goingInstallation on going2.2. VO-dteam-orca-8.3VO-dteam-orca-8.3--processing-remove processing-remove Removal on goingRemoval on going3.3. VO-dteam-orca-8.3VO-dteam-orca-8.3--processing-validate processing-validate Validation on goingValidation on going4.4. VO-dteam-orca-8.3VO-dteam-orca-8.3--aborted-install aborted-install Installation failureInstallation failure5.5. VO-dteam-orca-8.3VO-dteam-orca-8.3--aborted-remove aborted-remove Removal failureRemoval failure6.6. VO-dteam-orca-8.3-VO-dteam-orca-8.3-aborted-validate aborted-validate Validation failureValidation failure7.7. VO-dteam-orca-8.3VO-dteam-orca-8.3--to-be-validated to-be-validated Installation OKInstallation OK8.8. Removal OKRemoval OK9.9. VO-dteam-orca-8.3 VO-dteam-orca-8.3 Validation OKValidation OK

Advantages:1. Normal users continue to use the same mechanism to know about

the software on a site2. The ESMs know about the status of his management experiment

software jobs.3. There is not possibility to have concurrent management software

jobs for the same software version on the same site.