36
Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

Embed Size (px)

DESCRIPTION

3 Release Schedule Ganga 4 Jul beta1 Aug beta2 Sep beta3 Sep beta4 Apr alpha1 May alpha2 May alpha3 May alpha4 Jun alpha5 Jun alpha6 Jun alpha7 Jul alpha8 Ganga 3 Mar Apr beta series: fully operational, public pre-release - bugfixes, testing, missing features - stability: config files, repository backwards compatibility audience: - tested/used ~10 users in LHCb, Atlas and outside - encouraged to be tried by everybody, no setup needed alpha series: prototype with frequent and incompatible changes audience: internal developers discontinued

Citation preview

Page 1: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

Ganga Core: Status

Jakub T. Moscicki

ARDA/LHCb

LHCb Software Week, September, 2005

Page 2: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 2

Ganga Overview

AtlasPROD

DIAL

DIRAC

LCG2

gLite

localhost

LSF

submit, kill

get outputupdate status

store & retrieve job definition

prepare, configure

Ganga4

JobJobJobJob

scripts

Gaudi

Athena

AtlasPROD

DIAL

DIRAC

LCG2

gLite

localhost

LSF

+ split, merge, monitor

Page 3: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 3

Release ScheduleGanga 4 Jul 8 4.0.0-beta1 Aug 8 4.0.0-beta2 Sep 7 4.0.0-beta3 Sep 23 4.0.0-beta4

Apr 16 4.0.0-alpha1 May 3 4.0.0-alpha2 May 13 4.0.0-alpha3 May 30 4.0.0-alpha4 Jun 6 4.0.0-alpha5 Jun 10 4.0.0-alpha6 Jun 24 4.0.0-alpha7 Jul 5 4.0.0-alpha8

Ganga 3 Mar 11 3.0.0 Apr 1 3.0.1

beta series: fully operational, public pre-release - bugfixes, testing, missing features - stability: config files, repository backwards compatibilityaudience: - tested/used ~10 users in LHCb, Atlas and outside - encouraged to be tried by everybody, no setup needed

alpha series: prototype with frequent and incompatible changesaudience: internal developers

discontinued

Page 4: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 4

Testing/stability

• Core testing:• automatic test suite (61 test cases)

• unit tests / invariant tests / integration tests• bugfix tests

• "a bug report have a test-case" policy• subsystem stubs (test submitters, transient repository)

• Extensions testing:• use-case tests in preparation (published as LHCb-2005-027 note)

• Release compatiblity:• automatic repository regression testing• GPI/config compatiblity policies

Page 5: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 5

Project Structure

• Framework• Submission logic• Monitoring• JobRepository• FileWorkspace• Utilities (config, logging)• Interfaces:

• interactive shell• command line / scripts• embedding / library

• Plugins• Applications• Backends• Datsets

Page 6: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 6

Project Structure

• Release area:• /afs/cern.ch/sw/ganga/install/slc3_gcc323/4.0.0-beta4

• bin/ganga• python/Ganga• core framework• Local, LSF, LCG, gLite backends• Executable application• python/GangaLHCb • Gaudi,DIRAC plugins• python/GangaAtlas• Athena, ADA plugins

[Configuration] RUNTIME_PATH = GangaLHCb:/myarea/GangaAtlas

Page 7: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 7

Configuration

• Config file: ~/.ganga4• Default template is well documented.

• Configurable features:• plugin location• hierarchical logger levels• polling rate (15 seconds)• repository configuration (local/remote)• file workspace (job input/output location)• VO• software versions• plugin specific parameters• Relevant command line options• -c cfgfile• -o[Repository]type=Remote• -o[Logging]GangaLHCb.Lib.Dirac=DEBUG

Page 8: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 8

Command line

ganga -h*** Welcome to Ganga ***Version: Ganga-4-0-0-beta4Documentation and support: http:/cern.ch/gangaType help() or help('index') for online help.

usage: ganga [options] [script] [args] ...options: --version show program's version number and exit -h, --help show this help message and exit -i enter interactive mode after running script -cFILE read user configuration from FILE (default ~/.ganga4) -g, --generate-config generate a default config file, backup the existing one -oEXPR, --option=EXPR set configuration options, may be repeated mutiple times, for example: -o[Logging]Ganga.Lib=DEBUG -oGangaLHCb=INFO -o[Configuration]TextShell = IPython FIXME: PATH-like variables are reset and not appended to (this behaviour is different from config file behaviour) --quiet only ERROR messages are printed --very-quiet only CRITICAL messages are printed --debug all messages including DEBUG are printed --no-prompt never prompt interactively for anything except IPython (FIXME:) --no-rexec rely on existing environment and do not re-exec ganga process to setup runtime plugin modules (affects LD_LIBRARY_PATH)

Page 9: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 9

Interfaces

• Interactive Shell IPython: <TAB>, coloring, history, editing, direct shell access Automatically generated GPI help index

• Scripting ganga script.py interpreter

• Embedding

#!/bin/env gangaprint jobs

from Ganga.Runtime import GangaProgramprog = GangaProgram()prog.bootstrap()from Ganga.GPI import *

Page 10: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 10

GPI

• Ganga Public Interface: GPI– high-level, user-friendly Python API for job manipulation– combines

consistency and flexibility of programming language interface clarity and ease of use

Ganga.Core

GPI

GUI CLIP SCRIPT

Page 11: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 11

GPI

• Hello World>>> job = Job()>>> job.application.exe='/bin/echo'>>> job.application.parameters=['hello world'])>>> job.submit()submitting job

>>> outfile = file(job.directory+'/output/std.out')>>> print outfile.read()Job started at: Fri Feb 18 14:05:32 2005Processing input files.../bin/echo Donehello worldApplication executed with the status code 0Processing output files...Exiting...Job finished at: Fri Feb 18 14:05:32 2005

>>> job2 = job.copy()>>> job2.backend = “LSF”>>> job2.submit()

Page 12: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 12

GPI

• Inspecting the jobs>>> print job.id5

>>> print jobsStatistics: 5 jobs jobs-------------- ID status name# 1 completed# 2 new Job20041231647334751371267881024# 3 completed Job2004123165980541363751081024# 4 submitted Job2004123182494941363292601024# 5 completed Job20041231842266811363443961024

>>> for j in jobs[1:3]:... print j.id12

Page 13: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 13

GPI

• Complex scenarios

>>> j = Job()>>> j.application = DaVinci()>>> j.application.options = 'my.opts'>>> j.backend = Glite()>>> j.backend.requirements = 'other.GlueCEUniqueID == "grid- ce.desy.de:2119/jobmanager-lcgpbs-short"'

>>> for i in range(100): j = Job()

Page 14: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 14

Ganga Tool vs Framework

• Ganga is a lightweight user tool easy to install (pure-python) “designed and optimized” for users GPI has a syntax (users have to judge):

• j.application, j.backend, j.id, j.submit(), …. Etc

• But also: Ganga is a developer framework Plugin model

• independent and rapid development of handlers (backends, applications) Promote but not force common GPI abstractions

• We do not require nor invent abstract base classes which are least common denominators between systems, example:

– you may implement very complex application (e.g. ADA) and enable submission to DIAL only if that’s your main case

• the design of framework does not attempt to match all possible applications with all possible backends

But: enable to build common tools on top of GPI: GUI, scripts,…

Page 15: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 15

Some Design Principles

• Avoid shared environment example: in LCG environment LD_LIBRARY_PATH is incompatible

with some application environments solution: LCG backend handler uses a private, cached environment

• Don't force common abstractions upfront application <-> backend are connected via adapters (runtime

handlers) in most cases adapters are shared (thus their number is reduced)

Page 16: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 16

Adapters: 7 vs 11 vs 20

X63LCG/glite

7XXXDIAL

XX4XDIRAC

XLSF

X521Localhost

ADAAthena Gaudi (DaVinci,Gauss,

Boole,…)

executable (any script)

Page 17: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 17

Summary

• Factsheet (4-0-0-beta4)– size:

Ganga base: ~400KB, pure-python (no install) Atlas and LHCb extensions: ~100KB

– existing functionality: basic job manipulation via GPI easy configuration / extension local and remote registry, local workspace Lib:

– local host, LSF, LCG2, DIRAC, glite– Gaudi (DaVinci, Gauss,...) , Athena, DIAL, Ada

– future functionality: GUI splitting/merging asynchronous job submission (remote job manager)

Page 18: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 18

http://cern.ch/ganga

Page 19: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 19

Backup Slides

Page 20: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 20

Ganga Architecture

Ganga.Core

GPI

GUI CLIPj =

Job(backend='LSF')

j.submit()

Job Repository

File Workspace IN/OUT SANDBOX

AtlasPRODDIALDIRACLCG2gLitelocalhostLSF

Athena

Gaudi

Plugin Modules

Monitoring

Page 21: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 21

Ganga Object Model

Page 22: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 22

Gaudi Application Objectclass Gaudi(GangaObject): _schema = Schema(Version(1,0),{ 'optsfile': FileItem(), 'version': SimpleItem(None), 'platform': SimpleItem(None), 'package': SimpleItem(None), 'appname': SimpleItem(None), 'cmt_release_area': SimpleItem(None), 'cmt_user_path': SimpleItem(None), 'masterpackage': SimpleItem(None), 'extraopts': SimpleItem(None)}) _category='applications' _name='Gaudi'

def _auto__init__(self): ...

def configure(self): ... extra_cfg=GaudiExtras() extra_cfg.flatopts=FileParser.writeString(gaudiopts,"expand") return (modified, extra_cfg)

def list_choices(self,property): ...

Page 23: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 23

Job Submit

Page 24: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 24

class GaudiLFSRunTimeHandler: def prepare(self,app,extra): (algpack,alg,algver)=app.masterpackage.split('/',3) script="""#!/usr/bin/env bashexport CMTPATH=###CMTUSERPATH###export ###THEAPP###_release_area=###CMTRELEASEAREA###if [ -f ${LHCBHOME}/scripts/ProjectEnv.sh ]; then . ${LHCBHOME}/scripts/ProjectEnv.sh ###THEAPP### ###VERSION###else echo "Could not find the ProjectEnv.sh script. Your job will probably fail"fimkdir -p cmttemp/v1/cmtcat >cmttemp/v1/cmt/requirements <<EOFuse ###ALG### ###ALGVER### ###ALGPACK###EOFcmt setup -sh -quiet -pack=cmttemp -version=v1 -path=$PWD >cmttemp/v1/cmt/setup.sh. cmttemp/v1/cmt/setup.sh$###THEAPP###_release_area/###APPUPPER###/###APPUPPER###_###VERSION###/###PACKAGE###/###THEAPP###/###VERSION###/###PLATFORM###/###THEAPP###.exe myopts.opts""" script=script.replace('###CMTUSERPATH###',app.cmt_user_path) script=script.replace('###THEAPP###',app.appname) script=script.replace('###CMTRELEASEAREA###',app.cmt_release_area) script=script.replace('###VERSION###',app.version) script=script.replace('###ALG###',alg) script=script.replace('###ALGVER###',algver) script=script.replace('###ALGPACK###',algpack) script=script.replace('###APPUPPER###',app.appname.toupper()) script=script.replace('###PACKAGE###',app.package) script=script.replace('###PLATFORM###',app.platform)

return {'jobscript': ('myscript',script), 'inputbox':[('myopts.opts',extra.flatopts)]}

Page 25: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 25

LSF Submit (1)def submit(self,jobid, jobconfig): inw = FileWorkspace.InputWorkspace() outw = FileWorkspace.OutputWorkspace()

logger.info('LSF: submitting job %d',jobid)

inw.create(jobid) outw.create(jobid) scriptpath = self.preparejob(jobid,jobconfig,inw,outw)

# FIXME: garbbing stdout is done by shell magic and probably should be implemented in python directly rc,soutfile = shell_cmd('cd %s; bsub %s' % (inw.getPath(),scriptpath))

if rc == 0: sout = file(soutfile).read() import re m = re.compile(r"^Job <(?P<id>\d*)> is submitted to (\S*) queue <(?P<queue>\S*)>.", re.M).search(sout)

if m is None: logger.warning('could not match the output and extract the LSF job identifier!') logger.warning('command output \n %s ',sout) else: self.id = m.group('id') queue = m.group('queue') if self.queue != queue: self.queue = queue logger.warning('you requested queue "%s" but the job was submitted to queue "%s"',self.queue,queue) logger.warning('command output \n %s ',sout) logger.info('job %d submission OK',jobid)

return rc == 0

Page 26: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 26

LSF Submit (2)def preparejob(self,jobid,jobconfig,inw,outw): appscriptpath = inw.writefile(jobconfig['jobscript'],executable=1)

# put files into job workdir (also to protect the originals while the job is running)

sharedinputbox = map(lambda f: inw.writefile(f), jobconfig['inputbox']) sharedoutputbox=outw.getPath() print sharedoutputbox

text = """#!/usr/bin/env pythonimport shutil

sharedinputbox = ###SHAREDINPUTBOX###sharedoutputbox= ###SHAREDOUTPUTBOX###

for fn in sharedinputbox: shutil.copy(fn,'.')

s = os.system('###APPSCRIPTNAME###')

print 'DEBUG: Job finshed with exit code: ',s

if s == 0: for fn in os.listdir('.'): if not os.path.isdir(fn): shutil.copy(fn,sharedoutputbox) # FIXME: needs recursive copy sys.exit(s)"""

text = text.replace('###SHAREDINPUTBOX###',repr(sharedinputbox)) text = text.replace('###APPSCRIPTNAME###',appscriptpath) text = text.replace('###SHAREDOUTPUTBOX###',repr(sharedoutputbox))

return inw.writefile(('__jobscript__',text),executable=1)

Page 27: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 27

Job Submit Sequence

Page 28: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 28

Files/Job Repository

• File Workspace ~/__Ganga4__/workspace/input/* ~/__Ganga4__/workspace/output/*

• Job Repository ~/__Ganga4__/repository/ganga_user

Page 29: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 29

LSF backend object

class LSF(GangaObject): _schema = Schema(Version(1,0), {'queue' : SimpleItem(defvalue='8nm'), 'id' : SimpleItem(defvalue=None,protected=1,copyable=0), 'status' : SimpleItem(defvalue=None,protected=1,copyable=0) }) _category = 'backends' _name = 'LSF'

def __init__(self): super(LSF,self).__init__()

Page 30: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 30

LSF Monitoring def updateMonitoringInformation(jobs):

rc,soutfile = shell_cmd('bjobs -a',allowed_exit=[0,255])

sout = file(soutfile).read()

if rc == 0: import re m1 = re.compile(r"JOBID\s+USER\s+STAT\s+QUEUE").search(sout) if not m1: logger.warning('problem with understanding the bjobs output:\n%s',sout) else: items = re.compile(r"^(?P<id>\d+)(\s*)(\S*)(\s*)(\S*)", re.M).findall(sout)

ids = map(lambda x: x[0], items)

for j in jobs: try: idx = ids.index(j.backend.id) new_status = items[idx][4]

if j.backend.status != new_status: logger.info('%d: LSF job status changed to %s',j.id,new_status)

j.backend.status = new_status

if j.backend.status == 'DONE' or j.backend.status == 'ERROR': j.status = "completed" except ValueError: pass updateMonitoringInformation = staticmethod(updateMonitoringInformation)

Page 31: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 31

Hello CLI

• Hello World: # execute hello script locally from Ganga.CLI import * Job(exe='hello').submit()

• Hello DaVinci: # execute DaVinci on the LSF, GRID, ...

# analysis will start at a worker node somewhere far far away ;-) j = Job(name='serious analysis',backend='LSF') j.application = DaVinciApplication(version='v12r3') j.application.optsfile = "DV-demo.opts" j.outputfiles = ["DVNtuples.hbook"] j.submit()

Page 32: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 32

Jobs

• Jobs # registry of persistent jobs jobs()

Statistics: 2 jobs registry-------------- ID status name# 1 new serious analysis# 2 submitted hello

# looping and selecting jobs j = jobs()[1] for j in jobs(): print j for j in jobs()[2:9]: j.name = 'important!' important = jobs()['important!']

Page 33: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 33

Plugin Components

• Applications & Backends # list plugin components backends()

['TestSubmitter', 'Local', 'Glite'] applications()

['DaVinciApplication', 'TestApplication', 'Executable'] # creating objects app = DaVinciApplication(optionsfile='some.opts') bk = Local() j.application, j.backend = app, bk # creating objects by a string name j.application = 'DaVinciApplication' j.application.optionsfile = 'some.opts' j.backend = 'Local'

Page 34: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 34

Templates and Copying

• Copy jobs # reuse existing jobs configuration to create new jobs j = other_job.copy() j = Job(template = other_job)

• Job templates # job templates are just like any other jobs

# except that their sole purpose it to store job configuration t = JobTemplate(backend=LSF(queue='8nm')) j = Job(template = t) # templates are stored in a separate container templates()

• Statistics: 1 jobs templates• ID status name• # 1 TEMPLATE None

Page 35: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 35

Design Principles

• CLI Design Principles Be predictable and follow python way of thinking Increase complexity of interface with complexity of task:

• Simple tasks – simple!Complicated tasks – also simple ;) !

Try to prevent users from slient mistakes:• job.id = 5 # FAILS: id is a read-only property• finished_job.name = 'newname' # FAILS: job is finished so can't modify

Hide implementation:• job._impl.attrs['id'] = 5

Be convinient and guide users• j.application.exe <=> j.exe # ALIASES of properties• TAB completion shows properties and hides internals

Be flexible: good for writing complex macros/scripts...

Page 36: Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005

[email protected] 36

Ganga Architecture

Client

Ganga.Core

GPI

GUI CLIPj =

Job(backend='LSF')

j.submit()

Job Repository

File Workspace IN/OUT SANDBOX

AtlasPRODDIALDIRACLCG2gLitelocalhostLSF

Athena

Gaudi

Plugin Modules

Monitoring