View
221
Download
2
Category
Preview:
Citation preview
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
Python Scripting and Grid User Environments
Craig E. Tull
Trillium Analysis Environment for the Grid
December 19, 2003
Caltech - Pasadena, CA
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
Scripting Language Features
• typeless/not strongly typed—simplify connections—simpler syntax—easier/faster to learn (enough)
• error checking at the last possible moment—user interactivity
• interpreted—immediate feedback (eg. data exploration)
• many instructions per line (higher level)—complex tasks in few lines of code
• programmers write same # of LOC per year
—many details are handled automatically
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
Why Python?
• Easy to learn/read high-level scripting language— Very little syntax
• A large collection of modules to support common operations, e.g., networking, http, smtp, ldap, XML, Web Services, etc.
• Excellent for “gluing” together existing codes— Many automated tools for interfacing with
C/C++/Fortran• Support for platform independent GUI
components• Runs on all popular OS’s, e.g., UNIX, Win32,
MacOS, etc.• Support for Grid programming with pyGlobus,
PyNWS, etc.
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
pyGMA
• LBNL - Data Intensive Distributed Computing Research Group (DIDC) - Dan Gunter
• an implementation of the Grid Monitoring Architecture (GMA) Producer, Consumer, and Directory Service Web-Services SOAP interfaces in Python
• uses the ZSI SOAP library to aid with serialization and deserialization of messages
• a framework that handles the SOAP communications between the monitoring components defined by the GGF
• the target "user" of pyGMA is a developer that wants to connect existing or newly created monitoring components into a GMA-compatible framework
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
PEG - Python Extensions for the Grid
• UCSD GRAIL (Grid Research and Innovation Lab)• PyNWS
—interface to the Network Weather Service (NWS) API library.
• routines for accessing the resource monitoring and forecasting services provided by NWS nameserver, memory, sensor, and forecaster processes.
—contains an extension module (nws) which provides functions that closely correspond to the defined NWS API for C/C++, as well as a Python module (nws_class) that provides higher-level support for performing some common NWS activities
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
Athena/Gaudi & Python
• Python version of Athena/ Gaudi JobOptions.txt has existed for some time.
• The Athena Startup Kit (ASK)
• Integrate Athena, Atlas releases, and CMT for an improved end-user experience
• Built on top of existing tools, not a replacement, it automates tasks otherwise left to the user
• Both GUI (simple) and CLI (powerful)
• Contains workarounds for broken releases (ugly!)
In addition: make Athena more interactive
=> GaudiPython / GANGA
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
pyRoot Motivation
• Pere Mato - ROOT 2002 Workshop• Be able to use any ROOT class from Python in a
generic way.— Without the need of wrapping each class— Using the ROOT object dictionary information
• Facilitate access of ROOT files and other facilities from non-ROOT applications
• Proof-of-concept that Python can be viewed as Software Bus— In analogy to a “hardware bus” where you
can plug a variety of modules and interface adaptors to other buses.
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
pyRoot Design
TClassTClassWrap
TMethodWrap
TObjectWrap
TCArrayWrap
TObject
TMethod
RootModule
CIN
T
Boost
.Pyth
on
OtherROOT
LibrariesTROOTWrap
Pythoninterpreter
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
C:\> python...>>> from rootmodule import *>>> f1 = TF1('func1','sin(x)/x',0,10)>>> f1.Eval(3)0.047040002686622402>>> f1.Derivative(3)-0.34567505667199266>>> f1.Integral(0,3)1.8486525279994681>>> f1.Draw() <TCanvas::MakeDefCanvas>: created default TCanvas with name c1
Example - Trivial Root
• No much difference between CINT and Python !
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
Example - ROOT + Excel
Filling an Excel spreadsheet from a ROOT ntuple
# Get the ntuple from the ROOT fileimport rootmodulehfile = rootmodule.TFile('hsimple.root')ntuple = rootmodule.gROOT.FindObject('ntuple')entries = ntuple.GetEntries()nvar = ntuple.GetNvar()tuple = ntuple.GetArgs()# Initialize Excelimport win32com.clientexcel = win32com.client.Dispatch('Excel.Application')wbook = excel.Workbooks.Add()wsheet = wbook.WorkSheets.Add()wsheet.Name = ntuple.GetTitle()# Fill Excel sheetfor i in xrange(500) : ntuple.GetEntry(i) for j in range(nvar) : wsheet.Cells(i+1,j+1).value = tuple[j]# Make Excel sheet visibleexcel.Visible = 1
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
pyGlobus Overview
• The Python CoG Kit provides a mapping between Python and the Globus Toolkit™. It extends the use of Globus by enabling access to advanced Python features such as events and objects for Grid programming.
• Hides much of the complexity of Grid programming behind simple object-oriented interfaces.
• The Python CoG Kit is implemented as a series of Python extension modules that wrap the Globus C code.
• Provides a complete interface to GT2.0.• Uses SWIG (http://www.swig.org) to help
generate the interfaces.
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
Scripting/Adaptation Layers
Native LangComponent
Shadow Class
Presentation
Usability
Task-based
Application
Component written in native program-ming language (C, C++, etc).eg. globus_ftp_client, gram_client, …
1 to 1 mapping (eg. via SWIG)
map onto Python concepts/constructs
apply the 80/20 rule for defaultsto narrow interface
aggregate componentsfor a common task
combine tasksin an application
Pyt
hon
• Adaptation laying for pyGlobus shows excellent decomposition.
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
OGSI Plans for pyGlobus
• Develop a full OGSI implementation in Python— Planned alpha release of an OGSI client by
the end of August— OGSI hosting environment based on
WebWare (http://webware.sourceforge.net/)• Dynamic web service invocation framework
— Similar to WSIF (Web Services Invocation Framework) from IBM for Java• http://www.alphaworks.ibm.com/tech/wsif
— Download and parse WSDL document, create request on the fly
— Support for multiple protocol bindings to WSDL portTypes
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
PyGlobus Status and Plans
• Users:—AccessGrid - Being rewritten using pyGlobus.—LIGO - Laser Interferometer Gravitational Wave Observatory
—CAS - Community Authorization Service—NCAR - National Center for Atmospheric
Research• Current Work - Keith Jackson
—~5-6 developers actively involved—GT3/OGSI port underway—pyGlobus will be part of GT3 distribution—Looking for feedback on "Usability" and
"Task" layers and on Framework.
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
GANGA Motivation
• ATLAS and LHCb develop applications within a common framework: Gaudi/Athena
• Both collaborations aim to exploit potential of Grid for large-scale, data-intensive distributed computing
• ATLAS and LHCb develop applications within a common framework: Gaudi/Athena
• Both collaborations aim to exploit potential of Grid for large-scale, data-intensive distributed computing
Simplify management of analysis and production jobs for end-user physicists by developing tool for accessing Grid services with built-in knowledge of how Gaudi/Athena works:
Gaudi/Athena and Grid Alliance (GANGA)
Simplify management of analysis and production jobs for end-user physicists by developing tool for accessing Grid services with built-in knowledge of how Gaudi/Athena works:
Gaudi/Athena and Grid Alliance (GANGA)
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
Athena/GAUDI Architecture
Converter
Algorithm
Event DataService
PersistencyService
DataFiles
AlgorithmAlgorithm
Transient Event Store
Detec. DataService
PersistencyService
DataFiles
Transient Detector
Store
MessageService
JobOptionsService
Particle Prop.Service
OtherServices
HistogramService
PersistencyService
DataFiles
TransientHistogram
Store
ApplicationManager
ConverterConverter
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
Interfacing to the GRID
• GANGA: Gaudi/Athena and Grid Alliance
— First ideas for GANGA were presented by P.Mato and C.Tull in summer 2001
— Joint ATLAS/LHCb GridPP proposal— 2 funded FTEs
• Karl Harrison• Alexander Soroko
— May 2002 - Cosners' House GridPP Meeting
— Technology Survey• Grappa, Genius, AliEn, Slice
— Atlas/LHCb design team, including US representatives GAUDI / Athena
GANGAGU
I
JobOptionsAlgorithms
GRIDServices
HistogramsMonitoringResults
Interfacing GAUDI with GRID - P.Mato
API
API
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
Rule #1: Protect the User
—Real Data vs. Virtual Data—LFN vs. PFN/TFN/SFN—Grid Enabled vs. Standalone—LSF/PBS/Condor
• We do not want the user of the Framework to know or care about details like this.—Implies: Uniform, abstract access to/specification
of data sets (ie. if Real and Virtual Data are to be used).
—Non-Grid implementations of Grid-enabled Services?
—Grid & Non-grid concepts must merge at UI.
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
Interfacing to the Grid
Job class Job class GANGA Core module
GANGA Core module
Job Handler class
Job Handler class
XML RPCXML RPC
Data management
service
Data management
service
Job submissionJob submission Job monitoring Job monitoring Security serviceSecurity service
dg-job-list-match
dg-job-submit
dg-job-cancel
dg-job-list-match
dg-job-submit
dg-job-cancel
grid-proxy-init
MyProxy ?
GSI ?
grid-proxy-init
MyProxy ?
GSI ?
dg-job-status
dg-job-get-logging-info
GRM/PROVE
dg-job-status
dg-job-get-logging-info
GRM/PROVE
edg-replica-manager
dg-job-get-output
globus-url-copy
GDMP?
edg-replica-manager
dg-job-get-output
globus-url-copy
GDMP?EDG UI
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
Ganga Prototyping
Embedded Python
interpreter
Tree of user
jobs
Job optionsfor selected
job
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
Ganga Prototyping (current state)
• GUI is created using wxPython extension module• Access to the Gaudi Job Configuration DB is implemented with the
xmlrpclib module• User can browse and create Job Options files using this DB• Serialization of objects (user jobs) is implemented with the Python
pickle module• Python interpreter is embedded into the GUI and allows user to
configure interface from the command line• GRID stuff is under development at the moment and is oriented on
EDG testbed 1.2
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
Conclusion
• Python Common themes—Naitive Lang for main functionality/peformance—Scripting as Glue (ala Stallman)—Fast Prototyping—Easily layered GUI
• wxWindows, tkinter
—Adaptation Layering—Ease of adapting "Legacy" code
• Python is proving its promise as a fast, effective, object-oriented scripting language.
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
Conclusion
• Other languages (Perl, Ruby, etc) and approaches (Grid Portals) exist can can/do work.
• But Python 1st choice for many in our field:—Middleware: pyGlobus, pyGMA, pyNWS—Physics: Athena/Gaudi, LCG, GANGA, LIGO
• Layering is crucial for coherent application• Wrapping, gluing, and building from components
a natural use of Python and a real boon in code reuse.
CETull@lbl.gov - Python & Grid (19dec02 - Trillium @ Caltech)
Scripting/Adaptation Layers
Native LangComponent
Shadow Class
Presentation
Usability
Task-based
Application
Component written in native program-ming language (C, C++, etc).eg. globus_ftp_client, gram_client, …
1 to 1 mapping (eg. via SWIG)
map onto Python concepts/constructs
apply the 80/20 rule for defaultsto narrow interface
aggregate componentsfor a common task
combine tasksin an application
Pyt
hon
• Adaptation laying for pyGlobus shows excellent decomposition.
Recommended