View
215
Download
1
Category
Preview:
DESCRIPTION
Jürgen Knobloch API 3 ANAPHE Components Lizard Commander Histograms NTuples Fitting Plotting VectorOfPoints Functions Analyzer Abstract types HTL Tags (HepODBMS Gemini/HepFitting Qplotter VectorOfPoints CLHEP Class Libraries for HEP Python SWIG Objectivity/DB NAG-C Qt Implemetations (HEP-specific) non-HEP components User Interface - using Abstract Types AIDA (Abstract Interfaces for Data Analysis)
Citation preview
Analysis Software Strategy
Jürgen KnoblochHTASC, DESY 9 October 2001
AIDA
ANAPHE
LIZARD
API 2Jürgen Knobloch
Anaphe – Lizard - AIDA
• Anaphe - Libraries for Hep Computing– Full replacement of CERNLIB– Open source, mostly “license-free”
• Lizard is based on Anaphe components and the Python scripting language (through SWIG)
– PAW functionality– young but has very solid base in mature Anaphe– real plug-in structure
• AIDA project (Abstract Interfaces for Data Analysis)
– Interface definitions & classes for C++ and Java
API 3Jürgen Knobloch
ANAPHE Components
LizardCommander
HistogramsNTuplesFittingPlottingVectorOfPointsFunctionsAnalyzer
Abstract types
HTLTags (HepODBMSGemini/HepFittingQplotterVectorOfPoints
CLHEPClass Libraries for
HEP
Python SWIG
Objectivity/DBNAG-CQt
Implemetations (HEP-specific)
non-HEP components
User Interface - using Abstract Types
AIDA(Abstract
Interfaces for Data Analysis)
API 4Jürgen Knobloch
ANAPHE - History
• LHC++ “Libraries for Hep Computing”– 1995 project initiated with the aim to develop a modular
replacement of CERNLIB (lib and tools)– using standards (STL, OpenGL, ...) and commercial
components (e.g., NAG_C, OpenInventor, IrisExplorer)
• First iteration on physics data analysis tool – data driven approach (based on IRIS Explorer) – GUI based, not command line driven
• Request to create new physics analysis tool – September 99:new requirements defined together with
experiments– identified categories/components and Abstract Types
API 5Jürgen Knobloch
AIDA - History
• started in Sept. 1999 (HepVis 99, Orsay)– several (mini-) workshops since then
• main ones Paris 2000 and Boston 2001• release 1.0 summer 2000
– concentrated on “developers view”• Histogram package only
– IAxis, IHistogram, IHistogram1D, IHistogram2D, IHistogramFactory
• release 2.0 May 2001 (“Boston release”)– about 20 Interface classes – aiming at discussion and gathering feedback
API 6Jürgen Knobloch
History and Status - Lizard
• Started after CHEP-2000• Full version out since June 2001
– “PAW like” analysis functionality plus– on-demand loading of compiled code using shared
libraries• gives full access to experiment’s analysis code
and data– based on Abstract Interfaces
• flexible and extensible• License-free version available
API 7Jürgen Knobloch
USDP-like approach
• Start with OO analysis– collection of user requirements
• OO design phase– define categories and classes– find patterns
• Create prototype– considered “throw away”
• get feedback from users• Iterate, iterate, iterate ...
API 8Jürgen Knobloch
User requirements
• Ease of use (“like PAW”)• Foresee customization/integration
– e.g., use persistency/messaging/... from experiment
• Framework used will not be exposed/imposed– needs to be compatible with experiment’s
framework• Plan for extensions
– “code for now, design for the future”• Maximize flexibility/interoperability
API 9Jürgen Knobloch
Architecture Overview
• Maximize flexibility and re-use– Abstract Interfaces allow each component to
develop independently– not bound to a specific implementation of any
component (“plugin” style) – re-use of existing packages to implement
components reduces start-up time significantly• Identify and use patterns - avoid anti-patterns
– learn from other people’s experiences/failures
API 10Jürgen Knobloch
“Ignominy” Tool to Quantify Modularity
0123456789
101112
0% 10% 20% 30% 40% 50% 60% 70%% packages in cyclic dependency
NC
CD
= s
um s
ingl
e pa
ckag
e de
pend
enci
es n
orm
alis
ed to
a
bina
ry tr
ee
ATLAS
IGUANA
COBRA (CMS)
ROOT
ORCA (CMS)
Anaphe / Lizard
GEANT4
idealNCCD (“spaghetti index”)
1.0: good toolkit< 1.0: independent packages> 1.0: strongly-coupled
ideal toolkit
See:See:[8-024][8-024]
IncludesFortran
API 11Jürgen Knobloch
CLHEP
• HEP foundation class library– Random number generators– Physics vectors
• 3- and 4- vectors– Geometry– Linear algebra– System of units– more packages recently added
• will continue to evolve• wwwinfo.cern.ch/asd/lhc++/clhep/
API 12Jürgen Knobloch
2D Graphics libraries
• Qt – multi-platform C++ GUI toolkit
• C++ class library, not wrapper around C libs
• superset of Motif and MFC• available on Unix and MS Windows• no change for developer
– commercial but with public domain version– www.troll.no
• Qplotter– “add-on” functionality for HEP
• “HIGZ/HPLOT”
API 13Jürgen Knobloch
Basic 3D Graphic Libraries
• OpenGL (basic graphics)– De-facto industry standard for basic 3D graphics– Used in CAD/CAE, games, VR, medical imaging
•OpenInventor (scene mgmt.)–OO 3D toolkit for graphics–Cubes, polygons, text, materials–Cameras, lights, picking–3D viewers/editors,animation–Based on OpenGL/MesaGL
API 14Jürgen Knobloch
Mathematical Libraries
• NAG (Numerical Algorithms Group) C Library– Covers a broad range of functionality
• Linear algebra• differential equations• quadrature, etc.
– Special functions of CERNLIB added to Mark-6 release
• mostly for theory and accelerator • Quality assurance• extensive testing done by NAG
– www.nag.com
API 15Jürgen Knobloch
Histograms: the HTL package
• Histograms are the basic tool for physics analysis
– Statistical information of density distributions
• Histogram Template Library (HTL) – design based on C++ templates– Modular : separation between sampling and
display– Extensible : open for user defined binning
systems– Flexible: support transient/persistent at the
same time– Open: large use of abstract interfaces– recent addition: 3D histograms
API 16Jürgen Knobloch
Fitting and Minimization
• Fitting and Minimization Library (FML)
– common OO interface • NAG-C, MINUIT
– based on Abstract Interfaces• IVector, IModelFunction, …
• Gemini– common minimization
interface – very thin layer
API 17Jürgen Knobloch
Tags, Ntuples and Events
• NtupleTag Library– Ntuple navigation and analysis– common OO interface for different storage
• ODBMS • HBook (CERNLIB)
• Exploiting Tag concept – enhanced Ntuples– associated with an underlying persistent store– optional association to the Event may be used to navigate to
any other part of the Event• even from an interactive visualization program
– main use: speedup data selection for analysis…• Tag data is typically better clustered than the original
data
Object Association
API 18Jürgen Knobloch
Interactive Data Analysis
• Aim: “OO replacement for PAW”– analysis of “ntuple-like data” (“Tags”, “Ntuples”, …)– visualisation of data (Histograms, scatter-plot, “Vectors”)– fitting of histograms (and other data)– access to experiment specific data/code
• Maximize flexibility and re-use– plug-in structure– careful design with limited source and binary dependencies
• Foresee customization/integration– allow use from within experiment’s s/w framework!
API 19Jürgen Knobloch
Lizard: Interfaces
API 20Jürgen Knobloch
Anaphe components
API 21Jürgen Knobloch
Scripting (Architectural issue)
• Typical use of scripting is quite different from programming (reconstruction, analysis, ...)
– history “go back to where I was before”– repetition/looping - with “modifiable parameters”
• SWIG to (semi-) automatically create connection to chosen scripting language
– allows flexibility to choose amongst several scripting languages
– Python, Perl, Tcl, Guile, Ruby, (Java) …• Python - OO scripting, no “strange $!%-variables”
– other scripting languages possible (through SWIG)• Can be enhanced and/or replaced by a GUI
– scripting window within GUI application
API 22Jürgen Knobloch
Sample session
API 23Jürgen Knobloch
Future Enhancements
• Access to other implementations of components
– HBOOK histograms and ntuples (RWN) – mostly done
– OpenScientist, ROOT histograms?• Adding other “scripting” languages
– Perl , Tcl, cint ?• Communication with Java tools/packages
– via AIDA• JAS • WIRED
API 24Jürgen Knobloch
Distributed Computing
• Motivation– move code to data– parallel analysis
• Techniques– services via AI– late binding– plug-in architecture
• End-user (Lizard)– look-and-feel of local analysis
• R&D started and first prototype available soon– CORBA
API 25Jürgen Knobloch
The AIDA project
• AIDA project (Abstract Interfaces for Data Analysis)
• Presently active mainly developers from existing packages
– Tony Johnson (JAS)– Andreas Pfeiffer (Lizard/Anaphe)– Guy Barrand (OpenScientist )– Mark Dönszelmann (Wired)– Developers from LHCb/Gaudi
API 26Jürgen Knobloch
AIDA
• Design Interfaces for Data Analysis (in HEP)– “The goals of the AIDA project are to define
abstract interfaces for common physics analysis tools, such as histograms. The adoption of these interfaces should make it easier for developers and users to select to use different tools without having to learn new interfaces or change their code. In addition it should be possible to exchange data (objects) between AIDA compliant applications.” (http://aida.freehep.org)
• Open for contributions of any kind– questions, suggestions, code, implementations …
API 27Jürgen Knobloch
Motivation
• Minimize coupling between components• Provide flexibility to interchange implementations of
these interfaces• Allows and try to re-use existing packages
– even across “language boundaries”• e.g., C++ analysis using Java Histograms
• Allow for faster turn-around time
API 28Jürgen Knobloch
Components + Abstract Interfaces
• User Code uses only Interface classes
– IHistogram1D * hist = histoFactory-> create1D(‘track quality’, 100, 0., 10.)
• Actual implementations are selected at run-time
– loading of shared libraries
• No change at all to user code but keep freedom to choose implementation
Histo-IF Fitter-IF
User Code
Histo-Impl. 2
Fitter-Impl. Y
Histo-Impl. 1
Fitter-Impl. X
API 29Jürgen Knobloch
Architectural issue:Abstract Interfaces• Abstract Interfaces
– Only pure virtual methods, inheritance only from other Abstract Interfaces
– Components use other components only through their Abstract Interface
– Defines a kind of a “protocol” for a component – Allow each component to develop independently
• reduces maintenance effort significantly– Maximize flexibility and re-use of packages
• Re-use of existing packages to implement components reduces start-up time significantly
API 30Jürgen Knobloch
Architectural issue: Components (I)
• Identify components by functionality• Define “protocol” using Abstract Interfaces• Emphasize separation of different aspects for
each component– Example: Histogram
• statistical entity (density distribution of a physics quantity)
• view of a “collection of data points” (which can be a density distribution but also a detector efficiency curve)
• command to manipulate/store/plot/fit/...
API 31Jürgen Knobloch
Architectural issue: Components (II)
• “User’s view” is different from “implementor’s (developer’s) view”
– separate Abstract Interfaces for both aspects – “command-layer” vs. “implementation-layer(s)”
• UserInterface as a separate component– by definition couples to most of the other
components • Facade pattern
– promotes weak coupling between the other components
– interfaces to scripting and/or GUI
API 32Jürgen Knobloch
Initial Categories and dependencies
API 33Jürgen Knobloch
Across the languages
• JAida : C++ access to Java libs– using C++ proxies implementing the C++ Abstract Interfaces
to the Java interfaces
C++UserCode
AIDA-IF C++ AIDA-IF Java
Java Lib
JAida
API 34Jürgen Knobloch
Infrastructure (I)
• Location of repository (Java and C++)– Anonymous CVS access – :pserver:anoncvs@cvs.freehep.org:/cvs/aida
(passwd aida)– module: aida
• “Release area” for C++ code – /afs/cern.ch/sw/contrib/AIDA/<version>/AIDA/
• #include <AIDA/IHistogram.h>• version : x.y.p
– tarballs (C++) available in /afs/cern.ch/sw/contrib/AIDA/tar/AIDA-<version>.tar.gz
API 35Jürgen Knobloch
Infrastructure (III)
• Mailing lists (archived) – <listName>@cern.ch
• project-aida-dev (open)• project-aida (open)• project-aida-announce (posting moderated,
subscription open)• Web page (http://aida.freehep.org/)
– Updated automatically from repository– On web page links to implementations
• from whoever provides one (and informs us)
API 36Jürgen Knobloch
Release schedule / plan
• Release for discussion and feedback – even if not (yet) “complete”
• V 2.0 mid May 2001 (“Boston release”)– C++ and Java version of the Interfaces
• V 2.1 Aug. 2001 (“Genova release”)– updates from discussions at Geant-4 workshop– fixed some problems with C++ version
• Aiming for 2-3 month release frequency
API 37Jürgen Knobloch
More information
• cern.ch/Anaphe• cern.ch/Anaphe/Lizard• aida.freehep.org/• cern.ch/DB• wwwinfo.cern.ch/asd/lhc++/clhep/
API 38Jürgen Knobloch
“Use-cases” of AIDA
• Java reference implementation in FreeHEP repository
• JAS, OpenScientist and Lizard/Anaphe plan for implementations of version 2.x by Dec. 2001
• Used by Gaudi/Athena (LHCb, Atlas, Harp)– Gaudi people involved in design
• Adopted and used in Geant-4 examples/testing– new category created in Geant-4 for analysis
• No need to go for “least common denominator”– use “reasonable” superset and concentrate on design
API 39Jürgen Knobloch
AIDA - Summary
• Design of Abstract Interfaces for Data Analysis
– Maximize flexibility and re-use– Allow for faster turn-around time – Allows for and try to re-use existing packages
• No need to go for “least common denominator”
– use “reasonable” superset – concentrate on proper design
Recommended