Upload
vuongmien
View
220
Download
3
Embed Size (px)
Citation preview
Alice Alice Grid ActivitiesGrid Activities
Peter Peter MalzacherMalzacher ((GSIGSI) for the Alice Grid Project) for the Alice Grid [email protected]@gsi.de
Seminar Datenverarbeitung in der Hochenergiephysik Seminar Datenverarbeitung in der Hochenergiephysik DESY, 27.5.2002DESY, 27.5.2002
ALICE GRID ProjectALICE GRID Project
G.G.AndronicoAndronico(CT), T.(CT), T.AnticicAnticic(IRB), R.(IRB), R.BarberaBarbera(CT), (CT), I.J.I.J.BloodworthBloodworth(Birmingham), M.(Birmingham), M.BotjeBotje(NIKHEF), P.(NIKHEF), P.BuncicBuncic(CERN), (CERN),
F.F.CarminatiCarminati(CERN), (CERN), P.P.CerelloCerello(TO)(TO), G., G.ChabratovaChabratova(DUBNA), (DUBNA), J.J.CleymansCleymans((CapetownCapetown), J.G.Contreras(MERIDA), D.), J.G.Contreras(MERIDA), D.DiBariDiBari(BA), (BA),
DeGirolamoDeGirolamo(BA), K.(BA), K.FlurchickFlurchick(OSC), J.(OSC), J.GossetGosset(SACLAY), (SACLAY), A.A.GrigoryanGrigoryan((YerevanYerevan), D.Johnson(OSC), M.L.), D.Johnson(OSC), M.L.LuvisettoLuvisetto(BO), (BO), P.P.MalzacherMalzacher(GSI), M.(GSI), M.MaseraMasera(CERN/TO), A. (CERN/TO), A. MasoniMasoni(CA), (CA),
K.K.MkoyanMkoyan((YerevanYerevan), D.Mura(CA), L.), D.Mura(CA), L.NellenNellen(UNAM), B.(UNAM), B.NilsenNilsen(OSU), (OSU), G.G.OdinyekOdinyek(LBNL), D.Olson(LBNL), S.K.Pal(Calcutta), F.(LBNL), D.Olson(LBNL), S.K.Pal(Calcutta), F.PierellaPierella(BO), (BO),
F.F.RademakersRademakers (CERN), I.(CERN), I.SacreidaSacreida(NERSC), P.(NERSC), P.SaizSaiz(CERN), (CERN), Y.Y.SchutzSchutz(SUBATECH), K.Schwarz(GSI), E.(SUBATECH), K.Schwarz(GSI), E.ScomparinScomparin(TO), (TO),
M.M.SittaSitta(TO), R.(TO), R.TurrisiTurrisi(PD), D.(PD), D.VicinanzaVicinanza(SA)(SA)
38 people, 21 institutions38 people, 21 institutions
�� http://www.to.http://www.to.infninfn.it/activities/experiments/.it/activities/experiments/alicealice--gridgrid
OverviewOverview
§§ LHC computing challengeLHC computing challenge§ MONARC, EU DG, LCG
§§ The Grid metaphorThe Grid metaphor§ Globus
§§ Alice Grid ActivitiesAlice Grid Activities§ The Alice Offline Project§ Remote job submission and monitoring§ AliEn§ Interface Root-Globus, Root-Alien, Root-...§ Distributed analysis with proof
§§ Summary Summary
OverviewOverview
èè LHC computing challengeLHC computing challenge§ MONARC, EU DG, LCG
§§ The Grid metaphorThe Grid metaphor§ Globus
§§ Alice Grid ActivitiesAlice Grid Activities§ The Alice Offline Project§ Remote job submission and monitoring§ AliEn§ Interface Root-Globus, Root-Alien, Root-...§ Distributed analysis with proof
§§ Summary Summary
LEP (end 2000) LEP (end 2000) --> LHC (start > LHC (start 20020077))
CMSATLAS
LHCb
The LHC DetectorsThe LHC Detectors
source: CERN/LHCC/2001-004 - Report of the LHC Computing Review - 20 February 2001
(ATLAS with 270Hz trigger)Regional Grand
Tier 0 Tier 1 Total Centres TotalProcessing (K SI95) 1,727 832 2,559 4,974 7,533Disk (PB) 1.2 1.2 2.4 8.7 11.1Magnetic tape (PB) 16.3 1.2 17.6 20.3 37.9
---------- CERN ----------
Summary of Computing Capacity Required for all LHC Experiments in 2007
1 SPECint95 = 10 CERNunits = 40 MIPStoday dual PIII ~ 100 SPECint95
Computing Requirements
Department α βγ
Desktop
The MONARC RC Topology The MONARC RC Topology
CERN – Tier 0
Tier 1 FNAL RC...
IN2P3622 M
bps2.5 Gbps
622
Mbp
s
155 m
bps 155 mbps
Tier2 Lab aUni b Lab c
Uni n
HEP Data Grid InitiativeHEP Data Grid Initiative
§§ European level coordination of national initiatives & projectsEuropean level coordination of national initiatives & projects§§ Principal goals:Principal goals:
§ Middleware for fabric & Grid management§ Large scale testbed - major fraction of one LHC experiment§ Production quality HEP demonstrations
¨ “mock data”, simulation analysis, current experiments§ Other science demonstrations
§§ Three year phased developments & demosThree year phased developments & demos§§ Complementary to other GRID projectsComplementary to other GRID projects
§ EuroGrid: Uniform access to parallel supercomputing resources§§ Synergy to be developed (GRID Forum, Industry and Research Synergy to be developed (GRID Forum, Industry and Research
Forum)Forum)
Work PackagesWork Packages
WWPP 11 GGrriidd WWoorrkkllooaadd MMaannaaggeemmeenntt CC..VViissttoollii//IINNFFNN WWPP 22 GGrriidd DDaattaa MMaannaaggeemmeenntt BB..SSeeggaall//CCEERRNN WWPP 33 GGrriidd MMoonniittoorriinngg sseerrvviicceess RR..MMiiddddlleettoonn//PPPPAARRCC WWPP 44 FFaabbrriicc MMaannaaggeemmeenntt TT..SSmmiitthh//CCEERRNN WWPP 55 MMaassss SSttoorraaggee MMaannaaggeemmeenntt JJ..GGoorrddoonn//PPPPAARRCC WWPP 66 IInntteeggrraattiioonn TTeessttbbeedd FF..EEttiieennnnee//CCNNRRSS WWPP 77 NNeettwwoorrkk SSeerrvviicceess CC..MMiicchhaauu//CCNNRRSS WWPP 88 HHEEPP AApppplliiccaattiioonnss FF..CCaarrmmiinnaattii//CCEERRNN WWPP 99 EEOO SScciieennccee AApppplliiccaattiioonnss CC..MMiicchhaauu//CCNNRRSS WWPP 1100 BBiioollooggyy AApppplliiccaattiioonnss CC..MMiicchhaauu//CCNNRRSS WWPP 1111 DDiisssseemmiinnaattiioonn GG..MMaassccaarrii//CCNNRR WWPP 1122 PPrroojjeecctt MMaannaaggeemmeenntt FF..GGaagglliiaarrddii//CCEERRNN
More realistically More realistically -- a Grid Topologya Grid Topology
CERN – Tier 0
Tier 1 FNAL RC...
IN2P3622 M
bps2.5 Gbps
622
Mbp
s
155 m
bps 155 mbps
Tier2 Lab aUni b Lab c
Uni n
Department αβ
γ
Desktop
LHC Computing ModelLHC Computing Model
CERNTier2
Lab a
Uni a
Lab c
Uni n
Lab m
Lab b
Uni bUni y
Uni x
PhysicsDepartment
α
β
γDesktop
Germany
Tier 1
USAFermiLab
UK
France
Italy
NL
USABrookhaven
……….
The LHC Computing Centre
LHC Computing ModelLHC Computing Model
CERNTier2
Lab a
Uni a
Lab c
Uni n
Lab m
Lab b
Uni bUni y
Uni x
PhysicsDepartment
α
β
γDesktop
Germany
Tier 1
USAFermiLab
UK
France
Italy
NL
USABrookhaven
……….
The LHC Computing Centre
LHC Computing ReviewLHC Computing Review
§§ Report issued on February 2001Report issued on February 2001§§ Recognition ofRecognition of
§ Size of LHC computing: 240MCHF initially and every 3y§ Understaffing of both IT and Offline core teams § Distributed hierarchical computing model (MONARC)
§§ RecommendationsRecommendations§ GRID technology development§ Common projects§ Establishment of proper management structure§ Continued support for common libraries and introduction of
the CERN support for FLUKA & ROOT§ Data challenges on common Prototype testbed§ Improvement of the planning of the Off-line Projects § Drafting of software MOU
The LHC Computing Grid Project Structure
The LHC Computing Grid Project
LHCC
Project Overview Board
RTAG
Reports
Reviews
CommonComputing
RRB
Resource Matters
OtherComputing
GridProjects
EUDataGridProject
implementation teams
Other HEPGrid
Projects
OtherLabs
Project Manager
ProjectExecution
Board
Requirements,Monitoring
Software andComputingCommittee
(SC2)
LHC GRID Computing Phase 1LHC GRID Computing Phase 1
§§ Technology development and data challengesTechnology development and data challenges§§ Production prototype at CERN, MS&NMS 2001Production prototype at CERN, MS&NMS 2001--2004 2004
§ Technical Design Report in 2003§ ~50% in complexity of one LHC experiment
§§ Training and technology opportunities Training and technology opportunities §§ Approved in the Council Meeting of 20Approved in the Council Meeting of 20--SeptSept--0101
§ ~ 40 FTEs for three years (more pledged)§ ~ 30 MCHF (>10 pledged from various sources)
§§ Critical level for “approval to start” reachedCritical level for “approval to start” reached§§ Need still more funds for the CERN prototypingNeed still more funds for the CERN prototyping
§ Contributions in kind, EU-FP6, industry contributions§§ Discussion with M & NM States, agencies, industriesDiscussion with M & NM States, agencies, industries
§ Work out convention with each funding agency
The Problem The Problem -- SummarySummary
§§ ScalabilityScalability àà cost cost àà managementmanagement§ Thousands of processors, thousands of disks, PetaBytes
of data, Terabits/second of I/O bandwidth, ….§§ WideWide--area distributionarea distribution
§ WANs are and will be 1% of LANs§ Distribute, replicate, cache, synchronize the data§ Multiple ownership, policies, ….§ Integration of this amorphous collection of Regional
Centers ..§ .. with some attempt at optimization
§§ AdaptabilityAdaptability§ We shall only know how analysis is done once the data
arrives
OverviewOverview
§§ LHC computing challengeLHC computing challenge§ MONARC, EU DG, LCG
èè The Grid metaphorThe Grid metaphor§ Globus
§§ Alice Grid ActivitiesAlice Grid Activities§ The Alice Offline Project§ Remote job submission and monitoring§ AliEn§ Interface Root-Globus, Root-Alien, Root-...§ Distributed analysis with proof
§§ Summary Summary
The GRID metaphorThe GRID metaphor
§§ Unlimited ubiquitous Unlimited ubiquitous distributed computingdistributed computing
§§ Transparent access to Transparent access to multipetabytemultipetabyte distributed distributed data basesdata bases
§§ Easy to plug inEasy to plug in
§§ Hidden complexity of the Hidden complexity of the infrastructureinfrastructure
§§ Analogy with the electrical Analogy with the electrical power GRIDpower GRID
Ian Foster and Carl Kesselman, editors, “The Grid: Blueprint for a New Computing Infrastructure,” Morgan Kaufmann, 1999, http://www.mkp.com/grids
Five Emerging Models of Networked Computing from Five Emerging Models of Networked Computing from
§§ Distributed ComputingDistributed Computing§ || synchronous processing
§§ HighHigh--Throughput ComputingThroughput Computing§ || asynchronous processing
§§ OnOn--Demand ComputingDemand Computing§ || dynamic resources
§§ DataData--Intensive ComputingIntensive Computing§ || databases
§§ Collaborative ComputingCollaborative Computing§ || scientists
How to build a Grid Application?How to build a Grid Application?
§§ Do It YourselfDo It Yourself§§ Legion = Legion = MetaOS MetaOS on top of local OSon top of local OS
“Legion is a reflective object-based wide-area system designed from first principles to address the difficult technical challenges of using distributed resources (cycles, data, people, applications)”
¨ www.legion.virginia.edu§ Globus = Protocol/Toolkit approach
“The Globus Toolkit provides a set of standard services for authentication, resource location, resource allocation, configuration, communication, file access, fault detection, and executable management. These services can be incorporated into applications and/or programming tools in a "mix-and-match" fashion to provide access to needed capabilities”
¨ www.globus.org
Globus ApproachGlobus Approach
§§ Focus on architecture issuesFocus on architecture issues§ Propose set of core services as basic
infrastructure§ Use to construct high-level, domain-
specific solutions§§ Design principlesDesign principles
§ Keep participation cost low§ Enable local control§ Support for adaptation
Diverse global services
Core Globusservices
Local OS
A p p l i c a t i o n s
GASS
GRAM
GIS
GSI
“Data Grid” Architecture Elements“Data Grid” Architecture Elements
CPUCPU
resourcemanager
Enquiry (LDAP)Access (GRAM)
Storage
Storageresourcemanager
Enquiry (LDAP)Access (???)
Locationcataloging
Metadatacataloging
Replicaselection
Attribute-basedlookup
Reliablereplication
VirtualDataCachingTask mgmt
(Condor-G)
Virtual Datacataloging
Data requestmanagement
…
…
…A P P L I C A T I O N S
The Grid from a Services ViewThe Grid from a Services View
:
:E.g.,
Applications
Resource-specific implementations of basic servicesE.g., Transport protocols, name servers, differentiated services, CPU schedulers, public keyinfrastructure, site accounting, directory service, OS bypass
Resource-independent and application-independent servicesauthentication, authorization, resource location, resource allocation, events, accounting,
remote data access, information, policy, fault detection
DistributedComputing
Toolkit
Grid Fabric(Resources)
Grid Services(Middleware)
ApplicationToolkits
Data-Intensive
ApplicationsToolkit
CollaborativeApplications
Toolkit
RemoteVisualizationApplications
Toolkit
ProblemSolving
ApplicationsToolkit
RemoteInstrumentation
ApplicationsToolkit
Applications Chemistry
Biology
Cosmology
High Energy Physics
Environment
The Globus AdvantageThe Globus Advantage
§§ Single signSingle sign--on for all resourceson for all resources§ No need for user to keep track of accounts and passwords at
multiple sites§ No plaintext passwords
§§ Uniform interface to various local scheduling mechanismsUniform interface to various local scheduling mechanisms§ PBS, Condor, LSF, NQE, LoadLeveler, fork, etc.§ No need to learn and remember obscure command sequences at
different sites§§ Support for staging, etc. Support for staging, etc.
OverviewOverview
§§ LHC computing challengeLHC computing challenge§ MONARC, EU DG, LCG
§§ The Grid metaphorThe Grid metaphor§ Globus
èè Alice Grid ActivitiesAlice Grid Activities§ The Alice Offline Project§ Remote job submission and monitoring§ AliEn§ Interface Root-Globus, Root-Alien, Root-...§ Distributed analysis with proof
§§ Summary Summary
ALICE CollaborationALICE Collaboration
0
200
400
600
800
1000
1200
1990 1992 1994 1996 1998 2000 2002 2004
ALICE
Collaboration statistics
LoI
MoU
TP
TRD
UKPORTUGAL
JINR
GERMANY
SWEDEN
CZECH REP.HUNGARYNORWAY
SLOVAKIAPOLANDNETHERLANDS
GREECE
DENMARKFINLAND
SWITZERLAND
RUSSIA CERN
FRANCE
MEXICOCROATIA ROMANIA
CHINA
USAARMENIA
UKRAINE
INDIA
ITALYS. KOREA
937 members (63% from CERN MS)
77 Institutions, 28 Countries
ALICE ComputingALICE Computing
§§ 1800 SI95, 2.8PB tapes/y, >1 PB disk1800 SI95, 2.8PB tapes/y, >1 PB disk§ Same order of magnitude than ATLAS or CMS
§§ Major strategic decisions already takenMajor strategic decisions already taken§ Move to C++ completed beginning 1999completed beginning 1999
¨ TDRs all produced with the new framework¨ PPR production being done with AliEn/AliRoot
§ Adoption of the ROOT framework§ Tightly knit Off-line team – single development line § Physics performance and computing in a single team§ Ambitious DC program on the LHC prototype§ ALICE DAQ/Computing integration realised during data challenges
in collaboration with ALICE DAQ team and IT division
AliRootAliRoot layoutlayout
ROOT
AliRoot
STEER
Virtual MC
G3 G4 FLUKA
EVGEN
Alice Offline FrameworkAlice Offline Framework
§§ > 50 active users participate in the development of AliRoot from> 50 active users participate in the development of AliRoot from the the different detector groupsdifferent detector groups§ 71% of the code developed outside, 29% by the core Offline team
§§ AliRootAliRoot frameworkframework§ C++: 400kLOC + 225kLOC (generated) + macros: 77kLOC § FORTRAN: 13kLOC (ALICE) + 914kLOC (external packages)§
¨
¨
¨
¨
§ Maintained on Linux (any version!), HP-UX, DEC Unix, Solaris§§ Two packages to install (ROOT+Two packages to install (ROOT+AliRootAliRoot))
§ Less that 1 second to link (thanks to 37 shared libs)§ 1-click-away install: download and make (non-recursive makefile)
§§ AliEnAliEn§ 50kLOC of PERL5 (ALICE) + 1MLOC of PERL5 (external packages)
§§ Installed on more than 30 sites by physicistsInstalled on more than 30 sites by physicists
Offline MilestonesOffline Milestones
Physics ChallengePhysics Challenge§§ Determine readiness of the offline Determine readiness of the offline
framework for data processingframework for data processing§§ Verify the computing modelVerify the computing model
Data ChallengeData Challenge§§ Verify integration of Offline and DAQVerify integration of Offline and DAQ§§ Assess technologies and modelsAssess technologies and models§§ Determine readiness of DAQ systemDetermine readiness of DAQ system
Test of the final system for Test of the final system for reconstruction and analysis.reconstruction and analysis.
01/0601/06--06/0606/06
20%20%
Complete chain used for trigger studies.Complete chain used for trigger studies.Prototype of the analysis tools.Prototype of the analysis tools.Comparison with parameterised Comparison with parameterised MonteCarloMonteCarlo..Simulated raw data.Simulated raw data.
01/0401/04--06/0406/04
10%10%
First test of the complete chain from First test of the complete chain from simulation to reconstruction for the PPRsimulation to reconstruction for the PPRSimple analysis.Simple analysis.ROOT digits.ROOT digits.
06/0106/01--06/0206/02
5%5%
Physics ObjectivePhysics ObjectivePDC PeriodPDC Period((milestone)milestone)
((SizeSize))
Final system.Final system.5/20065/20061250TB | 1250 MB/s1250TB | 1250 MB/s
Prototype of the final HLT software.Prototype of the final HLT software.Prototype of the final remote data Prototype of the final remote data replication.replication.Raw digitsRaw digits for all detectors.for all detectors.
5/20055/2005750TB | 750 MB/s750TB | 750 MB/s
HLT prototype for all detectors that HLT prototype for all detectors that plan to make use of it.plan to make use of it.Remote reconstruction of partial data Remote reconstruction of partial data streams.streams.Raw digitsRaw digits for barrel and MUON.for barrel and MUON.
5/20045/2004450TB | 450 MB/s450TB | 450 MB/s
Integration of single detector HLT Integration of single detector HLT code, at least for TPC and ITS.code, at least for TPC and ITS.Quasi onQuasi on--line reconstruction at CERNline reconstruction at CERNPartial data replication to remote Partial data replication to remote centres.centres.
5/20035/2003300TB | 300 MB/s300TB | 300 MB/s
Rootification Rootification of raw data. of raw data. Raw data for TPC and ITS.Raw data for TPC and ITS.
10/200210/2002200TB | 200 MB/s200TB | 200 MB/s
Offline milestoneOffline milestoneADC MilestoneADC Milestone((Data to MSSData to MSS))
ALICE GRID resourcesALICE GRID resources
Yerevan
CERN
Saclay
Lyon
Dubna
Capetown, ZA
Birmingham
Cagliari
NIKHEF
GSI
Catania
BolognaTorino
Padova
IRB
Kolkata, India
OSU/OSCLBL/NERSC
Merida
Bari38 people21 insitutions
FZK
ALICE/GRID StrategyALICE/GRID Strategy
§§ Test / Use Test / Use Globus Globus ToolsTools§§ Physics Performance Report (Step 1) Physics Performance Report (Step 1)
§ Involving progressively more sites (Catania, CERN, GSI/FZK, Lyon, OSC, NIKHEF, NERSC, Torino)
§§ AliKitAliKit + + AliEnAliEn (+ (+ GlobusGlobus & MRTG & ...)& MRTG & ...)§ Don't wait for EU-DG1
§§ Integration with Integration with DataGRID DataGRID ((TestbedTestbed 1)1)§ AliKit into EU-DG1: straightforward§ Evaluate EU-DG1: Services§ AliEn into EU-DG1: interaction with middleware WP’s
§§ ALICE Data Challenge IVALICE Data Challenge IV§§ Development of distributed parallel analysis tools (PROOF)Development of distributed parallel analysis tools (PROOF)
ALICE Physics Performance ReportALICE Physics Performance Report
§§ Evaluation of acceptance, efficiency, resolution for signalsEvaluation of acceptance, efficiency, resolution for signals§§ Step1: Simulation of ~10,000 central Step1: Simulation of ~10,000 central PbPb--PbPb eventsevents
§ Hits: 2 GB/event = 20 TB§ Summable Digits: 1 GB/event = 10 TB§ AliEn (Globus Authentication + Data Catalog) 1 user (Production Manager)
§§ Step2: Signal Superposition + Reconstruction 100,000 eventsStep2: Signal Superposition + Reconstruction 100,000 events§ Reconstructed Events = 10 TB x nsignals§ ROOT + Data Manager for access to distributed input: nsignals users
(Production Managers)§§ Step3: Event Analysis: Analysis Object Data = 1 TB x Step3: Event Analysis: Analysis Object Data = 1 TB x nnsignalssignals
§ PROOF + Data Manager + Workload Manager for chaotic access to distributed analysis input: 200 users
Web interface to loginon “The Grid”
Remote Job Submission and Monitoring
Web interface
for
job submissio
n !
CPU LoadDisk spaceNetwork
availability
Boo
k ke
epin
gsy
stem
Onl
y at
Lyo
n
LDAP server
for ALICE
(only in Italy)
ALICE GRID: Monitoring ALICE GRID: Monitoring
Monitor of:CPU loadDISK availabilityNetwork load
ALICE GRID August ProductionALICE GRID August Production
§§ Bookkeeping included: Bookkeeping included: § Check production & keep track of the event parameters for selection§ Central Server (Catania) + mirrors (Lyon, Bologna)http://alipc1.ct.infn.it/PPR/PPRmain.php?off=0&nrow=20&rev=1§ MySQL based, ORACLE interface available
ALICE in EU DGALICE in EU DG
§§ First HEP event producedFirst HEP event producedon the on the DataGRID testbedDataGRID testbedreleased Dec 10, ’01released Dec 10, ’01
§§ ALICE providesALICE providesHEP coordinationHEP coordination
² Distributed file catalogue, simulating a file structure.
² Each directory is a table in a SQL database, abstract interface to DB’s
² Ownership and privileges like in a UNIX system.
² Completely transparent access to mass storage systems.
² Automatic synchronization among all the databases.
² Secure authentication
² Replication & caching
² User shell interface, C/C++ API
² 150kL perl5, 95% from the WEB² SOAP for client/server and API
implementation
�P.Buncic P.Saiz, J-E.Revshback
Alien
§§ ALIALICE production distributed CE production distributed EnEnvironmentvironment§ Entirely ALICE developed
§§ File Catalogue as a global file system on a RDBFile Catalogue as a global file system on a RDB§§ TAG Catalogue, as extension TAG Catalogue, as extension §§ Secure AuthenticationSecure Authentication
§ Interface to Globus available§§ Central Queue Manager ("pull" Central Queue Manager ("pull" vsvs "push" model)"push" model)
§ Interface to EDG Resource Broker available§§ Monitoring infrastructureMonitoring infrastructure
The CORE GRID functionalityThe CORE GRID functionality§§ Automatic software installation with Automatic software installation with AliKitAliKit§§ http://alien.http://alien.cerncern..chch
HI PPR Production SummaryHI PPR Production Summary
§§ Entirely Entirely AliEnAliEn based!based!§§ >5500 jobs validated (~24h each)>5500 jobs validated (~24h each)§§ 13 clusters actively used in production, 9 remote sites13 clusters actively used in production, 9 remote sites§§ GSI, GSI, KarlsruheKarlsruhe, , DubnaDubna, Nantes, Budapest, , Nantes, Budapest, BariBari, Zagreb, Birmingham are joining, Zagreb, Birmingham are joining§§ Hijing paramHijing param 8000/4000/2000 8000/4000/2000 dNdN//dydy§§ HijingHijing centrality binscentrality bins
§ fm: 0-5, 0-2, 5-8.6, 8.6-11.2, 11.2-13.2, 13.2-15, 15-inf. §§ HIJING + Jet/Photon Trigger HIJING + Jet/Photon Trigger
§ Minimum Transverse Momentum: 25, 50, 75, 100GeV
CERN CCIN2P3 LBL
Torino Catania Padova
OSC NIKHEF Bari
Other productionsOther productions
LBNL Lyon
Ohio Darmstadt §§ EMCAL productionEMCAL production§ Design and optimisation of the proposed
EM calorimeter§ Entirely AliEn based§ Decided, implemented and realised in two
months§ 2000 jobs, 4000h CPU
§§ pp production august 2001pp production august 2001§ ~ 10000 events generated at CERN in CASTOR§ Transport + TPC digitization
§§ pp production pp production octoberoctober 20012001§ Vertex sampling in the diamond§ > 10000 events – 80 GB in 3 days in Torino transferred to CERN with AliEn in ~20 hours
(band-width saturated)§ Test and tune tracking
You execute a root script:You execute a root script:The Grid:The Grid:§ Finds convenient places for it to be run§ Organises efficient access to your data
¨ Caching, migration, replication of data and/or code§ Deals with authentication to the different sites that you will be using§ Interfaces to local site resource allocation mechanisms, policies§ Compiles/runs your scripts§ Monitors progress§ Recovers from problems
If there is scope for parallelism, it can also decompose your woIf there is scope for parallelism, it can also decompose your work into rk into convenient execution units based on the available resources, datconvenient execution units based on the available resources, data distribution, a distribution, . . .. . .
The The Vision:Vision:
This can work only, if there is a closecooperation of ROOT and Grid services !
ROOT with the help of Grid Services:
Show me theequation of state of quark gluon plasma
Use CaseUse Case:(:(chaoticchaotic) Analysis) Analysis
PhysicistRootscript
CERN
FNAL RALIN2P3
Lab aUni b Lab c
Uni n
α β γ
DataGridDataGrid & ROOT& ROOT
Local
Remote
SelectionParameters
Procedure
Proc.C
Proc.C
Proc.C
Proc.C
Proc.C
PROOF
CPU
CPU
CPU
CPU
CPU
CPU
TagDB
RDB DB1
DB4
DB5
DB6
DB3
DB2
Bring the KB to the PB and not the PB to the KB
Parallel Script ExecutionParallel Script Execution
root
Remote PROOF Cluster
proof
proof
proof
TNetFile
TFile
Local PC
$ root
ana.Cstdout/obj
node1
node2
node3
node4
$ root
root [0] .x ana.C
$ root
root [0] .x ana.C
root [1] gROOT->Proof(“remote”)
$ root
root [0] .x ana.C
root [1] gROOT->Proof(“remote”)
root [2] gProof->Exec(“.x ana.C”)
ana.C
proof
proof = slave server
proof
proof = master server
#proof.confslave node1slave node2slave node3slave node4
*.root
*.root
*.root
*.root
TFile
TFile
Running a PROOF JobRunning a PROOF Job
gROOT->Proof();TDSet *treeset = new TDSet("TTree", "AOD");treeset->Add("lfn:/alien.cern.ch/alice/prod2002/file1");. . . gProof->Process(treeset, “myselector.C”);
gROOT->Proof();TDSet *objset = new TDSet("MyEvent", "*", "/events"); objset->Add("lfn:/alien.cern.ch/alice/prod2002/file1");. . . objset->Add(set2003);gProof->Process(objset, “myselector.C”);
TGrid TGrid Class Class ––Abstract Interface to AliEnAbstract Interface to AliEn
class TGrid : public TObject {public:
virtual Int_t AddFile(const char *lfn, const char *pfn) = 0;virtual Int_t DeleteFile(const char *lfn) = 0;virtual TGridResult *GetPhysicalFileNames(const char *lfn) = 0;virtual Int_t AddAttribute(const char *lfn,
const char *attrname,const char *attrval) = 0;
virtual Int_t DeleteAttribute(const char *lfn,const char *attrname) = 0;
virtual TGridResult *GetAttributes(const char *lfn) = 0;virtual void Close(Option_t *option="") = 0;virtual const char *GetInfo() = 0;const char *GetGrid() const { return fGrid; }const char *GetHost() const { return fHost; }Int_t GetPort() const { return fPort; }virtual TGridResult *Query(const char *query) = 0;static TGrid *Connect(const char *grid, const char *uid = 0,
const char *pw = 0);ClassDef(TGrid,0) // ABC defining interface to GRID services
};
Running PROOF Using AliEnRunning PROOF Using AliEn
TGrid *alien = TGrid::Connect(“alien://alien.cern.ch”, rdm);TGridResult *res; res = alien->Query(“lfn:///alice/simulation/2001-04/V0.6*.root“);TDSet *treeset = new TDSet("TTree", "AOD");treeset->Add(res);
gROOT->Proof();gProof->Process(treeset, “myselector.C”);
// plot/write objects produced in myselector.C. . .
DataGridDataGrid & ROOT& ROOT
Root RDB
Selection parameters TAG DB
selected events Grid RBLFN
#hits
Grid MDS
Grid perf log
Grid perf mon
Grid replica catalog
Grid cost evaluator
best places
Spawn PROOF tasks
Grid net ws
Grid replica manager
Grid autenticate
PROOF loop
Grid replica catalog
Update Root RDB
output LFNs
Send results back
Grid log & monitor
First First StepsSteps::
üü Use Use GSI to GSI to provide provide a a single signsingle sign--onon viavia ""gridgrid--idid" " ((prototype ready for proofprototype ready for proof))
üü globusglobus--jobjob--runrun to start to start remote demonsremote demons§ we need jobmanager-fork or another way to launch interactive processes
üü TLdap TLdap to to access access MDSMDS§ API: OpenLDAP§ Information model: Globus, DataGrid
¨ Mapping of events to files¨ Mapping of files to replica¨ Ressource broker¨ ...
MDS ApproachMDS Approach
§§ Based on Based on LDAPLDAP§ Lightweight Directory Access Protocol v3
(LDAPv3)§ Standard data model§ Standard query protocol
§§ Globus specific schemaGlobus specific schema§ Host-centric representation
§§ Globus specific toolsGlobus specific tools§ GRIS, GIIS§ Data discovery, publication,…
SNMP
GRIS
NIS
NWS
LDAP
LDAP API
Middleware
…
Application
GIIS…
OverviewOverview
§§ LHC computing challengeLHC computing challenge§ MONARC, EU DG, LCG
§§ The Grid metaphorThe Grid metaphor§ Globus
§§ Alice Grid ActivitiesAlice Grid Activities§ The Alice Offline Project§ Remote job submission and monitoring§ AliEn§ Interface Root-Globus, Root-Alien, Root-...§ Distributed analysis with proof
èè Summary Summary
The Grid from a Services ViewThe Grid from a Services View
:
:E.g.,
Applications
Resource-specific implementations of basic servicesE.g., Transport protocols, name servers, differentiated services, CPU schedulers, public keyinfrastructure, site accounting, directory service, OS bypass
Resource-independent and application-independent servicesauthentication, authorization, resource location, resource allocation, events, accounting,
remote data access, information, policy, fault detection
DistributedComputing
Toolkit
Grid Fabric(Resources)
Grid Services(Middleware)
ApplicationToolkits
Data-Intensive
ApplicationsToolkit
CollaborativeApplications
Toolkit
RemoteVisualizationApplications
Toolkit
ProblemSolving
ApplicationsToolkit
RemoteInstrumentation
ApplicationsToolkit
Applications Chemistry
Biology
Cosmology
High Energy Physics
Environment
AliEnAliEn as a metaas a meta--GRIDGRID
AliEn User Interface
AliEn stackiVDGL stack EDG stack
Long Term Long Term UseUse CaseCase:(:(chaoticchaotic) Analysis) Analysis
Physicist
Rootscript
CERN
FNAL RALIN2P3
Lab aUni b Lab c
Uni n
α β γ
Rootsession
ResourceBroker
ReplicaLocationCatalog
EventCatalog
Physicist executes a root (c++) script to analyse eventsroot:•finds the name of the file(s) with the events •locates the file (and replica)•decides based on
network speed and load, CPUs/disks available and file usage pattern, whether -to fetch the file -to use only parts (via rootd)or -to execute the script at the remote side (via proof).
Physicist executes a root (c++) script to analyse eventsroot:•finds the name of the file(s) with the events •locates the file (and replica)•decides based on
network speed and load, CPUs/disks available and file usage pattern, whether -to fetch the file -to use only parts (via rootd)or -to execute the script at the remote side (via proof).