18/09/2002P. Capiluppi - CSN1 Catania
CMS LHC-Computing
Paolo Capiluppi
Dept. of Physics and INFN
Bologna
2P. Capiluppi - CSN1 Catania 18/09/2002
OutlineOutline
Milestones and CMS-Italy Responsibilities CCS (Core Computing and Software) milestones Responsibilities (CMS Italy)
Productions (Spring 2002) Goals and main issues Available resources Work done
Data Challenge 04 Goals and plans CMS Italy participation and plans (preliminary)
LCG role Tier1 and Tier2s (and Tier3s)
LCG and Grid What’s LCG Grid Real Results and Strategies
Conclusion
3P. Capiluppi - CSN1 Catania 18/09/2002
Milestones (CCS and …externals)Milestones (CCS and …externals)
DAQ TDR November 2002
End EU-DataGrid / EU-DataTAG Projects December 2003
End US-GriPhyN Project December 2003
Data Challenge 04 (5%)LCG-1 phase 1
LCG-1 phase 2
“February” 2004June 2003
November 2003
End US-PPDG Project December 2004
CCS TDR November 2004
Physics TDR December 2005
End LCG Phase 1LCG-3
December 2005December 2004
Data Challenge 05 April 2005
Data Challenge 06 April 2006
7P. Capiluppi - CSN1 Catania 18/09/2002
CMS-Italy official ResponsibilitiesCMS-Italy official Responsibilities
CCS SC (Core Computing and Software Steering Committee) Grid Integration Level 2 manager (Claudio Grandi) INFN contact (Paolo Capiluppi)
CCS FB (CCS Financial Board) INFN contact (Paolo Capiluppi)
PRS (Physics Reconstruction and Software) Being recruited/refocused for the Physics TDR Muons (Ugo Gasparini) Tracker/b-tau (Lucia Silvestris)
LCG (LHC Computing Grid Project) SC2 (Software and Computing Steering Committee) (Paolo Capiluppi
alternate of David Stickland) Detector Geometry & Material Description RTAG (Requirements Technical
Assessment Group) chairperson (Lucia Silvestris) HEPCAL (HEP Common Application Layer) RTAG (Claudio Grandi)
CCS Production Team INFN contact (Giovanni Organtini)
8P. Capiluppi - CSN1 Catania 18/09/2002
“Spring 2002 Production” (and Summer extension)
“Spring 2002 Production” (and Summer extension)
Goal of Spring 2002 Production: DAQ TDR Simulations and Studies
~6 million events simulated, then digitized at different luminosities NoPU (2.9M) , 2x1033 (4.4M), 1034 (3.8M)
CMSIM started in February with CMS125 Digitization with ORCA-6, started in April First analysis completed (just!) in time for the June CMS week
Extension of activities: “Summer 2002 Production” Ongoing ‘ntuple-only’ productions
High-pt jets for the e- group (10 M) Non-recycled pileup for the JetMet group (300 K)
Over 20 TB of data produced CMS-wide Most available at CERN, lots at FNAL, INFN FNAL, INFN, UK also hosting analysishosting analysis
Some samples analyzed at various T2s (Padova/Legnaro, Bologna, …)
Production tools obligatory: IMPALA, BOSS, DAR, RefDB BOSSBOSS is an official CMS production Tool :
INFNINFN developed (A. Renzi and C. Grandi) and maintained (C. Grandi)!
10P. Capiluppi - CSN1 Catania 18/09/2002
Spring02: CPU Resources
Wisconsin 18%
INFN 18%
IN2P3 10%
RAL 6%UCSD 3%
UFL 5%
HIP 1%
Caltech 4%Moscow
10%
Bristol 3%
FNAL 8%
CERN 15%
IC 6%
11 RCs (~20 sites)About 1000 CPUs and 30 people CMS-wideSome new sites & people, but lots of experience too
12P. Capiluppi - CSN1 Catania 18/09/2002
Production in the RCs Production in the RCs RC name CMSIM (K) 2x1033 (K) 1034 (K) Objy size (TB)
CERN 870 1670 1970 10.4
Bristol/RAL 547 60 20 0.4
Caltech 214 146 0.5
Fermilab 345 251 332 2.5
INFN (9 sites) 1545 719 709 3.0
IN2P3 200
Moscow (4 sites) 425 0.2
UCSD 338 278 288 1.8
UFL 540 40 40 0.2
Wisconsin 67 54 0.3
Imperial College 878 147 121 1.4
Thanks to: Giovanni Organtini (Rm), Luciano Barone (Rm), Alessandra Fanfani (Bo), Thanks to: Giovanni Organtini (Rm), Luciano Barone (Rm), Alessandra Fanfani (Bo), Daniele Bonacorsi (Bo), Stefano Lacaprara (Pd), Massimo Biasotto (LNL), Simone Daniele Bonacorsi (Bo), Stefano Lacaprara (Pd), Massimo Biasotto (LNL), Simone Gennai (Pi), Nicola Amapane (To), et al.Gennai (Pi), Nicola Amapane (To), et al.
14P. Capiluppi - CSN1 Catania 18/09/2002
1.2 seco
nds per
even
t for 4
months
CMSIM
Feb. 8th June 6th
6 million events
17P. Capiluppi - CSN1 Catania 18/09/2002
DC04, 5% Data ChallengeDC04, 5% Data Challenge
Definition Is 5% of 1034 running, or 25% of 2x1033 (Startup)
One month “data taking” at Cern, 50 M events It represents a factor 4 over Spring 2002, consistent with the goal
of doubling complexity each year to reach a full-scale (for LHC startup) test by Spring 2006
Called DC04 (and the others DC05, DC06) to get over the % confusion
More importantly, Previous challenges have mostly been about doing the
Digitization This one will concentrate on the reconstruction, data distribution
and early analysis phase Move the issue of “Analysis Model” out of the classroom and into the
spotlight
18P. Capiluppi - CSN1 Catania 18/09/2002
Setting the Goals of DC04Setting the Goals of DC04
As defined to the LHCC, the milestone consists of: CS-1041 1 April 2004 5% Data challenge complete (Now called DC04)
The purpose of this milestone is to demonstrate the validity of the
software baseline to be used for the Physics TDR and in the preparation
of the Computing TDR. The challenge comprises the completion of a “5%
data challenge”, which successfully copes with a sustained data-taking
rate equivalent to 25Hz at a luminosity of 0.2 x 1034 cm-2 s-1 for a period of
1 month (approximately 5 x 107 events). The emphasis of the challenge is
on the validation of the deployed grid model on a sufficient number of
Tier-0, Tier-1, and Tier-2 sites. We assume that 2-3 of the Tier-1 centers
and 5-10 of the Tier-2 centers intending to supply computing to CMS in
the 2007 first LHC run would participate to this challenge.
19P. Capiluppi - CSN1 Catania 18/09/2002
DC04: Two PhasesDC04: Two Phases
Pre-Challenge (Must be successful) Large scale simulation and digitization Will prepare the samples for the challenge Will prepare the samples for the Physics TDR work to get fully underway Progressive shakedown of tools and centers
All centers taking part in challenge should participate to pre-challenge The Physics TDR and the Challenge depend on successful completion
Ensure a solid baseline is available, worry less about being on the cutting edge
Challenge (Must be able to fail) Reconstruction at “T0”(CERN) Distribution to “T1s”
Subsequent distribution to “T2s” Assign “streams” and “analyses” to people at T1 and T2 centers
Some will be able to work entirely within one center Others will require analysis of data at multiple-centers GRID tools tested for data movement and job migration
20P. Capiluppi - CSN1 Catania 18/09/2002
DC04 Setting the ScaleDC04 Setting the Scale
Aim is 1 month of “running” at 25 Hz, 20 hours per day 50 Million reconstructed events (passing L1 Trigger and mostly passing HLT, but some background
samples also required)) Simulation (GEANT4!)
100TB 300 kSI95.Months
1GHz P3 is 50 SI95 Working assumption that most farms will be at 50SI95/CPU in late 2003
Six months running for 1000 CPUS (Worldwide)(Actually aim for more CPU’s to get production time down)
Digitization 75TB 15 kSI95.Months 175MB/s Pileup bandwidth (if allow two months for digitization)
Reconstruction at T0-CERN 25TB 23 kSI95 for 1 month (460 CPU @ 50SI95/CPU)
Analysis at T1-T2s Design a set of tasks such that offsite requirement during challenge is
about twice that of the “T0”
Pre
-Ch
alle
ng
e C
hal
len
ge
23P. Capiluppi - CSN1 Catania 18/09/2002
CSM-Italy and DC04CSM-Italy and DC04
Participation to the Challenge: ~ 20% contribution. Use of 1 Tier1 (common) and 3 - 4 Tier2s All Italian sites will possibly participate to pre-challenge phase
Use all available and validated (CMS-certified) Grid tools for the pre-challenge phaseCoordinate resources within LCG for both pre-challenge and challenge phases, where possible (Tier1/INFN must be fully functional: ~70 CPU Boxes, ~20 TB)Use the CMS Grid Integrated environment for the Challenge (February 2004)Participate to the preparation of:
Build the necessary resources and define the Italian commitments Define the Data Flow Model Validation of Grid tools Integration of Grid and Production tools (review and re-design)
24P. Capiluppi - CSN1 Catania 18/09/2002
CMS-Italy DC04 PreparationCMS-Italy DC04 Preparation
Use the tail of “Summer Production” to test and validate resources and tools (grid and non-grid)
November/December 2002Participate to the Production-Tools Review
Now (Claudio Grandi, Massimo Biasotto) Hopefully contribute to the new tools’ development (early 2003)
Make available the “new” software at all the Sites (T1, T2s, T3s)Use some of the resources to test and validate Grid Integration
Already in progress at the Tier1 (CMS resources) and at PadovaCommit and validate (for CMS) the resources for DC04
See following slideDefine the participation to the LCG-1 system
See following slide
25P. Capiluppi - CSN1 Catania 18/09/2002
All the current and coming resources of CMS Italy will be available for the DC04, possibly within the LCG Project
Small amount of resources requested for 2003 Smoothly integrate the resources into LCG-1 Continue to use dedicated resources for tests of Grid and
Production tools’ Integration
Needs for the funding of the others 3 - 4 Tier2s Request for common CMS Italy sub-judice in 2003:
Present a detailed plan and a clear Italian commitment to CMS 60 CPUs and 6 TBytes disk + Switches
Will complete already existing Farms
We are particularly “low” in disk storage availability Essential for physics analysis
CMS Italy DC04 preliminary plansCMS Italy DC04
preliminary plans
26P. Capiluppi - CSN1 Catania 18/09/2002
CMS Italy DC04 LCG preliminary plans
CMS Italy DC04 LCG preliminary plans
Tier1 plans common to all Experiments See F. Ruggieri’s Presentation
LNL partially funded in 2002 (24 CPUs, 3 TB) for LCG participation. The remaining resources are CMS directly funded.
Name & location of Regional Centre INFN- Laboratori Nazionali di Legnaro (LNL)
Experiments that are served by the resources noted below
CMS
Preliminary commitment of possibly available resources: years
2002 2003 2004 2005
Processor Farm
No. of processors planned installed 50 80 110 150
Disk Storage
Estimated total capacity (TB) 5 8 15 25
27P. Capiluppi - CSN1 Catania 18/09/2002
DC04 SummaryDC04 Summary
With the DAQ TDR about to be completed, the focus moves to the next round of preparations
The Data Challenge series to reach full scale tests in 2006 The baseline for the Physics TDR The prototypes required for CMS to write a CCS TDR in 2004
Start to address the analysis modelStart to test the data and task distribution modelsPerform realistic tests of the LCG GRID implementationsBuild the distributed expertise required for LHC Computing
DC04 will occupy us for most of the next 18 months
28P. Capiluppi - CSN1 Catania 18/09/2002
LCGLCG
LCG = LHC Computing Grid project (PM: Les Robertson) CERN-based coordination effort (hardware, personnel, software,
middleware) for LHC Computing; Worldwide!Worldwide! (Tier0, Tier1s and Tiers2s)
Funded by participating Agencies (INFN too) Two phases:
2002-2005 Preparation and setting-up (including tests, R&D and support for Experiments’ activities)
2006-2008 Commissioning of LHC Computing System Five (indeed four!) areas of activity for Phase 1:
Applications (common software and tools) (Torre Wenaus) Fabrics (hardware, farms’ tools and architecture) (Bernd Panzer) Grid Technologies (middleware development) (Fabrizio Gagliardi) Grid Deployment (resources management and run) (Ian Bird) Grid Deployment Board (agreements and plans) (Mirco Mazzucato)
Many Boards: POB(Funding), PEB(Executive), SC2(Advisory), ….
31P. Capiluppi - CSN1 Catania 18/09/2002
Grid projects(CMS-Italy leading roles)
Grid projects(CMS-Italy leading roles)
Integration of Grid Tools and Production tools almost done (Italy, UK, Fr main contributions) (Thanks to CNAF people and DataTAG personnel)
We can submit (production) jobs to the DataGrid testbed via the CMS Production tools (modified IMPALA/BOSS/RefDB)
Prototypes working correctly on DataTAG test layout Will test large scale on DataGrid/LCG Production Testbed Will measure performances to compare with “summer production”
classic jobs (November 2002)
Integration of EU/US Grid/production tools Already in progress in the GLUE activity Most of the design (not only for CMS) is ready. Implementation in
progress. Target for (first) delivery by end of 2002
34P. Capiluppi - CSN1 Catania 18/09/2002
Proposal for a DC04 diagram
REPTOR/Giggle + Chimera?
DatasetCatalogue
EDG WorkloadManagement System
EDG L&B
MDS
LDAP
PublishResource
status Read data
Write data
RetrieveResource
status
Data management operations
Job assignment to resources
Copy data
VDT PlannerIMPALA/MOP
DAG/JDL+scripts Job submission
Inpu
t da
talo
catio
n
Job
crea
tion
Production on demand
ExperimentSoftware
REPTOR/Giggle?PACMAN?
Dataset Catalogue
Software release
SW
download &
installation
Dataset Algorithm Specification
Dataset Input S
pecification
Dataset Definition
New datasetrequest
BOSS&R-GMA
BOSS-DB
Job Monitoring Definition Job type definition
Job
outp
utfil
terin
g
Update
dataset metadata
Production monitoring
EDG SEVDT Server
Data
EDG CEVDT server
EDG UIVDT Client
Push dataor infoPull info