Upload
amando-mauro
View
219
Download
0
Embed Size (px)
Citation preview
22 Giugno 2004P. Capiluppi - CSN1 Pisa
CMS Computingrisultati e prospettive
OutlineSchedulePre Data Challenge 04 ProductionData Challenge 04
Disegno e scopo Componenti sw e mw Risultati Lezione
Prospettive ed attivita’ prossimeConclusioni
Nota: poco “pre-Challenge (PCP), ma update di quanto presentato a Settembre a Lecce
2P. Capiluppi - CSN1 Pisa 22 Giugno 2004
CMS Computing scheduleCMS Computing schedule
2004 Mar/Apr. DC04 to study T0 Reconstruction, Data Distribution,
Real- time analysis 25% of startup scale May/Jul. Data available and useable by PRS groups Sep. PRS analysis feed-backs Sep. Draft CMS Computing Model in CHEP papers Nov. ARDA prototypes Nov. Milestone on Interoperability Dec. Computing TDR in initial draft form. [NEW milestone date]
2005 July. LCG TDR and CMS Computing TDR [NEW milestone date] Post July?... DC05 , 50% of startup scale. [NEW milestone date] Dec. Physics TDR [~ Based on Post-DC04 activities]
2006 DC06 Final readiness tests Fall. Computing Systems in place for LHC startup Continuous testing and preparations for data
3P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Strong contribution of INFN and CNAF Tier-1 to CMS past&future productions:252 assid’s in PCP-DC04, for all production step, both local and (when possible) Grid
The system is evolving into a permanent production effort…
CMS ‘permanent’ productionCMS ‘permanent’ production
Digitisation
…
Pre DC04 start
‘Spring02prod’
‘Summer02prod’
CMKIN CMSIM+ OSCAR
DC04 start
2002 20042003
# D
ata
sets
/mo
nth
# D
ata
sets
/mo
nth
T. WildishT. Wildish
4P. Capiluppi - CSN1 Pisa 22 Giugno 2004
~ 43 Mevts in CMS~ 7.8 Mevts (~ 18%) done by INFN
PCP @ INFN statistics (4/4)PCP @ INFN statistics (4/4)
2x1033 digitisation step(INFN only)
2x1033 digitisation step(all CMS)
Note:strong contribution to all stepsby CNAF T1 but only outside DC04(on DC too hard for CNAF T1 to be a RC also!!)
DC04
May 04Feb 04
24 M
even
ts, 6
wee
ks
CMS production steps:GenerationSimulationooHitformattingDigitisationDigitisation continued through DC!
D. BonacorsiD. Bonacorsi
5P. Capiluppi - CSN1 Pisa 22 Giugno 2004
EU-CMS: submit to LCG scheduler
CMS-LCG “virtual” Regional Center
0.5 Mevts Generation [“heavy” pythia](~2000 jobs ~8 hours* each, ~10 KSI2000 months)
~ 2.1 Mevts Simulation [CMSIM+OSCAR](~8500 jobs ~10hours* each, ~130 KSI2000 months)
~2 TB data * PIII 1GHz
CMSIM: ~1.5 Mevtson CMS/LCG-0
OSCAR: ~0.6 Mevtson LCG-1
PCP grid-based prototypesPCP grid-based prototypes
Constant work of integration in CMS between: CMS software and production tools evolving EDG-XLCG-Y middleware
in several phases:
CMS “Stress Test” stressing EDG<1.4, then:
PCP on the CMS/LCG-0 testbed
PCP on LCG-1
… towards DC04 with LCG-2
D. BonacorsiD. Bonacorsi
6P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Scopo del Data Challenge 04Scopo del Data Challenge 04
Aim of DC04:reach a sustained 25Hz reconstruction rate in the Tier-0 farm (25% of the
target conditions for LHC startup)
register data and metadata to a catalogue
transfer the reconstructed data to all Tier-1 centers
analyze the reconstructed data at the Tier-1’s as they arrive
publicize to the community the data produced at Tier-1’s
monitor and archive of performance criteria of the ensemble of activities for
debugging and post-mortem analysis
Not a CPU challenge, but a full chain demonstration!
Pre-challenge production in 2003/0470M Monte Carlo events (30M with Geant-4) produced
Classic and grid (CMS/LCG-0, LCG-1, Grid3) productions
Era un “challenge”, e ogni volta che si e’ trovato un limite Era un “challenge”, e ogni volta che si e’ trovato un limite di scalabilita’ di una componente, e’ stato un di scalabilita’ di una componente, e’ stato un SuccessoSuccesso!!
7P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Data Challenge 04: layoutData Challenge 04: layoutTier-2Tier-2
Physicist
T2T2storagestorage
ORCALocal Job
Tier-2Tier-2
Physicist
T2T2storagestorage
ORCALocal Job
Tier-1Tier-1Tier-1agent
T1T1storagestorage
ORCAAnalysis
Job
MSS
ORCAGrid Job
Tier-1Tier-1Tier-1agent
T1T1storagestorage
ORCAAnalysis
Job
MSS
ORCAGrid Job
Tier-0 Tier-0
Castor
IBIB
fake on-lineprocess
RefDB
POOL RLScatalogue
TMDB
ORCARECO
Job
GDBGDBTier-0
data distributionagents
EBEB
LCG-2Services
Tier-2Tier-2
Physicist
T2T2storagestorage
ORCALocal Job
Tier-1Tier-1Tier-1agent
T1T1storagestorage
ORCAAnalysis
Job
MSS
ORCAGrid Job
Unico Tier2 Unico Tier2 nel DC04:nel DC04:
LNLLNL
Full chain (but the Tier-0 reconstruction) done in LCG-2, but only for INFN and PIC
Not without pain…
By C. GrandiBy C. Grandi
INFNINFN
INFNINFN
INFNINFN
INFNINFN
INFNINFN
INFNINFN
30 Mar 04 – Rates from GDB to EBs 30 Mar 04 – Rates from GDB to EBs
RAL, IN2P3, FZKRAL, IN2P3, FZK
FNALFNAL
INFN, PICINFN, PIC
A. Fanfani
8P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Data Challenge 04: numbers Data Challenge 04: numbers
Pre Challenge Production (PCP04) [Jul03-Feb04] Eventi simulati : 75 M events [750k jobs, ~800k files, 5000
KSI2000 months, 100 TB of data] (~30 M Geant4) Eventi digitizzati (raw): 35 M events [35k jobs, 105k files] Dove: INFN, USA, CERN, … In Italia: ~ 10-15 M events (~20%) Per cosa (Physics and Reconstruction Software Groups):
“Muons”, B-tau”, “e-gamma”, “Higgs”Data Challenge 04 [Mar04-Apr04]
Eventi ricostruiti (DST) al Tier0 del CERN: ~25 M events [~25k jobs, ~400k files,
150 KSI2000 months, 6 TB of data]
Eventi distribuiti al Tier1-CNAF e Tier2-LNL: gli stessi ~25 M events e files
Eventi analizzati al Tier1-CNAF e Tier2-LNL: > 10 M events [~15 k jobs, ognuno di ~ 30min
CPU]
Post Data Challenge 04 [May04- …] Eventi da riprocessare (DST): ~25 M events Eventi da analizzare in Italia: ~ 50% di 75 M events Eventi da produrre e distribuire: ~ 50 M
9P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Data Challenge 04: componenti MW e SWData Challenge 04:
componenti MW e SWCMS specific
Transfer Agents per trasferire i files di DST (al CERN, ai Tier1)
Mass Storage Systems su nastro (Castor, Enstore, etc.) (al CERN ai Tier1)
RefDb, Database delle richieste e “assignment” di datasets (al CERN)
Cobra, framework del software di CMS (CMS wide)
ORCA, OSCAR (Geant4), ricostruzione e simulazione di CMS (CMS wide)
McRunJob, sistema per preparazione dei job (CMS wide)
BOSS, sistema per il job tracking (CMS wide)
SRB, sistema di replicazione e catalogo di files (al CERN, a RAL, Lyon e FZK)
MySQL-POOL, backend di POOL sul database MySQL (a FNAL)
ORACLE database (al CERN e al Tier1-INFN)
LCG “common” User Interfaces including Replica Manager
(al CNAF, Padova, LNL, Bari, PIC) Storage Elements
(al CNAF, LNL, PIC) Computing Elements
(al CNAF, a LNL e a PIC) Replica Location Service
(al CERN e al Tier1-CNAF) Resource Broker
(al CERN e al CNAF-Tier1-Grid-it) Storage Replica Manager
(al CERN e a FNAL) Berkley Database Information Index
(al CERN) Virtual Organization Management System
(al CERN) GridICE, sistema di monitoring
(sui CE, SE, WN, …) POOL, catalogo per la persistenza
(in CERN RLS)US specific
Monte carlo distributed prod system (MOP) (a FNAL, Wisconsin, Florida, …)
MonaLisa, sistema di monitoring (CMS wide) Custom McRunJob, sistema di preparazione
dei job (a FNAL e…forse Florida)
10P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Data Challenge 04 Processing RateData Challenge 04 Processing Rate
Processed about 30M events
But DST “errors” make this pass not useful for analysis
Generally kept up at T1’s in CNAF, FNAL, PIC
Got above 25Hz on many short occasions
But only one full day above 25Hz with full system
Working now to document the many different problems
11P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Data Challenge 04: data transfer from CERN to INFN
Data Challenge 04: data transfer from CERN to INFN
exercise with ‘big’ files
CNAF - Tier1CNAF - Tier1
A total of >500k>500k files and ~6 TB~6 TB of data transferred CERN T0 CNAF T1• max nb.files per day is ~4500045000 on March 31st ,• max size per day is ~400 GB400 GB on March 13th (>700 GB 700 GB considering the “Zips”)
~340 Mbps~340 Mbps(>42 MB/s)
sustainedfor ~5 hours
(max was383.8 Mbps383.8 Mbps)
Global CNAF networkGlobal CNAF network
May 2May 2ndndMay 1May 1stst
GARR Network useGARR Network use
D. BonacorsiD. Bonacorsi
12P. Capiluppi - CSN1 Pisa 22 Giugno 2004
DC04 Real-Time (fake) AnalysisDC04 Real-Time (fake) Analysis
CMS software installation CMS Software Manager (M. Corvo) installs software via a grid job
provided by LCG RPM distribution based on CMSI or DAR distribution Used at CNAF, PIC, Legnaro, Ciemat and Taiwan with RPMs
Site manager installs RPM’s via LCFGng Used at Imperial College
Still inadequate for general CMS users
Real-time analysis at Tier-1 Main difficulty is to identify
complete file sets (i.e. runs) Information today in TMDB or
via findColls Job processes single runs at
the site close to the data files File access via rfio
Output data registered in RLSPush data or info
Pull info
BOSS
JDL RB
RLS
CE SE
WN
Jobmetadata
bdII
CE
CE
SE
SECE
output data registration
UI
rfio
A. Fanfani – C. GrandiA. Fanfani – C. Grandi
13P. Capiluppi - CSN1 Pisa 22 Giugno 2004
DC04 Fake Analysis ArchitectureDC04 Fake Analysis Architecture
TMDB MysqlTMDB
POOL RLScatalogue
Transferagent
Replicationagent
Mass Storageagent
SE ExportSE ExportBufferBuffer
PIC CASTORPIC CASTORStorageStorageElementElement
MSS
CIEMAT disk CIEMAT disk SESE
PIC diskPIC diskSESE
Dropagent
Fake Analysisagent
DropFiles
LCG ResourceBroker
LCG WorkerNode
Data Transfer Fake Analysis
Drop agent triggers job preparation/submission when all files are available Fake Analysis agent prepares xml catalog, orcarc, jdl script and submits job Jobs record start/end timestamps in mysql DB
J. HernandezJ. Hernandez
14P. Capiluppi - CSN1 Pisa 22 Giugno 2004
• the dataset-oriented analysis made the results dependent on which dataset were sent in real time from CERN• Tuning of the Tier-1 Replica Agent• Replica Agent operation affected by CASTOR problem• Analysis Agents were not always up due to debugging• for 1 dataset Zipped Metadata were late with respect to data • few problems with submission
The minimum time from T0 to T1 analysis was 10 minutes Different problems contributed to the time spread:
Real-time DC04 analysis: Turn-around time from T0Real-time DC04 analysis: Turn-around time from T0
N. De Filippis, A. Fanfani, F. FanzagoN. De Filippis, A. Fanfani, F. Fanzago
15P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Maximum rate of analysis jobs: 194 jobs/hour
Maximum rate of analysed events: 26 Hz
Total of ~15000 analysis jobs via Grid tools in ~2 weeks (95-99% efficiency)
Datasets examples: B0
S J/ Bkg: mu03_tt2mu, mu03_DY2mu
tTH, H bbbar t Wb W l T Wb W had.Bkg: bt03_ttbb_tth Bkg: bt03_qcd170_tth
Bkg: mu03_W1mu H WW 2 2
Bkg: mu03_tt2mu, mu03_DY2mu
DC04 Real-time AnalysisDC04 Real-time Analysis
N. De Filippis, A. Fanfani, F. FanzagoN. De Filippis, A. Fanfani, F. Fanzago
16P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Software di ricostruzione e DSTSoftware di ricostruzione e DST
Last CMS wk: Today: Prototype DST in place Huge effort by large number of people, especially S. Wynhoff, N.
Neumeister, T. Todorov, V. Innocente for “base”. Also from: Emilio Meschi, David Futyan, George Daskalakis, Pascal Vanlaer,
Stefano Lacaprara, Christian Weiser, Arno Heister, Wolfgang Adam, Marcin Konecki, Andre Holzner, Olivier van der Aa, Christophe Delaere, Paolo Meridiani, Nicola Amapane, Susanna Cucciarelli, Haifeng Pi
DST constitutes first “CMS summary” Examples of “doing physics” with it in place. But not complete
Senza l’attivita’ dei PRS (b-tau, muon, e-gamma) per il Senza l’attivita’ dei PRS (b-tau, muon, e-gamma) per il software di ricostruzione non ci sarebbe analisi ne’ Data software di ricostruzione non ci sarebbe analisi ne’ Data Challenge (04):Challenge (04):
L’INFN e’ il major contributor: Ba, Bo, Fi, Pi, Pd, Pg, Rm1, To.L’INFN e’ il major contributor: Ba, Bo, Fi, Pi, Pd, Pg, Rm1, To.
P. SphicasP. Sphicas
17P. Capiluppi - CSN1 Pisa 22 Giugno 2004
PRS analysis contributions…PRS analysis contributions…ttH; H→bb and related backgrounds
S. Cucciarelli, F. Ambroglini, C. Weiser, S. Kappler. A. Bocci, R. Ranieri, A. Heister ...
Bs→J/ and related backgroundsV. Ciulli, N. Magini, Dubna group...
A/Hsusy→ established channel for SUSY H; HLTPeople/channels:
A/H→2→-jet + -jet S. Gennai, S. Lehti, L. Wendland
Reconstruction; full track reco starting from to raw-data; several algos already implementedStudies of RecHits, sensor positions, B field, material distW. Adam, M. Konecki, S. Cucciarelli, A. Frey, M. Konecki, T. Todorov
HPeople: G. Anagnostou, G. Daskalakis, A. Kyriakis, K. Lassila, N. Marinelli, J. Nysten, K. Armour, S. Bhattacharya, J. Branson, J. Letts, T. Lee, V. Litvin, H. Newman, S. Shevchenko
HZZ(*)4ePeople: David Futyan, Paolo Meridiani, Kate Mackay, Emilio Meschi, Ivica Puljak, Claude Charlot, Nikola Godinovic, Federico Ferri, Stephane Bimbot
H WW 22Zanetti, Lacaprara E molti altri !!!!E molti altri !!!!
Calibrazioni ed allineamentiHiggs studies
18P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Data Challenge 04: lezione (1/2)Data Challenge 04: lezione (1/2)Molte componenti usate non scalano (sia CMS che NON):
RLS Castor D-cache Metadata SRB Cataloghi di vario tipo e specie Job submission system at the Tier0 Etc.
Molte funzioni/componenti mancavano: Data Transfer Management Global Data location per tutti (almeno) i Tier1
Niente di male, era un challenge fatto per questo!
Ma la vera lezione e’ stata (surprise?) che: NON c’era (c’e’) l’organizzazione, ne’ per LCG ne’ per CMS ne’ per
Grid3 NON c’era (c’e’) un consistente disegno ne’ di Data ne’ di
Computing Model Salvo che parzialmente in Italia e in USA!
19P. Capiluppi - CSN1 Pisa 22 Giugno 2004
D. BonacorsiD. Bonacorsi
Data Challenge 04: lezione (2/2)Data Challenge 04: lezione (2/2)
Infatti, per es.Infatti, per es.
DC04 datatime window:
51 (+3) daysMarch 11th – May 3rd
T0 Castor problems ramp-down @150
ramp-up @300 jobs
T0 Castor nameserverCreaking under load
ramp-down..
T0 @>20 Hz, butconfig agent OFFEB agents ON but
useless, thenramp-up @500
all T1’s had somehomework from the EBs here..
T0 issue of the17k files on
wrong stager
ramp-up@100
Easterprod &transfer
T0 @20and CNAF
emptiesbacklog
ramp-up @50 then 200 jobs
T0 jobs “massextinction” thenramp-up @300
“Zips”exerciseramp-up
&down
ramp-down
20P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Prospettive INFNProspettive INFNBreve termine
Ricostruire i DST con una versione di ORCA (sw CMS) Validata dalle analisi mentre avviene la produzione Dovunque (Tier0, Tier1s e Tier2s) sia possibile
Distribure i DST, gli altri formati di dati (Digi, Simhits) e i metadati
Ai Tier1 e di conseguenza ai Tier2 Consentire l’analisi “localmente distribuita”
In modo consistente per l’accesso ai dati (pochi tools lo permettono…)
Medio termine Costruire un “Data Model” Costruire un “Computing Model” Costruire una architettura consistente e distribita Costruire un accesso controllato (e “semi-trasparente”) ai dati
Con le “componenti” che ci sono e che hanno una prospettiva di scalabilita’ (da misurare di nuovo, in modo
organico)
21P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Attivita’ post Data Challenge 04Attivita’ post Data Challenge 04[June 04 – July 04]
Ricreazione dei DST Distribuzione dei file necessari (data e metadata) per l’analisi Primi risultati per i PRS e per il Physics TDR
[July 04 – July 05] Produzione di nuovi (o vecchi) datasets (inclusi i DST): Target 10 M events/month, steady, per il Physics TDR Analisi continua dei dati prodotti
[Sep 04 – Oct 04] Risultati del Data Challenge 04 per CHEP04 Prima definizione del Data & Computing Model Definizione dei MoUs
[Jul 05 - …] CMS Computing TDR (e LCG TDR) Data Challenge 05, per verificare il Computing Model
Serviranno risorse (2005) di: Storage per l’analisi e la produzione ai Tier1, Tier2 e Tier3
CPUs per la produzione e l’analisi ai Tier1 e Tier2 Attivita’ continua Risorse dedicate?
22P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Possible evolution of CCS tasks(Core Computing and Software)Possible evolution of CCS tasks(Core Computing and Software)
CCS will Reorganize to match the new requirements and the move from R&D to Implementation for Physics
Meet the PRS Production Requirements (Physics TDR Analysis) Build the Data Management and Distributed Analysis infrastructures
Production Operations group [NEW] Outside of CERN. Must find ways to reduce manpower requirements. Using predominantly (only?) GRID resources.
Data Management Task [NEW] Project to respond to DM RTAG
Physicists/ Computing to define CMS Blueprint, relationships with suppliers (LCG/EGEE…), CMS DM task in Computing group
Expect to make major use of manpower and experience from CDF/D0 Run II
Workload Management Task [NEW] Make the Grid useable to CMS users Make major use of manpower with EDG/LCG/EGEE experience
Distributed Analysis Cross Project (DAPROM) [NEW] Coordinate and harmonize analysis activities between CCS and PRS Work closely with Data and Workload Management tasks
Establish high-level Physics/Computing panel between T1 countries to ensure Collaboration Ownership of Computing Model for MoU
and RRB discussions
23P. Capiluppi - CSN1 Pisa 22 Giugno 2004
ConclusioniConclusioniIl Data Challenge “04” di CMS ha avuto successo:
Misurate molte funzionalita’ in modo “scientifico” Scoperti molte “failures” e bottlenecks (ma raggiunti i 25 Hz!) Capite (??) molte cose Contributo italiano (INFN) determinate
Il Data Challenge “04” di CMS non ha avuto successo: Non e’ stato programmato a sufficienza Ha richiesto una continua (due mesi) presenza ed intervento di persone
“volonterose” (20 ore per giorno, inclusi i week-end) per soluzioni “al volo”: ~30 persone, world-wide
NON c’e’ ancora una valutazione “oggettiva” dei risultati Tutto quello che ha funzionato (nel bene e nel male) viene a-priori
criticato senza proposte realistiche alternative…
Tuttavia, CMS, superato lo “stress” del DC04, si sta riprendendo…
The CMS system is evolving into a permanent
Production and Analysis effort…
24P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Milestones 2004: specifiche (1/2)Milestones 2004: specifiche (1/2)
Partecipazione di almeno tre sedi al DC04 [Marzo] Importare in Italia (Tier1-CNAF) tutti gli eventi ricostruiti al T0 Distribuire gli streams selezionati su almeno tre sedi (~ 6 streams, ~ 20 M
eventi, ~ 5TB di AOD) La selezione riguarda l’analisi di almeno 4 canali di segnale e relativi fondi,
ai quali vanno aggiunti gli studi di calibrazione Deliverable: contributo italiano al report DC04, in funzione del C-TDR e
della “preparazione” del P-TDR. Risultati dell'analisi dei canali assegnati all'Italia (almeno 3 stream e 4 canali di segnale)
Integrazione del sistema di calcolo CMS Italia in LCG [Giugno] Il Tier1, meta’ dei Tier2 (LNL, Ba, Bo, Pd, Pi, Rm1) e un terzo dei Tier3 (Ct,
Fi, Mi, Na, Pg, To) hanno il software di LCG installato e hanno la capacita’ di lavorare nell’environment di LCG
Comporta la installazione dei pacchetti software provenienti da LCG AA e da LCG GDA (da Pool a RLS etc.)
Completamento analisi utilizzando infrastruttura LCG e ulteriori produzioni per circa 2 M di eventi
Deliverable: CMS Italia e’ integrata in LCG per piu’ della meta’ delle risorse
Fine del DC04 slittata ad AprileFine del DC04 slittata ad Aprile
Sedi: Ba, Bo, Fi, LNL, Pd, Pi, CNAF-Tier1Sedi: Ba, Bo, Fi, LNL, Pd, Pi, CNAF-Tier1
2 Streams, ma 4 canali di analisi2 Streams, ma 4 canali di analisi
DONE, 90%DONE, 90%
Sedi integrate in LCG: CNAF-Tier1, LNL, Ba, Sedi integrate in LCG: CNAF-Tier1, LNL, Ba, Pd, Bo, PiPd, Bo, Pi
Il prolungarsi dell’Il prolungarsi dell’analisianalisi dei risultati del DC04 dei risultati del DC04 fa slittare di almeno fa slittare di almeno 3 mesi3 mesi
In progress, 30%In progress, 30%
25P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Milestones 2004: specifiche (2/2)Milestones 2004: specifiche (2/2)
Partecipazione al C-TDR [Ottobre] Include la definizione della partecipazione italiana al C-TDR in termini di:
Risorse e sedi (possibilmente tutte) Man-power Finanziamenti e piano di interventi
Deliverable: drafts del C-TDR col contributo italiano
Partecipazione al PCP DC05 di almeno il Tier1 e i Tier2 [Dicembre] Il Tier1 e’ il CNAF e i Tier2 sono: LNL, Ba, Bo, Pd, Pi, Rm1 Produzione di ~ 20 M di eventi per lo studio del P-TDR, o equivalenti (lo
studio potrebbe richiedere fast-MC o speciali programmi) Contributo alla definizione del LCG-TDR Deliverable: produzione degli eventi necessari alla validazione dei tools
di fast-simulation e allo studio dei P-TDR (~20 M eventi sul Tier1 + i Tier2/3)
Il Computing TDR e’ ora dovuto per Luglio 2005Il Computing TDR e’ ora dovuto per Luglio 2005
La milestone slitta di conseguenzaLa milestone slitta di conseguenza
Stand-by/progress, 10%Stand-by/progress, 10%
Il Data Challenge 05 slitta al Luglio 2005Il Data Challenge 05 slitta al Luglio 2005
La milestone slitta di conseguenzaLa milestone slitta di conseguenza
Stand-by, 0%Stand-by, 0%
26P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Back-up SlidesBack-up Slides
27P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Computing Model di CMSComputing Model di CMS
Computing Model designData location and access ModelAnalysis (user) ModelCMS Software and ToolsInfrastructure & Organization (Tiers and LCG)
28P. Capiluppi - CSN1 Pisa 22 Giugno 2004
29P. Capiluppi - CSN1 Pisa 22 Giugno 2004
CPU Power Ramp Up
CMS
1
10
100
1000
10000
100000
2002 2003 2004 2005 2006 2007 2008 2009
kSI9
5.M
on
ths
CERN
OFFSITE Average slope=x2.5/yearDC04
C TDR
DC05P TDRLCG TDR
DC06Readiness
LHC2E33
LHC1E34
DAQTDR
Time shared Resources Dedicated CMS Resources
ActualDC04level
Actual PCP level
30P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Experiment Alice Atlas CMS LHCb SumResource
CERN Tier 0 + Tier 1Disk PetaBytes 0.5 2.0 1.8 0.3 5
Mass Storage PetaBytes 2.3 7.6 9.2 1.0 20Processing M SI2000** 5.6 5.4 5.7 2.7 19
Sum of resources at all Tier1 centresExpected number of centres 3 6 6 5
Disk PetaBytes 3.0 6.8 8.7 1.3 20Mass Storage PetaBytes 3.6 7.2 6.6 0.4 18
Processing M SI2000** 9.1 13.6 12.6 9.5 45
Sum of resources at all Tier12centresExpected number of centres 16 24 25 15
Disk PetaBytes 3.0 3.8 5.0 0.6 12Mass Storage PetaBytes 0.0 1.6 2.9 0.0 5
Processing M SI2000** 7.2 8.4 7.5 16.4 40
** Current fast processor ~1K SI2000
First full year of data - 2008Estimated Resources Required by LHC Experiments in 2008
Estimates prepared as input to the MoU Task ForceComputing models under active development
NO HEAVY IONS INCLUDED YET!
31P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Tier-1 Centers are Crucial to CMSTier-1 Centers are Crucial to CMS
CMS expects to have (External) T1 centers at CNAF, FNAL, Lyon, Karlsrhue, PIC, RAL And a Tier-1 center at CERN (Still discussing role of CERN T1)
Current Computing model gives total External T1 requirements
Assumed over 6 centers, but not necessarily 6 equal centers
Tier-1 centers will be crucial for Calibration, Reprocessing, Data-Serving To service the requirements of the Tier-2 centers
Both from the region and via explicit relationships with external T2 centers.
Servicing the analysis requirements of their ‘regions’
Next step is to iterate with the T1 centers/CMS Country managements to understand what they can realistically hope to propose and to possibly succeed in obtaining
32P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Possible Sizing of Regional T1’sPossible Sizing of Regional T1’s
Assume 1 T1 at CERN and Sum of 6 External T1’s
Take truncated sum of collaboration at T1 Countries and calculate Fractions in those countries
Share the 6+1 T1’s according to this algorithm to get opening scenario for discussions:
CERN 1 T1 for CMS (By Definition)
France 0.5T1 for CMS Germany0.4T1 Italy 1.7T1 Spain 0.2T1 UK 0.4T1 USA 2.6T1
Country / agency % of CMS physicists
Truncated Fractions of
T1 Countries
Proposed Fraction of a Canonical
CMS T1
Tier1 candidates in redAustria 1.3%Belgium 2.7%BrazilBulgaria 0.4%CERN (Committed to T0 and T1 for CMS) 6.2% 10.6% 1.0China 1.7%Croatia 0.3%Cyprus 0.1%Estonia 0.3%Finland 1.2%France-CEA 1.4% 2.4% 0.1France-IN2P3 4.3% 7.3% 0.4Germany 4.3% 7.3% 0.4Greece 1.7%Hungary 0.8%India 2.1%Iran 0.3%Ireland 0.1%Italy 16.5% 28.1% 1.7Korea 1.3%New Zealand 0.3%Pakistan 0.7%Poland 0.7%Portugal 0.5%Russia-DMS 11.9%Serbia 0.5%Spain 2.4% 4.1% 0.2Switzerland-ETHZ 1.6%Switzerland-PSI 1.3%Switzerland-UNIV 0.7%Taipei 1.2%ThailandTurkey 1.3%United Kingdom 3.9% 6.7% 0.4USA (DOE+NSF) 25.8% 44.1% 2.6
64.78% 110.6% 7.0
33P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Tier-2Tier-2
Ask Now for intentions from all CMS Agencies I have an “old” list, I request that you contact me with
your intentions so I can bring this up to date.
T1 countries are making a very heavy commitment They may need to demonstrate sharing of costs with
the dependent T2’s T2’s need to start defining with which T1 they will
enter into service agreements, and negotiating with them to how costs will be distributed.
34Claudio Grandi INFN Bologna
RLS performance
● Time to register the output of a single job (16 files) – left axis
● Load on client machine at the time of registration – right axis
April 2nd, 18:00April 2nd, 18:00
0.4 files/s 25 Hz0.4 files/s 25 Hz
0.16 files/s 10 Hz0.16 files/s 10 Hz
P. C
apil
uppi
- C
SN1
Pis
a
Slide 35
RLS issuesRLS issues
Total Number of files registered in the RLS during DC04: 570K LFNs each with 5-10 PFN’s and 9 metadata attributes
Inserting information into RLS Insert PFN (file catalogue) was fast enough if using the appropriate tools,
produced in-course LRC C++ API programs (0.1-0.2sec/file), POOL CLI with GUID (secs/file)
Insert files with their attributes (file and metadata catalogue) was slow We more or less survived, higher data rates would be troublesome
RLS Real Time by Drop Time
0
50
100
150
200
250
0 500 1000 1500 2000 2500 3000 3500 4000 4500
Minutes from start
Secon
ds
0
0.5
1
1.5
2
2.5
3
3.5
5 Apr 10:002 Apr 18:00
3 sec / file
Tim
e t
o r
egis
ter
the o
utp
ut
of
a T
ier-
0 job (
16
file
s)
Sometimes the load on RLS increases and requires intervention on the server (i.g. log partition full, switch of server node, un-optimized queries)
able to keep up in optimal condition, so and so otherwise
36P. Capiluppi - CSN1 Pisa 22 Giugno 2004
BOSS DB
Dataset
metadataJob
metadata
McRunjob+ plug-inCMSProd
Site Manager startsan assignment
RefDB
Phys.Group asks fora new dataset
shellscripts
LocalBatch Manager
Computer farm
Job level query
Data-levelquery
Production Managerdefines assignments
Push data or info
Pull info
JDL Grid (LCG)Scheduler LCG-
0/1
RLS
DAG
job job
job
job
DAGMan(MOP)
ChimeraVDL
Virtual DataCatalogue
Planner
Grid3
PCP set-up: a hybrid modelPCP set-up: a hybrid modelby C.Grandi
37P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Generation step(all CMS)
~79 Mevts in CMS~9.9 Mevts (~13%) done by INFN (strong contribution by LNL)
Generation step(INFN only)
Jun – mid-Aug 03
contribute to this slope
PCP @ INFN statistics (1/4)PCP @ INFN statistics (1/4)
CMS production steps:GenerationGenerationSimulationooHitformattingDigitisation
38P. Capiluppi - CSN1 Pisa 22 Giugno 2004
~75 Mevts in CMS~10.4 Mevts (~14%) done by INFN (strong contribution by CNAF T1+LNL)
PCP @ INFN statistics (2/4)PCP @ INFN statistics (2/4)
Simulation step[CMSIM+OSCAR]
(INFN only)
Simulation step[CMSIM+OSCAR]
(all CMS)
Jul – Sep 03
CMS production steps:GenerationSimulationSimulationooHitformattingDigitisation
39P. Capiluppi - CSN1 Pisa 22 Giugno 2004
~37 Mevts in CMS~7.8 Mevts (~21%) done by INFN
PCP @ INFN statistics (3/4)PCP @ INFN statistics (3/4)
ooHitformatting step(INFN only)
ooHitformatting step(all CMS)
Dec 03 end-Feb 04
CMS production steps:GenerationSimulationooHitformattingooHitformattingDigitisation
D. BonacorsiD. Bonacorsi
22 Giugno 2004 P. Capiluppi - CSN1 Pisa 40
OSCAR in Production
OSCAR
PCP04 Productionwith OSCAR begins
~20
mill
ion
even
ts in
6 m
onth
s, ~7
50K
per
wee
k
41P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Evolution of Transfer RequirementsEvolution of Transfer Requirements
42José Hernández CIEMAT
From GDB to analysis at T1
Transfer
Replication
Job preparation
Job Submission
43P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Real-Time (Fake) AnalysisReal-Time (Fake) Analysis
Goals Demonstrate data can be analyzed in real time at the T1
Fast Feedback to reconstruction (e.g. calibration, alignment, check of reconstruction code, etc.)
Establish automatic data replication to T2s Make data available for offline analysis
Measure time elapsed between reconstruction at T0 and analysis at T1
Architecture Set of software agents communicating via local mysql DB
Replication, data set completeness, job preparation & submission
Use LCG to run jobs Private Grid Information System for CMS DC04 Private Resource Broker J. HernandezJ. Hernandez
44P. Capiluppi - CSN1 Pisa 22 Giugno 2004
From GDB to analysis at T1From GDB to analysis at T1
GDBGDB EBEB T1T1 T2T2ReconstructionAnalysis
Publisher and configuration agents
EB agent Transfer and replication agents
Drop and Fake Analysis agents
J. HernandezJ. Hernandez
45P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Real-time analysis: two weeks of quasi-continuous running! The total number of analysis jobs submitted ~ 15000 Overall Grid efficiency ~ 95-99%
Problems : RLS query to prepare a POOL xml catalog done using file GUID otherwise
much slower Resource Broker disk being full causing the RB unavailability for several hours. This problem was related to large input/output sandbox. Possible solutions:
Set quotas on RB space for sandboxConfigure to use RB in cascade
Network problem at CERN, not allowing connections to the RLS and CERN RB Legnaro CE/SE disappeared in the Information System during one night Failures in updating Boss database due to overload of MySQL server (~30% ). The Boss recovery procedure was used
Real-time DC04 analysis:Summary
Real-time DC04 analysis:Summary
N. De Filippis, A. Fanfani, F. FanzagoN. De Filippis, A. Fanfani, F. Fanzago
46P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Description of RLS usage in DC04Description of RLS usage in DC04
CERN RLSPOOL catalogue
RM/SRM/SRB EB agents
Configurationagent
Tier-1Transfer agent
LCGORCA
AnalysisJob
SRBGMCAT
XMLPublication
Agent
ReplicaManager
1. Register Files
2. Find Tier-1 Location (based on metadata)
4. Copy filesto Tier-1’s
6. Process DSTand registerprivate data
Local POOLcatalogue
TMDB
ResourceBroker
Specific client tools: POOL CLI, Replica Manager CLI, C++ LRC API based programs, LRC java API tools (SRB/GMCAT), Resource Broker
CNAF RLSreplica
5. Submitanalysis job
ORACLEmirroring
3. Copy/delete files to/from export buffers
47P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Context for the agent systemContext for the agent system
Replicamanagers
Agents(and TMDB)
Grid transfer tools
Filecatalogue
Configurationagent
MetadataAnalysis:A separate
world?
Resourcebrokers?
Global system management/ steering
22 Giugno 2004 P. Capiluppi - CSN1 Pisa 48
T0 T1 castor SE
T1 disk SECNAF
T2 disk SELNL
b/datasets
muon datasets
UI CNAF
DST files
2. Notify that new files are available for analysis
RB CERN/CNAF
CMS software (ORCA8.0.1) installed by the CMS software manager using a GRID job based on xcmsi tool
CNAF or LNL Computing Elements
to
Real-time analysis schema
Replica Agent
Real-time Analysis Agent
1. Replicate data to disk SEs at T1/2
ORCA 8.0.1 on UI to compile analysis code
1. Check if a file-set (run) is ready to be analyzed (greenlight)2. Prepare the job to analyze the run3. Submit the job via BOSS to the RB
22 Giugno 2004 P. Capiluppi - CSN1 Pisa 49
tTH analysis resultsMuon and Neutrino Informations:
transverse energy• Muon Pt• Isolated Muon Pt
Isolation EfficiencySingle muon = 88% (98% wrt selection)
22 Giugno 2004 P. Capiluppi - CSN1 Pisa 50
tTH analysis resultsJet Informations:
• Total number of Jet
• Number of B Jet
• Et of non B Jet
• Et of B Jet
22 Giugno 2004 P. Capiluppi - CSN1 Pisa 51
tTH analysis results
Leptonic Top
Hadronic Top
Hadronic W
Reconstructed Masses:
05/05/2004Federica Fanzago INFN Padova
data transfert and job preparation
T0 T1 castor
T1 disk SECNAF
T2 disk SELNL
b/tau dataset
Muon dataset
Replica agent
UI CNAF
DST filesDST files
Notify that new files are available for analysis
RB cern/cnaf
Real-time analysis agentOnly If the collection file has “greenlight”the agent prepares and submits a job to analyse one run
Submissionvia BOSS
CMS software is installed by the CMS Software Manager using a GRID job based on xcmsi tool
CNAF or LNL testbedTo
ORCA_8_0_1 available on UI to compile analysis code
2
53P. Capiluppi - CSN1 Pisa 22 Giugno 2004
54P. Capiluppi - CSN1 Pisa 22 Giugno 2004
An example:An example: Replicas to disk-SEs Replicas to disk-SEs
CNAF T1 disk-SE
green
CNAF T1 Castor SE
CNAF T1 Castor SE
eth I/O inputfrom SE-EB
eth I/O inputfrom Castor SE
TCP connections
RAM memory
Legnaro T2 disk-SEeth I/O input from Castor SE
Just one day:Apr, 19th
D. BonacorsiD. Bonacorsi
55P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Data TransferData Transfer
CERN EB(3 disk SE)
PICdisk SE
Castor
CNAFdisk SE
CIEMATdisk SE
Legnarodisk SE
Tier-1
Tier-2
Tier-1
Tier-2
PIC SE
Castor
CNAF SE
Transfer tools: Replica Manager CLI used for EB CNAF and CNAF Legnaro
Java-based CLI introduces non negligible overhead at start-up globus-url-copy + LRC C++ API used for EB PIC and PIC
Ciemat Faster
Performance has been good with both tools Total network throughput limited by small file size Some transfer problem caused by performance of underlying MSS
Always use a disk SE in front of an MSS in the future?A. FanfaniA. Fanfani
56P. Capiluppi - CSN1 Pisa 22 Giugno 2004
Dataset bt03_ttbb_ttH analysed with executable ttHWmu
Total execution time ~ 28 minutes
ORCA execution time ~ 25 minutes
Job waiting time before starting ~ 120 s
Time for staging input and output files ~ 170 s
Overhead of GRID + waiting
time in queue
Real-time DC04 analysis: job time statistic
Real-time DC04 analysis: job time statistic
N. De Filippis, A. Fanfani, F. FanzagoN. De Filippis, A. Fanfani, F. Fanzago