Upload
stephanie-thomson
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
Your university or experiment logo here
LHCb is Beautiful?
Glenn PatrickGridPP19, 29 August 2007
2
In the beginning…
3
LHCb – GridPP1 Era (May 2002)
Empty!
4
LHCb – GridPP2 Era (Mar 2005)
Not Beautiful!
5
LHCb December 2006
Muon Calorimeters RICH2Trackers
Magnet
RICH1VELO
p p
Getting Pretty!
6
Summer 2008 – Beauty at Last?
1000 million B mesons/year
2008Suddenly Beautiful!
B0 B0
b
d d
b
7
…and so it is with the Grid?
Job
Local disk
Compute Element
globus-url-copy
ReplicaCatalogue
NIKHEF - Amsterdam
CERN TESTBED
REST-OF-GRID
JobStorage Element
replica-get
publish
register-local-file
Storage Element
mss
Data Data
Data
Origins of Grid for LHCb …GridPP at NeSc Opening – 25 April
2002
8
DIRAC WMS Evolution (2006)
Job Receiver
Job Receiver
JobJDL
Sandbox
JobInput
JobDB
Job Receiver
Job Receiver
Job Receiver
Job Receiver
DataOptimizer
DataOptimizer
TaskQueue
LFCLFC
checkData
AgentDirectorAgent
Director
checkJob
RBRBRBRBRBRB
PilotJob
CECE
WNWN
PilotAgentPilot
Agent
JobWrapper
JobWrapper
execute(glexec)
UserApplication
UserApplicationfork
MatcherMatcher
CEJDL
JobJDL
getReplicasWMSAdminWMS
Admin
getProxySE
uploadData
VO-boxVO-boxputRequest
AgentMonitorAgent
Monitor
checkPilot
getSandbox
JobMonitor
JobMonitor
DIRACservicesDIRAC
services
LCGservices
LCGservices
WorkloadOn WN
WorkloadOn WN
9
DIRAC Production & Analysis
DIRAC JobManagement
Service
DIRAC JobManagement
Service
DIRAC CEDIRAC CEDIRAC CEDIRAC CE
DIRAC CEDIRAC CE
LCGLCGResourceBroker
ResourceBroker
CE 1CE 1
DIRAC SitesDIRAC Sites
AgentAgent AgentAgent AgentAgent
CE 2CE 2
CE 3CE 3
Productionmanager
Productionmanager GANGA UIGANGA UI User CLI User CLI
JobMonitorSvcJobMonitorSvc
JobAccountingSvcJobAccountingSvc
AccountingDB
Job monitorJob monitor
InformationSvcInformationSvc
FileCatalogSvcFileCatalogSvc
MonitoringSvcMonitoringSvc
BookkeepingSvcBookkeepingSvc
BK query webpage BK query webpage
FileCatalogbrowser
FileCatalogbrowser
Userinterfaces
DIRACservices
DIRACresources
DIRAC StorageDIRAC Storage
DiskFileDiskFile
gridftpgridftpbbftpbbftp
rfiorfio
GridPP: Gennady Kuznetsov (RAL) – DIRAC Production Tools
DIRAC1: started 19.12.2002
DIRAC3 (data ready): due
2007
10
GANGA: Gaudi ANd Grid Alliance - 2001
GAUDI Program
GANGAGU
I
JobOptionsAlgorithms
Collective&
ResourceGrid
Services
HistogramsMonitoringResults
First ideas…Pere Mato: LHCb Workshop, Bologna, 15 June 2001
GridPP - Alexander Soroko (Oxford) Karl Harrison (Cambridge) Ulrik Egede (Imperial)
Alvin Tan (Birmingham)
11
PBS OSG NorduGridLocal LSF PANDA
US-ATLAS WMS
LHCb WMS
ExecutableAthena
(Simulation/Digitisation/Reconstruction/Analysis)
AthenaMC(Production)
Gauss/Boole/Brunel/DaVinci(Simulation/Digitisation/Reconstruction/Analysis)
LHCb Experiment neutral ATLAS
Ganga Evolution: 2001-2007
Replaces
12
Ganga 2007: Elegant Beauty?
CERN, September 2005 Cambridge, January 2006
Job details
Logical
Folders
Job Monitoring
Log window
Job builder
Scriptor
Screenshot of the Ganga GUI
Screenshot of the Ganga GUI
Edinburgh, January 2007
13
Ganga Users - 2007
806 unique users since 1 Jan 2007: LHCb=162 unique users
ATLAS
LHCb
Other
14
Ganga by Domain - 2007
CERNO
ther
15
RAL CSF
120 Linux cpu
IBM 3494 tape robot
LIVERPOOL
MAP300 Linux cpu
CERNpcrd25.cern.ch
lxplus009.cern.ch
RAL (PPD)
Bristol
Imperial College
Oxford
GLASGOW/EDINBURGH
“Proto-Tier 2”
Initial LHCb-UK “Testbed”
Institutes
Exists
Planned
RAL DataGrid Testbed
Cambridge
LHCb “Grid” - circa 2001
16
2 kHz@30 kB/event60MB/s
LHCb Computing Model
40 MHz
Level-0Hardware
1 MHz
Level-1Software
HLTSoftware
40 kHz
17
Monte Carlo Simulation 2007
Record of 9715simultaneous
jobsover 70+ siteson 28 Feb 2007
Raja Nandakumar (RAL)
700M events simulatedsince May 2006.
1.5M jobs submitted
18
Reconstruction & Stripping - 2007
…but not so often we get all Tier 1 centres working together.Peak of 439 jobs.
CNAF
NIKHEF
RAL
IN2P3
CER
N
19
Data Management - 2007
● Production jobs upload output to associated Tier 1 SE (i.e. RAL in UK).
● Multiple “Failover” SE and Multiple VO Boxes used in case of failure.
● Replication done via FTS and centralised Transfer DB.eScience PhD: Andrew Smith (Edinburgh)
20
Data Transfer - 2007 •RAW data replicated from Tier 0 to one of six Tier 1 sites.•gLite FTS used for T0 – T1 replication.•Transfers trigger automated job submission for
reconstruction.•Sustained total rate of 40MB/s required (and achieved).
Further DAQ –T0 – T1 throughput tests at 42MB/s aggregate rate
scheduled later in 2007.
50
21
Bookkeeping (2007)GridPP: Carmine Cioffi (Oxford)
Tomcat
volhcb01
AMGA Client
Re
ad
Oracle DBOracle DB
BK Service
Bookk eeping Svc B
oo kkeep ingQuer y
Serv
let
Web
Browser
Read
Read
AM
GA
Client
AM
GA
R/WR/W
22
LHCb CPU Use 2005-2007
COUNTRYCPU USE
(%)
UK 34.0
Italy 16.1
Switzerland 13.7
France 9.8
Spain 7.1
Germany 4.8
Greece 4.0
Netherlands 4.0
Russia 2.0
Poland 1.8
Hungary 0.6CER
N
UK
Italy
Swiss
France
Spain
Germany
Many thanks to: Birmingham, Bristol, Brunel, Cambridge, Durham, Edinburgh, Glasgow, Imperial, Lancaster, Liverpool, Manchester,
Oxford, QMUL, RAL, RHUL, Sheffield and all others.
23
UKI Evolution for LHCb
Tier 1
Tier 1NorthGrid
London
ScotGrid
SouthGrid
2004
2007
24
GridPP3: Final Crucial Step(s)
2001-2004
2008-2011 GridPP3
2004-20072007-2008
Beauty!
25
Some 2007-2008 Milestones
Sustain DAQ-T0–T1 throughput tests at 40+ MB/s.
Reprocessing (second pass) of data at Tier 1 centres.
Prioritisation of analysis, reconstruction and stripping jobs (all at Tier 1 for LHCb).
CASTOR has to work reliably for all service classes!
Ramp up of hardware resources in UK.
Alignment. Monte-Carlo done with perfectly positioned detectors…. reality will be different!
Calibration. Monte-Carlo done with “well understood” detectors… reality will be different! Distributed Conditions Database plays vital role.
Analysis. Increasing load of individual users.
26EPS Conference on High Energy Physics, Manchester 23 July 2007
Lyn EvansThe End (and the Start)
GridPP3