Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
March 12, 2004 Vicky White - URA Visiting Committee, 2004 1
f
Computing Division
URA Visiting Committee ReviewMarch 12, 2004
Vicky White
Head, Computing Division
March 12, 2004 Vicky White - URA Visiting Committee, 2004 2
f Computing Division Organization
CDF
D0
CMS
Exp. Astrophysics
Exp. Support
Division Office
Core Support Services
Computingand
Commun-icationsFabric
Computingand
EngineeringFor PhysicsApplications
12 + (5)
15
8
14
17
60 + (1) 51 + (3) 55
Scientists of all sorts, Engineers, Technical, Computing, Admin = 258 + (12) = 270
26 + (3)
March 12, 2004 Vicky White - URA Visiting Committee, 2004 3
f Five Activity AreasProvide services, tools, and components, and operate computing facilities that serve the lab and the scientific program broadly.Provide dedicated help, leadership and active participation in running and approved experiments,US-CMS, and other lab scientific programs (including support and expert help to the Beams Division).Work on projects funded competitively outside the base budget – e.g. SciDAC & GRID projects. Participate in planning and R&D for future experiments/lab activities.Run a computing organization and computer center.
March 12, 2004 Vicky White - URA Visiting Committee, 2004 4
f Job Categories (FTEs) – March 2004
17
0
155
1
13
46
612
50 1
17
08 11
47
511
3 0 2 0
12
160
1
15
43
8 60 1
13 101
153
17
0
20
40
60
80
100
120
140
160
180
Admi
n & M
anag
emen
t
Comp
uter P
rofes
siona
lsEn
ginee
ring P
hysic
ists
Engin
eers
Scien
tists
Gues
t Scie
ntists
/Engin
eer
Othe
r Tec
hnica
l Sup
port
Cleri
cal &
Sec
retar
ial
Draft
ersSe
rvice
Wor
kers
Tech
nician
sSk
illed T
rade
s
FTEs
Sept-02 Sept-03 Mar-04Total FTEs = 276 Total FTEs = 266 Total FTEs = 258
March 12, 2004 Vicky White - URA Visiting Committee, 2004 5
f Common Services and Tools
Much of our work in this area is used by the whole lab and all of the experiments and scientific program stakeholdersThis year we have worked hard to improve our efficiency and to develop metrics to monitor all of our systems, services and performance. Also to formalize all of our project workDigest of some of our metrics Our projects and their statuses
http://www-csd.fnal.gov/metrics/CD_metrics_digest.htmhttp://wwwserver2.fnal.gov/cfdocs/projectsdb/index.cfm
March 12, 2004 Vicky White - URA Visiting Committee, 2004 6
f Some Common ServicesCommon Service Customer/Stakeholder CommentsStorage and Data movement and caching
CDF, D0, CMS, MINOS, Theory, SDSS, KTeV, all
Enstore – 1.5 Petabytes data !dCache, SRM
Databases CDF, D0, MINOS, CMS, Accelerator, ourselves
Oracle 24x7mySQL,Postgres
Networks, Mail, Print Servers, Helpdesk, Windows, Linux, etc.
Everyone ! First class, many 24X7, services + lead Cyb.Security
SAM-GRID CDF, D0, MINOS Aligning with LHC
Simulation, MC and Analysis Tools
CDF, D0, CMS, MINOS, Fixed Target, Accel. Div.
Growing needs
Farms All experiments Moving to GRID
Engineering Support and R&D
CDF, D0, BTeV, JDEM, Accel. Div. Projects
Q outside our door
March 12, 2004 Vicky White - URA Visiting Committee, 2004 7
f Run II Computing
Working very well in generalReconstruction raw data keeping up, except for few glitches
Luminosity increases and/or detector degradations and/or increased data rates present risks
Large Analysis capability at Fermilab for CDF, smaller FarmsLarge Farm resources at Fermilab for D0, smaller AnalysisVirtual Computing center model
core computing is provided by Fermilab, according to reviewed Run II plans, with global computing contributions
Adequate Network bandwidth is clearly essential After 1 year we finally have a contract for a dark fiber to Starlight and will commission this link soonESNet working with us on Metropolitan Area Network between ANL, Fermilab and Starlight
March 12, 2004 Vicky White - URA Visiting Committee, 2004 8
f D0 Reconstruction Processing
• Keeping up with the raw data
• ~800 processors in the Farm at Fermilab
• Highly successful distributed reprocessing of data recently completed at 6 sites worldwide•article in FermiNews DZero Breaks New Ground in Global Computing Efforts. First steps toward Grid application with 'real data'
http://www.fnal.gov/pub/ferminews/ferminews04-02-01/p1.htmlhttp://www.fnal.gov/pub/ferminews/ferminews04-02-01/p1.htmlhttp://www.fnal.gov/pub/ferminews/ferminews04-02-01/p1.html
March 12, 2004 Vicky White - URA Visiting Committee, 2004 9
f D0 worldwide reprocessing
March 12, 2004 Vicky White - URA Visiting Committee, 2004 10
f D0 Computing
Central SGI processor being phased out Successful linux farm analysis processing on CAB
In a typical week analyze 50 TB of data on the analysis systems, corresponding to 1.2 Billion events
Wait times for delivery to caches from central Enstorestorage system typically small compared to the cpu time used for analysis
SAM-GRID data handling system used for all data delivery, tracking, metadata
March 12, 2004 Vicky White - URA Visiting Committee, 2004 11
f CDF Computing
March 12, 2004 Vicky White - URA Visiting Committee, 2004 12
f Building a CDF analysis facility (CAF)
March 12, 2004 Vicky White - URA Visiting Committee, 2004 13
f CDF Enstore Bytes/day transferred
March 12, 2004 Vicky White - URA Visiting Committee, 2004 14
f Total Lab bytes/day - Enstore
25 Terabytes per day!
March 12, 2004 Vicky White - URA Visiting Committee, 2004 15
f CDF dCache data read
March 12, 2004 Vicky White - URA Visiting Committee, 2004 16
f Accelerator Division Projects
15 FTEs of help on projects from throughout the division. 5+ scientists strongly involved. Tevatron BPM project led by Steve Wolbers the largest effortVarious ongoing analysis, controls, database and tools efforts
20 people involved – limited by nature of tasks available
March 12, 2004 Vicky White - URA Visiting Committee, 2004 17
f Tev BPM Project will deliver
March 12, 2004 Vicky White - URA Visiting Committee, 2004 18
f Joint CD/AD project
March 12, 2004 Vicky White - URA Visiting Committee, 2004 19
f US-CMS Fermilab Tier-1 Activities
Preparations for major milestones DC04 Data Challenge and preparation for computing TDR preparation for the Physics TDR roll out of the LCG Grid service and federating it with the U.S. Grid facilities (Grid3)
Develop the required Grid and Facilities infrastructure increase the facility capacity through equipment upgrades, following the baseline plancommission Grid capabilities through Grid3 and LCG-1 effortsdevelop and integrate required functionalities and services
Increase the capability of User Analysis Facilityimprove how a physicists would use facilities and softwarefacilities and environment improvementssoftware releases, documentation, web presence etc
March 12, 2004 Vicky White - URA Visiting Committee, 2004 20
fUS-CMS Tier-1 Facility
Scaling up the Tier-1 equipmentOn track for the baseline planpreparation for DC04: U.S. share to CMSCPU, storage, data accessPlanned procurement for next upgrades
March 12, 2004 Vicky White - URA Visiting Committee, 2004 21
fTowards US Grid Infrastructure- Open
Science GridGrid3 demonstrator for multi-organizational Grid environment
together with US Atlas, iVDGL, GriPhyN, PPDG, SDSS, LIGOFermilab and US LHC facilities available through shared GridMassive CMS production of simulated events and data movementsHugely increase CPU resources available to CMS through opportunistic use, running on Grid3!
March 12, 2004 Vicky White - URA Visiting Committee, 2004 22
fLHC Production Grids
Federating U.S. Grid resources with the LHC Grid through Grid3
March 12, 2004 Vicky White - URA Visiting Committee, 2004 23
f Lattice QCD SciDAC Clusters
March 12, 2004 Vicky White - URA Visiting Committee, 2004 24
fLattice QCD – using clusters for physics
results
March 12, 2004 Vicky White - URA Visiting Committee, 2004 25
f MINOS Computing contributions in
Offline Software infrastructureData handling (using SAM, dCache, Enstore)Databases developmentControl Room Logbook usageNear detector LAN installationSystem Support and installs
(Use General Purpose Farms, FNALU, Common Storage System and services)
March 12, 2004 Vicky White - URA Visiting Committee, 2004 26
f Computer Center(s)Feynman Center is out of power, cooling and space
Facility with Uninterruptible Power (UPS + Generator)Satellite facility for Lattice Gauge clusters
And soon CDF CAF Stage 4Pre-fall Farms – part of Grid accessible Farm facilities for CMS, D0, CDF
Heroic work by FESS, Directorate, CD staff to plan for and execute a project for re-use of experimental facility (formerly Wide Band).
Stage 1 of a High Density Computing Facility Demolition and Construction starts April 5
Massive power and cooling needs for Run II experiments and CMS. Even accounting for Grid computing and contributions to data processing worldwide for Run II
March 12, 2004 Vicky White - URA Visiting Committee, 2004 27
f Historical Power Growth + Projected TotalCD Computer Power Growth
0
500
1000
1500
2000
2500
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
KVA
Projected KVA Actual KVA
FCC Max
Experiment Projections
0
500
1000
1500
2000
2500
FY04 FY05 FY06 FY07 FY08 FY09
KV
A
CDFD0CMSLattice GaugeTotal KVA
March 12, 2004 Vicky White - URA Visiting Committee, 2004 28
f Conclusions
CD is running flat out on all cylinders – with improved efficiency in operations
Taken on more tasks (e.g. Accelerator Support and modeling), SNAP R&D US-CMS ramp up of Tier 1 facility and user supportTo the point of being stretched very thin and hiring in targeted areas
We worked safely – and received a trophy for 1Million hours worked without a lost time injuryWe continue to evolve and plan for transition
RunII era -> CMS era -> BTeV eraGRID computing, the Open Science Grid and our partnership in worlwide LHC Grid are important components of our strategy
http://www.fnal.gov/pub/today/archive_2004/today04-03-01.htmlhttp://www.fnal.gov/pub/today/archive_2004/today04-03-01.html
March 12, 2004 Vicky White - URA Visiting Committee, 2004 29
f Extra Slides
March 12, 2004 Vicky White - URA Visiting Committee, 2004 30
f Job Categories – March 2004
Admi
n & M
anag
emen
t
Comp
uter
Profe
ssion
als
Engin
eerin
g Phy
sicist
sEn
ginee
rsSc
ientis
ts
Gues
t Scie
ntists
/Eng
ineer
Othe
r Tec
hnica
l Sup
port
Cleri
cal &
Secr
etaria
lDr
afters
Serv
ice W
orke
rsTe
chnic
ians
Skille
d Tra
des
Total
9
159
113
475
268
017204
11
10
155
1 1346
6
266
01710512
13
160
1 1543
8
276
01710612
0
50
100
150
200
250
300
SEP-02 SEP-03 SEP-04
March 12, 2004 Vicky White - URA Visiting Committee, 2004 31
f CD – ages of technical staff11 Engineers - average age = 46.3
00.5
11.5
22.5
33.5
44.5
20-25 25-30 30-35 35-40 40-45 45-50 50-55 55-60 60-65 65-70 70-75
18 Technicians - average age = 50.2
0
1
2
3
4
5
6
7
20-25 25-30 30-35 35-40 40-45 45-50 50-55 55-60 60-65 65-70 70-75
37 Scientific staff - average age = 46.5
0
2
4
6
8
10
12
14
20-25 25-30 30-35 35-40 40-45 45-50 50-55 55-60 60-65 65-70 70-75
152 Computing Professionals - average age = 45
05
10152025303540
20-25 25-30 30-35 35-40 40-45 45-50 50-55 55-60 60-65 65-70 70-75
March 12, 2004 Vicky White - URA Visiting Committee, 2004 32
f D0 SAM Files delivered
March 12, 2004 Vicky White - URA Visiting Committee, 2004 33
f Experimental Astrophysics
Sloan Digital Sky Survey (SDSS)Started 5th year of operations2nd official data release in MarchWe operate the SDSS data processing systems, process imaging, spectroscopic data and design plug plates. Monitor DAQ. Survey strategy.SDSS archive became a major component of a National Virtual ObservatoryGrid Computing used for SDSS science analysis. Terabyte Analysis Machine joined Grid3EAG members active in many science discoveries
JDEM/SNAP Joined the SNAP collaboration
Much more about all EAG activities in Kent’s talk
Computing Division OrganizationFive Activity AreasJob Categories (FTEs) – March 2004Common Services and ToolsSome Common ServicesRun II ComputingD0 Reconstruction ProcessingD0 worldwide reprocessingD0 ComputingCDF ComputingBuilding a CDF analysis facility (CAF)CDF Enstore Bytes/day transferredTotal Lab bytes/day - EnstoreCDF dCache data readAccelerator Division ProjectsTev BPM Project will deliverJoint CD/AD projectUS-CMS Fermilab Tier-1 ActivitiesUS-CMS Tier-1 FacilityTowards US Grid Infrastructure- Open Science GridLHC Production GridsLattice QCD SciDAC ClustersLattice QCD – using clusters for physics resultsMINOS Computing contributions inComputer Center(s)Historical Power Growth + Projected TotalConclusionsExtra SlidesJob Categories – March 2004CD – ages of technical staffD0 SAM Files deliveredExperimental Astrophysics