Challenges and Success of HEP GRID

Challenges and Success

of HEP GRIDFaïrouz Malek, CNRSFaïrouz Malek, CNRS

3rd EGEE User FORUM 2008Clermont-Ferrand

2Faïrouz Malek/CNRS

The scales


Chambres à muons

Calorimètre

Trajectographe

-

High Energy Physicsmachines and detectors

pp @ √s=14 TeVL : 1034/cm2/s

L: 2.1032 /cm2/s

2,5 million collisions per secondLVL1: 10 KHz, LVL3: 50-100 Hz25 MB/sec digitized recording

40 million collisions per secondLVL1: 1 kHz, LVL3: 100 Hz0.1 to 1 GB/sec digitized recording


LHC: 4 experiments … ready! First beam expected in autumn 2008


WWττμμeeZZννττννμμννee

bbssdd γγquar

ksle

pton

s

1ère 2ème 3ème

génération

bosons de jauges

ttccuu

HHHiggs

gg

Professor Vangelis, what are you expecting from the LHC ?

← CMS Simulation


Supersymetry: New world where each Boson (photon) or Fermion (e-) has Super Partner(s)

New Dimensions (space)

where only some particles can propagate → gravitons, new bosons …

Towards String Theory … gravitation is handled by quantum mechanics. This is true only if 10 or more dimensions of space-time.

Alas! … Hopefully ? MS is not so Standard AND …Hmmmmm … Maybe …….

Calabi-Yau


Physicists see online/offlinePhysicists see online/offline TRUE TRUE (top) (top)

events @ a running D0/Fermilab experimentevents @ a running D0/Fermilab experiment


A collision @ LHC


@ CERN: Acquisition, First pass reconstruction,

Storage Distribution


The Data Acquisition


LHC computing: is it really a challenge ?

• Signal/Background 10-9

• Data volume– High rate * large number of channels

* 4 experiments

15 PetaBytes of new data each year

• Compute power– Event complexity * Nb. events *

thousands users

60 k of (today's) fastest CPUs


Options as seen in 1996Before the GRID was invented


Timeline LHC Computing

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008

LHC approved

ATLAS & CMS approved

ALICEapproved

LHCb approved

“Hoffmann”Review

7x107 MIPS1,900 TB disk

ATLAS (or CMS) requirementsfor first year at design luminosity

ATLAS&CMSCTP

107 MIPS100 TB disk

LHC start

ComputingTDRs

55x107 MIPS70,000 TB disk

(140 MSi2K)


Evolution of CPU Capacity at CERN

SC (0.6GeV)

PS (28GeV)ISR (300GeV)

SPS (400GeV)

ppbar(540GeV)

LEP (100GeV)

LEP II (200GeV)

LHC (14 TeV)

Costs (2007Swiss Francs)

Includes infrastructurecosts (comp.centre,

power, cooling, ..) and physics tapes

Tape & diskrequirements

:>10 times

CERNpossibility


Data Challenges

First physics

Cosmics

GRID 3

EGEE 1

LCG 1

Service Challenges

EU DataGrid

GriPhyN, iVDGL, PPDG

EGEE 2

OSG

LCG 2

EGEE 3

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008

WLCG Partially decentralized

model– replicate the event data at

about five regional centres

– data transfer via network ormovable media

RC2

CERN

RC1

Timeline Grids


The Tiers ModelTier-0 -1 -2


WLCG Collaboration

• The Collaboration– 4 LHC experiments– ~250 computing centres– 12 large centres

(Tier-0, Tier-1)– 38 federations of smaller

“Tier-2” centres– Growing to ~40 countries– Grids: EGEE, OSG, Nordugrid

• Technical Design Reports– WLCG, 4 Experiments: June 2005

• Memorandum of Understanding(Agreed in October 2005)– Guaranteed resources– Quality of services (24/7, 4h Intervention)

• Resources– 5-year forward look– Target reliability and efficiency: 95%


Centers around the world form a Supercomputer

• The EGEE and OSG projects are the basis of the Worldwide LHC Computing Grid Project WLCG

Inter-operation between Grids is working!


Available Infrastructure

EGEE: ~250 sites, >45000 CPUOSG: ~ 15 sites for LHC, > 10000 CPU

¼ of the resources are contributed by groups external to the project

~>25 k simultaneous jobs


What about the Middleware ?

• Security – Virtual Organization Management (VOMS) – MyProxy

• Data management – File catalogue (LFC)– File transfer service (FTS)– Storage Element (SE)– Storage Resource Management (SRM)

• Job management – Work Load Management System(WMS)– Logging and Bookeeping (LB)– Computing Element (CE)– Worker Nodes (WN)

• Information System– Monitoring: BDII (Berkeley Database Information Index), RGMA

(Relational Grid Monitoring Architecture) aggregate service information from multiple Grid sites, now moved to SAM (Site Availability Monitoring)

– Monitoring & visualization (Griview, Dashboard, Gridmap etc.)


• ATLAS– pathena/PANDA– GANGA together with the gLite and Nordugrid

• CMS – CRAB together with gLite WMS and CondorG

• LHCb– GANGA together with DIRAC

• Alice– Alien2, PROOF

GRID ANALYSIS TOOLS


• User friendly job submission tools– Extensible due to plugin system

• Support for several applications– Athena, AthenaMC (ATLAS)– Gaudi, DaVinci (LHCb)– Others …

• Support for several backends– LSF, PBS, SGE etc– gLite WMS, Nordugrid, Condor– DIRAC, PANDA

• GANGA Job Building blocks

• Various interfaces– Command line, IPhyton, GUI


In total 968 persons since January, 579 in ATLASPer month ~275 users, 150 in ATLAS

ATLAS

LHCb

Others


• On the EGEE and the Nordugrid infrastructure ATLAS uses direct submission to the middleware using GANGA– EGEE: LCG RB and gLite WMS

– Nordugrid: ARC middleware

• On OSG PANDA system– Pilot based system

– Also available at some EGEE sites

ATLAS Strategy


About 50K jobs since September

Tier 0 1 2 3

Fraction 8 37 40 15

Tier-1: 48% Lyon, 36% FZK


ATLAS Panda System

• Interoperability is important

• PANDA jobs on some EGEE sites

• PANDA is an additional backend for GANGA

• The positive aspect is that it gives ATLAS choices on how to evolve


• CMS Remote Analysis Builder– User oriented tool for grid submission and handling of analysis

jobs

• Support for gLite WMS and CondorG

• Command line oriented tool– Allows to create and submit jobs, query status and retrieve output

CMS CRAB FEATURES


Mid-July mid-August 2007 645K jobs (20K jobs/day) – 89% grid success rate


• LHCb– GANGA as user interface – DIRAC as backend

• Alice– Alien2

• Alien and DIRAC are in many respects similar to PANDA


File File catalogcatalog

MasterMaster

SchedulSchedulerer

StorageStorage

CPU’sCPU’s

QueryQueryPROOF query:PROOF query:data file list, mySelector.Cdata file list, mySelector.C

Feedback,Feedback,merged final outputmerged final output

PROOF clusterPROOF cluster

• Cluster perceived as extension of local PC• Same macro and syntax as in local session• More dynamic use of resources• Real-time feedback• Automatic splitting and merging


Baseline Services

• Storage Element– Castor, dCache, DPM (with SRM 1.1)– Storm added in 2007– SRM 2.2 – long delays incurred

- being deployed in production

• Basic transfer tools – Gridftp, ..• File Transfer Service (FTS)• LCG File Catalog (LFC)• LCG data mgt tools - lcg-utils• Posix I/O –

– Grid File Access Library (GFAL)

• Synchronised databases T0T1s

– 3D project

• Information System• Compute Elements

– Globus/Condor-C– web services (CREAM)

• gLite Workload Management– in production at CERN

• VO Management System (VOMS)

• VO Boxes• Application software

installation• Job Monitoring Tools

The Basic Baseline Services – from the TDR (2005)

... continuing evolutionreliability, performance, functionality, requirements


3D - Distributed Deployment of Databases for LCG

ORACLE Streaming with Downstream Capture

(ATLAS, LHCb)

SQUID/FRONTIER Web caching

(CMS)


LHCOPN Architecture

Tier-2s and Tier-1s are inter-connected by the general

purpose research networks

Any Tier-2 mayaccess data at

any Tier-1

Tier-2 IN2P3

TRIUMF

ASCC

FNAL

BNL

Nordic

CNAF

SARAPIC

RAL

GridKa

Tier-2

Tier-2

Tier-2

Tier-2

Tier-2

Tier-2

Tier-2Tier-2

Tier-2


The usage

The number of jobs

The production

The real success !!!!


Data Transfer out of Tier-0


Site reliability


Site Reliability

Site ReliabilityTier-2 Sites

83 Tier-2 sites being monitored

Targets – CERN + Tier-1sBefore

July July 07 Dec 07 Avg.last 3 months

Each site 88% 91% 93% 89%

8 best sites 88% 93% 95% 93%


GRID Production per Vo in one year

HEP

33 million jobs ~ 110 million Norm. CPU


HEP GRID Production in one year

Babar

D0

ILC, …


CMS simulation2nd Term 2007

CC-IN2P3

FNAL

PIC

~675 Mevents


ATLAS: the data chain works – Sept 2007

Tracks recorded in the muon chambers of the ATLAS detector were expressed to physicists all over the world, enabling simultaneous analysis at sites across the globe. About two million muons over two weeks were recorded.

Terabytes of data were moved from the Tier-0 site at CERN to Tier-1 sites across Europe (seven sites), America (one site in America and one in Canada) and Asia (one site in Taiwan). Data transfer rates reached the expected maximum. Real analysis (in T2) happened in quasi real-time at sites across Europe and the U.S.


Ramp-up Needed for Start-up

Jul Sep Apr -07 -07 -08

3.7 X Sep Jul Apr -06 -07 -08

Sep Jul Apr -06 -07 -08

3 X 2.9 X

Sep Jul Apr -06 -07 -08

Sep Jul Apr -06 -07 -08

2.3 X 3.7 X target usageusage

pledgeinstalled


The Grid is now in operation, working on: reliability, scaling up, sustainability


Summary

• Applications support in good shape • WLCG service

– Baseline services in production with the exception of SRM 2.2– Continuously increasing capacity and workload– General site reliability is improving – but still a concern– Data and storage remain the weak points

• Experiment testing progressing – – involving now most sites, approaching full dress rehearsals

• Sites & experiments working well together to tackle the problems

• Major Combined Computing Readiness Challenge Feb-May 2008, before the machine starts, -- essential to provide experience for site operations and storage systems – stressed simultaneously by all four experiments

• Steep ramp-up ahead to deliver the capacity needed for 2008 run


Improving Reliability

• Monitoring• Metrics• Workshops• Data challenges• Experience• Systematic

problem analysis• Priority from

software developers

Documents

Challenges and Success of HEP GRID