28
The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

Embed Size (px)

Citation preview

Page 1: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

The ALICE Computing

F.CarminatiMay 4, 2006

Madrid, Spain

Page 2: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

24/5/2006 fca @ CIEMAT

level 0 - special hardware8 kHz (160 GB/sec)

level 1 - embedded processors

level 2 - PCs

200 Hz (4 GB/sec)

30 Hz (2.5 GB/sec)

30 Hz

(1.25 GB/sec)

data recording &

offline analysis

Total weight 10,000tOverall diameter 16.00mOverall length 25mMagnetic Field 0.4Tesla

ALICE Collaboration ~ 1/2 ATLAS, CMS, ~ 2x LHCb ~1000 people, 30 countries, ~

80 Institutes

Page 3: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

34/5/2006 fca @ CIEMAT

The history

• Developed since 1998 along a coherent line

• Developed in close collaboration with the ROOT team

• No separate physics and computing team– Minimise communication problems– May lead to “double counting” of people

• Used for the TDR’s of all detectors and Computing TDR simulations and reconstructions

Page 4: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

44/5/2006 fca @ CIEMAT

The framework

ROOT

AliRoot

STEER

Virtual MC

G3 G4 FLUKA

HIJING

MEVSIM

PYTHIA6

PDF

EVGEN

HBTP

HBTAN

ISAJET

AliE

n +

LC

G

EMCAL ZDCITS PHOSTRD TOF RICH

ESD

AliAnalysis

AliReconstruction

PMD

CRT FMD MUON TPCSTART RALICESTRUCT

AliSimulation

JETAN

Page 5: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

54/5/2006 fca @ CIEMAT

The code

• 0.5MLOC C++• 0.5MLOC “vintage” FORTRAN code• Nightly builds• Strict coding conventions• Subset of C++ (no templates, STL or exceptions!)

– “Simple” C++, fast compilation and link (see R.Brun’s talk)

– No configuration management tools (only cvs)– aliroot is a single package to install

• Maintained on several systems– DEC-Tru64, Mac OSX, Linux RH/SLC/Fedora

(i32:i64:AMD), Sun Solaris

• 30% developed at CERN and 70% outside

Page 6: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

64/5/2006 fca @ CIEMAT

The tools

• Coding convention checker• Reverse engineering• Smell detection• Branch instrumentation• Genetic testing (in preparation)• Aspect Oriented Programming (in

preparation)

Page 7: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

74/5/2006 fca @ CIEMAT

The Simulation

User Code

VMC

Geometrical Modeller

G3 G3 transport

G4 transportG4

FLUKA transportFLUKA

Reconstruction

Visualisation

Generators

See A.Morsch’s talk

Page 8: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

84/5/2006 fca @ CIEMAT

QuickTime™ and a decompressor

are needed to see this picture.

Page 9: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

94/5/2006 fca @ CIEMAT

0 10 20 30

microsec/point (1 milion

Gexam1

Gexam3

Gexam4

ATLAS

CMS

BRAHMS

CDF

MINOS_NEAR

BTEV

TESLA

Performance for "Where am I" - physics case (G3 geometries collected in 2002)

ROOT

G3

TGeo modeller

Page 10: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

104/5/2006 fca @ CIEMAT

The reconstruction

• Incremental process– Forward propagation

towards to the vertex TPCITS

– Back propagation ITSTPCTRDTOF

– Refit inward TOFTRDTPCITS

• Continuous seeding– Track segment finding

in all detectors

Best track 1 Best track 2

Conflict !

TRD

TPC

ITS

TOF

• Combinatorial tracking in ITS– Weighted two-tracks 2 calculated – Effective probability of cluster

sharing– Probability not to cross given layer

for secondary particles

See P.Hristov’s talk

Page 11: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

114/5/2006 fca @ CIEMAT

Calibration

DAQ

Trigger

DCS

ECS

Physics

data

DCDB

AliEn+LCGmetadatafile store

calibration procedures

calibration files

AliRoot

Calibration classes

API

API

API

API

API

filesFrom URs:

Source, volume, granularity, update frequency, access pattern, runtime environment and dependencies

API – Application Program Interface

API

APIHLT

shuttle

Page 12: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

124/5/2006 fca @ CIEMAT

Alignment

Simulation

Ideal Geometry

Misalignment

Reconstruction

Raw data

File from survey

Ideal Geometry

Alignment procedure

Page 13: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

134/5/2006 fca @ CIEMAT

Tag architecture

ev#guid

Tag1, tag2, tag3…

ev#guid

Tag1, tag2, tag3…

ev#guid

Tag1, tag2, tag3…

ev#guid

Tag1, tag2, tag3…

Reconstruction

Bitmap Index

Index builder

Analysis job

Selection List of ev#guid’s

proof#1

proof#2

proof#3

proof#n

guid#{ev1…evn}

guid#{ev1…evn}

guid#{ev1…evn}

guid#{ev1…evn}

Page 14: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

144/5/2006 fca @ CIEMAT

Visualisation

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

See M.Tadel’s talk

Page 15: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

154/5/2006 fca @ CIEMAT

ALICE Analysis Basic Concepts• Analysis Models

– Prompt reco/analysis at T0 using PROOF infrastructure

– Batch Analysis using GRID infrastructure

– Interactive Analysis using PROOF(+GRID) infrastructure

• User Interface– ALICE User access any GRID

Infrastructure via AliEn or ROOT/PROOF UIs

• AliEn– Native and “GRID on a GRID”

(LCG/EGEE, ARC, OSG)– integrate as much as possible

common components• LFC, FTS, WMS, MonALISA ...

• PROOF/ROOT– single + multitier static and

dynamic PROOF cluster– GRID API class

TGrid(virtual)TAliEn(real)

p

Page 16: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

164/5/2006 fca @ CIEMAT

If you thought this was difficult ...

NA49 experiment:

A Pb-Pb event

Page 17: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

174/5/2006 fca @ CIEMAT

ALICE Pb-Pb central event

Nch(-0.5<<0.5)=8000

… then what about this!

Page 18: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

184/5/2006 fca @ CIEMAT

ALICE Collaboration

UKPORTUGAL

JINR

GERMANY

SWEDENCZECH REP.

HUNGARYNORWAY

SLOVAKIA

POLANDNETHERLANDS

GREECE

DENMARKFINLAND

SWITZERLAND

RUSSIA CERN

FRANCE

MEXICOCROATIA ROMANIA

CHINA

USAARMENIA

UKRAINE

INDIA

ITALYS. KOREA

~ 1000 Members

(63% from CERN MS)

~30 Countries

~80 Institutes

0

200

400

600

800

1000

1200

1990 1992 1994 1996 1998 2000 2002 2004

ALICE Collaboration statistics

LoI

MoU

TP

TRD

Page 19: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

194/5/2006 fca @ CIEMAT

CERN computing power

• “High throughput” computing based on reliable commercial components

• More tha 1500 double CPU PC’s– 5000 in 2007

• More than 3 PB of data on disks & tapes– > 15 PB in 2007

Far from enough!

Page 20: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

204/5/2006 fca @ CIEMAT

EGEE production service

• >180 sites• >15 000 CPUs • ~14 000 jobs

completed per day

• 20 VOs • >800

registered users that represent thousand of scientists

http://gridportal.hep.ph.ic.ac.uk/rtm/

Situation 20 September 2005

Page 21: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

214/5/2006 fca @ CIEMAT

ALICE view on the current situation

EDG

AliEn

Exp specific services

LCGAliEn arch + LCG code

EGEE

Exp specific services (AliEn’ for ALICE)

EGEE, ARC, OSG…

Page 22: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

224/5/2006 fca @ CIEMAT

Job 1.1 lfn1

Job 1.2 lfn2

Job 1.3 lfn3, lfn4

Job 2.1 lfn1, lfn3

Job 2.1 lfn2, lfn4

Job 3.1 lfn1, lfn3

Job 3.2 lfn2

Site

ALICE central services

ALICE Grid

Optimizer

ComputingAgent

RB

CE

WN

Execs agent

Submits job UserALICE Job Catalogue

VO-Box

LCG

User Job

ALICE catalogues

Registers output

lfn guid

{se’s}

lfn guid

{se’s}

lfn guid

{se’s}

lfn guid

{se’s}

lfn guid

{se’s}

ALICE File Catalogue

packman

SA

xrootdGUID

LFC

SRM

MSS

File accessWorkloadrequest

SURL

Page 23: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

234/5/2006 fca @ CIEMAT

File Catalogue query

CE and SE

processing

User job (many events)

Data set (ESD’s, AOD’s)

Job Optimizer

Sub-job 1 Sub-job 2 Sub-job n

CE and SEprocessin

g

CE and SE

processing

Job Broker

Grouped by SE files location

Submit to CE with closest SE

Output file 1

Output file 2

Output file n

File merging job

Job output

Distributed analysis

processin

g

processin

g

Page 24: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

244/5/2006 fca @ CIEMAT

Data Challenge

• Last (!) exercise before data taking

• Test of the system started with simulation

• Up to 3600 jobs running in parallel

• Next will be reconstruction and analysis

Page 25: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

254/5/2006 fca @ CIEMAT

T1 T2 T1 T2 T1 T2 T1 T2

TDR requirement (MSI 2K) 4.9 5.8 12.3 14.4 16.0 18.7 20.9 24.3

Missing % -44% -40% -43% -56% -29% -41% -24% -53%

TDR requirement (PB) 3.1 1.5 7.9 3.7 10.2 4.8 13.3 6.2

Missing % -61% -35% -61% -48% -51% -28% -46% -32%

TDR requirement (TB) 2779 - 6947 - 9031 - 11740 -

Missing % -45% - -45% - -15% - -9% -

CPU

Disk

MS

Pledged by external sites versus required MoU

2007 2008 2009 2010

ALICE computing model• For pp similar to the other experiments

– Quasi-online data distribution and first reconstruction at T0– Further reconstructions at T1’s

• For AA different model– Calibration, alignment, pilot reconstructions and partial data export during data taking– Data distribution and first reconstruction at T0 in the four months after AA run (shutdown)– Further reconstructions at T1’s

• T0: First pass reconstruction, storage of RAW, calibration data and first-pass ESD’s• T1: Subsequent reconstructions and scheduled analysis, storage of a collective copy of RAW

and one copy of data to be safely kept, disk replicas of ESD’s and AOD’s• T2: Simulation and end-user analysis, disk replicas of ESD’s and AOD’s

Page 26: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

264/5/2006 fca @ CIEMAT

Production Environment

Coord.

• Production environment (simulation, reconstruction & analysis)

• Distributed computing environment

• Database organisation

DetectorProjects

Framework & Infrastructure

Coord.

• Framework development (simulation, reconstruction & analysis)

• Persistency technology

• Computing data challenges

• Industrial joint projects

• Tech. Tracking• Documentation

Simulation Coord.

• Detector Simulation• Physics simulation• Physics validation• GEANT 4 integration• FLUKA integration• Radiation Studies• Geometrical modeler

International Computing

Board

DAQ

Reconstruction & Physics Soft

Coord.

• Tracking• Detector

reconstruction• Global

reconstruction• Analysis tools• Analysis algorithms• Physics data

challenges• Calibration &

alignment algorithms

Management Board

Regional Tiers

Offline BoardChair: Comp Coord

Software Projects

HLTLCG

SC2, PEB, GDB, POB

Core Computing and Software

EU Gridcoord.

US Gridcoord.

Offline Coordination

• Resource planning• Relation with funding agencies• Relations with C-RRB

Offline Coord.(Deputy PL)

Page 27: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

274/5/2006 fca @ CIEMAT

Conclusions

• ALICE has followed a single evolution line since eight years

• Most of the initial choices have been validated by our experience

• Some parts of the framework still have to be populated by the sub-detectors

• Wish us good luck!

Page 28: The ALICE Computing F.Carminati May 4, 2006 Madrid, Spain

284/5/2006 fca @ CIEMAT