56
Alice Alice Grid Activities Grid Activities Peter Peter Malzacher Malzacher ( GSI GSI) for the Alice Grid Project ) for the Alice Grid Project [email protected] [email protected] Seminar Datenverarbeitung in der Hochenergiephysik Seminar Datenverarbeitung in der Hochenergiephysik DESY, 27.5.2002 DESY, 27.5.2002

Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

Embed Size (px)

Citation preview

Page 1: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

Alice Alice Grid ActivitiesGrid Activities

Peter Peter MalzacherMalzacher ((GSIGSI) for the Alice Grid Project) for the Alice Grid [email protected]@gsi.de

Seminar Datenverarbeitung in der Hochenergiephysik Seminar Datenverarbeitung in der Hochenergiephysik DESY, 27.5.2002DESY, 27.5.2002

Page 2: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

ALICE GRID ProjectALICE GRID Project

G.G.AndronicoAndronico(CT), T.(CT), T.AnticicAnticic(IRB), R.(IRB), R.BarberaBarbera(CT), (CT), I.J.I.J.BloodworthBloodworth(Birmingham), M.(Birmingham), M.BotjeBotje(NIKHEF), P.(NIKHEF), P.BuncicBuncic(CERN), (CERN),

F.F.CarminatiCarminati(CERN), (CERN), P.P.CerelloCerello(TO)(TO), G., G.ChabratovaChabratova(DUBNA), (DUBNA), J.J.CleymansCleymans((CapetownCapetown), J.G.Contreras(MERIDA), D.), J.G.Contreras(MERIDA), D.DiBariDiBari(BA), (BA),

DeGirolamoDeGirolamo(BA), K.(BA), K.FlurchickFlurchick(OSC), J.(OSC), J.GossetGosset(SACLAY), (SACLAY), A.A.GrigoryanGrigoryan((YerevanYerevan), D.Johnson(OSC), M.L.), D.Johnson(OSC), M.L.LuvisettoLuvisetto(BO), (BO), P.P.MalzacherMalzacher(GSI), M.(GSI), M.MaseraMasera(CERN/TO), A. (CERN/TO), A. MasoniMasoni(CA), (CA),

K.K.MkoyanMkoyan((YerevanYerevan), D.Mura(CA), L.), D.Mura(CA), L.NellenNellen(UNAM), B.(UNAM), B.NilsenNilsen(OSU), (OSU), G.G.OdinyekOdinyek(LBNL), D.Olson(LBNL), S.K.Pal(Calcutta), F.(LBNL), D.Olson(LBNL), S.K.Pal(Calcutta), F.PierellaPierella(BO), (BO),

F.F.RademakersRademakers (CERN), I.(CERN), I.SacreidaSacreida(NERSC), P.(NERSC), P.SaizSaiz(CERN), (CERN), Y.Y.SchutzSchutz(SUBATECH), K.Schwarz(GSI), E.(SUBATECH), K.Schwarz(GSI), E.ScomparinScomparin(TO), (TO),

M.M.SittaSitta(TO), R.(TO), R.TurrisiTurrisi(PD), D.(PD), D.VicinanzaVicinanza(SA)(SA)

38 people, 21 institutions38 people, 21 institutions

�� http://www.to.http://www.to.infninfn.it/activities/experiments/.it/activities/experiments/alicealice--gridgrid

Page 3: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

OverviewOverview

§§ LHC computing challengeLHC computing challenge§ MONARC, EU DG, LCG

§§ The Grid metaphorThe Grid metaphor§ Globus

§§ Alice Grid ActivitiesAlice Grid Activities§ The Alice Offline Project§ Remote job submission and monitoring§ AliEn§ Interface Root-Globus, Root-Alien, Root-...§ Distributed analysis with proof

§§ Summary Summary

Page 4: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

OverviewOverview

èè LHC computing challengeLHC computing challenge§ MONARC, EU DG, LCG

§§ The Grid metaphorThe Grid metaphor§ Globus

§§ Alice Grid ActivitiesAlice Grid Activities§ The Alice Offline Project§ Remote job submission and monitoring§ AliEn§ Interface Root-Globus, Root-Alien, Root-...§ Distributed analysis with proof

§§ Summary Summary

Page 5: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

LEP (end 2000) LEP (end 2000) --> LHC (start > LHC (start 20020077))

Page 6: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

CMSATLAS

LHCb

The LHC DetectorsThe LHC Detectors

Page 7: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

source: CERN/LHCC/2001-004 - Report of the LHC Computing Review - 20 February 2001

(ATLAS with 270Hz trigger)Regional Grand

Tier 0 Tier 1 Total Centres TotalProcessing (K SI95) 1,727 832 2,559 4,974 7,533Disk (PB) 1.2 1.2 2.4 8.7 11.1Magnetic tape (PB) 16.3 1.2 17.6 20.3 37.9

---------- CERN ----------

Summary of Computing Capacity Required for all LHC Experiments in 2007

1 SPECint95 = 10 CERNunits = 40 MIPStoday dual PIII ~ 100 SPECint95

Computing Requirements

Page 8: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

Department α βγ

Desktop

The MONARC RC Topology The MONARC RC Topology

CERN – Tier 0

Tier 1 FNAL RC...

IN2P3622 M

bps2.5 Gbps

622

Mbp

s

155 m

bps 155 mbps

Tier2 Lab aUni b Lab c

Uni n

Page 9: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

HEP Data Grid InitiativeHEP Data Grid Initiative

§§ European level coordination of national initiatives & projectsEuropean level coordination of national initiatives & projects§§ Principal goals:Principal goals:

§ Middleware for fabric & Grid management§ Large scale testbed - major fraction of one LHC experiment§ Production quality HEP demonstrations

¨ “mock data”, simulation analysis, current experiments§ Other science demonstrations

§§ Three year phased developments & demosThree year phased developments & demos§§ Complementary to other GRID projectsComplementary to other GRID projects

§ EuroGrid: Uniform access to parallel supercomputing resources§§ Synergy to be developed (GRID Forum, Industry and Research Synergy to be developed (GRID Forum, Industry and Research

Forum)Forum)

Page 10: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

Work PackagesWork Packages

WWPP 11 GGrriidd WWoorrkkllooaadd MMaannaaggeemmeenntt CC..VViissttoollii//IINNFFNN WWPP 22 GGrriidd DDaattaa MMaannaaggeemmeenntt BB..SSeeggaall//CCEERRNN WWPP 33 GGrriidd MMoonniittoorriinngg sseerrvviicceess RR..MMiiddddlleettoonn//PPPPAARRCC WWPP 44 FFaabbrriicc MMaannaaggeemmeenntt TT..SSmmiitthh//CCEERRNN WWPP 55 MMaassss SSttoorraaggee MMaannaaggeemmeenntt JJ..GGoorrddoonn//PPPPAARRCC WWPP 66 IInntteeggrraattiioonn TTeessttbbeedd FF..EEttiieennnnee//CCNNRRSS WWPP 77 NNeettwwoorrkk SSeerrvviicceess CC..MMiicchhaauu//CCNNRRSS WWPP 88 HHEEPP AApppplliiccaattiioonnss FF..CCaarrmmiinnaattii//CCEERRNN WWPP 99 EEOO SScciieennccee AApppplliiccaattiioonnss CC..MMiicchhaauu//CCNNRRSS WWPP 1100 BBiioollooggyy AApppplliiccaattiioonnss CC..MMiicchhaauu//CCNNRRSS WWPP 1111 DDiisssseemmiinnaattiioonn GG..MMaassccaarrii//CCNNRR WWPP 1122 PPrroojjeecctt MMaannaaggeemmeenntt FF..GGaagglliiaarrddii//CCEERRNN

Page 11: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

More realistically More realistically -- a Grid Topologya Grid Topology

CERN – Tier 0

Tier 1 FNAL RC...

IN2P3622 M

bps2.5 Gbps

622

Mbp

s

155 m

bps 155 mbps

Tier2 Lab aUni b Lab c

Uni n

Department αβ

γ

Desktop

Page 12: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

LHC Computing ModelLHC Computing Model

CERNTier2

Lab a

Uni a

Lab c

Uni n

Lab m

Lab b

Uni bUni y

Uni x

PhysicsDepartment

α

β

γDesktop

Germany

Tier 1

USAFermiLab

UK

France

Italy

NL

USABrookhaven

……….

The LHC Computing Centre

Page 13: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

LHC Computing ModelLHC Computing Model

CERNTier2

Lab a

Uni a

Lab c

Uni n

Lab m

Lab b

Uni bUni y

Uni x

PhysicsDepartment

α

β

γDesktop

Germany

Tier 1

USAFermiLab

UK

France

Italy

NL

USABrookhaven

……….

The LHC Computing Centre

Page 14: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

LHC Computing ReviewLHC Computing Review

§§ Report issued on February 2001Report issued on February 2001§§ Recognition ofRecognition of

§ Size of LHC computing: 240MCHF initially and every 3y§ Understaffing of both IT and Offline core teams § Distributed hierarchical computing model (MONARC)

§§ RecommendationsRecommendations§ GRID technology development§ Common projects§ Establishment of proper management structure§ Continued support for common libraries and introduction of

the CERN support for FLUKA & ROOT§ Data challenges on common Prototype testbed§ Improvement of the planning of the Off-line Projects § Drafting of software MOU

Page 15: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

The LHC Computing Grid Project Structure

The LHC Computing Grid Project

LHCC

Project Overview Board

RTAG

Reports

Reviews

CommonComputing

RRB

Resource Matters

OtherComputing

GridProjects

EUDataGridProject

implementation teams

Other HEPGrid

Projects

OtherLabs

Project Manager

ProjectExecution

Board

Requirements,Monitoring

Software andComputingCommittee

(SC2)

Page 16: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

LHC GRID Computing Phase 1LHC GRID Computing Phase 1

§§ Technology development and data challengesTechnology development and data challenges§§ Production prototype at CERN, MS&NMS 2001Production prototype at CERN, MS&NMS 2001--2004 2004

§ Technical Design Report in 2003§ ~50% in complexity of one LHC experiment

§§ Training and technology opportunities Training and technology opportunities §§ Approved in the Council Meeting of 20Approved in the Council Meeting of 20--SeptSept--0101

§ ~ 40 FTEs for three years (more pledged)§ ~ 30 MCHF (>10 pledged from various sources)

§§ Critical level for “approval to start” reachedCritical level for “approval to start” reached§§ Need still more funds for the CERN prototypingNeed still more funds for the CERN prototyping

§ Contributions in kind, EU-FP6, industry contributions§§ Discussion with M & NM States, agencies, industriesDiscussion with M & NM States, agencies, industries

§ Work out convention with each funding agency

Page 17: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

The Problem The Problem -- SummarySummary

§§ ScalabilityScalability àà cost cost àà managementmanagement§ Thousands of processors, thousands of disks, PetaBytes

of data, Terabits/second of I/O bandwidth, ….§§ WideWide--area distributionarea distribution

§ WANs are and will be 1% of LANs§ Distribute, replicate, cache, synchronize the data§ Multiple ownership, policies, ….§ Integration of this amorphous collection of Regional

Centers ..§ .. with some attempt at optimization

§§ AdaptabilityAdaptability§ We shall only know how analysis is done once the data

arrives

Page 18: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

OverviewOverview

§§ LHC computing challengeLHC computing challenge§ MONARC, EU DG, LCG

èè The Grid metaphorThe Grid metaphor§ Globus

§§ Alice Grid ActivitiesAlice Grid Activities§ The Alice Offline Project§ Remote job submission and monitoring§ AliEn§ Interface Root-Globus, Root-Alien, Root-...§ Distributed analysis with proof

§§ Summary Summary

Page 19: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

The GRID metaphorThe GRID metaphor

§§ Unlimited ubiquitous Unlimited ubiquitous distributed computingdistributed computing

§§ Transparent access to Transparent access to multipetabytemultipetabyte distributed distributed data basesdata bases

§§ Easy to plug inEasy to plug in

§§ Hidden complexity of the Hidden complexity of the infrastructureinfrastructure

§§ Analogy with the electrical Analogy with the electrical power GRIDpower GRID

Ian Foster and Carl Kesselman, editors, “The Grid: Blueprint for a New Computing Infrastructure,” Morgan Kaufmann, 1999, http://www.mkp.com/grids

Page 20: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

Five Emerging Models of Networked Computing from Five Emerging Models of Networked Computing from

§§ Distributed ComputingDistributed Computing§ || synchronous processing

§§ HighHigh--Throughput ComputingThroughput Computing§ || asynchronous processing

§§ OnOn--Demand ComputingDemand Computing§ || dynamic resources

§§ DataData--Intensive ComputingIntensive Computing§ || databases

§§ Collaborative ComputingCollaborative Computing§ || scientists

Page 21: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

How to build a Grid Application?How to build a Grid Application?

§§ Do It YourselfDo It Yourself§§ Legion = Legion = MetaOS MetaOS on top of local OSon top of local OS

“Legion is a reflective object-based wide-area system designed from first principles to address the difficult technical challenges of using distributed resources (cycles, data, people, applications)”

¨ www.legion.virginia.edu§ Globus = Protocol/Toolkit approach

“The Globus Toolkit provides a set of standard services for authentication, resource location, resource allocation, configuration, communication, file access, fault detection, and executable management. These services can be incorporated into applications and/or programming tools in a "mix-and-match" fashion to provide access to needed capabilities”

¨ www.globus.org

Page 22: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

Globus ApproachGlobus Approach

§§ Focus on architecture issuesFocus on architecture issues§ Propose set of core services as basic

infrastructure§ Use to construct high-level, domain-

specific solutions§§ Design principlesDesign principles

§ Keep participation cost low§ Enable local control§ Support for adaptation

Diverse global services

Core Globusservices

Local OS

A p p l i c a t i o n s

GASS

GRAM

GIS

GSI

Page 23: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

“Data Grid” Architecture Elements“Data Grid” Architecture Elements

CPUCPU

resourcemanager

Enquiry (LDAP)Access (GRAM)

Storage

Storageresourcemanager

Enquiry (LDAP)Access (???)

Locationcataloging

Metadatacataloging

Replicaselection

Attribute-basedlookup

Reliablereplication

VirtualDataCachingTask mgmt

(Condor-G)

Virtual Datacataloging

Data requestmanagement

…A P P L I C A T I O N S

Page 24: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

The Grid from a Services ViewThe Grid from a Services View

:

:E.g.,

Applications

Resource-specific implementations of basic servicesE.g., Transport protocols, name servers, differentiated services, CPU schedulers, public keyinfrastructure, site accounting, directory service, OS bypass

Resource-independent and application-independent servicesauthentication, authorization, resource location, resource allocation, events, accounting,

remote data access, information, policy, fault detection

DistributedComputing

Toolkit

Grid Fabric(Resources)

Grid Services(Middleware)

ApplicationToolkits

Data-Intensive

ApplicationsToolkit

CollaborativeApplications

Toolkit

RemoteVisualizationApplications

Toolkit

ProblemSolving

ApplicationsToolkit

RemoteInstrumentation

ApplicationsToolkit

Applications Chemistry

Biology

Cosmology

High Energy Physics

Environment

Page 25: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

The Globus AdvantageThe Globus Advantage

§§ Single signSingle sign--on for all resourceson for all resources§ No need for user to keep track of accounts and passwords at

multiple sites§ No plaintext passwords

§§ Uniform interface to various local scheduling mechanismsUniform interface to various local scheduling mechanisms§ PBS, Condor, LSF, NQE, LoadLeveler, fork, etc.§ No need to learn and remember obscure command sequences at

different sites§§ Support for staging, etc. Support for staging, etc.

Page 26: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

OverviewOverview

§§ LHC computing challengeLHC computing challenge§ MONARC, EU DG, LCG

§§ The Grid metaphorThe Grid metaphor§ Globus

èè Alice Grid ActivitiesAlice Grid Activities§ The Alice Offline Project§ Remote job submission and monitoring§ AliEn§ Interface Root-Globus, Root-Alien, Root-...§ Distributed analysis with proof

§§ Summary Summary

Page 27: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

ALICE CollaborationALICE Collaboration

0

200

400

600

800

1000

1200

1990 1992 1994 1996 1998 2000 2002 2004

ALICE

Collaboration statistics

LoI

MoU

TP

TRD

UKPORTUGAL

JINR

GERMANY

SWEDEN

CZECH REP.HUNGARYNORWAY

SLOVAKIAPOLANDNETHERLANDS

GREECE

DENMARKFINLAND

SWITZERLAND

RUSSIA CERN

FRANCE

MEXICOCROATIA ROMANIA

CHINA

USAARMENIA

UKRAINE

INDIA

ITALYS. KOREA

937 members (63% from CERN MS)

77 Institutions, 28 Countries

Page 28: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

ALICE ComputingALICE Computing

§§ 1800 SI95, 2.8PB tapes/y, >1 PB disk1800 SI95, 2.8PB tapes/y, >1 PB disk§ Same order of magnitude than ATLAS or CMS

§§ Major strategic decisions already takenMajor strategic decisions already taken§ Move to C++ completed beginning 1999completed beginning 1999

¨ TDRs all produced with the new framework¨ PPR production being done with AliEn/AliRoot

§ Adoption of the ROOT framework§ Tightly knit Off-line team – single development line § Physics performance and computing in a single team§ Ambitious DC program on the LHC prototype§ ALICE DAQ/Computing integration realised during data challenges

in collaboration with ALICE DAQ team and IT division

Page 29: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

AliRootAliRoot layoutlayout

ROOT

AliRoot

STEER

Virtual MC

G3 G4 FLUKA

EVGEN

Page 30: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

Alice Offline FrameworkAlice Offline Framework

§§ > 50 active users participate in the development of AliRoot from> 50 active users participate in the development of AliRoot from the the different detector groupsdifferent detector groups§ 71% of the code developed outside, 29% by the core Offline team

§§ AliRootAliRoot frameworkframework§ C++: 400kLOC + 225kLOC (generated) + macros: 77kLOC § FORTRAN: 13kLOC (ALICE) + 914kLOC (external packages)§

¨

¨

¨

¨

§ Maintained on Linux (any version!), HP-UX, DEC Unix, Solaris§§ Two packages to install (ROOT+Two packages to install (ROOT+AliRootAliRoot))

§ Less that 1 second to link (thanks to 37 shared libs)§ 1-click-away install: download and make (non-recursive makefile)

§§ AliEnAliEn§ 50kLOC of PERL5 (ALICE) + 1MLOC of PERL5 (external packages)

§§ Installed on more than 30 sites by physicistsInstalled on more than 30 sites by physicists

Page 31: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

Offline MilestonesOffline Milestones

Physics ChallengePhysics Challenge§§ Determine readiness of the offline Determine readiness of the offline

framework for data processingframework for data processing§§ Verify the computing modelVerify the computing model

Data ChallengeData Challenge§§ Verify integration of Offline and DAQVerify integration of Offline and DAQ§§ Assess technologies and modelsAssess technologies and models§§ Determine readiness of DAQ systemDetermine readiness of DAQ system

Test of the final system for Test of the final system for reconstruction and analysis.reconstruction and analysis.

01/0601/06--06/0606/06

20%20%

Complete chain used for trigger studies.Complete chain used for trigger studies.Prototype of the analysis tools.Prototype of the analysis tools.Comparison with parameterised Comparison with parameterised MonteCarloMonteCarlo..Simulated raw data.Simulated raw data.

01/0401/04--06/0406/04

10%10%

First test of the complete chain from First test of the complete chain from simulation to reconstruction for the PPRsimulation to reconstruction for the PPRSimple analysis.Simple analysis.ROOT digits.ROOT digits.

06/0106/01--06/0206/02

5%5%

Physics ObjectivePhysics ObjectivePDC PeriodPDC Period((milestone)milestone)

((SizeSize))

Final system.Final system.5/20065/20061250TB | 1250 MB/s1250TB | 1250 MB/s

Prototype of the final HLT software.Prototype of the final HLT software.Prototype of the final remote data Prototype of the final remote data replication.replication.Raw digitsRaw digits for all detectors.for all detectors.

5/20055/2005750TB | 750 MB/s750TB | 750 MB/s

HLT prototype for all detectors that HLT prototype for all detectors that plan to make use of it.plan to make use of it.Remote reconstruction of partial data Remote reconstruction of partial data streams.streams.Raw digitsRaw digits for barrel and MUON.for barrel and MUON.

5/20045/2004450TB | 450 MB/s450TB | 450 MB/s

Integration of single detector HLT Integration of single detector HLT code, at least for TPC and ITS.code, at least for TPC and ITS.Quasi onQuasi on--line reconstruction at CERNline reconstruction at CERNPartial data replication to remote Partial data replication to remote centres.centres.

5/20035/2003300TB | 300 MB/s300TB | 300 MB/s

Rootification Rootification of raw data. of raw data. Raw data for TPC and ITS.Raw data for TPC and ITS.

10/200210/2002200TB | 200 MB/s200TB | 200 MB/s

Offline milestoneOffline milestoneADC MilestoneADC Milestone((Data to MSSData to MSS))

Page 32: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

ALICE GRID resourcesALICE GRID resources

Yerevan

CERN

Saclay

Lyon

Dubna

Capetown, ZA

Birmingham

Cagliari

NIKHEF

GSI

Catania

BolognaTorino

Padova

IRB

Kolkata, India

OSU/OSCLBL/NERSC

Merida

Bari38 people21 insitutions

FZK

Page 33: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

ALICE/GRID StrategyALICE/GRID Strategy

§§ Test / Use Test / Use Globus Globus ToolsTools§§ Physics Performance Report (Step 1) Physics Performance Report (Step 1)

§ Involving progressively more sites (Catania, CERN, GSI/FZK, Lyon, OSC, NIKHEF, NERSC, Torino)

§§ AliKitAliKit + + AliEnAliEn (+ (+ GlobusGlobus & MRTG & ...)& MRTG & ...)§ Don't wait for EU-DG1

§§ Integration with Integration with DataGRID DataGRID ((TestbedTestbed 1)1)§ AliKit into EU-DG1: straightforward§ Evaluate EU-DG1: Services§ AliEn into EU-DG1: interaction with middleware WP’s

§§ ALICE Data Challenge IVALICE Data Challenge IV§§ Development of distributed parallel analysis tools (PROOF)Development of distributed parallel analysis tools (PROOF)

Page 34: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

ALICE Physics Performance ReportALICE Physics Performance Report

§§ Evaluation of acceptance, efficiency, resolution for signalsEvaluation of acceptance, efficiency, resolution for signals§§ Step1: Simulation of ~10,000 central Step1: Simulation of ~10,000 central PbPb--PbPb eventsevents

§ Hits: 2 GB/event = 20 TB§ Summable Digits: 1 GB/event = 10 TB§ AliEn (Globus Authentication + Data Catalog) 1 user (Production Manager)

§§ Step2: Signal Superposition + Reconstruction 100,000 eventsStep2: Signal Superposition + Reconstruction 100,000 events§ Reconstructed Events = 10 TB x nsignals§ ROOT + Data Manager for access to distributed input: nsignals users

(Production Managers)§§ Step3: Event Analysis: Analysis Object Data = 1 TB x Step3: Event Analysis: Analysis Object Data = 1 TB x nnsignalssignals

§ PROOF + Data Manager + Workload Manager for chaotic access to distributed analysis input: 200 users

Page 35: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

Web interface to loginon “The Grid”

Remote Job Submission and Monitoring

Web interface

for

job submissio

n !

CPU LoadDisk spaceNetwork

availability

Boo

k ke

epin

gsy

stem

Onl

y at

Lyo

n

LDAP server

for ALICE

(only in Italy)

Page 36: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

ALICE GRID: Monitoring ALICE GRID: Monitoring

Monitor of:CPU loadDISK availabilityNetwork load

Page 37: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

ALICE GRID August ProductionALICE GRID August Production

§§ Bookkeeping included: Bookkeeping included: § Check production & keep track of the event parameters for selection§ Central Server (Catania) + mirrors (Lyon, Bologna)http://alipc1.ct.infn.it/PPR/PPRmain.php?off=0&nrow=20&rev=1§ MySQL based, ORACLE interface available

Page 38: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

ALICE in EU DGALICE in EU DG

§§ First HEP event producedFirst HEP event producedon the on the DataGRID testbedDataGRID testbedreleased Dec 10, ’01released Dec 10, ’01

§§ ALICE providesALICE providesHEP coordinationHEP coordination

Page 39: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

² Distributed file catalogue, simulating a file structure.

² Each directory is a table in a SQL database, abstract interface to DB’s

² Ownership and privileges like in a UNIX system.

² Completely transparent access to mass storage systems.

² Automatic synchronization among all the databases.

² Secure authentication

² Replication & caching

² User shell interface, C/C++ API

² 150kL perl5, 95% from the WEB² SOAP for client/server and API

implementation

�P.Buncic P.Saiz, J-E.Revshback

Page 40: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

Alien

§§ ALIALICE production distributed CE production distributed EnEnvironmentvironment§ Entirely ALICE developed

§§ File Catalogue as a global file system on a RDBFile Catalogue as a global file system on a RDB§§ TAG Catalogue, as extension TAG Catalogue, as extension §§ Secure AuthenticationSecure Authentication

§ Interface to Globus available§§ Central Queue Manager ("pull" Central Queue Manager ("pull" vsvs "push" model)"push" model)

§ Interface to EDG Resource Broker available§§ Monitoring infrastructureMonitoring infrastructure

The CORE GRID functionalityThe CORE GRID functionality§§ Automatic software installation with Automatic software installation with AliKitAliKit§§ http://alien.http://alien.cerncern..chch

Page 41: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

HI PPR Production SummaryHI PPR Production Summary

§§ Entirely Entirely AliEnAliEn based!based!§§ >5500 jobs validated (~24h each)>5500 jobs validated (~24h each)§§ 13 clusters actively used in production, 9 remote sites13 clusters actively used in production, 9 remote sites§§ GSI, GSI, KarlsruheKarlsruhe, , DubnaDubna, Nantes, Budapest, , Nantes, Budapest, BariBari, Zagreb, Birmingham are joining, Zagreb, Birmingham are joining§§ Hijing paramHijing param 8000/4000/2000 8000/4000/2000 dNdN//dydy§§ HijingHijing centrality binscentrality bins

§ fm: 0-5, 0-2, 5-8.6, 8.6-11.2, 11.2-13.2, 13.2-15, 15-inf. §§ HIJING + Jet/Photon Trigger HIJING + Jet/Photon Trigger

§ Minimum Transverse Momentum: 25, 50, 75, 100GeV

CERN CCIN2P3 LBL

Torino Catania Padova

OSC NIKHEF Bari

Page 42: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

Other productionsOther productions

LBNL Lyon

Ohio Darmstadt §§ EMCAL productionEMCAL production§ Design and optimisation of the proposed

EM calorimeter§ Entirely AliEn based§ Decided, implemented and realised in two

months§ 2000 jobs, 4000h CPU

§§ pp production august 2001pp production august 2001§ ~ 10000 events generated at CERN in CASTOR§ Transport + TPC digitization

§§ pp production pp production octoberoctober 20012001§ Vertex sampling in the diamond§ > 10000 events – 80 GB in 3 days in Torino transferred to CERN with AliEn in ~20 hours

(band-width saturated)§ Test and tune tracking

Page 43: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

You execute a root script:You execute a root script:The Grid:The Grid:§ Finds convenient places for it to be run§ Organises efficient access to your data

¨ Caching, migration, replication of data and/or code§ Deals with authentication to the different sites that you will be using§ Interfaces to local site resource allocation mechanisms, policies§ Compiles/runs your scripts§ Monitors progress§ Recovers from problems

If there is scope for parallelism, it can also decompose your woIf there is scope for parallelism, it can also decompose your work into rk into convenient execution units based on the available resources, datconvenient execution units based on the available resources, data distribution, a distribution, . . .. . .

The The Vision:Vision:

This can work only, if there is a closecooperation of ROOT and Grid services !

ROOT with the help of Grid Services:

Page 44: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

Show me theequation of state of quark gluon plasma

Use CaseUse Case:(:(chaoticchaotic) Analysis) Analysis

PhysicistRootscript

CERN

FNAL RALIN2P3

Lab aUni b Lab c

Uni n

α β γ

Page 45: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

DataGridDataGrid & ROOT& ROOT

Local

Remote

SelectionParameters

Procedure

Proc.C

Proc.C

Proc.C

Proc.C

Proc.C

PROOF

CPU

CPU

CPU

CPU

CPU

CPU

TagDB

RDB DB1

DB4

DB5

DB6

DB3

DB2

Bring the KB to the PB and not the PB to the KB

Page 46: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

Parallel Script ExecutionParallel Script Execution

root

Remote PROOF Cluster

proof

proof

proof

TNetFile

TFile

Local PC

$ root

ana.Cstdout/obj

node1

node2

node3

node4

$ root

root [0] .x ana.C

$ root

root [0] .x ana.C

root [1] gROOT->Proof(“remote”)

$ root

root [0] .x ana.C

root [1] gROOT->Proof(“remote”)

root [2] gProof->Exec(“.x ana.C”)

ana.C

proof

proof = slave server

proof

proof = master server

#proof.confslave node1slave node2slave node3slave node4

*.root

*.root

*.root

*.root

TFile

TFile

Page 47: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

Running a PROOF JobRunning a PROOF Job

gROOT->Proof();TDSet *treeset = new TDSet("TTree", "AOD");treeset->Add("lfn:/alien.cern.ch/alice/prod2002/file1");. . . gProof->Process(treeset, “myselector.C”);

gROOT->Proof();TDSet *objset = new TDSet("MyEvent", "*", "/events"); objset->Add("lfn:/alien.cern.ch/alice/prod2002/file1");. . . objset->Add(set2003);gProof->Process(objset, “myselector.C”);

Page 48: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

TGrid TGrid Class Class ––Abstract Interface to AliEnAbstract Interface to AliEn

class TGrid : public TObject {public:

virtual Int_t AddFile(const char *lfn, const char *pfn) = 0;virtual Int_t DeleteFile(const char *lfn) = 0;virtual TGridResult *GetPhysicalFileNames(const char *lfn) = 0;virtual Int_t AddAttribute(const char *lfn,

const char *attrname,const char *attrval) = 0;

virtual Int_t DeleteAttribute(const char *lfn,const char *attrname) = 0;

virtual TGridResult *GetAttributes(const char *lfn) = 0;virtual void Close(Option_t *option="") = 0;virtual const char *GetInfo() = 0;const char *GetGrid() const { return fGrid; }const char *GetHost() const { return fHost; }Int_t GetPort() const { return fPort; }virtual TGridResult *Query(const char *query) = 0;static TGrid *Connect(const char *grid, const char *uid = 0,

const char *pw = 0);ClassDef(TGrid,0) // ABC defining interface to GRID services

};

Page 49: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

Running PROOF Using AliEnRunning PROOF Using AliEn

TGrid *alien = TGrid::Connect(“alien://alien.cern.ch”, rdm);TGridResult *res; res = alien->Query(“lfn:///alice/simulation/2001-04/V0.6*.root“);TDSet *treeset = new TDSet("TTree", "AOD");treeset->Add(res);

gROOT->Proof();gProof->Process(treeset, “myselector.C”);

// plot/write objects produced in myselector.C. . .

Page 50: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

DataGridDataGrid & ROOT& ROOT

Root RDB

Selection parameters TAG DB

selected events Grid RBLFN

#hits

Grid MDS

Grid perf log

Grid perf mon

Grid replica catalog

Grid cost evaluator

best places

Spawn PROOF tasks

Grid net ws

Grid replica manager

Grid autenticate

PROOF loop

Grid replica catalog

Update Root RDB

output LFNs

Send results back

Grid log & monitor

Page 51: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

First First StepsSteps::

üü Use Use GSI to GSI to provide provide a a single signsingle sign--onon viavia ""gridgrid--idid" " ((prototype ready for proofprototype ready for proof))

üü globusglobus--jobjob--runrun to start to start remote demonsremote demons§ we need jobmanager-fork or another way to launch interactive processes

üü TLdap TLdap to to access access MDSMDS§ API: OpenLDAP§ Information model: Globus, DataGrid

¨ Mapping of events to files¨ Mapping of files to replica¨ Ressource broker¨ ...

Page 52: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

MDS ApproachMDS Approach

§§ Based on Based on LDAPLDAP§ Lightweight Directory Access Protocol v3

(LDAPv3)§ Standard data model§ Standard query protocol

§§ Globus specific schemaGlobus specific schema§ Host-centric representation

§§ Globus specific toolsGlobus specific tools§ GRIS, GIIS§ Data discovery, publication,…

SNMP

GRIS

NIS

NWS

LDAP

LDAP API

Middleware

Application

GIIS…

Page 53: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

OverviewOverview

§§ LHC computing challengeLHC computing challenge§ MONARC, EU DG, LCG

§§ The Grid metaphorThe Grid metaphor§ Globus

§§ Alice Grid ActivitiesAlice Grid Activities§ The Alice Offline Project§ Remote job submission and monitoring§ AliEn§ Interface Root-Globus, Root-Alien, Root-...§ Distributed analysis with proof

èè Summary Summary

Page 54: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

The Grid from a Services ViewThe Grid from a Services View

:

:E.g.,

Applications

Resource-specific implementations of basic servicesE.g., Transport protocols, name servers, differentiated services, CPU schedulers, public keyinfrastructure, site accounting, directory service, OS bypass

Resource-independent and application-independent servicesauthentication, authorization, resource location, resource allocation, events, accounting,

remote data access, information, policy, fault detection

DistributedComputing

Toolkit

Grid Fabric(Resources)

Grid Services(Middleware)

ApplicationToolkits

Data-Intensive

ApplicationsToolkit

CollaborativeApplications

Toolkit

RemoteVisualizationApplications

Toolkit

ProblemSolving

ApplicationsToolkit

RemoteInstrumentation

ApplicationsToolkit

Applications Chemistry

Biology

Cosmology

High Energy Physics

Environment

Page 55: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

AliEnAliEn as a metaas a meta--GRIDGRID

AliEn User Interface

AliEn stackiVDGL stack EDG stack

Page 56: Alice Grid Activities - DESY · PDF fileAlice Grid Activities Peter Malzacher ... Infrastructure,” Morgan Kaufmann, 1999, ... Applications Chemistry Biology

Long Term Long Term UseUse CaseCase:(:(chaoticchaotic) Analysis) Analysis

Physicist

Rootscript

CERN

FNAL RALIN2P3

Lab aUni b Lab c

Uni n

α β γ

Rootsession

ResourceBroker

ReplicaLocationCatalog

EventCatalog

Physicist executes a root (c++) script to analyse eventsroot:•finds the name of the file(s) with the events •locates the file (and replica)•decides based on

network speed and load, CPUs/disks available and file usage pattern, whether -to fetch the file -to use only parts (via rootd)or -to execute the script at the remote side (via proof).

Physicist executes a root (c++) script to analyse eventsroot:•finds the name of the file(s) with the events •locates the file (and replica)•decides based on

network speed and load, CPUs/disks available and file usage pattern, whether -to fetch the file -to use only parts (via rootd)or -to execute the script at the remote side (via proof).