28
22 February 2008 GS Group Meeting - EIS se ction GS-EIS: Experiment Integration Support section Five staff: Harry Renshall Section Leader Simone Campana ATLAS support Roberto Santinelli LHCb support Andreas Sciaba CMS support Patricia Mendez ALICE support Four INFN funded CERN fellows: Alessandro di Girolamo ATLAS Elisa Lanciotti LHCb Nicolo Magini CMS and ALICE Vincenzo Miccio CMS One ASGC funded visitor: Gang Qin ATLAS In the future we would like to broaden the associations with single experiments where possible e.g. by leveraging common solutions or having limited duration task forces on a particular experiment problem area.

GS-EIS: Experiment Integration Support section

  • Upload
    yan

  • View
    23

  • Download
    0

Embed Size (px)

DESCRIPTION

GS-EIS: Experiment Integration Support section. Five staff: Harry Renshall Section Leader Simone Campana ATLAS support Roberto Santinelli LHCb support Andreas Sciaba CMS support Patricia Mendez ALICE support Four INFN funded CERN fellows: Alessandro di Girolamo ATLAS - PowerPoint PPT Presentation

Citation preview

Page 1: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section

GS-EIS: Experiment Integration Support section

Five staff:Harry Renshall Section LeaderSimone Campana ATLAS supportRoberto Santinelli LHCb supportAndreas Sciaba CMS support Patricia Mendez ALICE supportFour INFN funded CERN fellows:Alessandro di Girolamo ATLASElisa Lanciotti LHCbNicolo Magini CMS and ALICEVincenzo Miccio CMSOne ASGC funded visitor:Gang Qin ATLAS

In the future we would like to broaden the associations with single experiments where possible e.g. by leveraging common solutions or having limited duration task forces on a particular experiment problem area.

Page 2: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section

• H.Renshall:

• member of wlcg team preparing for LHC startup (CCRC'08) then production running

• deputy group leader - attend/contribute to departmental management activities

• section leader (light administrative load)• scientific secretary of the LHCC Computing

Resources Review Board and also of the Computing Resource Scrutiny Group

• IT link person to the LHCb experiment

Page 3: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section

Simone Campana:

Page 4: GS-EIS: Experiment Integration Support section

• Experiment Integration Support for ATLAS✓ Liaison with WLCG and EGEE

• In ATLAS organization: Facility Coordinator✓ Coordinate GRID middleware and ATLAS Distributed SW deployment/updates/upgrades✓ Primary contact for Tiers Facilities Managers

• Organize, plan and coordinate ATLAS wide tests✓ Tier-0 throughput, DDM Functional Test, CCRC08✓ Includes scripting/development of tools, debugging, testing, follow up…

➡ This is the most time consuming activity✓ This activity is strong contact collaboration with Birger and Stephane Jerzequel from ATLAS

➡ A lot of overlap, but also different scopes

• Follow up of Alessandro’s activity on monitoring✓ Only little effort now from my side (he is now independent)

Page 5: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section

Patricia Mendez:

Page 6: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section

EIS Commitments

• WLCG Support to the ALICE Experiment– Maintenance and support of ALICE VOBOXES

together with the AliEn software distribution– Implementation of gLite middleware within the

Alice WMS– Establishment of site/services contacts with the

experiment– SAM implementation– ALICE FDR setup and planning

Page 7: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section

EIS Commitments

• Support to communities beyond HEP– UNOSAT, Geant4, generic applications (theoretical

physics, ITU, Garfield, HARP, QCD...)– Creation and setup of Vos = gear, UNOSAT,

geant4– Resources research and setup

• Depending on each application requirements

– Site/application contact– Gridification of the applications and merge onto

the Grid environment

Page 8: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section

EIS Commitments

• CCRC`08 exercises and services– Follow up of the ALICE participation as WLCG

contact person– Additional tasks as SAM implementation for the

experiments, VOBOXES setup, etc

• EGI proposal– Application support working group

• EGEE-III

Page 9: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section

Roberto Santinelli:

Page 10: GS-EIS: Experiment Integration Support section

GS Group Meeting - EIS section

Supporting LHCb: setting up LFC distributed service

• The goal A redundant and reliable File catalogue service for LHCb based on LFC A system that best matches the LHCb use cases

• Implementation a master LFC at CERN and mirrored replicas at Tier-1 sites using Oracle Streams

• Several technical aspects to consider Coherence of data and access control Latency in the propagation of updates

• VO support team contributed to the project Definition of the solution and “acceleration” of all steps in the software lifecycle (whenever

this was possible) Functionality and stress tests. Readiness of site implementation

The distributed LHCb file catalogue was deployed in time for the currently ongoing combined computing challenge (CCRC’08)

The distributed LHCb file catalogue was deployed in time for the currently ongoing combined computing challenge (CCRC’08)

Page 11: GS-EIS: Experiment Integration Support section

GS Group Meeting - EIS section

Supporting LHCb: site readiness for CCRC and beyond

Not only monitoring resources and services (and Writing custom tools for that) But also :

• Working with sites and WLCG service for fixing problems spawned• Negotiating resources• channeling problems to/from VO

Not only monitoring resources and services (and Writing custom tools for that) But also :

• Working with sites and WLCG service for fixing problems spawned• Negotiating resources• channeling problems to/from VO

Service classes disk space monitoring charts

FTS matrix channels between all T1’s SRMv2

Page 12: GS-EIS: Experiment Integration Support section

GS Group Meeting - EIS section

Supporting LHCb: SAM tests

• LHCb uses the SAM framework to: Check the availability of Computing Elements

• Queues, WN hardware and software Detect Operating System and architecture Manage the deployment of LHCb software

• Install (or remove) and publish appropriate software versions• Run test simulation, reconstruction, analysis

• LHCb SAM jobs run with high priority with software manager credentials

• LHCb sensors integrated in DIRAC infrastructure When pilot job arrives on WN the testsuite is executed

• Results are published in SAMDB

Nicolo' Magini - Third EGEE User Forum 12

Page 13: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section

Andrea Sciaba:

Page 14: GS-EIS: Experiment Integration Support section

22 February 2008

• CMS contact in EIS– “Grid expert” in CMS

• Giving advice, solving problems reported by CMS users and developers– Site commissioning

• Responsible for SAM in CMS– Managing SAM test submission– Interface with SAM and Dashboard developers– Development of CMS SAM tests

• Debug site problems, mainly those exposed by SAM tests– VO management

• CMS VO manager, processing registration requests and solving VOMRS/VOMS issues

• Interface with VOMRS/VOMS developers• Middleware testing

– gLite WMS, CREAM, job priorities• EGEE TCG

– “alternate” CMS representative– Giving input on middleware-related issues and future developments

• OSG/EGEE Interoperability working group– Representing CMS

• Training and documentation– Editor of the gLite 3 User Guide– Giving tutorials

Page 15: GS-EIS: Experiment Integration Support section

22 February 2008

Nicolo Magini

Page 16: GS-EIS: Experiment Integration Support section

Nicolo' Magini - Third EGEE User Forum

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

SAM tests for SRMv2

• Start from higher-level functionality: lcg-util tests SRMv2-get-SURLs

• For ops/dteam: get path from BDII and corresponding space tokens• For VOs: replace with VO-specific plugins. Developed TFC test for CMS

SRMv2-lcg-cp• Copy a file to the SRMv2, copy it back

SRMv2-lcg-cr• Copy a file to the SRMv2 and register in LFC File Catalog

SRMv2-lcg-gt• Get a TransferURL with supported protocols

SRMv2-lcg-gt-rm-gt• Verify ability to correctly remove a file from SRMv2

SRMv2-lcg-ls-dir• List a directory on SRMv2

SRMv2-lcg-ls• List a file on SRMv2

• Other lower-level SRMv2 functionality will be added16Nicolo' Magini - Third EGEE User Forum

Page 17: GS-EIS: Experiment Integration Support section

Nicolo' Magini - Third EGEE User Forum

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688

VO support activity for CMS: DDT

• DebuggingDataTransfers– During CMS CSA07 period:

Define a metric and procedure to commission data transfer links between CMS Tiers

• ~ 4 MB/s sustained for 5 days Provide documentation and support on transfer debugging

• FTS, SRM operations within the CMS PhEDEx middleware ~ 250 links commissioned in 2007

– Current efforts (CCRC08 and beyond): Scale up the rates to the requirements for the data taking period

• 20 MB/s over 24h Ongoing support for transfer debugging

• FTS, SRMv2 ecc. Current global traffic in PhEDEx counting DDT + CCRC transfers is

approaching the 2008 requirements for CMS• 20 Gbps

17Nicolo' Magini - Third EGEE User Forum

Page 18: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section

Gang Qin:

Page 19: GS-EIS: Experiment Integration Support section

22 February 200822 February 2008 GS Group Meeting - EIS sectionGS Group Meeting - EIS section

Storage Space MonitorStorage Space Monitor For T0,T1s & T2s For T0,T1s & T2s

Reliable space-info:Reliable space-info: of different storage classes of different storage classes (i.e. (i.e.

Atlas:custodial:nearline, Atlas:custodial:nearline, Atlas:replica:online, Atlas etc.)Atlas:replica:online, Atlas etc.)

of different time-periodof different time-period Last 24hoursLast 24hours Last monthLast month Last yearLast year

Current status:Current status: cronjobs running to fetch daily data cronjobs running to fetch daily data still lot of inconsistency, since lots of things are changing (SRMv2 space tokens)still lot of inconsistency, since lots of things are changing (SRMv2 space tokens)

To Do:To Do: implement SRM2 function to have space info for each space token implement SRM2 function to have space info for each space token

Functions for different storage typesFunctions for different storage types Cross check between BDII (ldap query) & local commandCross check between BDII (ldap query) & local command

DPMDPM dpm-qryconfdpm-qryconf CASTOR CASTOR stager_qry on site VOBOXstager_qry on site VOBOX dcachedcache no local query command, ldap query (BDII)no local query command, ldap query (BDII) StoRM StoRM no local query command, infos by site adminno local query command, infos by site admin

Page 20: GS-EIS: Experiment Integration Support section

22 February 200822 February 2008 GS Group Meeting - EIS sectionGS Group Meeting - EIS section

Lumber —— Lemon SensorLumber —— Lemon Sensor Monitor the status of user-specified processesMonitor the status of user-specified processes Process statusProcess status

‘ ‘0’ 0’ Everything is OKEverything is OK ‘ ‘1’1’ process is not running (temporarily, a restart is tried)process is not running (temporarily, a restart is tried) ‘ ‘2’ 2’ process is closed (i.e. by expert working on the system)process is closed (i.e. by expert working on the system) ‘ ‘3’ 3’ process restart failed ------- ALARM mail sent process restart failed ------- ALARM mail sent

Now in production and running on ATLAS Now in production and running on ATLAS VOBOXesVOBOXes

Page 21: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section

Alessandro di Girolamo

Page 22: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section

ATLAS specific tests integration in the Service Availability Monitor framework

Storage & Computing Elements endpoints definition: intersection between GOCDB and TiersOfATLAS (ATLAS specific sites configuration file with Cloud Model)

Different services and endpoints might need to be tested using different VOMS credentials ATLAS endpoints and paths must be explicitly tested The LFC of the Cloud (residing in the T1) is used

• Monitor the availability of ATLAS critical Site Services

• Verify the correct installation and the proper functioning of the ATLAS software on each site

• SE:• Put, Get and Del for each SRM endpoint

• CE:• GangaRobot on each site: execute a real analysis job based on a MC dataset

• keep on running also large part of OPS suite

Page 23: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section

Tiers of ATLAS integration within the Grid

Lumber: a Lemon sensor to monitor the status of critical processes (like DataManagment and monitoring) running on the ATLAS VOBOXes• fully integrated into Lemon (Exceptions/Alarms)• availability output possible on Service Level Status (SLS)

The publication of the availablity status of experiment specific services into monitoring framework like Lemon and SLS is now in progress

Great effort in testing the Tiers (Tier0,1 and 2) supporting ATLAS:• commissioning of srm2 endpoints installation, configuration and proper functioning• verification of the middleware versions and client tools installed

Monitor of ATLAS specific critical services

Page 24: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section

Enzo Miccio

Page 25: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section

Page 26: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section

Elisa Lanciotti

Page 27: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section

Page 28: GS-EIS: Experiment Integration Support section

22 February 2008 GS Group Meeting - EIS section