10
1 14 Feb 2007 CMS Italia – Napoli A. Fanfani Univ. Bologna A. Fanfani University of Bologna MC Production System & DM catalogue

1 14 Feb 2007 CMS Italia – Napoli A. Fanfani Univ. Bologna A. Fanfani University of Bologna MC Production System & DM catalogue

Embed Size (px)

Citation preview

Page 1: 1 14 Feb 2007 CMS Italia – Napoli A. Fanfani Univ. Bologna A. Fanfani University of Bologna MC Production System & DM catalogue

114 Feb 2007

CMS Italia – NapoliA. Fanfani Univ. Bologna

A. Fanfani

University of Bologna

MC Production System & DM catalogue

Page 2: 1 14 Feb 2007 CMS Italia – Napoli A. Fanfani Univ. Bologna A. Fanfani University of Bologna MC Production System & DM catalogue

214 Feb 2007

CMS Italia – NapoliA. Fanfani Univ. Bologna

Production System Overview

ProdRequest

ProdMgr

ProdAgent

ProdAgent

ProdAgent

LCG/EGEE Resource

OSGResource

Resource

User Request

Get Work

Jobs

Jobs

Jobs

ReportProgress

• User Interface to create requests

• Manage request •Allocate work to PA when PA request it• Tracks the global completion of the task

• Ask for work • Convert work into processing jobs• Create, submit, track jobs• Manage the merge, failures, resubmit, local cataloguing, etc..

ProdMgr

ProdRequest

Under High developmentBasic chain PR PM PA works

ProdAgent

ProdAgent

ProdAgent

In production since the summer

Aim at automating as much as possible, easy maintenance

• Various Grid/batch Middleware to support

Page 3: 1 14 Feb 2007 CMS Italia – Napoli A. Fanfani Univ. Bologna A. Fanfani University of Bologna MC Production System & DM catalogue

314 Feb 2007

CMS Italia – NapoliA. Fanfani Univ. Bologna

ProdAgent Processing Workflow

ProdAgent

GridWMS Tier-2

Tier-1

Tier-2

Processing

Processing

Processing

Processing jobs sent to sites Output data left in local SE Report back to ProdAgent Data management

cataloguing (registration in local DBS/DLS)

Failed jobs handled automatically

Small output file fromProcessing jobSE

SE

SE

LocalDBS/DLS

Page 4: 1 14 Feb 2007 CMS Italia – Napoli A. Fanfani Univ. Bologna A. Fanfani University of Bologna MC Production System & DM catalogue

414 Feb 2007

CMS Italia – NapoliA. Fanfani Univ. Bologna

ProdAgent Merge Workflow

ProdAgent

GridWMS

Tier-2

Tier-1

Tier-2

Merging

Merging

Merging

Large output file fromMerge job

SE

SE

SE

PhEDEx PhEDEx transfer invoked by PA

Merge data at site Watch DBS/DLS for

produced unmerged data send merge job at sites

hosting data Transfer data

PhEDEx injection

LocalDBS/DLS

Page 5: 1 14 Feb 2007 CMS Italia – Napoli A. Fanfani Univ. Bologna A. Fanfani University of Bologna MC Production System & DM catalogue

514 Feb 2007

CMS Italia – NapoliA. Fanfani Univ. Bologna

ProdAgent Architecture

Core MySQL DataBase Python API Core Services Work split into atomic Python

Components Asynchronous

Publish/Subscribe model for inter-component communications Simple API to communicate

between components easy to add new functionality and build on existing features

Persistent state recorded in DB

ProdAgent core

Page 6: 1 14 Feb 2007 CMS Italia – Napoli A. Fanfani Univ. Bologna A. Fanfani University of Bologna MC Production System & DM catalogue

614 Feb 2007

CMS Italia – NapoliA. Fanfani Univ. Bologna

Production Agent components

ProdAgentCore

DBS/DLS interfacePhEDEx interface

Job Tracking

Job Creator

Local DBS

Local DLS

PHEDEX

BOSS DBBOSS submit

Merge Sensor

watchswatchs

LCG RB/gLite WMS

ProdMgr interface

Job Submitter

Job Cleanup

Error handler

Retrieve work

Job Queue

ResourceMonitor

workflow

Merge Accountant

Page 7: 1 14 Feb 2007 CMS Italia – Napoli A. Fanfani Univ. Bologna A. Fanfani University of Bologna MC Production System & DM catalogue

714 Feb 2007

CMS Italia – NapoliA. Fanfani Univ. Bologna

Monitoring

status of each component

Status of each component

overview of the current job status

Overview of current job status mc-physval-120-ZToMuMu-StartUpLumiPU

PA level monitoring for operators (developed by Bari team + Carlos)

Page 8: 1 14 Feb 2007 CMS Italia – Napoli A. Fanfani Univ. Bologna A. Fanfani University of Bologna MC Production System & DM catalogue

814 Feb 2007

CMS Italia – NapoliA. Fanfani Univ. Bologna

Prod system Status & Plan ProdAgent implemented and deployed operationally since

summer CSA06 pre-production of 66Mevents + organized skimming run at

Tier1s PhysVal+HLT samples with CMSSW12x , see Nicola’s talk

Focus is now on automation to reduce manual work for operators i.e. automatic block management and PhEDEX injections

and on performance to make it more scalable, more robust Bulk creation & bulk submission with gLite (with LCG RB

2000jobs/day per PA) Deployment of ProdRequest/ProdMgr/ProdAgent system

The production teams will no longer have to inject workflows taking them from Twiki pages

The production coordinator will assign work to teams with given priority via ProdManager

DBS-2 integration Alpgen integration True collaborative development effort:

Dave Evans, Frank Van Lingen, Giulio Eulisse (US) Carlos Kavka, Alessandra Fanfani, William Bacchi,Giuseppe Codispoti,

contribution from Bari team (IT)

Page 9: 1 14 Feb 2007 CMS Italia – Napoli A. Fanfani Univ. Bologna A. Fanfani University of Bologna MC Production System & DM catalogue

914 Feb 2007

CMS Italia – NapoliA. Fanfani Univ. Bologna

Data Management catalogues: DBS

The Dataset Bookkeeping System (DBS) provides the means to define, discover and use CMS event data

First version deployed for CSA06, including data discovery browser

Development for 2nd generation (DBS-2) Prepare the system for describing real data Added info like run, luminosity sections, primary dataset

description Preliminary support for Analysis dataset

A subset of a Processed Dataset representing a coherent sample for physics analysis

More functionalities for browsing data discovery Deployable with Oracle at CERN for Global DBS Deployable with MySQL too to be used as “local scope” DBS

Under integration with CRAB, ProdAgent, PhEDEx, MTCC data

Page 10: 1 14 Feb 2007 CMS Italia – Napoli A. Fanfani Univ. Bologna A. Fanfani University of Bologna MC Production System & DM catalogue

1014 Feb 2007

CMS Italia – NapoliA. Fanfani Univ. Bologna

Data Management catalogues: DLS

The Data Location Service (DLS) provides the means to locate replicas of data in the distributed computing system The DBS knows how datasets are organized in term of file-

blocks The Data Location Service (DLS) maps file-blocks to storage

elements (SE’s) DLS based on LCG LFC used for CSA06:

some drawbacks (performance issues for reverse lookup, i.e. data discovery), some advantages (production service mantained by LCG, VOMS authentication-authorization, DLI)

No server-side work needed by CMS Serving us right so far

Evaluating to have DLS on the same server as DBS Decide based on CMS use case Still keeping DLS API functionalities Add support for the Resource Broker to talk to directly (via

DLI)