Upload
grace-barton
View
213
Download
0
Embed Size (px)
Citation preview
114 Feb 2007
CMS Italia – NapoliA. Fanfani Univ. Bologna
A. Fanfani
University of Bologna
MC Production System & DM catalogue
214 Feb 2007
CMS Italia – NapoliA. Fanfani Univ. Bologna
Production System Overview
ProdRequest
ProdMgr
ProdAgent
ProdAgent
ProdAgent
LCG/EGEE Resource
OSGResource
Resource
User Request
Get Work
Jobs
Jobs
Jobs
ReportProgress
• User Interface to create requests
• Manage request •Allocate work to PA when PA request it• Tracks the global completion of the task
• Ask for work • Convert work into processing jobs• Create, submit, track jobs• Manage the merge, failures, resubmit, local cataloguing, etc..
ProdMgr
ProdRequest
Under High developmentBasic chain PR PM PA works
ProdAgent
ProdAgent
ProdAgent
In production since the summer
Aim at automating as much as possible, easy maintenance
• Various Grid/batch Middleware to support
314 Feb 2007
CMS Italia – NapoliA. Fanfani Univ. Bologna
ProdAgent Processing Workflow
ProdAgent
GridWMS Tier-2
Tier-1
Tier-2
Processing
Processing
Processing
Processing jobs sent to sites Output data left in local SE Report back to ProdAgent Data management
cataloguing (registration in local DBS/DLS)
Failed jobs handled automatically
Small output file fromProcessing jobSE
SE
SE
LocalDBS/DLS
414 Feb 2007
CMS Italia – NapoliA. Fanfani Univ. Bologna
ProdAgent Merge Workflow
ProdAgent
GridWMS
Tier-2
Tier-1
Tier-2
Merging
Merging
Merging
Large output file fromMerge job
SE
SE
SE
PhEDEx PhEDEx transfer invoked by PA
Merge data at site Watch DBS/DLS for
produced unmerged data send merge job at sites
hosting data Transfer data
PhEDEx injection
LocalDBS/DLS
514 Feb 2007
CMS Italia – NapoliA. Fanfani Univ. Bologna
ProdAgent Architecture
Core MySQL DataBase Python API Core Services Work split into atomic Python
Components Asynchronous
Publish/Subscribe model for inter-component communications Simple API to communicate
between components easy to add new functionality and build on existing features
Persistent state recorded in DB
ProdAgent core
614 Feb 2007
CMS Italia – NapoliA. Fanfani Univ. Bologna
Production Agent components
ProdAgentCore
DBS/DLS interfacePhEDEx interface
Job Tracking
Job Creator
Local DBS
Local DLS
PHEDEX
BOSS DBBOSS submit
Merge Sensor
watchswatchs
LCG RB/gLite WMS
ProdMgr interface
Job Submitter
Job Cleanup
Error handler
Retrieve work
Job Queue
ResourceMonitor
workflow
Merge Accountant
714 Feb 2007
CMS Italia – NapoliA. Fanfani Univ. Bologna
Monitoring
status of each component
Status of each component
overview of the current job status
Overview of current job status mc-physval-120-ZToMuMu-StartUpLumiPU
PA level monitoring for operators (developed by Bari team + Carlos)
814 Feb 2007
CMS Italia – NapoliA. Fanfani Univ. Bologna
Prod system Status & Plan ProdAgent implemented and deployed operationally since
summer CSA06 pre-production of 66Mevents + organized skimming run at
Tier1s PhysVal+HLT samples with CMSSW12x , see Nicola’s talk
Focus is now on automation to reduce manual work for operators i.e. automatic block management and PhEDEX injections
and on performance to make it more scalable, more robust Bulk creation & bulk submission with gLite (with LCG RB
2000jobs/day per PA) Deployment of ProdRequest/ProdMgr/ProdAgent system
The production teams will no longer have to inject workflows taking them from Twiki pages
The production coordinator will assign work to teams with given priority via ProdManager
DBS-2 integration Alpgen integration True collaborative development effort:
Dave Evans, Frank Van Lingen, Giulio Eulisse (US) Carlos Kavka, Alessandra Fanfani, William Bacchi,Giuseppe Codispoti,
contribution from Bari team (IT)
914 Feb 2007
CMS Italia – NapoliA. Fanfani Univ. Bologna
Data Management catalogues: DBS
The Dataset Bookkeeping System (DBS) provides the means to define, discover and use CMS event data
First version deployed for CSA06, including data discovery browser
Development for 2nd generation (DBS-2) Prepare the system for describing real data Added info like run, luminosity sections, primary dataset
description Preliminary support for Analysis dataset
A subset of a Processed Dataset representing a coherent sample for physics analysis
More functionalities for browsing data discovery Deployable with Oracle at CERN for Global DBS Deployable with MySQL too to be used as “local scope” DBS
Under integration with CRAB, ProdAgent, PhEDEx, MTCC data
1014 Feb 2007
CMS Italia – NapoliA. Fanfani Univ. Bologna
Data Management catalogues: DLS
The Data Location Service (DLS) provides the means to locate replicas of data in the distributed computing system The DBS knows how datasets are organized in term of file-
blocks The Data Location Service (DLS) maps file-blocks to storage
elements (SE’s) DLS based on LCG LFC used for CSA06:
some drawbacks (performance issues for reverse lookup, i.e. data discovery), some advantages (production service mantained by LCG, VOMS authentication-authorization, DLI)
No server-side work needed by CMS Serving us right so far
Evaluating to have DLS on the same server as DBS Decide based on CMS use case Still keeping DLS API functionalities Add support for the Resource Broker to talk to directly (via
DLI)