10
PhEDEx Overview for CMS data operators Natalia Ratnikova Fermilab, WH1E 22 th November 2016

PhEDEx intro for Interns - Fermilabnatasha/TALKS/PhEDEx_intro_for...– Integrate FTS 3 support – automate file invalidation requests – processing consistency checks results •

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: PhEDEx intro for Interns - Fermilabnatasha/TALKS/PhEDEx_intro_for...– Integrate FTS 3 support – automate file invalidation requests – processing consistency checks results •

PhEDEx Overview for CMS data operators

Natalia RatnikovaFermilab, WH1E22th November 2016

Page 2: PhEDEx intro for Interns - Fermilabnatasha/TALKS/PhEDEx_intro_for...– Integrate FTS 3 support – automate file invalidation requests – processing consistency checks results •

Introduction

•  Physics Experiment Data Export : central component of CMS Data Management System, responsible for data location and placement –  created in 2004, undergone significant evolution over time–  uses grid tools (FTS, SRM, etc) to transfer files according to the

CMS data placement policy–  manages CMS data at CMS grid sites: currently 150 nodes are

registered in CMS production instance –  uses debug instance for load test transfers between the sites to

ensure “links” quality–  provides tools for verifying data consistency, statistics and

monitoring–  uses clever routing algorithm, adjustable workload, and more

11/22/2016N. Ratnikova | PhEDEx overview for CMS data operators2

Page 3: PhEDEx intro for Interns - Fermilabnatasha/TALKS/PhEDEx_intro_for...– Integrate FTS 3 support – automate file invalidation requests – processing consistency checks results •

11/22/2016 3

CMS Data

•  Event data in files –  average file size reasonably large ~2.5 GB –  output merged to help scaling in catalogs and storages

•  Files are grouped in file blocks to manage them in bulk –  ~10-1000 files/block

•  File blocks are grouped by physics content in datasets of variable size (0.1–100 TB)

~1010 events/year ~6x107 distinct files

in 2016

N. Ratnikova | PhEDEx overview for CMS data operators

Page 4: PhEDEx intro for Interns - Fermilabnatasha/TALKS/PhEDEx_intro_for...– Integrate FTS 3 support – automate file invalidation requests – processing consistency checks results •

PhEDEx components

•  Transfer Management Database at CERN (oracle)

•  Site agents – set of perl daemons running at every site, each performing a particular local data management task:–  file download, delete, stage from MSS, export for outbound transfer, verify

•  Central agents managing PhEDEx workflows and infrastructure–  transfer requests, data routing, bookkeeping, monitoring and other central activities

•  PhEDEx web site – set of interactive web applications to control and monitor the PhEDEx system–  new implementation uses combination of perl + javascript for more interactive features

11/22/2016N. Ratnikova | PhEDEx overview for CMS data operators4

Page 5: PhEDEx intro for Interns - Fermilabnatasha/TALKS/PhEDEx_intro_for...– Integrate FTS 3 support – automate file invalidation requests – processing consistency checks results •

11/22/2016 N. Ratnikova | PhEDEx overview for CMS data operators 5

PhEDEx workflow

Page 6: PhEDEx intro for Interns - Fermilabnatasha/TALKS/PhEDEx_intro_for...– Integrate FTS 3 support – automate file invalidation requests – processing consistency checks results •

11/22/2016 6

PhEDEx transfer workflow

•  Central PhEDEx agents are middleware-agnostic •  Site agents integrated through plugins with WLCG DM

middleware – e.g FTS or SRM – to execute transfers

N. Ratnikova | PhEDEx overview for CMS data operators

Page 7: PhEDEx intro for Interns - Fermilabnatasha/TALKS/PhEDEx_intro_for...– Integrate FTS 3 support – automate file invalidation requests – processing consistency checks results •

PhEDEx building blocks

PhEDEx code has been refactored to provide generalized solutions and facilitate the implementation of the new features:•  Core agent framework

–  provides base Agent class and set of modules for common functions: SQL statements, LFN to PFN conversions, etc

•  Namespace framework–  provides interface to various storage types (dCache,DPM, EOS, Castor,

posix) to access file properties for consistency checks•  Data service framework and the website

–  Implement web site frontend and the APIs to access the PhEDEx database.

•  LifeCycle agent framework–  allows to simulate full life cycle of the PhEDEx system, generating the work

load; useful for performance and scalability tests, debugging and validation

11/22/2016N. Ratnikova | PhEDEx overview for CMS data operators7

Page 8: PhEDEx intro for Interns - Fermilabnatasha/TALKS/PhEDEx_intro_for...– Integrate FTS 3 support – automate file invalidation requests – processing consistency checks results •

Recent developments

•  Maintenance of the existing code–  port to new systems, external upgrades

•  Additional features for operational needs:–  Integrate FTS 3 support –  automate file invalidation requests–  processing consistency checks results

•  Network–aware applications •  Metrics for latency and popularity analytics

11/22/2016N. Ratnikova | PhEDEx overview for CMS data operators8

Page 9: PhEDEx intro for Interns - Fermilabnatasha/TALKS/PhEDEx_intro_for...– Integrate FTS 3 support – automate file invalidation requests – processing consistency checks results •

11/22/2016N. Ratnikova | PhEDEx overview for CMS data operators9

Screen shot of general overview of PhEDEx web page

Page 10: PhEDEx intro for Interns - Fermilabnatasha/TALKS/PhEDEx_intro_for...– Integrate FTS 3 support – automate file invalidation requests – processing consistency checks results •

Screen shot of file sizes breakdown and stats As of Jan 6, 2016:

Total files: 44 763 533Total data size: 115.27 PB

11/22/2016N. Ratnikova | PhEDEx overview for CMS data operators10

Today’s numbers:Total files: 64 363 757Total data size: 165.27 PB