18
ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial Edinburgh, UK – March 21-22, 2011

ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

ATLAS Databases:An Overview,

Athena use of Geometry/Conditions DB,and Conditions Metadata

Elizabeth Gallas - Oxford

ATLAS-UK Distributed Computing TutorialEdinburgh, UK – March 21-22, 2011

Page 2: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

Mar 2011 Elizabeth Gallas - Databases/COMA 2

Outline

Motivation: Databases Overview of ATLAS Databases

Databases of Athena-based analysis interest Geometry Database Conditions Database

And how they are made accessible on the grid COMA (Conditions Metadata)

Selected/derived Run/Lb-wise Conditions/Configuration in relational format

Data Periods in COMA Other COMA reports

Summary and Conclusions

Page 3: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

Mar 2011 Elizabeth Gallas - Databases/COMA 3

Motivation: Database use in ATLAS ATLAS “data” – falls into 2 broad categories

Event-wise data: stored in files (RAW, ESD, AOD, TAG …) Know something about themselves but also have some

‘metadata’ pointers to the bigger picture Non-event-wise data: Stored in Databases

Enable construction of the ‘bigger picture’ Important information needed at our fingertips

Usually by diverse clients Data Base Management Systems (DBMS) provide:

persistent storage for large/small collections of data of varied complexity in data structures that provide access flexibility

powerful query language for data entry, modification and retrieval

transaction management appearance of isolation but provides multi-user simultaneous access

Page 4: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

Mar 2011 Elizabeth Gallas - Databases/COMA 4

Overview – Oracle usage in ATLASOracle is used extensively: every stage of data taking, processing, analysis.Some of the more common applications: Configuration

PVSS – Detector Control System (DCS) Configuration & Monitoring Trigger – Trigger Configuration (online and simulation data) OKS – Configuration databases for the TDAQ Geometry - Detector Description

File and Job management T0 – Tier 0 processing DQ2/DDM – distributed file and dataset management Dashboard – monitor jobs and data movement on the ATLAS grid PanDa – workload management: production & distributed analysis

Conditions data (non-event data for offline analysis) Conditions Database

[POOL files in DDM (referenced from the Conditions DB)] “Metadata” == data about data

AMI (ATLAS Metadata Interface) – Dataset metadata COMA (COnditions MetadatA) – Configuration/Conditions metadata TAGs (not an acronym) – Event-level metadata

Page 5: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

Mar 2011 Elizabeth Gallas - Databases/COMA 5

What does your Athena job need ?

What does every Athena job need ?

1. Data (Events)

2. Database (Geometry, Conditions)

3. Efficient I/O (sometime across a network), CPU

4. (A Purpose and a) Place for Output

Next slides … more details about Geometry and Conditions What they contain How Athena accesses them How they are distributed for access on the grid User interfaces, documentation, and help

Needs:1. Food2. Water3. Love4. Place for output

Page 6: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

Mar 2011 Elizabeth Gallas - Databases/COMA 6

Geometry Database Relational DB: Primary Numbers for the ATLAS Detector Description

All data for building GeoModel description in single place Primary numbers stored in Data Tables (leaf) Organized by subsystem (branch)

Tagging (versioning) at various levels Locked tags define distinct detector description And Globally tagged/locked at higher levels

Associated with Software Releases Evolution of Geometry tags is set up such that

Each new tag is compatible with older Releases Location and Distribution:

Master copy: in Oracle server at CERN Up to now: Copy of entire database dumped into SQLite file

Delivered to sites using DB Release technology with each Software Release

Future … more diverse distribution model being tested (Frontier) Update: (Vakho Tsulaia) in upcoming Software/Computing workshop

Page 7: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

Mar 2011 Elizabeth Gallas - Databases/COMA 7

Geometry DB Browserhttp://atlas.web.cern.ch/Atlas/GROUPS/OPERATIONS/dataBases/DDDB

Page 8: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

Mar 2011 Elizabeth Gallas - Databases/COMA 8

“Conditions”

“Conditions” – general term for information which is not ‘event-wise’ reflecting the conditions or states of a system – conditions are valid for an ‘interval of validity’ (IOV) ranging from very short to infinity.

IOV’s can be expressed as a range: in timestamps or Run/LumiBlocks.Any conditions data needed for offline processing and/or analysis must

be stored in the ATLAS Conditions Database (aka: COOL)

or in its referenced POOL files (DDM)

ATLAS Conditions Database

ZDC

DCS TDAQ OKS

LHC

DQ

Page 9: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

Mar 2011 Elizabeth Gallas - Databases/COMA 9

Conditions DB infrastructure in ATLAS Relies on considerable infrastructure: COOL, CORAL, Athena (developed by ATLAS

and CERN IT) -- generic schema design which can store / accommodate / deliver a large amount of data for diverse set of subsystems.

IOV ‘interval of validity’ DB in relational DB tables Data organized into folders … foldersets

By schema (subdetector) By instance (for real data and MC)

Stores data ‘inline’ but can have references to external POOL files (managed by DDM)

Athena / Conditions DB data maps to transient C++ objects, which are

accessible to Athena at run time through the Transient Store

COOL Tag (version) - distinct sets of Conditions making specific computations reproducible

Used at many stages of data taking and analysisFrom online calibrations, alignment, monitoring, to offline

… processing … more calibrations … further alignment… reprocessing … analysis …to luminosity and data quality

Page 10: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

Mar 2011 Elizabeth Gallas - Databases/COMA 10

Conditions: User interfacesCommand line interface:

https://twiki.cern.ch/twiki/bin/view/Atlas/AtlCoolConsoleConditions TAG Browser:

https://atlascoolbrowser.web.cern.ch/atlascoolbrowser

Page 11: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

Mar 2011 Elizabeth Gallas - Databases/COMA 11

Oracle Distribution of Conditions data Oracle stores a huge amount of essential data ‘at our fingertips’

But ATLAS has many… many… many… fingers May be looking for oldest to newest data

Conditions in Oracle – Master copy at Tier-0 Replicated to many Tier-1 sites

Running jobs at Oracle sites (direct access) performs well But direct Oracle access on the grid from remote sites:

Even after tuning, direct access requires many back/forth network transactions – RTT (Round Trip Time) multiplies … SLOW

Cascade effect: Jobs hold connections longer, prevents starting new jobs Use alternative technologies, especially over WAN (Wide Area Network):

“caching” Conditions from Oracle when possible

OnlineCondDB

Offlinemaster

CondDB

Tier-1replica

Tier-1replica

Tier-0 farm

Computer centre

Outside world

Isolation / cut

Calibration updates

SimplifiedDiagram !

Page 12: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

Mar 2011 Elizabeth Gallas - Databases/COMA 12

Technologies for Conditions “caching” “DB Release”: make a system of files containing all data ‘needed’.

Used in reprocessing campaigns and for MC processing/analysis Includes:

SQLite replicas: “mini” Conditions DB with specific Folders, IOV range, CoolTag (a ‘slice’ – small subset of all rows in Oracle tables)

And associated POOL files and a PFC (file catalog)

“Frontier”: store results in a web cache. Developed by Fermilab (used by CDF, further refined for CMS) Basic Idea: Frontier / Squid servers located at/near Oracle RAC

negotiate transactions between grid jobs and Oracle DB reduce the load on Oracle by caching results of repeated queries reduce latency observed connecting to Oracle over the WAN.

Additional Squid servers at remote sites help even more Used by default for user analysis jobs.

Picture on next slide

Page 13: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

Mar 2011 Elizabeth Gallas - Databases/COMA 13

Conditions DB access via FrontierFrontier for distributed database access

Used by default for user analysis jobs.

Main components Frontier server

Communicates directly with Oracle server Includes data caching Provides data to Squids

Squid Communicates with Frontier server over http Caches retrieved data locally for its clients

ATLAS: Frontier in operation late in 2009 Frontier servers at T1 sites on replication ~60 Squids all over the world

Mostly T2, some T3 too

Tier 2

Tier 1

Page 14: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

Mar 2011 Elizabeth Gallas - Databases/COMA 14

DB Access in Athena Athena applications access conditions and geometry DBs

using LCG software libraries POOL, COOL and CORAL Allows for transparent usage of various technologies

(Oracle, SQLite, FroNTier/Squid)

Page 15: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

Mar 2011 Elizabeth Gallas - Databases/COMA 15

Tips for Users (1) What Global Conditions and Geometry tags to use?

Autoconfigure your job Have job read global tags from its input file (ESD, AOD)

In job options:from RecExConfig.RecFlags import rec

rec.AutoConfiguration=['everything'] In job transforms:

Command line parameter'autoConfiguration=everything'

https://twiki.cern.ch/twiki/bin/view/Atlas/RecExCommonAutoConfiguration

Slide: V.Tsulaia

Page 16: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

Mar 2011 Elizabeth Gallas - Databases/COMA 16

Tips for Users (2) How to configure my environment to access

FroNTier/Squid? Conditions payload POOL files? DB Release for geometry (and MC conditions if needed)?

All that is done for you automatically...

… just sit back and

enjoy the ride!

Slide: V.Tsulaia

Page 17: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

Mar 2011 Elizabeth Gallas - Databases/COMA 17

Tips for Users (3)If things go wrong …and it seems to be related to database access

Useful information on TWiki: Athena DB Access:

https://twiki.cern.ch/twiki/bin/view/Atlas/AthenaDBAccess COOL Troubles:

https://twiki.cern.ch/twiki/bin/viewauth/Atlas/CoolTroubles Atlas DB Release:

https://twiki.cern.ch/twiki/bin/viewauth/Atlas/AtlasDBRelease

These TWiki documents should be able to help you in narrowing down the problem and then you'll be in position to

Either ask your site admin Or send email to Database Operations<[email protected]>

Slide: V.Tsulaia

Page 18: ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial

Mar 2011 Elizabeth Gallas - Databases/COMA 18

Conclusions: Databases and DB Access from Athena Databases are used extensively in ATLAS

At every stage of data taking, processing, analysis Scratch the surface of many interactive user applications

And you will find a Database ! I’ve attempted to give an overview of the issues and considerations in

DB access from Athena The need to provide database information

In a variety of access patterns With potentially widely varying data volumes From diverse clients

makes Athena access to ATLAS non-event-wise databases (Conditions and Geometry) complex.

Supporting different technologies allows us to optimally meet the various needs.

A lot of effort has gone into making DB access for user analysis as transparent as possible …

More details can be found: See V.Tsulaia slides

Software Workshop in Tbilisi Oct 26, 2010 On various TWiki pages