View
215
Download
2
Tags:
Embed Size (px)
Citation preview
ATLAS: Database Strategy(and Happy Thanksgiving)
Elizabeth Gallas(Oxford)
Distributed Database Operations Workshop CERN: November 26-27, 2009
26-Nov-2009 Elizabeth Gallas 2
Outline Thanks ! Overview of Oracle Usage in ATLAS
Highlight and describe schemas which are distributedAMI (limited distribution: Lyon CERN)Trigger Database (Summary and Usage)Conditions Database (many systems under one roof)
‘Conditions’ in the ATLAS Conditions Database Database Distribution Technologies, why Frontier ?
TAG Database Architecture, Services, Resource planning
Distributed Database Access at Tier-1 Accounts and profiles The ATLAS COOL Pilot Monitoring
Summary
26-Nov-2009 Elizabeth Gallas 3
Supporting the Databases is critical ! Special thanks to
CERN Physics Database ServicesSupport Oracle-based services at CERNCoordinate Distributed Database Operations (WLCG 3D)
Tier-1 (Tier-2) -- DBAs and system managersEssential component for ATLAS success
The majority of ATLAS processing happens on “The Grid”
ATLAS DBAs:Florbela Viegas Gancho Dimitrov
- Schedule/Apply Oracle interventions- Advise us on Application development- Coordinate database monitoring (experts, shifters)
- Helping to develop, maintain and distribute this critical data takes specialized knowledge and considerable effort which is frequently underestimated.
Thanks to Oracle Operations Support
26-Nov-2009 Elizabeth Gallas 4
Oracle is used extensively: every stage of data taking & analysis Configuration
Detector Description – Geometry OKS – Configuration databases for the TDAQ Trigger – Trigger Configuration (online and simulation) PVSS – Detector Control System (DCS) Configuration & Monitoring
File and Job management T0 – Tier 0 processing DQ2/DDM – distributed file and dataset management Dashboard – monitor jobs and data movement on the ATLAS grid PanDa – workload management: production & distributed analysis
Dataset selection catalogue AMI (dataset selection catalogue)
Conditions data (non-event data for offline analysis) Conditions Database in Oracle [POOL files in DDM (referenced from the Conditions DB)]
Event summary - event-level metadata TAGs – ease selection of and navigation to events of interest
Overview – Oracle usage in ATLASDatabase
Distribution togrid sites
Lyon-CERN
Volunteerhosts
26-Nov-2009 Elizabeth Gallas 5
ATLAS 3D replication snapshot Oracle Streams CERN Tier-1 sites
Conditions DB data needed for offline analysis Trigger DB
Muon Conditions from muon calibration centres Munich, Michigan, Rome CERN
AMI (Lyon CERN)
ATLAS TAG DB (Customized distribution) not using Oracle Streams
“Upload” mechanism described later Only to volunteer host sites
CERN, TRIUMF, RAL, DESY, BNL?
26-Nov-2009 Elizabeth Gallas 6
AMI (Metadata Catalogue) replication AMI DB: Master at Lyon AMI read-only replica: at CERN via Oracle Streams
AMI web services - more robust Other Tier-1s – no worry
AMICatalogue X
AMICatalogue X
ORACLE/CCIN2P3
MYSQL/AMI1/LPSC
AMICatalogue X
ORACLE/CERN
Read Only
3D replication
legacy
ccami02
ccami01ami.in2p3.fr
ami.cern.chr.ch
r.fr
r.ch
servers
Picture from Solveig Albrand/AMI team
26-Nov-2009 Elizabeth Gallas 7
Trigger Database: Usage and ReplicationTrigger DB lies at the heart of the Trigger
Configuration system (red components)Paul Bell, in March09 gave a summary:
http://indicobeta.cern.ch/conferenceDisplay.py?confId=43390
A subset of Trigger DB information is stored in the Conditions DB Used by many standard ATLAS analysis
But more information is needed from the Trigger DB for specialized tasks
Therefore: Online TriggerDB on ATONR
replicated to ATLR via Oracle Streams replicated to Tier-1 via Oracle StreamsAllows (re-)running HLT at Tier-1
Simulation Trigger DB on ATLR replication to Tier-1 via Oracle StreamsAllows the configuration of MC jobs
Storage in Oracle (some previously in xml)insures safe archival of all menus usedeasy browsing using web access tools
Replication to Tier-2 for both Online and Simulation Trigger DB via “DB Release” SQLite files shipped with DB Release
Run Control
Offline access
Configurationuploading
Online Operation
TriggerTool
TriggerDB
CORAL“Loader classes”
L1 H/W Configuration
HLT S/WConfiguration
TriggerPanel
Conf2COOL
COOLConditionsDatabase
26-Nov-2009 Elizabeth Gallas 8
“Conditions” in the ConditionsDB
“Conditions” – general term for information which is not ‘event-wise’ reflecting the conditions or states of a system – conditions are valid for an interval ranging from very short to infinity.
Any conditions data needed for offline processing and/or analysis must be stored in the
ATLAS Conditions Database (aka: COOL) or in its referenced POOL files (DDM)
ATLAS Conditions Database(any non-event-wise data
needed for offline process/analysis)ZDC
DCS TDAQ OKS
LHC
DataQuality
Configuration …
26-Nov-2009 Elizabeth Gallas 9
Conditions DB in ATLAS
Used at many stages of data taking and analysisFrom online calibrations, alignment, monitoring, to offline
… processing … more calibrations … further alignment… reprocessing … analysis …to luminosity and data quality
An IOV (‘Interval of Validity’) based database All data indexed by interval (time or RunLB) Makes extracting subsets of the data in distinct
intervals across subsystems possiblethe basis of Conditions in “DB Release”
Athena (the ATLAS Software framework) Allows jobs to access any Conditions Considerable Improvements in Release15:
TWiki: DeliverablesForRelease15 More efficient use of COOL connections Enabling of Frontier / Squid connections to Oracle IOVDbSvc refinements … and much more
Continued refinement – by subsystem As we chip away at inefficient usage
Relies on considerable infrastructure: COOL, CORAL, Athena (developed by ATLAS and CERN IT) -- generic schema design which can store / accommodate / deliver a large amount of data for a diverse set of subsystems.
26-Nov-2009 Elizabeth Gallas 10
Conditions Distribution Issues Oracle stores a huge amount of essential data ‘at our fingertips’
But ATLAS has many… many… many… fingers May be looking for oldest to newest data
Conditions in Oracle – Master copy at Tier-0 selected replication to Tier-1 Running jobs at Oracle sites (direct access) performs well
Important to continue testing, optimize RAC But direct Oracle access on the grid from remote site over WideAreaNetwork
Even after tuning efforts, direct access requires many back/forth communications on the network – excessive RTT (Round Trip Time)… SLOW
Cascade effect Jobs hold connections for longer … Prevents new jobs from starting …
Use alternative technologies, especially over WAN: “caching” Conditions from Oracle when possible
OnlineCondDB
Offlinemaster
CondDB
Tier-1replica
Tier-1replica
Tier-0 farm
Computer centre
Outside world
Isolation / cut
Calibration updates
26-Nov-2009 Elizabeth Gallas 11
Alternatives to direct Oracle access “DB Release”: make system of files containing data needed. Used in
reprocessing campaigns. Includes: SQLite replicas: “mini” Conditions DB
with specific Folders, IOV range, CoolTag(a ‘slice’ – small subset of all rows in Oracle tables)
And associated POOL files, PFC
“Frontier”: store results of Oracle queries in a web cache. Developed by Fermilab
used by CDF, adopted and further refined for CMS Thanks to Dave Dykstra/CMS for a lot of advice !
Basic Idea: Frontier / Squid servers located at/near Oracle RAC negotiate transactions between grid jobs and Oracle DB – load levelling reduces the load on Oracle by caching results of repeated queries reduces latency observed connecting to Oracle over the WAN.
Additional Squid servers at remote sites help even more Douglas Smith will talk later today about Frontier/Squid deployment
“DB Release” works well for reprocessing, but not well for many other use cases…
26-Nov-2009 Elizabeth Gallas 12
ATLAS TAGs in the ATLAS Computing model
Stages of ATLAS reconstruction RAW data file
ESD (Event Summary Data) ~ 500 kB/event AOD (Analysis Object Data) ~ 100 kB/event
TAG (not an acronym) ~ 1 kB/event (stable)
TAG s Are produced in reconstruction in 2 formats:
File based AthenaAwareNTuple format (AANT)
TAG files are distributed to all Tier 1 sites Oracle Database
Event TAG DB populated from files in ‘upload’ process Can be re-produced in re-processing
Available globally through network connection In addition:
‘Run Metadata’ at Temporal, Fill, Run, LB levels File and Dataset related Metadata
TAG Browser (ELSSI) – uses combined Event, Run, File … Metadata
RAW
AOD
ESD
TAG
26-Nov-2009 Elizabeth Gallas 13
TAG Database Architecture TAG DB services / architecture model has evolved
From: Everything deployed at all voluntary sites To: Specific aspects deployed to optimize resources
Decoupling of services underway – increases flexibility of the system to deploy resources depending on evolution of usage
TAG DB (experts: Florbela Viegas, Elisabeth Vinek) Is populated via “upload” process from TAG files at Tier-0
For initial reconstruction at Tier-0: Upload is automatedFor reprocessing: Upload is manually triggered (site specific)
All Uploads are Controlled by Tier-0 with detailed monitoringIntegrated with AMI and DQ2 tools where appropriate
Balanced for read and write operations. Very Stable. TAG DB Voluntary Hosts include:
CERN, TRIUMF, DESY, RAL and BNL (?)
26-Nov-2009 Elizabeth Gallas 14
Tier-1 ATLAS DB Accounts and ProfilesAccess to the distributed Oracle schemas is regulated via applications
which access the database using Accounts you have set up at the Tier-1s
Tier-1s should refer to this ATLAS TWiki for more details:https://twiki.cern.ch/twiki/bin/view/Atlas/OracleProcessesSessionsUsersProfilesTier-1 DB Account overview:
Conditions Database ATLAS_COOL_READER_U
Used by Athena to read Conditions DB via direct Oracle access You have recently disabled ATLAS_COOL_READER which had a
weak password (and used by old ATLAS Releases) Trigger Database
ATLAS_CONF_TRIGGER_V2_R TAGs (for volunteer sites hosting ATLAS TAG DB)
ATLAS_TAGS_READER Reader account of the TAGS application ATLAS_TAGS_WRITER Writer account of the TAGS application
Special accounts for Frontier (at sites with Frontier) ATLAS_FRONTIER_READER
Used by Athena to read COOL and CONF access via Frontier Servlet ATLAS_FRONTIER_TRACKMOD
Used by Frontier cache consistency to get last table modification time
26-Nov-2009 Elizabeth Gallas 1526-Nov-2009 Elizabeth Gallas 15
The ATLAS COOL Pilot “Message Board” style queue to throttle COOL jobs based on load Described in detail here (Database Operations Meeting 4/Nov/09):
http://indicobeta.cern.ch/getFile.py/access?subContId=2&contribId=4&resId=1&materialId=slides&confId=72342
Tested at IN2P3, installed at several sites, not active now, but can be setup quickly if needed, as code (pilot.py) is in the Atlas software.
Monitoring tools allow to control the behaviour by web-based interface."""File: pilot.py…. while True :….. sensor.execute("select atlas_cool_pilot.load_status from dual") …. if status=='GO' : # RW: avoid start of many jobs released at once sleep (attempt*random.randint(1, 300)) #exit(134) break else: print strftime(" %a, %d %b %Y %H:%M:%S ", gmtime()), " avoiding load of", load[3], "at", sessions, "concurrent COOL sessions" if attempt<5 : attempt=attempt+1 print attempt*sessions sleep (attempt*sessions) ….. if attempt==5 : print " FATAL: Killing job to avoid Oracle overload" sys.exit(134) …..
26-Nov-2009 Elizabeth Gallas 16
Database monitoring at Tier-1 ATLAS Computing Model – puts many types of analysis on the grid Wide diversity of jobs accessing databases
Conditions DB access via Athena is especially variable DBAs and Conditions DB experts looking into specific queries
Asking Tier-1s to send top-usage queries back to us Frontier evaluation has been useful in disclosing some issuesSavannah bug reports fixes to CORAL, COOL, Frontier)
Access will vary With time and with increasing collision data By site (different groups have different responsibilities)
Experts at CERN and Tier-1 use resources and levers at their disposal to explore issues, find solutions Example: Recent talk by Carlos Gamboa (BNL)
http://indicobeta.cern.ch/conferenceDisplay.py?confId=72342 New ATLAS TWiki logging issues, evaluation, resolution and follow up:
https://twiki.cern.ch/twiki/bin/view/Atlas/AtlasDBIssues Access patterns at Tier-1s will help us proactively find issues
Weekly (or regular reports) might help. Example from Tier-0:http://atlas.web.cern.ch/Atlas/GROUPS/DATABASE/rac_report/
26-Nov-2009 Elizabeth Gallas 17
Summary Schemas are stable Access is controlled Monitoring in place Depth of experience Some redundancy and
tools in back pockets (COOL Pilot, Frontier)
Ready ? YES !
It’s November…and the data is coming… intensive and diverse analysis will follow …
Are we ready ?
26-Nov-2009 Elizabeth Gallas 19
Why Frontier ?TadaAki Isobe (DB access from Japan)
http://www.icepp.s.u-tokyo.ac.jp/~isobe/rc/conddb/DBforUserAnalysisAtTokyo.pdf
ReadReal.py scriptAthena 15.4.03 Methods:
SQLiteAccess files on NFS
w/ LYON Oracle~290msec RTT to CC‐IN2P3
w/ FroNTier~200msec RTT to BNLZip‐level: 5
Zip‐level: 0 (i.e. no compress), it takes ~15% longer time.
Work is ongoing to understand these kinds of tests.
26-Nov-2009 Elizabeth Gallas 20
What does a remote collaborator (or grid job) need?
Use case: TRT Monitoring (one of MANY examples…) Needs: latest Conditions (Oracle + POOL files) From: a Tier-3 site Explored: all 3 access methods
Talked: in/at hallways … then at meetings (BNL Frontier, Atlas DB) .. With experts and many other users facing similar issues…
In this process: Many many talented people have been involved in this process from around the globe – impressive collaboration !
Collective realization that Use cases continue to grow for distributed
Processing…Calibration…Alignment…Analysis …Expect sustained surge in all use cases w/collision data
Frontier technology seems to satisfies the needs of most use cases in a reasonable time
Now a matter of final testing to refine configurations, going global … for all sites wanting to run jobs with latest data …
Fred LuehringIndiana U
26-Nov-2009 Elizabeth Gallas 21
TAG Database Services Components:
TAG Database(s)at CERN and voluntary Tier-1s, Tier-2s
ELSSI – Event Level Selection Service InterfaceTAG Usage in every Software tutorial
ELSSI and file based TAG usage Web Services
Extract - dependencies Atlas Software AFS maxidisk to hold root files from Extract
Skim - Atlas Software, DQ2, Ganga, … Surge in effort helping to make TAG jobs grid-enabled
Response times vary: O(sec) for interactive queries
event selection, histograms… O(min) for extract O(hr) for skim
26-Nov-2009 Elizabeth Gallas 22
TAGs by Oracle Site (size in GB) CERN ATLR (total 404 GB)
ATLAS_TAGS_CSC_FDR 22 ATLAS_TAGS_COMM_2009 142ATLAS_TAGS_USERMIX 18ATLAS_TAGS_COMM 200 ATLAS_TAGS_ROME 22
BNL (total 383 GB)ATLAS_TAGS_CSC_FDR 18ATLAS_TAGS_COMM_2009 105ATLAS_TAGS_COMM 260
TRIUMF (total 206 GB)ATLAS_TAGS_CSC_FDR 16ATLAS_TAGS_COMM 190
DESY ATLAS_TAGS_MC 231
RAL – gearing up …
Respectable level of TAG deployment – should entertain wide variety of users (commissioning and physics analysis)
TAG upload now routine for commissioning … will be automatic for collisions
Intensive work on deployment of TAG services making them increasingly accessible to users (ELSSI)