Upload
betty-gibbs
View
215
Download
2
Embed Size (px)
Citation preview
Fermilab Database Experience in Run II
fFermilab Run II Database
Requirements• Online databases are maintained at each
experiment and are critical for data taking.• Offline databases are maintained in the
Feynman Computing Center and are critical for data processing and analysis.
• High Availability for both online and offline database systems is required.
• Database Applications Overview– Detector and physics data
• Detector Calibration • Trigger lists• Data Luminosity• Detector Slow Controls• Run and Run Quality information
– Data Handling (The SAM Database)• Physics Metadata• File catalog• File replica management• Processing information
• Database storage growth is shown in the accompanying charts (D0 left, CDF right).
Fermilab Database Experience in Run II
fOracle in Run II
• TOOLMAN– Provides an alternative method to OEM for monitoring Oracle
databases. – Can be customized in several ways for the machine and
databases it monitors.
Data Base Monitoring: • Monitoring is done using Oracle Enterprise Manager (OEM, by
Oracle Corp) and TOOLMAN,an in-house developed tool.
• OEM monitors the following: – Node up and down, Database Listener down,Intelligent
Agent – Number of storage extents and space usage– Database Alerts – Db down , file corruption – Number of concurrent sessions, CPU usage,Memory usage– Hit ratios for Library, Buffer Cache and other database
resources.
Table Partitioning• Partitioning has been implemented for very large table(s) in the
database.• D0 uses a partitioned Events table with 50M events in each partition.• Each partition is stored in its own tablespace and corresponding
indexes are also partitioned and stored in their own tablespaces. • Partitioning improves Query Optimization and Backup Performance• Over 1 billion events are distributed over 24 partitions and a new
partition is started about once a month.
Replication• Replication is used to share data in a large user. • CDF has the same database structure for online and offline databases.
Oracle’s asynchronous replication is used to refresh offline tables from online tables periodically.
• One replica is used by Farm Users and the other is used by CAF and other READ ONLY users.
• A key feature of CDF replication is Fail-Over from one replica to another for high reliability.
• CDF is planning to migrate to Oracle Streams replication available from version 9.2.x release soon. Event Partition Growth
0
5
10
15
20
25
30
10/1
7/20
01
1/17
/200
2
4/17
/200
2
7/17
/200
2
10/1
7/20
02
1/17
/200
3
4/17
/200
3
7/17
/200
3
10/1
7/20
03
1/17
/200
4
4/17
/200
4
7/17
/200
4
Rollover Dates
Par
titi
on
#
Event PartitionGrowth
9.2.0.4On-line DB(cdfonprd)
8.7.1.4Off-line DB(cdfofprd)
9.2.0.3Off-line DB(cdfrep01)
DFC
Farm Users
On-line Users
CAF and Others
Failover for Read Service ONLY
4 Applications
Basic
Basic Basic
4 Applications
Fermilab Database Experience in Run II
fRun II Database Access
• For D0, only a subset of the online information was transferred to the offline database (Lower left).
• All access to the D0 offline database was through the Calibration DB server (DAN,upper right) or Data Handling server (SAM) .
• CDF employed Basic Oracle replication to transfer all online database information to offline databases (See poster ‘Oracle in Run II’).
• FroNtier is a web-based, highly scalable, approach which is being developed for CDF to provide high performance database access to read-only information (Lower right). http://whcdf03.fnal.gov/ntier-wiki
L1
LEVEL 3FILTER NODES
L2CR
DD
DL
EXAMINE
ONLINE DB
Datalogger, DLSAM PROCESS
ENSTORE
OFFLINE DB
DATA FILE
META DATA
FE NODESTRIG CTL
LUM SERVER
HDB
TRIG
ON CAL
mdata REPO
MON
LUM
CALIB PROCESS
TRIG DL PROC
SAM DB
FF
COOR
Web Entry
MFC Entry
Online Host -- DEC
Front End -- 68k
Level 3 Nodes -- NT
Offline Host -- Sun
Local Host
SIGEVNT
run ctl
AlarmGUI
AlarmSRV
OFF LINE CALONLINE TO
OFFLINE CONNECTION
LUM run ctl
ONLINE OFFLINE
ETC
DØ Ø Online to Offline Database Copy
DØ Offline Caching Server: DAN (Database Access Network)
• CORBA interface to Client apps
• Memory (L1) and Disk (L2) caching
• Connection management to Database
• Server has common code base with SAM DB server
Read-only DB access
FroNtier Overview
CDF Persistent Object Templates(Java)
Client
Caching
FroNtierServer
Database
FroNtier Client API Library
Squid Proxy/Caching Server
FroNtier Servlet running under Tomcat
Database (or other
persistency service)
XML Server Descriptors
DDL for Table Descriptions
C++ Header and Stubs
JDBC
HTTP
HTTP
Fermilab Database Experience in Run II
fRun II Database Performance
and Monitoring
CDF Logging Server
CDF ClientApplications
DANServerDAN
ServerDANServerDAN
Server
• Project Goal: Common tools for Application Monitoring
• Information Generation is Experiment Specific
• The Collector gathers and parses data
• The Archiver uses a MySQL Repository
• Plotting tools use JavaFreeChart
• Histogramming uses JAIDA
• Admin and automation scripts are included.
• http://dbsmon.fnal.gov
Average duration time for Database connections for CDF .
Top CPU users on CDF Database Applications over an 8 hour interval
D0 Sam Servers query counts over 24 hours interval
DB Monitor Overview
CDF or D0
• DBS Monitor is used for collecting information on database access and presenting it through a web interface
DBS MonitorNumber of
connections per minute for CDF
Query counts per week forD0 SAM station server
Number of queries per hour for D0 Farm and
Non-Farm servers
Database Monitoring is a crucial component of our Database Operation.