Upload
claud-long
View
215
Download
3
Embed Size (px)
Citation preview
CASTOR: CERN’s data management system
CHEP0325/3/2003
Ben Couturier, Jean-Damien Durand, Olof Bärring CERN
25/3/2003 CASTOR: CERN's data management system 2
Introduction
• CERN Advanced STORage Manager– Hierarchical Storage Manager used to store user and
physics files– Manages the secondary and tertiary storage
• History– Development started in 1999 based on SHIFT, CERN's
tape and disk management system since beginning of 1990s (SHIFT was awarded the 21st Century Achievement Award by Computerworld in 2001)
– In production since the beginning of 2001
• Currently holds more than 9 million files and 2000 TB of data
• http://cern.ch/castor/
25/3/2003 CASTOR: CERN's data management system 3
Main Characteristics (1)
• CASTOR Namespace– All files belong to the “/castor” hierarchy– The rights are standard UNIX rights
• POSIX Interface– The files are accessible through a standard
POSIX interface, all calls are rfio_xxx (e.g. rfio_open, rfio_close…)
• RFIO Protocol– All remote file access done using the Remote
File IO protocol, developed at CERN.
25/3/2003 CASTOR: CERN's data management system 4
Main Characteristics (2)
• Modularity– The components in CASTOR have well defined roles and
interfaces, it is possible to change a component without affecting the whole system
• Highly Distributed System– CERN uses a very distributed configuration with many
disk servers/tape servers. – Can also run in more limited environment
• Scalability– The number of disk servers, tape servers, name
servers… is not limited– Use of RDBMS (Oracle, MySQL) to improve the scalability
of some critical components
25/3/2003 CASTOR: CERN's data management system 5
Main Characteristics (3)
• Tape drive sharing– A large number of drives can be shared between users
or dedicated to some users/experiments– Drives can be shared with other applications: with TSM,
for example
• High Performance Tape Mover– Use of threads and circular buffers– Overlaid device and network I/O
• Grid Interfaces– A GridFTP daemon interfaced with CASTOR is currently
in test– A SRM Interface (V1.0) for CASTOR has been developed
25/3/2003 CASTOR: CERN's data management system 6
Hardware Compatibility
• CASTOR runs on:– Linux, Solaris, AIX, HP-UX, Digital UNIX, IRIX– The clients and some of the servers run on Windows
NT/2K
• Supported drives– DLT/SDLT, LTO, IBM 3590, STK 9840, STK9940A/B (and
old drives already supported by SHIFT)
• Libraries– SCSI Libraries– ADIC Scalar, IBM 3494, IBM 3584, Odetics, Sony DMS24,
STK Powderhorn
25/3/2003 CASTOR: CERN's data management system 7
CASTOR Components
• Central servers– Name Server– Volume Manager– Volume and Drive Queue Manager (Manages the volume
and drive queues per device group)– UPV (Authorization daemon)
• “Disk” subsystem– RFIO (Disk Mover)– Stager (Disk Pool Manager and Hierarchical Resource
Manager)
• “Tape” Subsystem– RTCOPY daemon (Tape Mover)– Tpdaemon (PVR)
25/3/2003 CASTOR: CERN's data management system 8
CASTOR Architecture
VDQMserver
NAMEserver
STAGER
RFIOD(DISK
MOVER)
DISK POOL MSGD
NAMEserver
VOLUMEmanager
RTCPD
RTCPD(TAPE
MOVER)
TPDAEMON(PVR)
VDQMserver
CUPV
RFIOClient
25/3/2003 CASTOR: CERN's data management system 9
CASTOR Setup at CERN
• Disk servers– ~ 140 disk servers – ~ 70 TB of staging pools– ~ 40 stagers
• Tape drives and servers
• Libraries– 2 sets of 5 Powderhorn silos (2 x 27500 cartridges)– 1 Timberwolf (1 x 600 cartridges)– 1 L700 (1 x 600 cartridges)
Model Nb Drives Nb Servers
9940B 21 20
9940A 28 10
9840 15 5
3590 4 2
DLT7000 6 2
LTO 6 3
SDLT 2 1
25/3/2003 CASTOR: CERN's data management system 10
Evolution of Data in CASTOR
25/3/2003 CASTOR: CERN's data management system 11
Tape Mounts per group
25/3/2003 CASTOR: CERN's data management system 13
Tape Mounts per drive type
25/3/2003 CASTOR: CERN's data management system 14
ALICE Data Challenge
• Migration rate of 300 MB/s sustained for a week– Using 18 STK T9940B drives– ~ 20 disk servers managed by 1 stager– A separate name server was used for the data
challenge– See presentation of Roberto Divia