30
The GSI Mass Storage for Experiment Data DVEE-Palaver GSI Darmstadt Feb. 15, 2005 Horst Göringer, GSI Darmstadt [email protected]

The GSI Mass Storage for Experiment Data

Embed Size (px)

DESCRIPTION

The GSI Mass Storage for Experiment Data. DVEE-Palaver GSI Darmstadt Feb. 15, 2005 Horst Göringer, GSI Darmstadt [email protected]. Overview. different views current status last enhancements: - write cache - on-line connection to DAQ future plans conclusions. - PowerPoint PPT Presentation

Citation preview

Page 1: The GSI Mass Storage  for Experiment Data

The GSI Mass Storage for Experiment Data

DVEE-Palaver GSI Darmstadt

Feb. 15, 2005

Horst Göringer, GSI Darmstadt

[email protected]

Page 2: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 2

Overview

different views current status last enhancements:

- write cache

- on-line connection to DAQ future plans conclusions

Page 3: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 3

GSI Mass Storage System

Gsi mass STORagE system

gstore

Page 4: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 4

gstore: storage view

central tape central disk clients

write cache

tsmcli client

RFIO client

DAQ client

ArchivePool,

RetrievePool,StagePool,

...

...

DAQPool,...

disk

memory

memory

read cache

write cache

ATL

Page 5: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 5

gstore: hardware view

3 automatic tape libraries (ATL):

(1) IBM 3494 (AIX)

8 tape drives IBM 3590 (14 MByte/s)

ca. 2300 volumes (47 TByte, 13 TByte backup)

1 data mover (adsmsv1)

access via adsmcli, RFIO read

read cache 1.1 TByte

StagePool, RetrievePool

Page 6: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 6

gstore: hardware view

(2) StorageTek L700 (Windows 2000)

8 tape drives LTO2 ULTRIUM (35 MByte/s)

ca 170 volumes (32 TByte)

8 data mover (gsidmxx), connected via SAN

access via tsmcli, RFIO

read cache 2.5 TByte

StagePool, RetrievePool

write cache

ArchivePool: 0.28 TByte

DAQPool: 0.28 TByte

Page 7: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 7

gstore: hardware view

(3) StorageTek L700 (Windows 2000)

4 tape drives LTO1 ULTRIUM (15 MByte/s)

ca. 80 volumes (10 TByte):

backup copy of 'irrecoverable' archives ...raw

mainly for backup of user data (~ 30 TByte)

Page 8: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 8

gstore: software view

2 major components:

• TSM (Tivoli Storage Manager) commercial

handles tape drives and robots

data base• GSI software (~ 80,000 lines of code)

C, sockets, threads

- interface to user (tsmcli / adsmcli, RFIO)

- interface to TSM (TSM API client)

- cache administration

Page 9: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 9

gstore user view: tsmcli

tsmcli subcommands:

archive file* archive path

retrieve file* archive path

query file* archive path*

stage file* archive path

delete file archive path

ws_query file* archive path

pool_query pool*

*: any combination of wildcard characters (*,?) allowed

soon: file may contain list of files (with wildcard chars)

Page 10: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 10

gstore user view: RFIO

rfio_[f]open

rfio_[f]read

rfio_[f]write

rfio_[f]close

rfio_[f]stat

rfio_lseek

GSI extensions (for on-line DAQ connection):

rfio_[f]endfile

rfio_[f]newfile

Page 11: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 11

gstore server view: query

writecacheserver

readcacheserver

DB

DB

TSMserver

client

serverentry

DB

Page 12: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 12

gstore server view: archive to cache

writecacheserver

readcacheserver

DB

DB

TSMserver

writecache

client

data mover i (of n)

serverentry

DB

moverserver

Page 13: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 13

gstore server view: archive from cache

writecacheserver

DBTSMserver

tape

Agent

writecache

data mover i (of n)

DB

SAN

serverarchive

TSMStor.

Page 14: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 14

gstore server view: retrieve from tape

writecacheserver

readcacheserver

DB

DB

TSMserver

tape

AgentStor.

TSM

cacheread

client

data mover i (of n)

entryserver

DB

SAN

moverserver

Page 15: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 15

server view: retrieve from write cache

writecacheserver

readcacheserver

DB

DB

TSMserver

cacheread write

cache

client

data mover jdata mover i

DB

serverentry

moverserver

servermover

Page 16: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 16

gstore: overall server view

writecacheserver

readcacheserver

DB

DB

TSMserver

tape tape

tape tape

... servercache

AgentStor.

TSM

cacheread write

cache

client

data mover i (of n)

serverentry

DB

SAN

moverserver

archiveserver

Page 17: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 17

server view: gstore design concepts

• strict separation of control and data flow• no bottleneck for data• scalable in

capacity (tape and disk)

I/O bandwidth• hardware independent

(as long as TSM support)• platform independent• unique name space

Page 18: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 18

server view: cache administration • multithreaded servers for read and write cache• each with own metadata DB• main tasks:

- lock/unlock files

- select data movers and file systems

- collect actual infos on

disk space

soon: data mover and disk load -> load balancing

- trigger asynchronous archiving

- disk cleaning • several disk pools with different attributes:

StagePool, RetrievePool, ArchivePool, DAQPool, ...

Page 19: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 19

usage profile: batch farmbatch farm: ~120 double processor nodes

=> highly parallel mass storage access (read and write)

• read requests:

'good' user: stage all files before

use wildcard chars

'bad' user: read lots of single files from tape

'bad' system: stage disk/DM crashes during analysis

• write requests:

via write cache

distribute as uniformly as possible

Page 20: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 20

usage profile: experiment DAQ

• several continous data streams from DAQ• keep same DM during life time of data stream• only via RFIO• GSI extensions necessary:

rfio_[f]endfile, rfio_[f]newfile• disks faster emptied than filled:

network -> disk: ~10 MByte/s

disk -> tape: ~30 MByte/s

=> time to stage for on-line analysis• enough disk buffer necessary for case of problems

(robot, TSM, ...)

Page 21: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 21

current plans: new hardwaremore and safer disks:• write cache: all RAID

4 TByte (ArchivePool, DAQPool)• read cache: +7.5 TByte new RAID

StagePool, RetrievePool,

new pools, e.g. with longer file life time• 5 new data movers:

new fail-safe entry server• hosts query server, cache administration servers

-> query performance!• take-over in case of host failure• metadata DBs mirrored on 2nd host

Page 22: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 22

current plans: merge tsmcli /adsmcli

new command gstore:• replaces tsmcli and adsmcli• unique name space (already available)• users need not care in which robot data reside• new archive: policy computing center

Page 23: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 23

brief excursion: future of IBM 3494?

• still heavily used• rather full• hardware highly reliable• should be decided this year!

Page 24: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 24

usage IBM 3494 (AIX)

Page 25: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 25

brief excursion: future of IBM 3494?

2 extreme options (and more in between):• no more money investment

use as long as possible

in a few years: move data to other robot• upgrade tape drives and connect to SAN

3590 (~30 GB, 14 MB/s) -> 3592 (300 GB, 40 MB/s)

new media: => 700 TByte capacity

access with available data movers via SAN

new fail-safe TSM server (Linux?)

Page 26: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 26

current plans: load balancing

• acquire actual info on no. of read/write processes

for each disk, data mover, pool• new write request:

select resource with lowest load• new read request:

avoid 'hot spots'

-> create additional instances of stage file• new option '-randomize' for stage/retrieve

distribute equally to different data movers / disks

split into n (parallel) jobs

Page 27: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 27

current plans: new org. of DMs

• Linux platform

more familar environment (shell scripts, Unix commands, ...)

case sensitive file names

current mainstream OS for experiment DV

• '2nd level' data movers

no SAN connection

disks filled via ('1st level') DMs with SAN connection

for stage pools with guaranteed life time of files

Page 28: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 28

current plans: new org. of DMs• integration of selected group file servers

as '2nd level' data movers

disk space (logically) reserved for owners

pool policy according to owners

many advantages:

no NFS => much faster I/O

files physically distributed over several servers

load balancing of gstore

disk cleaning

disadvantages:

only for exp. data, access via gstore interface

Page 29: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 29

current plans: user interface• a large number of user requests:

- longer file names

- option to rename files

- more specific return codes

- ...• program code consolidation • improved error recovery after HW failures• support for successor of alien• GRID support

- gstore as Storage Element (SE)

- Storage Resource Manager (SRM)

-> new functionalities, e.g. reserve resources

Page 30: The GSI Mass Storage  for Experiment Data

Horst Göringer GSI DVEE Palaver 15.2.2005 30

Conclusions

• GSI concept for mass storage successfully verified• hardware and platform independent• scalable in capacity and bandwidth to keep up with

- requirements of future batch farm(s)

- data rates of future experiments• gstore able to manage very different usage profiles• but still a lot of work ...

to fully reach all discussed plans