23
1 CNAP 22nd March 2004 CNAP 22nd March 2004 Summary of Atlas Petabyte Summary of Atlas Petabyte Data Store User Group Data Store User Group Meeting Meeting March 4 March 4 th th 2004 2004

1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

Embed Size (px)

Citation preview

Page 1: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

1

CNAP 22nd March 2004CNAP 22nd March 2004

Summary of Atlas Petabyte Data Summary of Atlas Petabyte Data Store User Group MeetingStore User Group Meeting

March 4March 4thth 2004 2004

Page 2: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

2

Summary of recent developments

• LHC, PP community and hardware upgrade, and media migration (Tim Folkes)

• SRB interface (Bonny Strong)• SE interface for GRIDPP (Jens Jensen)• Belt and braces:

• Improved environmental monitoring• disaster recovery: New off site back-up service.

• OAIS, the RLG and trusted digital repositories (David Giaretta)

Page 3: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

3

9940B connections

Switch_1 Switch_2

RS6000 RS6000RS6000 RS6000fsc0 fsc1 fsc1fsc0

9940B 9940B 9940B 9940B 9940B 9940B 9940B 9940B

1 2 3 4 5 6 7 8

11 14 11 1415

fsc1fsc0fsc1fsc0

12 13 12 13 15

rmt1 rmt4rmt3rmt2rmt5-8 rmt5-8rmt5-8rmt5-8

A A A A A A A A

STK 9310 “Powder Horn”

Gbit network

1.2TB 1.2TB 1.2TB 1.2TB

Page 4: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

4

SRB Example: CMS

• Largest project using CCLRC SRB services at present is the CERN CMS experiment.

• SRB chosen for Pre-Challenge Production in 2003, producing data for Data Challenge 2004.

• ADS driver for SRB was developed to meet CMS immediate needs.

• SRB server installed for CMS which interfaces to ADS.

Page 5: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

5

Future Plans for SRB to ADS

• The SRB driver developed for CMS will be expanded for use by other projects.

• ADS will run an SRB server for integration into any SRB domain.

• Will translate the SRB user name and/or domain name into an ADS owner name.

• Will use the pathtape server to map SRB collection names to ADS 6-character tape names.

Page 6: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

6

APS Recent New Users & Potential New Users

• Recent New Users• National Crystallography Service,

Southampton University (~2TB/yr?)• WASP (30TB/yr?)• VIRGO Consortium (3TB/yr?)

• Potential New users• Integrative Biology (15TB/yr?)• Diamond? (1-3PB/yr?)• BBSRC (BITS)? 10-20TB/yr?)• Arts and Humanities Data Service?

(2TB/yr)

Page 7: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

7

Actual Growth 1997-2003

-20000

0

20000

40000

60000

80000

100000

Jun-

97

Sep-9

7

Dec-9

7

Mar

-98

Jun-

98

Sep-9

8

Dec-9

8

Mar

-99

Jun-

99

Sep-9

9

Dec-9

9

Mar

-00

Jun-

00

Sep-0

0

Dec-0

0

Mar

-01

Jun-

01

Sep-0

1

Dec-0

1

Mar

-02

Jun-

02

Sep-0

2

Dec-0

2

Mar

-03

Jun-

03

Sep-0

3

Dec-0

3

Time years

Dat

a V

olu

me

(GB

)

Cumulative Data Volume (GB)

Actual Growth (GB)

Page 8: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

8

Questionnaire responses

62% from CCLRC; 38% external75% currently using ADS; 25% not currently using or not

users.

Average years of use 7.4Max years of use 20.0Min years of use 0.8SD years of use 6.6 Some role descriptions of those responding: “Sys admin”, “Data Analysis and data provision”, “Experiment

coordinator”, “Archiver”, ”User”, “Project Data Storage Manager”, “Responsible for project back-ups”, “Project Manager”.

Page 9: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

9

Questionnaire – Motivation and assessment

“Convenient”, “Easy”, “Reliable”, “Support available”, “Secure”, “Long term back up”, “Large volume”

 “No need to get involved with tape storage”;“No perceived alternative”

Mean Score (out of 10) 8.2

Min 5.0

Max 10.0

SD 1.8

Page 10: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

10

Questionnaire – Web page usage

Web page usage %

Never 21

Rarely 14

Occasional 57

Often 7

Page 11: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

11

Questionnaire – Communication & Awareness

Preferences for improved methods of communication

  % For% Against% Maybe

Need for list server 71 29 0Need for user group meeting 57 29

14

User awareness of recent developmentsAwareness of Aware (%) Not aware (%)

Hardware upgrade 79 21

SE interface 29 71

SRB interface 50 50

Page 12: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

12

Improvements or changes required to the service (1)

• Backup service available on wide platform i.e Windows PC etc

• Require SRM interface• Need to store data sets with long names (I.e.

> 6 chars) - and better than pathtape look-up is required

• Native support for full path names (ie. not having to use the pathtape service). Tiny tape names

• Use email more for known downtimes etc• Ability to store large files (> 2Gb)

Page 13: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

13

Improvements or changes required to the service (2)

• More online storage / caching (depending on future requirements)

• Web / Grid interface• User-queryable database of usage statistics,

e.g. to find out my top-100 datasets, or to see how many times this year / month / etc a particular item has been accessed. Having this as a database that I can query using JDBC from my own management applications would be even better than static reports.

• Metadata lookups: it would be useful to check the file size directly from flfsys

Page 14: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

14

Improvements or changes required to the service (3)• Transparent file access (HSM) so that we could

forget about (virtual) tapes • Fix the problem between Solaris and the ADS

software regarding multiple files on ADS datasets;• Provide a backup and archive interface for NT

servers.• Really good tape changer driver mapped into

Windows server 2003. (More support required)• Quicker access to off line tapes to improve speed of

restores.• More documentation.• More user-friendly commands for such things as

rules• Price control.

Page 15: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

15

Ranked User issues

question User specified Issue Mean response

(A-K)

3 Need to store data sets with long names (I.e. > 6 chars) - and better than pathtape look-up is required

7.9

4 Native support for full path names (ie. not having to use the pathtape service).Tiny tape names

7.7

6 Ability to store large files (> 2Gb) 7.3

18 Price control. 6.5

8 Web / Grid interface 6.4

5 Use email more for known downtimes etc 6.2

16 More documentation. 6.1

7 More online storage / caching (depending on future requirements) 5.6

17 More user-friendly commands for such things as rules 5.6

1 Backup service available on wide platform i.e Windows PC etc 5.4

15 Quicker access to off line tapes to improve speed of restores. 5.3

9 User-queryable database of usage statistics, 5.0

Page 16: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

16

Conclusions (1)

Responses have been received mainly from technical, hands-on users with a good balance from both within CCLRC and from external users.

The majority of responses have been received from people who are currently using the Data store. Most have many years of experience of using the Data Store.

The responses received represent approximately 20% of the active users. (Total number of active[1] users = 84)

Given 1,2 and 3 above, the responses received are from a knowledgeable section of experienced users both internal and external to CCLRC, who comprise a representative proportion of all current active users. On this basis the responses can be believed and should be used reliably.

Page 17: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

17

Conclusions (2)

Most users understand the advantages of the ADS. I.e. they know what they want.

Overall, most users get what they want from the service (8.2/10). We now have a measure from which to improve.

Some of the improvements identified by the users have already or are now being addressed. Of those that are not, further clarification is required in order to understand how important the issue is to other users, and to clarify the problem adequately to consider appropriate solutions. What mechanisms could be used to achieve this?

Most users were aware of the recent hardware upgrade, although a surprisingly high proportion of users (21%) were not. Most users were unaware of the SE interface, and only half were aware of the SRB interface. This matters because there are improved services coming on line from the development team, which some users may wish to take advantage of.

Page 18: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

18

Conclusions (3)Most users (64%) use the web page at least

occasionally, whereas 35% use it rarely or never.

Communication between users and development team needs to be improved. Given that most users make at least occasional use of the

web pages, the most simple and effective means of doing so is to keep the web site up-to-date with current developments. However, this will not be successful for around one third of users.

Almost 80% of users are in favour of a email list serv. Service. The combination of this with an improved web site should be adequate.

Almost 60% of users are in favour of User group meetings. These should be continued, probably yearly.

Page 19: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

19

Backups

Page 20: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

20

Digital Curation Centre (DCC)

• Joint collaboration between CCLRC, UKOLN, and Edinburgh and Glasgow Universities.

• Provide advice, support, research and development into aspects of Digital Curation for the UK HE community

• Funded jointly by JISC and EPSRC - £1m/year for three years initially. Feb 2004- 2007

• Establish collaboration with industrial partners…

Page 21: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

21

ADS Running Costs 04/05. (Option 1).

H/W maintenance11%

S/W maintenance3%

Hardware15%

Network0%

Other5%

Staff costs66%

Page 22: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

22

3590/9940 Drive connections (old)

STK 9310 ~6000 slots

3590 3590 3590 3590

RS6000 RS6000 RS6000 RS6000

54G 216G108G108G

100Mbit Network

9940 9940 9940 9940

Page 23: 1 CNAP 22nd March 2004 Summary of Atlas Petabyte Data Store User Group Meeting March 4 th 2004

23

Real drive performance

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

01

/01

/20

03

15

/01

/20

03

29

/01

/20

03

12

/02

/20

03

26

/02

/20

03

12

/03

/20

03

26

/03

/20

03

09

/04

/20

03

23

/04

/20

03

07

/05

/20

03

21

/05

/20

03

04

/06

/20

03

18

/06

/20

03

02

/07

/20

03

16

/07

/20

03

30

/07

/20

03

13

/08

/20

03

27

/08

/20

03

10

/09

/20

03

24

/09

/20

03

08

/10

/20

03

22

/10

/20

03

05

/11

/20

03

19

/11

/20

03

03

/12

/20

03

17

/12

/20

03

31

/12

/20

03

14

/01

/20

04

28

/01

/20

04 0

5

10

15

20

25

30

Mbytes/sec

Mounts Reads Writes Real mounts MB/Sec

Upgrade