13
Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008

Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008

Embed Size (px)

Citation preview

Page 1: Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008

Channel Archiver Stats & Problems

Kay Kasemir, Greg Lawson, Jeff Patton

Presented by Xiaosong Geng (ORNL/SNS)

March 2008

Page 2: Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

2

Channel Archiver

1997 design goal: Write 10,000 samples/sec Raw 'write' test on today's computers:

>50,000 samples/sec

Used by several EPICS sites worldwide

Page 3: Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

3

SNS Stats

Need only ~2,000 samples/secon average.

Configuration for ~80,000 channels split into ~70 sub-archives … so that maintenance or problems on

one sub-archive won't affect others Means users have to retrieve from

correct sub-archive CSS DataBrowser makes that a bit easier

(next talk)

Disk space: ~170 GB/month Keep running into disk space limitations

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 4: Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

4

Data Management Limitations

Difficult and time consuming Moving data around requires manual index updates

(takes hours, could be days)

Few Informational Tools Nothing prevents duplication Which channels contribute the most to data growth?

Storage only supports "Append new samples" Removal of selected channels impossible Removal of older data limited to complete 'sub archives' No practical way to use Java or Matlab code to replace original

samples with reduced sample count, .. or to insert computed data like daily statistics into "archive"

Page 5: Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

5

Question

Continue to invest in Channel Archiver's data management tools? End up implementing a relational database? Why not use existing RDB?

Page 6: Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

6

JLab: MySQL Transition (Chris Slominski)

Very promising performance tests!

Limitations by design Stores every update from IOC. No 'sampling'. 'Double' stored as 'float' to save space. Only small arrays. Metadata: Units. No limits, precision. No status/severity.

MySQL Issues Table size is limited

Need one table per channel Table count is limited

Custom code implements 'clustering' SQL "DELETE" doesn't free disk space, or is very slow

(delete ONE week data --> take TWO weeks)

Page 7: Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

7

SNS Oracle Tests Basic JDBC test code:

up to 8,000 inserts/second via network

Tricks "Batching" ~500 inserts "Partitioning" spreads one big "sample" table over

disk partitions Currently one partition each day, automatically

added. Details are still being developed.

Expensive, but looks like the way to go Avoid MySQL workarounds SNS committed to Oracle anyway,

but will need partitioning license

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Page 8: Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

8

Great! But what about SLAC?

Reported promising performance results for Oracle-based data storage Lee Ann Yasukawa, Robert Hall:

"Archiving Into Oracle", ICALEPCS2001

End of 2004: No more. What do we need to learn from that?

Page 9: Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

9

Basic Sample Table Design

What data types to support? Time stamp detail, enumerated values, arrays, meta

data?

One table per channel (JLab)?

One table per data type (SLAC)?

SNS:One table for all scalar samples Possibly wasting space, but best to

use SQL across various channelsof different types

Array elements in additional table

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 10: Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

10

Archive Engine Prototype Developed in Java

Eclipse/CSS command-line app

Reads existing engine config files "Scanned" and "monitored" sampling as before "Disabling"/"enabling" of groups of channels Scalars, arrays, enumerated samples

Web-server interface for status which is similar to the original engine

Writes into Oracle

Write performance OK for scalar tests Reaches the ~8000 samples/sec from original tests

Page 11: Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

11

Engine's Web Server

Page 12: Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

12

Archive Engine Prototype… JNI vs. CAJ

Only JNI dependably handles many Channel Access connections (>10,000)

JCA/JNI Performance improved on Mar. 11, use “yield” call not “sleep” 10 msec, approximately factor of 30 improvement(from 45 seconds down to 1.5 seconds for 1460 non-blocking connection requests on SNS physics server)

Can run multiple engines, but RDB enforces that they handle different channels

Arrays are slow One N-element array comparable to N scalars Only "Double" type array elements

Need to implement meta data handling (units, limits, …)

Plan is to support MySQL, but "you get what you pay for" No custom code to handle table size/count limitations MySQL "Timestamp" only handles seconds, no nanoseconds

Page 13: Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008

OAK RIDGE NATIONAL LABORATORYU. S. DEPARTMENT OF ENERGY

13

Summary

Testing Oracle setups and prototype sampling engine Oracle space currently only ~30GB

Data retrieval via CSS DataBrowser RDB is just another data source

Compared to current SNS ChannelArchiver setup, performance expected to be A little less … but sustainable in the long run.