23
ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 [email protected]

ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 [email protected]

Embed Size (px)

Citation preview

Page 1: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

ICAT Overview

Tom Griffin, ISIS FacilityICAT Developer WorkshopThe Cosener’s House, Abingdon

August 2009

[email protected]

Page 2: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

The Problem(s)ICAT

• Large Data Volumes• High Throughput • Proliferation of data formats• Multiple Data Analysis Step• Increasing complexity of data• Data Access requirements (Sharing and Restriction)• Versioning of data formats and associated software• Distributed Computation (accessed offline from research chain)• Common names and units for temperature, pressure etc. • Changing / differing metadata requirements • International users / federation of data from facilities• Relating to Proposals and Publications• Ontologies• Provenance (Creation, Ownership, History)• Governments want return on investment

Page 3: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

What is ICAT ? ICAT is a database (with a well defined API) that provides a uniform interface to experimental data and a mechanism to link all aspects of research from proposal through to publication.

Access data anywhere via the web Annotate your data Search for data in a meaningful way

e.g. taxonomy, Sample, temperature, pressure etc

Share data with colleaguesAccess data via your own programs

(C++, Fortran, Java etc.) via the ICAT API

Identify potential collaborations Utilise integrated e-Science High-

Performance Computing and Visualisation resources

Link to data from your publicationsEtc.

Proposals

Once awarded beamtime at ISIS, an entry will be created in ICAT that describes your proposed experiment.

Experiment

Data collected from your experiment will be indexed by ICAT (with additional experimental conditions) and made available to your experimental team

Analysed Data

You will have the capability to upload any desired analysed data and associate it with your experiments.

Publication

Using ICAT you will also be able to associate publications to your experiment and even reference data from your publications.

B-lactoglobulin protein interfacial structureE

xam

ple

IS

IS P

rop

osa

l

GEM – High intensity, high resolution neutron diffractometer

H2-(zeolite) vibrational frequencies vs polarising

potential of cations

What is ICAT?ICAT

Page 4: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

RDBMS

Web Services API

ICAT API

Command Line Tools

Glassfish / JBOSS

JavaC++Fortran

Data Storage/ Delivery System

Single Sign On

User Database System

Proposal SystemProposal System

Publication SystemPublication System

e-Science Services

e-Science Services

Software RepositorySoftware

Repository

OverviewICAT

Page 5: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

FederationICAT

RDBMS

Web Services API

ICAT API

Data Storage/ Delivery System

Single Sign On

User Database System

Proposal SystemProposal System

Publication SystemPublication System

e-Science Services

e-Science Services

Software RepositorySoftware

Repository

RDBMS

Web Services API

ICAT API

Data Storage/ Delivery System

Single Sign On

User Database System

Proposal System

Proposal System

Publication System

Publication System

e-Science Services

e-Science Services

Software Repositor

y

Software Repositor

y

RDBMS

Web Services API

ICAT API

Data Storage/ Delivery System

Single Sign On

User Database System

Proposal System

Proposal System

Publication System

Publication System

e-Science Services

e-Science Services

Software Repositor

y

Software Repositor

y

ISIS

SNS ANSTO

Data PortalTopCat

Page 6: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

Investigation

Publication KeywordTopic

SampleSample

ParameterDataset

Dataset Parameter

Datafile

Datafile Parameter

InvestigatorReference / Proposal IdPrevious ReferenceFacilityInstrumentTitleAbstractEtc.

Name

Name/Units/Value etcSearchableIs Sample ParameterIs Dataset ParameterIs Datafile ParameterVerified

NameUnitsString ValueNumeric ValueRange TopRange BottomError

Full ReferenceURL

Repository

NameParent Id

Topic Level

User IdRole

NameChemical FormulaSafety Information

NameUnitsString ValueNumeric ValueRange TopRange BottomError

NameSample Id

Description

NameUnitsString ValueNumeric ValueRange TopRange BottomError

NameDescription

VersionLocation

FormatFormat Version

Create TimeModify Time

SizeChecksum

Related DatafileRelated Datafile

Parameter

Authorisation

Source Datafile IdDestination Datafile Id

RelationS/W Application

S/W Version

User IdRole e.g Admin, Deleter,

Updater, Reader, Creater, Downloader etc.

Element TypeElement Id

Data ModelICAT

Page 7: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

Investigation

Keyword

SampleSample

ParameterDataset

Dataset Parameter

Datafile

Datafile Parameter

Investigator

Parameter

InvestigationFacility: ISISInstrument: MERLINTitle: SiMnSi2 100mev 8s 300k in CCR 45x45mminv_type: experimentBcat_inv_string: Mark Dr A - UniversitDr A,,

DatasetName:Default,Type:experiment_rawDataset_Status: completeDescription: MER03766 Mark Dr A SiMnSi2

100mev 8s 300k

DatafileName:MER03790.rawDesc: Yb0.9Y0.1InCu4 15meV 4S 40K 3Kbar

CuBe cell 10x22mmFormat: isis neutron raw

Sample:Name: Vanadium L2=158 (Gm=91)

Sample Parameters:Name: sample_state Units: N/A String value: powder

Name: sample_situtionUnits: N/A String value: CCR

Datafile parameter:Name: total_proton _chargeUnits: uAmpHoursValue: 0.233844

Keyword:Name: RALName: g_largeName: OSIRISName: YCo3D1.3

Page 8: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

• Service Oriented Architecture– Services exposed as Web Services– User required to authenticate in order to obtain Session Token– Token is used in all subsequent API calls to for authorisation

• The API is modular in order to fit the needs of the facilities– Plugin own user database– Plugin data delivery system

• Chracteristics– Platform independent [Java]– Application Server independent [EJB3]– Database Independent (Almost!) [JPL] – Language independent [Web Services]

• Internals– Core functionality implemented as POJOs using JPA– For deployment EJB3 Session Beans bind the core API, user db and data

delivery aspects together– Services are unit tested using JUNIT– Services are logged at every interaction point using LOG4J

ICAT APIICAT

Page 9: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

ICAT API ContinuedICAT

Page 10: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

ICAT ClientICAT

Page 11: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

Data PortalICAT

Page 12: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

• Role based permissions– [Super]– Admin– Create– Delete– Update– Download– Read

• Data Policy– 3 year embargo on data (+1 if requested)– Commercial data is never made public– Instrument Scientists can access all data from their beamline– Calibration data is public– Any data that involves IPR (e.g. analysed) is private for perpetuity unless

explicitly shared by user

• SSL

SecurityICAT

Page 13: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

Technologies Used– Java– NetBeans 6.1– Glassfish UR2– Ant– JUnit– JMeter– Log4J– EJB3– JPA– JAX-WS– JAXB– Oracle (10G / 11G)– Subversion

Installation / DevelopmentICAT

DevelopmentInstallation» Any O/S» Oracle 10G/11G» Java 6 Update 6» Apache Ant v1.7+» Glassfish v2 UR2» Installed & Configured

Cog Kit

» Unzip download bundle» Update properties files

e.g. database details» Run Ant commands

Page 14: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

User DatabaseICAT

Page 15: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

Data DeliveryICAT

Data Portal

ICAT API

Data.ISIS

User performs search via application e.g. Data PortalSearch is executed in ICAT

Permitted results are returned to application

11

11

22

33

33

22

Results are displayed to the user44

44

55

User performs request to download datafile, multiple datafiles or dataset

55

66

ICAT creates http GET link and passes to back to user (routed through application)

sessionIdemail (optional)fileId(s) or datasetIdaction (i.e. download, zip,

compressed)

66

User clicks http link77

77

Data.ISIS call ICAT API to check permissionssessionId & datafileId(s) or

datasetId

88

88

Return Exception on failure or DownloadObject on success- userId- array [filename, cycle, run

number]

99

99

User gets their data!1010

1010

Page 16: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

Data Delivery ContinuedICAT

Page 17: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

Client

XML IngestICAT

RDBMS

Web Services API

ICAT API

Data Storage/ Delivery System

Single Sign On

User Database System

Proposal SystemProposal System

Publication SystemPublication System

e-Science Services

e-Science Services

Software RepositorySoftware

Repository

XSD

XM

LIng

est(

xml)

Investigatio

nId

Validation

Page 18: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

BusinessSystem

InstrumentControl

Data Storage

ICATProposalSystem

ISIS IntegrationICAT

Trigger•NXIngest•RawIngest

Page 19: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

DevelopersICAT

Page 20: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

Future DevelopmentsICAT

• Design and develop new interface • Release TopCat to ISIS users• Move XML Ingest into asynchonous Message

Driven Bean• Rule-based policy implementation• Expand and improve the supplied interface• Proposal System integration• Publication System integration• Database independent• Derived and simulated data upload• Consequence…• Look at issue/tickets & forum!

Page 21: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

Damian FlannerySummaryICAT

• At ISIS– Volume of data ~4TB– ~3M datafiles (22 instruments, 330/hour)– 6.7GB metadata, 33M rows– 550+ unit & stress tests

• Attempt to solve problems as outlined earlier in this talk• Software characteristics

– Scalability– Maintainability– Reliability– Availability– Extensibility– Performance– Manageability– Security

• We want to drive this forward• We would like to do it in collaboration with other facilities

Page 22: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

Damian FlanneryAcknowledgementsICAT

• ISIS– Damian Flannery, Robert McGreevy, Kenneth

Shankland, Stuart Ansell– Freddie Akeroyd, Chris Moreton-Smith, Matt Clarke,

Kevin Knowles, Steven King, Adrian Hillier, Alex Hannon, Rob Dalgleish

• e-Science– Glen Drinkwater, Shoaib Sufi, Kerstin Kleese Van

Dam, Laurent Lerusse, Rik Tyer, Phil Couch– Gordon Brown, Kier Hawker, Carmine Coiffe– Roger Downing

Page 23: ICAT Overview Tom Griffin, ISIS Facility ICAT Developer Workshop The Cosener’s House, Abingdon August 2009 tom.griffin@stfc.ac.uk

Damian FlanneryQuestionsICAT

http://code.google.com/p/icatproject