Summary of Metadata Workshop Peter Hristov 28 February 2005 Alice Computing Day

Summary of Metadata WorkshopPeter Hristov28 February 2005Alice Computing Day

Metadata: DefinitionTraditionally: Metadata has been understood as “data

about data”Example(s): A library catalogue contains information

(metadata) about publications (data) A file system maintains permissions

(metadata) about files (data)More definitions: try Google

http://www.google.ch/search?hl=en&lr=&oi=defmore&q=define:Metadata

http://www.google.ch/search?hl=en&lr=&oi=defmore&q=define:Metadata

General Applications of Metadata (Web)

Cataloguing (item and collections)Resource discoveryElectronic commerceIntelligent software agentsDigital signaturesContent ratingIntellectual property rightsPrivacy preferences & policies

Statement of the ProblemA user wants to process “some events”He/she needs to know where they are Which file Where in the fileSounds simple!

Tools at HandIn Grid file catalogues files may have metadata that identify a file However we are not sure what the

grid catalogue will look like in this moment

Given a ROOT file, a TRef allows us to “jump” directly to a given object

General Idea on the Grid

MD MD MD keys <guid>MD MD MD keys <guid>MD MD MD keys <guid>MD MD MD keys <guid>MD MD MD keys <guid>MD MD MD keys <guid>MD MD MD keys <guid>MD MD MD keys <guid>MD MD MD keys <guid>MD MD MD keys <guid>

TAG DatabaseEvent catalogue

MD MD MD LFN SE <guid>MD MD MD LFN SE <guid>MD MD MD LFN SE <guid>MD MD MD LFN SE <guid>MD MD MD LFN SE <guid>MD MD MD LFN SE <guid>MD MD MD LFN SE <guid>MD MD MD LFN SE <guid>MD MD MD LFN SE <guid>MD MD MD LFN SE <guid>

File catalogueMD MD MD PFN <guid>MD MD MD PFN <guid>MD MD MD PFN <guid>MD MD MD PFN <guid>MD MD MD PFN <guid>MD MD MD PFN <guid>MD MD MD PFN <guid>MD MD MD PFN <guid>MD MD MD PFN <guid>MD MD MD PFN <guid>

Local catalogue

Main QuestionsWhat happens if we are not on the grid The system should not be too different

What do we put in each catalogueHow much do we depend on the file catalogue being exposed to usWhich information do we put in the TAG databaseWhich are its dimensions and its efficiency?

Event Model

RAW data; Written once, read (not too) many times. Size: 1 – 50 MB per event, exist only one per event.

ESD; Written (not too) many times, read many times. Size: ~ 1/10 of raw per event, exist only one per event.

AOD; Written many times, read many times. Size: ~ 1/10 of ESD per event, exist many (~10) per event.

…Tag; Written (not too) many times, read many times.

Size: 100 B – 1 kb per event, exist many per event. This is done for fast event data selection.

It’s not directly for analysis, histogram production etc. Even (by chance) if the information is there you may do

it. For discussion.

Global experiment tags. Physics working group tags. User defined tags.

TerminologyDefine common terms, first

●Metadata: Key-Value pairsAny data necessary to work on the grid not living in the files●Entry: Entities to which metadata is attachedDenoted by a string, format like file-path in Unix, wild-cards allowed●Collection: Set of entriesCollections are themselves entries, think of Directories●Attribute: Name or key of piece of metadataAlphanumerical string with starting letter●Value: Value of an entry's attributePrintable ASCII string●Schema: Set of attributes of an entryClassifies types of entries: Collection's schema inherited by its entries●Storage Type: How back end stores a valueBack end may have different (SQL) datatypes than application●Transfer Type: How values are transferredValues are transported as printeable ASCII

Event MetadataTag: event building and physics related informationCatalog information => how to retrieve the eventWhat else => links to another sources of information?

Tag DatabaseKarel Safarik

TAG StructureEvent building information.

Allows to find all the information about the event. Event ESD and all the AODs. Maybe also RAW data (hopefully will not be

used often). … (This is not my job).

Physics information. Query-able (that’s on what you select data). Information about trigger, quality etc. Usually same global physics variable. But also one may have there which may not too

much physical sense but is good for selection.

TAG SizeHas to be reasonable to be able to query in reasonable time

Somewhere around disk size --- O(100GB) Typical yearly number of events

107 for heavy ion 109 for pp

However TAG size (in principle) is independent on

multiplicity But it is collision-system dependent, trigger

dependent… For heavy-ion: few kb gives few 10 GB For pp: 100 B gives 100 GB STAR: 500 physics tag fields in 0.5 kb (in average

1 par B)

TAG Content

(Only physics information)Technical part – the same for every TAG database

Run number, event number, bunch crossing number, time stamp

Trigger flags (an event may be trigger by more than one trigger class), information from trigger detectors

Quality information: which detectors were actually on, what was their configuration, quality of reconstruction

Physics part – partly standard, partly trigger/physics/user dependent

Charged particle multiplicity Maximum pt Sum of the pt Maximum el-mag energy Sum of el-mag energy Number of kaons …

TAG ConstructionBasic (experiment wide) TAG database

Written during reconstruction – ESD production But it has also to navigate to (all ?) AOD (produced later) ? There is part which is untouchable (nobody is allowed to

modify) There is part which maybe modified, as result of further

analysisFrom this one all other TAG databases start

The real content of definite instant of TAG database Trigger dependent Detector configuration dependent Physics analysis dependent

Define the physics group TAG databases Derived from experiment wide database

Maybe allow for user TAG databases Derived from physics group database

Useful tag fields are then pushed up in this hierarchy

TAG Conclusion

We have to define prototype of experiment-wide TAG database

Implement this in reconstruction programPhysics working group – to define physic group databases

Test the mechanism of inheritance from experiment-wide TAG database

Decide if the ‘event building’ information has to allow to navigate To all the AODs Or just to those created within that working group

When ?, who ?

Metadata in CDCFons Rademakers

AliMDC - ROOT ObjectifierROOT Objectifier reads raw data stream via shared memory from the GDCObjectifier has three output stream: Raw event file via rootd to CASTOR Event catalog (tag DB) File catalog

In raw event file In MySQL In alien

RAW Data DBRaw data file contains a tree of AliEvent objects: An AliEventHeader object

16 data members (72 bytes) An TObjArray of AliEquipment objects

An AliEquipmentHeader object 7 data members (28 bytes)

An AliRawData object Char array (variable length)

An TObjArray of sub-events (also AliEvents)No compression (need more CPU power)Size of individual raw DB files around 1.5 GB

Event Catalog - Tag DBThe tag DB contains tree of AliEventHeader objects:

Size, type, run number, event number, event id, trigger pattern, detector pattern, type attributes

Basic physics parameters (see Karel’s talk)CompressedUsed for fast event selection

Using compressed bitmap indices (as used in the STAR grid collector)?

Do we store LFN’s for the events or do we look the LFN’s up in the file catalog using run/event number?Might need more than one LFN per event (RAW, RECO, ESD, AOD), or do we use naming conventions?

File CatalogThe file catalog contains one AliStats object per raw data file: Filename of raw file, number of events,

begin/end run number, begin/end event number, begin/end time, file size, quality histogram

Same info also stored in a central MySQL RDBMSIn AliEn catalog only: LFN, raw filename, file size

LFN: /alice_md/dc/adc-<date>/<filename> <Filename>: <hostname>_<date>_<time>.Root

File/event CollectionsNot yet addressedA file collection can be one (or more) sub-directories in alien (with symbolic links to original LFN)?An event collection can be a file collection with an associated event list per file?Or fully dynamic: a collection is just a (stored) query in the event and file catalogs (like grid collector)

ARDA Experiences With MetadataNuno Santos

Experience with Metadata

● ARDA tested several Metadata solutions from the experiments:●LHCb Bookkeeping – XML-RPC with Oracle backend●CMS: RefDB - PHP in front of MySQL, giving back XML tables●Atlas: AMI - SOAP-Server in Java in front of MySQL●gLite (Alien Metadata) - Perl in front of MySQL parsing command, streaming back text

● Tested performance, scalability and features => a lot of plots omitted, see the original presentation.

Synthesis● Generally, the scalability is poor

●Sending big responses in a single packet limits scalability●Use of SOAP and XML-RPC worsens the problem

● Schema evolution not really supported●RefDB and Alien don't do schema evolution at all.●AMI, LHCb-Bookkeeping via admins adjusting tables.

● No common Metadata interfaceExperience with

existing Software

Propose Generic Interface&

Prototype as Proof-Of-Concept

Design decisions● Metadata organized as a hierarchy - Collect objects with shared attributes into collections

●Collections stored as a table - allows queries on SQL tables●Analogy to file system:

● Collection <-> Directory● Entry <-> File

● Abstract the backend – Allows supporting several backends.

●PostgreSQL, MySQL, Oracle, filesystem…● Values restricted to ASCII strings

●Backend is unknown● Scalability:

●Transfer large responses in chunks, streaming from DB● Decreases server memory requirements, no need to store full

response in memory● Implement also a non-SOAP protocol

● Compare with SOAP. What is the performance price of SOAP?

Conclusions on SOAPTests show:

SOAP performance is generally poor when compared to TCP streaming.

gSOAP is significantly faster than other toolkits. Iterators (with stateful server) helps considerably –

results returned in small chunks.SOAP puts an additional load on server.SOAP interoperability very problematic.

Writing an interoperable WSDL is hard. Not all toolkits are mature.

Conclusions● Many problems understood studying metadata implementations of experiments

● Common requirements exist● ARDA proposes generic interface to metadata on Grid:

● Retrieving/Updating of data● Hierarchical view● Schema discovery and management

(discussed with gLite, GridPP, GAG, accepted by PTF)● Prototype with SOAP & streaming front end built

● SOAP can be as fast as streaming in many cases● SOAP toolkits still immature

● http://project-arda-dev.web.cern.ch/project-arda-dev/metadata/

Documents

Summary of Metadata Workshop Peter Hristov 28 February 2005 Alice Computing Day