34
CLAs Reconstruction and Analysis Physics Data Processing with SOA based Framework Vardan Gyurjyan on behalf of Clas12 software group

CLAs Reconstruction and Analysis Physics Data Processing with SOA based Framework Vardan Gyurjyan on behalf of Clas12 software group

Embed Size (px)

Citation preview

CLAs Reconstruction and Analysis

Physics Data Processing with SOA based Framework

Vardan Gyurjyan on behalf of Clas12 software group

Outline

Problem statement SOA based framework as a solution Current status of the ClaRA project Future plans Conclusion

April 19, 2023V. Gyurjyan

Computing power

CMOS TechnologySingle Chip Integration

Node/Rack Integration

Network Integration

April 19, 2023V. Gyurjyan

Integration

Chip4 Cores

Computer Card1Chip13.6GF/s

Mode Card32 Computer Cards435 GF/s

Rack32 Mode Cards13.9 TF/s

IBM Blue Gene 72 Racks1PF/s

April 19, 2023V. Gyurjyan

Network Evolution

1

10

100

1,000

10,000

100,000

1,000,000

1980 1983 1986 1989 1992 1995 1998 2001

Nor

mal

ized

Gro

wth

sin

ce 1

980

User Traffic2x / 12months

Router Capacity2.2x / 18months

Moore’s Law2x / 18 months

Network Capacity2x / 7 months

CAT410Mbps10base-T

CAT5100Mbps100base-T

CAT5e1Gbps1000base-T

2003 – CAT6 10Gbps 2007 – CAT7 100Gbps

April 19, 2023V. Gyurjyan

High Performance Computing Trends

1. Exponential growth in processor performance (coming to an end)

2. Power cost = System cost: invention required

3. Growth in level of parallelism (near term solution)

April 19, 2023V. Gyurjyan

IBM Approach – Path to Petascale

Multiple modest cores on a single chip rather than one high-performance processor Watts/FLOP will not improve much from future

technologies. Linux environment and MPI (standard messaging

interface)

April 19, 2023V. Gyurjyan

"The Network is the Computer."

John Gage

April 19, 2023V. Gyurjyan

Specifics of the Offline Software

Lifetime of the software >= lifetime of the experiment. Collaborative nature of the development. Coexistence of parallel running applications for the

single experiment. Unprecedented scale and complexity of the physics

computing environment Physics computing environment must keep up with

fast growing computing technologies Large worldwide user base.

April 19, 2023V. Gyurjyan

PDP (Physics Data Processing) ApplicationConventional vs. parallel/distributed

April 19, 2023V. Gyurjyan

Running Conventional Software Application

Copycheckout

Give up

Configure

Compile

Fix errors

Run

Modified?

Complain

yesno

yesno

ok

April 19, 2023V. Gyurjyan

Programming Errors Compile time

Program does not compile. Compiler reports a “best guess” of the problem Undeclared variables or functions Missing semicolon or brace Typos Missing files or libraries Type ambiguities

Run time Executable crashes or has unexpected behavior May not appear for all conditions or all data sets Uninitialized variables Memory errors Numeric errors Type errors in print statements Closing a NULL file pointer Accessing a NULL pointer Variables out of scope

April 19, 2023V. Gyurjyan

Challenges of the Conventional Approach

Difficult to organize and coordinate activities Difficult to maintain Inevitable fragmentation of the software Poor scalability Computing skills are required to use physics data

processing applications

April 19, 2023V. Gyurjyan

ABC

BA

A B C

CLAS 6

CLAS 12

A+B << CC: requires a few or no programming skills

April 19, 2023V. Gyurjyan

One way to eat an elephant

A bite at a timeApril 19, 2023V. Gyurjyan

Where we start?

Each bite is a clear, simple, single purpose application, developed by group B member.

Group A, with a tight collaboration with group B and C shall control and manage the process, never loosing maniacal focus on a big picture (elephant).

April 19, 2023V. Gyurjyan

“Things should be made as simple as possible, but not simpler.”Albert Einstein

April 19, 2023V. Gyurjyan

Language and Architecture Evolution

Structured and Proceduralprogramming

Object Orientedprogramming

Assem

bly

Lan

gu

ag

e

Serv

ice O

rien

ted

pro

gra

mm

ing

April 19, 2023V. Gyurjyan

SOA SOA promotes the goal of separating service users from the

service implementation. Style of building reliable systems that deliver functionality as

services Loose coupling between interacting services Directories and addressing mechanisms are at the center of SOA.

ProgramArbitrary format Arbitrary format

ServiceStandard format Standard format

Complex

Specialized, simple

April 19, 2023V. Gyurjyan

Attributes of Services

Well defined, easy-to-use, somewhat standardized interface Self-contained with no visible dependencies to other

services (almost) Always available but idle until requests come Location transparency Easily accessible and usable readily, no “integration”

required New services can be offered by combining existing services Quantifiable quality of service

April 19, 2023V. Gyurjyan

Service Interface

Standard message based Highly Polymorphic

Intent is enough Implementation can be changed in ways that do not

break all the service consumers

April 19, 2023V. Gyurjyan

Service Orientation is scalable

End users can consume and combine a lot of services since they don’t have to know or “learn” how the services are made.

Service providers (A+B) can offer their services to a lot more consumers by optimizing The user interface Access Implementations

April 19, 2023V. Gyurjyan

“On Demand” Physics Data Processing

Use software as you need Much lower setup time, forget about

Installation Implementation Training Maintenance

Scalable and effective usage of resources Parallelism (CPU, Storage, Bandwidth…)

April 19, 2023V. Gyurjyan

What is ClaRA?

Framework that Implements SOA. Service development environment. Toolbox of generic physics data processing services. Network distributed platform. The “Glue”, binding together services into an

algorithmic data analysis application.

April 19, 2023V. Gyurjyan

Design criteria Framework service shall be simple to use and easy to learn. Framework service should be customizable to be able to adapt to the different

data processing tasks. Framework shall provide context sensitive help and assistance, with many

real world physics data processing application examples. Framework shall provide ready to use services, encapsulating essential

functionalities of the physics data processing system. Services shall be reusable and easily replaceable. Physics data processing application design and implementation shall require

a few or no programming skills. Neither specific computing environment, nor compiling shall be necessary to

build and run physics data processing application. Framework shall provide graphical environment for physics data processing

application development. Frameworks platform shall be network distributed, and shall have temporal

continuity. The new system shall provide World Wide Web access to the services for

remote configuration and execution of the data processing applications. The necessary security considerations must be addressed.

April 19, 2023V. Gyurjyan

Data and Algorithm

Framework advocates clear separation between: a) data and algorithm b) transient and persistent data

Methods in the data object will be limited to manipulations of the internal data members only.

Algorithm will process one type of data and generate data objects of a different type.

Algorithm Data Data

April 19, 2023V. Gyurjyan

Persistent and Transient Data

Physics algorithm objects should not use data objects directly in the persistent storage.

Transient data storage as a means of communication between physics algorithms.

Two different optimization criteria for applications using persistent and transient data.

Being independent from the persistent storage technology.

April 19, 2023V. Gyurjyan

Data Object categories

April 19, 2023V. Gyurjyan

ClaRA Platform

cMsg

cMsg

SCC

SCC

SOAP SOAP

SOAPSOAP

Users

CMSG SOAP

April 19, 2023V. Gyurjyan

Current Status

Geometry

Service

Magnetic

Field Map

Service

GEMCService

TrackingService

bCNUService

Event Data

Service

ClaRA cMsg Platform

Thin Clients

WWWClaRA WebServices Platform

MathService

StatService

ProbabilityService

GeometryService

Matrices Service

April 19, 2023V. Gyurjyan

Examples

EVIO event producer and EVIO event consumer services (C++).

data producer and data consumer services. C examples use cMsg payload (ASCII).

C++ geometry service client example Java geometry service client example Web services JSP clients

April 19, 2023V. Gyurjyan

Tracking composite application

Transient data

Space-point

maker

Coarse track finder

Cluster Analyze

r

Ambiguity solver

Track fitter

Histogram

builder

Persistent data

ClaRA cMsg Platform

Thin Clients

April 19, 2023V. Gyurjyan

Tracking application service decomposition

DetectorData

EvtData

StatData

TransientEvtData

TransientDetData

TransientStatData

Track candidates

Resolved Tracks

Space Points

Raw Data

Final Tracks

SpacePointFormation

CoarseTrackFinder

SeadMaker

VertexFinder

ClusterAnalyzer

AmbiguitySolver

TrackFitter

TrackScoring

Supervisor

start

start

start

start

retrieve

record

retrieve

record

retrieve

record

retrieve

Transient Storage

Tracking State machine

April 19, 2023V. Gyurjyan

Performance measurements

April 19, 2023V. Gyurjyan