16
20.10.08 1 NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de Work supported by EU RP6 project JRA1 FutureDAQ RII3-CT-2004-506078 High Performance Event Building over InfiniBand Networks J.Adamczewski-Musch, H.G.Essel , N.Kurz, S.Linev GSI Helmholtzzentrum für Schwerionenforschung GmbH Experiment Electronics, Data Processing group CBM data acquisition Event building network InfiniBand & OFED Performance tests Data Acquisition Backbone Core DABC

NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH 20.10.081

Embed Size (px)

Citation preview

Page 1: NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH  20.10.081

20.10.08 1NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks

GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de

Work supported by EU RP6 project JRA1 FutureDAQ RII3-CT-2004-506078

High Performance Event Building over InfiniBand Networks

J.Adamczewski-Musch, H.G.Essel, N.Kurz, S.Linev

GSI Helmholtzzentrum für Schwerionenforschung GmbHExperiment Electronics, Data Processing group

• CBM data acquisition• Event building network• InfiniBand & OFED• Performance tests• Data Acquisition Backbone Core DABC

Page 2: NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH  20.10.081

20.10.08 2NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks

GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de

CBM DAQ features summary

Complex trigger algorithms on full data:

Self-triggered front-end electronics.

Time stamped data channels.

Transport full data into filter farm.

Data sorting over switched network on

full data rate of ~1TB/s.

Sorting network: ~1000 nodes.

Is that possible (in 2012)?

Page 3: NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH  20.10.081

20.10.08 3NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks

GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de

Data flow principle

Merge channels

Sort over switched network,units are not events, but time slice data!

Distributecomplete data

Detector electronics, time stamped data channels

Processor farms, event definition, filtering, archiving

Up to 1000 lines1 GByte/sec each

Partial data

Complete data

Page 4: NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH  20.10.081

20.10.08 4NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks

GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de

Software developments

Software packages developed:

1. Simulation with SystemC (flow control, scheduling)• Meta data on data network

2. Real dataflow core (round robin, with/without sychronization)• Linux, InfiniBand, GB Ethernet• Simulates data sources

3. Data Acquisition Backbone Core DABC (includes dataflow core)• Controls, Configuration, Monitoring, GUI ...• Real data sources• General purpose DAQ framework

Let's have a look to these

Poster N30-98: (Wednesday, 10:30)  DABC, a Data Acquisition Backbone Core Library.

Page 5: NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH  20.10.081

20.10.08 5NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks

GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de

SystemC simulations 100x100 (1)

data

meta data

no traffic shaping 40%

traffic shaping 95% Optimization of network utilizationfor variable sized buffers.Enhancement up to factor two.

Same network used for data and meta data.Occupancy of networkby different kind of data

Page 6: NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH  20.10.081

20.10.08 6NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks

GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de

Real implementation: InfiniBand (2)

High-performance interconnect technology– switched fabric architecture – up to 20 GBit/s bidirectional serial link– Remote Direct Memory Access (RDMA)– quality of service– zero-copy data transfer– low latency (few microseconds)– common in HPC

Software library:– OpenFabrics (OFED) package

see www.openfabrics.org for more information

InfiniBand switch of the GSI cluster

Page 7: NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH  20.10.081

20.10.08 7NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks

GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de

OpenFabrics software stack*

SA Subnet Administrator

MAD Management Datagram

SMA Subnet Manager Agent

PMA Performance Manager Agent

IPoIB IP over InfiniBand

SDP Sockets Direct Protocol

SRP SCSI RDMA Protocol (Initiator)

iSER iSCSI RDMA Protocol (Initiator)

RDS Reliable Datagram Service

UDAPL User Direct Access Programming Lib

HCA Host Channel Adapter

R-NIC RDMA NIC

Common

InfiniBand

iWARP

Key

InfiniBand HCA iWARP R-NIC

HardwareSpecific Driver

Hardware SpecificDriver

ConnectionManager

MAD

InfiniBand OpenFabrics Kernel Level Verbs / API iWARP R-NIC

SA Client

ConnectionManager

Connection ManagerAbstraction (CMA)

InfiniBand OpenFabrics User Level Verbs / API iWARP R-NIC

SDPIPoIB SRP iSER RDS

SDP Lib

User Level MAD API

Open SM

DiagTools

Hardware

Provider

Mid-Layer

Upper Layer Protocol

User APIs

Kernel Space

User Space

NFS-RDMARPC

ClusterFile Sys

Application Level

SMA

ClusteredDB Access

SocketsBasedAccess

VariousMPIs

Access to File

Systems

BlockStorageAccess

IP BasedApp

Access

Apps & Access

Methodsfor usingOF Stack

UDAPL

Ker

nel b

ypas

s

Ker

nel b

ypas

s

* slide from www.openfabrics.org

Subnet Administrator:Multicast

Subnet manager:Configuration

Page 8: NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH  20.10.081

20.10.08 8NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks

GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de

InfiniBand basic tests

Hardware for testing:• GSI (November 2006) - 4 nodes, single data rate SDR• Forschungszentrum Karlsruhe* (March 2007) – 23 nodes, double data rate DDR• UNI Mainz** (August 2007) - 110 nodes, DDR

SDR (GSI) DDR (Mainz)

Unidirectional 0.98 GB/s 1.65 GB/s

Bidirectional 0.95 GB/s 1.3 GB/s

Point-to-point tests

2 KByte Rate per node

GSI (4 nodes) 0.625 GB/s

Mainz (110 nodes) 0.225 GB/s

Multicast tests

* thanks to Frank Schmitz, Ivan Kondov and Project CampusGrid in FZK** thanks to Klaus Merle and Markus Tacke at the Zentrum für Datenverarbeitung in Uni Mainz

0 0,5 1 1,5 2 2,5

Nominal SDR

Unidir. SDR

Bidir. SDR

Nominal DDR

Unidir. DDR

Bidir. DDR

4 nodes Multic.

110 nodes Multic.

GByte/s

Page 9: NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH  20.10.081

20.10.08 9NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks

GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de

Scaling of non-synchronized round robin traffic

Page 10: NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH  20.10.081

20.10.08 10NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks

GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de

Effect of sychronization and topology optimization

+ 25%

Topology optimization for 110 nodes would have required different cabling of switches!

Page 11: NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH  20.10.081

20.10.08 11NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks

GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de

Motivation for DABC (3)

2004 → EU RP6 project JRA1 FutureDAQ*

2004 → CBM FutureDAQ for FAIR

* RII3-CT-2004-506078

1996 → MBS future 50 installations at GSI, 50 external http://daq.gsi.de

Use cases• Detector tests• FE equipment tests• Data transport• Time distribution• Switched event building• Software evaluation• MBS event builder• General purpose DAQ

Data Acquisition Backbone Core

Intermediatedemonstrator

Requirements• build events over fast networks• handle triggered or self-trigger front-ends• process time stamped data streams• provide data flow control (to front-ends)• connect (nearly) any front-ends• provide interfaces to plug in application codes• connect MBS readout or collector nodes• be controllable by several controls frameworks

Poster N30-98: (Wednesday, 10:30)  DABC, a Data Acquisition Backbone Core Library.

Page 12: NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH  20.10.081

20.10.08 12NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks

GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de

DABC development branches

1. High speed event building over fast networks

2. Front-end readout chain tests (CBM, September 2008)

3. DABC as MBS* event builder (Ready)

4. DABC with mixed, triggered (MBS) and time stamped, data channels• Needs Synchronization of between both • Insert event number from trigger to time stamped data stream• Read out time stamp from MBS via VME module (to be built)

From this, DABC evolved as a general purpose DAQ framework.The main application areas of DABC are:

* MultiBranchSystem: standard DAQ system at GSI

Page 13: NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH  20.10.081

20.10.08 13NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks

GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de

GE: Gigabit EthernetIB: InfiniBand

DABC design: global overview

datainput

sortingtaggingfilteranalysis

datainput

sortingtaggingfilteranalysis

IB

PC

PCLinux

GE

analysisfilterarchive

archive

PC

frontendDataCombinerr

frontendother

frontendReadout scheduler

scheduler

DABC

TCP

PCIe

Applicationcodes throughplug-ins

Page 14: NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH  20.10.081

20.10.08 14NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks

GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de

DABC design: Data flow engine

A module processes data of one or several data streams.Data streams propagate through ports, which are connected by transports and devices

DABC Module

port

port

DABC Module

port

port

process process

Device

Transport

Device

Transport

Network

Object manager

locally (by reference)

Central data managerMemory pools BufferqueueBufferqueue

Threads

Page 15: NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH  20.10.081

20.10.08 15NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks

GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de

DABC and InfiniBand

DABC fully supports InfiniBand as data transport between nodes– connection establishing– memory management for zero-copy transfer– back-pressure– errors handling– thread sharing

GSI InfiniBand cluster with four nodes SDR

All to all test

Page 16: NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks GSI Helmholtzzentrum für Schwerionenforschung GmbH  20.10.081

20.10.08 16NSS 2008, Dresden, High Performance Event Building over InfiniBand Networks

GSI Helmholtzzentrum für Schwerionenforschung GmbH http://dabc.gsi.de

Conclusion

• InfiniBand is a good candidate for fast event building network

• Up to 520 MB/s bidirectional data rate is achieved on 110 nodes without optimization.

• Mixture of point-to-point and multicast traffic is possible

• Further investigation of scheduling mechanisms is required

• Further investigation of network topology is required

Thank You!