19
Christian Steinle, University of Mannheim, Institute of Comput er Engineering 1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel, Reinhard Männer Computer Engineering, University of Mannheim • Contents Status CBMROOT Realisation in hardware Outlook

Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Embed Size (px)

Citation preview

Page 1: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

1

L1 Tracking – Status CBMROOT And Realisation

Christian Steinle, Andreas Kugel, Reinhard MännerComputer Engineering, University of Mannheim

• Contents

– Status CBMROOT– Realisation in hardware– Outlook

Page 2: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

2

Status CBMROOT

The code is working in the actual CBMROOT framework

• cbmroot/parameters/htrack• contains all table files• table files transform hit signatures into priority classes

• cbmroot/macro• contains two simulation and two reconstruction macros

• cbmroot/htrack• contains all source code files

Page 3: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

3

Status CBMROOT cbmroot/macro

Important macro entries:

• Load library:• gSystem->Load("libHTrack");

• Create objects• CbmStsFindTracks* findTracks = new

CbmStsFindTracks(iVerbose, NULL, kFALSE, "STS Track Finder");

• CbmHoughStsTrackFinder* trackFinder = new CbmHoughStsTrackFinder();

• Set task• fRun->AddTask(findTracks);

• Use for track finding• findTracks->UseFinder(trackFinder);

Page 4: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

4

Status CBMROOT Documentation

• A doxygen documentation is ready in the source code• A howTo documentation is in review. It contains:

• Main class description with constructors• Algorithm configuration via ASCII configuration file

• parameter name, meaning, standard value, value range, value format, links to other related parameters

• Signature definition via• ASCII table files• Automated generation algorithms

• Major algorithm definitions in the source code• Peak finding definitions like, for example, the window type or

size• Enabling/disabling of analysis

• Example scripts

Page 5: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

5

Realisation in hardware Environment

• Data:• 107 events/s with 20000 hits lead to 2*1011 hits/s• 1 hit is encoded with 32bit lead to 32 bit/hit

Data rate = 2*1011 hits/s * 32 bit/hit = 6,4*1012 bit/s = 6,4Tbit/s

• Network:• (10 Gbit/s)/link

Number of links = (6,4 Tbit/s) / (10 Gbit/s) / link = 640 links

• FPGA:• Process 1 hit / clock cycle with (10 Gbit/s)/link and 32 bit/hit

Clock = (10 Gbit/s) / (32 bit/hit) / 1 hit = 312,5*106 1/s = 312,5 MHz

Page 6: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

6

Realisation in hardware Up to now: single-chip FPGA implementation

HBuffer

LBuffer

Histogram Layer

Page 7: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

7

Realisation in hardware Planned: multi-chip FPGA implementation

Multi Chip

Page 8: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

8

Realisation in hardware Planned: multi-chip FPGA implementation

Just relocated HBuffer

Page 9: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

9

Realisation in hardware Planned: multi-chip FPGA implementation

No HBuffer needed, if enoughprocessors for all histogramlayers exist

Page 10: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

10

Realisation in hardware Up to now: single-chip FPGA timing

Page 11: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

11

Realisation in hardware Planned: multi-chip FPGA timing

248 (Speedup:19)

312 (Speedup: 4)

245 (Speedup: 4)

Page 12: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

12

Realisation in hardware Up to now: single-chip FPGA ressources

• PRELUT:• input: 20 bits (xy: 17, z: 3); output: γmin and γmax (2 x 8 bit) 1 x (1M x 16) bits external RAM

• LUT:• input: 20 bits (xy: 17, z: 3); output: startPos and houghCmd (7 + 29 bit) 2 x (1M x 18) bits external RAM

• HBuffer:• entry: γmax, inputLUT and previousListAddress (8 + 20 + 15 bit)

• memory for 32k entries with 45 bits due to Blockram scalability 80 Blockram, 500 + about 5000 logic cells

• Histogram: 30.000 logic cells• Peak finding: estimated 5000 logic cells• LUT access: estimated 5000 logic cells

Ressources: 45500 logic cells, 80 dual-ported Blockram and 7MB RAM

1 x Xilinx Virtex 4 XC4VFX60

Page 13: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

13

Realisation in hardware Planned: multi-chip FPGA ressources

• Version1 (Histogram with registers)

• MasterIn: PRELUT, LUT 7 MB RAM and 5.000 logic cells• Processing Units: Histogramming, Encoding, Diagonalization, 2D

Peak finding 30.000 logic cells per histogram layer• MasterOut: 3D Peak finding 5.000 logic cells

MasterIn: 1 x Xilinx Virtex 4 XC4VFX12 Processing Units: 64 x Xilinx Virtex 4 XC4VFX100 MasterOut: 1 x Xilinx Virtex 4 XC4VFX12

Page 14: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

14

Realisation in hardware Planned: multi-chip FPGA ressources

• Version2 (Histogram with Blockrams)

• MasterIn: PRELUT, LUT 7 MB RAM and 5.000 logic cells• Processing Units: Histogramming, Encoding, Diagonalization, 2D

Peak finding 31 x 2kB Blockram per layer• MasterOut: 3D Peak finding 5.000 logic cells

MasterIn: 1 x Xilinx Virtex 4 XC4VFX12 Processing Units: 16 x Xilinx Virtex 4 XC4VFX100 MasterOut: 1 x Xilinx Virtex 4 XC4VFX12

Page 15: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

15

Realisation in hardware Planned: multi-chip FPGA ressources

• Version3 (Histogram with Blockrams and registers)

• MasterIn: PRELUT, LUT 7 MB RAM and 5.000 logic cells• Processing Units: Histogramming, Encoding, Diagonalization, 2D

Peak finding 31 x 2kB Blockram per layer or 30.000 logic cells• MasterOut: 3D Peak finding 5.000 logic cells

MasterIn: 1 x Xilinx Virtex 4 XC4VFX12 Processing Units: 14 x Xilinx Virtex 4 XC4VFX100 MasterOut: 1 x Xilinx Virtex 4 XC4VFX12

Page 16: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

16

Realisation in hardware Estimation

• Data rate: 6,4Tbit/s with 20000 hits/event• Network: 640 * (10 Gbit/s)/link• FPGA: 32 bit/hit with 312,5 MHz

single chip:• Minimal pipeline stall: 76400 clock cycles• No streamlined processing is possible• Five Hough transform units for one data link lead to 3200 units

multi chip:• Minimal pipeline stall: #(histogram dim2) = 31 clock cycles• Accept just 19969 hits and discard leading or trailing 31 hits• Direct streamlined processing is possible• One Hough transform unit for one data link lead to 640 units• Processing time speed up: 19• Hardware: at least 16 chips (14 x XC4VFX100 and 2 x

XC4VFX12)

Page 17: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

17

Realisation in hardware Multi-chip FPGA vs. Cell implementation

Multi-chip FPGA

Cell Processor

A Cell processor can be used to develop conceptsfor a multi-chip FPGAImplementation

Cheap and rapid prototypingwith a Sony Playstation 3

Page 18: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

18

Realisation in hardware Cell Processor

• 1 PowerPC• 64 bit architecture• 32 kB L1 Cache• 512 kB L2 Cache

• 8 Synergetic Processing Elements (SPE)• 128 registers with 128 bit• ALU with 128 bit SIMD• 256 kB local memory• Memory Flow Controller (MFC) with DMA transfer possibility

• 1 XDR-Ram with up to 4,5GB

Handles the LUT processing, the job distribution and the 3D peak finding

Handles the Histogramming, Encoding, Diagonalization and 2D peak finding

Memory for the LUTs, the HBuffer unit and the LBuffer unit

Page 19: Christian Steinle, University of Mannheim, Institute of Computer Engineering1 L1 Tracking – Status CBMROOT And Realisation Christian Steinle, Andreas Kugel,

Christian Steinle, University of Mannheim, Institute of Computer Engineering

19

Outlook

• A manual documentation is in process

• A thesis documentation is in process

• Additional analysis in software are in process

• Development of PS3 (Cell) – source code is in process

• single-chip FPGA concept + Cell concepts = multi-chip FPGA