Upload
diep
View
25
Download
0
Embed Size (px)
DESCRIPTION
Fast Data Processing in ALICE TRD FEE & GTU. Introduction FEE partitioning Design for Test Optical Readout Global Tracking Unit. Venelin Angelov Kirchhoff Institute of Physics Chair of Computer Science Prof. Dr. Volker Lindenstruth University Heidelberg, Germany - PowerPoint PPT Presentation
Citation preview
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 1
• Introduction• FEE partitioning• Design for Test• Optical Readout• Global Tracking Unit
Fast Data Processing in
ALICE TRD FEE & GTU
Venelin AngelovKirchhoff Institute of PhysicsChair of Computer ScienceProf. Dr. Volker LindenstruthUniversity Heidelberg, GermanyPhone: +49 6221 54 9812Fax: +49 6221 54 9809Email:[email protected]: www.ti.uni-hd.de
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 2
TR-detector
z
r
B=0.4
T
verte
x
18 Supermodules in azimuth
5 m
odule ri
ngs
TRD Structurestack
6 pl
anes
8 MCM 144 channels
module
max
. 16
padr
ows
MCM
18 channels
1.2 million channels 1.4 million ADCs peak data rate: 16 TB/s ~65000 MCMs processing time: 6 µs
PASA
TRAP
1080 optical links @2.5GbpsO
RI
ORI
G T U
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 3
Line Fit & Global Tracking
module profile
tracklet processor
straight line fit via a linear regression method
searching for dedicated patterns (for energy cut and electron - pion separation)
raw data buffer
global tracking
projection of ‘tracklets’ to a virtual plane
searching for tracklets belonging together
perform (accurate) energy cut and generate trigger
tracklets
virtualplane
projectedtracklets
track
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 4
FEE development
How to design the FEE with so many channels?- fast and with low latency- low power - precise power control (1mW/channel → 1.2kW)- low cost
- minimize using connectors- use some simple chip package (MCM + ball grid array)- standard components? which process? IP cores?
- flexible, as much as possible- make everything configurable- use CPU core(s) for final processing
- reliable, no possibility to repair anything later- redundancy, error and failure protection- self diagnostic features
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 5
FEE development(2)
Chip Design flow1. Detector simulations to understand the signals we get and what kind
of processing we need2. Select PASA shaping time, ADC sampling rate and resolution3. Behavior model of the digital processing including the bit-precision
in every arithmetic operation4. Estimate the processing time and select the clock speed of the design
(multiple of LHC and ADC sampling clock)5. Code the digital design, simulate it, synthesize it, estimate the timing
and area, optimize again…6. Submit the chip, this is the point of no return!7. Continue with the simulations, find some bugs and think about fixes8. Prepare the test setup
And so on, TRAP1, 2, TRAPADC, miniQPM, TRAP3, TRAP3a (final)
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 6
Partitioning, Data Flow & Reduction
PASA ADCTracklet
Preprocessor
TPPGTU
TrackletProcessor
TP
NetworkInterface
NI
L1 triggerto CTP
to HLT& DAQ
detector6 layers
1.2 millionanalog channels
chargesensitive
preamplifiershaper
10 Bit ADC10 MSPS
21 channels
digital filterpreprocess data
event buffer
builds readout treefor
trigger & raw data
fit tracklets fortrigger functionality
process raw datamonitoring
merge trackletsinto tracks for
triggerprocess raw data
for HLT
TRD
MCM - Multi Chip Module
event bufferstore raw data until L1A
during first 2 µs (drift time)33 MB
16 TB/s257 GB/s
1
after 3.5 µstime:data / event:
peak rate:mean rate:reduction:
after 4.1 µsmax. 80 KB600 GB/s
-~ 400
after 6 µssome bytes
--
to trigger decision
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 7
Detector Readout
2
4
4.6
6time [µs]
Pretrigger
Drift
Processing
Transmissiontime
GTU finished
0 1,200,000 channels, 1,400,000 ADCs10 bit 10 MSPS, 20 samples/event
preprocessing
65,000 MCMs, 18+3 channels/MCM,4 CPUs at 120 MHz
4,100 Readout Boards, 16+1(+1) MCMs8 bit 120MHz DDR readout tree
1,080 Optical Readout Interface links2 links/chamber at 2.5 Gb/s
90 Track Matching Units (GTU)1 TMU/module, FPGA based
Central Trigger Processor
BM
HCM
ORI
12 optical links
3+own:1
4:1
4:1
TMU5:1
DDL
x18
CTP
TGU
. . .
. . .
SMU
G T
U
A+D
Mer
ger
MCM
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 8
TRAP development – a long long way
MCM for 8 channels with the first prototypes of the Digital chip and Preamplifier, commercial ADCs Beg 2001
First tested TRAP chip, in „spider“ mode Summer of 2002
FaRo1AMS0.35μm
TRAP1UMC0.18μm
PASAAMS0.35μm
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 9
Multi Chip Module
Digital Frontendand Tracklet Preprocessor
CPU cores, memories & periphery
Network Interface
PASA
MasterState
Machine
External Pretrigger
SerialInterface
ReadoutNetwork
GlobalI/O-Bus
Internal ADCs (Kaiserslautern)
Serial Interface
slave
4 cm
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 10
-1.0
-0.5
0.0
0.5
1.0
Diff
eren
tial o
utpu
t vol
tage
, V
0.0 500.0n 1.0µ 1.5µ 2.0µ-1.02
-1.00
-0.98
Time, s
PASA - Preamplifier and Shaper
PASA - Preamplifier and Shaping Amplifier
FWHM (shaping time): 120 ns ENC: 850 electrons at 25 pF Gain: 12.5 mV/fC Integral Nonlinearity: 0.3% Power: 12 mW / channel Process: 0.35µm AMS Area: 21.3 mm²
ChargeSensitive
Preamplifier
P/Zcancellation
Shaper 1InputPadsx18
Shaper 2
diff outputto the ADC
Programmable test generator
EN_i
Vref
i1
dig_cntrl1
CLK
DIN
STR
DAC[7..0]
EN[17..0]EN[17..0]
CLK
DIN
STR
DAC
18x
Vref
VdnVup
V. Catanescu, H.K.Soltveit
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 11
Hit Selection
Hit Detection
FittingUnit
Fit Register File
FittingUnit
FittingUnit
FittingUnit
IMEMDMEM
GRF
FIFOFIFOFIFOFIFO
NI
CPU
IMEM
CPU
IMEM
CPU
IMEM
CFG
SCSN
GSM
StandbyArmedAcquireProcess
Send
TRAP
EventBuffer
Filter
21 ADCs
Nonlinearity Correction
Pedestal Correction
Gain Correction
Tail Cancellation
Crosstalk Suppression
Digital Filters
CPU
DMEM
PRFGRF
CONST
FRF
IMEM
DecoderPC
ALU
Flags
Bus(NI)Pipe 1 Pipe 2
CPU
TRAP block diagram10 bit 10 MHz,12.5 mW, low
latency
64 samples
4x RISC CPU @ 120 MHz8 bit 120 MHz DDR
24 Mb/s serial networkMemory:
4 x 4k for instr.1k x 32 Quad ported for dataHamming prot.
4 x 8 bit 120 MHzDDR inputs
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 12
ADC – principle of operation
Principle of the cyclic AD-ConversionExample of quantization process
essential Operations:- comparision- +/- V ref- x2- (Sample & Hold)
x2
x2
1 0 1
threshold
000
111
binary decisions
ADC- Stage:
Vin
Bi
COMP
S&H
+/-Vref
Vref
x2
DAC
Designed in Uni-Kaiserslautern,R.Tielert, D.Muthers
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 13
ADC parameters
Resolution 10bitSampling Rate 10.4MS/sProcess 0.18um CMOS + MIMCAPSSize (per ADC) 0.11mm2
Power 12.5mW @ 10.4MS/sInput (pro.) +/-1V - +/-1.4VSupply 1.8V, 3.3VDNL - 0.4 / +0.6 LSBINL - 0.8 / +0.7 LSBENOB @1MHz Signal 9.5 bitSNR @1MHz Signal 59.2 dBSFDR @1MHz Signal 73.0dBTHD @1MHz Signal - 69.5dB
0
200
400
600
800
1000
50 100 150 200 250 300 350 400
AD
C o
utp
ut
110 kHz sinwave, 10 MSPS, RMS=0.39, ENOB=9.57
-1
0
1
50 100 150 200 250 300 350 400
RE
S
samples
110 kHz sinwave, 10 MSPS, RMS=0.39, ENOB=9.57
CPU running
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 14
Filter and Tracklet Preprocessor
DFIL
Event Buffer
Condition Check
DFIL Event Buffer
DFIL
Event Buffer
Condition Check
DFIL
Event Buffer
Condition Check
DFIL Event Buffer
DFIL
Event Buffer
Condition Check
Qhit
hit
Qhit
Qhit
18+1 channels
COGPosition
CalcLUT
Para -meterCalc
Hit
Sel
ectU
nit (
max
. 4 h
its)
PositionCalcLUT
Para -meterCalc
PositionCalcLUT
Para -meterCalc
PositionCalcLUT
Para -meterCalc F
IT R
egis
ter
Fil
e an
d tr
ackl
et s
elec
tion
CPU0
CPU1
CPU2
CPU3
L R
C
A ACOG
A
Q
COG
COG
COG
ADC
ADC
ADC
ADC
ADC
ADC
Non-Lin
Offs Tail-canc
Gain Cross-talk
Digital FILter64 timebins deep
FIT register file is for the CPUs a readonly register file
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 15
Filter & Preprocessor21
dig
ital c
hann
els
from
AD
Cs
(10
MH
z)
21
x
preprocessor10
10
hist
ory
fifo
nonl
inea
rity
corr
.
pede
stal
cor
r.
gain
adj
ustm
ent
tail
canc
ella
tion
cros
stal
k ca
ncel
l.
even
t buf
fer
12
12 chan
nel s
elec
tion
ma
x. 4
po
sitio
n
posi
tion
calc
ulat
ion
posi
tion
corr
ectio
n 8
8
calc
ulat
e su
ms
for
regr
essi
on
sele
ct c
andi
date
s
ma
x. 4
ca
nd
.
232
232
to p
roce
sso
r
filter & event buffer
C
L
Ry y+1y-1
amp
posCOGorigin
deflection
time
bins
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 16
MIMD Processor
CPU Harvard style architecture two stage pipeline 32 Bit data path register to register operations fast ALU
32x32 multiplication 64/32 radix-4 divider
maskable interrupts synchronization mechanisms
MIMD processor 4 CPUs shared memory / register file global I/O bus arbiter separate instruction memory coupled data & control paths
decoder
pip
elin
ere
gis
ter
PC
ALU
select operands
CON FIT
write backinterrupt
I/O busarbiter
PRF
clks rst
powercontrol
externalinterrupts
CPU0
local I/O busses
global I/O bus
IMEM DMEM GRF
Preprocessor, 4 sets of fit data
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 17
SCSN - Slow Control Serial Network
Master(DCS)
Slave
Slave
Slave
Slave
Slave
ring
0
ring
1
Slave
serial
Slave
bridged
Master(DCS)
Slave
Slave
Slave
Slave
Slave
ring
0
ring
1
SCSN Up to 126 slaves per ring CRC protected 24 MBit/s transfer rate 16 addr., 32
data-bits/frame
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 18
NI Datapath
1616I/O 0
port0
1616
1616
1616
I/O 1
I/O 2
I/O 3
I/O G 16 16 16 16
10
16port1
10
16port2
10
16port3
10
16
16
10
Network Interface
FiFo64x16
clk
FiFo64x16
clk
FiFo64x16
clk
FiFo64x16
clk
local I/O 0
local I/O 1
local I/O 2
local I/O 3
global I/O
CPU 1
CPU 2
CPU 3
CPU 4
config
glo
bal
bu
s ar
bit
er
Processor
DMEM
IMEM
GRF
Network Interface
local & global I/O interfaces
input port with data resync. and DDR decoding
input FIFOs (zero latency)
port mux to define readout order
output port with DDR encoding and programmable delay unit
port4Delay units
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 19
Readout Boards and Readout Tree
MCM as Readout Network• data source only• data source and data merge• data merge only
2.5 Gbps850 nm
Half-chamber
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 20
0 200 400 600 800 10000
1
2
3
4
5
reg CPU 0 reg CPU 1 reg CPU 2 reg CPU 3
Total errors in CPU registerschip: isofruitI=60pA
To
tal b
it e
rro
rs
Time, sec
Radiation tests - TRAP
Hamming protection switched off
11 registers x 32 bit / CPU
Typical size of the real time CPU program <256 Words. Memory is fully protected from 1 bit error and will be refreshed periodically.
The data in event buffers remain <100s. We have parity bit for error detection.
0 200 400 600 800 10000
100
200
300
400
Total errors in instruction memory (each 4k x 32)chip: isofruitI=60pA
im0 im1 im2 im3
To
tal b
it e
rro
rs
Time, sec
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 22
Test flow of the MCM testing
• Apply voltages, control the currents• JTAG connectivity test • Basic test using SCSN (serial configuration bus)• Test of all internal components using the CPUs• Test of the fast readout• Test of the ADCs by applying 200 kHz Sin-wave• Test of the PASA by applying voltage steps through serial capacitors• Store all data for each MCM in separate directory, store in XML file the essential results• Export the result for MCM marking and sorting
Programmablesupply voltages,current control
MCM(DUT)
Progr.StepGener.
Progr.Sin-waveGener.
FPGA2
MCM0
MCM3
MCM1
MCM2
FPGA1
PCI(PC)
LVDS8 bit120MHzDDR
SCSN
+1.8Vd
+3.3Vd
+1.8Va
+3.3Va
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 23
TRAP internal tests
IM0DM
4 x 4k x 24 1k x 32
DB
256 x 32
StateM (~25)
NI (~12)
IRQ(64)
Counters(8)
Const(20)
~130
LUT-nonl(64)
Gain corr(42)
LUT-pos(128)
Fil/Pre(~44) ~280
Arbiter/
SCSN slv
r1 r0
CPU0
Global bus
r1 r0
Port 3
In total 434 configuration registers
NPGTC
Eventbuff
ADC
21x
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 24
TRAP wafer test and results
576 TRAP chips/wafer
Fully automatic partial test of the TRAP.
Serial configuration interface
Parallel readoutProgrammable
sin-wave generator
A
Programmable power supply
TRAP
Prep
100%
06/02
06/06 06/09 07/01
07/03-107/03-2
2525 49 49 3 2525
75%Up to now produced and tested 201 wafers with ~115,000 TRAPs, of them ~86,000 usable
25 n
ew w
afer
s
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 25
MCM Tester and results
T.Blank, FZK IPE (Karlsruhe)
• test of 3x3 or 4x4 MCMs• digital camera with pattern recognition software for precise positioning using an X-Y table• vertical lift for contacting• about 1 min/MCM for positioning and test
• store the result into a DB• mark later the tested MCMs with serial Nr. and test result code
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 26
0
2
4
6
8
10
12
-6 -4 -2 0 2 4 6
%, 100%
= 8
1408
0
2
4
6
8
10
12
90 95 100 105 110
%, 100%
= 8
1408
0 5
10
15
20
25 1
.2 1
.22
1.2
4 1
.26
1.2
8 1
.3 1
.32
1.3
4 1
.36
1.3
8 1
.4
%, 100% = 27136
TRAP ADC – conversion gain variation
TR
AP
AD
C V
ref,
V
Channel-Channel
variation on the same TRAP +/-2 % Chip-Chip
variation
+/- 5 %
Statistics based on ~27000 TRAPs
Amplitude, %
1.2
1.22
1.24
1.26
1.28
1.3
1.32
1.34
1.36
1.38
1.4
90 95 100 105 110
Essential for tracking!
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 27
TRAP ADC – programmable conv. gain
Ampl(DAC=0)/Ampl(DAC=10000b), % Ampl(DAC=11111b)/Ampl(DAC=10000b), %
The conversion gain can be changed by the ADCDAC parameter (5 bit) for all ADCs in the chip. This can be used to adjust the conversion gain according to the gas-gain variation on the chamber. The variations within one chip can be compensated by the digital gain correction in a range of +/- 12%.
0
1
2
3
4
5
6
7
8
115.5 116 116.5 117 117.5 118 118.5
%,
100%
= 8
1408
0
1
2
3
4
5
6
7
8
9
10
86 86.5 87 87.5 88
%,
100%
= 8
1408
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 28
TRAP ADC – summary
• The ADC reference voltage varies about +/- 5%. • The ADC conversion gain varies about +/- 5% and corresponds to the variation of the Vref. • The ADC conversion gain variation from channel to channel in the same TRAP is about +/- 2%.•The relative conversion gain controlled by the ADCDAC parameter has small variation (about +/- 0.5%).
• At ADCDAC=0 the measured amplitude is 1.170 times the amplitude at ADCDAC=10000b. • At ADCDAC=11111b the measured amplitude is 0.870 times the amplitude at ADCDAC=10000b.
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 29
Optical Readout Interface (ORI)
How to push the data out of the FEE?- no handshaking possible- magnetic field- don’t disturb the sensitive PASAData rate 240 MB/s- because of 8b/10b encoding this means at least 2.4 Gbps- full custom chip started, but not finished at the right time (OASE)- simple board based on commercially available chips- use local high stability quartz oscillator instead of LHC clock- use custom laser driver circuit to avoid using commercial SFP modules with unspecified stability in magnetic field
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 30
Optical Readout Interface (ORI)
LV
DS
-TT
L
CPLDSERDES2.5GBits/s
Laser Diode850 nm
DDR SDRResynchronization, status, counters
125MHz
Laser Driver
Conf. Mem.120MHz
8 bitDDR
+24 ns
VCSEL
+24 ns +300 ns
HC
M (
TR
AP
)
latency
16
I2C
All 1200 produced and tested, 1199 of them fully functional
Magnetic field & radiation tolerant!
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 31
ORI – production test (laser diode)
Several parameters controlled:
- Supply currents at various operation mode conditions- Voltages on the board in enabled/disabled state- Source, bias and modulation currents through the laser diode- Temperature of the laser driver chip- optical output power- photodiode current measured by the laser driver chip
0
5
10
15
20
25
1 2 3 4 5 6
% o
f O
RIs
, 1
00
%=
96
9
Ibias, mA at optical power 200uW
Ibias, mA at 200uW
0
2
4
6
8
10
12
14
16
18
20
0 50 100 150 200 250
% o
f O
RIs
, 1
00
%=
96
9
Imonitor, uA at optical power 200uW
Imonitor, uA at 200uW
0
5
10
15
20
25
0.35 0.36 0.37 0.38 0.39 0.4 0.41
% o
f O
RIs
, 1
00
%=
96
9
Ireset, A
Ireset, A
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 32
ORI radiation test in OsloNo permanently damaged components. Equivalent time in years in blue, * - the device fails.Recovery after power cycle or switching off for up to 12 hours (VR and Laser Driver).The configuration of the CPLD and EEPROM was not damaged.
CPLDLatticeLC4256V
15-50*
LVDS TransceiverNational Semiconductor
DS90LV048A 30
EEPROM24LC01 30*
Laser DriverLinear Technology
LTC5100 100*
VCSEL DiodeULM Photonics
ULM850-02-LC-TOSA 20
SerializerTexas InstrumentsTLK2501
250*
Voltage RegulatorsNational Semicondutor
LP3962-3.3 and -2.5 60-110*F.Rettig
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 33
Charge Cluster to Tracklet
•Local tracking units on detector perform linear fits and reject uninteresting data
Charge Cluster to Tracklet
•Local tracking units on detector perform linear fits and reject uninteresting data
TRD Trigger Timing
origin
deflection
tim
e
bin
s
“Tracklet”(32-Bit word):• y
position (origin)•
slope (deflection)• z
position (pad-row number)•
charge
♦ Trigger decision after 6 µs♦ Charge drift and data pre-processing uses
most of the time
♦ Trigger decision after 6 µs♦ Charge drift and data pre-processing uses
most of the time
Global Tracking
•Inside GTU (Global Tracking Unit)
•Objective: find high momentum tracks
•Search for tracklets belonging together
•Combine tracklets from all six layers
•Reconstruct pt, compare to
threshold and generate trigger
•Constraint:only approx. 1.5 µs processing time
Global Tracking
•Inside GTU (Global Tracking Unit)
•Objective: find high momentum tracks
•Search for tracklets belonging together
•Combine tracklets from all six layers
•Reconstruct pt, compare to
threshold and generate trigger
•Constraint:only approx. 1.5 µs processing time
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 34
Track Re-assembly
• Search for tracklets belonging together(3-dimensional matching task)Projection of tracklets to virtual planeSliding window algorithmA track is found, if ≥ 4 tracklets from different layers inside same multi-dimensional window
Track Re-assembly
• Search for tracklets belonging together(3-dimensional matching task)Projection of tracklets to virtual planeSliding window algorithmA track is found, if ≥ 4 tracklets from different layers inside same multi-dimensional window
Online Track Reconstruction
Reconstruction of the Transverse Momentum
•Calculate linear fit of (unprojected)y positions of tracklets. Estimate transverse momentum fromline parameter a uses look-up tables, additions and multiplications
Reconstruction of the Transverse Momentum
•Calculate linear fit of (unprojected)y positions of tracklets. Estimate transverse momentum fromline parameter a uses look-up tables, additions and multiplications
Module Profile
VirtualPlane
ProjectedTracklets
Track
a
y
b
x
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 35
GTU in the Framework
OnlineTracking
Trigger
Busy
Pre-Trigger based on TOF, L1 contribution based on GTU Cosmic Trigger
CTP GTU
Pre-TriggerT0 TOF
V0 ACORDE
Other Detectors
Other ALICE Systems TRD Systems
Outside Magnet
Inside Magnet
Pre-Trigger, L0, L1
Modified Tracklet Data
L0 contrib, Busy
TTC (L0, L1, L2, …)
L1 contrib, Busy
TGU
Szintillators
SMUsSMUsTMUsTMUsTMUs
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 36
Global Tracking Unit in the TRD Readout Chain
ORI
ORI
Module
Stack
TriggerDesign
EventBufferingDesign
Trackletsonly
Tracklets&
Raw Data
RX
TMU
5x
SMU
TriggerHandling &
Control
GTU Supermodule Segment
Front-End Electronicswithin L3 magnet
Half Chamber
DCSBoardTTCrx
DDLSIU
DATESoftware
DIU
D -RORC
DAQ System
CentralTrigger
Processor
TGU18
x
TrackConcentr.
GTU Racks
12
fib
res
Racks belowMuon System
L1 contribution+ busy
TTC triggersignals
♦ Track Matching Unit (TMU)
♦ Supermodule Unit (SMU)
♦ Trigger Generation Unit (TGU)
Storage
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 37
Event Buffering Design
16
/12
8
Event Shaper
SRAMController
write read
16
/12
8
ReadAddress
Logic&
Control
DataFormatting
TMU/SMUInterface
Readout Unit
4-MbitSRAM
SMU Control(L0-/L1-Trigger)
Scheduling
MemoryManagement
Even
t (n
, 0
)
Even
t (n
+1
, 0
)
Even
t (n
, 1
1)
Even
t (n
+1
, 1
1)
Event InfoFIFO Buffer
Superm
odule
Unit
Links
from
one S
tack
... 1
2x ...
SMU Control(L2-Message)
Data Block, Link 0 Data Block, Link 11
... 12x ...
em
pty
ad
dre
ssdata
status
DataStreamMerging
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 38
Trigger – Simulation Framework
AliRoot-based framework of simulation classes:AliEn integration, Offline-RawReader, TRAP/GTU simulation
CASTOR/AliEn
orFiles
OfflineRawStrea
mReader
TRD ResponseModel
Half-Chamber Models
TRAP models
18 SM, 5 Stacks, 6 Layers, 2 Half-Chambers
GTU Trigger Model
TMUsTMUsTMU
Models
SMUsSMU
Models
Trac
klets
Input of „empty“ and interesting events
TGUModel
Configuration parameters
Trigger Efficiency/Purity
Assessment
Simulation Framework
ADC data L1 contrib
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 39
Trigger Schemes
♦Di-Lepton Trigger• Tracking developed by Jan de Cuveland• Transverse momentum cuts
• simple coincidence of “high-pt seen flags” from different supermodules• Optimized for minimum latency → L1 contribution
♦Jet-Trigger• PhD thesis: Concept and Implementation of a Jet trigger in the GTU• fast selection of events (L1 contribution) based on
• thresholds for numbers of high-pt tracks within certain space regions/angular ranges• charge sums (if available) within certain space regions/angular ranges
• more complex calculations (invariant mass, …) and correlationsbeing under consideration currently → idea of L2 contribution by GTU
♦PhD theses on GTU trigger at KIP: S. Kirsch (A-A), F. Rettig (p-p)
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 40
Track Matching Unit (TMU)
• Trigger design currently optimized for high pt cut only• Simple di-lepton trigger foreseen in trigger unit (based on `binary‘ local decisions)
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 41
Track Matching Unit (TMU) Board
850 nmSFP-Transceiver
850 nmSFP-Transceiver
Com
pact
PC
I B
us
Com
pact
PC
I B
us
DDR2 SRAMDDR2 SRAM
(6U Height, Single Width)
3 Parallel Links (240 MHz DDR, 8
Bit LVDS)
From Left TMU
To Right TMU
To SMU Board
12 Fiber Optical Serial Links(2.5 GBit/s)
From 1 Detector
Stack
Virtex-4 FX100 FPGA:95k LCs, 768 I/Os,
20 Internal Multi-Gigabit Serializer/Deserializer
Units,2 PowerPC Cores
DDR2 SDRAMDDR2 SDRAM
DDR2 SRAM:High Bandwidth (28.8
GBit/s) Data Buffer
64 MB64 MB
4 MB4 MB
Config PROM
Config PROM
FPGA
Xilinx XC4VFX100 FF1152
Xilinx XC4VFX100 FF1152
Cust
om
LV
DS I/O
72
Pair
s
Cust
om
LV
DS I/O
72
Pair
s
JTAGJTAG
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 42
TRD Beam Test at CERN (2007-11)
November 2007 Beam Test Setupat CERN Proton Synchrotron
1 GTU Segment
1 TRD Super-module
(not toscale)
Fir
st
Tra
ckle
ts f
rom
TR
DMean: -0.0916 cm
RMS: 0.119 cm
Deflection Error / cm
Count
Single Tracklet Deflection Precision
♦ Accelerator: CERN Proton Synchrotron (PS)
♦ Particles: Electrons, Pions(Transverse Momenta: 0.5 – 6 GeV/c)
♦ Good statistics for detector calibration (More than 1 Mio. events per momentum value)
♦ 8 days of continuous operation♦ First run with tracklets, consistent with raw
data
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 43
Commissioning at CERN /ALICE 2008 Cosmics Runs
♦ Read-out chain running continuously♦ Writing cosmics data to DAQ via GTU♦ 60,000 events with pseudo-random test
pattern: No errors♦ Measured TRD readout timing: Fully able
to saturate DAQ link
GTU Installation at CERN(End of 2007, Work in Progress)
GTU Installation at CERN(End of 2007, Work in Progress)
TRD Readout Timing•Measured readout time for maximum black event(30 time bins):
•L0 → Event at GTU: 245 us (4 kHz)L0 → DDL Tx done: ~ 13 ms (~ 75 Hz)Divide times by a factor of 5–500 for zero-suppressed dataDDL performance: 120 Mbyte/s
•DDL utilization: 99 %
TRD Readout Timing•Measured readout time for maximum black event(30 time bins):
•L0 → Event at GTU: 245 us (4 kHz)L0 → DDL Tx done: ~ 13 ms (~ 75 Hz)Divide times by a factor of 5–500 for zero-suppressed dataDDL performance: 120 Mbyte/s
•DDL utilization: 99 %
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 44
Summary GTU
• The TRD Global Tracking Unit (GTU) is the central element of the ALICE TRD trigger system and readout chain
• It provides both complex online data analysis within tight low-latency requirements and high bandwidth raw data handling
• The GTU system has been fully installed
• Commissioning and extensive read-out and interoperability tests have been performed successfully
A GTU Segment in Operation
•Next steps•Further Tests of the online tracking and trigger functionality
• Preparations for first LHC operation in 2008
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 45
Questions?
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 46
SPARES
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 47
A Large Ion Collider Experiment
♦ p-p or Pb-Pb Collision
♦ Creation of Quark Gluon Plasma, a state of matter existing during the first few microseconds after the big bang
♦ TRD is used as a trigger detector due to its fastreadout time (2 µs):
• Transversal Momentum
• Electron/Pion Separation
Inner Tracking System (ITS) Time Projection Chamber (TPC)Transition Radiation Detector (TRD)
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 48
Transition Radiation Detector
TRD - Transition Radiation Detector used as trigger and tracking detector > 24000 particles / interaction in acceptance
of detector up to 8000 charged particles within the TRD trigger task is to find specific particle pairs
within 6 s.
ITS - Inner Tracking System event trigger vertex detection
TPC - Time Projection Chamber high resolution tracking detector but too slow for 8000 collisions / second
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 49
Tail Cancellation Filter
sub12 12
multaddregister multadd 15 15 15 15
12
13
13
multaddregister mult13 13 13 1312
12
13
12
unfiltered input
filtered output
LongExponetial
Coefficients
ShortExponetial
Coefficients
tSL
tLLTail tTRF 1)(
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 50
The TRAP chip bonded on the MCM
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 51
CPU 3 CPU 2
CPU 0 CPU 1
QuadPort
Memory
IMEM 0 IMEM 1
DataBuffer
IMEM 3 IMEM 2
FiFo
GRF
8 ch.evt.buf
13 c
h.e
vt.b
uf
21 independentDigital filterchannels
21 A
DC
Ch
ann
els
network IF
TRAP Layout
5x7mm
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 52
Delay unit
Experimental setup (50cm Experimental setup (50cm tp cable)tp cable)
0 1 2 3 4 5 6 7
0
1
2
3
4
5
6
7
8
out(0) out(4) simulation
dela
y, n
s
Programmed delay (0..7)
~ 3440 ps ~1720 ps ~860 psNI_out(9:0) + strobe
del[2:0] per bit
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 53
Tail Cancellation Optimization
Average Signal beforeTail Cancellation
Average Signalafter Tail Cancellation
Mean Signalduring Drift Time
Signal Decayafter Drift Time
tSL
tLLTail tTRF 1)(
2,,
,,,,
SLL
SLL
SLL Mean
DecayM
Minimization of Error Measure
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 54
Tracking Arithmetics
Hit Selection
PedestalSubstraction
Fit Register File
)(
)()( 11
tQ
tQtQ
i
ii
Look-UpTable
+
+ X(t)+ X(t)·Y(t) + X2(t)+ Y(t) + Y2 (t)+ Q (t) + 1
Timer
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 55
DCS board, slow control in ALICE
ARM+FPGA
EmbeddedLinux
Ethernet 10
TTC opticalReceiver
ADCs, Temp.
TRAP configSCSN interf.
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 56
The MIMD Architecture
FIT
CPU 0 CPU 1
D-MEM
CPU 2 CPU 3
GRF
Local Bus Local Bus
Local Bus Local Bus
Bus
Const.
Interrupt
Evt. Buffer
Config.
Net
wor
kIn
terf
ace
4
4
4x10
10
Four RISC CPU's Coupled by Registers (GRF)
and Quad ported data Memory Register coupling to the
Preprocessor Global bus for Periphery Local busses for
Communication, Event Buffer read and direct ADC read
I-MEM: 4 single ported SRAMs Serial Interface for Configuration IRQ Controller for each CPU Counter/Timer/PsRG for each
CPU and one on the global bus Low power design, CPU clocks
gated individually
Cnt/Timer
IMEM
IMEM
IMEM
IMEM
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 57
TRAP internal configuration
TRAPIM0
DM
4 x 4k x 24 1k x 32
DB
256 x 32
StateM (~25)NI (~12)IRQ(64)Counters(8)Const(20)~120
LUT-nonl(64)Gain corr(42)LUT-pos(128)Fil/Pre(~44) ~280
Arbiter/SCSN slv
r1 r0
CPU0
Global bus
r1 r0
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 58
The TRAP1 chip bonded on the MCM
1
2
3
4
5
6
1. ADC2. Filter/Preprocessor3. DMEM4. CPUs5. Network Interface6. IMEM
On-chip FuseID
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 59
Wafer test – bad contacts
26
40
26
39
5mm
7mm
42
40
20
29
5mm7m
m
Balance!
adc adc
Number of needles
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 60
Integration equations, simple but…
♦ PASA + TRAP = MCM♦ 17 or 18 MCMs connected properly = ROB (readout board)
♦ 6 or 8 ROBs + 1 ORI + 1 DCS connected properly = module (chamber) electronics
♦ 5 modules in one layer = one layer of the supermodule♦ x 6 layers = supermodule!♦ 18 supermodules = TRD!
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 61
Integration prototype, Hd 2004
MCM = PASA+TRAP
DCS boardTrigger & clock distribution, ARM CPU+FPGA, embedded Linux, Ethernet, serial link to the TRAPs …
Detector prototype prepared for the beam test at CERN
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 62
Supermodule I in Heidelberg, 2006
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 63
Installation of the first Supermodule
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 64
TMU - Momentum Reconstruction
• Estimate transverse momentum from line parameter a,
• Fixed point arithmetics, look-up tables, additions and multiplications only
• Fast cut condition for trigger: full calculation of p t too slow
aconst
pt
apconst t min
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 65
TMU - Track Re-assembly
• Search for tracklets belonging together(3-dimensional matching task)
• Projection of tracklets to virtual Plane
• Sliding window Algorithm
• A track is found, if ≥ 4 tracklets from different layers inside same multidimensional window
• Algorithm highly optimized for hardware primitives available
• only quantities really needed are computed!
Projected Tracklets Track
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 66
GTU: Hardware Production / Installation
♦ All components produced and installed
♦ 4 Types of PCBs• TMU/SMU Hardware• 130 PCBs (90 TMUs +
18 SMUs + 1 TGU + Spares)
• Main GTU (LVDS) Backplane
• 18 PCBs (+ Spares)• Trigger Concentrator
Backplane• 1 PCB (+ Spare)• CTP Interface Module• 1 PCB (+ Spare)
♦ 9 Wiener cPCI crates♦ Mechanical components,
custom front panels, ...
♦ All components produced and installed
♦ 4 Types of PCBs• TMU/SMU Hardware• 130 PCBs (90 TMUs +
18 SMUs + 1 TGU + Spares)
• Main GTU (LVDS) Backplane
• 18 PCBs (+ Spares)• Trigger Concentrator
Backplane• 1 PCB (+ Spare)• CTP Interface Module• 1 PCB (+ Spare)
♦ 9 Wiener cPCI crates♦ Mechanical components,
custom front panels, ...
Uplink to Data
Acquisition
FPGA/PROM Programming and Control via Ethernet
• 5 TMU boards and 1 SMU board per super-module
• In total: 108 FPGA nodes
60 Fiber Optical Links
One of the 18 GTU Segments
TTC Input
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 67
Cosmic Trigger – Current Selection Criteria
C
C
C
C
C
C
•TRAP cosmic mode: hit detection,fit selection and dedicated tracklet building
•TMU: upper limit on charge/hits for selection of individual tracklets → sort out perturbances like hot pads, high amplitude border noise and
600 kHz oscillations
•SMU: threshold on charge/hit sums for each chamber, currently nearly open (approx. one true tracklet)
•SMU: threshold on number of “active” layers per stack, currently 4
Currently used:
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 68
Trigger OutFrom TMU 0From TMU 1From TMU 2From TMU 3From TMU 4
SMU Concentrator Board
(6U Height, Double Width)
5 Parallel Links (120
MHz DDR, 8 Bit LVDS)
ALICE Detector Data Link (DDL)
ALICE Timing, Trigger &
Control Input (TTC)
Ethernet: System
Configuration and Control
Com
pact
PC
I B
us
Com
pact
PC
I B
us
Cust
om
LV
DS I/O
72
Pair
s
Cust
om
LV
DS I/O
72
Pair
s
JTAGJTAGDDR2 SRAMDDR2 SRAM
DDR2 SDRAMDDR2 SDRAM
64 MB64 MB
4 MB4 MB
Config PROM
Config PROM
FPGA
Xilinx XC4VFX100 FF1152
Xilinx XC4VFX100 FF1152
850 nmSFP-Transceiver
850 nmSFP-Transceiver
RJ 45RJ 45
Detector Control System (DCS)
Board
Detector Control System (DCS)
Board
DDL - Source Interface Unit (SIU)
DDL - Source Interface Unit (SIU)
TTCTTC
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 69
Cosmic Trigger – Nice Examples for Tuning
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 70
Cosmic Trigger – First Results
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 71
Cosmic Trigger – First Resultsev 18
ev 18
ev 20
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 72
Cosmic Trigger – Additional Trigger Criteria
As soon as TOF delivers proper L0 trigger or pre-trigger input:
•Stronger settings for all thresholds/limits,e.g. requirement of 5 active layers/stack
•Coincidence between supermodules
trg
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 73
Cosmic Trigger – TOF Pre-triggerTOF coincidencefor pre-trigger box input
Pict
ure
by M
inJu
ng, T
RD A
nnua
l Mee
ting
2008
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 74
Cosmic Trigger – MCM Level
Modified tracklets for cosmics trigger: no tracklet parameters, instead…ScaledCharge Sum
over all time bins and channels
8 Bit
RelevantMCM
ConfigurationParameters
8 Bit
Number of Hits
8 Bit
RelevantMCM
ConfigurationParameters
8 Bit
TrackletWord
360.000
255
Charge Sum
Charge Sum
N
N
Details: Presentation byVenelin Angelov, 30.06.2008
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 75
Cosmic Trigger – Chamber Level (TMU)
Summation of tracklet numbers, charge and hit numbersover all tracklets from both half-chamber links of each chamber.
HC HC
HC HC
HC HC
HC HC
HC HC
HC HC
Threshold conditions for trigger:
• Charge summed up over all tracklets per chamber
• Hits summed up over all tracklets per chamber
• Number of Tracklets per chamber (normal tracklets)
trg
trg
trg
trg
trg
trg
thresholds
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 76
Cosmic Trigger – Stack Level (SMU)Trigger condition: number of “active” chambers per stack
Optionally: minimum/maximum limit for number of stacks hit
C C
C C
C C
C C
C C
C C
thresholds
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
trg trg trg trg trg
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 77
Cosmic Trigger – Supermodule Level (TGU)
Trigger conditions:
♦single supermodule
♦one-to-many correlations between supermodules
Optionally:
♦“same-side veto” in one-to-many correlation
trg
trg
trg
trg
Events like shown above may be very rare → experience, rates?
Such events rejected by TOF-/ACORDE-coincidence settings?
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 78
Cosmic Trigger – Next Steps♦Precise measurements of all contributions to latency in pre-trigger, tracklet calculation and GTU tracking/trigger needed
♦Increase the number of time bins with pre-trigger in operation
♦Shift CTP L1 contribution time window to 6.2us?
♦Need for additional cosmic trigger schemes?
♦Update Münster Setup• benchmarking of cosmic trigger performance with Münster setup,
L1 trigger decision recorded in SMU header→ offline comparison: easy to determine purity and efficiency
• use L1 contribution as trigger too?→ adaption of setup to L1 contribution and generate L1a/r
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 79
Cosmic Trigger – Bad Examples for Tuning
KIP Uni-Heidelberg, V.Angelov ALICE Workshop Sibiu 2008 80
ORI - production test (data path)
Independently tested the following interfaces and components:
• I2C interface from the TRAP chips on the readout board and the configuration memory of the laser driver chip• J2C interface from the TRAP chips on the readout board and the configuration and status registers in the CPLD• JTAG to the CPLD• 120 MHz DDR parallel data between the half-chamber merger TRAP and the CPLD on ORI• Data word and parity error counters in the CPLD• Optical data transmission using a receiver module mounted on a PCI card
The optimal configuration for two optical power settings 200 and 300µW stored together with all other test data to a database.