Upload
alyson-cross
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
DØ LEVEL 3 TRIGGER/DAQ SYSTEM STATUS
G. Watts (for the DØ L3/DAQ Group)
“More reliable than an airline*” * Or the
GRID
2
Full Detector Readout After Level 2 AcceptSingle Node in L3 Farm makes the L3 Trigger DecisionStandard Scatter/Gather
ArchitectureEvent size is now about 300 KB/event.First full detector readout
L1 and L2 use some fast-outs
Overview of DØ Trigger/DAQ
G. Watts (UW)
Standard HEP Tiered Trigger System
Level 1 Level 2
DAQL3 Trigger
FarmOnlineSyste
m
1.7 MHz
2 kHz
1 kHz 300MB/sec
100 Hz 30 MB/sec
Firmware
FW + SW
Commodity
Commodity
G. Watts (UW)
3
Overview Of Performance
System has been fully operational since March 2002.
Tevatron Increases
Luminosity
# of Multiple Interactions
Increase
Increased Noise Hits
Physicists Get Clever
Trigger List Changes
More CPU Time Per
Event
Increased Data Size
Trigger software written by large collection of non-realtime programmer physicists.
CPU time/event has more than tripled.
Continuous upgrades since operation started
Have added about 10 new cratesStarted with 90 nodes, now have almost 250, none of them originalSingle core at start, latest purchase is dual 4-core.
No major unplanned outagesAn Overwhelming
Success
Trigger HW Improvemen
ts
Rejection Moves to
L1/L2
G. Watts (UW)
4
24/7
Constant pressure: L3 deadtime shows up in
this gap!
Over order of magnitude increase in peak luminosity
5
G. Watts (UW)
Data FlowDirected, unidirectional flow
Minimize copying of data
Buffered at origin and at destination
Per Event Control Flow100% TCP/IP
Bundle small messages to decrease network overhead
Compress messages via configured lookup tables
Basic Operation
G. Watts (UW)
6
The DAQ/L3 Trigger End Points
ROC
ROC
ROC
ROC
ROC
Farm Node
Farm Node
Farm Node
Read Out Crates are VME crates that receive data from the detector.
Most data is digitized on the detector and sent to the Movable Counting House
Detector specific cards in the ROCDAQ HW reads out the cards and makes the data format uniform
Event is built in the Farm NodeThere is no event builder
Level 3 Trigger Decision is rendered in the node.
Farm Nodes are located about 20m away (electrically isolated)
Between the two is a very large CISCO
switch…
G. Watts (UW)
7
Hardware
ROC’s contain a Single Board Computer to control the readout.
VMIC 7750’s, PIII, 933 MHz128 MB RAMVME via a PCI Universe II chipDual 100 Mb ethernet4 have been upgraded to Gb ethernet due to increased data size
Farm Nodes: 288 total, 2 and 4 cores per pizza box
AMD and Xeon’s of differing classes and speedsSingle 100 Mb EthernetLess than last CHEP!
CISCO 6590 switch16 Gb/s backplane9 module slots, all full8 port GB112 MB shared output buffer per 48 ports
G. Watts (UW)
8
Data Flow
ROC
ROC
ROC
ROC
ROC
Farm Node
Farm Node
Farm Node
Routing Master
DØ Trigger
Framework
The Routing Master Coordinates All Data FlowThe RM is a SBC installed in a special VME crate interfaced to the DØ Trigger Framework
The TFW manages the L1 and L2 triggersThe RM receives an event number and trigger bit mask of the L2 triggers.The TFW also tells the ROC’s to send that event’s data to the SBCs, where it is buffered.
The data is pushed to the SBC’s
L2 Accept
G. Watts (UW)
9
The RM Assigns a NodeRM decides which Farm Node should process the event
Uses trigger mask from TFWUses run configuration informationFactors into account how busy a node is. This automatically takes into account the node’s processing ability.
10 decisions are accumulated before being sent out
Reduce network traffic.
Data Flow
ROC
ROC
ROC
ROC
ROC
Farm Node
Farm Node
Farm Node
Routing Master
DØ Trigger
Framework
G. Watts (UW)
10
Data Flow
ROC
ROC
ROC
ROC
ROC
Farm Node
Farm Node
Farm Node
DØ Trigger
Framework
The Data MovesThe SBC’s send all event fragments to their proper nodeOnce all event fragments have been received, the farm node will notify the RM (if it has room for more events).
Routing Master
11
G. Watts (UW)
DØ Online Configuration DesignLevel 3 SupervisorConfiguration Performance
Configuration
G. Watts (UW)
12
The Static Configuration
Farm Nodes What read out crates to expect on every event What is the trigger list programming Where to send the accepted events and how to tag them
Routing Master What read out creates to expect for a given set of trigger
bits What nodes are configured to handle particular trigger
bits Front End Crates (SBC’s)
List of all nodes that data can be sent to
Much of this configuration information is
cached in lookup tables for efficient communication at
run time
G. Watts (UW)
13
SBC
SBC
SBC
Farm Node
Farm Node
Farm Node
Farm Node
The DØ Control System
“COOR”dinate(master Run
Control)
Calorimeter Level 2 Level 3Online
Examines
Farm Node
Routing Master
SBC
DØ ShifterConfigurati
on Database
Standard Hierarchical Layout• Intent flows from the
shifter down to the lowest levels of the system• Errors flow in reverse
G. Watts (UW)
14
• COOR sends a state description down to Level 3• Trigger List• Minimum # of nodes• Where to send the output data• What ROC’s should be used
• L3 Component Failures, crashes, etc.• Supervisor updates its state and
reconfigures is necessary.
Level 3 Supervisor
Level 3
Input
Output• Commands to sent to each L3
component to change the state of that component.
Current State
Desired State
Super
Commands to Effect the Change
State Configuration• Calculates the minimum number of
steps to get from the current state to desired state.• Complete sequence calculated before
first command issued• Takes ~9 seconds for this
calculation.
G. Watts (UW)
15
General Comments
SBC
SBC
SBC
Farm Node
Farm Node
Farm Node
Farm Node
“COOR”dinate(master Run
Control)
Farm Node
Routing Master
SBC
Level 3
Boundary Conditions• Dead time: beam in the machine and
no data flowing• Run change contributes to
downtime!• Current operating efficiency is 90-
04%• Includes 3-4% trigger deadtime
• That translates to less than 1-2 minutes per day of downtime.
• Any configuration change means a new run• Prescales for luminosity changes,
for example.• A system fails and needs to be
reset• Clearly, run transitions have to be
very quick!
Push Responsibility Down LevelDon’t Touch configuration unless it must be changed We didn’t start out
this way!
G. Watts (UW)
16
Some Timings
Configured Running
Paused
Start
Configure – 26 sec Start Run #nnnnn – 1 sec
Stop Run #nnnnn – 1 sec
Pause1 sec
Resume1 sec
Unconfigure – 4 secs
G. Watts (UW)
17
Caching
SBC
SBC
SBC
Farm Node
Farm Node
Farm Node
Farm Node
“COOR”dinate(master Run
Control)
Farm Node
Routing Master
SBC
Level 3
COOR caches the complete state of the systemSingle point of failure for the whole
system. But it never crashes!Recovery is about 45 minutesCan re-issue commands if L3 crashes
L3 Supervisor caches the desired state of the system (COOR’s view) and the actual state
COOR is never aware of any difficulties unless L3 can’t take data due to a failureSome reconfigurations require minor interruption of data flow (1-2 seconds)Farm nodes cache the trigger filter
programmingIf a filter process goes bad no one outside the node has to know
Event that caused the problem is saved locally
Fast Response to problems == less downtime
25
G. Watts (UW)
Future & Conclusions
G. Watts (UW)
26
Upgrades
Farm NodesWe continue to purchase farm nodes at a small increments as old nodes pass their 3-4 year lifetimes.8 core CPU’s run 8-9 parallel processes filtering event with no obvious degradation in performance.Original plan called for 90 single processor nodes
“Much easier to purchase extra nodes than re-write the tracking software from scratch.
SBCsFinally used up our cache of sparesNo capability upgrades requiredBut we have not been able to make the new model SBC’s operate at the efficiency that is required by the highest occupancy crates.Other New IdeasWe have had lots of varying degrees of craziness.Management very reluctant to make major changes at this point
Management has been very smart.
G. Watts (UW)
27
Conclusion
This DØ DAQ/L3 Trigger has taken every single physics event for DØ since it started taking data in 2002.
63 VME sources powered by Single Board Computers sending data to 328 off-the-shelf commodity CPUs.
Data flow architecture is push, and is crash and glitch resistant. Has survived all the hardware, trigger, and luminosity upgrades
smoothly Upgraded farm size from 90 to 328 nodes with no major change in
architecture. Evolved control architecture means minimal deadtime incurred
during normal operations Primary responsibility is carried out by 3 people (who also work on
physics analysis) This works only because Fermi Computing Division takes care of the farm
nodes...