Upload
ethel
View
31
Download
0
Tags:
Embed Size (px)
DESCRIPTION
ATLAS High Level Trigger. Introduction System Scalability Trigger Core Software Development Trigger Selection Algorithms Commissioning & Preparation for Cosmics & First Beam. ON-line. OFF-line. Level-1 Trigger 40 MHz Hardware (ASIC, FPGA) Massive parallel Architecture Pipelines. - PowerPoint PPT Presentation
Citation preview
Chris BeeChris Bee
ATLAS High Level TriggerATLAS High Level TriggerATLAS High Level TriggerATLAS High Level Trigger
• Introduction• System Scalability
• Trigger Core Software Development• Trigger Selection Algorithms
• Commissioning & Preparation for Cosmics & First Beam
22Chris BeeChris Bee
10-9 10-6 10-3
10-0 103 106 sec
25ns 3µshour year
ms
Reconstruction& Analyses TIER0/1/2
Centers
ON-line OFF-line
sec
10 -2
100
102
104
106
108
QED
W,Z
Top
Z*
Higgs
10 -4
Rate (Hz)
2 µs
1 sec
10 ms
Level-1 Trigger 40 MHzHardware (ASIC, FPGA)Massive parallel ArchitecturePipelines
Level-2 Trigger 100 kHzs/w PC farmLocale Reconstruction
Level-3 Trigger 1 kHzs/w PC farmFull Reconstruction
Event rate Event rate
Level-2 Level-2
Level-1 Level-1
Offline Analyses Offline Analyses
Massstorage Massstorage
IntroductionIntroductionIntroductionIntroduction
33Chris BeeChris Bee
IntroductionIntroductionIntroductionIntroduction• ATLAS trigger comprises 3 levels
– LVL1• Custom electronics & ASICS, FPGAs• Max. time 2.5s• Use of Calorimeter and Muon detector data• Reduce interaction rate to 75 kHz
– LVL2• Software trigger based on linux PC farm (~500 dual CPUs)• Mean processing time ~10 ms• Uses selected data from all detectors (Regions of Interest
indicated by LVL1)• Reduces LVL1 rate to ~1 kHz
– Event Filter• Software trigger based on linux PC farm (~1600 dual CPUs)• Mean processing time ~1s• Full event & calibration data available• Reduces LVL2 rate to ~100Hz• Note – large fraction of HLT processor cost deferred initial
running with reduced computing capacity
44Chris BeeChris Bee
ATLAS Trigger & DAQ ArchitectureATLAS Trigger & DAQ ArchitectureATLAS Trigger & DAQ ArchitectureATLAS Trigger & DAQ Architecture
H
L
T
DATAFLOW
40 MHz
75 kHz
~2 kHz
~ 200 Hz
Event Building N/workDataflow Manager
Sub-Farm Input
Event Builder EBSFI
EBNDFMLvl2 acc = ~2 kHz
Event Filter N/work
Sub-Farm Output
Event FilterProcessors EFN
SFO
Event FilterEFP
EFPEFP
EFP
~ sec
~4
GB
/s
EFacc = ~0.2 kHz
Trigger DAQ
RoI BuilderL2 Supervisor
L2 N/workL2 Proc Unit
Read-Out Drivers
FE Pipelines
Read-Out Sub-systems
Read-Out Buffers
Read-Out Links
ROS
120 GB/s
ROB ROB ROB
LVL1
DE
T R/O
2.5
s
Calo MuTrCh Other detectors
Lvl1 acc = 75 kHz
40 MHz
RODRODROD
LVL2 ~ 10 ms
ROIB
L2P
L2SV
L2N
RoI
RoI data = 1-2%
RoI requests
specialized h/w
ASICsFPGA
120 GB/s
~ 300 MB/s
~2+4 GB/s
1 PB/s
55Chris BeeChris Bee
ATLAS Three Level Trigger ATLAS Three Level Trigger ArchitectureArchitecture
ATLAS Three Level Trigger ATLAS Three Level Trigger ArchitectureArchitecture
2.5 s
~10 ms
~ sec.
• LVL1 decision made with calorimeter data with coarse granularity and muon trigger chambers data.
•Buffering on detector
• LVL2 uses Region of Interest data (ca. 2%) with full granularity and combines information from all detectors; performs fast rejection.
•Buffering in ROBs
• EventFilter refines the selection, can perform event reconstruction at full granularity using latest alignment and calibration data.
•Buffering in EB & EF
66Chris BeeChris Bee
LVL1 - Muons & CalorimetryLVL1 - Muons & CalorimetryLVL1 - Muons & CalorimetryLVL1 - Muons & Calorimetry
Muon Trigger looking for coincidences in muon trigger chambers
Muon Trigger looking for coincidences in muon trigger chambers
Calorimetry Trigger looking for e//jets
• Various combinations of cluster sums and isolation criteria
Calorimetry Trigger looking for e//jets
• Various combinations of cluster sums and isolation criteria
Toroid
77Chris BeeChris Bee
ATLAS LVL1 TriggerATLAS LVL1 TriggerATLAS LVL1 TriggerATLAS LVL1 Trigger
Calorimeter trigger Muon trigger
Central Trigger Processor (CTP)
Timing, Trigger, Control (TTC)
Cluster Processor (e/, /h)
Pre-Processor (analogue ET)
Jet / Energy-sum Processor
Muon Barrel Trigger
Muon End-cap Trigger
Muon-CTP Interface (MUCTPI)
Multiplicities of for 6 pT thresholdsMultiplicities of e//h, jet
for 8 pT thresholds each; flags for ET, ET j, ET
miss over thresholds; multiplicity of fwd jets
LVL1 Accept, clock, trigger-type to Front End systems, RODs, etc – RoI pointers
~7000 calorimeter trigger towers O(1M) RPC/TGC channels
ET values (0.20.2)EM & HAD
ET values (0.10.1)EM & HAD
pT, information onup to 2 candidates/sector(208 sectors in total)
88Chris BeeChris Bee
RoI MechanismRoI MechanismRoI MechanismRoI Mechanism
LVL2 uses Regions of Interest as identified by Level-1
• Local data reconstruction, analysis,and sub-detector matching of RoI data
LVL2 uses Regions of Interest as identified by Level-1
• Local data reconstruction, analysis,and sub-detector matching of RoI data
LVL1 triggers on high pT objects• Calorimeter cells and muon
chambers to find e//-jet- candidates above thresholds
LVL1 triggers on high pT objects• Calorimeter cells and muon
chambers to find e//-jet- candidates above thresholds
The total amount of RoI data is minimal
• ~2% of the Level-1 throughput but it has to be accessed at 75 kHz
H →2e + 2H →2e + 2
22
2e2e
99Chris BeeChris Bee
Physics Selection StrategyPhysics Selection StrategyPhysics Selection StrategyPhysics Selection Strategy
• ATLAS has an inclusive trigger strategy– LVL1 Trigger on individual signatures
• EM cluster• Muon track• Jets• Total Energy • Missing Energy
– LVL2 confirms & refines LVL1 signature• requires seeding of LVL2 with LVL1 result – i.e. RoI
– Event Filter confirms & refines LVL2 signature & more complete event reconstruction
• Possibility of seeding of Event Filter with LVL2 result• tags accepted events according to physics selection
• Reject events early– Save resources
• minimize data transfer• minimize required CPU power
1010Chris BeeChris Bee
System ScalabilitySystem ScalabilitySystem ScalabilitySystem Scalability
1111Chris BeeChris Bee
ATLAS TDAQ Physical LayoutATLAS TDAQ Physical LayoutATLAS TDAQ Physical LayoutATLAS TDAQ Physical Layout
CentralSwitches
EventsBuilt
1212Chris BeeChris Bee
System ScalabilitySystem ScalabilitySystem ScalabilitySystem Scalability
• Extended testing programme for system scalability testing– Dedicated testbed for dataflow performance & networking
issues Data Acquisition group
– Large clusters worldwide for “node” scalability testing• Machine & run control• Start/end run cycling• Software distribution• Large scale configuration Data Acquisition & Trigger groups
– Trigger focus on Event Filter• Recent work
– Use of LXSHARE cluster at CERN ~ 500 nodes and WESTGRID cluster in Canada (~840 nodes)
• Plans– Use of 50-700+ nodes on LXSHARE this summer– http://atlas-tdaq-large-scale-tests.web.cern.ch
1313Chris BeeChris Bee
Summary of Recent TestsSummary of Recent TestsSummary of Recent TestsSummary of Recent Tests
• Conclusions– Primary goal was system porting and debugging– Important bug in CORBA lib was found and fixed
• Many others benefits obtained:– Experience in porting large-scale DAQ system– Many particular indications for weak points and possible
improvements– General impression of run control transition times
• LST @ CERN– June 6 – July 19– Many things being tested / investigated / measured– We are ready following experience from WestGrid
1414Chris BeeChris Bee
System ScalabilitySystem ScalabilitySystem ScalabilitySystem Scalability
• Many hardware issues need attention– How to organize O(2000) PCs
• racks, space, weight, heat & cooling, cabling• data I/O & networking• operating – booting, s/w installation, operational monitoring• dependency on ever evolving PC & CPU architectures and compilers,
applicability of Moore’s Law• Remote farms
• Possible Involvement– Longer term possibilities of LSTs at SLAC?
– Software development & testing work in the Event Filter to include requirements from overall ATLAS monitoring and calibration
– Work on the specification development, installation, maintenance & running of the EF
1515Chris BeeChris Bee
Trigger Core Software DevelopmentTrigger Core Software DevelopmentTrigger Core Software DevelopmentTrigger Core Software Development
1616Chris BeeChris Bee
Trigger Core Software DevelopmentTrigger Core Software DevelopmentTrigger Core Software DevelopmentTrigger Core Software Development
• Provides a coherent software framework for LVL2 and EF• Coherent data access methods• Re-use of some offline components where appropriate• Development platform ~common across trigger & offline
– Facilitates online/offline comparisons & ease of development
• Detailed collaboration with core offline development group as well as detector software development– Benefit from detailed expertise in each detector group– E.g. => in last year’s testbeam: detector monitoring
software developed for use in offline was also used online in the EF
– Considerable exchange of ideas & development– Performance & efficiency improvements done for the trigger
now benefit offline some new offline functionality benefits the trigger
• More specific dedicated development for LVL2
1717Chris BeeChris Bee
HLT Event Selection SoftwareHLT Event Selection SoftwareHLT Event Selection SoftwareHLT Event Selection Software
HLTSSW
Steering Monitoring Service
1..*
MetaData Service
1..*ROB DataCollector
DataManager
HLTAlgorithms
Processing Task
Event DataModel
L2PU Application
<<import>>
Event DataModel
Reconstr. Algorithms
<<import>>
StoreGateAthena/Gaudi
<<import>><<import>>
Interface
Dependency
Package
Event Filter
HLT Core Software
Offline Core Software Offline Reconstruction
HLT Algorithms
HLT Data Flow Software
HLT Selection Software Framework ATHENA/GAUDI Reuse some offline
components Common to Level-2 and EF
~Offline algorithms used in EF
1818Chris BeeChris Bee
LVL2 Development EnvironmentLVL2 Development EnvironmentLVL2 Development EnvironmentLVL2 Development Environment
Data Flow
L2PU
Steering Controller
Algorithms
ATHENA Environment
athenaMT
Steering Controller
Algorithms
Link to algorithm libraries
Support for multiple threads
OfflineOnline
Offline support for Level-2 developers
Multithreaded offline
application AthenaMT
Emulates complete L2PU environment
No need to setup complex Data Flow systems
As simple to run as a normal offline application:
athenaMT <number of threads> <job-configuration>
Coding guidelines for Lvl2 developers
HLT software development and testing in offline environment
Final “certification” procedure in Data Flow test-beds
Development and Data Flow setup for Level-2
1919Chris BeeChris Bee
Trigger Core Software DevelopmentTrigger Core Software DevelopmentTrigger Core Software DevelopmentTrigger Core Software Development
• Possible Involvement– Work & responsibility in specific s/w packages in the core
s/w– Trigger configuration and algorithm control system– Trigger monitoring framework and strategy– Offline/online Software integration
2020Chris BeeChris Bee
Trigger Selection AlgorithmsTrigger Selection AlgorithmsTrigger Selection AlgorithmsTrigger Selection Algorithms
2121Chris BeeChris Bee
Trigger Selection AlgorithmsTrigger Selection AlgorithmsTrigger Selection AlgorithmsTrigger Selection Algorithms
• On-line event selection in the HLT based on algorithmic software tools running in LVL2 and EF farms, sequenced by HLT steering
– LVL2 specialized algorithms, EF algorithms adapted from off-line– Important deployment in HLT test-beds to assess compliance with realistic
on-line environment• Building on expertise and development inside detector communities
– Calorimeters, Inner Detector, Muon Spectrometer• Studies of efficiency, rates, rejection factors, physics coverage
organized around five main lines (“vertical slices”) coherently mapped to the Physics Combined Performance groups (see physics session)
– Electrons and photons• Fundamental signatures for both precision measurements and discovery signals
– Muons• Low- and High-PT objects, strategic also for B-physics programme
– Jets / Taus / ETmiss• Models testing, new physics
– b-tagging• Optimize physics coverage, add flexibility and redundancy to HLT selection
starting from LVL2– B-physics
• Rich program of work with new strategies dependent on luminosity• Most recent talks on performance studies
– http://agenda.cern.ch/fullAgenda.php?ida=a052747
2222Chris BeeChris Bee
Trigger Menus and StrategyTrigger Menus and StrategyTrigger Menus and StrategyTrigger Menus and Strategy
• Extracting tiny signals out of huge backgrounds requires the HLT selection strategy to be robust, redundant and flexible
– Selections are mostly inclusive, with as-low-as-possible pT thresholds for fundamental objects
– The usage of software tools at both HLT levels allows detailed studies of the boundary between LVL2 and EF
• Different paths leading at approximately thesame efficiency (electrons in the figure)
• Example of flexibility and different selection sequences
• Choice will depend on background conditions, detector knowledge, luminosity, …
• The building of complete Trigger Menus evolves and complement the work done in the slices
– Moving from single objects to complex topological signatures– Include issues of pre-scaled triggers, monitor triggers, etc– Optimize to environmental conditions
• Commissioning the HLT selection will be an important step towards physics data taking
– Needs to be ready for cosmic period– Implies modification to algorithms, new sequences
2323Chris BeeChris Bee
Trigger SelectionTrigger SelectionTrigger SelectionTrigger Selection
• Possible Involvement– Work in trigger algorithm development and selection
performance evaluation• Jet / tau / Etmiss area is in particular need of increased effort• Other areas would also benefit from new manpower and groups
willing to take on new responsibility– Preparation/adaptation of sets of algorithms & selection
procedures for use in cosmic running and in initial beam periods (single beams, very initial collisions etc)
2424Chris BeeChris Bee
Commissioning & Preparation for Cosmics & Commissioning & Preparation for Cosmics & First BeamFirst Beam
Commissioning & Preparation for Cosmics & Commissioning & Preparation for Cosmics & First BeamFirst Beam
2525Chris BeeChris Bee
CommissioningCommissioningCommissioningCommissioning
• Detailed planning for stepwise commissioning of the trigger system (LVL1 & HLT) is being prepared– Planning taking account of detector plans and triggering
requirements for their commissioning– Planning in various phases with increasing levels of
integration
• Commissioning planning is broken in 4 broad phases:– Subsystem standalone commissioning– Integrate subsystems into full detector– Cosmic rays, recording data, analyze/understand, distribute
to remote sites– Single beam, first collisions, increasing rates
• Phases will overlap TDAQ “pre-series” system
2626Chris BeeChris Bee
TDAQ Pre-series systemTDAQ Pre-series systemTDAQ Pre-series systemTDAQ Pre-series system
• Fully functional, small scale, version of the complete HLT/DAQ system– Equivalent to a detector’s ‘module 0’
• Purpose and scope of the pre-series system:– Pre commissioning phase:
• To validate the complete, integrated, HLT/DAQ functionality• To validate the infrastructure, needed by HLT/DAQ, at point-1.
– Installed at point 1 (USA15 and SDX1)
– Commissioning phase• To validate a component (e.g. a ROS) or a deliverable (e.g. a
Level-2 rack) prior to its installation and commissioning– TDAQ post-commissioning development system.
• Validate new components (e.g. their functionality when integrated into a fully functional system).
• Validate new software elements or software releases before moving them to the experiment.
ROS, L2, EFIO and EF racks : one Local File Servers, one or more Local Switches
One Switch rack
-
TDAQ rack-
128-port GEth for L2+EB
One ROS rack
-
TC rack+ horiz. cooling
-
12 ROS48 ROBINs
One Full L2 rack
-
TDAQ rack-
30 HLT PCs
PartialSuperv’r rack
-
TDAQ rack-
3 HE PCs
Partial EFIO rack
-
TDAQ rack-
10 HE PC(6 SFI - 2 SFO - 2 DFM)
Partial EF rack
-
TDAQ rack-
12 HLT PCs
Partial ONLINE
rack-
TDAQ rack-
4 HLT PC(monitoring)
2 LE PC(control)2 Central
FileServers
RoIB rack
-
TC rack + horiz. cooling
-50% of RoIB
5.5
Pre-SeriesSDX1USA15
2828Chris BeeChris Bee
CommissioningCommissioningCommissioningCommissioning• Phase 1 commissioning will be completely defined after the
experience with the pre-series
• Parallelize commissioning work as much as possible– Use data taken during detector commissioning to test data
unpacking tools– Develop special algorithms to test component units– Extend offline s/w testing procedures– Provide infrastructure to collect systematic information from
trigger selection studies:•List of selection variables•Graphs of rate and efficiency variation
– There is a strong coupling with the offline commissioning activities
• Trigger commissioning extends well into data-taking– Need good coordination with physics groups– Treat the trigger as a single object to be commissioned (inc.
LVL1)– Will need a clear strategy for the daily run meetings (data
request)•It is clear that the “Extra Triggers (monitoring, calibration, etc…) will be much larger than the foreseen 10% during the first months of data-taking
2929Chris BeeChris Bee
CommissioningCommissioningCommissioningCommissioning
• Possible involvement We would like to benefit from your experience in
commissioning and running the BaBar experiment & elsewhere
– Work in installing, developing and exploiting the pre-series system
– Development of algorithms and procedures that allow to rapidly check the trigger performance with real data and monitor the overall HLT commissioning advancement
– Responsibility in the more general trigger commissioning activities and in preparing the ATLAS trigger for cosmic tests and first beams in LHC
– There is considerable lack of effort in this area and there is room for major involvement and responsibility
3030Chris BeeChris Bee
SummarySummarySummarySummary
• Outlined several areas within the ATLAS HLT system where members of the SLAC team could contribute and take responsibility
• Spread of areas ranging from more technical software design and implementation to much more physics oriented work
• Many interesting challenges ahead to lead ATLAS into data-taking and first physics
• TDAQ Workshop in Mainz, Germany 10-14 October 2005
WELCOME !!!
3131Chris BeeChris Bee
BackupBackupBackupBackup
3232Chris BeeChris Bee
ATLAS LVL1 TriggerATLAS LVL1 TriggerATLAS LVL1 TriggerATLAS LVL1 Trigger
LVL1 Accept 75(100) kHz
75(100) kHz 75(100) kHz
75(100) kHz
75(100) kHz
3333Chris BeeChris Bee
BMLBML
RPC station 2(Pivot)
RPC station 1(Low Pt confirm)
T
Z
Z RPC 2
Z RPC 1
Z MDT
Z = (Z RPC 2 + Z RPC 1)/2 – ZMDT
RoI reconstruction at LVL2 using RoI reconstruction at LVL2 using FastFast
Muon Road
3434Chris BeeChris Bee
muFast Timing MeasurementsmuFast Timing Measurements
• Fast latency is the CPU time taken by the algorithm without considering the data access/conversion time:
– the presence of Cavern Background does not increase the Fast processing time.
• The total latency shows timings made on the same event sample before and after optimizing the MDT data access.Optimized version:
– total data access time ~ 800 s;– data access takes the same cpu time of Fast;
Optimized code run on
(Pentium III @ 2.3GHz).
Physics: single muon,pt=100 GeV
Cavern Background: High Lumi x 2
3535Chris BeeChris Bee
Stepwise HLT SelectionStepwise HLT SelectionStepwise HLT SelectionStepwise HLT Selection
• Selection takes place in steps• Rejection can happen at every
step• Trigger Decision and Data
Navigation is based on Trigger Elements
• Algorithms use the result from previous steps (Seeding) using the Data Navigation and the Trigger Elements
• The initial seeds for the LVL2 steps are the LVL1 RoIs
e50ie50i +e50i+e50i ? e50ie50i
e50e50 +
isolationisolation
e50e50
EM50EM50 +
Event Accepted
RoI
isolationisolation
elecIdelecId elecIdelecId
EM50EM50
Decision
LVL1 Trigger Element
RoI
3636Chris BeeChris Bee
The Different Commissioning Phases The Different Commissioning Phases (1)(1)
The Different Commissioning Phases The Different Commissioning Phases (1)(1)
•HLT standalone commissioning– Units of racks (considered to be a unit to be
commissioned)– A rack delivered from installation has:
•Checked the power, cooling and network within and outside the rack
•Operating system installed– Commissioning starts with the installation of the DAQ
and offline software•Check internal Dataflow (preloaded data)
– Monitoring tools
•Offline software – Offline software distribution procedures– Automatic testing procedures– Testing algorithms
3737Chris BeeChris Bee
The Different Commissioning Phases The Different Commissioning Phases (2)(2)
The Different Commissioning Phases The Different Commissioning Phases (2)(2)
• Integrate subsystems into the full detector.– These operations that have a very strong coupling with the offline
commissioning activities– First start with data unpacking algorithms
• Monitoring infrastructure to check this step– Use any commissioning data taken by the detectors to debug this
part of the system• Even if the data is corrupted, it might be very useful to test the
robustness of the code• Current activities (or areas where we need to concentrate
effort)– Extend the pool of data prep algorithms
• Algorithms must be scrutinized and broken up in simpler testing units– Testing procedures for both offline selection software and interface
to DAQ software are being strengthened and running in the nightly automatically
• The goal is to arrive to a set of tests that almost guarantee further test-bed (or pre-series, etc) integration will succeed
– Specify constraints and tests in the offline software before distribution
– Software distribution
3838Chris BeeChris Bee
The Different Commissioning Phases The Different Commissioning Phases (3)(3)
The Different Commissioning Phases The Different Commissioning Phases (3)(3)
• The remaining phases correspond to commissioning while data is being taken and assumes:
– Complete HLT Dataflow is working– The algorithms start selecting/rejecting events
• The trigger work will focus more on demonstrating that an algorithm gives an Xx.Yy% selection efficiency with some rejection rate
•This activities are very important:– Help to develop and tune the algorithms– Give us the building blocks to test the complete HLT chain
– However, for commissioning, we need to be focused also in some other aspects
•Have a centralized place where the complete set of parameters that algorithms use (will be inside the configuration in the future) are listed
– Size of data request around the ROI– Set of selection cuts
•For every “selection variable” we need the graph of variation in selection efficiency and rejection rate around the chosen optimal point (we are sure we will have to tune it with data)
•Need to prepare a set of algorithms and methods that allow us to check the trigger performance with data:
– Particles with known mass (selected only triggering in one of its decay products)– How many hours of data-taking do we need to know the selection efficiency within
a 5% precision?