Upload
edwin-mcdowell
View
215
Download
0
Embed Size (px)
Citation preview
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
1
Event Filter on SMP architecture Subfarm design Tests and results Conclusions
Andrea Negri
Giacomo Polesello
Diana Scannicchio
Cristian Stanescu
Valerio Vercesi
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
2
Symmetric Multi Processor The SMP architecture offers evident advantages in data sharing and transfer
between the different hardware and software components all processors can access symmetrically the main memory and many other
system resources through a very high speed system interconnect (system bus, crossbar switch, ...)
In the development phase of an Event Filter subfarm system one should avoid as much as possible interferences of critical operating system aspects in
the sub-farm code implementation itself obtain a better reliability of both hardware and software component
The EF in the subfarm has been implemented on a commercial SMP with proprietary operating system
The technical choice has been an HP SMP server running version 11.0 of the HP-UX operating system that provides kernel level POSIX thread and is POSIX 1003.1c compliant (draft 10)
After gaining experience with this implementation, the prototype has also been easily ported on an SMP commodity PC running Linux OS
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
3
POSIX compliance allows for an easy porting of the code on other operating systems obeying the same standard: all the EF code has been written according to POSIX the subfarm has been already ported in other environments (Solaris,
Tru64-Unix, Linux)
To better exploit the hardware architecture all the subfarm components have been implemented within a single multi-thread process every component is assigned a thread scheduled directly by the OS kernel
(“1x1” scheduling model: to each user thread corresponds one thread in the kernel)
One obvious by-product is that load balancing, a critical parameter in the sub-farm operation, is automatically provided by the OS scheduler
The choice of the multi-threaded implementation stems from the fact that it eases in particular the communication and the synchronisation among the different subfarm components
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
4
The sub-farm implementation has been tested on
20 CPU, PA-8500 440 MHz 0.5+1 MB L1 cache on chip, 15 GB/s 8x8 crossbar hyperplane, 16 GB RAM
8 CPU, PA-8500 440 MHz 0.5+1 MB L1 cache on chip, 4 GB/s system bandwidth shared across two system bus, 8 GB RAM
COMPAQ ProLiant 5500 4 CPU, PII XEON 400 MHz 512 KB L2 cache, 512 MB RAM
HP K220 HP N4000
HP Exemplar V2500 HP Exemplar V2500
4 CPU, PA-7200 120 MHz 1+1 MB L2 cache, 512 MB RAM
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
5
K220 N4000 V2500ProLiant
5500
SPECint95 6.4 34.0 34.0 15.3
SPECfp95 9.1 51.4 51.4 11
SPECint_rate95 228 2403 5300 594
SPECfp_rate 95 275 2075
We acknowledge CILEA (Consorzio Interuniversitario Lombardo per l’Elaborazione Automatica, a computer centre located near Milan) for dedicating us the servers N4000 and V2500, allowing us to perform the necessary tests
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
6
Subfarm Design
SFOStorage
SFIEvent backup
dele
te e
vent
bac
kup
Distributor FIFO(physics)
PTPTPTcal.
PTcal.PT
SFO
Distributor FIFO(calibration)
Collector FIFOCollector FIFO
Su
per
viso
rS
up
ervi
sor
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
7
The Distributor and the Collector are implemented by FIFOs whose availability (provided by semaphores based on mutexes and condition variables) regulates the flow of the events in the subfarm
The SFI or injector thread stores the events in DGB, selects them according to their type (e.g. physics, calibration) and fills the two Distributor FIFOs
The PT threads get the events, process them and fill the Collector FIFO (with the filtered ones) which eventually is emptied by the SFO thread The “physics” PT runs the Calorec++ ATLAS EM Calorimeter reconstruction
software (developed by C. Meessen) The “calibration” PT consumes CPU by mathematical operation; its
processing time is set to ~10% of the Calorec++ PT
The DGB has to ensure that events are not lost during the passage through the subfarm and has been implemented as a disk partition on which the events are stored as different files and are removed after having been rejected by the PT or disposed of by the SFO
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
8
This is the prototype design compliant to Version 3 of the subfarm
Only one process, every component is assigned a thread
In this implementation the mechanism of control is embedded in the use of the POSIX thread library, that provides several system functions for the management of the thread associated to the component
the component statistics is visible from the whole subfarm (global variables)
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
9
The error handling has been implemented exploiting all the means provided by the thread POSIX libraries
Some tests to check the error handling have been performed
Simulating a system crash the process has been killed the events that were still to be processed have been found in the DGB
and the recovery system embedded in the multi-thread implementation ensures that they are firstly processed when the subfarm restarts before accepting new events from the Distributor: no event is lost
Causing errors in the PT threads the threads have been killed the crashed thread is identified and deleted and a new one is created
Error handling and recovery
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
10
All tests have been performed on different machines and platforms 4 CPU HP K220 (PA-7200, 120 MHz) 8 CPU HP N4000 (PA-8500, 440 MHz) 20 CPU HP V2500 (PA-8500, 440 MHz) 4 CPU COMPAQ ProLiant 5500 (PII XEON, 400 MHz) running Linux OS
We have performed tests using different initial conditions: number of CPUs number of PTs (“physics” and “calibration”) event size processing time (looping many times the reconstruction software to
simulate different realistic values) use of the DGB
The size of the ATLAS EM Calorimeter MC events (~50 KB) is padded to 250 KB or to 1 MB to simulate the realistic size of an ATLAS event and to 100 KB to simulate the “calibration” events
Conditions of the Tests
running HP-UX 11 OS
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
11
The tests aimed to measure the performances prove absence of bottleneck due to software or hardware test the scalability
All results show that the software and hardware architectures do not limit the behaviour of the subfarm
The global throughput is independent of the number of PTs (from 4 to 400) is inversely proportional to the processing time is independent of the event size scales according to the number of CPUs (up to 20 on V2500)
Running concurrently the two different types of PTs (“physics” and “calibration”) does not change the load balancing: each PT is balanced with the others of the same type the relative composition in the number of PTs does not affect the previous
result
Results
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
12
To better simulate realistic, variable processing times the tests have been also performed with a random number of loops of the reconstruction software (from 0.2 s to 4 s on K220) the system scheduler still balances the processing tasks
The use of DGB influences the performance reducing the global throughput as expected, depending on the hardware used to implement it (FW SCSI disk, FiberChannel array, ...)
The use of DGB becomes negligible with increasing processing time
The results of the tests performed on an Intel based SMP (PII XEON) running Linux (RedHat 5.3, kernel 2.2.6) provides the same results obtained with HP, proving the platform independence and giving way to a low cost high performance implementation
We have performed also a long term reliability test: the subfarm processed more than 4 millions events in 3 days without any problem
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
13
Throughput vs. number of PTs
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
14
Throughput (HP K220)
One processing time unit is ~220 msec
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
15
Series of runs performed increasing the number of the active CPUs in the V2500 with different conditions (event size and DGB)
Scalability
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
16
8 “physics” PT and 8 “calibration” PT
Load Balancing
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
17
16 “physics” PT and 4 “calibration” PT
Load Balancing
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
18
8 “physics” PT and
8 “calibration” PT
K220
Load Balancing
400 “physics”PT
N4000
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
19
24 PT
N4000
Load Balancing
Main thread
SFI thread
SFO thread
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
20
One processing time unit is ~40 msec on N4000 and ~220 msec on K220
Throughput and DGB usage
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
21
Throughput (ProLiant 5500)
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
22
The Event Filter subfarm has been successfully implemented on different SMP machines
The POSIX standard ensures an easy porting of the code (only recompilation / linking is needed for the different platform)
The software robustness has been checked by proving that the global throughput is independent of the number of PTs is inversely proportional to the processing time is independent of the event size up to 1 MB scales almost perfectly with the number of CPUs is independent of the platform
The hardware robustness has been proved testing the thread error recovery, the functionality of the DGB and performing long term reliability tests
Conclusions
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
Diana Scannicchio (D.F.N.T. and I.N.F.N of Pavia) - T/DAQ Workshop - Beatenberg 6-10 Dec 1999
23
Complete the implementation of the Version 3 (Object Oriented) design of the subfarm
Perform other studies on error handling and on different communication mechanisms
Since the results obtained prove the SMP implementation of the subfarm is completely independent of the platform used, we will perform other tests on 4 and 8 CPU Intel based boards
Outlook