ALICE TPC Online Tracking on GPU David Rohr for the ALICE Corporation 25.5.2010 Lisbon

Preview:

Citation preview

ALICE TPC Online Tracking on GPU

David Rohrfor the ALICE Corporation

25.5.2010Lisbon

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Large Hadron Collider

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

The ALICE Experiment

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Tracking

Clusters Tracks

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Real TPC pp-Event in Online Event Display

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Simulated Heavy Ion Event

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Tracks found in simulated Heavy Ion Event

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Tracks in simulated Central Heavy Ion Event

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

TPC Clusters divited into Slices

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

The ALICE TPC

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

TPC Tracker and Merger

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

One ALICE TPC Sector(Traking is performed row by row)

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Tracking Algorithm

Category of Task Name of Task Description on Task

(Initialization)

Combinatorial Part(Cellular Automation)

I: Neighbors FindingConstruct Seeds(Track Candidates)II: Evolution

Kalman Filter PartIII: Tracklet Construction

Fit SeedExtrapolate TrackletFind New Clusters

IV: Tracklet Selection Select good Tracklets

(Tracklet Output)

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Step I: Neighbors Finder(Fit best straight lines)

dx+

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Step II: Evolution(Keep coinciding links)

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Step III: Tracklet Construction

Green: Seed Red: ExtrapolationClusters close to the extraplation point are searched

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Neighbours Finder on Real Data

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Evolution Step on Real Data

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Tracklet Construction on Real Data

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Tracklet Selection on Real Data

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

NVIDIA CUDA GPU

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Parallel Tracklet Construction

Current Row

Tracklets are independent and can be processed simultaneouslyBecause of data locality the tracklets are processed for a common RowSingle Instruction Decoder Either parallel fitting or extrapolationRed: Initial Seed Green: Extrapolation

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Initial GPU Tracker Performance (Peripheral Pb-PB)

(8 threads on Nehalem CPU with 8 virtual / 4 physical cores) Focus on Tracklet Construction when optimizing

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Active GPU Threads for the First Implementation

Active GPU Threads: 19%

Colors represent Tracklet Constructor Steps:Black: IdleBlue: Track FitGreen: Track Extrapolation

x-axis: threads y-axis: time

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Active GPU Threads using Dynamic Scheduling

Active GPU Threads: 62%Colors represent Tracklet Constructor Steps:Black: IdleBlue: Track FitGreen: Track Extrapolation

x-axis: threads y-axis: time

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Pipelining

Initialization / Output on CPU, Tracking on GPU and DMA Transfer can overlap

Time

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Final Speedup (Central Heavy Ion)

(CPU performance was doubled as a side effect)

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Final Speedup in Contrast to Event Size

(PP Mode: Special variant optimized for small scale events)

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Tracking Efficiency

CPU

GPU

ALICE TPC Online Tracking on GPU

IntroductionTracking AlgorithmNVIDIA CUDATracking on GPUResultsSummary

Summary

Twofold performance increase on CPU2.5-fold performance increase compared to CPUTracking efficiency matches the CPU tracker‘s efficiencyA common source code ensures maintainabilityCPU still available during the GPU tracking

Perspective for the Future

Next NVIDIA GPU generation Fermi may result in another boostTrack Merger may be adapted to run on the GPU