Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
MachineLearningatLHCbMikhailHushchyn
NationalResearchUniversityHigherSchoolofEconomicsMoscowInstituteofPhysicsandTechnology
Yandex SchoolofDataAnalysis
onbehalfoftheLHCb collaboration
ICPPA2017,3-6October,Moscow
TheLHCb Detector• LHCb isasingle-armforwardspectrometer.• ThemaingoalofthedetectoristosearchforindirectevidenceofnewphysicsinCPviolationandraredecaysofbeautyandcharmhadrons.
Int.J.Mod.Phys.A30(2015)1530022
2
3
WhydoweuseML?
40 MHz bunch crossing rate
450 kHzh±
400 kHzµ/µµ
150 kHze/γ
L0 Hardware Trigger : 1 MHz readout, high ET/PT signatures
Software High Level Trigger
12.5 kHz (0.6 GB/s) to storage
Partial event reconstruction, select displaced tracks/vertices and dimuons
Buffer events to disk, perform online detector calibration and alignment
Full offline-like event selection, mixture of inclusive and exclusive triggers
LHCb 2015 Trigger Diagram
TheLHCb detectorgeneratestoomuchdatatokeepitall.MachinelearningisneededforefficientselectionofthemostinterestingeventsinSoftwareHighLevelTrigger.
MLisalsousedin:• Trackpatternrecognition• Faketracksrejection• Particleidentification• Jetsidentification• DQmonitoring• Optimizeuseofstoragecapacity
TrackPatternRecognition
VELO track Downstream track
Long track
Upstream track
T track
VELOTT
T1 T2 T3
• VELOtracking:vertexreconstruction• Longtracks:usedinmajorityofanalyses(B/Ddecays)• Downstreamtracks:daughtersoflonglivedparticles• Averagetrackingefficiency>96%• Momentumresolutionvariesfrom0.5%atlowmomentumto1.0%at
200GeVInt.J.Mod.Phys.A30(2015)1530022
4
LongTracksReconstruction
StartingfromseedsintheVELO,tracksaresearchedinTstations:• SearchwindowinTstationsdefinedbyVELOtrack.• Projectx-hitintoreferenceplane.• Fit4-layer-x-cluster andremoveoutliers.• Addandfittrackwithstereohits.
TwoDeepNeuralNetworks:• Firstofthemistunedforrejectionofbad4-layer-x-clusters.• Secondoneistrainedforcandidatesselectionafterstereofit.
LHCb-PROC-2017-013
5
DownstreamTracksReconstruction
ThealgorithmisseededbytracksreconstructedinTstations.
FindmatchingTThits
Results:3- 5%improvementinfaketrackrejectionandincreaseinsignalefficiency
Rejectionofabout40%offakeT-SeedsusingBosaiBDT[JINST 8 P02013]
VELO track Downstream track
Long track
Upstream track
T track
VELOTT
T1 T2 T3
6
TMVAMLP
FakeTrackRejection
]2 [MeV/cπKm1800 1850 1900
)2ca
ndid
ates
/ (4
MeV
/c
0
500
1000
1500
2000
2500
3000
3500
LHCbpreliminary
π K→0all Dfakes
NNsaretrainedforbackgroundrejectionatgiven(97to99%)efficiency.Faketrack(ghost)probabilitybasedontheDNNoutputallowstoreducefakerate. Results:• Increasedefficiency• Reducedfakerate(22%→ 14%)
efficiency0 0.5 1
fake
trac
k re
ject
ion
0.4
0.5
0.6
0.7
0.8
0.9
1
LHCbpreliminary
ghost probability
/dof2χtrack fit
UsingDNNs
LHCb-PROC-2017-013
7
ParticleIdentification• Problem:identifyparticletypeassociatedwithatrack.• Particletypes:Ghost,Electron,Muon,Pion,Kaon,Proton.• LHCb subdetectors:RICH,ECAL,HCAL,MuonChambersandTrack
observables• Differentparticletypeshasdifferentresponsesinthesubdetectors.• Theproblemcanbeconsideredasmulticlassclassificationproblemin
machinelearning.
LHCb RICH
8
Int.J.Mod.Phys.A30(2015)1530022
ParticleIdentification• ThefirstmachinelearningalgorithmsusedforthePIDinLHCb isone-
hidden-layerneuralnetwork(TMVAMLP).• EachparticletypehasitsownbinaryNNtrainedinone-particle-vs-
restmode.
Σ# → 𝑝𝜇#𝜇& Σ# → 𝑝𝜇#𝜇&
Int.J.Mod.Phys.A30(2015)1530022
9
Plots:usingdatasidebands forbackgroundsandMonteCarlosimulation forthesignal
ParticleIdentification
• Onemodelforallparticletypes.
• ROCAUCs≈ 0.91 − 0.99 fordifferentparticletypes.
6xProbNNstrainedinone-vs-restmodeareconsideredasbaseline.
FurtherPIDperformanceimprovementisdoneusingdifferentmulticlassmodels:deepneuralnetworks(DNN)andBDTs(XGBoost andCatBoost):
Tracking System
ECAL & HCAL
RICH
Muon Chambers
!"#$%Ghost
10
LHCb Simulation,preliminary
ParticleIdentificationSeveralDBTmodelswithflatefficienciesalong𝑷, 𝑷𝒕, 𝜼 and𝒏𝑻𝒓𝒂𝒄𝒌𝒔areprovided.Themodelsaretrainedwithspeciallossfunctiondescribedin[JINST10(2015)T03002].
LHCb Simulation,preliminary
11
𝝅𝟎 − 𝜸 separationSignal:singlephoton𝛾.Background:photonsfrom𝜋= → 𝛾𝛾 decay.Problem: separatesignalandbackgroundclustersintheelectromagneticcalorimeter.
12
𝝅𝟎 − 𝜸 separationBaselinesolution:• Clustersshapeandsymetryaredescribedbysetoffeatures.• 2-layersMLPistrainedtoseparatesignalandbackgroundclusters.
𝐵= → 𝐾∗=𝛾
MLPresponse>0.6𝜀BCD~98%,𝜀HID~55%
LHCb-PUB-2015-016
13
𝝅𝟎 − 𝜸 separationNewapproach:• Responsesin5x5cellclustersforECALandpre-showerdetectorsare
consideredasnewfeatures.• SeveralNNandBDTmodelsaretrainedonthese2x25inputfeatures.• BDTmodelshowsbetterperformance.• Promisingpossibilityofaggressivebackgroundsuppressionis
demonstratedonsimulateddata.
14
JetTaggingProblem:identifybandcjetswithasmallmisidentificationprobabilityoflight-parton jets.Theidentificationof(b,c)jetsisperformedusingSVsfromthedecaysof(b,c)hadrons.
LHCb-PAPER-2015-016
15
JetTaggingTwoBDTmodelsareconsidered:DBT(𝑏𝑐|𝑢𝑑𝑠𝑔)istrainedtoseparate𝑏𝑐 and𝑙𝑖𝑔ℎ𝑡 jets,BDT(𝑏|𝑐)istrainedtoseparate𝑏 and𝑐 jets.BothBDTsaretrainedonsimulatedsamplesofb,candlight-parton jets.10kinematicobservablesofSVsareusedasinputs.
LHCb-PAPER-2015-016
W+jet events
16
JetTagging
The(b,c)-jetefficienciesversusthemistag probabilityoflight-partonjetsobtainedbyincreasingtheDBT(𝑏𝑐|𝑢𝑑𝑠𝑔)cut.
LHCb-PAPER-2015-016
17
TopologicalTrigger• ThegoalofHLT2topologicaltriggerisefficientselectionofanyB(and
D)decaywithatleast2chargeddaughters.• Itisdesignedtohandlethepossibleomissionofchildparticles.• InRun1,asimpleBDTwasusedtodefineinterestingSVs.• InRun2,thealgorithmisreoptimized usingseveralMLmodels.
LHCb-PUB-2011-016J.Phys.:Conf.Ser.664(2015)082025
18
TopologicalTriggerHLT-1trackislookingforonesuperhighPTorhighdisplacedtrack.HLT-1trackMVAclassifierislookingfortwotracksmakingavertex.HLT-2topologicalclassifierusesfullreconstructedeventtolookfor2,3,4andmoretracksmakingavertex.KinematicobservablesofSVsareusedastheclassifiersinputs.
19
J.Phys.:Conf.Ser.664(2015)082025
HLT-1track:90kHZ HLT-1trackMVA:40kHZ
TopologicalTrigger• SeveralMLmodelsareconsideredduringthetriggerreoptimization:
BDTs(MatrixNet),NeuralNetworks,LogisticRegression.• ROCcurveinaregionwithsmallFalsePositiveRateisoptimized.
20
J.Phys.:Conf.Ser.664(2015)082025
LHCb Simulation,preliminary
LHCb Simulation,preliminary
TopologicalTrigger• Mostn-bodyhadronicBdecays(n≥3)areonlytriggeredon
efficientlyinLHCb bythetopologicaltrigger.• Gain50%..80%efficiencyfordifferentchannels.
21
J.Phys.:Conf.Ser.664(2015)082025
𝜀VWX(𝑅𝑢𝑛2)/𝜀VWX(𝑅𝑢𝑛1)
JetTagging&TOPOTrigger• ThetopologicaltriggeralgorithmusesSVsthatsatisfysimilarcriteria
tothoseusedintheSV-taggeralgorithmtobuildtwo-,three- andfour-trackSVs.
• TheSVusedbytheTOPOtotriggerrecordingoftheeventcanalsobeusedtotagabjet.
• TheBDTusedintheTOPOalgorithmusessimilarinputsasjet-taggerBDTmodels.
The“loose”labelfortheTOPOreferstotheBDTrequirementusedinthetrigger forSVsthatcontainmuoncandidates.
22
LHCb-PAPER-2015-016
DQMonitoringRobo-Shifter
• Robo-shifterismachine-learningbasedsystemdesignedtoassiststheDQshifter.
• Givenrundataitcanpredictprobabilityofrunbeinggoodorbad.
• Providespotentialproblemsourcesextractedfromdecisiontrees.
• Thefirstversionofrobo-shifteriscurrentlybeingtestedbytheDQshifters.
23
MachinelearningiseverywhereatLHCb helpingtoimprovethedetectoroperationanddataprocessing.
24