44
Advanced event analysis methods at the energy frontier Reinhard Schwienhorst CPPM Seminar, July 5 2007

Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

Advanced event analysis methods at the energy frontier

Reinhard Schwienhorst

CPPM Seminar, July 5 2007

Page 2: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

2Reinhard Schwienhorst, Michigan State University

Outline• Introduction

– Physics at the energy frontier

• Analysis procedure• Event analysis methods

– Decision trees• Boosting• Random forest

– Bayesian neural networks– Classifier comparisons

• ConclusionsDisclaimer: highlight general principles and guiding ideas

• Not necessarily mathematically rigorous• Some of the same topics were addressed at PhyStat conferences

Page 3: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

3Reinhard Schwienhorst, Michigan State University

Typical event analysis procedures

1) Cut-based event counting2) Peak in a characteristic distribution

Page 4: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

Event counting• Apply cuts to variables

describing the event– Object identification– Kinematic cuts on objects– Event kinematics

• Goal: cut until the signalis visible– No background left

– Or large S/√B

• Sensitive to any signal with this final state

• Requires understanding of background

example: Z discovery at UA12 EM clusters,ET > 25 GeV

1 EM clustertrack-matched

both EM clusterstrack-matched

Page 5: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

5Reinhard Schwienhorst, Michigan State University

Peak in a characteristic distribution• Find a variable that has a smooth

distribution for background – Typically invariant mass

• Measure this distribution over a large range of possible values

• Look for possible resonance peaks • Sensitive to any resonance with

this final state• Background estimate for sidebands

“Bump Hunting”

Example: b-quark discovery at Fermilab

Page 6: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

6Reinhard Schwienhorst, Michigan State University

The energy frontier• Colliding particles at the highest available energies

• Probe structure of matter at the most fundamental level

– Observe interactions at the smallest possible distances– Produce never-before-seen particles

Future: LHCFuture: LHCPresent: TevatronPresent: Tevatron

Page 7: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

7Reinhard Schwienhorst, Michigan State University

Searches at the energy frontier• Tevatron:

– Single top quark production

Low DTregion

High DTregion

Page 8: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

Searches at the energy frontier• Searches for new particles, phenomena, couplings

– Tevatron:• Single top quark production• Higgs boson search• SUSY• Extra dim• ...• Statistics-limited

Tevatron Higgs Sensitivity

Multivariate techniques make this level of sensitivity possible

Page 9: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

Searches at the energy frontier• Searches for new particles, phenomena, couplings

– Tevatron:• Single top quark production• Higgs boson search• SUSY• Extra dim• ...

– LHC:• Higgs searches

SM background

LHC Higgs Sensitivity

Multivariate techniqueswill be required to reachthis level of sensitivity

Page 10: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

Searches at the energy frontier• Searches for new particles, phenomena, couplings

– Tevatron:• Single top quark production• Higgs boson search• SUSY• Extra dim• ...

– LHC:• Higgs searches• SUSY• Extra dim• ...•

LHC SUSY signatureMeff = ET + ∑ pT (jets)

SM background

Susy at 1TeV

/

Page 11: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

Measurements at the energy frontier– First measurements of properties, couplings

• With samples of limited size• Example: Top quark mass

– ~ 3 GeV with 1 fb-1 – Was the goal for 2 fb-1!

lepton+jets top mass, matrix element method

Page 12: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

Measurements at the energy frontier• First measurements of properties, couplings

– With samples of limited size– Example:

LHC SUSY particlemasses

LHC b mass: 100 signal events in 30 fb-1

M(llbbET) [GeV]/

~

Page 13: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

Physics at the energy frontier• Searches for new particles, phenomena, couplings• First measurements of properties, couplings

• Multivariate techniques Adding more data

Making the most out of small samples of events

Page 14: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

14Reinhard Schwienhorst, Michigan State University

How to improve upon

Event countingand

Bump hunting

Page 15: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

15Reinhard Schwienhorst, Michigan State University

Bayesian limit• For each analysis, there exists a fully optimized

signal-background separation– Target function, also called Bayes discriminant or Bayesian

limit

• For a single discriminating variable, this ratio of signal and background likelihoods is easy to calculate– Monte Carlo procedure:

• Generate signal and background MC events• Fill histograms for signal and background• Divide the two histograms

B(x) = _____L(S|x)

L(B|x)

Page 16: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

16Reinhard Schwienhorst, Michigan State University

Bayesian limit• For each analysis, there exists a fully optimized

signal-background separation– Target function, also called Bayes discriminant or Bayesian

limit

• For a single discriminating variable, this ratio of signal and background likelihoods is easy to calculate

• In case of more than one variables, this isn't possible anymore– Not enough MC statistic to compute a

many-dimensional likelihood

B(x) = _____L(S|x)

L(B|x)

Curse of dimensionality

Page 17: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

17Reinhard Schwienhorst, Michigan State University

Optimized event analysis

• Requires detailed understanding of signal and background– Only applicable to searches for a specific signal or

measurements of a specific process

Optimized =

Optimize signal-background separationExploit full event information

Event kinematics, angular correlations, ...Take all correlations into account

Goal: Reach the Bayesian limit

Page 18: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

18Reinhard Schwienhorst, Michigan State University

Optimized event analysis

• Requires detailed understanding of signal and background– Only applicable to searches for a specific signal or

measurements of a specific process• Limited by background and signal modeling

– MC statistics, MC model, background composition, shape, ...

Optimized =

Optimize signal-background separationExploit full event information

Event kinematics, angular correlations, ...Take all correlations into account

If signal model is wrong: search is not sensitive

If background model is wrong: find something that isn't there

Goal: Reach the Bayesian limit

Page 19: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

19Reinhard Schwienhorst, Michigan State University

Event analysis techniques

Many others: Kernel methods, support vector machines, ...

Bayesian neural networksBoosted decision trees,random forest

Matrix Elements

Cut-Based Neural networks Decision trees Likelihood

Page 20: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

20Reinhard Schwienhorst, Michigan State University

Event analysis techniques

Cut­based

decision trees Matrix Elements

Likelihood

Bayesian neural networks

Neural networks

Page 21: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

21Reinhard Schwienhorst, Michigan State University

Cut-based analysis

Lepton pT>41 GeV

P

P

Event Energy<65GeV

M(top) <352 GeV

Final Event Set

• Estimate background yield• Compare to data

Nobs = Ndata – NB

• Calculate signal acceptanceσ = Nobs / (A*L)

In the final event set

P

Page 22: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

22Reinhard Schwienhorst, Michigan State University

Including events that fail a cut

pTl>41

PF

PF

ptl>65

PF

M<352

– Create a tree of cuts– Divide sample into

“pass” and “fail” sets – Each node corresponds

to a cut (branch)

Page 23: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

23Reinhard Schwienhorst, Michigan State University

Trees and leafs

PF

PFPF

M<352

– Create a tree of cuts– Divide sample into

“pass” and “fail” sets – Each node corresponds

to a cut (branch)– A leaf corresponds to an

end-point– For each leaf, calculate purity

(from MC):purity = NS/(NS+NB)

Leaf

pTl>41

ptl>65

Page 24: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

24Reinhard Schwienhorst, Michigan State University

Decision tree

PF

PFPF

M<352

– Create a tree of cuts– Divide sample into

“pass” and “fail” sets – Each node corresponds

to a cut (branch)– A leaf corresponds to an

end-point– For each leaf, calculate purity

(from MC):purity = NS/(NS+NB)

– Train the tree by optimizing the Gini improvement:

• Gini = 2 NS NB /(NS + NB)

• Each leaf will be either background- or signal-enhanced

Leaf

pTl>41

ptl>65

Page 25: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

25Reinhard Schwienhorst, Michigan State University

Decision tree output• Train on signal and background models (MC)

– Stop and create leaf when NMC<100

• Compute purity value for each leaf• Send data events through tree

– Assign purity value corresponding to the leaf to the event

• Result approximates a probability density distribution

Page 26: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

26Reinhard Schwienhorst, Michigan State University

Boosting

Page 27: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

27Reinhard Schwienhorst, Michigan State University

Boosting• A general method to improve the

performance of any weak qualifier– Decision trees, neural networks, ...

• Linear combination of many filter functions

– ak: coefficient, typically result of minimization of error function

F(x) = ∑ ak fk(x) k

Page 28: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

28Reinhard Schwienhorst, Michigan State University

Boosting procedure

Train filter function Train filter function ffkk

Find coefficient aFind coefficient akk minimize error functionminimize error function

Initial training sample TInitial training sample Tk=1k=1

Modify training sample TModify training sample Tkk

F = ∑ ak fk k

Page 29: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

29Reinhard Schwienhorst, Michigan State University

Adaptive boosting

• In each iteration, update coefficient ak – From minimizing error function– coefficients decrease at each iteration

• Update weight for each event in training sample Tk – Figure out which events have been misclassified

• Signal events should have purity ≥ 0.5

• Background should have purity <0.5

– Increase event weight for those events that have been misclassified

Page 30: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

30Reinhard Schwienhorst, Michigan State University

Boosting performance

DØ single top searchwith decision trees

3 different sets ofdiscriminating variables

Page 31: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

31Reinhard Schwienhorst, Michigan State University

Random forest• Average over many

decision trees– Typically O(100)

• Each tree is grown usingm variables– For N total variables, m<<N

• Very fast algorithm– Even with large number of variables

• Very few parameters to adjust– Typically only m

Page 32: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

32Reinhard Schwienhorst, Michigan State University

Event analysis techniques

Bayesian neural networksBoosted decision trees,random forest

Matrix Elements

Cut-Based Neural networks Decision trees Likelihood

Page 33: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

33Reinhard Schwienhorst, Michigan State University

Neural networks

f(x) = Σ w'k nk(x,wk)

Output Node: linear combination of hidden nodes

0 1

B S

Input Nodes: One for each variable xi

nk(x,wk) =1

1 + e- Σ wik xi

Sigmoid Hidden Nodes: Each is a sigmoid dependent on the input variables

Page 34: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

34Reinhard Schwienhorst, Michigan State University

Neural Network Training

– Find optimum NN parameters on training signal/background events

– Apply NN to independent set of signal and background• Testing sample

– Stop training when error from testing sample starts increasing• Overfitting

DØ single top search

Page 35: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

35Reinhard Schwienhorst, Michigan State University

Bayesian neural networks• Bayesian idea:

– Rather than finding one value for each weight,determine the posterior probability for each weight

• Form many networks by sampling from the posterior• Typical case: ~100 individual neural networks

– Each network gets a weight based on training performance

• Avoids overfitting• But: very slow due to integration required to

determine the posterior

Page 36: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

36Reinhard Schwienhorst, Michigan State University

Comparing multivariate methods

How optimal can anoptimal event analysis be?

Page 37: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

37Reinhard Schwienhorst, Michigan State University

0 5 10 15 20 25 30 35 400

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

5.5

D0 single top Wbb

R LDA

SPR Boost w/bag

D0 DT

SPR LDA

SPR Boost Dec Splits

D0 NN (p17)

SPR RF

signal efficiency (%)

back

grou

nd e

ffici

ency

(%

)

Classifier comparison, DØ single top

Neural network

Random forest

Likelihood analysis

25 variables, 10000 eventsClassifiers are evaluated at fixed points, lines connect points for better visibility

Page 38: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

38Reinhard Schwienhorst, Michigan State University

Babar Muon ID

Interesting region

25 variables, 10000 events

Page 39: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

39Reinhard Schwienhorst, Michigan State University

Mini-Boone

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90.000

0.005

0.010

0.015

0.020

0.025

0.030

0.035

0.040

MiniBoone Comparison

Random Forest

Roe (boosted DT)

Neal (Bayesian NN)

Signal Efficiency

Bac

kgro

und

Effi

cien

cy

Here 54 variables, 130000 events, training was also done with 300 variablesBDT uses 1000 trees

Page 40: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

40Reinhard Schwienhorst, Michigan State University

Glast

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 10.0000

0.0001

0.0010

0.0100

0.1000

1.0000

GLAST background rejection

R Random Forest R LDA R LR SPR BDT SPR BDT

SPR LDA SPR RF

signal efficiency

ba

ck

gro

un

d

Boosted decision tree

Random forest

Random forest

Likelihood methods

35 variables

Page 41: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

41Reinhard Schwienhorst, Michigan State University

Summary

Signal efficiency

Bac

kgro

und

effi

cien

cy1

00 1

Random guess

Neural networks,simple decision trees, etc

Boosted decision trees,bayesian neural networks,randomforests

Cut-based or likelihood

Page 42: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

Conclusions• Multivariate event analysis techniques are now a

common tool in HEP– In the past mostly neural networks, now also decision tree-

related methods• Glast, MiniBoone, Atlas, Dzero

• Modern classification tools make life easy– Very few parameters to adjust– Can use many variables

• Ranking of variables automatically provided

– Implemented in several software packages

Page 43: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

Conclusions• Multivariate event analysis techniques are now a

common tool in HEP– In the past mostly neural networks, now also decision tree-

related methods• Glast, MiniBoone, Atlas, Dzero

• Modern classification tools make life easy– Very few parameters to adjust– Can use many variables

• Ranking of variables automatically provided

– Implemented in several software packages

 Advanced event analysisenables discoveries

Page 44: Event Analysis Seminar - schwier/talks/seminars/Marseille2007AdvancedMethods.pdf · Event counting • Apply cuts to variables describing the event – Object identification – Kinematic

Resources• PhyStat code repository

https://plone4.fnal.gov:4430/P0/phystat/

• PhyStat 2007 conferencehttp://phystat-lhc.web.cern.ch/phystat-lhc/

• Jim Linnemann's collection of statistics links:http://www.pa.msu.edu/people/linnemann/stat_resources.html

• Statistical analysis tool Rhttp://www.r-project.org/

• TMVA (multivariate analysis tools in root)http://tmva.sourceforge.net/

• Neural Networks in Hardwarehttp://neuralnets.web.cern.ch/NeuralNets/nnwInHep.html

• Boosted Decision Trees in MiniBoonehttp://arxiv.org/abs/physics/0508045

• Decision Tree Introductionhttp://www.statsoft.com/textbook/stcart.html

• GLAST Decision Treeshttp://scipp.ucsc.edu/~atwood/Talks%20Given/CPAforGLAST.ppt