20
Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 1 Statistical Learning Basics Jens Zimmermann [email protected] Max-Planck-Institut für Physik, München Forschungszentrum Jülich GmbH XEUS and MAGIC Example: -hadron Separation Basic Concepts and Notions Classical Methods Statistical Learning Methods Statistical Learning Theory Conclusion

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann [email protected] Max-Planck-Institut für Physik,

Embed Size (px)

Citation preview

Page 1: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 1

Statistical Learning Basics

Jens [email protected]

Max-Planck-Institut für Physik, München

Forschungszentrum Jülich GmbH

XEUS and MAGICExample: -hadron Separation

Basic Concepts and Notions

Classical MethodsStatistical Learning Methods

Statistical Learning Theory

Conclusion

Page 2: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 2

XEUS and MAGIC

X-rays: 0.1 – 10 keV Gamma-rays: 10 – 1000 GeV

X-ray Evolving UniverseSpectroscopy Mission

Major Atmospheric GammaImaging Cerenkov Telescope

Launchedinto space~ 2015

Built inLa Palma

2003

AGNSNRGRB

First galaxiesMetal synthesisIGM

Chargedistribution

Cherenkovphotons

ellipse

Page 3: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 3

Example:-hadron separation

photon: small hadron: uniform „Hillas“ parameters:

lengthwidthsize

...

Photon excess for small - significance of excess- number of excess events

before after

Page 4: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 4

Example:-hadron separation

Choose (preprocessed) inputs (length, width, size, ...)Classification: photon vs. hadron

Offline Analysis: Train with simulated photonsComparison to classical „supercuts“ methodNeural Network based on linear separation

details to

be

discuss

ed

Page 5: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 5

Training of Statistical Learning Methods

Statistical Learning Method:From N examples infer a rule( , )i ix y

( )x out x

Important: Generalisation vs. Overtraining

Without noise and separableby complicated boundary?

0 1 2 3 4 5 6

0

1

2

3

4

5

6

Easily separable but with noise?

Too high degree of polynomial results in interpolationbut too low degree means bad approximation

Page 6: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 6

Classification vs. Regression

Pileup vs.Single photon

Reconstruction of theincident position with

subpixel resolution

0

5

10

15

20

25

-15.0 -7.5 0 7.5 15.0

=Eout-Etrue

=xout-xtrue

XEUS – x[µm]MAGIC – E[GeV]

²=<>²+²

Gamma vs.Myon vs.Hadron event

Reconstructionof the primary

photon energy

0 20 40 60 80 1000

20

40

60

80

100

signal efficiency

bac

kgro

un

d r

ejec

tio

n

validation

training

XEUS – Photon recognition

0 output 1

photons

pileups

Page 7: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 7

Inputs and Preprocessing

Reasonable selection of inputs = Steering the search in function spaceMAGIC

A B

C D A C

B D

XEUS

• as many as necessary, as few as possible• highest possible analysis level• make use of symmetries

• reflection• rotation

• Measure importance of inputs• correlation• relevance

Page 8: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 8

Motivation and Training Data

Lack of time

• „Online“ application, usually trigger• Very fast hardware implementations of statistical learning methods(down to few 100 ns)

• Training with offline analysis

neural network trigger at the H1 experiment

Page 9: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 9

Motivation and Training Data

Lack of knowledge

• „Offline“ application• No theoretical description of the data• Theoretical prediction does not match data• Theory too complicated to construct algorithm• Performance increase with statistical learning methods• Incorporate prior knowledge by preprocessing

• Training with• Monte Carlo simulation (careful!) or• modified experiment

mesh experiment

Page 10: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 10

Classical Methods

Classification: “cuts”

two univariate cuts vs. one multivariate cut XEUS: patterns which can begenerated by single photons

Regression: “fit”

MAGIC: estimate energy of the primary photon minimise relative error

Page 11: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 11

Statistical Learning Methods

Decision Trees

C4.5

CAL5

Local Density Estimators

Naïve Bayes

“Maximum Likelihood”

k-Nearest Neighbours

CART

Linear Separation

Neural Network

Support Vector M

achine

Linear

Discriminant

Analysis

Meta Learning Strategies

Bagging

Boosting Random

Subspace

Page 12: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 12

Some Events

0 1 2 3 4 5 6 x10# formulas

# s

lide

s

0

1

2

3

4

5

6 x

10

# formulas # slides

42 21

28 8

71 19

64 31

29 36

15 34

48 44

56 51

25 55

12 16Exp

erim

enta

list

s

The

oris

ts

Page 13: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 13

Decision Trees

0 2 4 6 x10

# formulas

#formulas < 20 exp#formulas > 60 th

0 2 4 6 x10

# slides

20 < #formulas < 60?

#slides > 40 exp#slides < 40 th

#slides < 40 #slides > 40

expth

#formulas < 20 #formulas > 60rest

exp th

all events

subset 20 < #formulas < 60

Page 14: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 14

Local Density Estimators

Search for similar events that are already classifiedand count the members of the two classes.

0 1 2 3 4 5 6 x10# formulas

# s

lide

s

0

1

2

3

4

5

6 x

10

0 1 2 3 4 5 6 x10# formulas

# s

lide

s

0

1

2

3

4

5

6 x

10

Page 15: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 15

Methods Based on Linear Separation

Divide the input space into regionsseparated by one or more hyperplanes.

Extrapolation is done!

0 1 2 3 4 5 6 x10# formulas

# s

lide

s

0

1

2

3

4

5

6 x

10

0 1 2 3 4 5 6 x10# formulas

# s

lide

s

0

1

2

3

4

5

6 x

10

Page 16: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 16

Meta-Learning Strategies

Training Data

Classifier 1Classifier 2 Classifier 3 Classifier n

Combine different classificationsto one final decision

Page 17: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 17

Statistical Learning Theory

error on training set true error

loss function, here misclassifications

PAC-Learning (probably approximately correct)

finite hypotheses space H, size of training set is n,target function y is in H

probablyapproximately

Page 18: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 18

Statistical Learning Theory

VC-Framework (Vapnik-Chervonenkis)

VC-dimension of linear separation in twodimensions is three because three points

can be shattered but not four points.

bound for the true errordepending on VC-dimension

“generalisation error”

Page 19: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 19

Conclusion

Many applications for statistical learning methodsin high energy and astrophysics

Classification or Regression

Online or Offline

Many different methods with three basic ideas(decision trees, local density estimators, linear separation)

Rich theory

Page 20: Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen1 Statistical Learning Basics Jens Zimmermann zimmerm@mppmu.mpg.de Max-Planck-Institut für Physik,

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 20

Next Talk

Very promising results with statistical learning methods

But:

Can they be trusted?Can they be controlled?Can one calculate uncertainties?