Click here to load reader

Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some

  • View
    4

  • Download
    0

Embed Size (px)

Text of Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing...

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 1

    Computing and Statistical Data Analysis Stat 5: Multivariate Methods

    London Postgraduate Lectures on Particle Physics;

    University of London MSci course PH4515

    Glen Cowan Physics Department Royal Holloway, University of London [email protected] www.pp.rhul.ac.uk/~cowan

    Course web page: www.pp.rhul.ac.uk/~cowan/stat_course.html

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 page 2

    Finding an optimal decision boundary In particle physics usually start by making simple “cuts”:

    xi < ci xj < cj

    Maybe later try some other type of decision boundary: H0 H0

    H0

    H1

    H1 H1

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 3

    Multivariate methods Many new (and some old) methods:

    Fisher discriminant Neural networks Kernel density methods Support Vector Machines Decision trees Boosting Bagging

    New software for HEP, e.g., TMVA , Höcker, Stelzer, Tegenfeldt, Voss, Voss, physics/0703039 StatPatternRecognition, I. Narsky, physics/0507143

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 4

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 5

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 6

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 7

    2

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 8

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 9

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 10

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 11

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 12

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 13

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 14

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 15

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 16

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 17

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 18

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 19

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 20

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 21

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 22

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 23

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 page 24

    Overtraining

    training sample independent validation sample

    If decision boundary is too flexible it will conform too closely to the training points → overtraining. Monitor by applying classifier to independent validation sample.

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 25

    Choose classifier that minimizes error function for validation sample.

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 page 26

    Neural network example from LEP II Signal: e+e- → W+W- (often 4 well separated hadron jets) Background: e+e- → qqgg (4 less well separated hadron jets)

    ← input variables based on jet structure, event shape, ... none by itself gives much separation.

    Neural network output:

    (Garrido, Juste and Martinez, ALEPH 96-144)

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 27

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 28

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 29

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 30

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 31

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 32

    Kernel-based PDE (KDE, Parzen window) Consider d dimensions, N training events, x1, ..., xN, estimate f (x) with

    Use e.g. Gaussian kernel:

    kernel bandwidth (smoothing parameter)

    Need to sum N terms to evaluate function (slow); faster algorithms only count events in vicinity of x (k-nearest neighbor, range search).

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 33

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 34

  • G. Cowan Computing and Statistical Data Analysis / Stat 5 35