19
Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

Embed Size (px)

Citation preview

Page 1: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

Exploring Artificial Neural Networks to discover the Higgs

boson at the LHC

Page 2: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

Overview• Introduction

– The Standard Model and the mass problem– Higgs search at the LHC (and ANNs)

• ttH, H to bb and other channels– Production process– Decay– Experimental signatures– Background processes

• ANNs a possible solution (theory)• ANN development issues (on simple 2-D classification

problem + results)• ANNs applied to Higgs data (results)• Summary

Page 3: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

Introduction• Origin of mass is last big unanswered question in the SM.

• The Standard Model

– To make physical equations in SM gauge invariant we require new terms, these correspond directly to gauge bosons. (eg photon)

– Massless particles would preserve SM’s gauge symmetry (easiest, but not the case)

– Higgs mechanism allows generation of mass in the SM (by breaking gauge invariance of vacuum) (spontaneous symmetry breaking)

– Needs further particle: HIGGS BOSON!!

• So search for Higgs is important to our understanding of particle interactions.

• It may be that nature has chosen another mass-generating mechanism, but whatever this mechanism is, it should show itself at the LHC.

Page 4: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

Search at the Large Hadron Collider (LHC)

(Higgs discovery one of its main aims)

• H mass not predicted by the SM but production and decay rates can be predicted as function of mH.

– From LEP:• 114.4GeV < mH (SM) .

– The LHC, with its detectors ATLAS and CMS, (due to go online in 2007) will collide p-p at

• 14TeV• Higgs mass reach to about 1TeV.

– High Luminosity;• But Higgs production v. rare! (1016 proton-proton interactions will occur per

year, but less than 100,000 Higgs bosons will be produced )

– As well as Higgs, LHC hopes to find evidence for new physics• Supersymmetry (SUSY) (modifies the SM to include a whole new series of

particles, supersymmetric partners of all the particles so far known. Has many desirable features, mending some shortcomings of the SM.

• If SUSY is the theory, we do not know how many Higgs bosons we would see (minimum 5)

Page 5: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

ttH, H to bb Channel

• H production processes:– Gluon fusion is

dominant Higgs production process, gg to H (but difficult to separate signal from large QCD background)

– Associated Production! ttH, Lower cross section but has leptonic final states

• Dominant decay mode at mH<130GeV is H to bb

Branching ratios for the SM Higgs Boson

0.001

0.101

0.201

0.301

0.401

0.501

0.601

0.701

0.801

0.901

100 110 120 130 140 150 160 170 180 190 200

HIggs Mass (GeV)

Bra

nch

ing

Rat

io

bb WW

ZZ

Page 6: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

• ttH, H to bb could account for half the Higgs discovery potential at ATLAS (Cammin)

• Background;– ttjj (most important, 94% after TDR

analysis)• Full reconstruction of final state is

necessary to minimize combinatorial background and to discriminate signal from large bg.

W-

bbar

W-

bbar

W+

b

W+W+

b

bbar

b

bbar

b

e-/ -

e-/ -

jet

jet

jet

jet

jetsjets

TDR analysis has 3 steps:•Preselection

-1 isolated lepton,-At least 6 jets,-Exactly 4 tagged as b-jets.

•Reconstruction-reconstruction of 2 top quarks, minimise:

Δ2 = (mlvb – mt)2 + (mjjb – mt)2

•Cuts on the reconstructed t and H masses (where ANNs come in)-Reconstructed top masses must be within 20GeV

of mt.-mbb = mH 30GeV

Page 7: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

• After this TDR analysis, significance S/√B = 1.94 (for 120GeV Higgs)

• Could increase significance by:– Better jet pairing– Improving ‘final selection’ (after t reconstruction, apply to events in mbb =

mH 30GeV)

Applying ANNs promising as makes use of event topology, not just mass cuts! Ie minimising (Δ2 eqn) does not take into account additional info such as spatial differences between jets!

• I looked at final selection!• Used 10 variables generated by Pythia. (which gave separation in

signal and background distributions)• Fed variables into a neural network. (to classify event as signal or

background)

Page 8: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

ANNs• Artificial Neural Networks (ANNs) are

computational modelling tools• Inspired by biological nervous system• Good at:

– generalization, – non-linear, – learn by example.

• Want to train network with examples to recognise right data( classification task) and reject rest

• (ANNs perform better than cut based in theory because can separate classes in feature space non linearly)

• (but training is difficult, optimisation harder than for cut based methods)

x1

x2

x1

x2

Page 9: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

w1w2

w3

x1

x2

x3

y

∑w.x

g( )

oi

wij

hj

wjk

xk

• Response function:oi=g(∑iwijg(∑kwjkxk))

Which is non-linear so network able to perform non-linear mappings

•Architecture and weight settings are what change classification!

•We want network to output 1 for signaland 0 for all background

A neural node

A neural network

How do ANNs Work?

Page 10: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

• Weights are changed in proportion to the difference (error) between target output and actual network output for each example.

• Minimize summed square error function:

E = 1/2 ∑p∑i(oi(p) - ti

(p))2

with respect to the weights.

• Error is function of all the weights and forms an irregular multidimensional complex hyperplane with many peaks, saddle points and minima.

• Error minimized by finding set of weights that correspond to global minimum. (ie get close to 1 for signal and close to 0 for background)

• Done with gradient descent method – (weights incrementally updated in proportion to δE/δwij)

Error Surface

Page 11: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

Summary of learning algorithm

1. Initialize wij and wjk with random values.2. Pick pattern p from training set.

• Present input and calculate the output from:oi=g(∑iwijg(∑kwjkxk))

• Update weights according to:wij(t + 1) = wij(t) – Δwij

wjk(t + 1) = wjk(t) – Δwjk where Δw = -

η δE/δw.

(…etc…for extra hidden layers).

• When no change (within some accuracy) occurs, the weights are frozen and network is ready to use on data it has never seen.

Page 12: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

2-D problem• Initially looked at simple ANN classification

problem;– Separate out a single point in a 2-D plane of

randomly generated numbers.• Generated 2 sets of random numbers• Fed network (using SNNS)(2 input 1 output)

(show diag!!) examples of signal and background data. (desired output 1 and 0 respectively)

• Used 300 patterns in both tr. And val sets.• Background to signal ratio was 3 to 1.• Looked at various net architectures.

• Results:– Learning shown by error curves– Projections show hyperplanes– 3 hidden nodes solve classification task fully!

(effectively 1 hidden node is equiv. of 1 linear hyperplane)

Page 13: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

– Got spiking behaviour of some error curves.• Showed inconsistent learning (updating of weights)• Was solved by adjusting some network params (made learning more stable!!!)

– Learning parameter, η.– dmax.– Shuffle option.

• To get a deeper understanding of learning, also looked at weight and bias variables.

Page 14: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC
Page 15: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

Using ANNs for Higgs search• Worked with data after reconstruction of top quarks.• Variables used;

– mbb: the invariant mass of the two b-jets assigned to the Higgs boson,– Δη(tnear, bb): the difference in pseudo rapidity between the bb-system and the

reconstructed top quark nearest ΔR.– cosb,b*: the cosine of the decay angle of the two b-jets from the Higgs boson in

the rest frame of the bb-system,– Δη(b,b): the difference in pseudo rapidity between the two b-jets from the Higgs

boson,– mbb(1): the combination with the smallest invariant mass mbb out of the six

combintations which are possible when selecting two b-jets out of four b-jets,– mbb(2): the combination with the second smallest invariant mass mbb out of the six

combinations which are possible when selecting two b-jets out of four b-jets, t1-t2: the difference in phi between the reconstructed top quarks,– pTt1+pTt2: the sum of the transverse momenta of the reconstructed top quarks.

Page 16: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

Signal is RED

Page 17: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

• (only ttjj background used)

• Rescaled data to [0,1]• Separated data into tr

and val sets• Used 1:1 for signal to

background.• Looked at various archs

(1 and 2 hidden layers)• Weak generalization:

Page 18: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

• Output for best architecture (6-20-20-1) gave:

Signal is RED

Page 19: Exploring Artificial Neural Networks to discover the Higgs boson at the LHC

Summary

• Optimisation difficulties and solutions have been identified in net development

• Some classification produced for Higgs data

• More work on arch could be needed (more data, lack of generalization)

• s/√B as fn. of cut on output.