15
G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 200 Glen Cowan Physics Department Royal Holloway, University of London [email protected] www.pp.rhul.ac.uk/~cowan

G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics

Embed Size (px)

Citation preview

Page 1: G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics

G. Cowan, RHUL Physics Discussion on significance page 1

Discussion on significance

ATLAS Statistics Forum

CERN/Phone, 2 December, 2009

Glen CowanPhysics DepartmentRoyal Holloway, University of [email protected]/~cowan

Page 2: G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics

G. Cowan, RHUL Physics Discussion on significance page 2

p-valuesThe standard way to quantify the significance of a discovery is to give the p-value of the background-only hypothesis H0:

p = Prob( data equally or more incompatible with H0 | H0 )

Requires a definition of what data values constitute a lesser level of compatibility with H0 relative to the level found withthe observed data.

Define this to get high probability to reject H0 if aparticular signal model (or class of models) is true.

Note that actual confidence in whether a real discovery is madedepends also on other factors, e.g., plausibility of signal, degreeto which it describes the data, reliability of the model used tofind the p-value. p-value is really only first step!

Page 3: G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics

page 3

Significance from p-value

Often define significance Z as the number of standard deviationsthat a Gaussian variable would fluctuate in one directionto give the same p-value.

TMath::Prob

TMath::NormQuantile

G. Cowan, RHUL Physics Discussion on significance

Z = 5 corresponds to p = 2.87 × 10-7

Page 4: G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics

G. Cowan, RHUL Physics Discussion on significance page 4

Sensitivity (expected significance)

The significance with which one rejects the SM depends onthe particular data set obtained.

To characterize the sensitivity of a planned analysis, give theexpected (e.g., mean or median) significance assuming agiven signal model.

To determine accurately could in principle require an MC study.Often sufficient to evaluate with representative (e.g. “Asimov”) data.

Page 5: G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics

G. Cowan, RHUL Physics Discussion on significance page 5

Significance for single counting experiment

Suppose we measure n events, expect s signal, b background.

n ~ Poisson(s+b)

Find p-value of s = 0 hypothesis.

data values with n ≥ nobs constitute lesser compatibility

Page 6: G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics

G. Cowan, RHUL Physics Discussion on significance page 6

Simple counting experiment with LR

Equivalently can write expectation value of n as

where is a strength parameter (background-only is = 0).To test a value of , construct likelihood ratio

where muhat is the Maximum Likelihood Estimator (MLE),which we constrain to be positive:

Page 7: G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics

G. Cowan, RHUL Physics Discussion on significance page 7

p-value from LR

Also define

High values correspond to increasing incompatibility with .

For discovery we are testing m = 0. We find

The p-value is

Page 8: G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics

G. Cowan, RHUL Physics Discussion on significance page 8

Significance from LR using 2 approx.

For large enough n, we can regard q as continuous, and find

Furthermore, for large enough n, the distribution of q approachesa form related to the chi-square distribution for 1 d.o.f.

Complications arise from requirement that be positive, butend result simple. For test of = 0 (discovery), significance is

Page 9: G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics

G. Cowan, RHUL Physics Discussion on significance page 9

Sensitivity for simple counting exp.

Find median significance from median n, which is approximatelys + b when this is sufficiently large.

Or, if using the approximate formula based on chi-square,approximate median by substituting s + b for n (“Asimov” data)

For s << b, expanding logarithm and keeping terms to O(s2),

Page 10: G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics

G. Cowan, RHUL Physics Discussion on significance page 10

Simple counting exp. with bkg. uncertainty

Suppose b consists of several components, and that these arenot precisely known but estimated from subsidiary measurements:

mi ~ Poisson,

n ~ Poisson,

Likelihood function for full set of measurements is:

Page 11: G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics

G. Cowan, RHUL Physics Discussion on significance page 11

Profile likelihood ratio

To account for the nuisance parameters (systematics), test with the profile likelihood ratio:

Double hat: maximizeL for the given

Single hats: maximizeL wrt and b.

Important point is that q = 2 ln () still related to chi-squaredistribution even with nuisance parameters (for sufficiently large sample), so retain the simple formula for significance:

Page 12: G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics

G. Cowan, RHUL Physics Discussion on significance page 12

Examples from recent HN posts From recent hypernews posts (Tetiana Hrynova, Xavier Prudent),

Consider s = 20.4, b = 2.5 ± 1.5. What is “correct” sensitivity?

First suppose b = 2.5 exactly, then:

1) Use MC to find median, assuming s = 20.4, of

2) Use formula based on chi-square approx. for likelihood ratio:

3) Use

Best(?)

Good for s+b > dozen?

Here OK for s << b, b > dozen?

Page 13: G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics

G. Cowan, RHUL Physics Discussion on significance page 13

Examples from recent HN posts (2)

To take into account the uncertainty in the background, need tounderstand the origin of the 2.5 ± 1.5.

Is this e.g. an estimate based on a Poisson measurement?

Use profile likelihood for nuisance parameter b.

Or is it a Gaussian prior (truncated at zero) with mean 2.5, = 1.5?

Use “Cousins-Highland”

Page 14: G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics

G. Cowan, RHUL Physics Discussion on significance page 14

Look-elsewhere effectThe p-value should give the probability of rejecting the background-only hypothesis if it is true, i.e., the probability of a false discovery.

But, we carry out many tests, e.g., we look for a Higgs of many different masses.

Need to correct for the fact that the probability that one of these will result in a 5 sigma effect is then > 2.87 × 10. Several approaches:

Treat signal parameter (e.g. Higgs mass) as a floating parameter in the likelihood ratio (Wilks’ thm no good?)

Compute trials factor with MC (find probability that onewill reject bkg-only for some (any) point in signal par. space.

Approx. correction, e.g., ~ mass range / mass resolution.

Ongoing discussion but should move towards more concrete guidelines.

Page 15: G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics

G. Cowan, RHUL Physics Discussion on significance page 15

Provisional conclusionsKey is to view p-value as the basic quantity of interest; Z isequivalent, and all “magic formulae” are various approximationsfor Z.

Also other considerations for discovery (and limits) beyond p-value, e.g., level to which signal described by data, plausibilityof signal model, reliability of model for p-value, …

Also consider e.g. Bayes factors for complementary info.

StatForum should move towards firm recommendations on what formulae to use where possible, but cannot investigateevery approximation – analysts must take some responsibility here.

Draft note (INT) attached to agenda on discovery significance;will also have partner note on limits.