24
Goodness of fit, confidence intervals and limits Jorge Andre Swieca School Campos do Jordão, January,2003 fourth lecture

Goodness of fit, confidence intervals and limits

  • Upload
    ivana

  • View
    53

  • Download
    2

Embed Size (px)

DESCRIPTION

fourth lecture. Goodness of fit, confidence intervals and limits. Jorge Andre Swieca School Campos do Jordão, January,2003. References. Statistical Data Analysis, G. Cowan , Oxford, 1998 - PowerPoint PPT Presentation

Citation preview

Page 1: Goodness of fit,  confidence intervals and limits

Goodness of fit, confidence intervals and limits

Jorge Andre Swieca School

Campos do Jordão, January,2003

fourth lecture

Page 2: Goodness of fit,  confidence intervals and limits

References

• Statistical Data Analysis, G. Cowan, Oxford, 1998• Statistics, A guide to the Use of Statistical Methods in

the Physical Sciences, R. Barlow, J. Wiley & Sons, 1989;

• Particle Data Group (PDG) Review of Particle Physics, 2002 electronic edition.

• Data Analysis, Statistical and Computational Methods for Scientists and Engineers, S. Brandt, Third Edition, Springer, 1999

Page 3: Goodness of fit,  confidence intervals and limits

Limits

“Tens, como Hamlet, o pavor do desconhecido?Mas o que é conhecido? O que é que tu conheces,Para que chames desconhecido a qualquer coisa em especial?”

Álvaro de Campos (Fernando Pessoa)

“Se têm a verdade, guardem-na!”Lisbon Revisited, Álvaro de Campos

Page 4: Goodness of fit,  confidence intervals and limits

Statistical tests

How well the data stand in agreement with given predicted probabilities – hypothesis.

null hypothesis H0 )|( 0Hxf

alternative )|( 1Hxf

)|( 2Hxf

function of measured variables: test statistics )(xt

)|( 0Htg

cutt

dtHtg )|( 0error first kindsignificance level

cutt

dtHtg )|( 0

power = 1

error second kind

power to discriminateagainst H1

Page 5: Goodness of fit,  confidence intervals and limits

Neyman-Pearson lemma

Where to place tcut?H0 signalH1 background

1-D: efficiency (and purity)m-D:

def. of acceptance region is not obvious),...,( mttt 1

Neyman-Pearson lemma: highest power (highest signal purity)for a given significance level α

region of t-space such that cHtg

Htg

)|(

)|(

1

0 determined by

the desired efficiency

Page 6: Goodness of fit,  confidence intervals and limits

Goodness of fit

how well a given null hypothesis H0 is compatible with the observed data (no reference to other alternative hypothesis)

coins: N tosses, nh , nt= N - nh coin “fair’? H and T equal?

test statistic: nh binomial distribution, p=0.5

hh nNn

hhh nNn

NNnf

2

121

)!(!

!);(

N=20, nh=17

E[nh]=Np=10

0 1 2 3 17 18 19 2010

);():();();(

);():();();(

2020201920182017

203202201200

ffff

ffff

Page 7: Goodness of fit,  confidence intervals and limits

Goodness of fit

P=0.0026 P-value: probability P, under H0, obtain a result as compatible of less with H0 than the one actually observed.

P-value is a random variable, α is a constant specified beforecarrying out the test

Bayesian statistics: use the Bayes theorem to assign a probability to H0 (specify the prior probability)

P value is often interpreted incorrectly as a prob. to H0

P-value: fraction of times on would obtain data as compatiblewith H0 or less so if the experiment (20 coin tosses) were repeated under similar circunstances

Page 8: Goodness of fit,  confidence intervals and limits

Goodness of fit

Easy to identify the region of values of t with equal or less degree of compatibility with the hypothesis than the observed value (alternate hypothesis: p ≠ 0.5)

“optional stopping problem”

Page 9: Goodness of fit,  confidence intervals and limits

Significance of an observed signal

Whether a discrepancy between data and expectation is sufficiently significant to merit a claim for a new discovery

signal event ns, Poisson variable νS

background event nb, Poisson variable νb

bs nnn bs

prob. to observe n events: !

)(),;(

)(

n

enf

bsnbs

bs

experiment: nobs events, quantify our degree of confidence inthe discovery of a new effect (νS≠0)

How likely is to find nobs events or more from background alone?

Page 10: Goodness of fit,  confidence intervals and limits

Significance of an observed signal

obs

obs

nn

n

nbsbsobs nfnfnnP

1

0

010 ),;(),;()(

1

0

1obs bn

n

nb

n

e

!

Ex: expect νb=0.5, nobs= 5 P(n>nobs)=1.7x10-4

this is not the prob. of the hypothesis νS=0 !

this is the prob., under the hypothesis νS=0, of obtainingas many events as observed or more.

Page 11: Goodness of fit,  confidence intervals and limits

Significance of an observed signal

How to report the measurement?

estimate of ν : 55 2254 .. s

misleading: • only two std. deviations from zero• impression that νS is not very incompatible with zero

yes: prob. that a Poisson variable of mean νb will fluctuateup to nobs or higher

no: prob. that a variable with mean nobs will fluctuate down to νb or lower

Page 12: Goodness of fit,  confidence intervals and limits

Pearson’s test 2

histogram of x with N binsni

νi

construct a statistic which reflects the level of agreement between observed and expected histograms

N

i i

iin

1

2

)( data 5 1 iN nnnn ),,(

aprox. gaussian, Poisson distributedwith ),,( N

1

follow a distribution for N degrees of freedom 2• regardless of the distribution of x• distribution free

larger larger discrepancy between data and the hypothesis

2

Page 13: Goodness of fit,  confidence intervals and limits

Pearson’s test2

2

dznzfP d );( dnE ][ 2 12

dn

(rule of thumbfor a good fit)

130 10 152 . Pnd4 2 1009 100 150 .Pnd

Page 14: Goodness of fit,  confidence intervals and limits

Pearson’s test2

Page 15: Goodness of fit,  confidence intervals and limits

Pearson’s test2

Before

N

iitot nn

1

Poisson variable with

N

iitot

1

Set ntot = fixed ni dist. as multinomial with prob. tot

ii n

p

Not testing the total number of expected and observed Events, but only the distribution of x.

N

i toti

totii

np

npn

1

22 )( large number on entries in each bin

pi known

Follows a distribution for N-1 degrees of freedom2

In general, if m parameters estimated from data, nd = N - m

Page 16: Goodness of fit,  confidence intervals and limits

ML: estimator for θ

Standard deviation as stat. error

n observations of x, hypothesis p.d.f f(x;θ)

),,(ˆ nxx 1analytic methodRCF boundMonte Carlographical

standard deviation ˆˆ

measurement

ˆˆˆ

repeated estimates each based on n obs.: estimator dist. centered around true value θ andwith true estimated by and

);( g

ˆ ˆˆ

Most practical estimators: becomes approx. Gaussian in the large sample limit.

);( g

Page 17: Goodness of fit,  confidence intervals and limits

Classical confidence intervals

n obs. of x, evaluate an estimator for a param. θ ),,(ˆ nxx 1

obs obtained and its p.d.f. (for a given θ unknown));( g

uˆprob. α

prob. β vˆ

)),((ˆ);())(ˆ()(

uGdguPu

1

)),((ˆ);())(ˆ()(

vGdgvPv

Page 18: Goodness of fit,  confidence intervals and limits

Classical confidence intervals

prob. for estimator to be inside the belt regardless of θ

1))(ˆ)(( uvP

)(),( vu monotonic incresingfunctions of θ

)ˆ()ˆ( 1ua )ˆ()ˆ(

1vb

)(ˆ u

)(ˆ v

)ˆ(a

)ˆ(b ))ˆ((aP

))ˆ((bP

1)ˆ()ˆ(( baP

Page 19: Goodness of fit,  confidence intervals and limits

Classical confidence intervals

Usually: central confidence interval 2

1)( baP

a: hypothetical value of for whicha fraction of the repeated estimt. would be higher than the obtain.

obs

)()(ˆ bvauobs

obs

aGdag obs

);ˆ(ˆ);( 1

obs

bGdbg obs

ˆ

);ˆ(ˆ);(

Page 20: Goodness of fit,  confidence intervals and limits

Classical confidence intervals

Relationship between a conf. interval and a test of goodnessof fit:

test the hypothesys using having equalor less agreement than the result obtained

a obs ˆ

P-value = α (random variable) and θ = a is specified

Confidence interval: α is specified first, a is a random quantitydepending on the data

],[ bac

d

ac bd

Page 21: Goodness of fit,  confidence intervals and limits

Classical confidence intervals

Many experiments: the interval would include the truevalue in 1

It does not mean that the probability that the true value of is in the fixed interval is 1

Frequency interpretation: is not a random variable,but the interval fluctuates since it is constructed from data.

Page 22: Goodness of fit,  confidence intervals and limits

Gaussian distributed

Simple and very important application

Central limit theorem: any estimator linear function of sum of random variables becomes Gaussian in the large sample limit.

ˆ))ˆ(

exp(),;(ˆ

ˆ

ˆ

ˆ dG2

2

2 22

1

ˆ known, experiment resulted in obs

(),;ˆ(ˆ

ˆ

aaG obs

obs

11

(),;ˆ(ˆ

ˆ

bbG obs

obs

11

Page 23: Goodness of fit,  confidence intervals and limits

Gaussian distributed

)(ˆˆ 11

obsa

)(ˆˆ 11

obsb

)()( 111

Page 24: Goodness of fit,  confidence intervals and limits

Gaussian distributed

Choose quantile

)( 21 1 1 )( 11 1

1 0.6827 10.8413

2 0.9544 20.9772

3 0.9973 30.9987

Choose confidence level

1 )( 21 1 )( 111

0.90 1.645 0.90 1.2820.95 1.960 0.95 1.6450.99 2.576 0.99

2.326.