23
Bayesian Scientific Computing, Spring 2013 (N. Zabaras) Hypothesis Testing The Bayesian Way Prof. Nicholas Zabaras Materials Process Design and Control Laboratory Sibley School of Mechanical and Aerospace Engineering 101 Frank H. T. Rhodes Hall Cornell University Ithaca, NY 14853-3801 Email: [email protected] URL: http://mpdc.mae.cornell.edu/ December 27, 2013 1

Hypothesis Testing The Bayesian Way - Purdue University

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Hypothesis Testing The Bayesian Way

Prof. Nicholas Zabaras

Materials Process Design and Control Laboratory

Sibley School of Mechanical and Aerospace Engineering

101 Frank H. T. Rhodes Hall

Cornell University

Ithaca, NY 14853-3801

Email: [email protected]

URL: http://mpdc.mae.cornell.edu/

December 27, 2013

1

Page 2: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Introduction to Bayesian Statistics

2

Reverend Thomas

Bayes

(ca. 1702–1761)

Sole probability paper,

“Essay Towards Solving a

Problem in the Doctrine of

Chances”,

published posthumously in 1763

Page 3: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

References

3

C P Robert, The Bayesian Choice: From Decision-Theoretic Motivations to

Compulational Implementation, Springer-Verlag, NY, 2001 (online resource)

A Gelman, JB Carlin, HS Stern and DB Rubin, Bayesian Data Analysis,

Chapman and Hall CRC Press, 2nd Edition, 2003.

J M Marin and C P Robert, The Bayesian Core, Spring Verlag, 2007 (online

resource)

D. Sivia and J Skilling, Data Analysis: A Bayesian Tutorial, Oxford University

Press, 2006.

Bayesian Statistics for Engineering, Online Course at Georgia Tech, B.

Vidakovic.

Additional References with links are provided in the lecture slides.

Page 4: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Hypothesis testing the Bayesian way

Examples of Parametric Bayesian Models

Examples of Bayesian Prediction

Sequential nature of Bayesian inference, a Gaussian example

Bayes vs. MLE (limit of large data sets)

Example: Bayes and the Poisson model

Hypothesis Testing in a Bayesian framework (integration in parameter space)

Summary: Bayesian point estimates

Contents

4

Page 5: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Hypothesis Testing the Bayesian Way

5

Consider two hypotheses in coin flipping:

Coin is fair with prior p(h1)

Coin always produces tails with prior p(h2)

We flip the coin 5 times and obtain data x = {HTHTT}

Inferencing: We want assess the validity of the two hypotheses

Likelihood f(x | hi ):

Posterior:

Here there is no effect of the prior

1 1 1

2 2 2

( | ) ( | ) ( )

( | ) ( | ) ( )

h f h h

h f h h

p p

p p

x x

x x

1 25

1( | ) , ( | ) 0

2f h f h x x

Page 6: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Hypothesis Testing the Bayesian Way

6

Consider two hypothesis in coin flipping:

Coin is fair with prior p(h1)

Coin always produces tails with prior p(h2)

We flip the coin 5 times and obtain data x = {TTTTT}

Inferencing: We want to assess the validity of the two hypotheses

Likelihood f(x | hi ):

Posterior

The data (evidence) points to `tails’ but the posterior inferences

also depend strongly on the priors!

1 1 1 1

2 2 2 2

( | ) ( | ) ( ) ( )1

( | ) ( | ) ( ) 32 ( )

h f h h h

h f h h h

p p p

p p p

x x

x x

1 25

1( | ) , ( | ) 1

2f h f h x x

Page 7: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Parametric Bayesian Models

7

Let q the probability that the coin will draw heads

Let p(q) be the prior for here taken as:

Consider that we have the data x = {HTHTT}. We want to make an

inference about q?

Likelihood:

Posterior:

2 3( | ) (1 ) ( )f binomialq q q x

0,1q

[0,1]( ) ( ) ( )p q q uniform1

2 3

[0,1]

( | ) ( )( | )

( )

(1 )( ) (3,4)

(3,4)

q p qp q

q qq

f

m

betaB

xx

x

10 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0

1

2

3

4

5

6

x

pdf

2H - 3T

20H - 30T

0H - 5T

Click here for a MatLab

implementation

Page 8: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Parametric Bayesian Models

8

Let q the probability that the coin will draw heads

Suppose that B(a,b) is now our prior distribution:

We are given data x = {HTHTT}

Likelihood:

Posterior:

2 3( | ) (1 ) ( )f binomialq q q x

1 11( ) (1 )

( , )

a b

beta a bp q q q

2 1 3 1

[0,1]

( | ) ( )( | )

( )

(1 )( ) ( 2, 3)

( 2, 3)

q p qp q

q qq

a b

f

m

a bbeta a b

B

xx

x

10 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0

5

10

15

20

25

30

x

pdf

a=1,b=1

a=100,b=100

a=100,b=1

Click here for a MatLab

implementation

Page 9: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Parametric Bayesian Models

9

Data given: The coin is flipped n times and nH of those came to be heads

Prior: We consider the Beta prior B(a,b) as in the earlier slide.

Posterior:

Posterior mean:

1 1

( | ) ( )( | )

( )

(1 )( , )

( , )

q p qp q

q q

H Ha n b n n

H H

H H

f

m

a n b n nbeta a n b n n

B

xx

x

[ | ] Ha n

a b nq

x

, [ | ] Hnn

nq x

2

[ | ]

1

H H

Var

a n b n n

a b n a b n

q

x

[ | ] 0 (1/ )Var as O nq x

Note that:

Posterior variance:

Page 10: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Prediction

10

Suppose we have observed x and we want to make a prediction about

(future) unknown observables: What is the probability of observing data

If we already have observed data x?

This means finding

We have:

Compare this with the normalizing factor:

( | )g x x

( )( )

( , , ) ( , , ) ( , )( | ) ( , | )

( ) ( , ) ( )

( | , ) ( | ) ( | ) ( | )

p q p q qq q q q

q

q p q q q p q q

posteriorlikelihood

g g d d dm m

f d f d

xx

x x x x xx x x x

x x x

x x x x x

x

Pr( )

( ) ( | ) ( )q p q q iorLikelihood

m f d

x

x x

Page 11: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Prediction: Example

11

Consider the coin flipping example

Let q the probability that the coin draws heads

Consider p(q) a uniform prior for q

Given data x = {HTHTT}

We have seen that the posterior is B(3, 4)

What is the probability that the next draw will be heads?

x

2 3(1 ) 3( | ) ( | ) ( | )

(3,4) 7g H f d d

beta

q qq p q q q q

x x x x

Page 12: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Prediction

12

Consider the coin flipping example

Let q the probability that the coin will draw heads

Consider p(q) a uniform prior for q

Given data x = {HTHTT}

We have seen that the posterior is B (3, 4)

What is the probability that the next 5 draws will be all heads?

x

2 35

( | ) ( | ) ( | )

(1 ) 1

(3,4) 22

g HHHHH g d

dbeta

q p q q

q qq q

x x x x

Page 13: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

The Bayesian Analysis is Sequential

13

Using some given data x, we computed the posterior p(q| x).

If new data x* arrives, how can we update our inference?

We assume that x and x* are conditionally independent on q i.e.:

The augmented posterior (based on both x and x*) is:

Note that the prior now is our old posterior computed with data x.

Thus Bayesian analysis is sequential.

( , * | ) ( ) ( * | ) ( | ) ( ) ( * | ) ( | )( | , *)

( , *) ( *) ( ) ( *)

p q p q q q p q q p qp q

f f f

m m m

x x x x x xx x

x x x x x

( , *| ) ( | ) ( *| )f f fq q qx x x x

( * | ) ( | )( | , *)

( *)

f

m

q p qp q

x xx x

x

Page 14: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Sequential Nature of Bayesian Inference

14

Assume we have observed and computed the

corresponding posterior. Now we observe a second realization

We are interested to update our posterior:

Updating the prior one observation at a time or all observations

together does not matter

The sequential approach is useful for massive data sets, e.g.

i.e. the prior at time n is the posterior at time n-1.

2

2 2 | ~ ( , ).q q x of X N

1 2 2 1 1 2 1

2 1

1 2

( | , ) ( | , ) ( | ) ( | ) ( | ) ( )

( | ) ( | )

( | ) ( | )

x x f x x x f x f x

f x x

f x x

p q q p q q q p q

q p q

q p q

1 2 1 2 1( | , ,..., ) ( | ) ( | , ,..., )n n nx x x f x x x xp q q p q

2

1 ~ ( , )x q N

Page 15: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

A Gaussian Example

15

Consider

Then we can derive the following:

2 2

1 0 0| ~ ( , ), ~ ( , ).q q q X with prior mN N

22

011 1 2 2

0

( )( )( | ) ( | ) ( ) exp

2 2

mxx f x

qqp q q p q

2

2011 12 2 2 2 2

0 0 1

1 1 1( | ) exp exp ( )

2 2

mxx m

qp q q q

2

1 1 1

2 22 012 2 2 2 2

1 0 0

2 011 1 2 2

0

| ~ ( , )

1 1 1,

q

x m with

and

mxm

N

Page 16: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

A Gaussian Example: Continued

16

To predict the distribution of a new observation in light

of , we use the predictive distribution as follows:

We use the properties of the bivariate normal distribution. The product

in the integrand is the exponential of a quadratic function in (x,q); hence

(x,q) have a joint normal distribution. One can verify that:

Thus the marginal (integrating q) is a Gaussian with:

2| ~ ( , )q q X N

1x

222211

2 22211

( )1 ( )( )( )222

1 1( | ) ( | ) ( | )

qqqq q p q q q q

mxmx

Likelihood Posterior

f x x f x x d e e d e d

221

2 2 21

( )1 ( )2 2

2 2(1 ) 1 1 1 1 1

2 2 2 2 2 2 21 1 1 1 1

( ) ( ) ( )( )~ , 2 ,

qq

q q

mx z

x m m x m me e z

2 2

1 1 1| ~ ( , ) X x mN

Page 17: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

A Gaussian Example: Continued

17

We can derive the same results also using the fact that is

Gaussian so characterized fully by its mean and variance as (use the

earlier posterior results):

Thus we obtain the same result as before:

1( | )f x x

1 1 1 1 1| ( | , ) | |X

u v Posterior mean

X x X x x x mq qq q

1 1 1 1 1

2 2 2

1 1 1

tan

| ar ( | , ) | ( | , ) |

| |

X X

u v

cons t

Var X x V X x x Var X x x

x Var x

q q

q q

q

q q

q

Model Variancevariance due to

posterioruncertainty in

2 2

1 1 1| ~ ( , ) X x mN

: ( | )Use u u v

: ar( | ) ( | )Similarly use Var u V u v Var u v

Page 18: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Note that from:

Lets re-write this equation conditioning on x1:

Similar derivation can be shown for the variance:

18

( ) ( ) ( | ) ( ) ( | ) ( ) ( | )u u u du u u v v dvdu u u v du v dv u vp p p p p

1 1 1 1( | ) ( | ) ( | , ) ( | )u x u u x du u u v x v x dvdup p p

1 1 1 1 1 1 1( | ) ( | , ) ( | ) ( | , ) ( | ) ( | , ) |u x u u v x du v x dv u v x v x dv u v x xp p p

1 1 1 1 1( | ) ( | , ) | ( | , ) |Var u x Var u v x x Var u v x x

Proof of the Conditional Expectations

Page 19: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Gaussian Example: Bayes’ versus MLE

19

We have seen that the ML estimate of q at time n is simply:

The posterior of q at time n is (simply generalizing the earlier result):

where

As N∞, the prior is washed out by the data and the posterior mean is

the MLE estimate:

11

1( | )

N N

ML i i

ii

arg sup f x xNq

q q

2

1| ,..., ~ ( , )q N N Nx x mN

2 2 22 0

2 2 2 2 2

0 0

2 1 0 1

2 2

0

1 1~

~

NN

N

N N

i i

i iN N

N

N

N N

x xm

mN

1| ,..., N N MLx x mq q

Page 20: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Gaussian Example: Bayes versus MLE

20

Information provided by the Bayesian approach is much richer than the

simple MLE estimate.

You can compute for example posterior probabilities

or

Also you can do prediction of future data

1Pr | ,..., NA x xq

1| ,..., NVar x xq

1| ,..., Nf x x x

Page 21: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Bayes and the Poisson model

21

Assume you have some counting observations i.e.

Assume we adopt a Gamma prior for q, i.e.

You can easily show that:

. . .

~ ( ),qi i d

iX P

( | )!

ix

i

i

f x ex

q qq

~ ( , )q G

1( ) ( ; , )( )

q

p q q q

eG

1

1

( | ,..., ) ( ; , )p q q

N

N i

i

x x x NG

Page 22: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Testing Hypothesis in a Bayesian Framework

22

Consider the problem where we have and

To test using the posterior, we simply compute:

Note the integration is in parameter space.

In Bayesian statistics you never integrate with respect to

observations

Contrary to a frequentist approach, hypothesis testing in Bayesian is

never based on data you don’t observe!

( ) [0,1]p q U

Pr( | ) (1 ) ( | ) ( 1, 1 )q q q p q

x n xn

X x x x n xx

B

0 1

1 1: :

2 2H vs Hq q

1

0 1

1/2

( | ) 1 ( | ) ( | )H x H x x dp p p q q

Page 23: Hypothesis Testing The Bayesian Way - Purdue University

Bayesian Scientific Computing, Spring 2013 (N. Zabaras)

Posterior Inference: Point Estimates

23

Maximum A Posteriori estimate (MAP)

Posterior Mean

Posterior Quantiles

( | ) ( )( | )

( )

q p qp q

f xx

m x

* argmax log ( | ) argmax log ( | ) log ( )x xq q

q p q p q p q

( | )[ ] ( | )p x x dqq q qp q q

Pr ( | )a

a x dq p q q