41
Introduction to Bayesian Statistics Christiana Kartsonaki February 11th, 2015

Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Introduction to Bayesian Statistics

Christiana Kartsonaki

February 11th, 2015

Page 2: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Introduction

Bayes’ theorem is an elementary result in probability theory which relatesthe conditional probability P(A given B) to P(B given A), for two events Aand B.

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 2 / 28

Page 3: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Bayes’ theorem

P(A): probability of event A

P(A | B): conditional probability of event A given that event B hasoccurred

P(A | B) =P(B | A) P(A)

P(B)

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 3 / 28

Page 4: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Bayesian inference

P(conclusion | data) ∝ P(data | conclusion) × P(conclusion)

posterior likelihood prior

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 4 / 28

Page 5: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Bayesian inference

P(conclusion | data) ∝ P(data | conclusion) × P(conclusion)

posterior likelihood prior

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 4 / 28

Page 6: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Example

buses arrive in a random pattern (Poisson process), mean time intervalunknown

data → likelihood

you know something about the interval (e.g. likely not every 1 hour, butnot 1 minute either)

prior distribution

the two sources of information are synthesized as a probability distribution

posterior distribution

→ merge information from data with ‘external’ information

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 5 / 28

Page 7: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Example

buses arrive in a random pattern (Poisson process), mean time intervalunknown data → likelihood

you know something about the interval (e.g. likely not every 1 hour, butnot 1 minute either) prior distribution

the two sources of information are synthesized as a probability distributionposterior distribution

→ merge information from data with ‘external’ information

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 5 / 28

Page 8: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Example

buses arrive in a random pattern (Poisson process), mean time intervalunknown data → likelihood

you know something about the interval (e.g. likely not every 1 hour, butnot 1 minute either) prior distribution

the two sources of information are synthesized as a probability distributionposterior distribution

→ merge information from data with ‘external’ information

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 5 / 28

Page 9: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Statistical inference

parameter θ

data x

model f (x, θ)

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 6 / 28

Page 10: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Bayes’ theorem

p(θ | x) =p(x | θ) π(θ)

p(x)

π(θ): prior distribution

p(x | θ): likelihood

p(θ | x): posterior distribution

p(x): predictive probability of x (normalizing factor)

posterior ∝ likelihood × prior

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 7 / 28

Page 11: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Bayes’ theorem

p(θ | x) =p(x | θ) π(θ)

p(x)

π(θ): prior distribution

p(x | θ): likelihood

p(θ | x): posterior distribution

p(x): predictive probability of x (normalizing factor)

posterior ∝ likelihood × prior

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 7 / 28

Page 12: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Bayesian inference

Any quantity that does not depend on θ cancels out from the denominatorand numerator of Bayes’ theorem.

So if we can recognise which density is proportional to the product of thelikelihood and the prior, regarded solely as a function of θ, we know theposterior density of θ.

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 8 / 28

Page 13: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Frequentist and Bayesian statistics

Frequentist approaches → typically treat θ as an unknown constant

Bayesian approaches → treat it as a random variable

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 9 / 28

Page 14: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Likelihood

Likelihood → used in most approaches to formal statistical inference.

Describes the data generating process.

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 10 / 28

Page 15: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Prior and posterior distribution

prior distribution → represents information about the parameters otherthan that supplied by the data under analysis

posterior distribution → probability distribution as revised in the light ofthe data, determined by applying standard rules of probability theory

Sometimes the prior distribution is ‘flat’ over a particular scale, intendedto represent the absence of initial information.

In complex problems with many nuisance parameters the use of flat priordistributions is suspect and, at the very least, needs careful study usingsensitivity analyses.

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 11 / 28

Page 16: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Prior and posterior distribution

prior distribution → represents information about the parameters otherthan that supplied by the data under analysis

posterior distribution → probability distribution as revised in the light ofthe data, determined by applying standard rules of probability theory

Sometimes the prior distribution is ‘flat’ over a particular scale, intendedto represent the absence of initial information.

In complex problems with many nuisance parameters the use of flat priordistributions is suspect and, at the very least, needs careful study usingsensitivity analyses.

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 11 / 28

Page 17: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Example

X1, . . . ,Xn: random sample from an exponential distribution with densityf (x | θ) = θe−θx , x > 0

prior π(θ) = λe−λθ, θ > 0, for some known value of λ

Then the likelihood is

f (x1, . . . , xn | θ) = θn e−θ∑n

i=1 xi .

So the posterior distribution is

which is the Gamma(n + 1, λ+∑

xi ) density.

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 12 / 28

Page 18: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Example

X1, . . . ,Xn: random sample from an exponential distribution with densityf (x | θ) = θe−θx , x > 0

prior π(θ) = λe−λθ, θ > 0, for some known value of λ

Then the likelihood is

f (x1, . . . , xn | θ) = θn e−θ∑n

i=1 xi .

So the posterior distribution is

which is the Gamma(n + 1, λ+∑

xi ) density.

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 12 / 28

Page 19: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Example

X1, . . . ,Xn: random sample from an exponential distribution with densityf (x | θ) = θe−θx , x > 0

prior π(θ) = λe−λθ, θ > 0, for some known value of λ

Then the likelihood is

f (x1, . . . , xn | θ) = θn e−θ∑n

i=1 xi .

So the posterior distribution is

which is the Gamma(n + 1, λ+∑

xi ) density.

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 12 / 28

Page 20: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Example

X1, . . . ,Xn: random sample from an exponential distribution with densityf (x | θ) = θe−θx , x > 0

prior π(θ) = λe−λθ, θ > 0, for some known value of λ

Then the likelihood is

f (x1, . . . , xn | θ) = θn e−θ∑n

i=1 xi .

So the posterior distribution is

p(θ | x) ∝ π(θ) f (x | θ)

which is the Gamma(n + 1, λ+∑

xi ) density.

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 12 / 28

Page 21: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Example

X1, . . . ,Xn: random sample from an exponential distribution with densityf (x | θ) = θe−θx , x > 0

prior π(θ) = λe−λθ, θ > 0, for some known value of λ

Then the likelihood is

f (x1, . . . , xn | θ) = θn e−θ∑n

i=1 xi .

So the posterior distribution is

p(θ | x) ∝ λe−λθ θne−θ∑

xi

which is the Gamma(n + 1, λ+∑

xi ) density.

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 12 / 28

Page 22: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Example

X1, . . . ,Xn: random sample from an exponential distribution with densityf (x | θ) = θe−θx , x > 0

prior π(θ) = λe−λθ, θ > 0, for some known value of λ

Then the likelihood is

f (x1, . . . , xn | θ) = θn e−θ∑n

i=1 xi .

So the posterior distribution is

p(θ | x) ∝ λe−λθ θne−θ∑

xi

∝ θne−θ(λ+∑

xi )

which is the Gamma(n + 1, λ+∑

xi ) density.

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 12 / 28

Page 23: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Example

X1, . . . ,Xn: random sample from an exponential distribution with densityf (x | θ) = θe−θx , x > 0

prior π(θ) = λe−λθ, θ > 0, for some known value of λ

Then the likelihood is

f (x1, . . . , xn | θ) = θn e−θ∑n

i=1 xi .

So the posterior distribution is

p(θ | x) ∝ θne−θ(λ+∑

xi ),

which is the Gamma(n + 1, λ+∑

xi ) density.

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 12 / 28

Page 24: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Example

X1, . . . ,Xn: random sample from an exponential distribution with densityf (x | θ) = θe−θx , x > 0

prior π(θ) = λe−λθ, θ > 0, for some known value of λ

Then the likelihood is

f (x1, . . . , xn | θ) = θn e−θ∑n

i=1 xi .

So the posterior distribution is

p(θ | x) ∝ θne−θ(λ+∑

xi ),

which is the Gamma(n + 1, λ+∑

xi ) density.

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 12 / 28

Page 25: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

History

‘inverse probability’

prior distribution intended to represent initial ignorance → usedsystematically in statistical analysis by Gauss and especially Laplace (circa1800)

the approach was criticized during the 19th century

in the middle of the 20th century attention shifted to a personalistic viewof probability – individual belief as expressed by individual choice in adecision-making context

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 13 / 28

Page 26: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Prior distribution

Controversial aspects concern prior. What does it mean?

What is the prior probability that treatment and control have identicaleffect?

What is the prior probability that the difference between two groups isbetween 5 and 15 units?

prior distribution must be specified explicitly, i.e. in effect numerically

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 14 / 28

Page 27: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Prior distribution

Broadly three approaches:

1 Summary of other data.Empirical Bayes.

2 Prior measures personalistic opinion of investigator about conclusions.Not useful for ‘public’ transmission of knowledge.

3 Objective degree of uncertainty.

1 Agreed measure of uncertainty.2 Ignorance, reference, flat prior for interval estimate. Laplace’s principle

of indifference.

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 15 / 28

Page 28: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Empirical Bayes

‘empirical’ → frequency interpretation implied

e.g. an unknown parameter representing a mean of some measurement –likely to vary under different circumstances

can be represented by a widely dispersed distribution

leading to a posterior distribution with a frequency interpretation

Example: variances of gene expression for different probes on a microarraymay be assumed to be a sample from a distribution with a commonparameter

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 16 / 28

Page 29: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Personalistic prior

reflects the investigator’s subjective beliefs

prior distribution is based on relatively informally recalled experience of afield, for example on data that have been seen only informally

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 17 / 28

Page 30: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Flat prior

a prior which aims to insert as little new information as possible

for relatively simple problems often limiting forms of the prior reproduceapproximately or exactly posterior intervals equivalent to confidenceintervals

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 18 / 28

Page 31: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Priors

Is the prior distribution a positive insertion of evidence? If so, what isits basis and has the consistency of that evidence with the currentdata been checked?

If flat/ignorance/reference priors have been used, how have they beenchosen? Has there been a sensitivity analysis? If the number ofparameters over which a prior distribution is defined is appreciablethen the choice of a flat prior distribution could be misleading.

Each of a substantial number of individuals may have been allocateda value of an unknown parameter, the values having a stablefrequency distribution across individuals – empirical Bayes.

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 19 / 28

Page 32: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Posterior distribution

Conclusions can be summarized using for example

posterior mean

posterior variance

credible intervals

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 20 / 28

Page 33: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Credible intervals

a region Cα(x) is a 100(1− α)% credible region for θ if∫Cα(x)

p(θ | x) dθ = 1− α

I there is posterior probability 1− α that θ is in Cα(x)

credible interval – special case of credible region

analogous to (frequentist) confidence intervals, different interpretation

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 21 / 28

Page 34: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Hypothesis testing

Frequentist approach to hypothesis testing → compares a null hypothesisH0 with an alternative H1 through a test statistic T that tends to belarger under H1 than under H0 and rejects H0 for small p-valuesp = PH0(T ≥ tobs), where tobs is the value of T actually observed and theprobability is computed as if H0 were true

Bayesian approach → attaches prior probabilities to models correspondingto H0 and H1 and compares their posterior probabilities using the Bayesfactor

B10 =P(x | H1)

P(x | H0)

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 22 / 28

Page 35: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Computation

conjugate prior → when the prior and the posterior are from the samefamily of distributions (for example normal prior and normal likelihood)

makes calculations easier

however, often unrealistic, so posterior distributions need to be evaluatednumerically

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 23 / 28

Page 36: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Markov chain Monte Carlo (MCMC)

Markov chain Monte Carlo (MCMC): a stochastic simulation techniquewhich is used for computing inferential quantities which cannot beobtained analytically

MCMC simulates a discrete-time Markov chain

it produces a dependent sequence of random variables{θ(1), . . . , θ(M)} with approximate distribution the posteriordistribution of interest

MCMC is an iterative procedure, such that given the current state ofthe chain, θ(i), the algorithm makes a probabilistic update to θ(i + 1)

Markov chains can automatically be constructed to match anyposterior density

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 24 / 28

Page 37: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

MCMC

Two of the most general procedures for MCMC simulation from a targetdistribution:

• Metropolis–Hastings algorithm

• Gibbs sampler

Software:

• WinBugs – a Windows version of BUGS (Bayesian analysis Using theGibbs Sampler)

• CODA: a collection of convergence diagnostics and sample outputanalysis programs

• JAGS (Just Another Gibbs Sampler)

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 25 / 28

Page 38: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

MCMC

Two of the most general procedures for MCMC simulation from a targetdistribution:

• Metropolis–Hastings algorithm

• Gibbs sampler

Software:

• WinBugs – a Windows version of BUGS (Bayesian analysis Using theGibbs Sampler)

• CODA: a collection of convergence diagnostics and sample outputanalysis programs

• JAGS (Just Another Gibbs Sampler)

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 25 / 28

Page 39: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

MCMC – priors

MCMC mostly uses flat priors.

Flat for θ not same as flat for e.g. log(θ).

For models with fairly few parameters and reasonable data givesconfidence level.

For large number of parameters may give very bad answer. Nogeneral theory known.

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 26 / 28

Page 40: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Discussion

Bayesian inference → based on Bayes’ theorem

differences in interpretation between Bayesian and frequentistinference

choice of prior controversial

computation usually done numerically; MCMC useful but to be usedwith caution

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 27 / 28

Page 41: Introduction to Bayesian Statisticsusers.ox.ac.uk/~scro1407/2015-02-15-KartsonakiBayesianStatistics.… · Christiana Kartsonaki Introduction to Bayesian Statistics February 11th,

Further reading

Cox, D. R. (2006). Frequentist and Bayesian Statistics: A Critique(Keynote Address). In Statistical Problems in Particle Physics,Astrophysics and Cosmology (Vol. 1, p. 3).

Cox, D. R. and Donnelly, C. A. (2011). Principles of AppliedStatistics. Cambridge University Press.

Davison, A. C. (2003). Statistical Models (Vol. 11). CambridgeUniversity Press.

Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. (2003).Bayesian Data Analysis. Chapman and Hall/CRC Texts in StatisticalScience.

WinBugshttp://www.mrc-bsu.cam.ac.uk/software/bugs/

Christiana Kartsonaki Introduction to Bayesian Statistics February 11th, 2015 28 / 28