5
This article is about the form of Bayes' theorem. For the decision rule, see Bayes estimator . For the use of Bayes factor in model selection, see Bayes factor . Bayesian statistics Theory Bayesian probability Probability interpretations Bayes' theorem Bayes' rule Bayes factor Bayesian inference Bayesian network Prior Posterior Likelihood Conjugate prior Posterior predictive Hyperparameter Hyperprior Principle of indifference Principle of maximum entropy Empirical Bayes method Cromwell's rule Bernstein–von Mises theorem Bayesian information criterion Credible interval Maximum a posteriori estimation Techniques Bayesian linear regression Bayesian estimator Approximate Bayesian computation

Information Theory

Embed Size (px)

DESCRIPTION

111

Citation preview

Page 1: Information Theory

This article is about the form of Bayes' theorem. For the decision rule, see Bayes estimator. For the use of Bayes factor in model selection, see Bayes factor.

Bayesian statistics

Theory

Bayesian probability

Probability interpretations

Bayes' theorem

Bayes' rule

Bayes factor

Bayesian inference

Bayesian network

Prior

Posterior

Likelihood

Conjugate prior

Posterior predictive

Hyperparameter

Hyperprior

Principle of indifference

Principle of maximum entropy

Empirical Bayes method

Cromwell's rule

Bernstein–von Mises theorem

Bayesian information criterion

Credible interval

Maximum a posteriori estimation

Techniques

Bayesian linear regression

Bayesian estimator

Approximate Bayesian computation

Uses

Page 2: Information Theory

Bayesian spam filtering

Binary classification

Naive Bayes classifier

Statistics portal

v

t

e

In probability theory and applications, Bayes' rule relates the odds of event to event , before and after conditioning on event . The relationship is expressed in terms of the Bayes factor, . Bayes' rule is derived from and closely related to Bayes' theorem. Bayes' rule may be preferred to Bayes' theorem when the relative probability (that is, the odds) of two events matters, but the individual probabilities do not. This is

because in Bayes' rule, is eliminated and need not be calculated (see Derivation). It is commonly used in science and engineering, notably for model selection.

Under the frequentist interpretation of probability, Bayes' rule is a general relationship

between and , for any events , and in the same event space. In this case, represents the impact of the conditioning on the odds.

Under the Bayesian interpretation of probability, Bayes' rule relates the odds on probability models and before and after evidence is observed. In this case, represents the impact of the evidence on the odds. This is a form of Bayesian

inference - the quantity is called the prior odds, and the posterior odds. By analogy to the prior and posterior probability terms in Bayes' theorem, Bayes' rule can be seen as Bayes' theorem in odds form. For more detail on the application of Bayes' rule under the Bayesian interpretation of probability, see Bayesian model selection.

Contents

1 The rule o 1.1 Single event o 1.2 Multiple events

2 Derivation 3 Examples

o 3.1 Frequentist example o 3.2 Model selection

4 External links

Page 3: Information Theory

The rule

Single event

Given events , and , Bayes' rule states that the conditional odds of given are equal to the marginal odds of multiplied by the Bayes factor :

where

In the special case that and , this may be written as

Multiple events

Bayes' rule may be conditioned on an arbitrary number of events. For two events and ,

where

In this special case, the equivalent notation is

Derivation

Consider two instances of Bayes' theorem:

Page 4: Information Theory

Combining these gives

Now defining

this implies

A similar derivation applies for conditioning on multiple events, using the appropriate extension of Bayes' theorem

Examples

Frequentist example

Consider the drug testing example in the article on Bayes' theorem.

The same results may be obtained using Bayes' rule. The prior odds on an individual

being a drug-user are 199 to 1 against, as and . The

Bayes factor when an individual tests positive is in favour of being a drug-user: this is the ratio of the probability of a drug-user testing positive, to the probability of a non-drug user testing positive. The posterior odds on being a drug user are therefore , which is very close to

. In round numbers, only one in three of those testing positive are actually drug-users.

Model selection