15
SADC Course in Statistics The binomial distribution (Session 06)

SADC Course in Statistics The binomial distribution (Session 06)

Embed Size (px)

Citation preview

Page 1: SADC Course in Statistics The binomial distribution (Session 06)

SADC Course in Statistics

The binomial distribution

(Session 06)

Page 2: SADC Course in Statistics The binomial distribution (Session 06)

2To put your footer here go to View > Header and Footer

Learning Objectives

At the end of this session you will be able to:

• describe the binomial probability distribution including the underlying assumptions

• calculate binomial probabilities for simple situations

• apply the binomial model in appropriate practical situations

Page 3: SADC Course in Statistics The binomial distribution (Session 06)

3To put your footer here go to View > Header and Footer

Study of child-headed households• One devastating effect of the HIV and AIDS

pandemic is the emergence of child-headed households, i.e. ones where both parents have died and the children are left to fend for themselves.

• Suppose it is of interest to study in greater detail those households that are child-headed.

• Statistical techniques that may be employed require initially, a knowledge of the distributional pattern of the random variable X corresponding to the number of child-headed households.

Page 4: SADC Course in Statistics The binomial distribution (Session 06)

4To put your footer here go to View > Header and Footer

• Interest is on the distribution of

X = number of child-headed households

• Under certain assumptions, X has a binomial distribution.

• To introduce this distribution, we first deal with a simpler (but related) distribution

A probability distribution for X

Page 5: SADC Course in Statistics The binomial distribution (Session 06)

5To put your footer here go to View > Header and Footer

The Bernoulli Distribution

The simplest probability distribution is one describing the behaviour of a dichotomous (binary) random variable, i.e. one with two possible outcomes; (Success, Failure), (Yes, No), (Female, Male), etc.

Outcome Values of random variable

Probability

Success 1 p

failure 0 1-p

Total 1

Page 6: SADC Course in Statistics The binomial distribution (Session 06)

6To put your footer here go to View > Header and Footer

Background

In general, we have a sequence of n trials, each with just two possible outcomes.e.g. visiting n households in turn and recording whether it is child-headed.

Call one outcome a “success”, the other a “failure”. Let probability (of a success) = p.

The word success is a generic term used to represent the outcome of interest, e.g. if a sampled household is child-headed we call it a “success” because that is the outcome of interest.

Page 7: SADC Course in Statistics The binomial distribution (Session 06)

7To put your footer here go to View > Header and Footer

Let X be the number of successes in n trials.

X is said to have a binomial distribution if:

• The probability of success p is the same for each trial.

• The trials have independent outcomes.

In the context of our example, X=number ofchild-headed HHs from n HHs sampled.

p=probability of a HH being child-headed.

Under what conditions would X be binomial?

Basics and terminology

Page 8: SADC Course in Statistics The binomial distribution (Session 06)

8To put your footer here go to View > Header and Footer

Binomial Probability Distribution

The probability of finding k successes out of n trials is given by

nkppknk

nkXP knk ,,1,0,)1(

)!(!

!)(

Here n! = n(n-1)(n-2)……. (3) (2) (1); 0!=1.

Thus, for example, 4! = 4 x 3 x 2 x 1 = 24.

Exercise: If p=0.2 and n=10, confirm that

3 10 3103 0 2 1 0 2 0 2

3 10 3

!P( X ) . ( . ) .

!( )!

Page 9: SADC Course in Statistics The binomial distribution (Session 06)

9To put your footer here go to View > Header and Footer

Binomial Probability Distribution

In computing binomial probabilities, the value of p is often unknown.

It is then estimated by the proportion of successes in the sample.

i.e.

i.e.

Following graphs show binomial probabilities for n=10 and differing values of p.

observed no.of successes ( say r ) in n trialsp̂

total sample size,n

r

p̂n

Page 10: SADC Course in Statistics The binomial distribution (Session 06)

10To put your footer here go to View > Header and Footer

There are 11 possible outcomes. Graph shows P(X=2)=0.3, P(X=3)=0.2, P(X>6) almost=0.

Binomial distribution with p = 0.2

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0 1 2 3 4 5 6 7 8 9 10

X

Pro

ba

bil

ity

Example 1: Left-handedness

Suppose the probability of a person being left-handed is p=0.2.

Let X be number left-handed persons in a group of 10.

Graph shows probability of 0, 1, 2, … left-handed persons

Page 11: SADC Course in Statistics The binomial distribution (Session 06)

11To put your footer here go to View > Header and Footer

The distribution is symmetrical. We find P(X=2)=P(X=8)=0.044, P(X=3)=P(X=7)=0.12, etc.

Example 2: Tossing a coin A coin is tossed. The probability of getting a head is p=0.5.

Let X be number heads in 10 tosses of the coin.

Graph shows probability of getting 0, 1, 2, … heads.

Binomial distribution with p = 0.5

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0 1 2 3 4 5 6 7 8 9 10

X

Pro

ba

bil

ity

Page 12: SADC Course in Statistics The binomial distribution (Session 06)

12To put your footer here go to View > Header and Footer

The distribution is now concentrated to the right.

Here P(X<4) is almost zero.

Example 3: Selecting a rural village

Ratio of rural villages to urban villages is 4:1.

Suppose 10 villages are selected at random. Let X be number of rural villages selected.

Graph shows probability of getting 0, 1, 2, … rural villages.

Binomial distribution with p = 0.8

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0 1 2 3 4 5 6 7 8 9 10

X

Pro

ba

bil

ity

Page 13: SADC Course in Statistics The binomial distribution (Session 06)

13To put your footer here go to View > Header and Footer

Properties of Binomial DistributionThe mean (average) of the binomial distribution with parameters n and p = np.e.g. In a population of size 1000, suppose the probability of selecting a child-headed HH is p=0.03. Then the mean number of child-headed HHs is 1000x0.03 = 30.Recall that the mean = expected value of X. Thus

.)1()!(!

!)(

0

npppxnx

nxXE xnx

n

x

.1)1()!(!

!

0

xnxn

x

ppxnx

nNote: Since the binomial is a probability distribution,

Page 14: SADC Course in Statistics The binomial distribution (Session 06)

14To put your footer here go to View > Header and Footer

The standard deviation of the binomial distribution is

For n=1000, p=0.2 the standard deviation is therefore =[1000*0.2*0.8]½ = 12.65

The theoretical derivation is given below.

)1( pnp

2 2

0

2 2

2 2 2

1

1

1

nx n x

x

n!E( X ) x p ( p )

x!( n x )!

np( p ) n p .

Var( X ) E( X ) np( p ).

Above can be shown to be

Further Properties:

)1( pnp

Page 15: SADC Course in Statistics The binomial distribution (Session 06)

15To put your footer here go to View > Header and Footer

Practical work follows to ensure learning objectives

are achieved…