22
Sections 3.6, 4.1, and 4.2 Elementary Statistics for the Biological and Life Sciences (Stat 205)

Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

Sections 3.6, 4.1, and 4.2

Elementary Statistics for the Biological and Life Sciences

(Stat 205)

Page 2: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

3.6 Binomial random variable

Independent-trials model A series of n independent trials isconducted. Each trial results in success or failure. Theprobability of success is equal to p for each trial, regardless ofthe outcomes of the other trials.

The binomial distribution defines a discrete random variableY that counts the number, out of the n trials, exhibiting acertain trait with probability p in the “independent trialsmodel.”

2 / 22

Page 3: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Example 3.6.1 Albinism

If both parents carry the gene for being albino, each kid has ap = 0.25 chance of being albino. Each child has the samechance of being albino independent of whether the otherchildren are albino.

Let Y count the number of kids out of two that are albino. Ycan be 0, 1, or 2.

3 / 22

Page 4: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Probability tree for albinism

Probability tree for albinism among two children of carriers of thegene for albinism.

4 / 22

Page 5: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Albino example, cont’d

Let the four possible experimental outcomes for thefirst/second child be albino/albino, albino/not, not/albino,not/not.

Y = 0 corresponds to not/not, Y = 1 corresponds to eitheralbino/not or not/albino, and Y = 2 corresponds toalbino/albino.

Pr{Y = 0} = Pr{not/not} = 916 .

Pr{Y = 1} = Pr{albino/not}+ Pr{not/albino} = 316 + 3

16 =616 .

Pr{Y = 2} = Pr{albino/albino} = 116 .

5 / 22

Page 6: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Probability distribution in tabular form

Y is binomial with p = 0.25 and n = 2.

6 / 22

Page 7: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

Binomial coefficient: Combination Rule

The number of ways of selecting x objects from n objects,irrespective of order is equal to: nCj = n!

j!(n−j)!

Example: If there are 5 blog post, and you only have time to read2 of them, how many ways to select 2 articles to read?

5!

(5− 2)!2!=

5!

3!2!= 5C2 = 10

Page 8: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Binomial distribution formula

Definition: A binomial random variable Y with probability p andnumber of trials n has the probability of j successes (and n − jfailures) given by

Pr{j successes} = Pr{Y = j} = nCj pj (1− p)n−j .

The binomial coefficient nCj counts the number of ways to orderj “successes” and n − j failures. For example, if n = 4 and j = 2then 4C2 = 6 because there’s 6 orderings

SSFF SFSF SFFS FSSF FSFS FFSS

8 / 22

Page 9: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Binomial coefficient, formal definition

The binomial coefficient is

nCj =n!

j!(n − j)!

where x! is read “x factorial” given by

x! = x(x − 1)(x − 2) · · · (3)(2)(1).

The first few are

0! = 1

1! = 1

2! = (2)(1) = 2

3! = (3)(2)(1) = 6

4! = (4)(3)(2)(1) = 24

5! = 120

9 / 22

Page 10: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Binomial probabilities

There are a lot of formulas on the previous slide.

It’s possible to compute probabilities like Pr{Y = 2} by handusing the formulas.

For Y binomial with n trials and probability p, R computesPr{Y = j} easily using dbinom(j,n,p)

Use R for your homework!

10 / 22

Page 11: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Example 3.6.4 Mutant cats!

Study in Omaha, Nebraska found p = 0.37 have a mutanttrait.

Randomly draw n = 5 cats and count Y , the number ofmutants.

Y is binomial with p = 0.37 and n = 5. Let’s have R find theprobability of Y = 0, Y = 1, Y = 2, Y = 3, Y = 4, Y = 5:

> dbinom(0,5,0.37)

[1] 0.09924365

> dbinom(1,5,0.37)

[1] 0.2914298

> dbinom(2,5,0.37)

[1] 0.3423143

> dbinom(3,5,0.37)

[1] 0.2010418

> dbinom(4,5,0.37)

[1] 0.05903607

> dbinom(5,5,0.37)

[1] 0.006934396

11 / 22

Page 12: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Probability distribution for n = 5 and p = 0.37

Questions What is Pr{Y ≤ 2}? Pr{Y ≤ 4}? Pr{3 ≤ Y ≤ 4}?

12 / 22

Page 13: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Cumulative probabilities using R

For a Binomial distribution with n = 5 and p = 0.37 we cansolve cumulative probabilities like Pr{Y ≤ 2}, Pr{Y ≤ 4},Pr{3 ≤ Y ≤ 4}? using R.

In R, the command “pbinom” gives the cumulative probability“up to” a value in the binomial distribution

> pbinom(2, 5, 0.37)

[1] 0.7329878

> pbinom(4, 5, 0.37)

[1] 0.9930656

Thus:Pr{Y ≤ 2} = 0.7329878, Pr{Y ≤ 4} = 0.9930656Pr{3 ≤ Y ≤ 4} = 0.9930656− 0.7329878 = 0.2600778.

13 / 22

Page 14: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Mean and standard deviation of binomial random variable

Let Y be binomial with n trials and probability p.

µY = n p

σY =√n p (1− p)

Example: for mutant cats, µY = 5(0.37) = 1.85 cats andσY =

√5(0.37)(0.63) = 1.08 cats.

µY = np = 1.85 is the expected value of Y.

14 / 22

Page 15: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Section 4.1 Normal curves

“Bell-shaped curve”

The normal density curve defines a continuous randomvariable Y .

Normal curves approximate lots of real data densities(examples coming up).

A normal curve is defined by the mean µ and standarddeviation σ.

We will also find that sample means Y are approximatelynormal in Chapter 5. So are sample proportions p (morelater).

Let’s look at some real data examples...

15 / 22

Page 16: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Serum cholesterol in n = 727 12–14 year-old children

16 / 22

Page 17: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Normal fit to cholesterol data with µ = 162 mg/dl andσ = 28 mg/dl.

17 / 22

Page 18: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Normal distribution of bill length

Bill length of Blue jay. µ = 25.4 mm & σ = 0.8 mm

18 / 22

Page 19: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

4.2 Normal density functions

The density function is given by

f (x) =1

σ√

2πexp

(−(x − µ)2

2σ2

)All normal curves have the same shape. They have a mode atµ and are the more spread out – the flatter – the larger σ is.

Almost all of the probability is contained between µ− 3σ andµ+ 3σ.

The area under every normal density is one.

If Y has a normal density with mean µ and standard deviationσ, we can write Y ∼ N(µ, σ).

19 / 22

Page 20: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Normal curve with mean µ and standard deviation σ

20 / 22

Page 21: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Three normal curves with different means and standarddeviations

21 / 22

Page 22: Sections 3.6, 4.1, and 4 - University of South Carolinapeople.stat.sc.edu/jiana/STAT 205/Lecture notes/L8-sec3_6to4_2.pdf3.6 Binomial distribution Sections 4.1 & 4.2 normal distributions

3.6 Binomial distributionSections 4.1 & 4.2 normal distributions

Discussion

Introduced two random variables, binomial and normal.binomial is discrete, normal continuous.

Binomial has a probability table with Pr{Y = j} forj = 0, 1, . . . , n, normal has density function f (x).

Binomial sometimes written Y ∼ bin(n, p)

Normal sometimes written Y ∼ N(µ, σ).

R computes probabilities for both.

In next section we’ll discuss how to get probabilities fornormal random variables.

22 / 22