Upload
lamtruc
View
221
Download
1
Embed Size (px)
Citation preview
Chapter 3
Introduction to Probability
3.1 What is Probability?
Table 3.1: 5-year Incidence of breast cancer in 2000 45-54 years old women
Had first child Total Diagnosed with breast cancer Proportion
Before the age of 20 1000 4 0.004
After the age of 30 1000 5 0.005
Total 2000 9 0.0045
• Is this evidence enough to confirm a difference in risk between
1
BIOS 2041 Statistical Methods Abdus S. Wahed
the two groups?
• How about if we increase sample sizes by 10-fold?
Table 3.2: 5-year Incidence of breast cancer in 20000 45-54 years old women
Had first child Total Diagnosed with breast cancer Proportion
Before the age of 20 10000 40 0.004
After the age of 30 10000 50 0.005
Total 2000 9 0.0045
• Not sure, as these differences in risks might just due to chance.
Thus, we need a formal way of judging if such differences in random
phenomenon could be attributed to chances only.
3.2 Probability
3.2.1 Experiment
An experiment is any action or process that generates observations.
1. Tossing a coin once,
2. Rolling a die twice ,
Chapter 3 2
BIOS 2041 Statistical Methods Abdus S. Wahed
3. measuring the blood pressure levels,
4. Obtaining blood types,
5. Picking up a student from this class at random and asking what
grade he/she expects in this class, etc.
3.2.2 Sample space
The sample space of an experiment, denoted by S, is the set of all
possible outcomes of the experiment. For the experiments mentioned
in the previous section,
1. S = {H, T},
2. S = {11, 12, 13, 14, 15, 16, 21, 22, . . . , 61, 62, 63, 64, 65, 66},
3. S = {x : x ≥ 0},
4. S = {A+, A−, B+, B−, . . .},
5. S = {A+, A, A−, . . . , }.
Chapter 3 3
BIOS 2041 Statistical Methods Abdus S. Wahed
3.2.3 Event
An event is any collection of outcomes contained in the sample space.
For the sample spaces mentioned in the previous section,
1. • E1 = {H, T} = Heads or Tails,
• E2 = {H} = Heads only,
• E3 = {T} = tails only,
• E4 = {} = ∅ (Empty Set)= Nothing, etc.
2. • E1 = {11} = Both dice shows 1 ,
• E2 = {11, 12, 13, 21, 22, 31} = Sum of the numbers is less
than 5, etc.
3. • E1 = {x : 80 < x < 92} = Blood pressure level is between
80 and 92,
• E2 = {x : x > 100} = Blood pressure level exceeds 100, etc.
4. • E1 = {A+} = A positive blood group , etc.
5. • E1 = {A+, A, A−} = At the least an A,
Chapter 3 4
BIOS 2041 Statistical Methods Abdus S. Wahed
• E2 = {A+, A, A−, B+, B, B−} = At the least a B, etc.
3.2.4 Probability
In the coin-tossing example, if the experiment is conducted with
fairness, the chance of “H” appearing is “50-50” in any toss. Why is
that? From our experience we know that if we toss the coin for a large
number of times, the number of times we will see “H” will closely
match the number of times we will see “H”. Thus we say that, in
this experiment the two outcomes are equally likely, or equivalently,
the outcome “H” will occur with a probability of 12. We write,
Pr(H) =1
2= Pr(T ).
Note that if the coin is weighted in such a way that “H” is three
times as likely to occur as “T”, then
Pr(H) =3
4, and Pr(T ) =
1
4.
Similarly, while tossing an “unbiased” die once, every number is
Chapter 3 5
BIOS 2041 Statistical Methods Abdus S. Wahed
equally likely to show up, resulting in
Pr(1) = Pr(2) = Pr(3) = Pr(4) = Pr(5) = Pr(6) =1
6.
In both cases we have assigned a number between 0 and 1 to each
of the outcomes in the sample space such that the total equals 1.
In a similar fashion, one can define the probabilities of events. For
example, while tossing a fair die, the two events
E1 : Less than 4
and
E2 : Greater than 3
are equally likely. we write
Pr(E1) = Pr(E2) =1
2.
Assigning probabilities in the blood pressure measurement exam-
ple or in the blood type examples is not as straightforward as in these
toy examples. However, the concept of probability easily generalizes
to those situations. In the coin toss example, we have assigned a
Chapter 3 6
BIOS 2041 Statistical Methods Abdus S. Wahed
probability of 12 to the outcome “H” because in “large” number of
tosses you would expect 50% of the times “heads” occuring. Simi-
larly, if we measure the blood pressure for a “large” number of times,
we would be able to know what proportion of times the blood prea-
sure level stays between 80 and 92. This proportion will serve as an
“estimate” of the probability of corresponding event. That is,
Pr(E) = P{x : 80 < x < 92}
=Number of measurements greater than 80 but less than 92
Total number of measurements.
Such probabilities are known as “empirical probabilities”.
In Table 3.1 of FOB, probability of a male live birth during 1965
is given by1, 927, 054
3, 760, 358= 0.51247.
If my chair randomly picks up a student from this class to know
about my teaching style, what is the probability that the student will
be a female?
Pr(F ) =#female students in this class
Total number of students.
Chapter 3 7
BIOS 2041 Statistical Methods Abdus S. Wahed
BASIC PROBABILITY LAWS
(i)For any event E, 0 ≤ Pr(E) ≤ 1.
(ii) Pr(S) = 1.
Intersection, Union and Complement
The union of two events E1 and E2, denoted by E1 ∪ E2 and read
as “ E1orE2”, is the event consisting of all outcomes that are either
in E1 or in E2 or in both. If in the blood pressure measurement
example,
E1 = {x : x < 90}
and
E2 = {x : 90 ≤ x < 95},
then
E1 ∪ E2 = {x : x < 95}.
The intersection of two events E1 and E2, denoted by E1 ∩ E2 and
read as “E1andE2”, is the event consisting of all outcomes that are
Chapter 3 8
BIOS 2041 Statistical Methods Abdus S. Wahed
both in E1 and in E2. In the above example,
E1 ∩ E2 = ∅.
But if we define another event as E3 : {x : x < 94}, then
E1 ∩ E3 = {x : x < 90},
and
E2 ∩ E3 = {x : 90 ≤ x < 94}.
The complement of an event E, denoted by E or Ec, is the event
consisting of all outcomes that are not in E. In the above example,
E1 = {x : x ≥ 90},
and
E2 = {x : x < 90 or x ≥ 95}.
Disjoint events
Two events E1 and E2 are disjoint or mutually exclusive if they can-
not both happen at the same time. In other words, two disjoint
Chapter 3 9
BIOS 2041 Statistical Methods Abdus S. Wahed
events E1 and E2 does not share any common outcomes. For ex-
ample, in the coin tossing example, the two events E2 = {H} and
E3 = {T} are mutually exclusive. However, E1 = {H, T} and E2
are not disjoint. In the blood pressure measuring example, the events
E1 = {x : x < 90} and E2 = {x : 90 ≤ x < 95} are disjoint.
BASIC PROBABILITY LAWS
For mutually exclusive events E1 and E2,
(i)E1 ∩ E2 = ∅, and
(ii) Pr(E1 ∪ E2) = Pr(E1) + Pr(E2)
(iii)Pr(E) = 1 − Pr(E).
Example 3.2.1. Example 3.12 (FOB). Suppose A =mother
is hypertensive (DBP ≥ 95), B =father is hypertensive. Further
suppose, Pr(A) = 0.1 and Pr(B) = 0.2.
1. What is the probability that the father is not hypertensive?
Pr(A) = 1 − 0.2 = 0.8.
2. What can we tell about the probability that both mother and
father are hypertensive?
Chapter 3 10
BIOS 2041 Statistical Methods Abdus S. Wahed
3.2.5 Independent Events
Two events E1 and E2 are called independent events if the occurrence
of one does not depend on the occurrence of the other. In terms of
probability,
Multiplicative Probability Law for Independent Events
For two independent events E1 and E2,
(i) Pr(E1 ∩ E2) = Pr(E1) × Pr(E2)
Example 3.2.2. Example 3.13 (FOB). Suppose A =mother
is hypertensive (DBP ≥ 95), B =father is hypertensive. Further
suppose, Pr(A) = 0.1 and Pr(B) = 0.2.
1. What can we tell about the probability that both mother and
father are hypertensive?
If we assume that the hypertensive status of the mother does
not depend at all on that of the father, then the probability
that both mother and father are hypertensive is Pr(A ∩ B) =
Pr(A) × Pr(B) = 0.1 × 0.2 = 0.02.
Chapter 3 11
BIOS 2041 Statistical Methods Abdus S. Wahed
3.2.6 Dependent Events
Two events E1 and E2 are called dependent events if the occurrence
of one depends on the occurrence of the other. In terms of probability,
for two dependent events E1 and E2,
Pr(E1 ∩ E2) 6= Pr(E1) × Pr(E2).
Example 3.2.3. Example 3.15 (FOB). Suppose
A+ =Doctor A makes a positive diagnosis,
B+ =Doctor B makes a positive diagnosis.
Given that,
Pr(A+) = 0.1,
P r(B+) = 0.17,
and
Pr(A+ ∩ B+) = 0.08.
Do you think that doctors A and B make independent diagnosis?
Pr(A+) × Pr(B+) = 0.1 ∗ 0.17 = 0.17 6= Pr(A+ ∩ B+) = 0.08.
Chapter 3 12
BIOS 2041 Statistical Methods Abdus S. Wahed
Thus events A+ and B+ are not independent.
Additive Probability Law
For two events E1 and E2,
(i) Pr(E1 ∪ E2) = Pr(E1) + Pr(E2) − P (E1 ∩ E2).
Example 3.2.4. Example 3.16 (FOB). Suppose
A+ =Doctor A makes a positive diagnosis,
B+ =Doctor B makes a positive diagnosis.
Given that,
Pr(A+) = 0.1,
P r(B+) = 0.17,
and
Pr(A+ ∩ B+) = 0.08.
Chapter 3 13
BIOS 2041 Statistical Methods Abdus S. Wahed
What is the probability that a patient will be diagnosed positive by
at least one of the two doctors?
Pr(A+ ∪ B+) = Pr(A) + Pr(B+) − Pr(A+ ∩ B+)
= 0.1 + 0.17 − 0.08
= 0.19.
3.3 Conditional Probability
In the above example, suppose, doctor A diagnoses a patient as posi-
tive. The patient wonders, what would have happened if the patient
was seen by doctor B? Can our probability theory help here?
Given A+, what can we say about B+?
In what proportion of cases, doctor B diagnoses positive when
doctor A diagnoses positive?
Pr(B+|A+) =Pr(B+ ∩ A+)
Pr(A+)=
0.08
0.10= 0.80.
• The probability that B occurs, given that A have already oc-
Chapter 3 14
BIOS 2041 Statistical Methods Abdus S. Wahed
curred, denoted by P (B|A) (read as probability of B given A),
is known as the conditional probability of B given A and is given
by the formula:
Pr(B|A) =Pr(B ∩ A)
Pr(A). (3.3.1)
Similarly,
Pr(A|B) =Pr(A ∩ B)
Pr(B)=
Pr(B|A) × P (A)
Pr(B). (3.3.2)
Some Properties
For two independent events E1 and E2,
(i) Pr(E1|E2) = Pr(E1)
(ii) Pr(E1|E2) = Pr(E1)
(iii) Pr(E1|E2) = Pr(E1)
(iv) Pr(E1|E2) = Pr(E1)
3.3.1 Relative Risk
The relative risk of B given A is defined as
RR =Pr(B|A)
Pr(B|A). (3.3.3)
Chapter 3 15
BIOS 2041 Statistical Methods Abdus S. Wahed
Example 3.3.1. Example 3.20 (FOB).
Pr(B+|A+) =Pr(B+ ∩ A+)
Pr(A+)=
0.08
0.10= 0.80.
P r(B+|A−) =Pr(B+ ∩ A−)
Pr(A−)
=Pr(B+) − Pr(A+ ∩ B+)
Pr(A−)
=0.17 − 0.08
1 − 0.10
= 0.10.
RR =Pr(B+|A+)
Pr(B+|A+)=
0.8
0.1= 8,
indicating that doctor B is 8 times as likely to diagnose a patient as
positive when doctor A diagnoses the patient as positive than when
doctor A diagnoses the patient as negative.
Chapter 3 16
BIOS 2041 Statistical Methods Abdus S. Wahed
3.3.2 Total Probability
Let us consider the following example:
Example 3.3.2. A chain of drug stores sells three different brands
of over the counter (OC) pain relievers. Of its OC pain reliever sales,
50% are brand A, 30% are brand B, and 20% are brand C. Each
manufacturer offers a 6-months satisfaction warranty. It is known
that 10% of brand A is returned to the store for refund within 6
months, whereas the corresponding percentages for brands B and C
are 7% and 3%, respectively.
1. What is the probability that a randomly selected purchaser who
has bought an OC pain reliever will return to the store for a
refund within 6 months?
2. If a customer returns to the store for a refund, what is the proba-
bility that it is a brand A pain reliever? A brand B pain reliever?
A brand C pain reliever?
Chapter 3 17
BIOS 2041 Statistical Methods Abdus S. Wahed
P(A) = .5
P(B) = .3
P(C) = .2
P(R|A) = .10
P(not R|A) = .90
P(R|B) = .07
P(R|C) = .03
P(not R|B) = .93
P(not R|C) = .97
Law of total probability
Suppose A1, A2, . . ., An are mutually exclusive events such that
A1 ∪ A2 ∪ . . . ∪ An = S.
If B is another event in the sample space, then
Pr(B) =∑k
i=1 Pr(B|Ai)Pr(Ai).
Chapter 3 18
BIOS 2041 Statistical Methods Abdus S. Wahed
Example 3.3.3. FOB 3.19, 3.21 Suppose that 20 in 100,000
women with negative mammograms will develop breast cancer within
2 years whereas 1 woman in 10 with positive mammograms will have
developed breast cancer within 2 years. Suppose that only 7% of the
general population of women will have a positive mammogram.
1. What is the probability that a randomly selected woman will
develop breast cancer within 2 years of having mammogram?
2. Suppose that a woman is diagnosed with breast cancer. What is
the probability that she had a negative result in her last mam-
mogram?
Chapter 3 19
BIOS 2041 Statistical Methods Abdus S. Wahed
3.3.3 Bayes’ Rule
Bayes’ Theorem
Suppose A1, A2, . . ., An are mutually exclusive events such that
A1 ∪ A2 ∪ . . . ∪ An = S.
If B is another event in the sample space, then
Pr(Aj|B) =Pr(B|Aj)Pr(Aj)
∑ki=1
Pr(B|Ai)Pr(Ai).
In Example 3.3.2, to answer the second question
If a customer returns to the store for a refund, what is the
probability that it is a brand A pain reliever? A brand B
pain reliever? A brand C pain reliever?
we have used the Bayes’ theorem.
Pr(A|R) =Pr(R|A)Pr(A)
Pr(R|A)Pr(A) + Pr(R|B)Pr(B) + Pr(R|C)Pr(C)
=.1(.5)
.1(.5) + .07(.3) + .03(.2)
=50
77= 0.65. (3.3.4)
Chapter 3 20
BIOS 2041 Statistical Methods Abdus S. Wahed
Pr(B|R) =Pr(R|B)Pr(B)
Pr(R|A)Pr(A) + Pr(R|B)Pr(B) + Pr(R|C)Pr(C)
=.07(.3)
.1(.5) + .07(.3) + .03(.2)
=3
11= 0.27. (3.3.5)
Pr(C|R) =Pr(R|C)Pr(C)
Pr(R|A)Pr(A) + Pr(R|B)Pr(B) + Pr(R|C)Pr(C)
=.03(.2)
.1(.5) + .07(.3) + .03(.2)
=6
77= 0.08. (3.3.6)
Positive Predictive Value/Predictive Value Positive
Positive Predictive Value (PPV)/Predictive Value Positive (PV+) of
a screening test is the probability that a person has a disease given
that the test is positive.
PV + = Pr(disease|test+).
Chapter 3 21
BIOS 2041 Statistical Methods Abdus S. Wahed
Negative Predictive Value/Predictive Value Negative
Negative Predictive Value (NPV)/Predictive Value Negative (PV−)
of a screening test is the probability that a person does not have a
disease given that the test is negative.
PV − = Pr(no disease|test−).
Example 3.3.4. Example 3.3.3 Continued. For the mammo-
gram test data, positive predictive value for the mammogram test
is
PV + = Pr(Breast Cancer|Mammogram+) =1
10= 0.1.
The negative predictive value
PV − = Pr(No Breast Cancer|Mammogram−) = 1−20
100000= 0.9998.
Sensitivity
The sensitivity of a test is given by the probability that the test is
positive when the person has the disease. i.e.,
Sensitivity = Pr(Positive Test|disease).
Chapter 3 22
BIOS 2041 Statistical Methods Abdus S. Wahed
Specificity
The Specificity of a test is given by the probability that the test is
negative when the person is disease-free. i.e.,
Specificity = Pr(Negative Test|no disease).
Example 3.3.5. Review Question 3, Page 59, FOB.
PSA test result Total
Prostate cancer + - Total
+ 92 46 138
- 27 72 99
Total 119 118 237
1. Sensitivity of PSA test
Sensitivity = Pr(Positive Test|disease)
=92
138
= 0.67. (3.3.7)
In 67% of the cases the PSA test detects prostate cancer when
the patient has cancer.
Chapter 3 23
BIOS 2041 Statistical Methods Abdus S. Wahed
Specificity of PSA test
Specificity = Pr(Negative Test|no disease)
=72
99
= 0.73. (3.3.8)
In 73% of the cases the PSA test correctly declares that there is
no prostate cancer when the patient does not have cancer.
2. Positive and negative predictive values
PV + = Pr(Prostate Cancer|PSA+) =92
119= 0.77.
The negative predictive value
PV − = Pr(No Prostate Cancer|PSA−) =99
118= 0.84.
Chapter 3 24
BIOS 2041 Statistical Methods Abdus S. Wahed
Example 3.3.6. Mental Health: Table 3.5 on Page 69,
FOB.
Table 3.3: Prevalence of Alzheimer’s disease (cases per 100 population)
Age group Males Females
65-69 1.6 0.0
70-74 0.0 2.2
75-79 4.9 2.3
80-84 8.6 7.8
85+ 35.0 27.9
Suppose an unrelated 77-year-old man, 76-year-old woman, and
82-year-old woman are selected from the community. Let
A:{77-year-old man has Alzheimer’s disease},
B:{76-year-old woman has Alzheimer’s disease}, and
C:{82-year-old woman has Alzheimer’s disease}. Then,
Pr(A) = 0.049
Pr(B) = 0.023
Pr(C) = 0.078
Chapter 3 25
BIOS 2041 Statistical Methods Abdus S. Wahed
3.17. Pr(All three have Alzheimer’s disease)
Pr(ABC) = Pr(A)Pr(B)Pr(C) = 0.000087906
3.20. Pr(Exactly one of the three have the Alzheimer’s disease)
= Pr(ABC) + Pr(ABC) + Pr(ABC)
= 0.049 ∗ 0.977 ∗ 0.922 + 0.951 ∗ 0.023 ∗ 0.922 + 0.951 ∗ 0.977 ∗ 0.078
= 0.137
3.22. Let D: {two of the three people have Alzheimer’s disease}.
That is, D = ABC ∪ ABC ∪ ABC.
Pr(BC|D) =Pr(BCD)
Pr(D)
=Pr(ABC)
Pr(D)
=Pr(ABC)
Pr(ABC) + Pr(ABC) + Pr(ABC)
Chapter 3 26