CHAPTER 3Probability Theory
• 3.1 - Basic Definitions and Properties• 3.2 - Conditional Probability and Independence• 3.3 - Bayes’ Formula• 3.4 - Applications (biomedical)
POPULATIONP(POPULATION) = 1
2
A = “Lung Cancer”
• P(A) corresponds to the ratio of the probability of A, relative to the entire population.
A = lung cancer (sub-)population
B = “Smoker”
B = smoking (sub-)population
A ∩ B
Probability of lung cancer
Probability of lung cancer and smoker
Informal Description…
• P(A ⋂ B) = the probability that both events occur simultaneously in the popul.
That is,
3
A = “Lung Cancer”
B = “Smoker”
A ∩ B
• P(A | B) corresponds to the ratio of the probability of A ∩ B,
relative to the probability of B.
A = lung cancer (sub-)population
• P(A) corresponds to the ratio of the probability of A, relative to the entire population.
B = smoking (sub-)population
Probability of lung cancer
Probability of lung cancer, given smoker
CONDITIONAL PROBABILITY
Informal Description…
( ) .( )
P A BP B
Probability of lung cancer and smoker
• P(A ⋂ B) = the probability that both events occur simultaneously in the popul.
Probability of “Primary Color,” given “Hot Color” = ?
E EC
F 0.30 0.15 0.45
FC 0.30 0.25 0.55
0.60 0.40 1.0
Outcome Probability
Red 0.10
Orange 0.15
Yellow 0.20
Green 0.25
Blue 0.30
1.00
POPULATION E = “Primary Color” = {Red, Yellow, Blue}
F = “Hot Color” = {Red, Orange, Yellow}
15%
20%
25%
30%
10%
P(E) = 0.60
P(F) = 0.45
Probability Table
Venn Diagram
0.150.30
0.250.30
E
F
Blue
Green
Orange
RedYellow
0.150.30
0.250.30
E
F
Blue
Green
Orange
RedYellow
Probability of “Primary Color,” given “Hot Color” = ?
E EC
F 0.30 0.15 0.45
FC 0.30 0.25 0.55
0.60 0.40 1.0
E = “Primary Color” = {Red, Yellow, Blue}
F = “Hot Color” = {Red, Orange, Yellow}
P(E) = 0.60
P(F) = 0.45
Probability Table
Venn Diagram
0.150.30
0.250.30
E
F
Blue
Green
Orange
RedYellow
Outcome Probability
Red 0.10
Orange 0.15
Yellow 0.20
Green 0.25
Blue 0.30
1.00
POPULATION
15%
20%
25%
30%
10%
P(E | F)( )
( )P E F
P F
0.300.45 0.667
Conditional Probability
=
P(F | E) ( )
( )P F E
P E
E EC
F 0.30 0.15 0.45
FC 0.30 0.25 0.55
0.60 0.40 1.0
E = “Primary Color” = {Red, Yellow, Blue}
F = “Hot Color” = {Red, Orange, Yellow}
P(E) = 0.60
P(F) = 0.45
Probability Table
Venn Diagram
0.150.30
0.250.30
E
F
Blue
Green
Orange
RedYellow
Probability of “Primary Color,” given “Hot Color” = ?
P(E | F)( )
( )P E F
P F
0.300.45 0.667
Outcome Probability
Red 0.10
Orange 0.15
Yellow 0.20
Green 0.25
Blue 0.30
1.00
POPULATION
15%
20%
25%
30%
10%Conditional Probability
P(F | E) 0.300.60 0.5
( )( )
P F EP E
=
E EC
F 0.30 0.15 0.45
FC 0.30 0.25 0.55
0.60 0.40 1.0
E = “Primary Color” = {Red, Yellow, Blue}
F = “Hot Color” = {Red, Orange, Yellow}
P(E) = 0.60
P(F) = 0.45
Probability Table
Venn Diagram
0.150.30
0.250.30
E
F
Blue
Green
Orange
RedYellow
Probability of “Primary Color,” given “Hot Color” = ?
P(E | F) P(EC | F)( )
( )P E F
P F
0.300.45 0.667 = 1 – 0.667 = 0.333
Outcome Probability
Red 0.10
Orange 0.15
Yellow 0.20
Green 0.25
Blue 0.30
1.00
POPULATION
15%
20%
25%
30%
10%Conditional Probability
P(F | E) 0.300.60 0.5
( )( )
P F EP E
E EC
F 0.30 0.15 0.45
FC 0.30 0.25 0.55
0.60 0.40 1.0
E = “Primary Color” = {Red, Yellow, Blue}
F = “Hot Color” = {Red, Orange, Yellow}
P(E) = 0.60
P(F) = 0.45
Probability Table
Venn Diagram
0.150.30
0.250.30
E
F
Blue
Green
Orange
RedYellow
Probability of “Primary Color,” given “Hot Color” = ?
P(E | F) P(EC | F)
P(E | FC)
( )
( )P E F
P F
0.300.45 0.667 = 1 – 0.667 = 0.333
Outcome Probability
Red 0.10
Orange 0.15
Yellow 0.20
Green 0.25
Blue 0.30
1.00
POPULATION
15%
20%
25%
30%
10%Conditional Probability
P(F | E) 0.300.60 0.5
( )( )
P F EP E
0.300.55 0.545
RedYellow
9
Example:
Women Men
Fractures 952 343 1295
No Fractures 1293 1417 2710
2245 1760 4005
Women Fractures
n = 4005
9521293 343
1417
10
P(Fracture, given Woman) ≈
P(Fracture, given Man) ≈
P(Man, given Fracture) ≈ P(Woman, given Fracture) =
952 / 2245 = 0.424
343 / 1760 = 0.195
343 / 1295 = 0.265
1 – 343 / 1295 = 952 / 1295 = 0.735
P(Fracture) ≈ 1295 / 4005 = 0.323
P(Fracture and Woman) ≈952 / 4005 = 0.238
Women Fractures
n = 4005
9521293 343
1417
11
P(Fracture, given Woman) ≈
P(Fracture, given Man) ≈
P(Man, given Fracture) ≈ P(Woman, given Fracture) =
952 / 2245 = 0.424
343 / 1760 = 0.195
343 / 1295 = 0.265
1 – 343 / 1295 = 952 / 1295 = 0.735
P(Fracture) ≈ 1295 / 4005 = 0.323
P(Fracture and Woman) ≈952 / 4005 = 0.238
“Osteoporosis-related fractures are more than twice as likely to
occur among women than men.”
“A person who suffers an osteoporosis-related fracture is almost three times more
likely to be a woman than a man.”
Women Fractures
n = 4005
9521293 343
1417
P(Fracture, given Woman) ≈
P(Fracture, given Man) ≈
P(Man, given Fracture) ≈ P(Woman, given Fracture) =
952 / 2245 = 0.424
343 / 1760 = 0.195
343 / 1295 = 0.265
1 – 343 / 1295 = 952 / 1295 = 0.735
P(Fracture) ≈ 1295 / 4005 = 0.323
P(Fracture and Woman) ≈952 / 4005 = 0.238
“Osteoporosis-related fractures are more than twice as likely to
occur among women than men.”
“A person who suffers an osteoporosis-related fracture is almost three times more
likely to be a woman than a man.”
? ?? ?
Def: The conditional probability of event A, given event B, is denoted by P(A|B), and calculated via the formula
( )( | ) = .( )
P A BP A BP B
13
Thus, for any two events A and B, it follows that P(A ⋂ B) = P(A | B) × P(B).
B occurs with prob P(B) Given that B occurs, A occurs with prob P(A | B) Both A and B occur, with prob
P(A ⋂ B)
Example: Randomly select two cards with replacement from a fair deck. P(Both Aces) = ?
Example: P(Live to 75) × P(Live to 80 | Live to 75) = P(Live to 80)
P(Ace1) = 4/52 P(Ace2 | Ace1) = 4/52 P(Ace1 ∩ Ace2) = (4/52)2
P(Ace2 | Ace1) = 3/51 P(Ace1 ∩ Ace2) = (4/52)(3/51)
Exercises: P(Neither is an Ace) = ? P(Exactly one is an Ace) = ? P(At least one is an Ace) = ?
A
B
Example: Randomly select two cards without replacement from a fair deck. P(Both Aces) = ?
14
Def: The conditional probability of event A, given event B, is denoted by P(A|B), and calculated via the formula
Thus, for any two events A and B, it follows that P(A ⋂ B) = P(A | B) × P(B).
B occurs with prob P(B) Given that B occurs, A occurs with prob P(A | B) Both A and B occur, with prob
P(A ⋂ B) Example: P(Live to 75) × P(Live to 80 | Live to 75) = P(Live to 80)
Tree Diagrams
P(B)
P(Bc)
P(A | B)
P(Ac | B)
P(A | Bc)
P(Ac | Bc)
P(A ⋂ B)
P(Ac ⋂ B)
P(A ⋂ Bc)
P(Ac ⋂ Bc)
Event A Ac
B P(A ⋂ B) P(Ac ⋂ B)
Bc P(A ⋂ Bc) P(Ac ⋂ Bc)
A BA ⋂ BA ⋂ Bc Ac ⋂ B
Ac ⋂ Bc
Multiply together “branch probabilities” to obtain “intersection probabilities”
A
B .
)()()|(
BPBAPBAP
15
Example: Bob must take two trains to his home in Manhattan after work: the A and the B, in either order. At 5:00 PM…• The A train arrives first with probability 0.65, and takes 30 mins to reach its last stop at Times Square. • The B train arrives first with probability 0.35, and takes 30 mins to reach its last stop at Grand Central Station.• At Times Square, Bob exits, and catches the second train. The A arrives first with probability 0.4, then travels to Brooklyn. The B train arrives first with probability 0.6, and takes 30 minutes to reach a station near his home.• At Grand Central Station, the A train arrives first with probability 0.8, and takes 30 minutes to reach a station near his home. The B train arrives first with probability 0.2, then travels to Queens.
With what probability will Bob be exiting the subway at 6:00 PM?
16
Example:
1( )=P A
1( )=P B
1A
5:00 5:30 6:00
1B
2 1( | )=P A A
2 1( | )=P B A
2 1( | )=P A B
2 1( | )=P B B
1 2( )=P A A
1 2( )=P A B
1 2( )=P B A
1 2( )=P B B
0.65
0.35
0.4
0.6
0.8
0.2
MULTIPLY:
0.26
0.39
0.28
0.07
ADD:
0.67
Bob must take two trains to his home in Manhattan after work: the A and the B, in either order. At 5:00 PM…• The A train arrives first with probability 0.65, and takes 30 mins to reach its last stop at Times Square. • The B train arrives first with probability 0.35, and takes 30 mins to reach its last stop at Grand Central Station.• At Times Square, Bob exits, and catches the second train. The A arrives first with probability 0.4, then travels to Brooklyn. The B train arrives first with probability 0.6, and takes 30 minutes to reach a station near his home.• At Grand Central Station, the A train arrives first with probability 0.8, and takes 30 minutes to reach a station near his home. The B train arrives first with probability 0.2, then travels to Queens.
With what probability will Bob be exiting the subway at 6:00 PM?
17
Example:
18
Let events C, D, and E be defined as:
E = Active vitamin E
C = Active vitamin C
D = Disease (Total Cancer)
Treatment D + D – Totals
Placebo E and C 479 3653
E (+ placebo C) 491 3659
C (+ placebo E) 480 3673
Active E and C 493 3656
Totals 1943 14641
D –
3174
3168
3193
3163
12698
E
D
C
493491 480
479
31633168 3193
3174
P(C) ≈ 7329 / 14641 = 0.5
P(D, given E) ≈ 984 / 7315 = 0.135
P(D) ≈ 1943 / 14641 = 0.133 P(D, given C) ≈ 973 / 7329 = 0.133
These study results suggest that D is statistically independent of both C and E, i.e., no association exists.
P(E) ≈ 7315 / 14641 = 0.5 “balanced”
19
15%
20%
25%
30%
10%
POPULATION
Outcome Probability
Red 0.10
Orange 0.18
Yellow 0.17
Green 0.22
Blue 0.33
1.00
POPULATION
18%
17%
22%
33%
10%
E EC
F 0.27 0.18 0.45
FC 0.33 0.22 0.55
0.60 0.40 1.0
Probability Table
Venn Diagram
0.180.27
0.220.33
E
F
Blue
Green
Orange
RedYellow
0.270.45
E EC
F 0.27 0.18 0.45
FC 0.33 0.22 0.55
0.60 0.40 1.0
E = “Primary Color” = {Red, Yellow, Blue}
F = “Hot Color” = {Red, Orange, Yellow}
P(E) = 0.60
Probability Table
Venn Diagram
0.180.27
0.220.33
E
F
Blue
Green
Orange
RedYellow
Outcome Probability
Red 0.10
Orange 0.18
Yellow 0.17
Green 0.22
Blue 0.33
1.00
POPULATION
Conditional Probability
P(E | F) ( )
( )P E F
P F
P(F | E) ( )
( )P F E
P E
18%
17%
22%
33%
10%
0.60 = P(E)
0.270.60
0.45 = P(F)
P(F) = 0.45
E EC
F 0.27 0.18 0.45
FC 0.33 0.22 0.55
0.60 0.40 1.0
Probability Table
Venn Diagram
0.180.27
0.220.33
E
F
Blue
Green
Orange
RedYellow
Outcome Probability
Red 0.10
Orange 0.18
Yellow 0.17
Green 0.22
Blue 0.33
1.00
POPULATION
Conditional Probability
P(E | F) = P(E)
P(F | E) = P(F)
18%
17%
22%
33%
10%
E = “Primary Color” = {Red, Yellow, Blue}
F = “Hot Color” = {Red, Orange, Yellow}
P(E) = 0.60
P(F) = 0.45
Events E and F are “statistically independent”
“Primary colors” comprise 60% of the “hot colors,” and 60% of the general population.
“Hot colors” comprise 45% of the “primary colors,” and 45% of the general population.
E EC
F 0.27 0.18 0.45
FC 0.33 0.22 0.55
0.60 0.40 1.0
Outcome Probability
Red 0.10
Orange 0.18
Yellow 0.17
Green 0.22
Blue 0.33
1.00
B occurs with prob P(B) Given that B occurs, A occurs with prob P(A | B) Both A and B occur, with prob
P(A ⋂ B)
23
Def: Two events A and B are said to be statistically independent ifP(A | B) = P(A),
Neither event provides any information about the other.
Is 0.27 = 0.60 × 0.45?P(E ⋂ F) = P(E)
P(F)?
YES!
which is equivalent to P(A ⋂ B) = P(A | B) × P(B).
If either of these two conditions fails, then A and B are statistically dependent.P(A)
Example: Are events A = “Ace” and B = “Black” statistically independent?P(A) = 4/52 = 1/13, P(B) = 26/52 = 1/2, P(A ⋂ B) = 2/52 = 1/26 YES!
P(A)
E = “Primary Color” = {Red, Yellow, Blue}
F = “Hot Color” = {Red, Orange, Yellow}
Events E and F are “statistically independent” = P(E)
P(F) =
Example:
24
Example: According to the American Red Cross, US pop is distributed as shown.
Rh Factor
Blood Type + – Row marginals:
O .384 .077 .461
A .323 .065 .388
B .094 .017 .111
AB .032 .007 .039
Column marginals: .833 .166 .999
Def: Two events A and B are said to be statistically independent ifP(A | B) = P(A),
Neither event provides any information about the other.
Are “Type O” and “Rh+” statistically independent?
= P(O)
= P(Rh+)
Is .384 = .461 × .833?
P(O ⋂ Rh+) = .384
YES!
which is equivalent to P(A ⋂ B) = P(A | B) × P(B).
If either of these two conditions fails, then A and B are statistically dependent.P(A)
A and B are statistically independent if:
P(A | B) = P(A)
IMPORTANT FORMULAS
P(Ac) = 1 – P(A)
P(A ⋃ B) = P(A) + P(B) – P(A ⋂ B)
25
= 0 if A and B are disjoint
P(A ⋂ B) = P(A | B) P(B) .)()()|(
BPBAPBAP
P(A ⋂ B) = P(A) P(B)
DeMorgan’s Laws
(A ⋃ B)c = Ac ⋂ Bc
(A ⋂ B)c = Ac ⋃ Bc
A B
Distributive LawsA (⋂ B ⋃ C) = (A ⋂ B) ⋃ (A ⋂ C)
A ⋃ (B ⋂ C) = (A ⋃ B) ⋂ (A ⋃ C)
Others…
Example: In a population of individuals:
60% of adults are male
P(B | A) = 0.6 40% of males are adults
P(A | B) = 0.4 30% are men
P(A ⋂ B) = 0.3
What percentage are adults?
26
A = Adult B = Male
What percentage are males?
Are “adult” and “male” statistically independent in this population?
0.3Men Boy
sWome
n
Girls
Example: In a population of individuals:
60% of adults are male
P(B | A) = 0.6 40% of males are adults
P(A | B) = 0.4 30% are men
P(A ⋂ B) = 0.3
⟹ P(B A) = 0.6 ⋂ P(A)0.3
P(A) = 0.3 / 0.6What percentage are adults?
27
A = Adult B = Male
What percentage are males?
Are “adult” and “male” statistically independent in this population?
0.3
⟹ P(A ⋂ B) = 0.4 P(B)0.3
P(B) = 0.3 / 0.4
0.2 0.45
Adult Child
Male 0.30 0.45 0.75
Female 0.20 0.05 0.25
0.50 0.50 1.00
0.05
P(A | B) = P(A)? OR P(B | A) = P(B)? OR P(A ⋂ B) = P(A) P(B)?
P(A) = 0.3 / 0.6 = 0.5, or 50%
0.5 – 0.3 = …
P(B) = 0.3 / 0.4 = 0.75, or 75%
0.75 – 0.3 = …
Men Boys
Women
Girls
Example: In a population of individuals:
60% of adults are male
P(B | A) = 0.6 40% of males are adults
P(A | B) = 0.4 30% are men
P(A ⋂ B) = 0.3
⟹ P(B A) = 0.6 ⋂ P(A)0.3
P(A) = 0.3 / 0.6What percentage are adults?
28
A = Adult B = Male
What percentage are males?
Are “adult” and “male” statistically independent in this population?
0.3
⟹ P(A ⋂ B) = 0.4 P(B)0.3
P(B) = 0.3 / 0.4
0.2 0.45
Adult Child
Male 0.30 0.45 0.75
Female 0.20 0.05 0.25
0.50 0.50 1.00
0.05
P(A | B) = P(A)? OR P(B | A) = P(B)? OR P(A ⋂ B) = P(A) P(B)?
NO
0.4 ≠ 0.5 0.6 ≠ 0.75
P(A) = 0.3 / 0.6 = 0.5, or 50%
P(B) = 0.3 / 0.4 = 0.75, or 75%
0.3 ≠ (0.5)(0.75)
Men Boys
Women
Girls
29
A = Adult B = Male
0.30.2 0.45
Adult Child
Male 0.30 0.45 0.75
Female 0.20 0.05 0.25
0.50 0.50 1.00
0.05
P(A | B) = 0.4
What percentage of males are boys?
What percentage of females are women?
What percentage of children are girls?
P(AC | B) = C( )
( )P A B
P B0.45= =0.75
0.6
60%
P(AC | B) = 1 – P(A | B)
Men Boys
Women
Girls
= 1 – 0.4 = 0.6
- OR -
P(A | BC) = P A B
P B
C
C
( )( )
0.20= =0.25
0.8
80%
P(BC | AC) = P B A
P A
C C
C
( )( )
0.05= =0.50
0.1
10%
P(A ⋂ B) = 0.3, i.e., 30%
P(B) = 0.75, i.e., 75%
P(A) = 0.5, i.e., 50%
30% are men
Example: In a population of individuals:
60% of adults are male
P(B | A) = 0.6 ⟹ 40% of males are adults
P(A | B) = 0.4
What percentage are adults?
30
A = Adult B = Male
What percentage are males?
Men Boys
Women
Girls
5% are girls
⟹ P(A) = 0.95 / 1.9
P(B A) = 0.6 ⋂ P(A)
⟹ P(A ⋂ B) = 0.4 P(B)0.05
⟹ 95% are not girls
P(A ⋃ B) = 0.95 P(A ⋃ B) = P(A) + P(B) − P(A ⋂B) 0.95
0.4 P(B) = 0.6 P(A)
P(B) = 1.5 P(A)
What percentage are men?
0.95 = P(A) + 1.5 P(A) − 0.6 P(A)
0.95 = 1.9 P(A)
0.30.2 0.45