Chapter 5: Multivariate Distributions

Preview:

Citation preview

Chapter 5:Multivariate Distributions

Professor Ron FrickerNaval Postgraduate School

Monterey, California

3/15/15 1Reading Assignment: Sections 5.1 – 5.12

Goals for this Chapter

•  Bivariate and multivariate probability distributions–  Bivariate normal distribution–  Multinomial distribution

•  Marginal and conditional distributions•  Independence, covariance and correlation•  Expected value & variance of a function of r.v.s

–  Linear functions of r.v.s in particular•  Conditional expectation and variance

3/15/15 2

Section 5.1:Bivariate and Multivariate Distributions

•  A bivariate distribution is a probability distribution on two random variables–  I.e., it gives the probability on the simultaneous

outcome of the random variables•  For example, when playing craps a gambler

might want to know the probability that in the simultaneous roll of two dice each comes up 1

•  Another example: In an air attack on a bunker, the probability the bunker is not hardened and does not have SAM protection is of interest

•  A multivariate distribution is a probability distribution for more than two r.v.s

3/15/15 3

Joint Probabilities

•  Bivariate and multivariate distributions are joint probabilities – the probability that two or more events occur–  It’s the probability of the intersection of

events: –  We’ll denote the joint (discrete) probabilities as

•  We’ll sometimes use the shorthand notation

–  This is the probability that the event and the event and … and the event all occurred simultaneously

3/15/15 4

n � 2{Y1 = y1}, {Y2 = y2}, . . . , {Yn = yn}

P (Y1 = y1, Y2 = y2, . . . , Yn = yn)

p(y1, y2, . . . , yn)

{Y1 = y1}{Y2 = y2} {Yn = yn}

Section 5.2:Bivariate and Multivariate Distributions

•  Simple example of a bivariate distribution arises when throwing two dice–  Let Y1 be the number of spots on die 1 –  Let Y2 be the number of spots on die 2

•  Then there are 36 possible joint outcomes (remember the mn rule: 6 x 6 = 36)–  The outcomes (y1, y2) are (1,1), (1,2), (1,3),…, (6,6)

•  Assuming the dice are independent, all the outcomes are equally likely, so

3/15/15 5

y1 = 1, 2, . . . , 6, y2 = 1, 2, . . . , 6p(y1, y2) = P (Y1 = y1, Y2 = y2) = 1/36,

Bivariate PMF for the Example

3/15/15 6

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

Defining Joint Probability Functions (for Discrete Random Variables)

•  Definition 5.1: Let Y1 and Y2 be discrete random variables. The joint (or bivariate) probability (mass) function for Y1 and Y2 is

•  Theorem 5.1: If Y1 and Y2 are discrete r.v.s with joint probability function , then1. 

2.  , where the sum is over all non-zero p(y1, y2)

3/15/15 7

P (Y1 = y1, Y2 = y2), �1 < y1 < 1,�1 < y2 < 1

p(y1, y2)p(y1, y2) � 0 for all y1, y2X

y1,y2

p(y1, y2) = 1

Back to the Dice Example

•  Find . •  Solution:

3/15/15 8

P (2 Y1 3, 1 Y2 2)

Textbook Example 5.1

•  A NEX has 3 checkout counters. –  Two customers arrive independently at the

counters at different times when no other customers are present.

–  Let Y1 denote the number (of the two) who choose counter 1 and let Y2 be the number who choose counter 2.

–  Find the joint distribution of Y1 and Y2.•  Solution:

3/15/15 9

Textbook Example 5.1 Solution Continued

3/15/15 10

Joint Distribution Functions

•  Definition 5.2: For any random variables (discrete or continuous) Y1 and Y2, the joint (bivariate) (cumulative) distribution function is

•  For two discrete r.v.s Y1 and Y2,

3/15/15 11

F (y1, y2) = P (Y1 y1, Y2 y2),�1 < y1 < 1,�1 < y2 < 1,

F (y1, y2) =X

t1y1

X

t2y2

p(t1, t2)

Back to the Dice Example

•  Find . •  Solution:

3/15/15 12

F (2, 3) = P (Y1 2, Y2 3)

Textbook Example 5.2

•  Back to Example 5.1. Find and .

•  Solution:

3/15/15 13

F (�1, 2), F (1.5, 2)F (5, 7)

Properties of Joint CDFs

•  Theorem 5.2: If Y1 and Y2 are random variables with joint cdf F(y1, y2), then1.  2.  3.  If and , then

Property 3 follows because

3/15/15 14

F (�1,�1) = F (y1,�1) = F (�1, y2) = 0

F (1,1) = 1y⇤1 � y1 y⇤2 � y2

F (y⇤1 , y⇤2)� F (y⇤1 , y2)� F (y1, y

⇤2) + F (y1, y2) � 0

F (y⇤1 , y⇤2)� F (y⇤1 , y2)� F (y1, y

⇤2) + F (y1, y2)

= P (y1 Y1 y⇤1 , y2 Y2 y⇤2) � 0

Joint Probability Density Functions

•  Definition 5.3: Let Y1 and Y2 be continuous random variables with joint distribution function F(y1, y2). If there exists a non-negative function f (y1, y2) such thatfor all then Y1 and Y2 are said to be jointly continuous random variables. The function f (y1, y2) is called the joint probability density function.

3/15/15 15

F (y1, y2) =

Z y1

t1=�1

Z y2

t2=�1f(t1, t2)dt1dt2

�1 < y1 < 1,�1 < y2 < 1,

Properties of Joint PDFs

•  Theorem 5.3: If Y1 and Y2 are random variables with joint pdf f (y1, y2), then1.  for all y1, y2

2. 

•  An illustrative joint pdf, which is a surface in 3 dimensions:

3/15/15 16

f(y1, y2) � 0

Z 1

�1

Z 1

�1f(y1, y2)dy1dy2 = 1

y1

y2

f (y1, y2)

Volumes Correspond to Probabilities

•  For jointly continuous random variables, volumes under the pdf surface correspond to probabilities

•  E.g., for the pdf in Figure 5.2, the probability corresponds to the volume shown

•  It’s equal to

3/15/15 17

P (a1 Y1 a2, b1 Y2 b2)

Z b2

b1

Z a2

a1

f(y1, y2)dy1dy2

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

Textbook Example 5.3

•  Suppose a radioactive particle is located in a square with sides of unit length. Let Y1 and Y2 denote the particle’s location and assume it is uniformly distributed in the square:

a.  Sketch the pdfb.  Find F(0.2,0.4) c.  Find

3/15/15 18

f(y1, y2) =

⇢1, 0 y1 1, 0 y2 10, elsewhere

P (0.1 Y1 0.3, 0.0 Y2 0.5)

Textbook Example 5.3 Solution

3/15/15 19

Textbook Example 5.3 Solution (cont’d)

3/15/15 20

3/15/15 21

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

Textbook Example 5.4

•  Gasoline is stored on a FOB in a bulk tank. –  Let Y1 denote the proportion of the tank available

at the beginning of the week after restocking. –  Let Y2 denote the proportion of the tank that is

dispensed over the week.–  Note that Y1 and Y2 must be between 0 and 1 and

y2 must be less than or equal to y1.–  Let the joint pdf be

a.  Sketch the pdfb.  Find

3/15/15 22

f(y1, y2) =

⇢3y1, 0 y2 y1 10, elsewhere

P (0 Y1 0.5, 0.25 Y2)

Textbook Example 5.4 Solution

3/15/15 23

•  First, sketching the pdf:

3/15/15 24

Source: Wackerly, D.D., W.M. Mendenhall III, and R.L. Scheaffer (2008). Mathematical Statistics with Applications, 7th edition, Thomson Brooks/Cole.

Textbook Example 5.4 Solution

3/15/15 25

•  Now, to solve for the probability:

Calculating Probabilities from Multivariate Distributions

•  Finding probabilities from a bivariate pdf requires double integration over the proper region of distributional support

•  For multivariate distributions, the idea is the same – you just have to integrate over n dimensions:

3/15/15 26

P (Y1 y1, Y2 y2, . . . , Yn yn) = F (y1, y2, . . . , yn)

=

Z y1

�1

Z y2

�1· · ·

Z yn

�1f(t1, t2, . . . , tn)dtn . . . dt1

Section 5.2 Homework

•  Do problems 5.1, 5.4, 5.5, 5.9

3/15/15 27

Section 5.3:Marginal and Conditional Distributions

•  Marginal distributions connect the concept of (bivariate) joint distributions to univariate distributions (such as those in Chapters 3 & 4) –  As we’ll see, in the discrete case, the name

“marginal distribution” follows from summing across rows or down columns of a table

•  Conditional distributions are what arise when, in a joint distribution, we fix the value of one of the random variables–  The name follows from traditional lexicon like “The

distribution of Y1, conditional on Y2 = y2, is…”

3/15/15 28

An Illustrative Example of Marginal Distributions

•  Remember the dice tossing experiment of the previous section, where we defined–  Y1 to be the number of spots on die 1 –  Y2 to be the number of spots on die 2

•  There are 36 possible joint outcomes (y1, y2), which are (1,1), (1,2), (1,3),…, (6,6)

•  Assuming the dice are independent, the joint distribution is

3/15/15 29

y1 = 1, 2, . . . , 6, y2 = 1, 2, . . . , 6p(y1, y2) = P (Y1 = y1, Y2 = y2) = 1/36,

An Illustrative Example (cont’d)

•  Now, we might want to know the probability of the univariate event

•  Given that all of the joint events are mutually exclusive, we have that

–  For example,

3/15/15 30

P (Y1 = y1)

P (Y1 = y1) =6X

y2=1

p(y1, y2)

P (Y1 = 1) = p1(1) =6X

y2=1

p(1, y2)

= p(1, 1) + p(1, 2) + p(1, 3) + p(1, 4) + p(1, 5) + p(1, 6)

= 1/36 + 1/36 + 1/36 + 1/36 + 1/36 + 1/36 = 1/6

An Illustrative Example (cont’d)

•  In tabular form:

3/15/15 31

y1

y2 1 2 3 4 5 6 Total 1 1/36 1/36 1/36 1/36 1/36 1/36 1/6 2 1/36 1/36 1/36 1/36 1/36 1/36 1/6 3 1/36 1/36 1/36 1/36 1/36 1/36 1/6 4 1/36 1/36 1/36 1/36 1/36 1/36 1/6 5 1/36 1/36 1/36 1/36 1/36 1/36 1/6 6 1/36 1/36 1/36 1/36 1/36 1/36 1/6

Total 1/6 1/6 1/6 1/6 1/6 1/6

The joint distribution of Y1 and Y2: p(y1,y2)The marginal

distribution of Y1: p1 (y1)

The

mar

gina

l di

strib

utio

n of

Y2:

p 2 (y

2)

Defining Marginal Probability Functions

•  Definition 5.4a: Let Y1 and Y2 be discrete random variables with pmf p(y1, y2). The marginal probability functions of Y1 and Y2, respectively, are and

3/15/15 32

p1(y1) =X

all y2

p(y1, y2)

p2(y2) =X

all y1

p(y1, y2)

Defining Marginal Density Functions

•  Definition 5.4b: Let Y1 and Y2 be jointly continuous random variables with pdf f (y1, y2). The marginal density functions of Y1 and Y2, respectively, are and

3/15/15 33

f1(y1) =

Z 1

�1f(y1, y2)dy2

f2(y2) =

Z 1

�1f(y1, y2)dy1

Textbook Example 5.5

•  From a group of three Navy officers, two Marine Corps officers, and one Army officer, an amphibious assault planning team of two officers is to be randomly selected. –  Let Y1 denote the number of Navy officers and Y2

denote the number of Marine Corps officers on the planning team.

•  Find the joint probability function of Y1 and Y2 and then find the marginal pmf of Y1.

•  Solution:

3/15/15 34

Textbook Example 5.5 Solution (cont’d)

3/15/15 35

Textbook Example 5.6

•  Let Y1 and Y2 have joint density

Sketch f (y1,y2) and find the marginal densities for Y1 and Y2.

•  Solution:

3/15/15 36

f(y1, y2) =

⇢2y1, 0 y1 1, 0 y2 10, elsewhere

Textbook Example 5.6 Solution (cont’d)

3/15/15 37

Conditional Distributions

•  The idea of conditional distributions is that we know some information about one (or more) of the jointly distributed r.v.s–  This knowledge changes our probability calculus

•  Example: –  An unconditional distribution: The distribution of

heights on a randomly chosen NPS student–  A conditional distribution: Given the randomly

chosen student is female, the distribution of heights

•  The information that the randomly chosen student is female changes the distribution of heights

3/15/15 38

Connecting Back to Event-based Probabilities

•  Remember from Chapter 2, for events A and B, the probability of the intersection is

•  Now, consider two numerical events,and , where can can then write where the interpretation of is the probability that r.v. Y1 equals y1 given that Y2 takes on value y2

3/15/15 39

P (A \B) = P (A)P (B|A)

(Y1 = y1)(Y2 = y2)

p(y1, y2) = p1(y1)p(y2|y1)= p2(y2)p(y1|y2)

p(y1|y2)

Defining Conditional Probability Functions

•  Definition 5.5: If Y1 and Y2 are jointly discrete r.v.s with pmf p(y1, y2) and marginal pmfs p1(y1) and p2(y2), respectively, then the conditional probability function of Y1 given Y2 is

–  Note that is not defined if –  The conditional probability function of Y2 given Y1 is

similarly defined3/15/15 40

p(y1|y2) p2(y2) = 0

P (Y1 = y1|Y2 = y2) =P (Y1 = y1, Y2 = y2)

P (Y2 = y2)

=p(y1, y2)

p2(y2)= p(y1|y2)

Textbook Example 5.7

•  Back to Example 5.5: Find the conditional distribution of Y1 given Y2 =1. That is, find the conditional distribution of the number of Navy officers on the planning team given that we’re told one Marine Corps officer is on the team.

•  Solution:

3/15/15 41

Textbook Example 5.7 Solution (cont’d)

3/15/15 42

Generalizing to Continuous R.V.s

•  Defining a conditional probability function for continuous r.v.s is problematic because is undefined since both and are events with zero probability–  The problem, of course, is that for continuous

random variables, the probability the r.v. takes on any particular value is zero

–  So, generalizing Definition 5.5 doesn’t work because the denominator is always zero

–  But we can make headway by using a different approach…

3/15/15 43

P (Y1 = y1|Y2 = y2)(Y1 = y1) (Y2 = y2)

Defining Conditional Distribution Functions

•  Definition 5.6: Let Y1 and Y2 be jointly continuous random variables with pdf f (y1, y2). The conditional distribution function of Y1 given Y2 = y2 is

•  And, with a bit of derivation (see the text), we have that

•  From this, we can define conditional pdfs…3/15/15 44

F (y1|y2) = P (Y1 y1|Y2 = y2)

F (y1|y2) =Z y1

�1

f(t1, y2)

f2(y2)dt1

Defining Conditional Density Functions

•  Definition 5.7: Let Y1 and Y2 be jointly continuous r.v.s with pdf f (y1, y2) and marginal densities f 1(y1) and f2(y2), respectively. For any y2 such that f2(y2) > 0, the conditional density function of Y1 given Y2 = y2 is

•  The conditional pdf of Y2 given Y1 = y1 is similarly defined as

3/15/15 45

f(y1|y2) =f(y1, y2)

f2(y2)

f(y2|y1) = f(y1, y2)/f1(y1)

Textbook Example 5.8

•  A soft drink machine has a random amount Y2 (in gallons) in supply at the beginning of the day and dispenses a random amount Y1 during the day. It is not resupplied during the day, so , and the joint pdf is

What is the probability that less than ½ gallon will be sold given that (“conditional on”) the machine containing 1.5 gallons at the start of the day?

3/15/15 46

Y1 Y2

f(y1, y2) =

⇢1/2, 0 y1 y2 20, elsewhere

Textbook Example 5.8 Solution

•  Solution:

3/15/15 47

Textbook Example 5.8 Solution (cont’d)

3/15/15 48

Section 5.3 Homework

•  Do problems 5.19, 5.22, 5.27

3/15/15 49

Section 5.4:Independent Random Variables

•  In Chapter 2 we defined the independence of events – here we now extend that idea to the independence of random variables

•  For two jointly distributed random variables Y1 and Y2, the probabilities associated with Y1 are the same regardless of the observed value of Y2–  The idea is that learning something about Y2 does

not tell you anything, probabilistically speaking, about the distribution of Y1

3/15/15 50

The Idea of Independence

•  Remember that two events A and B are independent if

•  Now, when talking about r.v.s, we’re often interested in events like for some numbers

•  So, for consistency with the notion of independence of events, what we want isfor any real numbers–  That is, if Y1 and Y2 are independent, then the joint

probabilities are the product of the marginal probs 3/15/15 51

P (A \B) = P (A)⇥ P (B)

P (a < Y1 b \ c < Y2 d)

a < b, c < d

a < b, c < d

P (a < Y1 b, c < Y2 d) = P (a < Y1 b)⇥ P (c < Y2 d)

Defining Independence for R.V.s

•  Definition 5.8: Let Y1 have cdf F1 (y1), Y2 have cdf F2 (y2), and Y1 and Y2 have joint cdf F(y1,y2). Then Y1 and Y2 are independent iff

for every pair of real numbers .•  If Y1 and Y2 are not independent, then they are

said to be dependent•  It is hard to demonstrate independence

according to this definition–  The next theorem will make it easier to determine

if two r.v.s are independent3/15/15 52

(y1, y2)

F (y1, y2) = F1(y1)F2(y2)

Determining Independence

•  Theorem 5.4: If Y1 and Y2 are discrete r.v.s with joint pmf p(y1,y2) and marginal pmfs p1(y1) and p2(y2), then Y1 and Y2 are independent ifffor all pairs of real numbers .

•  If Y1 and Y2 are continuous r.v.s with joint pdf f (y1,y2) and marginal pdfs f1(y1) and f2(y2), then Y1 and Y2 are independent ifffor all pairs of real numbers .

3/15/15 53

(y1, y2)

(y1, y2)f(y1, y2) = f1(y1)f2(y2)

p(y1, y2) = p1(y1)p2(y2)

Textbook Example 5.9

•  For the die-tossing problem of Section 5.2, show that Y1 and Y2 are independent.

•  Solution:

3/15/15 54

Textbook Example 5.10

•  In Example 5.5, is the number of Navy officers independent of the number of Marine Corps officers? (I.e., is Y1 independent of Y2?)

•  Solution:

3/15/15 55

Textbook Example 5.11

•  Let

Show that Y1 and Y2 are independent.

•  Solution:

3/15/15 56

f(y1, y2) =

⇢6y1y22 , 0 y1 1, 0 y2 10, elsewhere

Textbook Example 5.11 Solution (cont’d)

3/15/15 57

Textbook Example 5.12

•  Let

Show that Y1 and Y2 are dependent.

•  Solution:

3/15/15 58

f(y1, y2) =

⇢2, 0 y2 y1 10, elsewhere

Textbook Example 5.12 Solution (cont’d)

3/15/15 59

A Useful Theorem for Determining Independence of Two R.V.s

3/15/15 60

•  Theorem 5.5: Let Y1 and Y2 have joint density f (y1,y2) that is positive iff and , for constants a, b, c, and d; and f (y1,y2) = 0 otherwise. Then Y1 and Y2 are independent r.v.s iff

where g(y1) is a nonnegative function of y1 alone and h(y2) is a nonnegative function of y2 alone –  Note that g(y1) and h(y2) do not have to be

densities!

a y1 bc y2 d

f(y1, y2) = g(y1)h(y2)

Textbook Example 5.13

•  Let Y1 and Y2 have joint density

Are Y1 and Y2 independent?

•  Solution:

3/15/15 61

f(y1, y2) =

⇢2y1, 0 y1 1, 0 y2 10, elsewhere

Textbook Example 5.13 Solution (cont’d)

3/15/15 62

Textbook Example 5.14

•  Referring back to Example 5.4, is Y1, the proportion of the tank available at the beginning of the week, independent of Y2, the proportion of the tank that is dispensed over the week? Remember, the joint pdf is

•  Solution:

3/15/15 63

f(y1, y2) =

⇢3y1, 0 y2 y1 10, elsewhere

Textbook Example 5.14 Solution (cont’d)

3/15/15 64

Section 5.4 Homework

•  Do problems 5.45, 5.49, 5.53

3/15/15 65

Section 5.5:Expected Value of a Function of R.V.s

•  Definition 5.9: Let g(Y1, Y2,…, Yk) be a function of the discrete random variables Y1, Y2,…, Yk, which have pmf p(y1, y2,…, yk). Then the expected value of g(Y1, Y2,…, Yk) isIf Y1,Y2,…,Yk are continuous with pdf f (y1, y2,…, yk), then

3/15/15 66

E [g(Y1, Y2, . . . , Yk)] =Z 1

�1· · ·

Z 1

�1

Z 1

�1g(y1, y2, . . . , yk)f(y1, y2, . . . , yk)dy1dy2 . . . dyk

E [g(Y1, Y2, . . . , Yk)] =X

all yk

· · ·X

all y2

X

all y1

g(y1, y2, . . . , yk)p(y1, y2, . . . , yk)

Textbook Example 5.15

•  Let Y1 and Y2 have joint densityFind E(Y1Y2).

•  Solution:

3/15/15 67

f(y1, y2) =

⇢2y1, 0 y1 1, 0 y2 10, elsewhere

Note the Consistency in Univariate and Bivariate Expectation Definitions

•  Let g(Y1, Y2) = Y1. Then we have

3/15/15 68

E [g(Y1, Y2)] =

Z 1

�1

Z 1

�1g(y1, y2)f(y1, y2)dy2dy1

=

Z 1

�1

Z 1

�1y1f(y1, y2)dy2dy1

=

Z 1

�1y1

Z 1

�1f(y1, y2)dy2

�dy1

=

Z 1

�1y1f1(y1)dy1 = E(Y1)

Textbook Example 5.16

•  Let Y1 and Y2 have joint densityFind E(Y1).

•  Solution:

3/15/15 69

f(y1, y2) =

⇢2y1, 0 y1 1, 0 y2 10, elsewhere

Textbook Example 5.17

•  For Y1 and Y2 with joint density as described in Example 5.15, in Figure 5.6 it appears that the mean of Y2 is 0.5. Find E(Y2) and evaluate whether the visual estimate is correct or not.

•  Solution:

3/15/15 70

Textbook Example 5.18

•  Let Y1 and Y2 have joint densityFind V(Y1).

•  Solution:

3/15/15 71

f(y1, y2) =

⇢2y1, 0 y1 1, 0 y2 10, elsewhere

Textbook Example 5.19

•  An industrial chemical process yields a product with two types of impurities. –  In a given sample, let Y1 denote the proportion of

impurities in the sample and let Y2 denote the proportion of “type I” impurities (out of all the impurities).

–  The joint density of is Y1 and Y2 is

Find the expected value of the proportion of type I impurities in the sample.

3/15/15 72

f(y1, y2) =

⇢2(1� y1), 0 y1 1, 0 y2 10, elsewhere

Textbook Example 5.19 Solution

3/15/15 73

Section 5.6:Special Theorems

•  The univariate expectation results from Sections 3.3 and 4.3 generalize directly to joint distributions of random variables

•  Theorem 5.6: Let c be a constant. Then

•  Theorem 5.7: Let g(Y1,Y2) be a function of the random variables Y1 and Y2 and let c be a constant. Then

3/15/15 74

E(c) = c

E [cg(Y1, Y2)] = cE [g(Y1, Y2)]

•  Theorem 5.8: Let Y1 and Y2 be random variables and g1(Y1,Y2), g2(Y1,Y2),…, gk(Y1,Y2) be functions of Y1 and Y2. Then

•  We won’t prove these…proofs follow directly from the way we proved the results of Sections 3.3 and 4.3

3/15/15 75

E [g1(Y1, Y2) + g2(Y1, Y2) + · · ·+ gk(Y1, Y2)]

= E [g1(Y1, Y2)] + E [g2(Y1, Y2)] + · · ·+ E [gk(Y1, Y2)]

Textbook Example 5.20

•  Referring back to Example 5.4, find expected value of Y1-Y2, where

•  Solution:

3/15/15 76

f(y1, y2) =

⇢3y1, 0 y2 y1 10, elsewhere

Textbook Example 5.20 Solution (cont’d)

3/15/15 77

A Useful Theorem for Calculating Expectations

•  Theorem 5.9: Let Y1 and Y2 be independent r.v.s and g(Y1) and h(Y2) be functions of only Y1 and Y2 , respectively. Then provided the expectations exist.

•  Proof:

3/15/15 78

E [g(Y1)h(Y2)] = E [g(Y1)]E [h(Y2)]

Proof of Theorem 5.9 Continued

3/15/15 79

Textbook Example 5.21

•  Referring back to Example 5.19, with the pdf below, Y1 and Y2 are independent.Find E(Y1Y2).

•  Solution:

3/15/15 80

f(y1, y2) =

⇢2(1� y1), 0 y1 1, 0 y2 10, elsewhere

Textbook Example 5.21 Solution (cont’d)

3/15/15 81

Section 5.5 and 5.6 Homework

•  Do problems 5.72, 5.74, 5.77

3/15/15 82

Section 5.7:The Covariance of Two Random Variables

•  When looking at two random variables, a common question is whether they are associated with one-another–  That is, for example, as one tends to increase,

does the other do so as well?–  If so, the colloquial terminology is to say that the

two variables are correlated•  Correlation is also a technical term used to

describe the association–  But it has a precise definition (that is more restrictive

than the colloquial use of the term)–  And there is a second measure from which

correlation is derived: covariance3/15/15 83

Covariance and Correlation

•  Covariance and correlation are measures of dependence between two random variables

•  The figure below shows two extremes: perfect dependence and independence

3/15/15 84

Defining Covariance

3/15/15 85

•  Definition 5.10: If Y1 and Y2 are r.v.s with means and , respectively, then the covariance of Y1 and Y2 is defined as

•  Idea:

µ1 µ2

Cov(Y1, Y2) = E [(Y1 � µ1)(Y2 � µ2)]

For these observations

(Y1 � µ1)(Y2 � µ2) > 0

and for these observations

(Y1 � µ1)(Y2 � µ2) > 0

So expected value is positive

And note that if the observations had fallen around a line with a negative slope then the expected value would be negative.

An Example of Low Covariance

3/15/15 86

For observations in this quadrant

(Y1 � µ1)(Y2 � µ2) > 0

(Y1 � µ1)(Y2 � µ2) > 0

So the positives and negatives balance out and the expected value is very small to zero.

For observations in this quadrant

For observations in this quadrant

For observations in this quadrant

(Y1 � µ1)(Y2 � µ2) < 0

(Y1 � µ1)(Y2 � µ2) < 0

Understanding Covariance

•  Covariance gives the strength and direction of linear relationship between Y1 and Y2:–  Strong-positive: large positive covariance–  Strong-negative: large negative covariance–  Weak positive: small positive covariance–  Weak negative: small negative covariance

•  But what is “large” and “small”?–  It depends on the measurement units of Y1 and Y2

3/15/15 87

88

60.00 62.00 64.00 66.00 68.00 70.00 72.00 74.00 76.00

64 65 66 67 68 69 70 71 72 73 74 parents ht

5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.0 6.1 6.2 6.3

5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.0 6.1 6.2 parent ft

Cov = 1.46 Cov = 0.0101 Same picture, different scale, different covariance

Height in inches Height in feet

Big Covariance?

3/15/15

89

Correlation of Random Variables

•  The correlation (coefficient) of two random variables is defined as:

•  As with covariance, correlation is a measure of the dependence of two r.v.s Y1 and Y2 –  But note that it’s re-scaled to be unit-free and

measurement invariant–  In particular, for any random variables Y1 and Y2,

⇢ =

Cov(Y1, Y2)

�1�2

�1 ⇢ +1

3/15/15

90

Correlation is Unitless

5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.0 6.1 6.2 6.3

5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.0 6.1 6.2 parent ft

60.00 62.00 64.00 66.00 68.00 70.00 72.00 74.00 76.00

64 65 66 67 68 69 70 71 72 73 74 parents ht

Height in inches Height in feet

⇢ = 0.339 ⇢ = 0.3393/15/15

Interpreting the Correlation Coefficient

•  A positive correlation coefficient ( ) means there is a positive association between Y1 and Y2 –  As Y1 increases Y2 also tends to increase

•  A negative correlation coefficient ( ) means there is a negative association between Y1 and Y2 –  As Y1 increases Y2 tends to decrease

•  A correlation coefficient equal to 0 ( ) means there is no linear association between Y1 and Y2

3/15/15 91

⇢ > 0

⇢ < 0

⇢ = 0

Interactive Correlation Demo

3/15/15 92

From R (with the shiny package installed), run: shiny::runGitHub('CorrelationDemo','rdfricker')

An Alternative Covariance Formula

3/15/15 93

•  Theorem 5.10: If Y1 and Y2 are r.v.s with means and , respectively, then

•  Proof:

µ1 µ2

Cov(Y1, Y2) = E [(Y1 � µ1)(Y2 � µ2)]

= E(Y1Y2)� E(Y1)E(Y2)

Textbook Example 5.22

•  Referring back to Example 5.4, find the covariance between Y1 and Y2, where

•  Solution:

3/15/15 94

f(y1, y2) =

⇢3y1, 0 y2 y1 10, elsewhere

Textbook Example 5.22 Solution (cont’d)

3/15/15 95

Textbook Example 5.23

•  Let Y1 and Y2 have the following joint pdf

Find the covariance of Y1 and Y2.•  Solution:

3/15/15 96

f(y1, y2) =

⇢2y1, 0 y1 1, 0 y2 10, elsewhere

Correlation and Independence

3/15/15 97

•  Theorem 5.11: If Y1 and Y2 are independent r.v.s, then . That is, independent r.v.s are uncorrelated.

•  Proof:

Cov(Y1, Y2) = 0

Textbook Example 5.24

•  Let Y1 and Y2 be discrete r.v.s with joint pmf as shown in Table 5.3. Show that Y1 and Y2 are dependent but have zero covariance.

•  Solution:

3/15/15 98

Textbook Example 5.24 Solution (cont’d)

3/15/15 99

100

Important Take-Aways

•  High correlation merely means that there is a strong linear relationship–  Not that Y1 causes Y2 or Y2 causes Y1

•  E.g., there could be a third factor that causes both Y1 and Y2 to move in the same direction–  No causal connection between Y1 and Y2 –  But high correlation

•  Remember: ü High correlation does not imply causation ü Zero correlation does not mean there is no

relationship between Y1 and Y2 3/15/15

Section 5.7 Homework

•  Do problems 5.89, 5.91, 5.92, 5.95

3/15/15 101

The Expected Value and Variance of Linear Functions of R.V.s (Section 5.8)

•  In OA3102 you will frequently encounter “estimators” that are linear combinations of random variables

•  For example,where the Yis are r.v.s and ais are constants

•  Thus, for example, we might want to know the expected value of U1, or the variance of U1, or perhaps the covariance between U1 and U2

3/15/15 102

U1 = a1Y1 + a2Y2 + · · ·+ anYn =nX

i=1

aiYi

Expected Value, Variance, and Covariance of Linear Combinations of R.V.s

•  Theorem 5.12: Let Y1, Y2, …, Yn and X1, X2, …, Xm be r.v.s with and . Define andfor constants a1,…, an and b1,…, bm. Then:

a.  and similarly

b.  with a similar expression for

3/15/15 103

E(Y1) = µi E(X1) = ⇠i

U1 =nX

i=1

aiYi U2 =mX

j=1

bjXj

E(U1) =nX

i=1

aiµi E(U2) =nX

j=1

bj⇠j

V (U1) =

nX

i=1

a2iV (Yi) + 2

XX1i<jn

aiajCov(Yi, Yj)

V (U2)

Expected Value, Variance, and Covariance of Linear Combinations of R.V.s (cont’d)

c.  And,

•  How will this be relevant? –  In OA3102 you will need to estimate the

population mean using data–  You will use the sample average as

an “estimator,” which is defined as

–  This is a linear combination of r.v.s withconstants for all i

3/15/15 104

Cov(U1, U2) =

nX

i=1

mX

j=1

aibjCov(Yi, Xj)

µ

Y =1

n

nX

i=1

Yi

ai = 1/n

Textbook Example 5.25

•  Let Y1, Y2, and Y3 be r.v.s, where:§  §  § 

Find the expected value and variance of U = Y1 - 2Y2 +Y3. If W = 3Y1 + Y2, find Cov(U,W).

•  Solution:

3/15/15 105

E(Y1) = 1, E(Y2) = 2, E(Y3) = �1

V (Y1) = 1, V (Y2) = 3, V (Y3) = 5

Cov(Y1, Y2) = �0.4,Cov(Y1, Y3) = 0.5,Cov(Y2, Y3) = 2

Textbook Example 5.25 Solution (cont’d)

3/15/15 106

Textbook Example 5.25 Solution (cont’d)

3/15/15 107

Proof of Theorem 5.12

•  Now, let’s prove Theorem 5.12, including parts a and b…

3/15/15 108

Proof of Theorem 5.12 Continued

3/15/15 109

Proof of Theorem 5.12 Continued

3/15/15 110

Textbook Example 5.26

•  Referring back to Examples 5.4 and 5.20, for the joint distribution find the variance of Y1 - Y2.

•  Solution:

3/15/15 111

f(y1, y2) =

⇢3y1, 0 y2 y1 10, elsewhere

Textbook Example 5.26 Solution (cont’d)

3/15/15 112

Textbook Example 5.26 Solution (cont’d)

3/15/15 113

Textbook Example 5.27

•  Let Y1, Y2, …, Yn be independent random variables with and . Show that and for

•  Solution:

3/15/15 114

E(Yi) = µ V (Yi) = �2

E(Y ) = µ V (Y ) = �2/n

Y =1

n

nX

i=1

Yi

Textbook Example 5.27 Solution (cont’d)

3/15/15 115

Textbook Example 5.28

•  The number of defectives Y in a sample of n = 10 items sampled from a manufacturing process follows a binomial distribution. An estimator for the fraction defective in the lot is the random variable . Find the expected value and variance of .

•  Solution:

3/15/15 116

p = Y/np

Textbook Example 5.28 Solution (cont’d)

3/15/15 117

Textbook Example 5.29

•  Suppose that an urn contains r red balls and (N-r) black balls. –  A random sample of n balls is drawn without

replacement and Y, the number of red balls in the sample, is observed.

–  From Chapter 3 we know that Y has a hypergeometric distribution.

–  Find and .•  Solution:

3/15/15 118

E(Y ) V (Y )

Textbook Example 5.29 Solution (cont’d)

3/15/15 119

Textbook Example 5.29 Solution (cont’d)

3/15/15 120

Section 5.8 Homework

•  Do problems 5.102, 5.103, 5.106, 5.113

3/15/15 121

Section 5.9:The Multinomial Distribution

•  In a binomial experiment, there are n trials, each with a binary outcome of equal probability

•  But there are many situations in which the number of possible outcomes per trial is greater than two–  When testing a missile, the outcomes may be

“miss,” “partial kill,” and “complete kill”–  When randomly sampling students at NPS, their

affiliation may be Army, Navy, Marine Corps, US civilian, non-US military, non-US civilian, other

3/15/15 122

Defining a Multinomial Experiment

•  Definition 5.11: A multinomial experiment has the following properties:1.  The experiment consists of n identical trials.2.  The outcome of each trial falls into one of k

classes or cells.3.  The probability the outcome of a single trial falls

into cell i, pi, i=1,…,k remains the same from trial to trial and .

4.  The trials are independent. 5.  The random variables of interest are Y1, Y2, …, Yk,

the number of outcomes that fall in each of the cells, where .

3/15/15 123

p1 + p2 + · · ·+ pk = 1

Y1 + Y2 + · · ·+ Yk = n

Defining the Multinomial Distribution

•  Definition 5.12: Assume that pi > 0, i=1,…,k are such that . The random variables Y1, Y2, …, Yk have a multinomial distribution with parameters p1 , p2 ,…, pk if their joint distribution is

where, for each i, yi = 0, 1, 2, …,n and .

•  Note that the binomial is just a special case of the multinomial with k = 2

3/15/15 124

p1 + p2 + · · ·+ pk = 1

p(y1, y2, . . . , yk) =n!

y1!y2! · · · yk!py11 py2

2 · · · pyk

k

kX

i�1

yi = n

Textbook Example 5.30

•  According to recent Census figures, the table below gives the proportion of US adults that fall into each age category:

•  If 5 adults are randomly sampled out of the population, what’s the probability the sample contains one person between 18 and 24 years, two between 25 and 34, and two between 45 and 64 years?

3/15/15 125

Age (years) Proportion

18-24 0.18

25-34 0.23

35-44 0.16

45-64 0.27

65+ 0.16

Textbook Example 5.30 Solution

•  Solution:

3/15/15 126

Expected Value, Variance, and Covariance of the Multinomial Distribution

•  Theorem 5.13: If Y1, Y2, …, Yk have a multinomial distribution with parameters n and p1 , p2 ,…, pk , then1.  2.  where, as always, qi = 1 – pi 3. 

•  Proof:

3/15/15 127

E(Yi) = npi

V (Yi) = npiqiCov(Ys, Yt) = �npspt, if s 6= t

Proof of Theorem 5.13 Continued

3/15/15 128

Application Example

•  Under certain conditions against a particular type of target, the Tomahawk Land Attack Missile – Conventional (TLAM-C) has a 50% chance of a complete target kill, a 30% chance of a partial kill, and a 20% chance of no kill/miss. In 10 independent shots, each to a different target, what’s the probability:–  There are 8 complete kills, 2 partial kills, and 0 no

kills/misses? –  There are at least 9 complete or partial kills?

3/15/15 129

Application Example Solution

3/15/15 130

Application Example Solution in R

•  The first question via brute force:

•  Now, using the dmultinom() function

•  For the second question, reduce it to a binomial where the probability of a complete or partial kill is 0.8

3/15/15 131

Section 5.9 Homework

•  Do problems 5.122, 5.125

3/15/15 132

Section 5.10:The Bivariate Normal Distribution

•  The bivariate normal distribution is an important one, both in terms of statistics (you’ll need it in OA3103), as well as probability modeling

•  The distribution has 5 parameters:–  the mean of variable Y1

–  the mean of variable Y2

–  the variance of variable Y1

–  the variance of variable Y2

–  the correlation between Y1 and Y2

3/15/15 133

µ1

µ2

�21

�22

The Bivariate Normal PDF

•  The bivariate normal pdf is a bit complicated•  It’s

where

for

3/15/15 134

f(y1, y2) =e�Q/2

2⇡�1�2

p1� ⇢2

Q =1

1� ⇢2

(y1 � µ1)2

�21

� 2⇢(y1 � µ1)(y2 � µ2)

�1�2+

(y2 � µ2)2

�22

�1 < y1 < 1,�1 < y2 < 1

Interactive Bivariate Normal Distribution Demo

3/15/15 135

From R (with the shiny package installed), run: shiny::runGitHub('BivariateNormDemo','rdfricker')

Some Facts About the Bivariate Normal

•  If Y1 and Y2 are jointly distributed according to a bivariate normal distribution, then–  Y1 has a (univariate) normal distribution with mean

and variance–  Y2 has a (univariate) normal distribution with mean

and variance–  and if

(equivalently ) then Y1 and Y2 are independent

•  Note that this result is not true in general: zero correlation between any two random variables does not imply independence

•  But it does for two normally distributed random variables!3/15/15 136

µ1

µ2

�21

�22

Cov(y1, y2) = ⇢�1�2 Cov(y1, y2) = 0

⇢ = 0

Illustrative Bivariate Normal Application

3/15/15 137

From R (with the shiny package installed), run: shiny::runGitHub('BivariateNormExample','rdfricker')

The Multivariate Normal Distribution

•  The multivariate normal distributions is a joint distribution for k > 1 random variables

•  The pdf for a multivariate normal is

where–  and are k-dim. vectors for the location at which

to evaluate the pdf and the means, respectively–  is the variance-covariance matrix–  For the bivariate normal: ,

and 3/15/15 138

f(y) =1

(2⇡)k/2|⌃|1/2exp

⇢�1

2

(y � µ)0⌃�1(y � µ)

y µ

⌃y = {y1, y2} µ = {µ1, µ2}

⌃ =

�21 ⇢�1�2

⇢�1�2 �22

Section 5.11:Conditional Expectations

•  In Section 5.3 we learned about conditional pmfs and pdfs–  The idea: A conditional distribution is the

distribution of one random variable given information about another jointly distributed random variable

•  Conditional expectations are similar–  It’s the expectation of one random variable given

information about another jointly distributed random variable

–  They’re defined just like univariate expectations except that the conditional pmfs and pdfs are used

3/15/15 139

Defining the Conditional Expectation

•  Definition 5.13: If Y1 and Y2 are any random variables, the conditional expectation of g(Y1) given that Y2 = y2 is defined asif Y1 and Y2 are jointly continuous, and if Y1 and Y2 are jointly discrete.

3/15/15 140

E (g(Y1)|Y2 = y2) =X

all y1

g(y1)p(y1|y2)

E (g(Y1)|Y2 = y2) =

Z 1

�1g(y1)f(y1|y2)dy1

Textbook Example 5.31

•  For r.v.s Y1 and Y2 from Example 5.8, with joint pdfFind the conditional expectation of Y1 given Y2 = 1.5.

•  Solution:

3/15/15 141

f(y1, y2) =

⇢1/2, 0 y1 y2 20, elsewhere

Textbook Example 5.31 Solution (cont’d)

3/15/15 142

A Bit About Conditional Expectations

•  Note that the conditional expectation of Y1 given Y2 = y2 is a function of y2 –  For example, in Example 5.31 we obtained

•  It then follows that–  That is, the conditional expectation is a function of

a random variable and so it’s a random variable itself

–  Thus, it has a mean and a variance as well

3/15/15 143

E(Y1|Y2 = y2) = y2/2

E(Y1|Y2) = Y2/2

The Expected Value of a Conditional Expectation

•  Theorem 5.14: Let Y1 and Y2 denote random variables. Then where the inside expectation is with respect to the conditional distribution and the outside expectation is with respect to the pdf of Y2.

•  Proof:

3/15/15 144

E [E(Y1|Y2)] = E(Y1)

Proof of Theorem 5.14 Continued

•  Note: This theorem is often referred to as the Law of Iterated Expectations–  Then it’s also frequently written in the other

direction:

3/15/15 145

E(Y1) = E [E(Y1|Y2)]

Textbook Example 5.32

•  A QC plan requires sampling n = 10 items and counting the number of defectives, Y. –  Assume Y has a binomial distribution with p, the

probability of observing a defective. –  But p varies from day-to-day according to a

U(0, 0.25) distribution. –  Find the expected value of Y.

•  Solution:

3/15/15 146

Textbook Example 5.32 Solution (cont’d)

3/15/15 147

The Conditional Variance

•  The conditional variance of Y1 given Y2 = y2 is defined by analogy with the ordinary variance:

•  Note that the calculations are basically the same, except the expectations on the right hand side of the equality are taken with respect to the conditional distribution

•  Just like with the conditional expectation, the conditional variance is a function of y2

3/15/15 148

V (Y1|Y2 = y2) = E(Y 21 |Y2 = y2)� [E(Y1|Y2 = y2)]

2

Another Way to Calculate the Variance

•  Theorem 5.15: Let Y1 and Y2 denote random variables. Then,

•  Proof:

3/15/15 149

V (Y1) = E [V (Y1|Y2)] + V [E(Y1|Y2)]

Proof of Theorem 5.15 Continued

3/15/15 150

Textbook Example 5.33

•  From Example 5.32, find the variance of Y.•  Solution:

3/15/15 151

Textbook Example 5.33 Solution (cont’d)

3/15/15 152

Note How Conditioning Can Help

•  Examples 5.32 & 5.33 could have been solved by first finding the unconditional distribution of Y and then calculating E(Y) and V(Y)

•  But sometimes that can get pretty complicated, requiring:–  First finding the joint distribution of Y and the

parameter (p in the previous examples)–  Then it’s necessary to solve for the marginal

distribution of Y –  And then finally, calculate E(Y) and V(Y) in the

usual way

3/15/15 153

Note How Conditioning Can Help

•  However, sometimes, like here, it can be easier to work with conditional distributions and the results of Theorems 5.14 and 5.15

•  In the examples:–  The distribution conditioned on a fixed value of the

parameter was clear–  Then putting a distribution on the parameter itself

accounted for the day-to-day differences•  So, the E(Y) and V(Y) can be found without

having to first solve for the marginal distribution of Y

3/15/15 154

Section 5.11 Homework

•  Do problems 5.133, 5.139, 5.140

3/15/15 155

Section 5.12:Summary

•  Whew – this was an intense chapter!–  But the material is important and will come up in

future classes•  For example:

–  Linear functions of random variables: OA3102, OA3103

–  Covariance, correlation, and the bivariate normal distribution: OA3103, OA3602

–  Conditional expectation and variance, Law of Iterated Expectations, etc: OA3301, OA4301

3/15/15 156

What We Have Just Learned

•  Bivariate and multivariate probability distributions–  Bivariate normal distribution–  Multinomial distribution

•  Marginal and conditional distributions•  Independence, covariance and correlation•  Expected value & variance of a function of

r.v.s–  Linear functions of r.v.s in particular

•  Conditional expectation and variance

3/15/15 157

Recommended