Probability &Statistics Lecture 8

1

Probability &StatisticsLecture 8

2

1. Reminding: Probability Density Function (pdf); Cumulative Distribution Function (cdf).

2. Relation between the Binomial, Poisson and Normal distributions

3. Measures of Central Tendency: Expectation, Variance, higher moments

4. Work in groups, Mathematica

3

Topic 1.Probability distribution function also called “cumulative distribution function” ( CDF) <this is transferred from Lect.7 (equation numbers start with 7 ).

From the “outside”, random distributions are well described by the probability distribution function (we will use CDF for short) F(x) defined as

7 13 ( ) ( ) ( ) ( . )

xF x P X x f y dy

7 14( ) ( ) ( ) ( . )P a X b F b F a

This formula can also be rewritten in the following very useful form:

1. Continuous random variable

4

To see what the distribution functions look like, we return to our examples.

a. The uniform distribution (7.15): 1

0

,( )

a x bb af xotherwise

Using the definition (7.16) and Mathematica, try to find F(x) for the uniform distribution. Prove thatF(x)=0 for x a; (x-a)/(b-a) for a x b; 1 for x>b.

Draw the CDF for several a and b. Consider an important special case a=0, b=1. How is it related to the spinner problem? To the balanced die?

b. The exponential distribution (7.17): 00

,( )

xe xf xotherwise

Before solving the problems involving integration, please read the following file : Self-Test and About the integrals

5

c. (Gamma)- and chi-square distributions

We already introduced the 2-parameter Gamma distribution discussing the maximum likelihood. Its pdf is

1 [ / ]( ) for 0 and ( ) 0 otherwise (7.18) .[ ]

>0 and >0 are the prameters. At =1 ( ) turns into the exponential distribution (7.17)

x Exp xf x x f x

f x

Please plot f(x) (7.18) corresponding CDF for 1,2,4 and =1.

A specific case of Gamma distribution with 2 and

describes so called chi-square distribution with v degrees of freedom. This distribution is very important in many statistical applications, e.g. For the hypothesis testing.

6

d. The standard and general normal distribution

Using Mathematica and Eq. (7.12), find F[x] for the SND, ND.

Use NIntegrate[f[t],{t,-,x}] and Plot[…] functions.

2. CDF for discrete random variables

For discrete variables the integration is substituted by summation:

7 21

( ) ( ) ( ) ( . )F x P X x p xix xi

It is clear from this formula that if X takes only a finite number of values, the distribution function looks like a stairway.

fsnd(x)=(2)-1/2 exp(-x2/2) (7.19)

fnd(x)=(2 2)-1/2 exp[-(x- )2/2 2] (7.20’)

7

p(x1)

p(x2)

p(x3)

p(x4)

x1 x2 x3 x4

F(x)

x

1

8

Topic 2Relation between binomial and normal distributions.

If n is large and if neither p nor q=1-p are too close to zero, the binomial distribution can be closely approximated by the normal distribution with= np and =(npq)1/2.

2

(8.2)

( )21( , , )

2

X npnpqp X p n

npqe

Here p is the probability function for the binomial distribution and X is the random variable (the number of successes)

9

Let’s now open Lect8/Mathematica/Lect8_BinPoisNormal.nb and run some examples.

Topic 3: Measures of Central tendency

When a large collection of data is considered, as in a census or in a massive blood test, we are usually interested not in individual numbers but rather in certain descriptive numbers such as the expected value and the variance.

10

Definition of mathematical expectation(a) Descrete random variables

A very important concept in probability and statistics is the mathematical expectation or expected value, or briefly the expectation of a random variable. For a discrete random variable X having the possible values x1,x2, …xn

E(X)= x1p(x1)+ x2p(x2) + …+ xnp(xn)=

A special case of (8.3) is the uniform distribution where the probabilities are equal:

...1 2( ) (8.4)x x xnE X n

(8.3)( )1

n x p xj jj

x

xpxXE )()(This can also be rewritten as

where the summation in rhs is taken over all the points of the sample space

(8.3’)

11

Example

What is the expectation for the balanced die?

Roll one die an Let X be the number that appears. P(X=x) = 1/6,

X = 1,2,3,4,5,6.

EX = (1/6) (1 +2+… + 6) = (1/6)*[6(6+1)/2]= 3.5.

12

Expectation for a function of the random variable

The functions of Random variables are very important in various applications. We will only briefly mention this important concept. A simple example is the kinetic energy mv2/2 which is a quadratic function of the random variable v, the velocity of a particle.

Theorem Let X be a discrete random variable with probability function p(X) and g(X) be a real-valued function of X. Then the expected value of g(X) is given by

E[g(X)] = ) (8.3'')( ( )1

n g x p xj jj

13

For a continuous random variable X having density function f(x), the expectation of X is defined as

( ) ( ) (8.5)E X xf x dx

The expectation of X is often called the mean of X and is denoted by X, or simply . If g(x) is a function of random variable x, then the expected value can be found as

b. Continuous random variable

[ ( )] ( ) ( ) (8.6)E g X g x f x dx

14

Some Examples (check the results at home)1. Suppose that a game is played with a single fair die. Assuming that the player wins $20 for X=2 (the face 2 is up), $40 if X=4 and -$30 (loses) if X=6. In addition, he neither wins nor loses if any other face turns up.

Find the expected sum of money to be won.

2. The density function of a random variable X is given byf(x)=x/2 for 0<x<2, and 0 otherwise. Find the expected value of X.

3. The same with f (x) = 4x3 for 0<x<1, 0 otherwise.

15

Some Examples (answers) 1. Suppose that a game is played with a single fair die. Assuming that the player wins $20 for X=2 (the face 2 is up), $40 if X=4 and -$30 (loses) if X=6. In addition, he neither wins nor loses if any other face turns up.

Find the expected sum of money to be won.

Solution:

X ($) 20 40 -30 0p 1/6 1/6 1/6 1/2

120 1 6 40 1 6 30 1 6 0 1 2 5( )( ) / / / /

nx p xj jj

E X

16

Example 2 The density function of a random variable X is given byf(x)=x/2 for 0<x<2, and 0 otherwise. Find the expected value of X.

Solution

Let’s use Eq. 8.5, , and Mathematica:( ) ( )

E X xf x dx

The answer: 4/3

17

The interpretation of the expected value

In statistics, one is frequently concerned with the average values.

Average = (Sum of entries) : (number of entries)

For any finite experiment, where the number of entries is small or moderate, the average of the outcomes is not predictable. However, we will eventually prove that that the average will usually be close to E(X) if we repeat the experiment a large number of times.

Let us illustrate these observations using the Lect8_AverageVsExpectation.nb

18

Expectation of a function of random variable (see for details G&S, 6.1, pp. 229-230. Note that they use notation m(x)

(instead of p(x) ) for the probability function)

This discussion can be considered a “proof” of the equation (8.3’’).

Suppose that f(X) is a real value function of the discrete random variable X. How to determine its expected value?

Let’s consider the example 6.6 of G&S, where X describes the outcomes of tossing three coins, and Y , the function of X, is the number of “runs”.

The sequence of the identical consecutive outcomes (such as HH, or HHH, or TT) is considered as one run (see 6.6 for details).

Three coin experiment can be described in terms of X and Y by the following table:

19

X YHHH 1

HHT 2

HTH 3

HTT 2

THH 2

THT 3

TTH 2

TTT 1

Notice that this table describes indeed Y as a function of X by mapping all possible outcomes x in the real numbers 1,2 or 3. The fact that x are the “three letter words” should not be confusing. They can equally be described as numbers. Using, for instance, a convention

H->1 and T->2 we will get HHH-> 111, THT->212, etc.

To define the expected value of Y we can build a probability function p(Y). It can be done with the help of the same table, by grouping the values of X corresponding a common Y-value and addingtheir probabilities:p(Y=1) = 2/8= ¼; p(2) = ½; p(3) = ¼.

Now we can easily find the expected value of Y: E(Y) = 1*1/4 + 2*1/2 + 3*1/4 = 2.

20

X YHHH 1

HHT 2

HTH 3

HTT 2

THH 2

THT 3

TTH 2

TTT 1

But what if we did not group the values of X with a common Y-value, but instead simply multiplied each value of Y(X) by the corresponding probability of X. In other words, we multiply each value from the right column by the probability of the corresponding outcome from the left column (=1/8):

1*(1/8) +2*{1/8) +3*(1/8) + 2*(1/8) + 2*(1/8) + 3*(1/8) + 2*(1/8) + 1*(1/8) = 16/8 = 2.

This illustrates the following general rule:

If X and Y are two random variables, and Y can be written as a function of X, then one can compute the expected value of Y using the distribution function of X.

In other words,

x

xpxfXfE )6.8()()())((

21

The sum of two random variables and its expected values

Sometimes, the sum of random variables is easy to define. For instance, the definition is straightforward with dies or coins:

One simply adds up the corresponding numbers (0 or 1 for the coins, 1 through 6 for the dice).

But what if X= die, and Y= coin. What does X+Y mean in this case? In such a case it’s reasonable to define the “joint” random variable Z ={X,Y} whose outcomes are the ordered pairs {x,y} where x = {0,1}, and Y= {1,2,3,4,5,6}.

Each outcome, such as {0,5} or {1,4}, has now a probability p = 1/12 and all the characteristics of the distribution can be easily obtained (see Example 6.7 from G&S, p.230)

E(Z=X+Y) = (1/12) ({0,1}+{0,2} +… + {0,6})+ (1/12) ({1,1}+{1,2} +… + {1,6}) =

(1/12) {0,21} + (1/12){6,21} = {1/2, 3.5}.

22

Theorem: Let X and Y be random variables with finite expected values. Then,

E(X+Y)= E(X) + E(Y) ; E(X-Y)= E(X) - E(Y) (8.7)

and if c any constant, then

E(cX)=cE(X) (8.8)

The prove is quite simple, and can be found in Grinstead&Snell, p. 231.

Let’s discuss (8.7).

We can consider the random variable X+Y as a result of applying function f(x,y) = x + y to a joined random variable (X,Y). Then according to (8.6) we have:

)(),(

)(),(

thatconditions thehere used we

)()()()(

),(),()(

jypjyixpi

xpjyixp

YEXEypyxpx

yxpyyxpxYXE

i

j

j jjii

i j i j jijjii

i

In plain language these conditions mean that the probability that X = xi while Y takes any of allowed values (or regardless the value of Y) is given by the probability function of X, and visa versa.

23

An Important Comment:

In the derivation of (8.7) it is not assumed that the summands were mutually independent.

The fact that the expectations add, whether or not the summands are mutually independent, is sometimes referred to as the

First Fundamental Mystery of Probability.

24

b. The variance and standard deviation.While the expectation of a random variable describes its “center”, the variance describes how strongly the distribution is localized near its center.

In other words, the variance (or standard deviation) is a measure of the dispersion, or scatter, of the values of the random variable about the mean value. Let’s start with an example of the normal distribution 2)( xVar

25

26

Let us now define the variance and have some practice

Var(X) = E[(X- )2] (8.6)

The positive square root of variance is called “standard deviation” and is given by

2( ) [( ) ] (8.7)Var X E XX

Where no confusion can result, the index in standard deviation is dropped.

Remember: Var = 2.

27

2)1

(8.8)2 ( ( )X j jj

n x p x

2 2( ) ( ) (8.9)X x f x

Discrete distribution

Continuous distribution

28

Some theorems on varianceTheir proofs can be found in G&S, 6.2

Theorem 1.

Theorem 2. If c any constant, Var(cX)= c2Var(X).

Theorem 3 If X and Y are independent random variable, then

Var(X+Y)=Var(X)+Var(Y)Var(X-Y)=Var(X)+Var(Y)

2 2 2 2 2 2[( ) ] [ ] 2 [ ] [ ] [ ]E X E X E X E X E X

Let’s prove Theorem 3.

μ

29

Theorem 3 If X and Y are independent random variable, then

Var(X+Y)=Var(X)+Var(Y)Var(X-Y)=Var(X)+Var(Y)

Prove

We will use the fact that for the independent X and Y, E(XY) = E(X)E(Y). It can be proven using the definition of independence : P(XY) = P(X)P(Y).

Let E(X) = a and E(Y) = b.

Var(X+Y) = (according to Theorem 1) E[(X+Y)2 ]– (a + b)2 = E(X2) + E(2 XY) + E(Y2) – a2- 2ab – b2== (E(X2) - a2) + (E(Y2) - b2) =(using once again Theorem 1) Var(X) + Var(Y).

Var(X+Y) = E((X-Y)2) – (a - b)2 = E(X2) - E(2 XY) + Y(Y2) – a2+ 2ab – b2== (E(X2) - a2) + (E(Y2) - b2) = Var(X) + Var(Y).

As promised, we used that E(2XY) = E(2X)E(Y) = 2 ab.

30

1 Expectation and Variance for Poisson distribution (open Lect8/Lect8_ClassPract_ExpectVar_forStudents.nb)

(1) Finding Expectation and Variance for the Poisson (discrete) and Exponential (continuous) distributions.

(2) Find (a) expectation, (b) variance and (c) the standard deviation of the sum obtained in tossing of two fair dice.

Working in groups

Documents

Probability &Statistics Lecture 8