Upload
miles-shaw
View
306
Download
0
Tags:
Embed Size (px)
Citation preview
Discrete Random Variables
Discrete random variables
For a discrete random variable X the probability distribution is described by the probability function, p(x), which has the following properties :
10 .1 xp
1 .2 x
xp
bxa
xpbXaP .3
Comment:
• For a discrete random variable the number of possible values (i.e. x such that p(x) > 0) is either finite or countably infinite (in a 1-1 correspondence with positive integers.)
Recall
p(x) = P[X = x] = the probability function of X.
This can be defined for any random variable X.
For a continuous random variable
p(x) = 0 for all values of X.
Let SX ={x| p(x) > 0}. This set is countable (i. e. it can be put into a 1-1 correspondence with the integers}
SX ={x| p(x) > 0}= {x1, x2, x3, x4, …}
Thus let 1
ix i
p x p x
Proof: (that the set SX ={x| p(x) > 0} is countable) (i. e. can be put into a 1-1 correspondence with the integers}
SX = S1 S2 S3 S3 …where
1 1
1iS x p xi i
1 1
11 Note: 2
2S x p x n S
i. e.
2 3
1 1 Note: 3
3 2S x p x n S
3 3
1 1 Note: 4
4 3S x p x n S
Thus the number of elements of 1 (is finite)i iS n S i
Thus the elements of SX = S1 S2 S3 S3 …
can be arranged {x1, x2, x3, x4, … }
by choosing the first elements to be the elements of S1 ,
the next elements to be the elements of S2 ,
the next elements to be the elements of S3 ,
the next elements to be the elements of S4 ,
etc
This allows us to write
1
for ix i
p x p x
A Discrete Random Variable
A random variable X is called discrete if
That is all the probability is accounted for by values, x, such that p(x) > 0.
1
1ix i
p x p x
Discrete Random Variables
For a discrete random variable X the probability distribution is described by the probability function p(x), which has the following properties
1
2. 1ix i
p x p x
1. 0 1p x
3. a x b
P a x b p x
Graph: Discrete Random Variable
p(x)
a x b
P a x b p x
a b
Some Important Discrete distributions
The Bernoulli distribution
Suppose that we have a experiment that has two outcomes
1. Success (S)2. Failure (F)
These terms are used in reliability testing.Suppose that p is the probability of success (S) and q = 1 – p is the probability of failure (F)This experiment is sometimes called a Bernoulli Trial
Let 0 if the outcome is F
1 if the outcome is SX
Then 0
1
q xp x P X x
p x
The probability distribution with probability function
is called the Bernoulli distribution
0
1
q xp x P X x
p x
0
0.2
0.4
0.6
0.8
1
0 1
p
q = 1- p
The Binomial distribution
Suppose that we have a experiment that has two outcomes (A Bernoulli trial)
1. Success (S)2. Failure (F)
Suppose that p is the probability of success (S) and q = 1 – p is the probability of failure (F)Now assume that the Bernoulli trial is repeated independently n times.
Let
the number of successes occuring in th trialsX n
Note: the possible values of X are {0, 1, 2, …, n}
For n = 5 the outcomes together with the values of X and the probabilities of each outcome are given in the table below:
FFFFF
0
q5
SFFFF
1
pq4
FSFFF
1
pq4
FFSFF
1
pq4
FFFSF
1
pq4
FFFFS
1
pq4
SSFFF
2
p2q3
SFSFF
2
p2q3
SFFSF
2
p2q3
SFFFS
2
p2q3
FSSFF
2
p2q3
FSFSF
2
p2q3
FSFFS
2
p2q3
FFSSF
2
p2q3
FFSFS
2
p2q3
FFFSS
2
p2q3
SSSFF
3
p3q2
SSFSF
3
p3q2
SSFFS
3
p3q2
SFSSF
3
p3q2
SFSFS
3
p3q2
SFFSS
3
p3q2
FSSSF
3
p3q2
FSSFS
3
p3q2
FSFSS
3
p3q2
FFSSS
3
p3q2
SSSSF
4
p4q
SSSFS
4
p4q
SSFSS
4
p4q
SFSSS
4
p4q
FSSSS
4
p4q
SSSSS
5
p5
For n = 5 the following table gives the different possible values of X, x, and p(x) = P[X = x]
x 0 1 2 3 4 5p(x) = P[X = x] q5 5pq4 10p3q2 10p2q3 5p4q p5
For general n, the outcome of the sequence of n Bernoulli trails is a sequence of S’s and F’s of length n.
SSFSFFSFFF…FSSSFFSFSFFS
• The value of X for such a sequence is k = the number of S’s in the sequence.
• The probability of such a sequence is pkqn – k ( a p for each S and a q for each F)
• There are such sequences containing exactly k S’s
• is the number of ways of selecting the k positions for the S’s. (the remaining n – k positions are for the F’s
n
k
n
k
Thus
0,1,2,3, , 1,k n knp k P X k p q k n n
k
These are the terms in the expansion of (p + q)n using the Binomial Theorem
0 1 1 2 2 0
0 1 2n n n n nn n n n
p q p q p q p q p qn
For this reason the probability function
0,1,2, ,x n xnp x P X x p q x n
x
is called the probability function for the Binomial distribution
Summary
We observe a Bernoulli trial (S,F) n times.
0,1,2, ,x n xnp x P X x p q x n
x
where
Let X denote the number of successes in the n trials.Then X has a binomial distribution, i. e.
1. p = the probability of success (S), and2. q = 1 – p = the probability of failure (F)
Example
A coin is tossed n= 7 times.
0,1,2, ,x n xnp x P X x p q x n
x
Thus
Let X denote the number of heads (H) in the n = 7 trials.Then X has a binomial distribution, with p = ½ and n = 7.
71 12 2
7 0,1,2, ,7
x xx
x
712
7 0,1,2, ,7x
x
0
0.05
0.1
0.15
0.2
0.25
0.3
0 1 2 3 4 5 6 7
x 0 1 2 3 4 5 6 7
p(x) 1/1287/128
21/12835/128
35/12821/128
7/1281/128
p(x)
x
ExampleIf a surgeon performs “eye surgery” the chance of “success” is 85%. Suppose that the surgery is perfomed n = 20 times
0,1,2, ,x n xnp x P X x p q x n
x
Thus
Let X denote the number of successful surgeries in the n = 20 trials.Then X has a binomial distribution, with p = 0.85 and n = 20.
2020.85 .15 0,1,2, , 20
x xx
x
-
0.0500
0.1000
0.1500
0.2000
0.2500
0.3000
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
p(x)
x
x 0 1 2 3 4 5p (x ) 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
x 6 7 8 9 10 11p (x ) 0.0000 0.0000 0.0000 0.0000 0.0002 0.0011
x 12 13 14 15 16 17p (x ) 0.0046 0.0160 0.0454 0.1028 0.1821 0.2428
x 18 19 20p (x ) 0.2293 0.1368 0.0388
The probability that at least sixteen operations are successful
= P[X ≥ 16]
= p(16) + p(17) + p(18) + p(19) + p(20)
= 0.1821 + 0.2428 + 0.2293 + 0.1368 + 0.0388
= 0.8298
Other discrete distributions
• Poisson distribution• Geometric distribution• Negative Binomial distribution• Hypergeometric distribution
The Poisson distribution
• Suppose events are occurring randomly and uniformly in time.
• Let X be the number of events occuring in a fixed period of time. Then X will have a Poisson distribution with parameter .
0,1,2,3,4,!
x
p x e xx
Some properties of the probability function for the Poisson distribution with parameter .
0 0
1. 1!
x
x x
p x ex
1e e
2 3 4
0
1! 2! 3! 4!
x
x
e ex
2 3 4
using 12! 3! 4!
u u u ue u
2. If , 1n xx
Bin
np x p n p p
x
is the probability function for the Binomial distribution with parameters n and p, and we allow n → ∞ and p → 0 such that np = a constant (= say) then
, 0
lim ,!
x
Bin Poissonn p
p x p n p x ex
Proof: , 1n xx
Bin
np x p n p p
x
Suppose or np pn
!
, , 1! !
x n x
Bin Bin
np x p n p x n
x n x n n
!
1 1! !
x nx
x
n
x n n x n n
1 11 1
!
x nx n n n x
x n n n n n
1 11 1 1 1 1
!
x nx x
x n n n n
Now
lim ,Binn
p x n
1 1lim 1 1 1 1
!
x nx
n
x
x n n n n
lim 1!
nx
nx n
Now using the classic limit lim 1n
u
n
ue
n
lim , lim 1! !
nx x
Bin Poissonn n
p x n e p xx n x
Graphical Illustration
Suppose a time interval is divided into n equal parts and that one event may or may not occur in each subinterval.
time interval
n subintervals
- Event occurs
- Event does not occur
As n→∞ , events can occur over the continuous time interval.
X = # of events is Bin(n,p)
X = # of events is Poisson()
Example
The number of Hurricanes over a period of a year in the Caribbean is known to have a Poisson distribution with = 13.1
Determine the probability function of X.
Compute the probability that X is at most 8.
Compute the probability that X is at least 10.
Given that at least 10 hurricanes occur, what is the probability that X is at most 15?
Solution
• X will have a Poisson distribution with parameter = 13.1, i.e.
0,1,2,3,4,!
x
p x e xx
13.113.1 0,1,2,3,4,
!
x
e xx
Table of p(x)
x p (x ) x p (x )
0 0.000002 10 0.083887 1 0.000027 11 0.099901 2 0.000175 12 0.109059 3 0.000766 13 0.109898 4 0.002510 14 0.102833 5 0.006575 15 0.089807 6 0.014356 16 0.073530 7 0.026866 17 0.056661 8 0.043994 18 0.041237 9 0.064036 19 0.028432
at most 8 8P P X
0 1 8 .09527p p p
at least 10 10 1 9P P X P X
1 0 1 9 .8400p p p
at most 15 at least 10 15 10P P X X
15 10 10 15
10 10
P X X P X
P X P X
10 11 150.708
.8400
p p p
The Geometric distribution
Suppose a Bernoulli trial (S,F) is repeated until a success occurs.
Let X = the trial on which the first success (S) occurs.
Find the probability distribution of X.
Note: the possible values of X are {1, 2, 3, 4, 5, … }
The sample space for the experiment (repeating a Bernoulli trial until a success occurs is:
S = {S, FS, FFS, FFFS, FFFFS, … , FFF…FFFS, …}
p(x) =P[X = x] = P[{FFF…FFFS}] = (1 – p)x – 1p
(x – 1) F’s
Thus the probability function of X is:
P[X = x] = p(x) = p(1 – p)x – 1 = pqx – 1
A random variable X that has this distribution is said to have the Geometric distribution.
Reason p(1) = p, p(2) = pq, p(3) = pq2 , p(4) = pq3 , …
forms a geometric series
1
1 + 2 + 3 + x
p x p p p
2 3 1
1-
p pp pq pq pq
q p
The Negative Binomial distribution
Suppose a Bernoulli trial (S,F) is repeated until k successes occur.
Let X = the trial on which the kth success (S) occurs.
Find the probability distribution of X.
Note: the possible values of X are
{k, k + 1, k + 2, k + 3, 4, 5, … }
The sample space for the experiment (repeating a Bernoulli trial until k successes occurs) consists of sequences of S’s and F’s having the following properties:
1. each sequence will contain k S’s
2. The last outcome in the sequence will be an S.
SFSFSFFFFS FFFSF … FFFFFFS
A sequence of length x containing exactly k S’s
The last outcome is an S
The # of S’s in the first x – 1 trials is k – 1.
1 , 1, 2,
1k x kx
p x P X x p q x k k kk
The # of ways of choosing from the first x – 1 trials, the positions for the first k – 1 S’s.
The probability of a sequence containing k S’s and x – k F’s.
The Hypergeometric distribution
Suppose we have a population containing N objects.Suppose the elements of the population are partitioned into two groups. Let a = the number of elements in group A and let b = the number of elements in the other group (group B). Note N = a + b.Now suppose that n elements are selected from the population at random. Let X denote the elements from group A. (n – X will be the number of elements from group B.)Find the probability distribution of X.\
Population
Group A (a elements)
GroupB (b elements)
sample (n elements)
xN - x
Thus the probability function of X is:
A random variable X that has this distribution is said to have the Hypergeometric distribution.
The total number of ways n elements can be chosen from N = a + b elements
a b
x n xp x P X x
N
n
The number of ways n - x elements can be chosen Group B .
The number of ways x elements can be chosen Group A .
The possible values of X are integer values that range from max(0,n – b) to min(n,a)
The Bernoulli distribution
Discrete distributions
1 0
1
q p xp x P X x
p x
0
0.2
0.4
0.6
0.8
1
0 1
1 Bernoulli trial =
0 Bernoulli trial = X
S
F
The Binomial distribution
0,1,2, ,x n xnp x P X x p q x n
x
-
0.0500
0.1000
0.1500
0.2000
0.2500
0.3000
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
p(x)
x
X = the number of successes in n repetitions of a Bernoulli trial p = the probability of success
-
0.02
0.04
0.06
0.08
0.10
0.12
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
The Poisson distribution
Events are occurring randomly and uniformly in time.X = the number of events occuring in a fixed period of
time.
0,1,2,3,4,!
x
p x e xx
The Geometric DistributionThe Negative Binomial Distribution
The Binomial distribution, the Geometric distribution and the Negative Binomial distribution each arise when repeating independently Bernoulli trials
The Binomial distribution the Bernoulli trials are repeated independently a fixed number of times n and X = the numbers of successes
The Negative Binomial distribution the Bernoulli trials are repeated independently until a fixed number, k, of successes has occurred and X = the trial on which the kth success occurred.
The Geometric distribution the Bernoulli trials are repeated independently the first success occurs (,k = 1) and X = the trial on which the 1st success occurred.
The Geometric distribution
Suppose a Bernoulli trial (S,F) is repeated until a success occurs.
Let X = the trial on which the first success (S) occurs.
Find the probability distribution of X.
Note: the possible values of X are {1, 2, 3, 4, 5, … }
The sample space for the experiment (repeating a Bernoulli trial until a success occurs is:
S = {S, FS, FFS, FFFS, FFFFS, … , FFF…FFFS, …}
p(x) =P[X = x] = P[{FFF…FFFS}] = (1 – p)x – 1p
(x – 1) F’s
Thus the probability function of X is:
P[X = x] = p(x) = p(1 – p)x – 1 = pqx – 1
A random variable X that has this distribution is said to have the Geometric distribution.
Reason p(1) = p, p(2) = pq, p(3) = pq2 , p(4) = pq3 , …
forms a geometric series
1
1 + 2 + 3 + x
p x p p p
2 3 1
1-
p pp pq pq pq
q p
Example
Suppose a die is rolled until a six occurs
Success = S = {six} , p = 1/6.
Failure = F = {no six} q = 1 – p = 5/6.
1. What is the probability that it took at most 5 rolls of a die to roll a six?
2. What is the probability that it took at least 10 rolls of a die to roll a six?
3. What is the probability that the “first six” occurred on an even number toss?
4. What is the probability that the “first six” occurred on a toss divisible by 3 given that the “first six” occurred on an even number toss?
Solution
Let X denote the toss on which the first head occurs.
Then X has a geometric distribution with p = 1/6.. q = 1 – p = 5/6.
1. P[X ≤ 5]?
2. P[X ≥ 10]?
3. P[X is divisible by 2]?
4. P[X is divisible by 3| X is divisible by 2]?
11 516 6 1, 2,3,
xxP X x p x pq x
1. P[X ≤ 5]?
5 1 2 3 4 5P X p p p p p
0 1 2 3 45 5 5 5 51 1 1 1 16 6 6 6 6 6 6 6 6 6+ + + +
1 2 3 45 5 5 516 6 6 6 61+ + + +
2 1 1
1
nn r
a ar ar ar ar
Using
55
56 516 65
6
1-1
1
556
2 1
1a ar ar a
r
Note also
10 10 11 12P X p p p
2. P[X ≥ 10]?
9 10 115 5 51 1 16 6 6 6 6 6+ + +
9 1 25 5 516 6 6 61+ + +
9 95 516 6 65
6
1
1
Using2 1
1a ar ar a
r
is divisible by 2 2 4 6P X p p p
3. P[X is divisible by 2]?
1 3 55 5 51 1 16 6 6 6 6 6+ + +
2 45 5 5 51 16 6 6 6 6 6 25
6
11+ + +
1
5 5 536 36 25 1125
36
1
1
4. P[X is divisible by 3| X is divisible by 2]?
5 11 175 5 51 1 16 6 6 6 6 6+ + +
5 6 125 5 516 6 6 61+ + +
is divisible by 3 is divisible by 2P X X
is divisible by 3 is divisible by 2
is divisible by 2
P X X
P X
is divisible by 6
is divisible by 2
P X
P X
is divisible by 6 6 12 18P X p p p
5551
6 6 6 6 656
1 5 3125
6 5 435311-
Hence
is divisible by 3 is divisible by 2P X X
is divisible by 6
is divisible by 2
P X
P X
3125687543531
5 4353111
The Negative Binomial distribution
Suppose a Bernoulli trial (S,F) is repeated until k successes occur.
Let X = the trial on which the kth success (S) occurs.
Find the probability distribution of X.
Note: the possible values of X are
{k, k + 1, k + 2, k + 3, 4, 5, … }
The sample space for the experiment (repeating a Bernoulli trial until k successes occurs) consists of sequences of S’s and F’s having the following properties:
1. each sequence will contain k S’s
2. The last outcome in the sequence will be an S.
SFSFSFFFFS FFFSF … FFFFFFS
A sequence of length x containing exactly k S’s
The last outcome is an S
The # of S’s in the first x – 1 trials is k – 1.
1 , 1, 2,
1k x kx
p x P X x p q x k k kk
The # of ways of choosing from the first x – 1 trials, the positions for the first k – 1 S’s.
The probability of a sequence containing k S’s and x – k F’s.
Example
Suppose the chance of winning any prize in a lottery is 3%. Suppose that I play the lottery until I have won k = 5 times.
Let X denote the number of times that I play the lottery.
Find the probability function, p(x), of X
1 , 1, 2,
1k x kx
p x P X x p q x k k kk
5 510.03 0.91 5,6,7,
4xx
x
-
0.001
0.002
0.003
0.004
0.005
0.006
0.007
0 100 200 300 400 500 600
Graph of p(x)
The Hypergeometric distribution
Suppose we have a population containing N objects.Suppose the elements of the population are partitioned into two groups. Let a = the number of elements in group A and let b = the number of elements in the other group (group B). Note N = a + b.Now suppose that n elements are selected from the population at random. Let X denote the elements from group A. (n – X will be the number of elements from group B.)Find the probability distribution of X.\
Population
Group A (a elements)
GroupB (b elements)
sample (n elements)
xn - x
Thus the probability function of X is:
A random variable X that has this distribution is said to have the Hypergeometric distribution.
The total number of ways n elements can be chosen from N = a + b elements
a b
x n xp x P X x
N
n
The number of ways n - x elements can be chosen Group B .
The number of ways x elements can be chosen Group A .
The possible values of X are integer values that range from max(0,n – b) to min(n,a)
• Suppose that N (unknown) is the size of a wildlife population.
• To estimate N, T animals are caught, tagged and replaced in the population. (T is known)
• A second sample of n animals are caught and the number, t, of tagged animals is noted. (n is known and t is the observation that will be used to estimate N).
Example: Estimating the size of a wildlife population
Note• The observation, t, will have a hypergeometric
distribution
;
T N T
t n tf t N L N
N
n
ˆTo Estimate we find the value , that maximizesN N
T N T
t n tL N
N
n
To determine when
T N T
t n tL N
N
n
1
L N
L N
is maximized compute and determine when the ratio
is greater than 1 and less than 1.
Now
1
1
1
T N T T N T
L N t n t t n tN NL N
n n
1
1
N T N
n t n
N T N
n t n
1 ! ! ! 1 !
! 1 ! 1 ! !
N T N T n t N N n
N T N T n t N N n
1 1
1 1
N T N n
N T n t N
Now
11
L N
L N
1 11
1 1
N T N n
N T n t N
if
1 1 1 1N T N n N T n t N or
21 ( ) 1N n T N nT
21 1N t n T N
1nT t N or
1nT
Nt
and
hence
11
L N
L N
1
nTN
t if
and
11
L N
L N
1
nTN
t if
also
11
L N
L N
1
nTN
t if
1nT
t
nT
t
nT
t N
greatest integer less than or equal to xx
Thus ˆ nTN
t
greatest integer less than or equal to nT
t
ˆ ˆIf is an integer then 1 or nT nT nT
N Nt t t
Example: Hyper-geometric distribution
Suppose that N = 10 automobiles have just come off the production line. Also assume that a = 3 are defective (have serious defects). Thus b = 7 are defect-free.
A sample of n = 4 are selected and tested to see if they are defective. Let X = the number in the sample that are defective. Find the probability function of X.
From the above discussion X will have a hyper-geometric distribution i.e.
3 7
4 0,1,2,3
10
4
a b
x n x x xp x P X x x
N
n
Table and Graph of p(x)
x p (x )
0 0.1667
1 0.5000
2 0.3000
3 0.0333 0.0
0.1
0.2
0.3
0.4
0.5
0.6
0 1 2 3
Sampling with and without replacement
Suppose we have a population containing N objects.
Suppose the elements of the population are partitioned into two groups. Let a = the number of elements in group A and let b = the number of elements in the other group (group B). Note N = a + b.
Now suppose that n elements are selected from the population at random. Let X denote the elements from group A. (n – X will be the number of elements from group B.)
Find the probability distribution of X.
1. If the sampling was done with replacement.
2. If the sampling was done without replacement
Solution:
1. If the sampling was done with replacement.
Then the distribution of X is the Binomial distn.
. . x n x
Binom
n a bi e p x P X x
x N N
with and 1a b
p q pN N
2. If the sampling was done without replacement.
Then the distribution of X is the hyper-geometric distn.
. . Hyper
a b
x n xi e p x P X x
N
n
Note:
Hyper
x n xBinom
a b N
p x x n x n
p x n a bx N N
! !! ! ( )! !
! ! ! ! ! !
n
x n x
x n xa b N n n N
x a x n x b n x n N a b
1 1 1 1
1 ( 1)
n
x n x
a a a x b b b n x N
a b N N N n
1 1 1 11 1 1 1
1 as , ,1 1
1 1
x n xa a b b
N a bn
N N
for large values of N, a and b
x n x
Binom
n a bp x
x N N
Hyper
a b Np x
x n x n
Thus
Thus for large values of N, a and b sampling with replacement is equivalent to sampling without replacement.
Summary
Discrete distributions
The Bernoulli distribution
Discrete distributions
1 0
1
q p xp x P X x
p x
0
0.2
0.4
0.6
0.8
1
0 1
1 Bernoulli trial =
0 Bernoulli trial = X
S
F
The Binomial distribution
0,1,2, ,x n xnp x P X x p q x n
x
-
0.0500
0.1000
0.1500
0.2000
0.2500
0.3000
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
p(x)
x
X = the number of successes in n repetitions of a Bernoulli trial p = the probability of success
-
0.02
0.04
0.06
0.08
0.10
0.12
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
The Poisson distribution
Events are occurring randomly and uniformly in time.X = the number of events occuring in a fixed period of
time.
0,1,2,3,4,!
x
p x e xx
The Negative Binomial distribution the Bernoulli trials are repeated independently until a fixed number, k, of successes has occurred and X = the trial on which the kth success occurred.
The Geometric distribution the Bernoulli trials are repeated independently the first success occurs (,k = 1) and X = the trial on which the 1st success occurred.
P[X = x] = p(x) = p(1 – p)x – 1 = pqx – 1
1 , 1, 2,
1k x kx
p x P X x p q x k k kk
Geometric ≡ Negative Binomial with k = 1
The Hypergeometric distribution
Suppose we have a population containing N objects.
The population are partitioned into two groups.
• a = the number of elements in group A
• b = the number of elements in the other group (group B).
Note N = a + b.
• n elements are selected from the population at random.
• X = the elements from group A. (n – X will be the number of elements from group B.)
a b
x n xp x P X x
N
n