Upload
phamliem
View
291
Download
1
Embed Size (px)
Citation preview
ENGG2450 Probability and Statistics for EngineersENGG2450 Probability and Statistics for Engineers1 Introduction3 Probabilityy4 Probability distributions5 Probability Densities5 Probability Densities2 Organization and description of data6 Sampling distributions6 Sampling distributions7 Inferences concerning a mean
C8 Comparing two treatments9 Inferences concerning variancesA Random Processes
6 Sampling distributions
6.2 The sample distribution of
1 Introduction3 Probability4 Probability distributions5 Probability densities2 Organization & description6.2 The sample distribution of
the mean (σ known)
6 3 Th l di t ib ti f
2 Organization & description6 Sampling distributions7 Inferences .. mean8 Comparing 2 treatments9 Inferences .. variancesA Random processes6.3 The sample distribution of
the mean (σ unknown)
A Random processes
6.4 The sampling distribution of the variancethe variance
(revision: 2.1 Populations and samples) (3)
Random Samples (finite population)
A set of observations X1, X2, …, Xn constitutes a random sample of size n from a finite population of size N, if its values are chosen so that
Random Samples (finite population)
size n from a finite population of size N, if its values are chosen so that each subset of n of the N elements of the population has the same probability of being selected.
e.g. N= 100, n= 4
X1 , X2 , X3 , X4 , X5 ,X6 , X7 , X8 , X9 , X10 ,
X X X X X X X XX11 ,X12, X13 , X14 , X15 , X16 , …. X99, X100
The upper case represents the random variables before they are observed.
(revision: 2.1 Populations and samples) (4)
Random Samples (finite population)
A set of observations X1, X2, …, Xn constitutes a random sample of size n from a finite population of size N, if its values are chosen so that
Random Samples (finite population)
size n from a finite population of size N, if its values are chosen so that each subset of n of the N elements of the population has the same probability of being selected.
e.g. N= 100, n= 4
x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 ,
x11 , x12 , x13 , x14 , x15 , x16 , …… x99, x100
We may also apply the term random sample to x1, x2, …, xn which is the set of observed values of the random variables X1, X2, …, Xn .the set of observed values of the random variables X1, X2, …, Xn .
Random Samples (infinite population)
(revision: 2.1 Populations and samples) (5)
A set of observations X1, X2, …, Xn constitutes a random sample of
Random Samples (infinite population)
size n from the infinite population f(x) if
1. Each Xi is a random variable whose distribution is given by f(x).i g y f( )2. These n random variables are independent.
X1 , X2 , X3 , X4 , X5 ,X6 , X7 , X8 , X9 , X10 ,
X X X X X X X XX11 ,X12, X13 , X14 , X15 , X16 , …. X99, X100, …
… , X1001, X1002, X1003, X1004, … … … … … …
The upper case represents the random variables before they are observed.…
We may also apply the term random sample to the set of observed values x1, x2, …, xn of the random variables.
x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11 , x12 , x13 , x14 , x15
x xx
e.g.
x16 , x17 , x18 , x19 , x20 , x21 , x22, x23, x24.
A set of observations X1(a) How many different samples of size n=2 can be chosen from a finite population of size N=7?
(b) Repeat (a) with N=24
A set of observations X1, X2, …, Xn constitutes a random sample of size nfrom a finite population of (b) Repeat (a) with N 24. p psize N, if its values are chosen so that each subsetof n of the N elements of
(c) What is the probability of each sample in part (a)if the samples are to be random?
l
the population has the same probability of being selected.
(d) Repeat (c) with N=24.
sln.(a) The number of possible samples = C7,2
(b) The n mber of possible samples C 24 23 / 2 276
= 7x6 / 2 = 21
(b) The number of possible samples = C24,2 = 24x23 / 2 = 276
(c) The probability of each sample in part (a) is 1/21.
(d) The probability of each sample in part (b) is 1/276.
6.2 The sample distribution of the mean ( known) (7)
A set of observations X1, X2, …, X constitutes a random sample of size n fromA set of observations X1, X2, …, Xn constitutes a random sample of size n from the infinite population f(x) if each Xi is a random variable whose distribution is given by f(x) and these n random variables are independent.
go to slide 2
A random sample of n (say 10) observations is taken from some population. The mean of the sample is computed to estimate the mean of the populationmean of the population.
x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11 , x12 , x13 , x14 , x15x xx
x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11 , x12 , x13 , x14 , x15
x16 , x17 , x18 , x19 , x20 , x21 , …, x99, x100 , x101, x102 , x103, x104, x105 , ……
Suppose 50 random samples of size n=10 are taken from a population pp p p phaving the discrete uniform distribution f(x) = 0.1 for x=0,1,2,…, 9 and f(x) = 0 for other values of x.
Sampling is with replacement and we are sampling from an infinite population.
(continued) Suppose 50 random samples of size n=10 are taken from a population having the discrete uniform distribution f(x) = 0.1 for x=0,1,2,…, 9 and f(x) = 0 for other values of x.
x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11 , x12 , x13 , x14 , x15x xx
x16 , x17 , x18 , x19 , x20 , x21 , …, x99, x100 , x101, x102 , x103, x104, x105 , ..
Proceeding in this way, we get 50 samples whose means are
4 4 3 2 5 0 3 5 4 1 4 4 3 6 6 5 5 3 4 4
means Frequency
[ 2.0 , 3.0 ) 2[ 3.0 , 4.0 ) 144.4 3.2 5.0 3.5 4.1 4.4 3.6 6.5 5.3 4.4
3.1 5.3 3.8 4.3 3.3 5.0 4.9 4.8 3.1 5.33.0 3.0 4.6 5.8 4.6 4.0 3.7 5.2 3.7 3.8 5 3 5 5 4 8 6 4 4 9 6 5 3 5 4 5 4 9 5 3
[ , )[ 4.0 , 5.0 ) 19[ 5.0 , 6.0 ) 12[ 6.0 , 7.0 ) 3
5.3 5.5 4.8 6.4 4.9 6.5 3.5 4.5 4.9 5.33.6 2.7 4.0 5.0 2.6 4.2 4.4 5.6 4.7 4.3
Total 50
Th l ti h th di tThe population has the discrete uniform distribution but the means of the 50 random samples has a Why?bell-shaped distribution.
(continued) The population has the discrete uniform distribution but the means of the 50 random samples has a bell-shaped distribution. Why?
x xxx1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11 , x12 , x13 , x14 , x15
x16 x17 x18 x19 x20 x21 x99 x100 x101 x102 x103 x104 x
x xx
x16 , x17 , x18 , x19 , x20 , x21 , …, x99, x100 , x101, x102 , x103, x104, x105 , ..
To answer this kind of question, we need to investigate the
F l f d 2
To answer this kind of question, we need to investigate the
theoretical sampling distribution of the sample mean ...1
nnXXX
Formulas for and 2XX
Theorem 1: If a random sample of size n is taken from a population ha ing the mean and the ariance 2 thenhaving the mean and the variance 2, then(a) is a random variable whose distribution has the mean ,X(b) for samples from infinite populations
,2
n
2 finite population(c) for samples from finite populations,
(b) for samples from infinite populations, the variance of this distribution is
.1
.2
NnN
n
finite population correction factor
(c) for samples from finite populations, the variance of this distribution is
note: is an outcome of random variable
xx1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11 , x12 , x13 , x14 , x15
x xx
f f f
random variable
ti thn
nXXX
..1x16 , x17 , x18 , x19 , x20 , x21 , …, x99, x100 , x101, x102 , x103, x104, x105 , ..
Theorem 1(a): If a random sample of size n is taken from a population having the mean and the variance 2, then
is a random variable which has the mean X
representing the sample mean.
is a random variable which has the mean . XXnote: random variables X1, ..,
Pf: The mean of the sample mean is
nn
n
i
i dxdxdxxxxfnx ...),...,,(... 212
11
X
variables X1, .., Xn have joint pdf f(x1,..,xn).
nn
n
i dxdxdxxfxfxfx ...)()...()(...12121
note: x1,.., xn are dummy variablesnn
ii fff
n)()()( 212
11
1
dummy variables representing outcomes of X1, X2,
nnn dxdxdxxfxfxxxn
...)()...(......121121
…, Xn .
(continue) Pf :
1
The mean of the sample mean is nn
n
i
i dxdxdxxxxfnx ...),...,,(... 212
11
X
nnn dxdxdxxfxfxxxn
...)()...(......121121
X
11
1
......)()...()(...1...)()...()(...12121221211
nnnn dxdxdxxfxfxfx
ndxdxdxxfxfxfx
n
nn dxxfdxxfdxxfxn
)(...)()(122111
1
nn dxxfdxxfdxxfxdxxfn
)(...)()()(13322211
...
nnnnn dxxfxdxxfdxxfdxxfn
)()(...)()(1112211
nnn
...
= the population mean.
e.g. n=10 x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11 , x12 , x13 , x14 , x15
x xx
Th 1(b) If d l f i i t k f l ti
x16 , x17 , x18 , x19 , x20 , x21 , …, x99, x100 , x101, x102 , x103, x104, x105 , ..
Theorem 1(b): If a random sample of size n is taken from a population having the mean and the variance 2, then is a random variable of the variance
X.
2
n
Pf: Without loss of generality, we assume =0 and so
n
nn dxdxdxxxxfx ...),...,,(... 212122
X2n
12 xxxn
i ji
12
nx
x i i where2
1
n
iji
jii
2
11 )...()...(n
xxxx nn
nnin
idxdxdxxxxfx
n...),...,,(...1
21212
122
X
nnjiji
dxdxdxxxxfxxn
...),...,,(...121212
e.g. n=10 x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11 , x12 , x13 , x14 , x15
x xx
1
(continue) Pf :
x16 , x17 , x18 , x19 , x20 , x21 , …, x99, x100 , x101, x102 , x103, x104, x105 , ..
nnin
i dxdxdxxxxfxn
...),...,,(...12121
212
2
X
dddf )(1
nnjiji
dxdxdxxxxfxxn
...),...,,(... 21212
1 E ])[( 2XVariance 2nni
n
i dxdxdxxfxfxfxn
...)()()(...12121
212
dddfff )()()(1
dxxfx
E
)()(
])[(2
X Variance 2
nnjiji
dxdxdxxfxfxfxxn
...)()()(...121212
1 1iii
n
idxxfx
n
)(1 2
12 jjjiiiji
dxxfxdxxfxn
)()(12
1 2
n
in 12
2
1 n
2
e.g. n=10 x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11 , x12 , x13 , x14 , x15
x xx
Theorem 1: If a random sample of size n is taken from a population having
x16 , x17 , x18 , x19 , x20 , x21 , …, x99, x100 , x101, x102 , x103, x104, x105 , ..
Theorem 1: If a random sample of size n is taken from a population having the mean and the variance 2, then(a) is a random variable whose distribution has the mean ,X(b) for samples from infinite populations,
the variance of this distribution is ,2
nn
Chebyshev’s Theorem: f(x)
n
kP || X .12k
nnXXX
..1
n
k /n k /n
k
n
kP || X .11 2k n k
e.g. n=10 x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11 , x12 , x13 , x14 , x15
x xx
Theorem 1: If a random sample of size n is taken from a population having
x16 , x17 , x18 , x19 , x20 , x21 , …, x99, x100 , x101, x102 , x103, x104, x105 , ..
Theorem 1: If a random sample of size n is taken from a population having the mean and the variance 2, then(a) is a random variable whose distribution has the mean ,X(b) for samples from infinite populations,
the variance of this distribution is ,2
nn
f(x)Chebyshev’s Theorem:
n
kP || X .11 2k || XP .1 2
2
n
nnXXX
..1
k /n k /nFor any given >0, the probability can be made arbitrarily close to 1 by
|| XP
= =can be made arbitrarily close to 1 by choosing n sufficiently large.
e.g. n=10 x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11 , x12 , x13 , x14 , x15
x xx
Law of large numbers
x16 , x17 , x18 , x19 , x20 , x21 , …, x99, x100 , x101, x102 , x103, x104, x105 , ..
Theorem 2: Let X1 , X2 , …, Xn be independent random variables each having the same mean and variance 2 Then
Law of large numbers
having the same mean and variance . Then nP as0)|(| -X
As the sample size increases, unboundedly, the probability that theAs the sample size increases, unboundedly, the probability that the sample mean differs from the population mean , by more than arbitrary amount , converges to zero.
f(x)Chebyshev’s Theorem:
|| XP 12
nXXX
..1 || XP .1 2n
For any given >0, the probability || XP
n
k /n k /n= =
y g , p ycan be made arbitrarily close to 1 by choosing n sufficiently large.
e.g. n=10 x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11 , x12 , x13 , x14 , x15
x xx
X1, X2, …, Xn are random variables.
x16 , x17 , x18 , x19 , x20 , x21 , …, x99, x100 , x101, x102 , x103, x104, x105 , ..
e g Consider an experiment where a specified event A has probability
= ( X1 + .. + Xn )/n , called the sample mean, is a random variable. X
e.g. Consider an experiment where a specified event A has probability p of occurring. Suppose that, when the experiment is repeated ntimes, outcomes from different trials are independent. Show that
number of times A occurs in n trialsrelative frequency of A =
nbecomes arbitrary close to p, with arbitrarily high probability, as the number of times the experiment is repeated grows unboundedly.
Sln. We can define n random variables X1 , X2 , …, Xn whereXi =1 if A occurs on the i th trial and Xi =0 otherwise.
X1 + X2 +.. + Xn is the number of times that event A occurs in n trials.
Random variable =( X1 + X2 +.. + Xn )/n is the relative frequency of A.X
e.g. Consider an experiment where a specified event A has probability p of occurring. Suppose that, when the experiment is repeated n times, outcomes from different trials are independent. Show that number of times A occurs in n trialsare independent. Show that number of times A occurs in n trialsrelative frequency of A =
nbecomes arbitrary close to p, which arbitrarily high probability, as the number of times the
i i d b d dl
(continued) Sln. We can define n random variables X1 , X2 , …, Xn whereXi =1 if A occurs on the i th trial and Xi =0 otherwise
experiment is repeated grows unboundedly.
Xi 1 if A occurs on the i th trial and Xi 0 otherwise.
Then X1 + X2 + …+ Xn is the number of times that event A occurs in n trials of the experiment and , the sample mean, is the relative frequency of A.X
E[Xi ] = )(' xfxk
xk
all
E[Xi2 ]
= 1 p + 0 (1- p) = p
= 12 p + 02 (1- p) = p
The Xi are independent and identically distributed with mean = p
[ i ] p ( p) p
and variance 2 = E[Xi2 ] – p(1- p).
e.g. Consider an experiment where a specified event A has probability p of occurring. Suppose that, when the experiment is repeated n times, outcomes from different trials are independent. Show that number of times A occurs in n trialsare independent. Show that number of times A occurs in n trials
relative frequency of A = n
becomes arbitrary close to p, which arbitrarily high probability, as the number of times the i i d b d dl
(continued) Sln.experiment is repeated grows unboundedly.
X1 + X2 + …+ Xn is the number of times that event A occurs in n trials of the experiment.
, the sample mean, is the relative frequency of A in n trials.X
Theorem 2 (Law of large number): Let X1 , X2 , …, Xn be independent 2random variables each having the same mean and variance 2. Then
as the sample size n increases, unboundedly, the probability that the sample mean differs from the population mean which is equal to p) bysample mean differs from the population mean which is equal to p), by more than arbitrary amount , converges to zero, i.e.
.0)|(| nP as-X .0)|(| nP asX(sample size n increases)
l l tisample mean = relative frequency of A in n trials
population mean= p
e.g. n=10 x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11 , x12 , x13 , x14 , x15
x xx
x16 , x17 , x18 , x19 , x20 , x21 , …, x99, x100 , x101, x102 , x103, x104, x105 , ..
Theorem 1(b): If a random sample of size n is taken from a population having the
mean and the variance 2, then the sample mean is a random variable
of the variance
X2of the variance
The reliability of the sample mean as an estimate of the population meani ft d b th t d d d i ti f th
.n
is often measured by the standard deviation of the meanwhich is also called standard error of the mean.
n X
50 samples whose means are
4.4 3.2 5.0 3.5 4.1 4.4 3.6 6.5 5.3 4.4 3 1 5 3 3 8 4 3 3 3 5 0 4 9 4 8 3 1 5 3
e.g. Suppose 50 random samples of size n=10are taken from a population having the di t if di t ib ti f( ) 0 1 f 3.1 5.3 3.8 4.3 3.3 5.0 4.9 4.8 3.1 5.3
3.0 3.0 4.6 5.8 4.6 4.0 3.7 5.2 3.7 3.8 5.3 5.5 4.8 6.4 4.9 6.5 3.5 4.5 4.9 5.33.6 2.7 4.0 5.0 2.6 4.2 4.4 5.6 4.7 4.3
discrete uniform distribution f(x) = 0.1 for x=0,1,2,…, 9 and f(x) = 0 for other values of x.
n n
428.450
1
n
i ix
xx 9298.0
50)(
12
2
n
i xix
xxs
e.g. n=10 x1 , x2 , x3 , x4 , x5 , x6 , x7 , x8 , x9 , x10 , x11 , x12 , x13 , x14 , x15
x xx
x16 , x17 , x18 , x19 , x20 , x21 , …, x99, x100 , x101, x102 , x103, x104, x105 , ..
S ppose 50 random samples of si e 10Suppose 50 random samples of size n=10are taken from a population having the discrete uniform distribution f(x) = 0.1 for
Theorem 1: If a random sample of size n is taken from an infinite population having the mean and the variance 2 then thex=0,1,2,…, 9 and f(x) = 0 for other values of x.
5.41019
0
xx
and the variance 2, then the sample mean has mean , and variance 2/n.
X
50 samples whose means are
4.4 3.2 5.0 3.5 4.1 4.4 3.6 6.5 5.3 4.4 3.1 5.3 3.8 4.3 3.3 5.0 4.9 4.8 3.1 5.33 0 3 0 4 6 5 8 4 6 4 0 3 7 5 2 3 7 3 8
0x
25.8101)()()(
9
0
29
0
22 xx
xxfx
3.0 3.0 4.6 5.8 4.6 4.0 3.7 5.2 3.7 3.8 5.3 5.5 4.8 6.4 4.9 6.5 3.5 4.5 4.9 5.33.6 2.7 4.0 5.0 2.6 4.2 4.4 5.6 4.7 4.3
By Theorem 1, the mean and variance of the sample mean are respectivelyX
X = 4 5 nX
n22 X
= 4.5
= 0.825 428.450
1
n
i ix
xx
)( 2n
9298.050
)(12
2
n
i xix
xxs
49These theoretical values are close to those computed from the 50 samples.
Central Limit Theorem
Theorem3: If is the mean of a random sample of size n is taken from a population having the mean and the
X
population having the mean and the variance 2, then
is a random variable-XZ is a random variable
whose distribution approaches
nZ
that of the standard normal distributions as n.
X1, X2 , …, Xn are independent random variables with p.d.f. px1, px2 , … , pxn
respectively For Y = X1 + X2 + + X the p d f of Y isrespectively. For Y X1 + X2 + … + Xn , the p.d.f. of Y is
py(y) = px1 px2 … pxn where is convolution.
Central Limit Theorem
Theorm: If n is very large, then for all pxi the p.d.f. of Y equals2
1 y )( 22
21 σ
y
yn
eπσ
yplim)(
)(
where 21 ... n where22
221
221
... n
n