Upload
chris-chow
View
236
Download
0
Embed Size (px)
Citation preview
ECON1203/ECON2292 Business and Economic
Statistics
Week 6
2
Week 6 topics
z Normal distribution z Calculating areas under the normal curvez Normal approximation to the Binomial
z Concept of an estimatorz Properties of estimators
z Key referencesz Keller 8.2, 9.2 pp 321-6, 10.1
Normal distribution
3
Normal distributionz Plays a pivotal role in statistical theory & practice
z In practice many continuous variables, e.g. heights of men & women, have distributions that are bell-shaped and well-approximated by normal distributions
z Will play a prominent role in upcoming discussion of statistical inference & estimators
z A normal rv X is a continuous rvz P(X = x) = 0 for every xz P(a < X < b) = area under pdf curve between a & bz Completely characterized by its mean P & variance V2
4
Normal distribution z Graphically, the probability density function (pdf) is
symmetric, unimodal & bell-shapedz Î mean=median=mode
z Basic features includez Range of “support” is unlimited, so -����[����z Despite unlimited range little area in tails of a normal distribution
z 4.5% outside P ±2V�; 0.3% outside P ±3V��(confirm from tables)
z Relevant tables in Keller,Table 3, p. B8-B9z Also in tutorial program
5
6
Normal distribution
7
Normal distribution z If X is normally distributed with mean P and variance V2 then we write X ~ N(P,V2)
z The algebraic formula for pdf of X:
fddf�
||
fddf�
�
¸¹·
¨©§ �
�
xexf
e
xexf
x
x
2
2
21
2
21
)(
becomes pdf the1, 0,When .14.3 and 718.2 where
21
)(
S
VPS
SVVP
8
Normal distribution as a model“All models are wrong but some are useful” George Box
z When a normal distribution is appropriate?z No “exact” way to justify! z Plot a histogram to see if it has a bell shapez Using past experiencez There might be theoretical or context-specific reasons to
justify why a normal is usefulz Can also use statistical hypothesis testing methods
to assess whether a normal “model” might be reasonablez Refer Week 12
9
Standard normalz Recall Z scores
z Z = (X-P)/Vz This standardization yields a rv with zero mean & standard
deviation of onez How are Z & X related?
Z = a + bX (where a = -(P /V) and b = 1/V)z So Z is a linear function (linear transformation) of Xz An important theorem tells us the following: Linear
combinations of normal rv’s are also normalz So Z is a normal rv and called a standard normal rv.
Write Z~(0,1)z This result assists calculations of probabilities
z Any probability calculation involving X~N(P,V2) translates into an equivalent statement using Z~N(0,1)
Calculating normal probabilities
10
?)( compute tohow ),,(~ 2 bXaPNX ��VP
VP
VPS
VP
VP
VP
VP
VP
�
�
��
���
�
��
��
� ��
�
³b
va
u
dxe
vZuP
bZ
aP
bXaPbXaP
xv
u
, where
21
)(
)(
)()(
2
2
)( compute to tablesnormal standard use We form. closed no has integral The
vZuP ��
Standard normal tablesz Be careful as these can come in different
formsz “BES” tables provide P(0 < Z < z), z>0z Keller, P(-��=�]�
z Use the table (next slide)to compute
z What is P(0<Z<0.50)?z What is P(0<Z<1.00)?z What is P(0<Z<1.96)?
11
12
How to use the table to compute
13
?0),( !� zzZP
?)( zZP !
?)( 21 zZzP ��
?)0( ��� ZzP
14
Calculating normal probabilitiesz Suppose time (in minutes) taken
to assemble a computer assumed X~N(50,100). What is probability that a computer is assembled in 45 to 60 minutes?
z Probability of assembly time being between 45 & 60 minutes is 0.5328
0.53280.34130.1915
)10()5.00()10()05.0(
)15.0(10
506010
5045)6045(
�
����� ������
���
¸¹·
¨©§ �
��
��
��
ZPZPZPZP
ZP
XPXP
VP
15
Calculating normal percentiles
z Tables can be used to solve 2 types of problemsz Given a particular z find P(0 < Z < z), orz Given a particular probability A find zA such that
P(Z > zA) = A or P(Z < zA) = 1 - A z Note that zA is the 100(1-A)th percentile of a
standard normal
16
Calculating normal percentiles z Use tables to verify that
z0.025 = 1.96z What isz 97.5th percentile?
z 2.5th percentile?
z 97.5th percentile in computer assembly line example?
1.96=(X0.025 – 50)/10
X0.025 =(1.96)10+50=69.6
17
Normal approximation to the binomialz Used formula or tables to
evaluate probabilities for a binomial rvz Convenient for small number
of trials ( n )z What if n is large?
z Important application of normal distribution is to approximatethe binomial for large nz See example from Keller, Fig.
9.8, p. 310, where n=20, p=0.5z Do you think approximation
would be better or worse if p=0.2?
18
Normal approximation to the binomial z Denote binomial rv by XBz Know that E(XB)=np=10 & Var(XB)= np(1-p)=5z Natural to choose approximating normal rv as
XN~N(10,5)z How good is the approximation?
z Consider P(10�XB ���� ����������������� �����whileP(10�XN ���� P(0�=������ �����
z What went wrong with the approximation?z Would you approximate P(XB=10) by P(XN=10)?
Normal approximation to the binomial ...z Need a continuity correction (Keller p. 311)z Would approximate P(XB=10) by
P(9.5�XN ������z In general approximate
z P(XB ��[��E\�P(XN �[�����z P(XB ��[��E\�P(XN � x-0.5)
z How do you approximate P(XB < x)? z Now reconsider approximation z P(10�XB ���� �����z Use P(9.5�XN �������LQVWHDG�RI�P(10�XN ����z Does this improve the approximation?
19
Normal approximation to the binomial
20
0.45650.4557Clearly
4557.3686.0871.)12.10()022.0(
)12.122.0(5
105.125105.9)5.125.9(
|
� dd�dd�
dd�
¸¹
ᬩ
§ �d
�d
� dd
ZPZPZP
XPXP N V
P
21
Airline mealsz On today’s flight all 160
passengers offered a lunch choice of beef or chicken
z Past data indicates 60% choose beef over chicken
z Passenger choices appear to be independent
z On this flight what is the probability that more than 110 passengers will choose beef?
meals. beef of out run they will that 100 in chance 1ely approximatonly is there as flight the on meals
beef 110only takingjustify could airline the Thus0096.0
)34.20(5.)34.2(197.6
965.110)5.110()110(
197.6)4)(.6(.160)1(
96)6(.160:use binomial to ionapproximat normal For
large is 160 but ondistributi eappropriat Binomial
��� !
¸¹·
¨©§ �
!
! !
�
ZPZP
ZP
XPXP
pnp
npȝ
n
NB
V
22
Estimationz Inferential statisticsz Extracting information about population parameters on
basis of sample statisticsz “Past data indicates 60% choose beef over chicken”z What does a sample mean tell us about the population mean?
z In practical situations parameters are unknown because they are difficult or impossible to determinez Using sample statistics may be only practical
alternative
23
Estimation z Estimatorsz Consider a generic parameter T� that characterizes
the pdf, f(x), of a rv Xz Suppose X1, X2, , Xn is a sample of size n from f(x)z A statistic is any function of sample dataz An estimator is a statistic whose purpose is to
estimate a parameter or some function thereofz A point estimator is simply a formula (rule) for
combining sample information to produce a single number to estimate T
Estimation ...z Estimators are random variables (because
they are a function of rv’s X1, X2, , Xn )z Examples of point estimatorsz Sample mean is a point estimator for the population
meanz Sample variance is an estimator of the population
variancez Why does it make no sense to expect an estimator to
always produce an estimate equal to the parameter of interest?
24
25
Properties of estimatorsz Sample mean is a ‘natural’ choice of estimator for
the population meanz But there may be other (better?) estimatorsz Why not use (n-1)s2/n as an estimator for V2?
z Desirable properties of estimatorsz Unbiasedness: On average, does the estimator achieve
the correct value?z Consistency: As sample size gets larger does the
probability that the estimator deviates from the parameter by more than a ‘small’ amount get small?
z Relative efficiency: If there are two competing estimators of a parameter, does one have less expected dispersion?
26
Properties of estimators
)~var()ˆvar( if efficient relatively is ˆthen unbiased both are ~ estimator ealternativ an & ˆ If
1)|ˆ(|lim
smallany for if of estimator consistent a is ˆ
)ˆ if of estimator unbiased an is ˆthen of estimator an is ˆ Suppose
TTT
TT
HTT
HTT
TTTT
TT
�
��fo
Pn
E(
27
Properties of estimators
2
2
2222
222
1
2
21
2
2
of estimator consistent a is ˆ & small is bias the n large for however
estimator biased a is thus & 11)ˆ(
but of estimator unbiased an is thus & )(
)(ˆ than rather
1
)(
as variance sample the of definition our Recall
V
V
VVV
VV
V
z�
¸¹·
¨©§ �
�
�
�
¦¦
nn
sn
nEE
sE
n
XX
n
XXs
n
ii
n
ii
28
Progress report #4z Have introduced a selection of distributions z Binomial, uniform & normalz These enable us to model a range of phenomenaz Normal also plays a key role in theory of estimation
z Have introduced the basics of estimationz Need to understand better the notion of a point estimator
as a rvz Leads us to sampling distributionsz Role of Normal distribution in theory of estimation comes
through the Central Limit Theoremz Can also use interval rather then point estimators
z Leads us to confidence intervals