Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
The Normal Distribution
Mean= Median= Mode
x
f(x)
µ
s
� It is bell-shaped� It is symmetrical around the mean� The random variable has an in�nite theoreticalrange: �1 to +1
1
� If random variable X has a normal distributionwith � and variance �2 , then it is shown as
X � N(�; �2)
� where the probability density function is
f (x) =1
�p2�e�1
2(x� ��
)2
2
� The cumulative distribution function is
F (x0) = P (X � x0) =Z x0
�1f (x)dx
x0 x0
f(x)
3
� The total area under the curve is 1.0, and thecurve is symmetric, so half is above the mean,half is below
f(X)
Xµ
0.50.5
1.0)XP( =∞<<−∞
0.5)XP(μ =∞<<0.5μ)XP( =<<−∞
4
The Standardized Normal (Standart Normal Da¼g¬l¬m)
� Any normal distribution can be transformed intothe standardized normal distribution (Z), withmean 0 and variance 1
Z =X � ��
and Z � N(0; 1)
� It obtains the following
1)N(0~Z ,Z
f(Z)
0
1
5
� Note that the distribution is the same, only thescale is standardized
a b x
f(x)
−
−
−
=
−
<<−
=<<
σμaF
σμbF
σμbZ
σμaPb)XP(a
σμb −
σμa − Z
µ
0
6
� The Standardized Normal Table gives cumula-
7
tive probability for any value of z
8
� Ex: X � N(8; 25) ) P (X < 8:6) =?
Z =X � ��
=8:6� 85
= 0:12, P (Z < 0:12) = 0:5478
Z0.120X8.68
µ = 8s = 10
µ = 0s = 1
P(X < 8.6) P(Z < 0.12)
� Xrassal de¼giskeninin alabilece¼gi de¼gerlerin%54.78�i8.6�n¬n alt¬ndad¬r
9
� For negative Z-values, use the fact that it is sym-metric distribution
� Ex: P (Z < �2:00) =? = P (Z > 2:00) = 1�P (Z < 2:00)
) P (Z < �2:00) = 1� 0:9772 = 0:0228
Z
.9772
.0228
Z
.9772.0228
10
� Ex: Finding the X value for a Known Probability�X � N(8; 25) ise X�in hangi de¼geri X�in ala-bilece¼gi tüm de¼gerlerin %20�sinin üstündedir?
? 8.0
.20
0.84 0
.80
11
� Z de¼geri için bahsi geçen de¼gerin 0.84 oldu¼gunustandart normal tablosundan biliyoruz. O halde
Z =X � ��
) X = � + Z� = 8 + (�0:84)5 = 3:8
12
Lognormal Distribution
� If X (= lnY ) is normally distributed with �and �, then Y has a log-normal distribution
ln(X) � N(�; �2)
� The lognormal distribution is used to model con-tinuous random quantities when the distributionis believed to be skewed, such as certain incomeand lifetime variables
13
� The lognormal is skewed to the right (ln100 =4:6 ln10 = 2:3)
14
DISTRIBUTION OF SAMPLE STATISTICS
Sampling from a Population
� Örnek: 2, 4, 6, 6, 7, 8 say¬lar¬ndan olusan birpopulasyonumuz olsun
� Bu say¬lardan 3 elemanl¬bir örneklem (sample)seçebiliriz. Bu elemanlar da 2, 6, 7 olsun.
� Bu 3 say¬n¬n ortalamas¬5�tir.� Di¼ger yandan populasyonumuzun ortalamas¬5.5�tir.
15
� Örneklemler seçmeye devam edersekÖrneklem Ortalama2, 6, 7 52, 7, 8 5.74, 7, 8 6.332, 4, 7 4.33
� Burada 3 elemanl¬örneklemlerin ortalamalar¬n¬nne kadar de¼gisebilece¼gi (4.33, 5,..., 5.66) hakk¬nda�kir sahibi olduk (distribution of sample means)
16
Sampling Distribution of Sample Means
� Central Limit Theorem: As n becomes large, thedistribution of
Z =�X � �� �X
=�X � ��=pn
approaches the standard normal distribution re-gardless of the underlying probability distribu-tion. That is
�X � N(�; �2
n)
17
� The standard deviation of the distribution of �Xdecreases when sample size, n; increases
18
� Law of large numbers: Central limit theoremstates that �X � N(�; �2=n).� Hence, as n become large, the mean of the sam-ples, �X, converges to the population mean, �:
19
CONFIDENCE INTERVAL ESTIMATION: ONE POPULA-TION
� A point estimator of a population parameter isa function of the sample information that yieldsa single number
� An interval estimator of a population parameteris a rule for determining (based on the sampleinformation) a range, or interval, in which theparameter is likely to fall
20
Interval Estimation
� Assume � is a random variableP (a < � < b) = 1� �
�the quantity 100(1 � �)% is called the con�-dence level of the interval
�the interval from a to b is called the 100(1 ��)% con�ence interval of �
21
Con�dence Interval Estimation for the Mean of a Normal Dis-tribution: Population Variance Known
� Örnek: Ortalamas¬�, standart sapmas¬� olanbir populasyondan n elemanl¬bir X örneklemiseçip bununla populasyonun ortalamas¬n¬aral¬ktahmini ile bulmak istersek
� Örne¼gin bu da¼g¬l¬m¬n sadece ortadaki %90�l¬k bölümüyleilgilendi¼gimizde, iki kenardan da %5�lik bölümüat¬yoruz
� Sa¼g taraftan att¬¼g¬m¬zda ilgilendi¼gimizZ de¼gerinin1:645 oldu¼gunu, sol taraftan att¬¼g¬m¬zda ise bunun
22
simetri¼gi olan �1:645 olaca¼g¬n¬bulabiliriz
23
�%90 güven aral¬¼g¬su sekilde bulunabilir0:90 = P (�1:645 < Z < 1:645)
�1:645 < �x� ��=pn< 1:645
�1:645�pn
< �x� � < 1:645�pn
�x� 1:645�pn
< � < �x +1:645�pn
Örneklem ortalamas¬ndan 1.645 standart sapmasa¼ga ve sola gitti¼gimizde populasyon ortalamas¬için %90 güven aral¬¼g¬n¬elde etmis oluyoruz
24
� Farkl¬örneklemler kullan¬ld¬¼g¬nda (�) için asa¼g¬-daki gibi güven aral¬klar¬elde edilebilecektir
� Bu güven aral¬klar¬n¬n %90�¬��yü içerecektir
25
� Güven aral¬klar¬n¬n genel sekli
�%90�un d¬s¬nda en çok kullan¬lan güven aral¬klar¬%95 ve %99�dur
26
� Bunlar için � de¼gerleri s¬ras¬yla %5 ve %1�dir� z de¼gerleri ise
F (z�=2) = F (z0:025) = 1:96
F (z�=2) = F (z0:005) = 2:575
27
Con�dence Interval Estimation for the Mean of a Normal Dis-tribution: Population Variance Unknown: The t Distribution
� For a random sample from a normal pupulationwith mean � and variance �2, the random vari-able �X has a normal distribution with mean �and variance �2=n; i.e.
Z =�X � ��=pn
has the standard normal distribution.
28
� But if � is unknown, usually sample estimate isused;
t =�X � �sx=pn
In this case the random variable t follows theStudent�s t distribution with (n � 1) degrees offreedom
29
� A random variable having the Student�s t distri-bution with � degrees of freedomwill be denotedt�. Then t�;� is the number for which
P (t� > t�;�) = �
30
31
� A 100(1� �)% con�dence interval for the popu-lation mean, variance unknown, given by
�x� tn�1;�=2sxpn< � < �x + tn�1;�=2 �
sxpn
32
� Örnek: Rassal bir sekilde seçilmis 6 araban¬n ga-lon/mil cinsinden yak¬t tüketimlerisu sekildedir:18.6, 18.4, 19.2, 20.8, 19.4 ve 20.5. E¼ger bu ara-balar¬n seçildi¼gi populasyona ait arabalar¬n yak¬ttüketimi normal da¼g¬l¬yorsa, bu populasyonunortalama yak¬t tüketimi için %90 güven aral¬¼g¬n¬bulunuz
�Populasyon varyans¬verilmedi¼ginden önce örnek-lemvaryans¬n¬hesaplay¬p önceki sayfadaki for-
33
mülü kullanabiliriz. Örneklem varyans için
i xi x2i1 18.6 345.962 18.4 338.563 19.2 368.644 20.8 432.645 19.4 376.366 20.5 420.25
Sums 116.9 2,282
34
� Dolay¬syla örneklem ortalamas¬
�x =
nPi=1xi
n=116:9
6= 19:5
örneklem varyans¬
s2 =
nPi=1(xi � �x)2
n� 1 =
nPi=1x2i � �x
2
n� 1 =22822 � 6 � 19:52
5= :96
ve standart sapmas¬
sx =p:96 = :98
35
� Arad¬¼g¬m¬z güven aral¬¼g¬
�x�tn�1;�=2 � sxp
n< � < �x +
tn�1;�=2 � sxpn
where n = 6 �=2 = :10=2 = :05 ) t5;:05 =
2:015
19:48� 2:015 � :98p6
< � < 19:48 +2:015 � :98p
6
dolay¬s¬yla
18:67 < � < 20:29
36
� Farkl¬ güven aral¬klar¬n¬n sonucu ise asa¼g¬dakigibidir
37
HYPOTHESIS TESTING
�We test validity of a claim about a populationparameter by using a sample data
� Null Hypothesis: The hypothesis that is main-tained unless there is strong evidence against it
� Alternative Hypothesis: The hypothesis that isaccepted when the null hypothesis is rejected
�Note: If you do not reject the null hypothesis,it does not mean that you accept it. You justfail to reject it
38
� Simple Hypothesis: A hypothesis that popula-tion parameter, �, is equal to a speci�c value,�0
H0 : � = �0
� Composite Hypothesis: A hypothesis that pop-ulation parameter is equal to a range of values
39
� Hypothesis Test Decisions:�Type I Error: Rejecting a true null hypothesis
�Type II Error: The failure to reject a false nullhypothesis
�Signi�cance Level of a Test: The probabilityof making Type I error, which is often denotedin percentage and by �:
�Power of a Test: The probability of not mak-ing Type II error
40
Null is True Null is False
Reject Null Type I Error CorrectFail to Reject Null Correct Type II Error
� Type I and Type II errors are inversely related:As one increases, the other decreases (but notone to one)
41
Tests of the Mean of a Normal Distribution: Population Vari-ance Known
� A random sample of n observations was obtainedfroma normally distributed populationwithmean� and known variance �2. We know that thissample mean has a standard normal distribution
Z =�X � ��=pn
with mean 0 and variance 1
42
� A test with signi�cance level � of the null hy-pothesis
H0 : � = �0against the alternative
H1 : � > �0
is obtained by using the following decision rule
Reject H0 if :�x� �0�=pn> z�
or equivalently �x > �0 + z��=pn
43
� If we use a �gure
44
� In this case � is the signi�cance level of the test(Probability of rejecting a true null hypothesis)
� If it was two-sided test, the signi�cance level ofthe test would had been 2�
� Yet, the power of the test (The probability of notrejecting a false null hypothesis) is not 1� 2�:�Because, if null hypothesis is wrong, then youhold the alternative hypothesis. It means theunderlying distribution is di¤erent
45
� Örnek: Bir mal¬n üretim sistemi do¼gru olarakçal¬st¬¼g¬zaman, ürünlerin a¼g¬rl¬¼g¬n¬n ortalamas¬n¬n5 kg, standart sapmas¬n¬n da 0.1 kg oldu¼gu, vebu a¼g¬rl¬klar¬n normal bir da¼g¬l¬ma sahip oldu¼gugörülmüstür. Üretimmüdürü taraf¬ndan yap¬lanbir de¼gisiklik sonucunda, ortalama ürün a¼g¬rl¬¼g¬n¬nartmas¬, ama standart sapmas¬n¬n de¼gismemesiamaçlanm¬st¬r. Bu de¼gisiklikten sonra 16 ele-manl¬ rassal bir örneklem seçildi¼gi zaman, buörneklemdeki ürünlerin ortalama a¼g¬rl¬¼g¬ 5.038kg olarak bulunmustur. Son populasyondaki ürün
46
a¼g¬rl¬¼g¬n¬n 5 kg olmas¬null hipotezini, alternatifhipotez olan 5 kg�dan büyük olmas¬hipotezinegöre %5 ve %10 önem derecesinde (signi�cancelevel) test ediniz
�Biz asa¼g¬daki hipotezi
H0 : � = 5
su alternetif hipoteze göre test etmek istiyoruz
H1 : � > 5
�Asa¼g¬daki kosul sa¼gland¬¼g¬zaman H0�¬H1�a
47
kars¬reddedebiliriz�X � ��=pn> z�
�Soruda verilenler: �x = 5:038 �0 = 5 n =16 � = :1; dolay¬s¬yla
�X � �0�=pn=5:038� 5:1=p16
= 1:52
�Önem derecesi %5 ise; standart normal tablo-sundan %5�e denk gelen z de¼geri
z0:05 = 1:64548
dolay¬s¬yla 1.52 bu say¬dan daha büyük ol-mad¬¼g¬ndan null hipotezini %5 önem seviyesindereddedemiyoruz (fail to reject)
�Önemderecesi %10 ise; standart normal tablo-sundan %10�e denk gelen z de¼geri
z0:1 = 1:28
bu sefer 1.52 bu say¬dan daha büyük oldu¼gun-dan null hipotezini %10 önem düzeyinde red-dedebiliyoruz
49
� Probability Value (p-value)*: In the previous ex-ample we have seen that we could not reject atest at %5 signi�cance level, but at %10. Henceit is possible to �nd the smallest signi�cance levelat which the null hypothesis is rejected, this iscalled p-value of a test. Formally, if random sam-ple of n observations was obtained from a nor-mally distributed population with mean � andknown variance �2, and if the observed samplemean is �x, the null hypothesis
H0 : � = �0
50
is tested against the alternative
H1 : � > �0
The p-value of the test is
p� value = P ( �x� ��=pn� zp j H0 : � = �0)
51
� Örnek: Bir önceki örnekte�X � �0�=pn=5:038� 5:1=p16
= 1:52
bulunmustu. Bu esitli¼gi sa¼glayan � de¼geri stan-dart normal tablosundan 0.643 olarak bulunabilir,testin p-de¼geridir. Sekille gösterirsek
52
Simple Null Against Two-Sided Alternative
� To test the null hypothesisH0 : � = �0
against the alternative at signi�cance level �
H1 : � 6= �0use the following decision rule
Reject H0 if :�X � �0�=pn< �z�=2
or�X � �0�=pn> �z�=2
53
� Sekille gösterirsek
54
Tests of the Mean of a Normal Distribution: Population Vari-ance Unknown
�We are given a random sample of n observationswas obtained from a normally distributed popu-lation with mean �. Using the sample mean andsample standart deviation, �x and s respectively,we can use the following tests with signi�cancelevel �
55
1. To test the null hypothesis
H0 : � = �0 or H0 : � 6 �0against the alternative
H1 : � > �0
the decision rule is as follows
Reject H0 if :�x� �0sx=pn> tn�1;�
56
2. To test the null hypothesis
H0 : � = �0 or H0 : � > �0against the alternative
H1 : � < �0
the decision rule is as follows
Reject H0 if :�x� �0sx=pn< �tn�1;�
57
3. To test the null hypothesis
H0 : � = �0
against the alternative
H1 : � 6= �0the decision rule is as follows
Reject H0 if :�x� �0sx=pn< �tn�1;�=2
or�x� �0sx=pn> tn�1;�=2
58
bunu sekille gösterirsek
59
Assessing the Power of a Test
Determining the Probability of Type II Error
� Consider the testH0 : � = �0
against the alternative
H1 : � > �0
using the decision rule
Reject H0 if :�x� �0�=pn> z�
60
� Now suppose the null hypothesis is wrong andthe population mean, ��, is in the region of H1.Type II error is the failure to reject a false nullhypothesis. Thus, we consider a � = �� suchthat �� > �0. Then the probability of makingType II error is
� = P (z <�x� ���=pn)
therefore the Power of a Test (the probability ofnot making Type II error)
1� �
61
� Örnek: Daha önce verdi¼gimiz örnekte, 16 ele-manl¬ rassal bir örneklem seçildi¼gi zaman, buörneklemdeki ürünlerin ortalama a¼g¬rl¬¼g¬n¬n 5 kgolmas¬null hipotezini, alternatif hipotez olan 5kg�dan büyük olmas¬hipotezine göre %5 önemderecesinde test etmistik
�Biz asa¼g¬daki hipotezi
H0 : � = 5
su alternatif hipoteze göre test etmek istiyoruz
H1 : � > 5
62
�Soruda verilenler: �0 = 5 n = 16 �2 =:1 z� = z:05 = 1:645; dolay¬s¬yla H0�¬H1�a kars¬reddetmek için karar kural¬(deci-sion rule)
�x� �0�=pn=�x� 5:1=4
> 1:645
ya da �x > 1:645 � (:1=4) + 5 = 5:041bu da demek oluyor ki örneklem ortalamas¬5.041�den küçük oldu¼gunda null hipotezimizireddedemiyor olaca¼g¬z
63
�Diyelim ki populasyon ortalamas¬5.05 olsun(yani alternatif hipotez do¼gru olsun), ve nullhipotezimizi reddetmeyerekType II Error yapmaihtimalimizi bulal¬m. Yani populasyon ortala-mas¬5.05 iken örneklemortalamas¬n¬n 5.041�denküçük olma ihtimalini
P ( �X � 5:041) = P (Z � 5:041� ��=pn)
= P (Z � 5:041� 5:05:1=4
) = P (Z � �:36)= 1� :64 = 0:36
64
dolay¬s¬yla testimizin gücü
Power = 1� � = :64
65
� Sekille gösterirsek
66