Upload
emma-campbell
View
236
Download
0
Embed Size (px)
Citation preview
Religious Preference Percent Proportion
Protestant 65.6 0.656Catholic 24.2 0.242Jewish 2.3 0.023Other 1.2 0.012None 6.1 0.061No Answer 0.5 0.005
Religious Preference Percent Proportion Proportion2
Protestant 65.6 0.656 0.430Catholic 24.2 0.242 0.059Jewish 2.3 0.023 0.001Other 1.2 0.012 0.000None 6.1 0.061 0.004No Answer 0.5 0.005 0.000
Religious Preference Percent Proportion Proportion2
Protestant 65.6 0.656 0.430Catholic 24.2 0.242 0.059Jewish 2.3 0.023 0.001Other 1.2 0.012 0.000None 6.1 0.061 0.004No Answer 0.5 0.005 0.000
i
K
1
0 494.
What does this mean?
When there is perfect dispersion, IQV = 1.00
When there is no dispersion, IQV = 0.00
Religious Preference Percent Proportion Proportion2
Protestant 16.67 0.1667 0.0279 Catholic 16.67 0.1667 0.0279 Jewish 16.67 0.1667 0.0279 Other 16.67 0.1667 0.0279 None 16.67 0.1667 0.0279 No Answer 16.67 0.1667 0.0279
IQV = (1 - 0.1674) / [(6 - 1) / 6]
= (0.833) / (5 / 6) = (0.833) / (0.833)
= 1.00
1674.01
K
i
Religious Preference Percent Proportion Proportion2
Protestant 100.00 1.00 1.00 Catholic 0.00 0.00 0.00 Jewish 0.00 0.00 0.00 Other 0.00 0.00 0.00 None 0.00 0.00 0.00 No Answer 0.00 0.00 0.00
IQV = (1 - 1.000) / [(6 - 1) / 6] = (0.000) / (5 / 6) = (0.000) / (0.833)
= 0.00
000.11
K
i
For example, take the following 12 values (N = 12):
5, 2, 27, 32, 3, 5, 35, 7, 31, 42, 37, 39
To determine any of the so-called quantile statistics such as the range, the scores first must be ranked or ordered, here in descending order:
1st 4239373532312715 7 5
312th 2
Univariate and EDA Statistics PPD 404 Stem Leaf # Boxplot 7 9 1 7 6 6 5 5 4 4 3 3 4 1 * 2 8 1 * 2 1 59 2 0 1 2 1 | 0 555556666777778889 18 +--+--+ 0 111111111111111111111111222222333344444 39 *-----* ----+----+----+----+----+----+----+---- Multiply Stem.Leaf by 10**+3
1st 42 2nd 39 3rd 37
------------------------------- 4th 35 5th 32 6th 31 7th 27 8th 15 9th 7
-------------------------------10th 5
11th 312th 2
1st 42 2nd 39 3rd 37
------------------------------- Q3
4th 35 5th 32 6th 31 7th 27 8th 15 9th 7
------------------------------- Q1
10th 5 11th 3
12th 2
1st 42 2nd 39 3rd 37
------------------------------- Q3 = (37.5 + 34.5)/2 = 36.0 4th 35
5th 32 6th 31 7th 27 8th 15 9th 7
------------------------------- Q1 = (7.5 + 4.5)/2 = 6.00 10th 5 11th 3
12th 2
1st 42 2nd 39 3rd 37
------------------------------- Q3 = (37.5 + 34.5)/2 = 36.0 4th 35
5th 32 6th 31 7th 27 8th 15 9th 7
------------------------------- Q1 = (7.5 + 4.5)/2 = 6.00 10th 5 11th 3
12th 2
IQR = Q3 – Q1 = 36.0 – 6.0 = 30.0
33 19.333 13.66727 19.333 7.66719 19.333 -0.33314 19.333 -5.33312 19.333 -7.33311 19.333 -8.333
0.000 (0.003)
iY Y YYi
The Sum of the Deviations
1—2—3—4—5^
Mean = 3.0- 2 ————— +2
- 1——— +10
Sum = (-2) + (+2) + (-1) + (+1) + (0)= 0.0
33 19.333 13.667 186.78727 19.333 7.667 58.78319 19.333 -0.333 0.11114 19.333 -5.333 28.44112 19.333 -7.333 53.77311 19.333 -8.333 69.439
= 0.000 = 397.334
iY Y YYi 2YYi
33 19.333 13.667 186.78727 19.333 7.667 58.78319 19.333 -0.333 0.11114 19.333 -5.333 28.44112 19.333 -7.333 53.77311 19.333 -8.333 69.439
= 0.000 = 397.334
sy2 = 397.334 / (6 - 1) = 79.467
iY Y YYi 2YYi
Z-scorespure numbers with mean of 0.0 and standard deviation of 1.00
z1 = (68 - 70.0) / 6.45 = (-2.00) / 6.45 = - 0.31
z1 = (68 - 70.0) / 12.88 = (-2.00) / 12.88 = - 0.16
Y
ii s
YYz
Using SAS to Produce Z-Scores libname old 'a:\';libname library 'a:\'; options ps=66 nodate nonumber; data temp1;set old.cities;popstd=populat;run; proc standard data=temp1 mean=0.0 std=1.0 out=temp2;var popstd;run; proc print data=temp2;id populat;var popstd;title1 'Z-Scores Produced by PROC STANDARD';title2;title3 'PPD 404';run;
Z-Scores Produced by PROC STANDARD PPD 404 POPULAT POPSTD 275 -0.28030 116 -0.42296 127 -0.41309 497 -0.08112 117 -0.42206 301 -0.25698 82 -0.45347 641 0.04808 453 -0.12060 100 -0.43732 241 -0.31081 82 -0.45347 101 -0.43642 72 -0.46244 393 -0.17443 86 -0.44988 175 -0.37002 68 -0.46603 108 -0.43014
libname mydata 'a:\';libname library 'a:\'; options ps=66 nodate nonumber;
proc univariate data=mydata.cities;var populat;title1 'Univariate Statistics';run;
Univariate Statistics PPD 404 Univariate Procedure Variable=POPULAT NUMBER OF RESIDENTS, IN 1,000S Moments N 63 Sum Wgts 63 Mean 587.4127 Sum 37007 Std Dev 1114.554 Variance 1242231 Skewness 5.090201 Kurtosis 30.74326 USS 98756687 CSS 77018305 CV 189.7395 Std Mean 140.4206 T:Mean=0 4.183237 Pr>|T| 0.0001 Num ^= 0 63 Num > 0 63 M(Sign) 31.5 Pr>=|M| 0.0001 Sgn Rank 1008 Pr>=|S| 0.0001 W:Normal 0.468356 Pr<W 0.0001
Quantiles(Def=5) 100% Max 7896 99% 7896 75% Q3 641 95% 1949 50% Med 278 90% 906 25% Q1 100 10% 72 0% Min 56 5% 60 1% 56 Range 7840 Q3-Q1 541 Mode 56 Extremes Lowest Obs Highest Obs 56( 30) 1511( 56) 56( 24) 1949( 55) 58( 46) 2816( 54) 60( 21) 3367( 53) 65( 51) 7896( 52)
Calculate the INDEX OF QUALITATIVE VARIATION for the data in the following table. =============================================================== Service Branch Frequency P P2
--------------------------------------------------------------- Air Force 56Army 166Marine Corps 14Merchant Marines 1Navy 70
-------Total 307 ---------------------------------------------------------------
===============================================================
Service Branch Frequency P P2
--------------------------------------------------------------- Air Force 56 0.182 0.033Army 166 0.541 0.292Marine Corps 14 0.046 0.002Merchant Marines 1 0.003 0.000Navy 70 0.228 0.052
--- ------Total 307 0.379 ---------------------------------------------------------------
INDEX OF QUALITATIVE VARIATION = 0.776
Here are data once again from 16 European countries. ==============================================================================
Gross Domestic Percent in Crude Birth Nation Product (GDP) Agriculture Rate per
(in billion$) 1,000
------------------------------------------------------------------------------ Austria 3 18 18Belgium 4 7 16Denmark 6 23 18Finland 7 38 17France 8 25 18Germany 112 8 17Great Britain 98 5 18Greece 9 48 18Ireland 10 42 22Italy 17 24 19Netherlands 18 13 18Norway 7 24 18Portugal 4 48 23Spain 18 36 21Sweden 20 18 16Switzerland 14 15 19------------------------------------------------------------------------------
What is the RANGE for the PERCENT IN AGRICULTURE?
What is the INTERQUARTILE RANGE for the PERCENT IN AGRICULTURE?
First, rank the values in descending order. Find the difference between the HIGHEST and LOWEST values (and add 1).
48484238362524242318181513 8 7 5
RANGE = H – L + 1 = 48 – 5 + 1 = 44.0
Having ranked the values in descending order, determine the value at the location dividing the upper 4 values from the lower 12 values. Then determine the value at the location dividing the upper 12 values from the lower 4 values. Find the difference between these two values.
48484238-- Q3 = (38.5 + 35.5) / 2 = 37.03625242423181815-- Q1 = (15.5 + 12.5) / 2 = 14.013 8 7 5
IQR = Q3 – Q1 = 37.0 – 14.0 = 23.0
==============================================================================
Gross Domestic Percent in Crude Birth Nation Product (GDP) Agriculture Rate per
(in billion$) 1,000
------------------------------------------------------------------------------ Austria 3 18 18Belgium 4 7 16Denmark 6 23 18Finland 7 38 17France 8 25 18Germany 112 8 17Great Britain 98 5 18Greece 9 48 18Ireland 10 42 22Italy 17 24 19Netherlands 18 13 18Norway 7 24 18Portugal 4 48 23Spain 18 36 21Sweden 20 18 16Switzerland 14 15 19------------------------------------------------------------------------------
What is the STANDARD DEVIATION for GDP?
What is Germany’s Z-SCORE for GDP?
Next, determine the deviations and squared deviations for each value.
3 19.1875 368.16024 -18.1875 330.78526 -16.1875 262.03527 -15.1875 230.66028 -14.1875 201.2852
112 89.8125 8066.285 98 75.8125 5747.535
9 -13.1875 173.9102 10 -12.1875 148.5352 17 -5.1875 26.91016 18 -4.1875 17.53516
7 -15.1875 230.66024 -18.1875 330.7852
18 -4.1875 17.53516 20 -2.1875 4.785156 14 -8.1875 67.03516 355 16224.44