Upload
raj-malhotra
View
228
Download
0
Embed Size (px)
Citation preview
8/4/2019 Introduction to Statistics With SAS
1/84
8/4/2019 Introduction to Statistics With SAS
2/84
Section 1.1
Fundamental StatisticalConcepts
8/4/2019 Introduction to Statistics With SAS
3/84
Objectives
Explain the purpose of statistics.
Decide what tasks to complete before you
analyze your data.
Distinguish between populations andsamples.
8/4/2019 Introduction to Statistics With SAS
4/84
W
hat Is Statistics?
HEIGHT
5d4t5d10t
5d2t5d8t 5d8t
6d1t
5d5t
6d5d11t
5d
8/4/2019 Introduction to Statistics With SAS
5/84
D
escriptive Statistics
MIN AV RA = d5t MAX
I
5d5d2t 5dt
5d5t 5d8t5d8t 5d10t
5d11t 6d6d1t
5' 7.3''
8/4/2019 Introduction to Statistics With SAS
6/84
Inferential Statistics
MIN AV RA =5d5t MAX
5d5d2t 5dt5d5t 5d8t
5d8t 5d10t 5d11t 6d 6d1t
5' 7.3''
8/4/2019 Introduction to Statistics With SAS
7/84
D
efining the Problem
Before you begin any analysis, you shouldcomplete certain tasks.
1. Outline the purpose of the study.
2. Document the study questions.
3. Define the population of interest.
4. Determine the need for sampling.
5. Define the data collection protocol.
8/4/2019 Introduction to Statistics With SAS
8/84
Cereal Example
Rise
nShine
15 ounces
8/4/2019 Introduction to Statistics With SAS
9/84
Defining the Problem
The purpose of the study is to determinewhether Rise n Shine cereal boxes contain 15
ounces of cereal.The study question is whether the averageamount of cereal in Rise n Shine boxes is equalto 15 ounces.
8/4/2019 Introduction to Statistics With SAS
10/84
Population
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
8/4/2019 Introduction to Statistics With SAS
11/84
Sample
Risen
Shine
Risen
Shine
Risen
ShineRise
nSh in e R ise
n
Shine
Risen
ShineRise
nShine RisenShine
Risen
Shine
Rise
nShine
Risen
Shine
Risen
Shine
Rise
nShine Rise
nShine
Risen
Shine
Risen
Shine
Risen
Shine
Risen
Shine
RisenShine
Risen
Shine
8/4/2019 Introduction to Statistics With SAS
12/84
Simple Random Sampling
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i
Ri
i...
8/4/2019 Introduction to Statistics With SAS
13/84
Convenience Sampling
Rise
nShine
Risen
Shine
Risen
Shine
Rise
nShine
Rise
nShine
Rise
nShine
Rise
nShine
Rise
nShine
Rise
nShine
Rise
nShine
Rise
nShine...
8/4/2019 Introduction to Statistics With SAS
14/84
Parameters and Statistics
Statistics are used to approximate populationparameters.
Population
Parameters
Sample
Statistics
Mean Q x
Variance
W s
2
StandardDeviati n
W s
8/4/2019 Introduction to Statistics With SAS
15/84
Levels of Measurement
The two levels of measurement of data used inthis course are
continuous
discrete.
8/4/2019 Introduction to Statistics With SAS
16/84
Describing Your Data
The goals when you are describing data are to
screen for unusual data values
inspect the spread and shape of continuousvariables
characterize the central tendency
draw preliminary conclusions about yourdata.
8/4/2019 Introduction to Statistics With SAS
17/84
Process ofData Analysis
Popul tion
Sampl
Stati ti
Random
Sampl
D rib
Make
Inferences
8/4/2019 Introduction to Statistics With SAS
18/84
Section 1.2
ExaminingDistributions
8/4/2019 Introduction to Statistics With SAS
19/84
Objectives
Examine distributions of data.
Explain and interpret measures of location,
dispersion, and shape.
Use the MEANS and UNIVARIATE proceduresto produce summary statistics.
Use the UNIVARIATE procedure to generatestem-and-leaf, box-and-whisker, normalprobability plots and histograms.
8/4/2019 Introduction to Statistics With SAS
20/84
Cereal Data Set
Rise
nShine WEIGHT ID
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
NUMBER
8/4/2019 Introduction to Statistics With SAS
21/84
8/4/2019 Introduction to Statistics With SAS
22/84
8/4/2019 Introduction to Statistics With SAS
23/84
Skewed Distributions
WEIGHT
R
EQ
UE
NC
Y
8/4/2019 Introduction to Statistics With SAS
24/84
8/4/2019 Introduction to Statistics With SAS
25/84
Measures of Central Tendency
The mean is the balancing point of your data.
15.00
15.02 14.98 15.01 14.99
8/4/2019 Introduction to Statistics With SAS
26/84
Percentiles
WEI
FRE
QUE
N
CY
0
0Percentile
th
40% 60%
8/4/2019 Introduction to Statistics With SAS
27/84
8/4/2019 Introduction to Statistics With SAS
28/84
Measures of Shape
FR
EQ
UE
NCY
SkewedtoRight
FR
EQ
UE
NCY
Symmetric
FR
EQ
UE
NCY
WEI WEI WEI
Skewedto Left
8/4/2019 Introduction to Statistics With SAS
29/84
Measures of Shape
Light-tailed
Normal
Heavy-tailed
8/4/2019 Introduction to Statistics With SAS
30/84
The MEANS Procedure
PROC MEANS DATA=SAS-data- et;VARvariable ;
RUN;
8/4/2019 Introduction to Statistics With SAS
31/84
The UNIVARIATE Procedure
PROC UNIVARIATE DATA=SAS-data- et;VARvariable ;
IDvariable;HISTOGRAM variable /;
PROBPLOT variable /;RUN;
8/4/2019 Introduction to Statistics With SAS
32/84
Descriptive Statistics
This demonstration illustrates using theMEANS and UNIVARIATE procedures tocalculate descriptive statistics forcontinuous variables.
8/4/2019 Introduction to Statistics With SAS
33/84
GraphicalDisplays ofDistributions
PROC UNIVARIATE produces three kinds of plotsfor examining the distribution of your data values:
stem-and-leaf plots
box-and-whisker plots
normal probability plots.
PROC UNIVARIATE can also generate histogramsand graphically enhanced normal probability plots.
8/4/2019 Introduction to Statistics With SAS
34/84
Stem-and-Leaf Plots
9 013388 00123477897 00134556677996 035685 84
3 92 01 4
Multiply Stem.Leafby 10**1
8/4/2019 Introduction to Statistics With SAS
35/84
Box-and-Whisker Plots
The mean is denoted by +.
100||
0||
80||
0||60|
|50|
|40|
|
0||20|
|10|
-
-
-
--
-
-
--
-
0
**
+
more than IQ n its from box
more than 1.5 IQ nits from box
75th percentile50th percentile median
25th percentile
max point 1.5 IQ nits from box
min point 1.5 IQ nits from box
8/4/2019 Introduction to Statistics With SAS
36/84
8/4/2019 Introduction to Statistics With SAS
37/84
ExaminingDistributions
This demonstration illustrates using PROCUNIVARIATE to generate stem-and-leaf,box-and-whisker, normal probability plotsand histograms.
8/4/2019 Introduction to Statistics With SAS
38/84
Section 1.3
Confidence Intervals
for the Mean
8/4/2019 Introduction to Statistics With SAS
39/84
Objectives
Explain and interpret the confidence intervalsfor the mean.
Explain the central limit theorem.
Calculate confidence intervals usingthe MEANS procedure.
8/4/2019 Introduction to Statistics With SAS
40/84
Point Estimates
estimates
estimates
8/4/2019 Introduction to Statistics With SAS
41/84
Variability among Samples
mean o 15 02
mean o 15 0
.
.
.
.
.
.
8/4/2019 Introduction to Statistics With SAS
42/84
Standard Error of the Mean
A statistic that measures the variability of yourestimate is the standard error of the mean.
It differs from the sample standard deviationbecause
the sample standard deviation deals with thevariability of your data
the standard error of the mean deals with thevariability of your sample mean.
8/4/2019 Introduction to Statistics With SAS
43/84
Confidence Intervals
( | | )
95% Confidence
| ( | )
5% Confidence
8/4/2019 Introduction to Statistics With SAS
44/84
Assumptions about
Confidence Intervals
The types of confidence intervals in this coursemake the assumption that the sample means arenormally distributed.
8/4/2019 Introduction to Statistics With SAS
45/84
Distribution of Sample Means
We ight Mean ofWeight
8/4/2019 Introduction to Statistics With SAS
46/84
NormalDistribution
Useful ro abilitiesforNormalDistributions
68%95%99%
QW QW QW QWQW QQW
8/4/2019 Introduction to Statistics With SAS
47/84
Confidence Intervals
x
Distribution of the SampleMeans
95%
8/4/2019 Introduction to Statistics With SAS
48/84
8/4/2019 Introduction to Statistics With SAS
49/84
8/4/2019 Introduction to Statistics With SAS
50/84
Confidence Intervals
This demonstration illustrates calculatingconfidence intervals using PROC MEANS.
8/4/2019 Introduction to Statistics With SAS
51/84
8/4/2019 Introduction to Statistics With SAS
52/84
Objectives
Define some common terminology relatedto hypothesis testing.
Perform hypothesis testing using theUNIVARIATE procedure.
Compare the means of paired groups usingthe TTEST procedure.
8/4/2019 Introduction to Statistics With SAS
53/84
Judicial Analogy
Hypothesis
CollectEvidence ecisionRule
Significance Level
8/4/2019 Introduction to Statistics With SAS
54/84
8/4/2019 Introduction to Statistics With SAS
55/84
Coin Analogy
Hypothesis
CollectEvidence ecisionRule
Significance Level
8/4/2019 Introduction to Statistics With SAS
56/84
Types of Errors
You used a decision rule to make a decision, butwas the decision correct?
ACTUAL
DECISION Fair Coin Not Fair Coin
Fair Coin correct Type II error
Not Fair Coin Type I error correct
8/4/2019 Introduction to Statistics With SAS
57/84
Modified Coin Experiment
Which coins are fair?
6 H e a d s3 7 Ta i l s
4 0 H e a d s6 0 Ta i ls
5 5 H e a d s4 5 Ta i l s
1 5 H e a d s8 5 Ta i ls
p -v a lu e = .2 7
p -v a lu e < .01 p -va lu e < .01
p -va lu e = .04
8/4/2019 Introduction to Statistics With SAS
58/84
Statistical Hypothesis Test
H : equality
H :d ifference
o
1
DecisionRule
Significance Level
Collect Data
p-valuep-value
SetHypothesis
set
Risen
Shine15 oz.
8/4/2019 Introduction to Statistics With SAS
59/84
Comparing E and the p-Value
In general, you
reject the null hypothesis if p < E
fail to reject the null hypothesis if p u E.
8/4/2019 Introduction to Statistics With SAS
60/84
Performing a Test of Hypothesis
To test the null hypothesis H0: Q = Q0, SASsoftware calculates the tstatistic
xs
xt
)(0
Q!
8/4/2019 Introduction to Statistics With SAS
61/84
8/4/2019 Introduction to Statistics With SAS
62/84
Two-Sided Test of Hypothesis
-3 -2 -1 0 1 2 3T
8/4/2019 Introduction to Statistics With SAS
63/84
One-Sided Test of Hypothesis
In many situations, you are only interestedin one direction. Perhaps you only want evidencethat the mean is significantly lower than fifteen.
For example, instead of testing
H0: Q = 15 versus H1: Q { 15
you test
H0: Q u 15 versus H1: Q < 15
8/4/2019 Introduction to Statistics With SAS
64/84
One-Sided Test of Hypothesis
-3 -2 -1 0 1 2 3
T
8/4/2019 Introduction to Statistics With SAS
65/84
Hypothesis Testing
This demonstration illustrates using PROCUNIVARIATE to perform hypothesis testing.
8/4/2019 Introduction to Statistics With SAS
66/84
Paired Samples
Sa lesSa les
BEFORE AFTER
ADVER T ISIN
8/4/2019 Introduction to Statistics With SAS
67/84
The TTEST Procedure
PROC TTEST DATA=SAS-data-set;CLASS variable;VARvariables;
PAIRED variable*variable;RUN;
8/4/2019 Introduction to Statistics With SAS
68/84
Paired t-Test
This demonstration illustrates using PROCTTEST to conduct a paired sample t-test.
8/4/2019 Introduction to Statistics With SAS
69/84
Section 1.5
Two-Sample t-Tests
8/4/2019 Introduction to Statistics With SAS
70/84
Objectives
Recognize and validate the assumptions ofa two-sample t-test.
Analyze two populations with the TTESTprocedure.
8/4/2019 Introduction to Statistics With SAS
71/84
Cereal Example
RisenShine
Morn
in
8/4/2019 Introduction to Statistics With SAS
72/84
Assumptions
independent observations normally distributed data for each group equal variances for each group.
Comparin T o opulations
Mornin Rise n Shine
1
2
8/4/2019 Introduction to Statistics With SAS
73/84
8/4/2019 Introduction to Statistics With SAS
74/84
Test Statistics and p-Values
FTest for equal variances: H0: W12 = W2
2
Variance Test:
F = 1.51 DF = (3,3) Prob > F = 0.7446
t-Tests for equal means: H0: Q1 = Q2
Unequal Variance t-test:
T = 7.4017 DF = 5.8 Prob > |T| = 0.0004
Equal Variance t-test:
T = 7.4017 DF = 6.0 Prob > |T| = 0.0003
8/4/2019 Introduction to Statistics With SAS
75/84
Test Statistics and p-Values
FTest for equal variances: H0: W12 = W2
2
Variance Test:
F = 15.28 DF = (9,4) Prob > F = 0.0185
t-Tests for equal means: H0: Q1 = Q2
Unequal Variance t-test:
T = -2.4518 DF = 11.1 Prob > |T| = 0.0320
Equal Variance t-test:
T = -1.7835 DF = 13.0 Prob > |T| = 0.0979
8/4/2019 Introduction to Statistics With SAS
76/84
8/4/2019 Introduction to Statistics With SAS
77/84
Section 1.6
OutputDelivery System
8/4/2019 Introduction to Statistics With SAS
78/84
Objectives
Introduce the Output Delivery System (ODS).
Examine some simple statements in ODS.
Use ODS to capture some specificUNIVARIATE procedure output.
Use ODS to generate a report in the HTMLformat.
Use ODS to generate data sets with specificPROC UNIVARIATE output.
8/4/2019 Introduction to Statistics With SAS
79/84
OutputDelivery System
SASprocedure
computesresults
Outputobject
created inODS
ODSconverts data
componentinto SASdata set
8/4/2019 Introduction to Statistics With SAS
80/84
ODS Statements
TRACEprovides information about the output objectsuch as the name and path.
LISTINGopens, manages, or closes the Listingdestination.
OUTPUTcreates SAS data set from an output object.
8/4/2019 Introduction to Statistics With SAS
81/84
OutputDelivery System
This demonstration illustrates the OutputDelivery System by introducing somesimple concepts and building on thatknowledge.
8/4/2019 Introduction to Statistics With SAS
82/84
Section 1.7
Exercises
8/4/2019 Introduction to Statistics With SAS
83/84
Section 1.8
Chapter Summary
8/4/2019 Introduction to Statistics With SAS
84/84
Section 1.9
Solutions to Exercises