INFO 515 Lecture #10 1
Action ResearchReview
INFO 515Glenn Booker
INFO 515 Lecture #10 2
Why do we do this? Measurements are needed to understand
a system, and predict its future behavior Statistical techniques provide a commonly
accepted means of analyzing measurements
Statistics is based on recognizing that measurements tend to fall over a range of values, not just one precise number
INFO 515 Lecture #10 3
Types of Research Historical (what
happened?) Descriptive (what is
happening?) Developmental
(over time) Case and Field (study
an organization)
Correlational (does A affect B?)
Causal Comparative (what caused it)
True Experimental (single / double blind)
Quasi-Experimental Action Research
INFO 515 Lecture #10 4
Data Analysis Raw data, such as one survey result Refined data, such as the distribution of
ages of Philadelphia residents Derived data, such as comparing the age
distribution of Philadelphia residents to that of the country
INFO 515 Lecture #10 5
Population vs. Sample Often the subject of interest (population) is
so big it isn’t feasible to measure it all Then a sample of measurements can be
made, and we want to relate the sample measurement to the population
INFO 515 Lecture #10 6
Sampling Sampling can be done using probabilistic
techniques (e.g. various random samples) Simple or stratified random, Cluster (geographic), or Systematic (every Nth) samples
Or using non-probabilistic methods (whoever’s convenient, specific groups, or experts)
INFO 515 Lecture #10 7
Customer Satisfaction Surveys A special case of sampling, customer
satisfaction surveys are often done using: In person interview Telephone interview Questionnaire by mail
Sample sizes are based on the allowable error, population size, and the result obtained
INFO 515 Lecture #10 8
Measurement Scales Measurements can use four major types
of scales; the types of analysis possible depend strongly on the type of measurements used Nominal (named buckets, without sequence) Ordinal (ordered buckets) Interval (intervals mean something, can +-) Ratio (you can form ratios, can +-*/ )
INFO 515 Lecture #10 9
Discrete versus Continuous Discrete (nonparametric) measurements
use nominal or ordinal scales; only specific values are allowed Car make = Chevy, or cost = High
Continuous (parametric) measurements use interval or ratio scales, and generally have integer or real number values Temperature = 98.6 deg F, Height = 172.1 cm
INFO 515 Lecture #10 10
Descriptive Statistics Many common statistics can describe the
central tendency of a set of measurements Average (arithmetic mean) Minimum, Maximum, Range Median (middle value) Mode (most common value)
INFO 515 Lecture #10 11
Normal Distribution Many measurements can be described by
a “normal” distribution, which is summarized by an average value and a standard deviation, or s
We can predict how likely any range of values is to occur for a normal distribution (how often is X between 5 and 8?)
INFO 515 Lecture #10 12
Z Score Z scores measure how far from the mean
a single measurement isz = (Xi -
Same formula used for finding “t” too Does not only apply to a normal
distribution, but if it does, then we can predict the probability of that value or higher/lower occurring
INFO 515 Lecture #10 13
Standard Error A sample of N measurements will have a
standard error SEx = s / sqrt(N) The standard error allows us to define the
confidence interval, CICI = mean +/- crit*SExwhere “crit” is the critical z score for a large sample, or the critical t score for a small sample
INFO 515 Lecture #10 14
Critical z and t The critical z score is only a function of the
desired confidence level of the results (zc = 1.96 for 95% confidence level)
Critical t score is a function of the sample size (degrees of freedom, df = n-1) and the desired confidence level As df gets very large, critical t critical z
INFO 515 Lecture #10 15
Confidence Level We have to accept some level of
uncertainty in a statistical analysis – our conclusion might be wrong!
Generally, a 95% level of confidence is used, unless life is on the line - then a 99% level of confidence is required Use 95% typically, hence critical significance
is 0.050
INFO 515 Lecture #10 16
Confidence Level The level of confidence of your results,
plus the critical significance, always equals exactly one
For practically every statistical test, having the Significance of the result less than the critical value means to reject the null hypothesis If Sig actual < Sig crit, reject null hypothesis
INFO 515 Lecture #10 17
Frequency and Percentage Frequency graphs and crosstabs can
provide a lot of information just from counts of a nominal or ordinal measurement occurring, possibly given with the percentages of each event’s occurrence
Histograms can provide similar charts for ratio or interval scaled data
INFO 515 Lecture #10 18
Scatterplots Scatter plots or diagrams show the
relationship between two or more measures The horizontal axis is generally the
independent variable (X), sometimes also called a factor or grouping variable
The vertical axis is generally the dependent variable (Y), which is the measure you’re trying to understand
INFO 515 Lecture #10 19
Hypothesis Testing Some statistics are used in the context of
testing a hypothesis - a statement whose truth you wish to determine Are Philadelphians more likely to be Nobel
Prize winners? The Null hypothesis is the opposite of the
hypothesis, and generally says there is no difference or no effect observed Philadelphians no more likely to be Nobel Prize
winners than any other group
INFO 515 Lecture #10 20
Hypothesis Testing Can’t truly PROVE anything - only
determine if the differences observed are “not likely to be due to chance”
Select one or more “Tests of Significance” to determine if there is a statistically significant difference (Yes/No); if Yes, then can Select one or more “Measures of Association”
to describe the strength of the difference, and possibly its direction
INFO 515 Lecture #10 21
One versus Two Tailed Tests A null hypothesis which tests for “no
difference” uses a two tailed test A null hypothesis which specifically tests
for “greater than” uses a one tailed test A null hypothesis which specifically tests
for “less than” uses a one tailed test One versus two tailed changes the critical z
or t score; generally makes the test easier to show significance – that’s why two-tailed tests are used
INFO 515 Lecture #10 22
Z or T Test The z or t tests can be used to compare
two distribution means, or compare one distribution mean to a fixed value (interval or ratio data)
Compare the actual z or t score to the critical z or t score
If the actual z or t score is closer to zero than the critical value, accept the null hypothesis
INFO 515 Lecture #10 23
Z or T Test (Two Tailed)
-crit +crit
Accept Null Hypothesis
mean
Reject NullHypothesis
Reject NullHypothesis
z or tscale
Xactual z
or t
Notice this is for the x or t value, NOT the significance of that value
INFO 515 Lecture #10 24
Z or T Test (One Tailed)
+crit
Accept Null Hypothesis
mean
Reject NullHypothesis
z or tscale
Xactual z
or t
(Case here is testing if the actual value is greater than the mean; for a “less than” case, use only the negative critical value.)
INFO 515 Lecture #10 25
Is My Sample Normal? Boxplots and stem-and-leaf diagrams can
help show graphically whether a sample has a fairly normal distribution
The skewness and kurtosis of a data set can help identify non-normality, if their values are more than two times their own standard errors
INFO 515 Lecture #10 26
T Tests T tests compare means for ratio or interval
data Independent t test is for two different strata
within one data set Paired t test is to compare measures of the
same group before and after some event (drug test), or the samples are otherwise believed to be dependent on each other
One-sample t test compares one sample to a fixed value
INFO 515 Lecture #10 27
T Tests Null hypothesis is that there is no
difference between the means Results (e.g. significance) may differ if
variances are not equal, since df changes The Levene test checks for equal
variances Null hypothesis for the Levene test is that the
variances are equal If the Levene significance < 0.050, variances
are not equal (reject the null hypothesis)
INFO 515 Lecture #10 28
Independent T Test Evaluation Three ways to check the results of a T test
If the T test’s significance < 0.050, reject the null hypothesis
Check the stated t value against the critical t value for this ‘df’ level; if t(actual) > t(critical) reject the null hypothesis
If the confidence interval for the difference between the means does not include zero, reject the null hypothesis
INFO 515 Lecture #10 29
Evaluating Significance
Critical0.050
Accept NullHypothesis
0
Reject Null Hypothesis
SignificanceX
ActualSig.
INFO 515 Lecture #10 30
Paired T Test Evaluation Checks before and after test cases Includes a correlation factor (like ‘r’)
Can use paired test if significance < 0.050 Larger correlation factor means stronger
relationship between the variables Test evaluation as Independent T Test
Significance, ‘t’ value, and confidence interval
INFO 515 Lecture #10 31
One-Sample T Test Compare a sample mean to a fixed value Test shows the actual values of means,
with their std deviation and std error Same interpretation of results
Significance, ‘t’ value, and confidence interval
INFO 515 Lecture #10 32
F Test and ANOVA Compare several means against each
other using Analysis of Variance (ANOVA) and the F test
Like extending the T tests to many variables
Want data from random samples of normal populations with equal variances
INFO 515 Lecture #10 33
F Test and ANOVA Output includes the Levene test
Want significance for Levene > 0.050, so that equal variances can be assumed
Otherwise, should not use ANOVA Evaluate F by its significance
If Sig. < 0.050, reject the null hypothesis (there is a significant difference among the means)
INFO 515 Lecture #10 34
Additional ANOVA Tests Once the F test shows there is some
difference in the means across a subset, additional ANOVA tests can help identify more specific trends and differences
Types of tests (see end of lecture 6) include Pairwise Multiple Comparisons Post Hoc Range Tests
INFO 515 Lecture #10 35
Pairwise Multiple Comparisons Pairwise Multiple Comparisons check two
subsets of data at a time Bonferroni test is better for a small number
of subsets Tukey test is better for many subsets
Both assume subset variances are equal For each pair of subset values,
Sig < 0.050 means the difference in means is significant
INFO 515 Lecture #10 36
Post Hoc Range Tests Post Hoc Range Tests look for groups
within each subset which all have similar variances Tukey and Tukey’s-b tests include Post Hoc
Range Tests Each column of the output is a subset with
statistically similar means Subsets may overlap substantially
INFO 515 Lecture #10 37
Contrasts Across Means Look across subset means to see if there is
a trend, such as a linear increase or decrease across subsets
Can check for Linear, Quadratic, or Cubic relationships (i.e. first, second, or third order polynomials)
Check Significance of F for the Unweighted version of each relationship (Linear, etc.) if Sig. < 0.050, reject the null hypothesis
INFO 515 Lecture #10 38
Determine Linearity An option under Compare Means / Means
allows checking just for linearity This confirms the ANOVA test result for
Linearity And gives R and Eta parameters, which
are Measures of Association
INFO 515 Lecture #10 39
R and Eta Pearson’s R * measures how well the data
fits the regression (-1 is a perfect negative correlation, 0 is no relationship, 1 is perfect positive correlation), and describes the amount of shared variance between them
Eta squared gives how much of the variance in one variable is caused by the changes in the other variable
* Named for English statistician Karl Pearson, 1857-1936 (per http://human-nature.com/nibbs/03/kpearson.html)
INFO 515 Lecture #10 40
Regression Analysis Regression Analysis looks at two interval
or ratio-scaled variables (generically X and Y) and tries to fit an equation between them
A dozen different equations are available Linear, Power, Logarithmic, Exponential, etc.
Significance is checked by ANOVA F, and Sig. of the regression coefficients; association is measured with R Squared
INFO 515 Lecture #10 41
Regression Analysis For a regression to have any significance,
we must have ANOVA’s Sig. F < 0.050 Then each variable’s coefficient (b0, b1,
etc.) must have significance < 0.050 Otherwise the coefficient might be zero
Then the better regression equations are ranked in order of strength by R Square, which is confirmed visually by plotting
INFO 515 Lecture #10 42
Regression Analysis The standard error of coefficients is given,
so confidence intervals can be formed Also helps report them meaningfully, so you
don’t report a value as 4.861435 if it has a standard error of 0.92
Depending on the accuracy of the source data, you could report that result as 5 +/- 1, or 4.9 +/- 0.9, or 4.86 +/- 0.92
INFO 515 Lecture #10 43
Crosstabs Crosstabs display data sorted by two
or more variables in table form Often just counts of each category,
and/or the percentage of counts Recoding data allows interval or ratio
scale data to be put into groups (e.g. age 18-25)
INFO 515 Lecture #10 44
Pearson’s Chi Square Measures how well the actual (observed)
data differs from a even (expected) distribution of data
The “expected” data can be a random distribution (same number of counts per cell), or adjusted for the actual total counts for each row and column
INFO 515 Lecture #10 45
Pearson’s Chi Square Evaluation When chi square is larger than the critical
value, reject the null hypothesis Or if the significance of chi square is <
0.050, reject the null hypothesis Can also generate Chi square for a single
variable Beware that Chi square is less meaningful
for large matrices Or, it’s too easy for large matrices to show
significance falsely using Chi square
INFO 515 Lecture #10 46
Residuals A residual is the difference between the
Observed and Estimated values for a cell Residuals can be plotted to look for
outliers Residuals can be standardized by dividing
by their standard deviation Cells with a standardized residual magnitude
> 2 contribute a lot to Chi square
INFO 515 Lecture #10 47
Measures of Association Measures of Association between two
variables can be symmetric or directional Dozens of measures have been developed
to work with chi square test Interpret them like ‘r’ - zero means no
correlation, larger values mean a stronger correlation Some can be > 1
INFO 515 Lecture #10 48
Measures of Association Symmetric measures don’t care which
variable is dependent (Y) Directional measures DO care which
variable is dependent (A = f(B) is not B = f(A)) Some directional measures have a
“symmetric” value, the weighted average of the other two
INFO 515 Lecture #10 49
Symmetric Measures The “Contingency Coefficient” is the main
symmetric measure with a Chi Square test Works even with nominal data Evaluated like Pearson’s r
Phi and Cramer’s V are other symmetric measures
INFO 515 Lecture #10 50
Directional Measures Directional measures range from 0 to 1
Lambda is the recommended directional measure - tells what proportion of the dependent variable is predicted by the independent variable (like Eta)
Eta can be applied here if one variable is interval or ratio scaled
INFO 515 Lecture #10 51
Relative Risk and Odds Ratio Use only with 2x2 tables Are quite directional Tells how much more likely one cell is to
occur than the others Need to be very careful when interpreting
INFO 515 Lecture #10 52
Square Tables Tables with the same number of rows and
columns (RxR), and the same variables in those rows and columns, can use kappa Measures strength of association, like ‘r’ Check results for significance (<0.050) Then judge the value of kappa using a
fixed scale
INFO 515 Lecture #10 53
General RxC Measures Many measures can be used with a
general table of R rows and C columns Gamma is the recommended measure
(symmetric) Spearman’s Correlation Coefficient is also
widely used Ranges from -1 to +1, based on ordered
categories
INFO 515 Lecture #10 54
Yule’s Q Yule’s Q is a special case of gamma for a
2x2 table Is judged on a fixed scale, like ‘r’