22
Analysis of Survey Results Nora Galambos, PhD Office of Institutional Research Stony Brook University

Galambos N Analysis Of Survey Results

Embed Size (px)

Citation preview

Page 1: Galambos N Analysis Of Survey Results

Analysis of Survey Results

Nora Galambos, PhDOffice of Institutional Research

Stony Brook University

Page 2: Galambos N Analysis Of Survey Results

Survey Planning: Questions to Ask

» What hypotheses are being tested?» What types of analyses are planned to test the

hypotheses?» Look over the instrument and create a map or

outline of possible analysis methods» What is the magnitude of the differences you

would like to detect?

Page 3: Galambos N Analysis Of Survey Results

Importance of Pilot Testing

» The most obvious reason for pilot testing is to be able to estimate the sample size.

» Find potential sources of bias» Assists in power calculations» Discover possible distribution problems prior to

surveying the entire sample

Page 4: Galambos N Analysis Of Survey Results

Bonferroni Adjustment

» When multiple tests are performed in an experiment the experiment-wise error rate increases.

» For example, if we are performing a trial with a=0.05, then 1 out of every 20 trials will be significant by chance.

» Let’s say we are comparing males and females on 15 independent factors with a=0.05, where the null hypothesis holds, then the chance of one being significant is actually 0.54.

» Study error rate = » To control this error rate the significance level can be

adjusted using α/n

Page 5: Galambos N Analysis Of Survey Results

Type I and Type II Errors

» A Type I error occurs when a true null hypothesis is rejected. The probability of a Type I error is denoted by α, and is the significance level of the hypothesis test, with 0.05 being a common value for α.

» On the other hand, a Type II error occurs when the null hypothesis is false and it is not rejected. A Type II error is denoted by β and is often set to 0.20.

Page 6: Galambos N Analysis Of Survey Results

Hypothesis Testing Table

True Results

Experimental Results Ho is true Ho is false

Reject Ho α (Type I error rate) Power = 1 - β

Accept Ho β (Type II error rate)

Page 7: Galambos N Analysis Of Survey Results

Power Calculations

» Statistical Power Analysis for the Behavioral Sciences—Jacob Cohen

» The power of a significance test is the probability of rejecting a false null hypothesis, and is equal to 1 - β. If β is 0.20, the power = 0.80.

» 0.80 is generally considered to be adequate level for the power

» Since sample size and power are related, a small sample size results in less power, or reduced probability of rejecting a false null hypothesis.

Page 8: Galambos N Analysis Of Survey Results

Using Sample Size Tables

» Based on initially planning, what differences do you hope or need to detect?

» For example, you may want to find the sample size needed for a t test to evaluate the difference between two means (where the standard deviation is the same in both groups.

» Calculate the effect size:

Page 9: Galambos N Analysis Of Survey Results

Power for a two-sided test, α=0.01

n (for each group) 0.2 0.5 0.8

30 0.03 0.24 0.66

40 0.04 0.35 0.82

50 0.06 0.45 0.91

60 0.07 0.55 >0.995

80 0.12 0.82 >0.995

100 0.29 0.99 >0.995

200 0.29 >0.995 >0.995

500 0.72 >0.995 >0.995

d = 0.2, 0.5, 0.8 (small, medium, and large effects)

Page 10: Galambos N Analysis Of Survey Results

Types of Missing Data

» Missing Completely at Random (MCAR)˃ Given two variables X and Y, the missingness is unrelated to either. The

missing values in X are independent of Y and vice versa. ˃ If the data are MCAR, then listwise deletion is appropriate

» Missing at Random (MAR)˃ Given two variables X and Y, the missingness is related to or dependent

upon X, but not Y. Suppose X = age and Y = income and income is more often missing in certain age groups, but within each age group, no income group is missing more often that any others, then the data are MAR.

» Nonignorable˃ Given two variables X and Y, the missingness is related to X, but may also be

related to Y. In our age-income example, certain income groups within an age group may be less likely to respond.

Page 11: Galambos N Analysis Of Survey Results

Evaluating Missing Data

» Select items with a missing percentage greater than 1% or 2%.

» Recode them into binary variables where with 1=missing and 0=non-missing.

» Analyze these variables by the demographic variables using t-tests or chi-square, as appropriate.

» Significant results indicate that missingness is associated with one or more of the demographic variables.

Page 12: Galambos N Analysis Of Survey Results

Data Reduction Methods

» Used to uncover relationship patterns among a group of variables with the goal of reducing the variables to a smaller group

» Two types of data reduction methods--confirmatory and exploratory

» Exploratory factor analysis does not assume any particular structure prior to the analysis and is used to “explore” relationships between variables

» Confirmatory factor analysis is used to test hypotheses regarding the underlying structure of a group of variables

» Traditional factor analysis and principal components analysis are exploratory data reduction methods

Page 13: Galambos N Analysis Of Survey Results

Principal Components Analysis

» Principal components analysis a method often used for reducing the number of variables

» Principal components analysis is part of the factor analysis procedures in SAS and SPSS

» Although factor analysis (FA) and principal components analysis (PCA) have mathematical differences the results are often similar

» Many authors loosely use the term “factor analysis” to refer to data reduction methods, in general

Page 14: Galambos N Analysis Of Survey Results

Principal Components Analysis

» Finds groups that are correlated with each other, possibly measuring the same construct.

» Reduces the variables in the data to a smaller number of items that account for most of the variance of all of the variables in the data

» The first component accounts for the greatest amount of variance. Then second one accounts for the greatest amount not accounted for by the first component and is uncorrelated with the first component.

Page 15: Galambos N Analysis Of Survey Results

Necessary Assumptions

» Suggested sample size: at least 100 subjects and 10 observations per variable

» A correlation analysis of the variables should result in most correlations greater than 0.3

» Bartlett’s test of sphericity is significant (p < 0.05)

» Kaiser-Meyer-Olkin (KMO) test of sampling adequacy ≥ 0.6

» Determinant >0.00001 which indicates that multicollinearity is not a problem

Page 16: Galambos N Analysis Of Survey Results

Obtaining a PCA

» In SPSS select principal components under “extraction method”» Select varimax rotation.˃ A rotation uses a transformation to aid in the

interpretation of the factor solution ˃ A varimax rotation is orthogonal, so the components

are uncorrelated, which maximizes the column variance

Page 17: Galambos N Analysis Of Survey Results

Evaluating PCA Results

» Kaiser criterion—choose components with eigenvalues greater than one.

» Scree plot—plot of eigenvalues˃ Retain the eigenvalues before the leveling off point of the plot.

» Want the proportion of variance accounted for by each factor (or component) to be 5% to 10%

» Cumulative variance accounted for should be 70% to 80%

Page 18: Galambos N Analysis Of Survey Results

Abbreviated Table of Variance Explained

Total Variance ExplainedInitial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings

Component Total % of VarianceCumulative %Total % of Variance Cumulative % Total % of VarianceCumulative %1 14.26 47.53 47.53 14.26 47.53 47.53 7.22 24.06 24.062 2.55 8.49 56.02 2.55 8.49 56.02 5.79 19.31 43.373 1.37 4.56 60.58 1.37 4.56 60.58 4.41 14.70 58.074 1.09 3.64 64.22 1.09 3.64 64.22 1.84 6.15 64.225 0.98 3.26 67.486 0.86 2.86 70.337 0.80 2.67 73.008 0.75 2.51 75.519 0.68 2.25 77.76

10 0.62 2.06 79.8211 0.58 1.93 81.7512 0.56 1.88 83.6313 0.49 1.64 85.2714 0.48 1.59 86.85

Page 19: Galambos N Analysis Of Survey Results

Scree Plot

Page 20: Galambos N Analysis Of Survey Results

More about PCA Results

» There should be at least three items with significant loadings on each component

» Check the conceptualization of the component items

» With an orthogonal rotation the factor loadings = correlation between variable and component

» A communality is the proportion of variance in a variable that is accounted for by the retained components or factors. A communality is large if it loads heavily on at least one component.

Page 21: Galambos N Analysis Of Survey Results

Obtaining Scores

» Factor score˃ Save the regression scores as variables˃ Standardize the survey responses˃ For each subject’s response, multiply the

standardized survey response by the corresponding regression weights—add the results

» Factor-based score˃ Average the responses of the items in the

component˃ Check for reverse codings and missing data.

Page 22: Galambos N Analysis Of Survey Results

Cronbach’s Alpha

» Cronbach’s Alpha is used to measure the reliability or the internal consistency of the factors or components.

» The variables in a scale are all entered into the calculation to obtain the alpha score.

» A Cronbach’s alpha > 0.7 is considered to be sufficient for demonstrating internal consistency for most social science research, while values > 0.6 are marginably acceptable