23
SPSS 2: Data analysis By Wendiann Sethi Spring 2011

SPSS 2: Data analysis

  • Upload
    micol

  • View
    65

  • Download
    0

Embed Size (px)

DESCRIPTION

SPSS 2: Data analysis. By Wendiann Sethi Spring 2011. Course description:. - PowerPoint PPT Presentation

Citation preview

Page 1: SPSS 2: Data analysis

SPSS 2: Data analysisBy Wendiann Sethi

Spring 2011

Page 2: SPSS 2: Data analysis

The second stages of using SPSS is data analysis. We will review descriptive statistics and then move onto other methods of data exploration using crosstabulations, inferences on the mean, regression and ANOVA. Students are encouraged to bring data that they are analyzing in class or projects to discuss what methods would be best to use.

Course description:

Page 3: SPSS 2: Data analysis

Descriptive statistics Identifying outliers Missing data – by variable, by respondent Manipulating data –

◦ reversing the scale◦ Collapsing a continuous variable into groups

Choosing the right statistic Data Analysis:

◦ Cross tabulations◦ Correlations and regression◦ Tests about mean and proportions◦ ANOVA

Multiple Responses

Course objectives

Page 4: SPSS 2: Data analysis

Measures of central tendency Measures of spread

Analyze> Descriptive Statistics

What do you use for each type of variable and why?

Descriptive Statistics

Page 5: SPSS 2: Data analysis

Look at the mean, median, st.dev and skewness to determine if there might be outliers

Could also use a box plot to see

Outliers

Page 6: SPSS 2: Data analysis

Two potential problems with missing data:1. Large amount of missing data – number of

valid cases decreases – drops the statistical power

2. Nonrandom missing data – either related to respondent characteristics and/or to respondent attitudes – may create a bias

Missing Data revisited

Page 7: SPSS 2: Data analysis

Missing Data AnalysisExamine

missing data

By variable By respondent By analysis

If no problem found, go directly to your analysisIf a problem is found:

Delete the cases with missing dataTry to estimate the value of the missing data

Page 8: SPSS 2: Data analysis

Use Analyze > Descriptive Statistics > Frequencies

Look at the frequency tables to see how much missing

If the amount is more than 5%, there is too much. Need analyze further.

Amount of missing data by variable

Page 9: SPSS 2: Data analysis

1. Use transform>count2. Create NMISS in the target variable3. Pick a set of variables that have more

missing data4. Click on define values5. Click on system- or user-missing6. Click add7. Click continue and then ok8. Use the frequency table to show you the

results of NMISS

Missing data by respondent

Page 10: SPSS 2: Data analysis

Use Analyze>descriptive statistics>crosstabs

Look to see if there is a correlation between NMISS (row) and another variable (column)

Use column percents to view the % of missing for the value of the variable

Missing data patterns

Page 11: SPSS 2: Data analysis

Proceed anyway Estimate (impute) the missing data with

substituting the mean or median value

What to do about the missing data?

Page 12: SPSS 2: Data analysis

Recoding Calculating When to create a new variable versus

creating a new one.

Manipulating data

Page 13: SPSS 2: Data analysis

What do you want to explore?

What do you need in your data to do that exploration?

Choosing the right statistic

Page 14: SPSS 2: Data analysis

Analyze>Descriptive Statistics>Crosstab

Good for categorical data to see the relationship between two or more variables

Statistics: correlation, Chi Square, association

Cells: Percentages – row or column

Cluster bar charts

Crosstabulation

Page 15: SPSS 2: Data analysis

Finding the relationship between two scale or ordinal variables.

Analyze > Correlate > bivariate

Analyze > regression > linear

Correlation and regression

Page 16: SPSS 2: Data analysis

Aim: find out whether a relationship exists and determining its magnitude and direction

Two correlation coefficients:◦ Pearson product moment correlation coefficient –r- interval

or ratio scale variables◦ Spearman rank order correlation coefficient –rho- ordered

or ranked data Assumptions:

◦ Related pairs of scores◦ Relationship of the variables is linear◦ Variables are measured at least at the ordinal level◦ Homoscedasticity – variability of y variable should remain

constant at all values of x variable

Correlation

Page 17: SPSS 2: Data analysis

Aim: Use after finding there is a correlation to find an appropriate Linear model to predict the results of the DV based on one or more IV’s

Assumptions:◦ Related pairs of scores◦ Relationship of the variables is linear◦ Variables are measured at least at the ordinal level◦ Homoscedasticity – variability of y variable should remain constant at

all values of x variable Procedure: Linear Regression

◦ One IV to one DV◦ ANALYZE>REGRESSION>LINEAR◦ After placing the appropriate DV and IV, click STATISTICS◦ Click CONTINUE and then OK to run the analysis

Regression

Page 18: SPSS 2: Data analysis

Comparing the means of a scale (or ordinal) when grouped by a category

Analyze > Compare means◦ Means – simplest form DV – scale to be compared given the IV –

categories◦ One-sample t-test : test the mean of the variable against a set

value.◦ Independent samples t-test: looking at the difference of two

means of the variable given a grouping variable (two-groups only) ◦ Paired-samples t-test: looking at the difference of the means when

there is paired data (pre-test vs post-test)◦ One-way ANOVA: comparing the means of dependent variables

(scale or ordinal) given a factor (one IV-category)

Comparing means

Page 19: SPSS 2: Data analysis

Aim: Testing the differences between the means of two independent samples or groups

Requirements:◦ Only one independent (grouping) variable IV (ex. Gender)◦ Only two levels for that IV (ex. Male or Female)◦ Only one dependent variable (DV)

Assumptions:◦ Sampling distribution of the difference between the means is normally

distributed◦ Homogeneity of variances – Tested by Levene’s Test for Equality of Variances

Procedure:◦ ANALYZE>COMPARE MEANS>INDEPENDENT SAMPLES T-TEST◦ Test variable – DV◦ Grouping variable – IV◦ DEFINE GROUPS (need to remember your coding of the IV)◦ Can also divide a range by using a cut point

T-test for independent groups

Page 20: SPSS 2: Data analysis

Aim:used in repeated measures or correlated groups designs, each subject is tested twice on the same variable, also matched pairs

Requirements:◦ Looking at two sets of data – (ex. pre-test vs. post-test)◦ Two sets of data must be obtained from the same subjects

or from two matched groups of subjects Assumptions:

◦ Sampling distribution of the means is normally distributed◦ Sampling distribution of the difference scores should be

normally distributed Procedure:

◦ ANALYZE>COMPARE MEANS>PAIRED SAMPLES T-TEST

Paired Samples T-test

Page 21: SPSS 2: Data analysis

Aim: looks at the means from several independent groups, extension of the independent sample t-test

Requirements:◦ Only one IV◦ More than two levels for that IV◦ Only one DV

Assumptions:◦ The populations that the sample are drawn are normally distributed◦ Homogeneity of variances◦ Observations are all independent of one another

Procedure:ANALYZE>COMPARE MEANS>One-Way ANOVA Dependent List – DV Factor – IV

One-way Analysis of Variance

Page 22: SPSS 2: Data analysis

How to deal with questions were the participant can choose several choices.

ANALYZE>MULTIPLE RESPONSE◦ Define sets◦ Frequencies◦ Crosstabs

Example data: survey_sample.sav◦ Eth1, 2, 3 – multiple response method◦ News 1, 2, 3 – multiple dichotomy method

Multiple responses

Page 23: SPSS 2: Data analysis

Wendiann [email protected]

AS 202C or SC 128

Thank you