72
Another Information- Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU http://www.sfu.ca/rd l/

Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Another Information-Gathering Technique & Introduction to Quantitative Data Analysis

Another Information-Gathering Technique & Introduction to Quantitative Data Analysis

Neuman and Robson Chapter 11.

Research Data library at SFUhttp://www.sfu.ca/rdl/

Page 2: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Quiz 2 Coverage• New Material from the Lectures and from the following

Chapters – 7 (Sampling), 8 (Surveys), 10 (Nonreactive Measures &

Existing Statistics) and the beginning of Chapter 11 (univariate statistics)

• The quiz may also include material covered in the first quiz especially: – Standardization & rates– Scales & indices– validity & reliability, – levels of measurement, – the notions of exhaustive & mutually exclusive categories.

Page 3: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Types of Equivalence for comparative research using existing statistics

Types of Equivalence for comparative research using existing statistics

• lexicon equivalence (technique of back translation)

• contextual equivalence (ex. role of religious leaders in different societies)

• conceptual equivalence (ex. income)• measurement equivlence (ex. different

measure for same context)

Page 4: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Ethical Issues in Comparative ResearchEthical Issues in Comparative Research

• ethical issues sometimes very important – ex. impact of demographic research on funding of

developing countries, controversy surrounding studies of the origins of AIDS

• sensitivity, privacy etc.sometimes still issues even if “subjects” dead.

Page 5: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Quantitative Data

• Types of Statistics– Descriptive– Inferential

• Common Ways of Presenting Statistics– Tables– Charts– Graphs

Page 6: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Data Preparation

• Recall: Coding Issues with War & Peace Journalism codes last day

• Entering Data into Spreadsheet or data processing software

• Cleaning Data

Page 7: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Recall: Coding Principles

• categories– exhaustive– mutually exclusive

• consistent for all cases• comparable with other studies

Page 8: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Ways of Developing Coding Categories

• pre-defined coding schemes–e.g. close-ended questions– Ex. Coding Missing Values (conventions not

always used)• not applicable=77, • don’t know=88,• no response=99

• post-collection analysis

Page 9: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

More Examples of Coding Process

• Sheet for One Television Commercial• Excel spreadsheet showing entered codes• SPSS example

Page 10: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Data entry conventions

Page 11: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Discrete & Continuous Variables

• Continuous– Variable can take infinite (or large) number of values

within range• Ex. Age measured by exact date of birth

• Discrete– Attributes of variable that are distinct but not

necessarily continuous• Ex. Age measured by age groups (Note: techniques exist

for making assumptions about discrete variables in order to use techniques developed for continuous variables)

Page 12: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Cleaning Data

• checking accuracy & removing errors –Possible Code Cleaning• check for impossible codes (errors)

– Some software checks at data entry– Examine distributions to look for impossible codes

– Contingency cleaning• inconsistencies between answers (impossible

logical combinations, illogical responses to skip or contingency questions)

Page 13: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Descriptive Statistics (some topics for next few weeks)

• Univariate (one variable)– Frequency distributions– Graphs & charts– Measures of central tendency– Measures of dispersion

• Bivariate (two variables)– Crosstabulations– Scattergrams & other types of graphs– Measures of association

• Multivariate (more than two variables)– Statistical control– Partials– Elaboration paradigm

Page 14: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Frequency Distribution (Univariate)Table 5-1 Alienation of Workers__________________________________---------------------------------------------------------Level of Alienation Frequency---------------------------------------------------------High 20Medium 67Low 13 (Sub Total) 100

(N=150)No Response 60

(Total) (N=210)

Page 15: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Simple Univariate Frequency Distributions and Percentages

• univariate:= one variable• “raw count” (frequencies, percentages)

Page 16: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Conventions in table design

• total number of cases (N=)• grouping cases – pro: see patterns– con: lose information

Page 17: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Graph of Frequency Distribution (Univariate)

Page 18: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Another visual representation of a distributions: Pie charts

Page 19: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Critically Analyzing Data on Frequency Distributions: Collapsing Categories and Treatment of Missing Data

• Consider Raw Data (Numbers) not just percentages

• Examine data preparation – Treatment of

missing cases?– Collapsing

categories?

Johnson, A. G. (1977). Social Statistics Without Tears. Toronto: McGraw Hill.

Page 20: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Treatment of Missing Data: Raw Data

Table 5-1 Alienation of Workers__________________________________---------------------------------------------------------Level of Alienation Frequency---------------------------------------------------------High 20Medium 67Low 13 (Sub Total) 100

(N=150)No Response 60

(Total) (N=210)

Page 21: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Treatment of Missing Data (%)• Comparison of % distributions and without

non respondents

Table 5-1 Alienation of Workers

Level of Alienation F %High 30 14 Medium 100 48 Low 20 10 No Response 60 29

(Total) 210 100

Table 5-1 Alienation of Workers

Level of Alienation F %High 30 20 Medium 100 67 Low 20 13

(Total) 150 100

Page 22: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Treatment of Missing Data (%)• Comparison with high & medium collapsed

Table 5-1 Alienation of Workers

Level of Alienation F %High & Medium 130 62 Low 20 10 No Response 60 29

(Total) 210 100

Table 5-1 Alienation of Workers

Level of Alienation F %High & Medium 130 87 Low 20 13

(Total) 150 100

Non-respondents included Non-respondents eliminated

Page 23: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Treatment of Missing Data (%)• Comparison with medium & low collapsed

Table 5-1 Alienation of Workers

Level of Alienation F %High 30 14 Medium & Low 120 58 No Response 60 29

(Total) 210 100

Table 5-1 Alienation of Workers

Level of Alienation F %High 30 20 Medium & Low 120 80

(Total) 150 100

Non-respondents included Non-respondents eliminated

Page 24: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Grouping Response Categories(%)

• Comparison of with high & medium response categories collapsed

Table 5-1 Alienation of Workers

Level of Alienation Freq % High& medium 87Low 13

(Total) 150

Table 5-1 Alienation of Workers

Level of Alienation Freq %High & Medium 62Low 10No Response 29

(Total) 210 100

Page 25: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Core Notions in Basic Univariate Statistics

Ways of describing data about one variable (“uni”=one)–Measures of central tendency• Summarize information about one variable • three types of “averages”: arithmetic mean,

median, mode

–Measures of dispersion• Analyze Variations or “spread”• Range, standard deviation, percentiles, z-scores

Page 26: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Mode

Babbie (1995: 378)

• most common or frequently occurring category or value (for all types of data)

Page 27: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Graph (Normal Distribution) with single mode

Page 28: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Bimodal Distribution

• When there are two “most common” values that are almost the same (or the same)

Page 29: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Median

Babbie (1995: 378)

• middle point of rank-ordered list of all values (only for ordinal, interval or ratio data)

Page 30: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Mean (arithmetic mean)

Babbie (1995: 378)

– Arithmetic “average” = sum of values divided by number of cases (only for ratio and interval data)

Page 31: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Two Data Sets with the Same Mean

Page 32: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Normal Distribution & Measures of Central Tendency

Neuman (2000: 319)Neuman (2000: 319)

• Symmetric• Also called the “Bell Curve”

Page 33: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Skewed Distributions & Measures of Central Tendency

Neuman (2000: 319)Neuman (2000: 319)

Skewed to the left

Skewed to the right

Page 34: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Normal & Skewed Distributions

Page 35: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Why Measures of Central Tendency are not enough to describe distributions:

Crowd Example

Why Measures of Central Tendency are not enough to describe distributions:

Crowd Example

• 7 people at bus stop in front of bar aged 25,26,27,30,33,34,35– median= 30, mean= 30

• 7 people in front of ice-cream parlour aged 5,10,20,30,40,50,55– median= 30, mean= 30

• BUT issue of “spread” socially significant

Page 36: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Measures of Variation or Dispersion Measures of Variation or Dispersion

• range: distance between largest and smallest scores

• standard deviation: for comparing distributions • percentiles: for understanding position in

distribution% up to and including the number (from below)

• z-scores: for comparing individual scores taking into account the context of different distributions

Page 37: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Range & Interquartile range

• distance between largest and smallest scores– what does a short distance between the scores tell us

about the sample?– problems of “outliers” or extreme values may occur

Page 38: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Interquartile range (IQR) • distance between the 75th percentile and the 25th

percentile• range of the middle 50% (approximately) of the data• Eliminates problem of outliers or extreme values

• Example from StatCan website (11 in sample) – Data set: 6, 47, 49, 15, 43, 41, 7, 39, 43, 41, 36 – Ordered data set:6, 7, 15, 36, 39, 41, 41, 43, 43, 47, 49– Median:41 – Upper quartile: 41– Lower quartile: 15 – IQR= 41-15

Page 39: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Standard Deviation and Variance

• Inter quartile range eliminates problem of outliers BUT eliminates half the data

• Solution? measure variability from the center of the distribution.

• standard deviation & variance measure how far on average scores deviate or differ from the mean.

Page 40: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Calculation of Standard Deviation

1

12345678

Neuman (2000: 321)Neuman (2000: 321)

Page 41: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Calculation of Standard Deviation

Neuman (2000: 321)Neuman (2000: 321)

Page 42: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Standard Deviation Formula

Neuman (2000: 321)Neuman (2000: 321)

Page 43: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Calculation of Standard Deviation

Neuman (2000: 321)Neuman (2000: 321)

Page 44: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Interpreting Standard Deviation

• amount of variation from mean• social meaning depends on exact case

Page 45: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Details on the Calculation of Standard Deviation

Neuman (2000: 321)Neuman (2000: 321)

Page 46: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

The Bell Curve & standard deviation

Page 47: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Discussion of Preceding Diagram

• “Many biological, psychological and social phenomena occur in the population in the distribution we call the bell curve (Portney & Watkins, 2000).” link to source

• Preceding picture – a symmetrical bell curve, – average score [i.e., the mean] in the middle, where the ‘bell’

shape tallest. – Most of the people [i.e., 68% of them, or 34% + 34%] have

performance within 1 segment [i.e., a standard deviation] of the average score.”

Page 48: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Interpreting Standard Deviation• amount of variation

from mean• Illustration: high &

low standard deviation

• meaning depends on exact case

Page 49: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Another Diagram of Normal Curve (Showing Ideal Random Sampling Distribution, Standard

Deviation & Z-scores)

Page 50: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Example:Central Tendency & Dispersion (description of

distributions)

Example:Central Tendency & Dispersion (description of

distributions)Recall:• 7 people at bus stop in front of bar aged

25,26,27,30,33,34,35– median= 30, mean= 30– Range= 10, standard deviation=10.5

• 7 people in front of ice-cream parlour aged 5,10,20,30,40,50,55– median= 30, mean= 30– Range= 50, standard deviation=17.9

Page 51: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Other ways of characterizing dispersion or spread

Techniques for understanding position of a case (or group of cases) in the context all of cases

• Percentiles• Standard Scores– z-scores

Page 52: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Percentile• 1st Calculate rank then choose a rank (score) and figure

out percentage equal to or less than the rank (score)– Link to more complex definition of percentile

• % up to and including the number (from below)– “A percentile rank is typically defined as the proportion of

scores in a distribution that a specific score is greater than or equal to. For instance, if you received a score of 95 on a math test and this score was greater than or equal to the scores of 88% of the students taking the test, then your percentile rank would be 88. You would be in the 88th percentile”

• Also used in other ways (for example to eliminate cases)

Page 53: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Normal Distribution with Percentiles

Page 54: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

z-scores

• For understanding how a score is positioned in the data set

• to enable comparisons with other scores from other data sets– (comparing individual scores in different distributions)• example of two students from different schools with

different GPAs

– comparing sample distributions to population. How representative is sample to population under study?

Page 55: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Calculating Z-Scores

• z-score=(score – sample mean)/standard deviation of set– Link to formula– Link to z-score calculator

Page 56: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Calculating Z-Scores

Page 57: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Using Z-scores to compare two students’ from different schools

• Susan has GPA of 3.62 & Jorge has GPA of 3.64• Susan from College A– Susan’s Grade Point Average =3.62– Mean GPA= 2.62– SD= .50– Susan’s z-score= 3.62-2.62=1.00/.50=2– Susan’s grade is two Standard deviations above mean

at her school

Page 58: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Using Z-scores to compare two students’ from different schools (continued)

• Jorge from College B– Jorge’s GPA =3.64– Mean GPA= 3.24– SD=.40– Jorge’s z-score= 3.64-3.24=.40/.40=1– Jorge’s grade is one standard deviation above the

mean at his school• Susan’s absolute grade is lower but her position

relative to other students at her school is much higher than Jorge’s position at his school

Page 59: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Another Diagram of Normal Curve with Standard Deviation & Z-scores

Page 60: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Discussion of Previous Case

• Relationship of sampling distribution to population (use mean of sample to estimate mean of population)

Page 61: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

If Time: Begin Bivariate Statistics (Results with two variables)

• Types of relationships between two variables:– Correlation (or covariation)• when two variables ‘vary together’

– a type of association– Not necessarily causal

• Can be same direction (positive correlation or direct relationship)• Can be in different directions (negative correlation or

indirect relationship)– Independence• No correlation, no relationship• Cases with values in one variable do not have any

particular value on the other variable

Page 62: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Techniques for examining relationships between two variables

• Graphs, scattergrams or plots• Cross-tabulations or percentaged tables• Measures of association (e.g. correlation

coeficient, etc.)

Page 63: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Scattergram (Bivariate)

Page 64: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Tables: Basic Terminology (Tables)

• Parts of a Table– title (conventions)• Order of naming of variables • Dependent, independent, control

– body, cell, column, row– “marginals”

• sources, date

Page 65: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Bivariate Statistics: Parts of the Table

Page 66: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Example of Raw Data Table (computer printout-bivariate)

Regan, T. (1985). In search of sobriety: Identifying factors contributing to the recovery from alcoholism. Kentville, NS.

Page 67: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Another Style of Presentation of Percentaged Tables

Table 1. Percentage in support of strike by type of school

Percent supportingType of School Strike

Secondary 60% (800)

Elementary 30% (1000)

__________________________________________________________N = 1800

Serial NumberDescriptive CaptionDependent Variable

IndependentVariable

Variable

Categories

One category of dichotomousdependent variable

Marginals for independentvariable

Total Sample

Page 68: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Presentation of Percentaged Tables (cont’d)

Table 2. Percentage who support strike by type of school and sex

Sex Female Per cent Male Per cent

Type of School supporting strike supporting strike

Secondary 60% 60% (400) (400)

Elementary 30% 30% (900) (100)

__________________________________________________________Female = .30 : Male = .30 N = 1800

Dependent Variable

IndependentVariable

Controlvariable

Control variable

Categories of control variable

Page 69: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Some Important Factors in Interpretation of Tables

• percentages vs. “raw” frequencies, need to know absolute number of cases (N=)

• grouping categories, missing cases• direction of calculation of percentages (for

bivariate and multivariate statistics)

Page 70: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Collapsing categories (U.N. example)

Babbie, E. (1995). The practice of social researchBelmont, CA: Wadsworth

Page 71: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Collapsing Categories & omitting missing data

Babbie, E. (1995). The practice of social researchBelmont, CA: Wadsworth

Page 72: Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. Research Data library at SFU

Grouping Response Categories

• To make new categories• Facilitate analysis of trends• But decisions have effects on the

interpretation of patterns