50
Final review - statistics Final review - statistics Spring 03 Spring 03 Also, see final review - research design

Final review - statistics Spring 03 Also, see final review - research design

Embed Size (px)

Citation preview

Page 1: Final review - statistics Spring 03 Also, see final review - research design

Final review - statisticsFinal review - statisticsSpring 03Spring 03

Also, see final review - research design

Page 2: Final review - statistics Spring 03 Also, see final review - research design

StatisticsStatistics

Descriptive Statistics

Statistics to summarize and describe the data we collected

Inferential Statistics

Statistics to make inferences from samples to the populations

Page 3: Final review - statistics Spring 03 Also, see final review - research design

A summary of your dataCenter / Central Tendencies

Indicates a central value for the variable

Measures of Dispersion (Variability / Spread)

Indicate how much each participants’ score vary from each other

Measures of Association Indicates how much variables go together

(Shown in Tables, Graphs, Distributions)

Page 4: Final review - statistics Spring 03 Also, see final review - research design

Measures of CenterMeasures of Center Mode A value with the highest frequency

The most common value

Median The “middle” score

Mean Average

Page 5: Final review - statistics Spring 03 Also, see final review - research design

WHY are LEVELS / SCALE of WHY are LEVELS / SCALE of MEASUREMENT IMPORTANT?MEASUREMENT IMPORTANT?

Because you need to match the statistic you use to the kind of variable you have

Page 6: Final review - statistics Spring 03 Also, see final review - research design

Measures of Central Tendency, Measures of Central Tendency, CenterCenter

Nominal Ordinal Interval/Ratio

Mode Mode Mode

Median Median

Mean

Page 7: Final review - statistics Spring 03 Also, see final review - research design

SummarySummary

Ratio

Interval

Ordinal

Nominal

Difference

Order

Equal Interval

Meaningful Zero

Calculate Math

Info

of

dif

fere

nce

am

ong

valu

es

Level of Measurement

Page 8: Final review - statistics Spring 03 Also, see final review - research design

Why “Equal Distance” Matters?Why “Equal Distance” Matters?

If the distance between values are equal (as in interval or ratio data), you are able to calculate (add, subtract, multiply, divide) values

You can get a mean only for interval/ratio variablesA wider variety of statistical tests are available for interval/ratio variables

Page 9: Final review - statistics Spring 03 Also, see final review - research design

4 5 6 7 8 9 10

What are the Mean, Median, and Mode for this distribution?

What is this distribution shape called?

Page 10: Final review - statistics Spring 03 Also, see final review - research design

Types of Types of Measures of Dispersion Measures of Dispersion Variability / SpreadVariability / Spread

Frequencies / Percentages Range

The distance between the highest score and the lowest score (highest – lowest)

Standard deviation / Variance

Page 11: Final review - statistics Spring 03 Also, see final review - research design

Variance / Variance / Standard DeviationStandard Deviation

Variance (S-squared): An approximate average of the squared deviations from the mean

Standard Deviation(S or SD): Square root of variance

The larger the variance/ SD is, the higher variability the data has or larger variation in scores, or distributions that vary widely from the mean.

Page 12: Final review - statistics Spring 03 Also, see final review - research design
Page 13: Final review - statistics Spring 03 Also, see final review - research design

Measures of DispersionMeasures of Dispersion

Nominal Ordinal

Frequency, %

Frequency, %

Frequency, %

Range, IQR Range, IQR

StandardDeviatn, Variance

Interval/Ratio

Page 14: Final review - statistics Spring 03 Also, see final review - research design

CORRELATIONCORRELATION

Co-relation 2 variables tend to “go together” Indicates how strongly and

in which direction two variables are correlated with each other

*** Correlation does NOT EQUAL cause

Page 15: Final review - statistics Spring 03 Also, see final review - research design

SIGNSIGN

0: No systematic relationship

• Positive correlation: As one variable increases, so does the 2nd

• Negative correlation: As one variable increases, the 2nd gets smaller

Page 16: Final review - statistics Spring 03 Also, see final review - research design

Correlation Co-efficientCorrelation Co-efficient

+1-1 0Negative Positive

Stronger StrongerWeaker

Perfect PerfectNone

Page 17: Final review - statistics Spring 03 Also, see final review - research design

SIZESIZE Ranges from –1 to + 1 0 or close to 0 indicates NO relationship +/- .2 - .4 weak +/- .4 - .6 moderate +/- .6 - .8 strong +/- .8 - .9 very strong +/- 1.00 perfectNegative relationships are NOT weaker!

Page 18: Final review - statistics Spring 03 Also, see final review - research design

Significance TestSignificance Test

Correlation co-efficient also comes with significance test (p-value)

p=.05: .05 probability of no correlation in the population = 5% risk of TYPE I Error = 95% confidence level

If p<.05, reject H0 and support Ha at 95% confidence level

Page 19: Final review - statistics Spring 03 Also, see final review - research design

1. Infer characteristics of a population from the characteristics of the samples.

2. Hypothesis Testing

3. Statistical Significance

4. The Decision Matrix

Page 20: Final review - statistics Spring 03 Also, see final review - research design

Sample Statistics

X SD n

Population Parameters

N

P opu l ati on

I n fer

Page 21: Final review - statistics Spring 03 Also, see final review - research design

Inferential StatisticsInferential Statistics

assess -- are the sample statistics indicators of the population parameters?

Differences between 2 groups -- happened by chance?

What effect do random sampling errors have on our results?

Page 22: Final review - statistics Spring 03 Also, see final review - research design

Random sampling errorRandom sampling error

Random sampling error: Difference between the sample

characteristics and the population characteristics caused by chance

Sampling bias:

Difference between the sample characteristics and the population characteristics

caused by biased (non-random) sampling

Page 23: Final review - statistics Spring 03 Also, see final review - research design

ProbabilityProbability Probability (p) ranges between 1 and 0 p = 1 means that the event would occur in

every trial p = 0 means the event would never occur in

any trial The closer the probability is to 1, the more

likely that the event will occur The closer the probability is to 0, the less

likely the event will occur

Page 24: Final review - statistics Spring 03 Also, see final review - research design

P > .05 means that …P > .05 means that …

95%

Means of two groups fall in 95% central area of normal distribution with one population mean

Mean 1

Mean 2

Page 25: Final review - statistics Spring 03 Also, see final review - research design

P < .05 means thatP < .05 means that … …

1 2

Means of two groups do NOT fall in 95% central area of normal distribution of one population mean, so it is more reasonable to assume that they belong to different populations

Page 26: Final review - statistics Spring 03 Also, see final review - research design

Null HypothesisNull Hypothesis

Says IV has no influence on DV

There is no difference between the two variables.

There is no relationship between the two variables.

Page 27: Final review - statistics Spring 03 Also, see final review - research design

Null HypothesisNull Hypothesis States there is NO true difference between

the groups If sample statistics show any difference, it

is due to random sampling error Referred as H0

(Research Hypothesis = Ha) If you can reject H0, you can support Ha If you fail to reject H0, you reject Ha

Page 28: Final review - statistics Spring 03 Also, see final review - research design

Be conservative. What are chances I would get these

results if null hypothesis is true? Only if pattern is highly unlikely (p

.05) do you reject null hypothesis and support your hypothesis

Since cannot be 100% sure your conclusion is correct, you take up to 5% risk.

Your p-value tells you the risk /the probability of making TYPE I Error

Page 29: Final review - statistics Spring 03 Also, see final review - research design

Correct

Correct

Wrong person to marry

Type II error

You think it’s the wrong person to marry

Type Ierror

True state

Page 30: Final review - statistics Spring 03 Also, see final review - research design

Correct

Correct

No fire

Type II error

No Alarm

Type Ierror

True state

Page 31: Final review - statistics Spring 03 Also, see final review - research design

Correct

Correct

Ho (no fire) Ha

Ho = null hypothesis = there is NO fire

Ha = alternative hyp. = there IS a FIRE

Accept Ho

(no alarm)

Type IIerror

Type I errorReject Ho

True State

You decide...

Page 32: Final review - statistics Spring 03 Also, see final review - research design

Easy ways to LOSE pointsEasy ways to LOSE points

Use the word “prove” Better to say support the hypothesis or

consistent with the hypothesis

Tentative statements acknowledge possibility of making a Type 1 or Type 2 error

Use the word “random” incorrectly

Page 33: Final review - statistics Spring 03 Also, see final review - research design

Significance TestSignificance Test

Significance test examines the probability of TYPE I error (falsely rejecting H0)

Significance test examines how probable it is that the observed difference is caused by random sampling error

Reject the null hypothesis if probability is <.05 (probability of TYPE I error

is smaller than .05)

Page 34: Final review - statistics Spring 03 Also, see final review - research design

Principle LogicPrinciple Logic

P < .05

Reject Null Hypothesis (H0)

Support Your Hypothesis (Ha)

Page 35: Final review - statistics Spring 03 Also, see final review - research design

Logic of Hypothesis Testing

Statistical tests used in hypothesis testing deal with the probability of a particular event occurring by chance.

Is the result common or a rare occurrence

if only chance is operating?

A score (or result of a statistical test) is “Significant”

if score is unlikely to occur on basis of chance alone.

Page 36: Final review - statistics Spring 03 Also, see final review - research design

The “Level of Significance” is a cutoff point for determining significantly rare or unusual scores.

Scores outside the middle 95% of a distribution are considered “Rare” when we adopt the standard

“5% Level of Significance”

This level of significance can be written as:

p = .05

Level of Significance

Page 37: Final review - statistics Spring 03 Also, see final review - research design

Decision Rules

Reject Ho (accept Ha) when

the sample statistic is statistically significant at the

chosen p level, otherwise accept Ho (reject Ha).

Possible errors:

• You reject the Null Hypothesis when in fact it is true,

a Type I Error, or Error of Rashness.

B. You accept the Null Hypothesis when in fact it is false,

a Type II Error, or Error of Caution.

Page 38: Final review - statistics Spring 03 Also, see final review - research design

Type I error

Correct

Data results are by

chance (Null is true)

CorrectData indicates something significant is happening (reject

null)

Type II error

There is nothing happening except chance variation (accept

the null)

Data indicates something is

happening (Null is false)

True state

Your decision:

Page 39: Final review - statistics Spring 03 Also, see final review - research design

To compare two groups on Mean Scores use t-test. For more than 2 groups use Analysis of Variance

(ANOVA)

Can’t get a mean from nominal or ordinal data.

Chi Square tests the difference in Frequency Distributions of two or more groups.

Page 40: Final review - statistics Spring 03 Also, see final review - research design

Parametric TestsParametric Tests

Used with data w/ mean score or standard deviation.

t-test, ANOVA and Pearson’s Correlation r.

Use a t-test to compare mean differences between two groups (e.g., male/female and married/single).

Page 41: Final review - statistics Spring 03 Also, see final review - research design

Parametric TestsParametric Tests

use ANalysis Of VAriance (ANOVA) to compare more than two groups (such as age and family income) to get probability scores for the overall group differences.

Use a Post Hoc Tests to identify which subgroups differ significantly from each other.

Page 42: Final review - statistics Spring 03 Also, see final review - research design

When comparing two groups on MEAN SCORES use the t-test.

t =

+1 2

M ean - M ean

2SD

n

2SD

n 2

2

1

1

Page 43: Final review - statistics Spring 03 Also, see final review - research design

T-testT-test

If p<.05, we conclude that two groups are drawn from populations with different distribution (reject H0) at 95% confidence level

Page 44: Final review - statistics Spring 03 Also, see final review - research design

When comparing two groups on MEAN SCORES

use the t-test.

t =

+1 2

M ean - M ean

2SD

n

2SD

n 2

2

1

1

Our Research Hypothesis: hair length leads to different perceptions of a person.

The Null Hypothesis: there will be no difference between the pictures.

Page 45: Final review - statistics Spring 03 Also, see final review - research design

Short Hair:

Mean = 2.2

SD = 1.9

n = 100

Long Hair:

Mean = 4.1

SD = 1.8

n = 100

Mean scores come from different distributions.

Mean scores reflect just chance differences from

a single distribution.

Accept Ha

Accept Ho

p = .03

“I think she is one of those people who quickly earns

respect.”

2.2 4.1

3.1

Page 46: Final review - statistics Spring 03 Also, see final review - research design

Short Hair:

Mean = 1.6

SD = 1.7

n = 100

Long Hair:

Mean = 3.6

SD = 1.2

n = 100

Mean scores come from different distributions.

Mean scores reflect just chance differences from

a single distribution.

Accept Ha

Accept Ho

p = .01

“In my opinion, she is a mature person.”

1.6 3.6

2.6

Page 47: Final review - statistics Spring 03 Also, see final review - research design

Short Hair:

Mean = 3.7

SD = 1.8

n = 100

Long Hair:

Mean = 3.9

SD = 1.5

n = 100

Mean scores are just chance differences from

a single distribution.

Accept Ha

Accept Ho

p = .89

Mean scores come from different distributions.

“I think we are quite similar to one another.”

3.7 3.9

3.8

Page 48: Final review - statistics Spring 03 Also, see final review - research design

A nonsignificant result may A nonsignificant result may be caused by abe caused by a

A. low sample size. B. very cautious significance level. C. weak manipulation of independent

variables. D. true null hypothesis.

Page 49: Final review - statistics Spring 03 Also, see final review - research design

When to use various statisticsWhen to use various statistics

Parametric Interval or ratio data

Non-parametric Ordinal and nominal

data

Page 50: Final review - statistics Spring 03 Also, see final review - research design

Chi-Square XChi-Square X22

Chi Square tests the difference in frequency distributions of two or more groups.

Test of Significance of two nominal variables or of a nominal variable & an ordinal variable Used with a cross tabulation table