42
Inferential statistics

Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Embed Size (px)

Citation preview

Page 1: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Inferential statistics

Page 2: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ
Page 3: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ
Page 4: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ
Page 5: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ
Page 6: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Why statistics are important• Statistics are concerned with difference –

how much does one feature of an environment differ from another

• Magnitude: The comparative strength of two variables.

• Reliability. The degree to which the measure of the magnitude of a variable can be replicated with other samples drawn from the same population.

Page 7: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Why statistics are important• Relationships – how does much one feature of the

environment change as another measure changesCorrelation or regression

r=0.73N=20p<0.01

Page 8: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Arithmetic mean or average

Mean (M or X), is the sum (X) of all the sample values ((X1 + X2 +X3.…… X22) divided by the sample size (N).

X = 45, N = 22. M = X/N = 45/22 = 2.05

Page 9: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

The median

• median is the "middle" value of the sample. There are as many sample values above the sample median as below it.

• If the sample size is odd (say, 2a + 1), then the median is the (a+1)st largest data value. If the sample size is even (say, 2a), then the median is defined as the average of the ath and (a+1)st largest data values.

Page 10: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Other measures of central tendency

• The mode is the single most frequently occurring data value.

• The midrange is the midpoint of the sample -- the average of the smallest and largest data values in the sample.

• Find the Mean, Median and Mode

Page 11: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ
Page 12: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

The underlying distribution of the data

Page 13: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Normal distribution

Page 14: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

All normal distributions have similar properties. The percentage of the scores that is between one standard

deviation (s) below the mean and one standard deviation above is always 68.26%

Mean =77.48 SD=7.15 N=62

-2SD -1SD 0 +1SD +2SD -14.30 -7.15 0 +7.15 +14.30

Page 15: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Is there a difference between Rich and poor scores

Page 16: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Is there a significant difference between Polynesian and “other” scores

Mean =75.0 SD=6.8 N=20

Mean =81.9 SD=6.5N=20

Page 17: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Three things we must know before we can say events are different

1. the difference in mean scores of two or more events

- the bigger the gap between means the greater the difference

2. the degree of variability in the data

- the less variability the better

Page 18: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Variance and Standard DeviationThese are estimates of the spread of data. They

are calculated by measuring the distance between each data point and the mean

variance (s2) is the average of the squared deviations of each sample value from the mean = s2 = X-M)2/(N-1)

The standard deviation (s) is the square root of the variance.

Page 19: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Calculating the

Variance and the standard deviation

for the Rich sample

Rich X-M (X-M)2

72 -9.85 97.02 75 -6.8 46.9 75 -6.8 46.9 76 -5.8 34.2 76 -5.8 34.2 76 -5.8 34.2 77 -4.8 23.5 77 -4.8 23.5 78 -3.8 14.8 80 -1.8 3.4 80 -1.8 3.4 82 0.2 0.0 87 5.2 26.5 87 5.2 26.5 87 5.2 26.5 88 6.2 37.8 89 7.2 51.1 89 7.2 51.1 91 9.2 83.7 95 13.2 172.9Total 1637 838.55Mean (Mx) 81.9 variance(x) 41.9Nx=20 Standard deviation (Sx) 6.5

Page 20: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Three things we must know before we can say events are different

3. The extent to which the sample is representative of the population from which it is drawn

- the bigger the sample the greater the likelihood that it represents the population from which it is drawn

- small samples have unstable means. Big samples have stable means.

Page 21: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Estimating difference The measure of stability of the mean is the Standard

Error of the Mean = standard deviation/the square root of the number in the sample.

So stability of mean is determined by the variability in the sample (this can be affected by the consistency of measurement) and the size of the sample.

The standard error of the mean (SEM) is the standard deviation of the normal distribution of the mean if we were to measure it again and again

Page 22: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Yes it’s significant. The Standard Errors of the Mean = 1.45 and 1.53, so the 95% confidence interval will be about 3 points (1.96*1.5) either side of the mean. The means falls outside each other’s confidence intervals

Page 23: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Is the difference between means significant?

What is clear is that the mean of the Rich group is well outside of the area where there is a 95% chance that the mean for the Poor Group will fall, so it is likely that the Rich mean comes from a different population than the Poor mean.

The convention is to say that if mean 2 falls outside of the area (the confidence interval) where 95% of mean 1 scores is estimated to be, then mean 2 is significantly different from mean 1. We say the probability of mean 1 and mean 2 being the same is less than 0.05 (p<0.05) and the difference is significant

p

Page 24: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

The significance of significance• Not an opinion• A sign that very specific criteria have been met• A standardised way of saying that there is a

There is a difference between two groups – p<0.05;There is no difference between two groups – p>0.05;There is a predictable relationship between two

groups – p<0.05; orThere is no predictable relationship between two

groups - p>0.05.

• A way of getting around the problem of variability

Page 25: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

If you argue for a one

tailed test – saying the

difference can only be in one direction, then you can add 2.5% error from side

where no data is expected to the side where

it is

2.5% of M1

distri-bution

2.5% of M1

distri=bution

95% of M1

distri-bution

2-tailed test

1-tailed test

Page 26: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

T-test resultst-Test: Two-Sample Assuming Equal Variances

  Poor RichMean 75 81.9Variance 49.1 44.1Observations 20 20

Pooled Variance 46.6Hypothesized Mean Difference 0df 38t Stat -3.2P(T<=t) one-tail 0t Critical one-tail 1.69P(T<=t) two-tail 0t Critical two-tail 2.02 

Page 27: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Tests of significance

• Tests of difference – t-tests, analysis of variance, chi-square, odds ratios

• Tests of relationship – correlation, regression analysis

• Tests of difference and relationship – analysis of covariance, multiple regression analysis.

Page 28: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Chi-squared () comparison of age in the sample vs the Waitakere population

Participants in each category

ObservedSample

ExpectedWaitakere

Age O E O-E (O-E)2 (O-E)2/E16-34 years 26 23.35 2.65 7.00 0.3035-54 23 23.85 -0.85 0.72 0.0355-74 10 11.52 -1.52 2.30 0.20 N=4 DF=3

75 and older 3 3.29 -0.29 0.09 0.03 p=0.05

62 62.01 0.56 NS=not significant

Page 29: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Values of chi-square for the research project

The fact that two groups are not significant means that there is no significant difference between the sample and Waitakere population except for culture and qualifications

Chi-squaredGroup obtained criterion P significanceOccupation 15.56 21.03 p<0.05 NSAge 0.56 7.82 p<0.05 NSFamily context 0.39 7.82 p<0.05 NSCulture 20.13 11.07 p>0.05 Significant Gender 0.01 3.84 p<0.05 NSQualifications

6.12 5.99 p>0.05 Significant

Page 30: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

PersonHeight (inches) - X

Self Esteem score/5 - Y

PersonHeight (inches) - X

Self Esteem score/5 -Y

1 68 4.1 11 68 3.5

2 71 4.6 12 67 3.2

3 62 3.8 13 63 3.7

4 75 4.4 14 62 3.3

5 58 3.2 15 60 3.4

6 60 3.1 16 63 4.0

7 67 3.8 17 65 4.1

8 68 4.1 18 67 3.8

9 71 4.3 19 63 3.4

10 69 3.7 20 61 3.6

Page 31: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

r =( (X – MX)*((Y – MY))/(N*SX*SY)

r =correlation coefficient

X = Height

Y= Self Esteem

MX=Mean of X

MY =Mean of Y

SX=Standard deviation of X

SY=Standard deviation of y

Page 32: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ
Page 33: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

r=0.73N=20

Page 34: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Level of Significance

  Two-Tailed Probabilities

Probability of error

0.1 0.05 0.01 0.001

Chance of not being

correlated

10% or 1/10

5% or 1/20

1% or 1 /100

0.1% or 1/1000

r value when n=20

0.378 0.444 0.561 0.679

Page 35: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

One or two tails?

What degrees of freedom

What level of significance should be chosen?

Page 36: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Correlations

Page 37: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

The perfect positive correlation

Page 38: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

The perfect negative correlation

Page 39: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

No correlation at all

Page 40: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

A perfect relationship, but not a correlation

x

y

Page 41: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

How correlation is used and misused

Page 42: Inferential statistics. Why statistics are important Statistics are concerned with difference – how much does one feature of an environment differ

Normality of residuals, Linearity, & Homoscedasticity