23
Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY Variability = Uncertainty Probabilities

Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

Embed Size (px)

Citation preview

Page 1: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

Measures of Dispersion

CUMULATIVE FREQUENCIES

INTER-QUARTILE RANGE

RANGE MEAN DEVIATION

VARIANCE andSTANDARD DEVIATION

STATISTICS: DESCRIBING VARIABILITY

Variability = Uncertainty

Probabilities

Page 2: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

STATISTICS: PROBABILITIES

CALCULATING PROBABILITIES

NORMAL DISTRIBUTION

What are probabilities? z-DISTRIBUTION

t-DISTRIBUTION

Probabilities

Page 3: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

Probability

= 0.167

The chance of NOT throwing a = 5/6 = 0.833

= 1 – 0.167

The chance of throwing a = 1/6

The probability of an event A, symbolized by P(A), is a number between 0 and 1

The higher the P value the more probable the event

The Total Number Of Possible Outcomes

 The Number Of Ways Event A Can Occur   P(A) = 

Page 4: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

What is the probability of picking a student of 1.65 m high from the class?

0

2

4

6

8

10

12

14

16

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9 2

2.1

2.2

Height (m)

Fre

quen

cy

Height (m) Frequency1.2 1

1.25 11.3 1

1.35 21.4 2

1.45 31.5 5

1.55 61.6 8

1.65 101.7 12

1.75 151.8 11

1.85 91.9 7

1.95 52 3

2.05 22.1 1

2.15 12.2 1

P(NOT 1.65) = 1 – P(1.65) = 1 – 0.094 = 0.906

Depends on how the data are

distributed

STATISTICS: PROBABILITY

P(1.65) = 10/106 = 0.094

The probability is a number between 0 and 1

Class has 106 students (n=106)

10 are 1.65m tall

Page 5: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

Frequency Distributions

Height (m)

0

2

4

6

8

10

12

14

16

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9 2

2.1

2.2

Height (m)

Fre

quen

cy (

%)

STATISTICS: PROBABILITY

Area under graph = total number of observations

Can display frequency distributions as % or proportion

0

2

4

6

8

10

12

14

16

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9 2

2.1

2.2N

o. o

f ob

serv

atio

ns

Area under graph = 1.0

(for proportion)

= 100 (for percentage)

12 people

10 people

0.113

0.094

Page 6: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

0

2

4

6

8

10

12

14

16

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9 2

2.1

2.2

Height (m)

Fre

qu

ency

(%

)STATISTICS: PROBABILITY

Normal Distributio

n

• Data clustered around the mean • Therefore good chance (high probability) of

picking (at random) a student with a height close to the mean

• Small chance (low probability) of picking (at random) a student who is either very tall or very short

Tails

Page 7: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

POPULATION DYNAMICS POPULATION DYNAMICS Required background knowledge:

• Data and variability concepts

Data collection

• Measures of central tendency (mean, median, mode, variance, stdev)

• Normal distribution and Standard Error

• Student’s t-test and 95% confidence intervals

• Chi-Square tests

• MS Excel

Page 8: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

Properties of a normal distribution:•The mean, median and mode are the same•The frequency distribution is completely symmetrical either side of the mean

•The area under the curve is proportional to number of observations

Height (mm)

Fre

qu

ency

(%

)

02468

1012

0 2 4 6 8 10 12 14 16 18 20 22 24

STATISTICS: PROBABILITY

0

0.05

0.1

0.15

0.2

0.25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

X

Fre

qu

en

cy X

s2 = 4s2 = 8s2 = 12s2 = 16

x = 10The shape of the curve

depends on the variance or standard deviation: the spread of values about the

mean

Σ = 100%

Page 9: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

Some dataset are normally distributed – but NOT all.

Height (mm)

Fre

qu

ency

(%

)

02468

1012

0 2 4 6 8 10 12 14 16 18 20 22 24

Σ = 100%

STATISTICS: PROBABILITY

The normal curve has fixed mathematical properties, irrespective of: The scale on which it is drawnThe magnitude or units of its meanThe magnitude or units of its Standard Deviation

…….and these render it susceptible to STATISTICAL ANALYSIS…

Can use the normal distribution to calculate probabilities

Page 10: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

STATISTICS: PROBABILITIES

CALCULATING PROBABILITIES

NORMAL DISTRIBUTION

What are probabilities? z-DISTRIBUTION

t-DISTRIBUTION

Probabilities

Page 11: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

To calculate the probability of a particular value x being drawn from a normally distributed population of data, you need to know the mean AND the standard deviation of the data

X = value you are consideringμ = population meanσ = population standard deviation

Z = (x – μ)

σ Equation 1

Z-values form the Z-DISTRIBUTION…. Z is based on data that are normally distributed, so the Z distribution is also normally distributed.

STATISTICS: CALCULATING PROBABILITIES

Z = how many standard deviations away from the mean is the value x

If Z is small number then x ≈ meanIf Z is large number then x ≈ mean

Once we know the Z-value we use statistical tables to calculate the associated probability…

Z 0 1 2 3 4 5

0.0 0.5000 0.4960 0.4920 0.4880 0.4840 0.48010.1 0.4602 0.4562 0.4522 0.4483 0.4443 0.44040.2 0.4207 0.4168 0.4129 0.4090 0.4052 0.40130.3 0.3821 0.3783 0.3745 0.3707 0.3669 0.36320.4 0.3466 0.3409 0.3372 0.3336 0.3300 0.3264

0.5 0.3085 0.3050 0.3015 0.2981 0.2946 0.29120.6 0.2743 0.2709 0.2676 0.2643 0.2611 0.25780.7 0.2420 0.2389 0.2358 0.2327 0.2297 0.22660.8 0.2119 0.2090 0.2061 0.2033 0.2005 0.19770.9 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711

1.0 0.1587 0.1562 0.1539 0.1515 0.1492 0.14691.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.12511.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.10561.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.08851.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735

0

2

4

6

8

10

12

14

16

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9 2

2.1

2.2

Height (m)

Fre

quen

cy

Page 12: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

Z = (x – μ)

σ

Z = (1.95 – 1.55)

0.3Z = (0.4)

0.3

Z = 1.33

?

What is the probability of a randomly drawing a student measuring more than 1.95 m from the population if μ = 1.55 m and σ = 0.3 m?

0.450.550.650.750.850.951.051.151.251.351.451.551.651.751.851.952.052.152.252.352.452.55

Height (m)

Fre

qu

ency

-3.67-3.33-3.00-2.67-2.33-2.00-1.67-1.33-1.00-0.67-0.330.000.330.671.001.331.672.002.332.673.003.33

Z

Fre

qu

ency 0.0918

Z 0 1 2 3 4 5

0.0 0.5000 0.4960 0.4920 0.4880 0.4840 0.48010.1 0.4602 0.4562 0.4522 0.4483 0.4443 0.44040.2 0.4207 0.4168 0.4129 0.4090 0.4052 0.40130.3 0.3821 0.3783 0.3745 0.3707 0.3669 0.36320.4 0.3466 0.3409 0.3372 0.3336 0.3300 0.3264

0.5 0.3085 0.3050 0.3015 0.2981 0.2946 0.29120.6 0.2743 0.2709 0.2676 0.2643 0.2611 0.25780.7 0.2420 0.2389 0.2358 0.2327 0.2297 0.22660.8 0.2119 0.2090 0.2061 0.2033 0.2005 0.19770.9 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711

1.0 0.1587 0.1562 0.1539 0.1515 0.1492 0.14691.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.12511.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.10561.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.08851.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735

STATISTICS: CALCULATING PROBABILITIES – an example

Step 1: calculate Z using known information

P(0.0918) of randomly drawing a student measuring 1.95 m from the population [P > 1.95 = 0.0918]

REMEMBER: The higher the P-value the more

probable the event

1st decimal place

2nd decimal place

PROBABILITY

P(A)

p > 1.95

Z = 1.33

Step 2: Look up Z in Z-tables

Page 13: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

Now you try:

A population of bone measurements is normally distributed with μ = 60 mm and σ = 10 mm. What is the probability of selecting a bone with a length greater than 66 mm?

Z = (x – μ)

σ

Step 1: calculate Z using known information

Step 2: Look up the P-value in the Z-tables

Z = 0.60

Therefore p = 0.2743

ANSWER:

In Excel:• Enter x, μ and σ into

different cells• Formula: =(x – μ)/ σ

Page 14: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

POPULATION DYNAMICS POPULATION DYNAMICS Required background knowledge:

• Data and variability concepts

Data collection

• Measures of central tendency (mean, median, mode, variance, stdev)

• Normal distribution and Standard Error

• Student’s t-test and 95% confidence intervals

• Chi-Square tests

• MS Excel

Page 15: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

NOTE:

Standard Deviation = tells you the variation around the mean

Standard Error = tells you how well you’ve estimated the mean

Standard Deviation vs. Standard Error

Page 16: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

0

2

4

6

8

10

12

14

16

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9 2 2.

12.

2

Height (m)

Fre

quen

cy

X2

X1

X3

X4

Population normal Population normal = distribution= distribution

0

2

4

6

8

10

12

14

16

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9 2 2.

12.

2

Sample Means

Fre

quen

cy

Sample mean = Sample mean = normal distributionnormal distribution

Standard Error

Page 17: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

X2

X1

X3

X4

X6

X5

X7

X8

X10

X9

X11

X12

σ2

nσ2

x =Equation 2

n=7n=7n=1n=1

00

n=1n=122

n=4n=4

As n increases

….So σ decreases

Standard Error

The variance of the mean

Page 18: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

σ2

nσ2

x =

Standard Error

Square Root both sides

σ2

x =

σx =

Equation 2

σ

n√ STANDARD

ERROR

Page 19: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

0

2

4

6

8

10

12

14

16

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9 2 2.

12.

2

Height (m)

Fre

quen

cy

Population normal = Population normal = distributiondistribution

0

2

4

6

8

10

12

14

16

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9 2 2.

12.

2

Sample Means

Fre

quen

cy

Sample mean = normal Sample mean = normal distributiondistribution

Standard Error

A normal deviate referring to the normal distribution

of Xi values

Z = (x – μ)

σ

A normal deviate referring to the normal distribution

of means

Z = (x – μ)

σx

Page 20: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

What is the probability of obtaining a random sample of nine measurements with a mean greater than 50.0 mm, from a population having a mean of 47 mm and a standard deviation of 12.0 mm?

Standard Error

N = 9, X = 50.0 mm, μ = 47.0 mm, σ = 12.0 mm

σx

= 12.0

√ 9= 4= 12

3

Z = (50.0 – 47.0) = 3 = 0.75

4 4

Z = (x – μ)

σx

Step 1: calculate Z using known information

σx =

Equation 2

σ

n√

Step 2: Look up Z in Z-tables

Page 21: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

Z 0 1 2 3 4 5

0.0 0.5000 0.4960 0.4920 0.4880 0.4840 0.48010.1 0.4602 0.4562 0.4522 0.4483 0.4443 0.44040.2 0.4207 0.4168 0.4129 0.4090 0.4052 0.40130.3 0.3821 0.3783 0.3745 0.3707 0.3669 0.36320.4 0.3466 0.3409 0.3372 0.3336 0.3300 0.3264

0.5 0.3085 0.3050 0.3015 0.2981 0.2946 0.29120.6 0.2743 0.2709 0.2676 0.2643 0.2611 0.25780.7 0.2420 0.2389 0.2358 0.2327 0.2297 0.22660.8 0.2119 0.2090 0.2061 0.2033 0.2005 0.19770.9 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711

1.0 0.1587 0.1562 0.1539 0.1515 0.1492 0.14691.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.12511.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.10561.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.08851.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735

STATISTICS: z-DISTRIBUTION

Z = 0.75

Step 2: Look up Z in Z-tables

P = 0.2266

So there is a 0.2266 is the probability of obtaining a random sample of nine measurements with a mean greater than 50.0 mm, from a population having a mean of 47 mm and a standard deviation of 12.0 mm.

For probability values, always report

4 decimal places

Page 22: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

σx =

σ

n√

Now you try:What is the probability of obtaining a random sample of 5 measurements with a mean greater than 60.0 mm, from a population having a mean of 57 mm and a standard deviation of 7.0 mm?Step 1: calculate Z using known information

Step 2: Look up the P-value in the Z-tables

Z = 0.96

Therefore p = 0.1685

ANSWER:

Z = (x – μ)

σx

• Formula: =(x - μ)/(σ /(SQRT(n)))

In Excel:• Enter x, μ, σ and n

Page 23: Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY

In the last couple of equations for Z we have used the population parameters: μ, σ and

BUT we don’t usually have access to population data and must make do with sample estimators x, s and

σx

sx

IF n is very, very large : we use Z distribution to calculate normal deviates

Z = (x – μ)

σx

STATISTICS: z-DISTRIBUTION

σx

sx =

t = (x – μ)

sx Equation 3

If n is not large, we must uset distribution:

σ sPROXY