32
N318b Winter 2002 Nursing Statistics Normal distribution, Z- scores, Central Limit Theorem, Probability Lecture 4

N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Embed Size (px)

Citation preview

Page 1: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

N318b Winter 2002 Nursing Statistics

Normal distribution, Z-scores, Central Limit Theorem,

Probability

Lecture 4

Page 2: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 2

School ofNursing

Institute for Work & Health

Today’s Class

Normal distribution Z-scores Central limit theorem << 10 min break >> Probability Applying knowledge to assigned readings

(Wolfe et al., 1996)

No work group today !

Page 3: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 3

School ofNursing

Institute for Work & Health

A Quick Review from Last Week

Data presentationBar graphs, pie chartsHistograms, polygons (lines)Box plots

Measures of asymmetrySkewKurtosis

Page 4: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 4

School ofNursing

Institute for Work & Health

Statistics is a branch of applied math Most statistical tests are based on a

set of basic assumptions about data Most assumptions refer to distribution If assumptions not true tests not

valid !

Normal Distribution

Review: How do you check normality of data?

“what is all the fuss about?!”

Page 5: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 5

School ofNursing

Institute for Work & Health

The (Standard) Normal Curve

- a hypothetical distribution that forms basis of statistical theory (also called Gaussian curve)

(See Figure 3.1 in textbook, page 64)

Page 6: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 6

School ofNursing

Institute for Work & Health

Why use normal curve? Many variables are normally distributed Many tests require normal distribution Allows for tests of inference since study

results can be compared against it (i.e. it is a probability or “chance” distribution)

“Understanding the normal curve prepares you for understanding the concept of hypothesis testing”

(Textbook page 64)

Page 7: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 7

School ofNursing

Institute for Work & Health

There is an elegant mathematical formula (theory) underlying the distribution (you don’t need to know it !)

Discovered in 1700’s by Demoivre, then later Gauss (1800’s) and then used by Galton (medicine)

Another example of mathematical theory helping to explain observed phenomena

Where did the normal curve come from?

Page 8: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 8

School ofNursing

Institute for Work & Health

What is the normal curve used for?

Test if your observed value (e.g. BP) is different from expected value (i.e. can use standardized or Z-scores to check this)

Estimate precision of observed study mean (i.e. confidence intervals)

Tests based on probability (likelihood) that observed results “fit” normal curve

Page 9: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 9

School ofNursing

Institute for Work & Health

What are the properties of the normal curve? X-axis measured in SD’s (from mean) Y-axis is frequency (units or counts) Mean, median, mode all same Symmetrical (“bell-shaped”) around mean +/- 1 SD includes 68% of population +/- 2 SD’s includes 96% of population “tails” hold very small % of population

(REMEMBER: total area under curve = 100% or 1.0)

Page 10: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 10

School ofNursing

Institute for Work & Health

Standard normal curve

+/- 2 SD includes 96% of sample

-2 SD +2 SD

+/-1 SD either side of mean includes about 68% of sample

-1 SD +1 SDMean

Page 11: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 11

School ofNursing

Institute for Work & Health

Z-scores

If a variable is normally distributed then observed (mean) values can be converted to a z-score

WHY?

Test if your study mean (e.g. BP) is different from expected value

Z-score just another name for SD “distance” from the population mean

Page 12: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 12

School ofNursing

Institute for Work & Health

HOW?

Z-scores – an example

A population has a mean sys BP of 110 mmHG and SD of 15 mmHG

What proportion (%) of people have BP between 95 and 120?

X - Z = ------ SD

= sample mean

SD = sample SD

Page 13: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 13

School ofNursing

Institute for Work & Health

X - Z = ------ SD

= sample mean

SD = sample SD

Z-scores – an example

X - Z1 = ------

X - Z2 = ------

95-110= ---------

15= -1.0

120-110= ---------

15= 0.67

Page 14: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 14

School ofNursing

Institute for Work & Health

Z-scores – an example

Now need to extract % values from the Z-scores using a table (e.g. Appendix A, pg. 417-8 of textbook)

-’ve values are % areas to left of mean, +’ve values are to the right of mean ( )

Z1 = -1.0 =

From Table in Appendix A

Total area = 34.13 + 24.86 = 58.99%

Z2 = 0.67 =34.13% (between 95 to 110)24.86% (between 110 to 125)

Page 15: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 15

School ofNursing

Institute for Work & Health

Z-scores – example 2

What proportion (%) of people have a systolic BP above 140?

X - Z = ------ SD

140-110= ---------

15= 2.0

Z = 2.0 = 47.72% between 110 to 140

From Table in Appendix A

> 140 = 50 – 47.72 = 2.28%

But this represents what?

Page 16: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 16

School ofNursing

Institute for Work & Health

Central Limit Theorem - What is it?When large enough (e.g. n>= 25) samples are drawn from a population with a known variance, the sample mean will be normally distributed

Theorem holds even if underlying distribution moderately non-normal (e.g. a bit skewed)

i.e. if you plot ’s you get a bell-curve

Page 17: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 17

School ofNursing

Institute for Work & Health

Central Limit Theorem – What is its importance? Now have ability to statistically test the

likelihood of observed (sample) mean Variation (“dispersion”) about true mean is

called “Standard error” (SE) of mean SE (of mean) and SD (of sample) are

directly related mathematically SE = SD / square root of n

(where n = sample size)

Page 18: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 18

School ofNursing

Institute for Work & Health

Z-scores – for meansHow likely is it (i.e. what %) that a sample of size n=100 will have mean systolic BP > 113 (assuming = 110 and = 15)?

Z = ------

113 - 110 = --------- 15 / 10

= 2.0

Z = 2.0 = 47.72% of area to right of But once again this represents what?

From Z-score Table in Appendix A

> 113 = 50 – 47.72 = 2.28%

Sample means between 110 - 113 mmHg

- Want Z-scores >= about 2 ! / n

Page 19: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 19

School ofNursing

Institute for Work & Health

What happens if sample size drops to 10 (i.e. n=10, > 113 and = 110, = 15)?

113 - 110 = ----------- 15 / 3.16

= 0.63

Z = 0.63 = 23.57% But once again this represents what?

From Table in Appendix A

For > 113 = 50 – 23.57 = 21.43%

- sample means that fall below 113 mmHg

Effect of sample size on mean?

Z = ------

-

/ n

Page 20: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 20

School ofNursing

Institute for Work & Health

10 minute break !

Page 21: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 21

School ofNursing

Institute for Work & Health

Think of it as a statistical measure of chance

Probability

A proportion (e.g. %) that lets you make intelligent guesses about future events

Often expressed as a “p-value” p-value “rules” in (quantitative) research

P(event) = number of events-------------------------number of subjects

(Often expressed as % when multiplied by 100)

Page 22: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 22

School ofNursing

Institute for Work & Health

Probability – cont’d

You read a well done clinical trial that followed 1000 women with breast CA, 200 of whom died from BC at 5 yrs

You then see a women with BC on the ward and she asks you if she is going to live – what do you tell her?

She has a 20% probability or a 1 in 5 chance of dying from BC within 5 yrs

Page 23: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 23

School ofNursing

Institute for Work & Health

Probability – cont’d

What if she then tells you she is node negative and the tumour was small?

Probability is a way of quantifying risk or likelihood of events occurring (usually according to a set of criteria)

Then she tells you her mother and sister both died from BC by age 45

Page 24: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 24

School ofNursing

Institute for Work & Health

Probability – Facts

Probabilities always between 0 and 1 (0 = min value = no chance)

(1 = max value = definite event)

P-value = “probability due to chance”

arbitrarily “set” at p<=0.05 in most cases, but it can vary from 0.2 to <0.01

P-value refers to the “tails” of the normal curve distribution (lower = better!)

Page 25: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 25

School ofNursing

Institute for Work & Health

Probability – Rules

Conditional Probabilities

probability of event A given event B

Multiplication Rule (Independence !)

probability of A and B = P(A) x P(B)

Addition Rule (Mutually exclusive !)

probability of A and B = P(A) + P(B)

Page 26: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 26

School ofNursing

Institute for Work & Health

Part 2: Application to the Assigned Reading

Page 27: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 27

School ofNursing

Institute for Work & Health

Wolfe et al. (1996)Quick summary of the paper: an etiologic study aimed at exploring

possible causal pathways between back pain and osteoarthritis of the knee

a 3-year consecutive series of 368 knee OA patients via a rheumatology clinic

X-sectional questionnaire assessment of key study variables (possible bias?)

Page 28: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 28

School ofNursing

Institute for Work & Health

Typical example of a sophisticated multistage exploratory analysis

Wolfe et al. (1996)

Descriptive analysis

Exploratory univariate analysis

Causal pathway multivariate analysis

Page 29: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 29

School ofNursing

Institute for Work & Health

Some questions …What does Figure 1 tell us?

Why did they group BMI in quartiles?

Page 30: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 30

School ofNursing

Institute for Work & Health

Some questions …

Do you understand the major features of the data in Table 1?

What do all the columns mean?e.g. “unadjusted” vs. “adjusted”

Odds ratios and confidence intervals studied later (CI’s in next lecture !)

Page 31: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 31

School ofNursing

Institute for Work & Health

Next Week - Lecture 5: Inference testing, Type I and Type II errors, p-values, and

Confidence Intervals

For next week’s class please review:1. Page 14 in syllabus2. Textbook Chapter 3, pages 80-913. Syllabus papers:

i) Birenbaum et al. (1996) ii) Gulick (1995)

Page 32: N318b Winter 2002 Nursing Statistics Normal distribution, Z-scores, Central Limit Theorem, Probability Lecture 4

Nur 318b 2002 Lecture 4: page 32

School ofNursing

Institute for Work & Health

Research Practicum

Can those who signed up please stay for a few extra minutes to decide placements?

Did those who signed-up last term and did NOT get placed want to be put back in the “pool” to be placed?