69

Part III The General Linear Model Chapter 10 GLM. ANOVA

  • Upload
    fahim

  • View
    35

  • Download
    0

Embed Size (px)

DESCRIPTION

Part III The General Linear Model Chapter 10 GLM. ANOVA. Chapter 10.1. Single Sample t-test. GLM, applied to ANOVA Single Sample t-test. Sleep data example. William Sealy Gosset AKA Student. V. 1. Construct Model. D. F. G. Verbal Drug A increases time slept. Graphical model - PowerPoint PPT Presentation

Citation preview

Page 1: Part III The General Linear Model Chapter 10 GLM. ANOVA
Page 2: Part III The General Linear Model Chapter 10 GLM. ANOVA

Part IIIThe General Linear Model

Chapter 10GLM. ANOVA.

Page 3: Part III The General Linear Model Chapter 10 GLM. ANOVA

Chapter 10.1Single Sample t-test

Page 4: Part III The General Linear Model Chapter 10 GLM. ANOVA

GLM, applied to ANOVA

Single Sample t-test. Sleep data example

William Sealy GossetAKA Student

Page 5: Part III The General Linear Model Chapter 10 GLM. ANOVA

1. Construct Model

• Verbal– Drug A increases time slept.

• Graphical model– Define quantity of interest

D

V

G F

T = TDrugA TControl–

Page 6: Part III The General Linear Model Chapter 10 GLM. ANOVA

1. Construct ModelD

V

G F

• Formal– Response:– Explanatory:– Formal:

Page 7: Part III The General Linear Model Chapter 10 GLM. ANOVA

2. Execute analysis

Data = Model + Res Res2T = βo + ε ε20.7 = 0.75 + -0.05 0.002-1.6 = 0.75 + -2.35 5.523-0.2 = 0.75 + -0.95 0.903-1.2 = 0.75 + -1.95 3.803-0.1 = 0.75 + -0.85 0.7233.4 = 0.75 + 2.65 7.0233.7 = 0.75 + 2.95 8.7030.8 = 0.75 + 0.05 0.0030 = 0.75 + -0.75 0.5632 = 0.75 + 1.25 1.563∑=0 ∑=28.809?

Page 8: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

Page 9: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

NA

Page 10: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

NA

NA

Page 11: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

NA

NA

Page 12: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

NA

NA

Page 13: Part III The General Linear Model Chapter 10 GLM. ANOVA

4. State population and whether sample is representative.

a) Differences in time slept relative to control for all possible subjects

b) All possible differences in time slept relative to control, given the experimental protocol

c) Differences in time slept relative to control, given the experimental protocol

Page 14: Part III The General Linear Model Chapter 10 GLM. ANOVA

5. Decide on mode of inference. Is hypothesis testing appropriate?

• Does Drug A affect hours of sleep?– Don’t know if the answer is yes or no

6. State HA / Ho pair, with tolerance for Type I error• HA:

• Ho:

• State test statistic: • Distribution of test statistic:• Tolerance for Type I error:

Page 15: Part III The General Linear Model Chapter 10 GLM. ANOVA

7. ANOVA- Calculate df, SS, MS, according to model

GLM: T = βo + εdf: 10 1 9SS: 34.43 5.62 28.81

Page 16: Part III The General Linear Model Chapter 10 GLM. ANOVA

7. ANOVA- Calculate df, SS, MS, according to model

GLM: T = βo + εdf: 10 1 9SS: 34.43 5.62 28.81ANOVA Tabledf SS MS F P-valueDrugA 0.218ResTotal

Page 17: Part III The General Linear Model Chapter 10 GLM. ANOVA

8. Recompute p-value if necessary.• Assumptions met so no need to recompute

9. Declare and report decision about model terms (compare p to α).• p = 0.218 < α = 0.05, so reject HA: βo ≠ 0• Report decision:– There is no significant difference in extra time

slept, for drug A (F1,9 = 1.76, p = 0.218)

– But might Type II error be a problem here?

Page 18: Part III The General Linear Model Chapter 10 GLM. ANOVA

• There may be a difference, but it is hidden in the variance

• Power analysis:– Compute the minimum detectable difference• T = 0.75 hours F = 1.76 p = 0.218• T = 1.00 hours F = 3.12 p = 0.111• T = 1.25 hours F = 4.88 p = 0.054 • T = 1.28 hours F = 5.12 p = 0.050

9. Declare and report decision about model terms (compare p to α).

Page 19: Part III The General Linear Model Chapter 10 GLM. ANOVA

• Another experiment, with more subjects, should be considered before concluding there is no evidence of an effect

• Power analysis:– Compute sample size needed to detect a difference• n = 10 F = 1.76 p = 0.218• n = 20 F = 3.71 p = 0.0692• n = 24 F = 4.49 p = 0.0451

9. Declare and report decision about model terms (compare p to α).

Page 20: Part III The General Linear Model Chapter 10 GLM. ANOVA

• Parameters are not of interest there because there appears to be no difference

• BUT, with this sampling effort and variability, the study needs to be repeated to be conclusive.

10. Report and interpret parameters of biological interest.

Page 21: Part III The General Linear Model Chapter 10 GLM. ANOVA

Chapter 10.2Two Sample t-test

Page 22: Part III The General Linear Model Chapter 10 GLM. ANOVA

GLM, applied to ANOVA

Two Sample t-test. Sleep data example

William Sealy GossetAKA Student

Page 23: Part III The General Linear Model Chapter 10 GLM. ANOVA

1. Construct Model

• Verbal– Extra time slept

depends on drug

• Graphical model • Formal model– Response:– Explanatory:

Measurement scale?

D

V

G F

Page 24: Part III The General Linear Model Chapter 10 GLM. ANOVA

2. Execute analysislm1 <- lm(diff~drug, data=drugs)

diff drug fits res0.7 DrugA 0.75 -0.05-1.6 DrugA 0.75 -2.35-0.2 DrugA 0.75 -0.95-1.2 DrugA 0.75 -1.95-0.1 DrugA 0.75 -0.853.4 DrugA 0.75 2.653.7 DrugA 0.75 2.950.8 DrugA 0.75 0.050 DrugA 0.75 -0.752 DrugA 0.75 1.251.9 DrugB 2.33 -0.430.8 DrugB 2.33 -1.531.1 DrugB 2.33 -1.230.1 DrugB 2.33 -2.23-0.1 DrugB 2.33 -2.434.4 DrugB 2.33 2.075.5 DrugB 2.33 3.171.6 DrugB 2.33 -0.734.6 DrugB 2.33 2.273.4 DrugB 2.33 1.07

Parameter estimates

Based on output:

GLM routine:

𝑇=𝛽𝑜+𝛽𝐷𝑟𝑢𝑔 ∙𝐷𝑟𝑢𝑔+𝜀

Page 25: Part III The General Linear Model Chapter 10 GLM. ANOVA

2. Execute analysislm1 <- lm(diff~drug, data=drugs)

Parameter estimates

Based on output:

GLM routine:

𝑇=𝛽𝑜+𝛽𝐷𝑟𝑢𝑔 ∙𝐷𝑟𝑢𝑔+𝜀

Page 26: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

Page 27: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

NA

Page 28: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

NA

Page 29: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

NA

Page 30: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

NA

X

Page 31: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

NA

X

Page 32: Part III The General Linear Model Chapter 10 GLM. ANOVA

4. State population and whether sample is representative.• All possible differences in time slept relative between the

two groups

5. Decide on mode of inference. Is hypothesis testing appropriate?• Yes. The question is whether one drug is better than the

other. • It is not clear whether the greater hours of sleep due to

the one drug is more than just chance.

Page 33: Part III The General Linear Model Chapter 10 GLM. ANOVA

6. State HA / Ho pair, with tolerance for Type I error• HA:

• Ho:

• State test statistic: • Distribution of test statistic:• Tolerance for Type I error:

Page 34: Part III The General Linear Model Chapter 10 GLM. ANOVA
Page 35: Part III The General Linear Model Chapter 10 GLM. ANOVA

7. ANOVA- Calculate df, SS, MS, according to model

GLM: T = βo + βDrug Drug + εSource: Total Drug Res

df:SS: 77.37 64.89

Page 36: Part III The General Linear Model Chapter 10 GLM. ANOVA

8. Recompute p-value if necessary.• When assumptions not met, recompute if:

– n small (n = 19, so _____)

– p near α (p =0.079, so _____)

• Colquhoun (1971) carried out a randomization test

– p = 0.0813 (976/12000)

Page 37: Part III The General Linear Model Chapter 10 GLM. ANOVA

9. Declare and report decision about model terms (compare p to α).

• p = 0.0813< α = 0.05, so reject HA

• Report decision:

– There is no significant difference in extra time slept for the two drugs (F1,18 = 3.46, p = 0.081)

– Again, Type II error may be a problem

• Run a Power Analysis to guide future study

Page 38: Part III The General Linear Model Chapter 10 GLM. ANOVA

• Parameters are not of interest there because there appears to be no difference

• BUT, with this sampling effort and variability, the study needs to be repeated to be conclusive.

– The inclusion of 10 more samples may allow the detection of a significant difference

10. Report and interpret parameters of biological interest.

Page 39: Part III The General Linear Model Chapter 10 GLM. ANOVA

Chapter 10.3One way ANOVA, Fixed Effects

Page 40: Part III The General Linear Model Chapter 10 GLM. ANOVA

GLM, applied to ANOVA One way ANOVA, Fixed Effects

• Pea section growth data, from Box 9.4 in Sokal and Rohlf (1995).

• Does growth depend on treatment (control versus 4 different sugars with auxin present)?

= +ε

Page 41: Part III The General Linear Model Chapter 10 GLM. ANOVA

1. Construct Model• Verbal– Pea section length in

treated groups differ from the control (untreated) group.

• Graphical model • Formal model– Response:– Explanatory:

• Fixed effect ↗

Measurement scale?

Page 42: Part III The General Linear Model Chapter 10 GLM. ANOVA

2. Execute analysislm1 <- lm(len~trt, data=peas)

𝐿𝑒𝑛=𝛽𝑜+𝛽𝑇𝑟𝑡 ∙𝑇𝑟𝑡+𝜀

len trt fits res75 Control 70.1 4.967 Control 70.1 -3.170 Control 70.1 -0.175 Control 70.1 4.965 Control 70.1 -5.171 Control 70.1 0.967 Control 70.1 -3.167 Control 70.1 -3.176 Control 70.1 5.968 Control 70.1 -2.157 Glucose 59.3 -2.358 Glucose 59.3 -1.360 Glucose 59.3 0.759 Glucose 59.3 -0.362 Glucose 59.3 2.760 Glucose 59.3 0.760 Glucose 59.3 0.757 Glucose 59.3 -2.359 Glucose 59.3 -0.361 Glucose 59.3 1.758 Fructose 58.2 -0.261 Fructose 58.2 2.856 Fructose 58.2 -2.2

Parameter estimates

Page 43: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

Page 44: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

NA

Page 45: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

NA

Page 46: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

NA

Page 47: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

NA

Page 48: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

NA

Page 49: Part III The General Linear Model Chapter 10 GLM. ANOVA

3. Evaluate Model

□ Straight line model ok?

□ Errors homogeneous?

□ Errors normal?

□ Errors independent?

NA

Page 50: Part III The General Linear Model Chapter 10 GLM. ANOVA

4. State population and whether sample is representative.• Population is all possible measurements, given the

method of applying treatments and the protocol for taking measurements.

• It is taken to be representative (i.e. not biased)

5. Decide on mode of inference. Is hypothesis testing appropriate?• Yes. We want to know if there are any differences between

treatments• ANOVA tells us if there are ANY differences in variance

among groups

Page 51: Part III The General Linear Model Chapter 10 GLM. ANOVA

6. State HA / Ho pair

Research Hypothesis (HA)Treatment effects differμC ≠ μG ≠ μF≠ μG+F ≠ μSORvar(βTrtTrt) > 0

Null Hypothesis (Ho)Treatment effects do not differμC = μG = μF= μG+F = μSORvar(βTrtTrt) = 0

Page 52: Part III The General Linear Model Chapter 10 GLM. ANOVA

6. HA / Ho pairs for planned comparisons

• 5·5 = 25 possible comparisons• We usually have some expectations about the direction

of the contrasts among groups– Based on these, we can undertake planned or a priori

comparisons– Examples:

• HA: βC ≠ (1/4)(βG+ βF + βG+F + βS) [Control vs. Treatment]• HA: βG+F ≠ (1/3)(βG+ βF + βS) [Mixed vs. Pure]• HA: βS ≠ (1/2)(βG+ βF ) [Poly vs Mono]

Page 53: Part III The General Linear Model Chapter 10 GLM. ANOVA

6. HA / Ho pair, test statistic, distribution, tolerance of Type I error• State test statistic : F-ratio• Distribution of test statistic: F-distribution• Tolerance for Type I error: 5%

• BUT we need to adjust α for planned comparisons– 5% = 1 in 20; hence, we would likely reject one true

HA if we did 20 comparisons – Adjust α level for three comparisons (use Dunn-Sidak

method)• αexpwise= 1 – (1 – α)k = 1 – (1 – 0.05)3 = 0.017

Page 54: Part III The General Linear Model Chapter 10 GLM. ANOVA

7. ANOVA

GLM: Len – βo = βTrt Trt + εSource: Total Trt Errordf: n – 149 5-14 49-445SS: 1322.8 245.5 1077.3

n = 50

Page 55: Part III The General Linear Model Chapter 10 GLM. ANOVA

8. Recompute p-value if necessary.• Assumptions met, skip

9. Declare decision about model terms.• p < 0.001

• p < 0.05 so accept HA That μC ≠ μG ≠ μF≠ μG+F ≠ μS• Pea section length differs significantly among the 5

groups (control and 4 treatments).

– F4,45 = 49.37, p < 0.0001

Page 56: Part III The General Linear Model Chapter 10 GLM. ANOVA

• Where are the differences, among the 5 groups?– Two approaches: A priori

A posteriori

• First planned comparison: Growth in treated media differs from that in untreated

10. Report and interpret parameters of biological interest.

Page 57: Part III The General Linear Model Chapter 10 GLM. ANOVA

• Run t-test on Control vs. Treatment

10. Report and interpret parameters of biological interest.len trt comp175 Control Control67 Control Control70 Control Control75 Control Control65 Control Control71 Control Control67 Control Control67 Control Control76 Control Control68 Control Control57 Glucose Treatment58 Glucose Treatment60 Glucose Treatment59 Glucose Treatment62 Glucose Treatment60 Glucose Treatment60 Glucose Treatment57 Glucose Treatment59 Glucose Treatment61 Glucose Treatment58 Fructose Treatment

Analysis of Variance Table

Response: len Df Sum Sq Mean Sq F value Pr(>F) comp1 1 832.32 832.32 81.45 6.516e-12 ***Residuals 48 490.50 10.22

lm2 <- lm(len~comp1, data=peas)anova(lm2)

Page 58: Part III The General Linear Model Chapter 10 GLM. ANOVA

• First work out UCL ≤ mean ≤ LCLControl: 67.6 ≤ 70.1 ≤ 72.6 units (n=40)Treatment: 58.9 ≤ 59.9 ≤ 60.9 units (n=40)• Then work out degree of difference(μCon – μTrt)/ μCon(70.1 - 59.9)/70.1 = 15%• Sugar supressed growth by 15%

10. Report and interpret parameters of biological interest.

Page 59: Part III The General Linear Model Chapter 10 GLM. ANOVA

• Conclusions from the 3 planned comparisons:– A 2% sugar solution reduces growth• F1,45 = 152.564, α = 0.017 > p < 0.0001

– Mixed glucose + fructose reduces growth relative to pure sugars• F1,45 = 8.82, α = 0.017 > p = 0.00476

– The monosaccharides (fructose, glucose) suppress growth more than the polysaccharide (sucrose)• F = 34.98, α = 0.017 > p = 0.00000417

• Conclusions reaffirmed using CL

10. Report and interpret parameters of biological interest.

Page 60: Part III The General Linear Model Chapter 10 GLM. ANOVA

Chapter 10.4One way ANOVA, Random Effects

Page 61: Part III The General Linear Model Chapter 10 GLM. ANOVA

GLM, applied to ANOVA One way ANOVA, Random Effects

• Example. Box 9.1 of Sokal and Rohlf 1995, p. 210.• Does tick size, as measured by scutum width, differ

among hosts (rabbits)?• Random effects example – contrast with fixed effects

Page 62: Part III The General Linear Model Chapter 10 GLM. ANOVA

1. Construct Model

• Verbal– Scutum width depends on host identity

• Graphical model NEXT• Formal model– Response: – Explanatory: (Random effect)

Page 63: Part III The General Linear Model Chapter 10 GLM. ANOVA

βH = [+12.55, -5.33, -4.4, +1.6]

βo = 359.7

𝑊 𝑠𝑐𝑢𝑡=𝛽𝑜+𝛽𝐻 ∙𝐻+𝜀

SSTotal

Page 64: Part III The General Linear Model Chapter 10 GLM. ANOVA

βH = [+12.55, -5.33, -4.4, +1.6]

βo = 359.7

𝑊 𝑠𝑐𝑢𝑡=𝛽𝑜+𝛽𝐻 ∙𝐻+𝜀

SSH

Page 65: Part III The General Linear Model Chapter 10 GLM. ANOVA

βH = [+12.55, -5.33, -4.4, +1.6]

βo = 359.7

𝑊 𝑠𝑐𝑢𝑡=𝛽𝑜+𝛽𝐻 ∙𝐻+𝜀

SSRes

Page 66: Part III The General Linear Model Chapter 10 GLM. ANOVA

2. Execute analysis.

3. Evaluate model.

4. State the population and whether the sample is representative.– Fixed vs. Random

• Fixed: All possible measurement of scutum widths from ticks on these four rabbits only

– A fixed factor has levels that are the only ones of interest (e.g. different sugar treatments).

• Random: All possible measurement of scutum widths from ticks found on all possible rabbits

– A random factor has levels that are considered a sample from some larger population of levels (such as rabbits)

Page 67: Part III The General Linear Model Chapter 10 GLM. ANOVA

More Fixed vs. Random

Because we want to infer to

a larger population level

Because we’re usually interested

in specific contrasts

Depends on

context

Page 68: Part III The General Linear Model Chapter 10 GLM. ANOVA

5. Decide on mode of inference. Is hypothesis testing appropriate?

6. State HA / Ho pair, test statistic, distribution, tolerance for Type I error.– Q: Is additional variation in size due to their host?

– HA: Var(βH·H) > 0– Ho: Var(βH·H) = 0

7. ANOVA - Compute and partition the df in the response variable according to the model

0.004Recommendation: Cross-check your workings with the computer generated ANOVA table.This ensures the computer did what you wanted it to do!

Page 69: Part III The General Linear Model Chapter 10 GLM. ANOVA

8. Recompute p-value by randomization if necessary.

9. Declare and report statistical decision, with evidence– The variance among hosts exceeds variance within

hosts (F3,33 = 5.26, p = 0.004)

10.Report and interpret parameters of biological interest.– In this example the interest was in whether there was

variance among the hosts.

– There was no stated interest in which hosts differed, or by how much.