29
EMBnet Course – Introduction to Statistics for Biologists, Jan 2009 Linear Models II http://bcf.isb-sib.ch/teaching/introStat/ Design of Experiments, Analysis of Variance and Multiple Regression EMBnet Course – Introduction to Statistics for Biologists The research process ! Scientific question of interest ! Decision on what data to collect (and how) ! Collection and analysis of data ! Conclusions, generalization ! Communication and dissemination of results

Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists, Jan 2009

Linear Models II

http://bcf.isb-sib.ch/teaching/introStat/

Design of Experiments, Analysis of Variance and Multiple Regression

EMBnet Course – Introduction to Statistics for Biologists

The research process

!! Scientific question of interest

!! Decision on what data to collect (and how)

!! Collection and analysis of data

!! Conclusions, generalization

!! Communication and dissemination of results

Page 2: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists

Generic Question : Does a ‘treatment’ have an ‘effect’?

Examples :

!! Does wine prevent cancer?

!! Does smoking cause lung cancer?

!! Does milk reduce osteoporosis?

!! Does physical exercise slow artheriosclerosis?

!! Does statin treatment lower blood lipids?

EMBnet Course – Introduction to Statistics for Biologists

Experimental Design – why do we care?

!! Poor design costs:

–! time, money, ethical considerations

!! To ensure relevant data are collected, and can be analyzed to test the scientific hypothesis/ question of interest

–! Decide in advance how data will be analyzed

–! ‘Designing the experiment’ = ‘Planning the analysis’

!! The design is about the science (biology)

Page 3: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists

Planning an Experiment

!! What measurements to make (response)

!! What conditions to study (treatments)

!! What experimental material to use (units)

A “good” experiment

!! tests what you want to test / estimates the effects you are interested in

!! controls for everything else (exclusion, blocking, adjustment) to avoid bias and confounding

Example

Cancer Diagnosis

!! Blood samples were taken from 25 cancer patients and

!! a control group of 25 healthy people.

!! The healthy people were a consecutive series that came to hospital as blood donors.

!! The laboratory analyzed the “positive” samples in March and the “negative” samples in April.

!! What can go wrong in this study?

EMBnet Course – Introduction to Statistics for Biologists Jan 2009

Page 4: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

Example Agricultural experiment

•! Response = crop yield

•! Treatments Two different sorts of potatoes are compared

•! Units Two pieces of land can be used

Field 1 Field 2

EMBnet Course – Introduction to Statistics for Biologists Jan 2009

Example: two blocks

Is this a good design ?

Block 1 Block 2

Type A

Type B

EMBnet Course – Introduction to Statistics for Biologists Jan 2009

Page 5: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

Blocking and Replication

•! replication is needed to estimate the scale of random effects

measurement errors

•! fields are subdivided into smaller areas; the choice of potato

sort of to be planted is randomized inside the two blocks

Block 1 Block 2

5 replicas for each

treatment in the first

block and 8 in the

second.

EMBnet Course – Introduction to Statistics for Biologists Jan 2009

EMBnet Course – Introduction to Statistics for Biologists

Addressing the question

!! A basic means to address this type of question involves comparing two groups of study subjects

–! Control group: provides a baseline for comparison

–! Treatment group: group receiving the ‘treatment’

Page 6: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists

Types of variability

!! Planned systematic (difference between the conditions, wanted)

!! Chance variation (can handle this with statistical models)

!! Unplanned systematic differences (NOT wanted)

–! Can bias results

–! Can only be corrected for if it can be included in the model (adjusting)

–! e.g. time of measurements

EMBnet Course – Introduction to Statistics for Biologists

Confounding factors

!! Ideally, both the treatment and control groups are exactly alike in all respects (except for group membership)

!! A confounding factor (or confounder) is associated with both the group membership and the response

!! Example: strong association of gender and lung cancer, confounded by smoking

!! Unbalanced factors that are not associated with response are not confounding

Page 7: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists

Replication, Randomization, Blocking

!! Replication – to reduce random variation of the

test statistic, increases generalizability

!! Randomization – to remove bias

!! Blocking – to reduce unwanted variation

!! Idea here is that units within a block are similar

to each other, but different between blocks

!! ‘Block what you can, randomize what you cannot’

EMBnet Course – Introduction to Statistics for Biologists

Experimental vs. Observational studies !! Controlled experiment : subjects assigned to groups by

the investigator

–! randomization: protects against bias in assignment to groups

–! blind, double-blind : protects against bias in outcome assessment/measurement

–! placebo : fake ‘treatment’

!! Observational study : subjects ‘assign’ themselves to groups

–! confounder : associated with both group membership and the outcome of interest

Page 8: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists

Observational studies

!! Advantages –! often easier to carry out

–! don’t ‘interfere’ with the system, what you see is ‘natural ’ rather than ‘artificial’

–! variation is biologically relevant, as it has been unaltered

–! sometimes manipulation is not possible

!! Drawbacks –! confounders

EMBnet Course – Introduction to Statistics for Biologists

Hibernation example !! General question: How do changes in an animal’s

environment cause the animal to start hibernating?

!! What changes should be studied ??

–! temperature

–! photoperiod (day length: long or short)

!! What measurement(s) to take?

–! nerve activity enzyme (Na+K+ATP-ase)

!! What animal to study

–! golden hamster, 2 organs (brain, heart)

Page 9: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists

Specific question

!! General question : How do changes in an animal’s environment cause the animal to start hibernating?

!! => Specific question : What is the effect of changing day length on the concentration of the sodium pump enzyme in two golden hamster organs?

EMBnet Course – Introduction to Statistics for Biologists

Sources of variability

!! Variability due to conditions of interest (wanted)

–! Day length (long vs. short)

–! Organ (heart vs. brains)

!! Variability in the response (NOT wanted): measurement error

–! Preparation of enzyme suspension

–! Instrument calibration

!! Variability in experimental units (NOT wanted)

–! Biological differences among hamsters

–! Environmental differences

Page 10: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists

Basic designs: Completely randomized

!! Focus on 1 organ (heart, say)

!! Random assignment: use chance to assign hamsters to long and short days

!! ‘Random’ is not the same as ‘haphazard’

!! For balance, assign same number to short and long

!! Example (8 hamsters):

Long: 4, 1, 7, 2

Short: 3, 8, 5, 6

EMBnet Course – Introduction to Statistics for Biologists

Basic designs: Randomized block !! Suppose that the hamsters came from 4

different litters, with 2 hamsters per litter

!! Expect hamsters from the same litter to be more similar than hamsters from different litters

!! Can take each pair of hamsters and randomly assign short or long to one member of each pair

!! Example (coin flip, say):

S, L // L, S // S, L // S, L

Page 11: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists

Basic designs: Factorial crossing

!! Compare 2 (or more) sets of conditions in the same experiment : Long vs. Short and Heart vs. Brain

!! In this example, there are 4 combinations of conditions:

–! Long/Heart, Long/Brain, Short/Heart, Short/Brain

!! Example (2 coin flips, say):

L/H: 7, 2 L/B: 4, 1

S/H: 3, 5 S/B: 8, 6

EMBnet Course – Introduction to Statistics for Biologists

Basic designs: Split plot/ repeated measures

!! First, randomly assign Long days to 4 hamsters and Short days to the other 4

!! Then, use each hamster twice : once to get Heart conc, and once to get Brain conc

!! This design has units of different sizes for each factor

–! for day length, the unit is a hamster

–! for organ, the unit is a part of a hamster

Page 12: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists

Summary

!! Optimize precision of the estimates among main comparisons of interest

!! Must satisfy scientific and physical constraints of the experiment

!! You can save a lot of time, money and heart-ache by consulting with an experienced analyst on design issues before any steps of the experiment have been carried out

EMBnet Course – Introduction to Statistics for Biologists, Jan 2009

X categorical- Y continuous

!! We can visually inspect the dependence of the distribution of Y given X by a series of boxplot or stripcharts

Page 13: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists

ANOVA !! Stands for ANalysis Of VAriance

!! But it’s a test of differences in means, generalizes the t-test to more than two groups defined f.ex. by one categorical variable

EMBnet Course – Introduction to Statistics for Biologists

The Observations yij

Treatment group

i = 1 i = 2 … i = k

means: m1 m2 … mk

y11 y21 … yk,1

y12 y22 … yk,2

… … … …

y1, n1 y2, n2 … yk, nk

Page 14: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists

Mathematical Principle !! The differences can be partitioned into between and

within groups sum of squares (SS)

!! variance = total SS = SSbetween groups + SSwithin groups

!! TSS = MSS + RSS (total variation = variation explained by the Model + Residual variation inside the groups (measurement error)

!! MS (mean squares) = SS / (number, degrees of freedom), for error MSE and for each factor

!! F test (Fisher) variance ratio, treatment MS / error MS; expected to be 1 if treatment does not explain variation more than error

!! Coeff. of determ. R2 = MSS / TSS

EMBnet Course – Introduction to Statistics for Biologists

The ANOVA table

!! The analysis is usually laid out in a table

!! For a one-way layout (where the response is assumed to vary according to grouping on one factor):

Source df SS MS F p-val

Model k-1 !(mi-m)2 MSS/(k-1) MST/MSE *

Error n-k !(yij-mi)2 RSS/(n-k)

Total n-1 !(yij-m)2

m = overall mean, mi = mean within group i

Page 15: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists

Assumptions

!! Have random samples from each separate population

!! The error variance is the same in each treatment group

!! The samples are sufficiently large that the CLT holds for each sample mean (or the individual population distributions are normal)

EMBnet Course – Introduction to Statistics for Biologists

ANOVA TEST: What does it mean when we reject H?

!! F test (Fisher) variance ratio assesses the null hypothesis that all population means are equal (joint hypothesis):

!! When we reject the null, that does NOT mean that the means are all different!

!! It means that at least one is different

!! To find out which is different, can do ‘post hoc’ testing (pairwise t-tests, for example)

Page 16: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists

Interaction

!! Interaction is very common (and very important) in science

!! Interaction is a difference of differences

!! Interaction is present if the effect of one factor is different for different levels of the other factor

!! Main effects can be difficult to interpret in the presence of interaction, because the effect of one factor depends on the level of the other factor

EMBnet Course – Introduction to Statistics for Biologists

Factorial crossing

!! Compare 2 (or more) sets of conditions in the same experiment

!! Designs with factorial treatment structure allow you to measure interaction between two (or more) sets of conditions that influence the response – you will look at this in more detail during the exercises today

!! Factorial designs may be either observational or experimental

Page 17: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

Interaction in Models !! A linear model with two main effects:

E(Y) = !0 + !1x + !2z

!! If x and z represent groups coded 0,1 : E(Y)=

X=0 X=1

Z=0 !0 !0 + !1

Z=1 !0 + !2 !0 + !1 + !2

!1 estimates the difference in means for X=1 compared to X=0 independent from the Z status

EMBnet Course – Introduction to Statistics for Biologists

Interaction in Models !! A linear model with two main effects and

interaction: E(Y) = !0 + !1x + !2z + !3(x*z)

!1 estimates the difference in means for X=1 compared to X=0 when Z=0

!1 + !3 estimates the difference in means for X=1 compared to X=0 when Z=1

X=0 X=1

Z=0 !0 !0 + !1

Z=1 !0 + !2 !0 + !1 + !2 + !3

EMBnet Course – Introduction to Statistics for Biologists

Page 18: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

Interaction plots

!3 =0 !3 <0

no interaction

EMBnet Course – Introduction to Statistics for Biologists

Interaction plots

!3 >0 !3 >>0

EMBnet Course – Introduction to Statistics for Biologists

Page 19: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists

More on model formulas

!! We can also include interaction terms in a model formula:

yvar ~ xvar1 + xvar2 + xvar3

Examples

–!yvar ~ xvar1 + xvar2 + xvar3 +

xvar1:xvar2

–!yvar ~ (xvar1 + xvar2 + xvar3)^2

–!yvar ~ (xvar1 * xvar2 * xvar3)

EMBnet Course – Introduction to Statistics for Biologists

More on model formulas !! The generic form is response ~ predictors

!! The predictors can be numeric or factor

!! Other symbols to create formulas with combinations of variables (e.g. interactions)

+ to add more variables

- to leave out variables

: to introduce interactions between two terms

* to include both interactions and the terms

(a*b is the same as a+b+a:b)

^n adds all terms including interactions up to order n

I() treats what’s in () as a mathematical expression

Page 20: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists

Interpreting R output

> chicks.aov <- aov(Weight ~ House + Protein*LP*LS)

> summary(chicks.aov)

Df Sum Sq Mean Sq F value Pr(>F)

House 1 708297 708297 15.8153 0.0021705 **

Protein 1 373751 373751 8.3454 0.0147366 *

LP 2 636283 318141 7.1037 0.0104535 *

LS 1 1421553 1421553 31.7414 0.0001524 ***

Protein:LP 2 858158 429079 9.5808 0.0038964 **

Protein:LS 1 7176 7176 0.1602 0.6966078

LP:LS 2 308888 154444 3.4485 0.0687641 .

Protein:LP:LS 2 50128 25064 0.5596 0.5868633

Residuals 11 492640 44785

---

Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

EMBnet Course – Introduction to Statistics for Biologists

Multiple linear regression !! You can also use more than one ‘X ’ variable to predict Y :

predicted y = a + b1x1 + b2x2

!! Example : predict ventricular shortening velocity (Y) from blood glucose (X1) and age (X2)

!! The ‘slopes’ b1 and b2 are called coefficients

!! The prediction function for Y is still linear in the parameters (a, b1, b2)

!! As in simple regression, minimize total squared deviation from the prediction surface (instead of a line it’s a plane or higher dim. hyperplane)

Page 21: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists

Example: cystic fibrosis > library(ISwR)

> data(cystfibr)

> round(cor(cystfibr),2)

age sex height weight bmp fev1 rv frc tlc pemax

age 1.00 -0.17 0.93 0.91 0.38 0.29 -0.55 -0.64 -0.47 0.61

sex -0.17 1.00 -0.17 -0.19 -0.14 -0.53 0.27 0.18 0.02 -0.29

height 0.93 -0.17 1.00 0.92 0.44 0.32 -0.57 -0.62 -0.46 0.60

weight 0.91 -0.19 0.92 1.00 0.67 0.45 -0.62 -0.62 -0.42 0.64

bmp 0.38 -0.14 0.44 0.67 1.00 0.55 -0.58 -0.43 -0.36 0.23

fev1 0.29 -0.53 0.32 0.45 0.55 1.00 -0.67 -0.67 -0.44 0.45

rv -0.55 0.27 -0.57 -0.62 -0.58 -0.67 1.00 0.91 0.59 -0.32

frc -0.64 0.18 -0.62 -0.62 -0.43 -0.67 0.91 1.00 0.70 -0.42

tlc -0.47 0.02 -0.46 -0.42 -0.36 -0.44 0.59 0.70 1.00 -0.18

pemax 0.61 -0.29 0.60 0.64 0.23 0.45 -0.32 -0.42 -0.18 1.00

EMBnet Course – Introduction to Statistics for Biologists

Pairwise plots of cystic fibrosis vars > pairs(cystfibr)

Page 22: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists, Jan 2009

Many variables

!! Pairwise correlations, similarity, clustering, heatmap

EMBnet Course – Introduction to Statistics for Biologists

R: multiple regression using lm > attach(cystfibr)

> summary(lm(pemax~age+sex+height+weight))

Call:

lm(formula = pemax ~ age + sex + height + weight)

Residuals:

Min 1Q Median 3Q Max

-47.791 -18.683 2.747 13.413 43.190

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 70.66072 82.50906 0.856 0.402

age 1.57395 3.13953 0.501 0.622

sex -11.54392 11.23902 -1.027 0.317

height -0.06308 0.80183 -0.079 0.938

weight 0.79124 0.86147 0.918 0.369

Residual standard error: 27.38 on 20 degrees of freedom

Multiple R-Squared: 0.4413, Adjusted R-squared: 0.3296

F-statistic: 3.949 on 4 and 20 DF, p-value: 0.01604

Page 23: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

Example Confounding

!! Ex.: Y= weight and the two groups are gender

!! !1 is the weight difference btw male and female

!! two groups might differ by other characteristics for ex. age or race

!! These cause a difference in weight as well

!! The coefficient is affected, biased

!! Similarly for continuous predictors

EMBnet Course – Introduction to Statistics for Biologists

Confounding Study Case 1

After adjustment for x2:

x1 has no additional effect, coeff = 0

positive confounding

x2

y

Association of y with x1

in the univariate model,

Higher mean when x1 higher

EMBnet Course – Introduction to Statistics for Biologists

Page 24: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

Confounding Study Case 2

X2 has an effect.

Now x1 improves the fitting, coeff <0 after adjustment with X2.

Negative confounding (masking)

x2

y

No association of y with x1

in the univariate model,

Equal mean when x1 higher

EMBnet Course – Introduction to Statistics for Biologists

Confounding

!! Positive: the effect is overestimated in the univariate model compared to the refined model

When two predictors are positively correlated and have effects of the same sign OR are inversely correlated but have effects of the opposite sign

!! Negative: the effect is underestimated (attenuated, masked) in the univariate model

When two predictors are positively correlated but have effects of the opposite sign OR are inversely correlated but have effects of the same sign

EMBnet Course – Introduction to Statistics for Biologists

Page 25: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

Adjustment

!! Aim is to estimate effects unbiased with a refined model

!! Ideally, all potential confounders y, z, …are known, measured and can be included in a model

!! Aim is to estimate unbiased effects with a refined model

!! E(Y) = !0 + !1x + !2y + !3z

!! under consideration of the effects of each predictor (adjustment)

EMBnet Course – Introduction to Statistics for Biologists Jan 2009

Confounding

!! In prospective randomized studies the groups should be approximately balanced for all potential confounders

!! Stratification can be used in designed studies

!! In observational studies confounding is to be expected and we do our best to control it in multipredictor models

EMBnet Course – Introduction to Statistics for Biologists

Page 26: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists, Jan 2009

What to do?

EMBnet Course – Introduction to Statistics for Biologists, Jan 2009

Modeling Overview

!! Want to capture important features of the relationship between a (set of) variable(s) and one or more response(s)

!! Many models are of the form

g(Y) = f(x) + error

!! Differences in the form of g, f and distributional assumptions about the error term

!! R: lm, glm

Page 27: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists, Jan 2009

Non-linear relations in lm

!! X and Y can show a curvilinear relation

!! Transformation Y= a + b*X3

Z=X3 then Y= a + b*Z

!! Multivariate model f.ex. Polynomial

Y= a + b*X + c*X2 Z=X2 then Y= a + b*X + c*Z

!! Linear Models (R: lm) handle these cases

EMBnet Course – Introduction to Statistics for Biologists, Jan 2009

Linearization examples if Y " b * xc, then log(Y) " log( b) + c log( x)

Y’" b’ + c x’

if Y " b + exp(cx), then log(Y- b) " log( exp(cx)) = cx

Y’" c x

Nonlinear regression: •! Different Iterative algorithm

•! Initial trial values

Page 28: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists, Jan 2009

Example: Non-linear relation, nlm

!! X and Y can show a curvilinear relation

Michaelis_Menten

Saturation Enzyme

Kinetics

EMBnet Course – Introduction to Statistics for Biologists, Jan 2009

Linearizing it (Lineweaver-Burk Plot,

Double-Reciprocal Plot)

•! slope b = Km / Vm

•! y-intercept a = 1 / Vm

•! 1/v approaches infinity as [S] decreases:

-! undue weight to inaccurate measurement at low concentration

-! insufficient weight to accurate measurements at high

concentration.

Page 29: Linear Models II - Bioinformatics · –!Can only be corrected for if it can be included in the model (adjusting) –!e.g. time of measurements EMBnet Course – Introduction to Statistics

EMBnet Course – Introduction to Statistics for Biologists, Jan 2009

Acknowledgement

Contributions to Slides, Lab and ideas by

Darlene Goldstein

Books: Peter Dalgaard “Introductory

Statistics with R” Springer Ch 5 Regression and correlation Ch 6 Analysis of variance …

Ch 9 Multiple regression

Ch 10 Linear models

Eric Vittinghoff et al. “Regression Methods in Biostatistics” Springer

Ch 3 Basic Statistical Methods

Ch 4 Linear Regression

Books: Roger Mead The design of experiments Cambridge University Press