Upload
madlyn-caldwell
View
238
Download
0
Tags:
Embed Size (px)
ANOVA and ANOVA and RegressionRegression
Brian Healy, PhDBrian Healy, PhD
ObjectivesObjectives
ANOVAANOVA– Multiple comparisonsMultiple comparisons
Introduction to regressionIntroduction to regression– Relationship to correlation/t-testRelationship to correlation/t-test
Comments from reviewsComments from reviews
Please fill them out because I read themPlease fill them out because I read them More examples and not just MSMore examples and not just MS More depth on technical More depth on technical
details/statistical theory/equationsdetails/statistical theory/equations– First time ever!!First time ever!!– I have made slides from more in depth I have made slides from more in depth
courses available on-line so that you have courses available on-line so that you have access to formulas for t-test, ANOVA, etc.access to formulas for t-test, ANOVA, etc.
Talks too fast for non-native speakersTalks too fast for non-native speakers
ReviewReview
Types of dataTypes of data p-valuep-value Steps for hypothesis testSteps for hypothesis test
– How do we set up a null hypothesis?How do we set up a null hypothesis? Choosing the right testChoosing the right test
– Continuous outcome Continuous outcome variable/dichotomous explanatory variable/dichotomous explanatory variable: Two sample t-testvariable: Two sample t-test
Steps for hypothesis testingSteps for hypothesis testing
1)1) State null hypothesisState null hypothesis2)2) State type of data for explanatory and State type of data for explanatory and
outcome variableoutcome variable3)3) Determine appropriate statistical testDetermine appropriate statistical test4)4) State summary statisticsState summary statistics5)5) Calculate p-value (stat package)Calculate p-value (stat package)6)6) Decide whether to reject or not reject the Decide whether to reject or not reject the
null hypothesisnull hypothesis• NEVER accept nullNEVER accept null
7)7) Write conclusionWrite conclusion
ExampleExample
In previous class, two groups were In previous class, two groups were compared on a continuous outcomecompared on a continuous outcome
What if we have more than two groups?What if we have more than two groups? Ex. A recent study compared the Ex. A recent study compared the
intensity of structures on MRI in normal intensity of structures on MRI in normal controls, benign MS patients and controls, benign MS patients and secondary progressive MS patientssecondary progressive MS patients
Question: Is there any difference among Question: Is there any difference among these groups?these groups?
Two approachesTwo approaches
Compare each group to each other Compare each group to each other group using a t-testgroup using a t-test– Problem with Problem with multiple comparisonsmultiple comparisons
Complete Complete global comparisonglobal comparison to see to see if there is any differenceif there is any difference– Analysis of variance (ANOVA)Analysis of variance (ANOVA)– Good first step even if eventually Good first step even if eventually
complete pairwise comparisonscomplete pairwise comparisons
Types of analysis-independent Types of analysis-independent samplessamples
OutcomeOutcome ExplanatoryExplanatory AnalysisAnalysis
ContinuousContinuous DichotomousDichotomous t-test, Wilcoxon t-test, Wilcoxon testtest
ContinuousContinuous CategoricalCategorical ANOVA, linear ANOVA, linear regressionregression
ContinuousContinuous ContinuousContinuous Correlation, Correlation, linear regressionlinear regression
DichotomousDichotomous DichotomousDichotomous Chi-square test, Chi-square test, logistic logistic regressionregression
DichotomousDichotomous ContinuousContinuous Logistic Logistic regressionregression
Time to eventTime to event DichotomousDichotomous Log-rank testLog-rank test
Global test-ANOVAGlobal test-ANOVA
As a first step, we can compare across As a first step, we can compare across all groups at onceall groups at once
The null hypothesis for ANOVA is that The null hypothesis for ANOVA is that the means in all of the groups are equalthe means in all of the groups are equal
ANOVA compares the within group ANOVA compares the within group variance and the between group variance and the between group variancevariance– If the patients within a group are very alike If the patients within a group are very alike
and the groups are very different, the and the groups are very different, the groups are likely differentgroups are likely different
Hypothesis testHypothesis test
1)1) HH00: mean: meannormalnormal=mean=meanBMSBMS=mean=meanSPMSSPMS
2)2) Outcome variable: continuousOutcome variable: continuousExplanatory variable: categoricalExplanatory variable: categorical
3)3) Test: ANOVATest: ANOVA4)4) meanmeannormalnormal=0.41; mean=0.41; meanBMSBMS= 0.34; = 0.34;
meanmeanSPMSSPMS=0.30=0.305)5) Results: p=0.011Results: p=0.0116)6) Reject null hypothesisReject null hypothesis7)7) Conclusion: At least one of the groups is Conclusion: At least one of the groups is
significantly different than the others significantly different than the others
Technical asideTechnical aside Our F-statistic is the ratio of the between group Our F-statistic is the ratio of the between group
variance and the within group variancevariance and the within group variance
This ratio of variances has a known distribution (F-This ratio of variances has a known distribution (F-distribution)distribution)
If our calculated F-statistic is high, the between If our calculated F-statistic is high, the between group variance is higher than the within group group variance is higher than the within group variance, meaning the differences between the variance, meaning the differences between the groups are not likely due to chancegroups are not likely due to chance
Therefore, the probability of the observed result Therefore, the probability of the observed result or something more extreme will be low (low p-or something more extreme will be low (low p-value)value)
1111
1
122
11
1
2
2
2
kkk
k
iii
within
between
nnsnsn
kxxn
s
sF
This is the distribution under the null
This small shaded region is the part of the distribution that is equal to or more extreme than the observed value.
The p-value!!!
Now whatNow what
The question often becomes which The question often becomes which groups are differentgroups are different
Possible comparisonsPossible comparisons– All pairsAll pairs– All groups to a specific controlAll groups to a specific control– Pre-specified comparisonsPre-specified comparisons
If we do many tests, we should If we do many tests, we should account for account for multiple comparisonsmultiple comparisons
Type I errorType I error
Type I error is when you reject the Type I error is when you reject the null hypothesis even though it is true null hypothesis even though it is true ((=P(reject H=P(reject H00|H|H00 is true)) is true))
We accept making this error 5% of We accept making this error 5% of the timethe time
If we run a large experiment with 100 If we run a large experiment with 100 tests and the null hypothesis was tests and the null hypothesis was true in each case, how many times true in each case, how many times would we expect to reject the null?would we expect to reject the null?
Multiple comparisonsMultiple comparisons For this problem, three comparisonsFor this problem, three comparisons
– NC vs. BMS; NC vs. SPMS; BMS vs. SPMSNC vs. BMS; NC vs. SPMS; BMS vs. SPMS If we complete each test at the 0.05 level, If we complete each test at the 0.05 level,
what is the chance that we make a type I what is the chance that we make a type I error? error? – P(reject at least 1 | HP(reject at least 1 | H00 is true) is true) = = – P(reject at least 1 | HP(reject at least 1 | H00 is true) is true) = 1- = 1- P(fail to reject P(fail to reject
all three| Hall three| H00 is true) is true) = 1-0.95= 1-0.9533 = 0.143 = 0.143 Inflated type I error rateInflated type I error rate Can correct p-value for each test to Can correct p-value for each test to
maintain experiment type I errormaintain experiment type I error
Bonferroni correctionBonferroni correction
The The Bonferroni correctionBonferroni correction multiples all p- multiples all p-values by the number of comparisons values by the number of comparisons completedcompleted– In our experiment, there were 3 comparisons, so In our experiment, there were 3 comparisons, so
we multiply by 3we multiply by 3– Any p-value that remains less than 0.05 is Any p-value that remains less than 0.05 is
significant significant The Bonferroni correction is conservative (it The Bonferroni correction is conservative (it
is more difficult to obtain a significant result is more difficult to obtain a significant result than it should be), but it is an extremely easy than it should be), but it is an extremely easy way to account for multiple comparisons.way to account for multiple comparisons.– Can be very harsh correction with many testsCan be very harsh correction with many tests
Other correctionsOther corrections
All pairwise comparisonsAll pairwise comparisons– Tukey’s testTukey’s test
All groups to a controlAll groups to a control– Dunnett’s testDunnett’s test
MANY othersMANY others False discovery rateFalse discovery rate
ExampleExample
For our three-group comparison, we For our three-group comparison, we compare each and get the following results compare each and get the following results from Tukey’s testfrom Tukey’s test
GroupsGroups Mean Mean diffdiff
p-valuep-value SignificaSignificantnt
NC vs. BMSNC vs. BMS 0.0750.075 0.100.10
NC vs. SPMSNC vs. SPMS 0.1140.114 0.0120.012 **
BMS vs. BMS vs. SPMSSPMS
0.0390.039 0.600.60
Questions to ask yourselfQuestions to ask yourself
What is the null hypothesis?What is the null hypothesis? We would like to test the null We would like to test the null
hypothesis at the 0.05 levelhypothesis at the 0.05 level If well defined prior to the experiment, If well defined prior to the experiment,
the correction for multiple comparison the correction for multiple comparison if necessary will be clearif necessary will be clear
Hypothesis generating vs. Hypothesis generating vs. hypothesis testinghypothesis testing
ConclusionsConclusions
If you are doing a multiple group If you are doing a multiple group comparison, always specify before the comparison, always specify before the experiment which comparisons are of experiment which comparisons are of interest if possibleinterest if possible
If the null hypothesis is that all the groups If the null hypothesis is that all the groups are the same, test global null using ANOVAare the same, test global null using ANOVA
Complete appropriate additional Complete appropriate additional comparisons with corrections if necessarycomparisons with corrections if necessary
No single right answer for every situationNo single right answer for every situation
Types of analysis-independent Types of analysis-independent samplessamples
OutcomeOutcome ExplanatoryExplanatory AnalysisAnalysis
ContinuousContinuous DichotomousDichotomous t-test, Wilcoxon t-test, Wilcoxon testtest
ContinuousContinuous CategoricalCategorical ANOVA, linear ANOVA, linear regressionregression
ContinuousContinuous ContinuousContinuous Correlation, Correlation, linear regressionlinear regression
DichotomousDichotomous DichotomousDichotomous Chi-square test, Chi-square test, logistic logistic regressionregression
DichotomousDichotomous ContinuousContinuous Logistic Logistic regressionregression
Time to eventTime to event DichotomousDichotomous Log-rank testLog-rank test
CorrelationCorrelation
Is there a linear Is there a linear relationship relationship between IL-10 between IL-10 expression and IL-6 expression and IL-6 expression? expression?
The best graphical The best graphical display for this display for this data is a scatter data is a scatter plotplot
CorrelationCorrelation
DefinitionDefinition: the degree to which two : the degree to which two continuous variables are linearly relatedcontinuous variables are linearly related– Positive correlation- As one variable goes up, the Positive correlation- As one variable goes up, the
other goes up (positive slope)other goes up (positive slope)– Negative correlation- As one variable goes up, the Negative correlation- As one variable goes up, the
other goes down (negative slope)other goes down (negative slope) Correlation (Correlation () ranges from -1 (perfect ) ranges from -1 (perfect
negative correlation) to 1 (perfect positive negative correlation) to 1 (perfect positive correlation)correlation)
A correlation of 0 means that there is no linear A correlation of 0 means that there is no linear relationship between the two variablesrelationship between the two variables
Positive correlation
0
2
4
6
8
10
12
0 2 4 6 8 10 12
Negative correlation
0
2
4
6
8
10
12
0 2 4 6 8 10 12
No correlation
0
1
2
3
4
5
6
7
8
9
10
0 2 4 6 8 10 12
No correlation (quadratic)
0
2
4
6
8
10
12
14
16
18
0 2 4 6 8 10
Hypothesis testHypothesis test
1)1) HH00: correlation between IL-10 expression : correlation between IL-10 expression and IL-6 expression=0and IL-6 expression=0
2)2) Outcome variable: IL-6 expression- Outcome variable: IL-6 expression- continuouscontinuousExplanatory variable: IL-10 expression- Explanatory variable: IL-10 expression- continuouscontinuous
3)3) Test: correlationTest: correlation4)4) Summary statistic: correlation=0.51Summary statistic: correlation=0.515)5) Results: p=0.011Results: p=0.0116)6) Reject null hypothesisReject null hypothesis7)7) Conclusion: A statistically significant Conclusion: A statistically significant
correlation was observed between the two correlation was observed between the two variables variables
Technical aside-correlationTechnical aside-correlation The formal definition of the correlation is given by: The formal definition of the correlation is given by:
Note that this is dimensionless quantity Note that this is dimensionless quantity This equation shows that if the covariance between This equation shows that if the covariance between
the two variables is the same as the variance in the the two variables is the same as the variance in the two variables, we have perfect correlation because two variables, we have perfect correlation because all of the variability in x and y is explained by how all of the variability in x and y is explained by how the two variables change togetherthe two variables change together
)()(
),(),(
yVarxVar
yxCovyxCorr
How can we estimate the How can we estimate the correlation?correlation?
The most common estimator of the correlation is The most common estimator of the correlation is the the Pearson’s correlation coefficientPearson’s correlation coefficient, given by: , given by:
This is a estimate that requires both x and y are This is a estimate that requires both x and y are normally distributed. Since we use the mean in the normally distributed. Since we use the mean in the calculation, the estimate is sensitive to outliers.calculation, the estimate is sensitive to outliers.
n
ii
n
ii
n
iii
yyxx
yyxxr
1
2
1
2
1
Distribution of the test Distribution of the test statisticstatistic
The standard error of the sample The standard error of the sample correlation coefficient is given bycorrelation coefficient is given by
The resulting distribution of the test The resulting distribution of the test statistic is a t-distribution with n-2 degrees statistic is a t-distribution with n-2 degrees of freedom where n is the number of of freedom where n is the number of patients (not the number of measurements)patients (not the number of measurements)
2
1)(ˆ
2
n
rres
22 1
2
21
0
r
nr
nr
rt
Regression-Everything in one Regression-Everything in one placeplace
All analyses we have done to this All analyses we have done to this point can be completed using point can be completed using regression!!!regression!!!
Quick math reviewQuick math review
As you remember, As you remember, the equation of a the equation of a line is line is y=mx+by=mx+b
FFor every one unit or every one unit increase in x, there increase in x, there is an m unit is an m unit increase in yincrease in y
bb is the value of y is the value of y when x is equal to when x is equal to zerozero
Line
y = 1.5x + 4
0
2
4
6
8
10
12
14
16
18
20
0 2 4 6 8 10 12
PicturePicture
Does there seem to Does there seem to be a linear be a linear relationship in the relationship in the data?data?
Is the data Is the data perfectly linear?perfectly linear?
Could we fit a line Could we fit a line to this data?to this data?
0
5
10
15
20
25
0 2 4 6 8 10 12
How do we find the best How do we find the best line?line?
Linear regression Linear regression tries to find the tries to find the best line (curve) to best line (curve) to fit the data Let’s fit the data Let’s look at three look at three candidate linescandidate lines
Which do you think Which do you think is the best?is the best?
What is a way to What is a way to determine the best determine the best line to use?line to use?
What is linear regression?What is linear regression?
The method of The method of finding the best finding the best line (curve) is least line (curve) is least squares, which squares, which minimizes the minimizes the distance from the distance from the line for each of line for each of points points
The equation of the The equation of the line is y=1.5x + 4line is y=1.5x + 4
y = 1.5x + 4
0
5
10
15
20
25
0 2 4 6 8 10 12
ExampleExample For our investigation of the For our investigation of the
relationship between IL-10 relationship between IL-10 and IL-6, we can set up a and IL-6, we can set up a regression equationregression equation
is the expression of IL-6 is the expression of IL-6 when IL-10=0 (intercept)when IL-10=0 (intercept)
is the change in IL-6 for is the change in IL-6 for every 1 unit increase in IL-every 1 unit increase in IL-10 (slope)10 (slope)
ii is the residual from the is the residual from the lineline
iii ILIL 10*6 10
The final regression equation is The final regression equation is
The coefficients mean The coefficients mean – the estimate of the mean expression of IL-6 the estimate of the mean expression of IL-6
for a patient with IL-10 expression=0 (for a patient with IL-10 expression=0 (00))
– an increase of one unit in IL-10 expression an increase of one unit in IL-10 expression leads to an estimated increase of 0.63 in the leads to an estimated increase of 0.63 in the mean expression of IL-6 (mean expression of IL-6 (11))
10*63.04.266̂ ILIL
Tough questionTough question
In our correlation hypothesis test, we In our correlation hypothesis test, we wanted to know if there was an association wanted to know if there was an association between the two measuresbetween the two measures
If there was no relationship between IL-10 If there was no relationship between IL-10 and IL-6 in our system, what would happen and IL-6 in our system, what would happen to our regression equation?to our regression equation?– No effect means that the change in IL-6 is not No effect means that the change in IL-6 is not
related to the change in IL-10related to the change in IL-10
– 11=0=0
Is Is 11 significantly different than zero? significantly different than zero?
Hypothesis testHypothesis test
1)1) HH00: no relationship between IL-6 : no relationship between IL-6 expression and IL-10 expression, expression and IL-10 expression, 11 =0 =0
2)2) Outcome variable: IL-6- continuousOutcome variable: IL-6- continuousExplanatory variable: IL-10- continuousExplanatory variable: IL-10- continuous
3)3) Test: linear regressionTest: linear regression4)4) Summary statistic: Summary statistic: 11 = 0.63 = 0.635)5) Results: p=0.011Results: p=0.0116)6) Reject null hypothesisReject null hypothesis7)7) Conclusion: A significant correlation was Conclusion: A significant correlation was
observed between the two variables observed between the two variables
Wait a second!!Wait a second!!
Let’s check somethingLet’s check something– p-value from correlation analysis = 0.011p-value from correlation analysis = 0.011– p-value from regression analysis = 0.011p-value from regression analysis = 0.011– They are the same!!They are the same!!
Regression leads to same conclusion as Regression leads to same conclusion as correlation analysiscorrelation analysis
Other similarities as well from modelsOther similarities as well from models
Technical aside-Estimates of Technical aside-Estimates of regression coefficientsregression coefficients
Once we have solved the least squares Once we have solved the least squares equation, we obtain estimates for the equation, we obtain estimates for the ’s, ’s, which we refer to as which we refer to as
To test if this estimate is significantly To test if this estimate is significantly different than 0, we use the following different than 0, we use the following equation: equation:
10ˆ,ˆ
xy
xx
yyxx
n
ii
n
iii
10
1
2
11
ˆˆ
ˆ
111
ˆˆ
ˆ
est
Assumptions of linear Assumptions of linear regressionregression
LinearityLinearity– Linear relationship between outcome and predictorsLinear relationship between outcome and predictors– E(Y|X=x)=E(Y|X=x)=++xx1 1 + + 22xx22
22 is still a linear regression is still a linear regression equation because each of the equation because each of the ’s is to the first ’s is to the first powerpower
Normality of the residualsNormality of the residuals– The residuals, The residuals, ii, are normally distributed, N(0, , are normally distributed, N(0,
Homoscedasticity of the residualsHomoscedasticity of the residuals– The residuals, The residuals, ii, have the same variance, have the same variance
IndependenceIndependence– All of the data points are independentAll of the data points are independent– Correlated data points can be taken into account Correlated data points can be taken into account
using multivariate and longitudinal data methodsusing multivariate and longitudinal data methods
Linear regression with Linear regression with dichotomous predictordichotomous predictor
Linear regression can also be used for Linear regression can also be used for dichotomous predictors, like sexdichotomous predictors, like sex
Last class we compared relapsing MS Last class we compared relapsing MS patients to progressive MS patientspatients to progressive MS patients
To do this, we use an indicator variable, To do this, we use an indicator variable, which equals 1 for relapsing and 0 for which equals 1 for relapsing and 0 for progressive. The resulting regression progressive. The resulting regression equation for expression isequation for expression is
iii Rex *10
Interpretation of modelInterpretation of model The meaning of the coefficients in this case The meaning of the coefficients in this case
are are – 0 0 is the estimate of the mean expression when is the estimate of the mean expression when
R=0, in the progressive groupR=0, in the progressive group
– is the estimate of the mean expression is the estimate of the mean expression when R=1, in the relapsing groupwhen R=1, in the relapsing group
– 1 1 is the estimate of the mean increase in is the estimate of the mean increase in expression between the two groupsexpression between the two groups
The difference between the two groups is The difference between the two groups is 11
If there was no difference between the If there was no difference between the groups, what would groups, what would 11 equal? equal?
Mean in wildtype=0
Mean in Progressive group=0
Difference between groups=1
Hypothesis testHypothesis test
1)1) Null hypothesis: meanNull hypothesis: meanprogressiveprogressive=mean=meanrelapsing relapsing
((11=0)=0)2)2) Explanatory: group membership- Explanatory: group membership-
dichotomousdichotomousOutcome: cytokine production-continuousOutcome: cytokine production-continuous
3)3) Test: Linear regressionTest: Linear regression
4)4) 11=6.87=6.875)5) p-value=0.199p-value=0.1996)6) Fail to reject null hypothesisFail to reject null hypothesis7)7) Conclusion: The difference between the Conclusion: The difference between the
groups is not statistically significantgroups is not statistically significant
T-testT-test
As hopefully you remember, you could As hopefully you remember, you could have tested this same null hypothesis have tested this same null hypothesis using a two sample t-testusing a two sample t-test
Very similar result to previous classVery similar result to previous class If we would have assumed equal If we would have assumed equal
variance for our t-test, we would have variance for our t-test, we would have gotten to the same result!!!gotten to the same result!!!
ANOVA results can also be tested ANOVA results can also be tested using regression using more than one using regression using more than one indicatorindicator
Multiple regressionMultiple regression
A large advantage of regression is the A large advantage of regression is the ability to include multiple predictors of an ability to include multiple predictors of an outcome in one analysisoutcome in one analysis
A multiple regression equation looks just A multiple regression equation looks just like a simple regression equation.like a simple regression equation.
exxxY nn ...22110
ExampleExample
Brain parenchymal fraction (BPF) is a Brain parenchymal fraction (BPF) is a measure of disease severity in MSmeasure of disease severity in MS
We would like to know if gender has We would like to know if gender has an effect on BPF in MS patientsan effect on BPF in MS patients
We also know that BPF declines with We also know that BPF declines with age in MS patientsage in MS patients
Is there an effect of sex on BPF if we Is there an effect of sex on BPF if we control for age?control for age?
.75
.8.8
5.9
.95
BP
F
0 .2 .4 .6 .8 1Sex
Blue=males; Red=females
Blue=males; Red=females
.75
.8.8
5.9
.95
BP
F
20 30 40 50 60Age
Is age a potential Is age a potential confounder?confounder?
We know that age has an effect on We know that age has an effect on BPF from previous researchBPF from previous research
We also know that male patients We also know that male patients have a different disease course than have a different disease course than female patients so the age at time of female patients so the age at time of sampling may also be related to sexsampling may also be related to sex
BPFSex
Age
ModelModel
The multiple linear regression model The multiple linear regression model includes a term for both age and sexincludes a term for both age and sex
What are the values genderWhat are the values genderii takes takes on?on?– gendergenderii=0 if the patient is female=0 if the patient is female
– gendergenderii=1 if the patient is male=1 if the patient is male
iiii agegenderBPF ** 210
ExpressionExpression Females:Females:
– BPFBPFi i = = 00+ + 22*age*ageii++ii
Males:Males:– BPFBPFi i = (= (00+ + )+ )+ 22*age*ageii++ii
What is different about the equations?What is different about the equations?– InterceptIntercept
What is the same?What is the same?– SlopeSlope
This model allows an effect of gender on the This model allows an effect of gender on the intercept, but not on the change with ageintercept, but not on the change with age
The meaning of each coefficientThe meaning of each coefficient– the average BPF when age is 0 and the the average BPF when age is 0 and the
patient is femalepatient is female
– the average difference in BPF between the average difference in BPF between males and female, HOLDING AGE CONSTANTmales and female, HOLDING AGE CONSTANT
– the average increase in BPF for a one unit the average increase in BPF for a one unit increase in age, HOLDING GENDER CONSTANT increase in age, HOLDING GENDER CONSTANT
Note that the interpretation of the Note that the interpretation of the coefficient requires mention of the other coefficient requires mention of the other variables in the modelvariables in the model
Interpretation of coefficientsInterpretation of coefficients
Estimated coefficientsEstimated coefficients
Here is the estimated regression equationHere is the estimated regression equation
The average difference between males and The average difference between males and females is 0.017 holding age constantfemales is 0.017 holding age constant
For every one unit increase in age, the mean For every one unit increase in age, the mean BPF decreases 0.0026 units holding sex constantBPF decreases 0.0026 units holding sex constant
Are either of these effects statistically Are either of these effects statistically significant?significant?– What is the null hypothesis?What is the null hypothesis?
iii agesexFBP *0026.0*017.0942.0ˆ
Hypothesis testHypothesis test
1)1) HH00: No effect of sex, controlling for age : No effect of sex, controlling for age =0=02)2) Continuous outcome, continuous predictorContinuous outcome, continuous predictor3)3) Linear regression controlling for sexLinear regression controlling for sex
4)4) Summary statistic: Summary statistic: =0.017=0.0175)5) p-value=0.37p-value=0.376)6) Since the p-value is more than 0.05, we fail Since the p-value is more than 0.05, we fail
to reject the null hypothesisto reject the null hypothesis7)7) We conclude that there is no significant We conclude that there is no significant
association between sex and BPF controlling association between sex and BPF controlling for agefor age
Hypothesis testHypothesis test
1)1) HH00: No effect of age, controlling for sex : No effect of age, controlling for sex 22=0=02)2) Continuous outcome, continuous predictorContinuous outcome, continuous predictor3)3) Linear regression controlling for sexLinear regression controlling for sex
4)4) Summary statistic: Summary statistic: =-0.0026=-0.00265)5) p-value=0.00p-value=0.00 446)6) Since the p-value is less than 0.05, we reject Since the p-value is less than 0.05, we reject
the null hypothesisthe null hypothesis7)7) We conclude that there is a significant We conclude that there is a significant
association between age and BPF controlling association between age and BPF controlling for sexfor sex
Estimated effect of age
p-value for age
Estimated effect of sex
p-value for sex
.75
.8.8
5.9
.95
BP
F
20 30 40 50 60Age
ConclusionsConclusions
Although there was a marginally Although there was a marginally significant association of sex and significant association of sex and BPF, this association was not BPF, this association was not significant after controlling for agesignificant after controlling for age
The significant association between The significant association between age and BPF remained statistically age and BPF remained statistically significant after controlling for sexsignificant after controlling for sex
What we learned (hopefully)What we learned (hopefully)
ANOVAANOVA CorrelationCorrelation Basics of regressionBasics of regression