Correlation, Regression Correlation, Regression
and and
Covariate-Adjusted Group Covariate-Adjusted Group Comparisons Comparisons
Robert Boudreau, PhDRobert Boudreau, PhD
Co-Director of Methodology CoreCo-Director of Methodology Core
PITT-Multidisciplinary Clinical Research Center PITT-Multidisciplinary Clinical Research Center
for Rheumatic and Musculoskeletal Diseasesfor Rheumatic and Musculoskeletal Diseases
Core Director for BiostatisticsCore Director for Biostatistics
Center for Aging and Population Health Center for Aging and Population Health
Dept. of Epidemiology, GSPHDept. of Epidemiology, GSPH
Flow chart for group Flow chart for group comparisonscomparisons
Measurements to be compared
continuous
Distribution approx normal or N ≥ 20?
No Yes
Non-parametrics T-tests
discrete
( binary, nominal, ordinal with few values)
Chi-squareFisher’s Exact
Flow chart for regression Flow chart for regression modelsmodels
(includes adjusted group comparisons)(includes adjusted group comparisons)Outcome variable continuous or dichotomous?
dichotomouscontinuous
Time-to-event available?
No Yes
Multiple logistic regression
Cox proportionalhazards regression
Predictor variable categorical?
No Yes(e.g. groups)
Multiple linear regression
ANCOVA(Multiple linear regression -using dummyvariable(s) forcategorical var(s)
Health, Aging and Body Health, Aging and Body Composition Study (HABC)Composition Study (HABC)
Observational study of 3075 men and women (now in year 13 followup)Observational study of 3075 men and women (now in year 13 followup) age 70-79age 70-79 45% African-American45% African-American Pittsburgh, PA and Memphis, TNPittsburgh, PA and Memphis, TN Able to walk 1/4 mile and climb 10 steps Able to walk 1/4 mile and climb 10 steps (study eligibility criteria)(study eligibility criteria)
Designed to assess the relationship of weight and body composition toDesigned to assess the relationship of weight and body composition to incident weight related diseases andincident weight related diseases and disabilitydisability
Funded by National Institute on Aging 1997-continuingFunded by National Institute on Aging 1997-continuing University of PittsburghUniversity of Pittsburgh University of Tennessee, MemphisUniversity of Tennessee, Memphis Coordinating Center: University of California, San Francisco Coordinating Center: University of California, San Francisco Laboratory for Epidemiology, Demography and Biometry, NIALaboratory for Epidemiology, Demography and Biometry, NIA
HABC: Knee OA HABC: Knee OA SubstudySubstudy
Year 2: Knee x-rays and MRIs were done on participants with “qualifying knee pain”
“knee OA” Cases (N=862): Sx (Knee-Pain)
SxRxKOA (KL ≥ 2)
African-American female 263 147African-American male 124 56White female 252 93White male 223 87
Today’s ObjectiveToday’s Objective In HABC: Examine Association between SxRxKOA (knee OA) In HABC: Examine Association between SxRxKOA (knee OA)
and CRP adjusted for BMI.and CRP adjusted for BMI.
Sowers M, Hochberg M et. al. C-reactive protein as a biomarker of emergent osteoarthritis. Osteoarthritis and CartilageVolume 10, Issue 8, August 2002, Pages 595-601
N=1025 women aged 27-53; 18% had Knee OA of those 40-53. Higher CRP associated with prevalent KOA (also incident KOA) Bilateral KOA had higher CRP than unilateral
Conclusion: “CRP is highly associated with Knee OA; however, its high correlation with obesity limits its utility as an exclusive marker for knee OA”
All White Females in HABC (N=844)[includes SxRxKOA (n=93); also rest of parent study cohort]
N=5N=5 had CRP > 30 (max=63.2)
log CRP
White FemalesWhite Females
Difference in average logCRP: 0.76 – 0.43 = 0.33
Knee OA
P-value
No (n=752) Yes (n=92)
Mean (SD) Mean (SD)
Equal vars Unequal
logCRP 0.43 (0.83) 0.76 (0.58) 0.0002 < 0.0001
BMI 25.4 (4.3) 28.8 (5.2) < 0.0001 < 0.0001
logCRP SD’s were signif diff (p<0.0001) => Use Satterthwaite unequal variance test
Two-Group Unadjusted Two-Group Unadjusted Comparison Comparison
Of Means Using Regression Of Means Using Regression with Dummy-coded Groupswith Dummy-coded Groups
* No OA is “referent” group (i.e. kneeOA=0)
HABCID logCRP kneeOA BMI
1000 1.10972 0 22.5922 1001 0.16526 0 22.2751 1002 1.50988 0 26.1207 1003 -0.62048 0 26.9536 1014 0.65657 1 26.5266 1017 0.82039 1 30.2526 1033 0.84323 1 29.8458 1048 1.67787 1 39.8597
proc reg data=kneeOA_vs_noOA; model logCRP= KneeOA; where female=1 and white=1;run;
Model: logCRP=0.42682 + 0.33091*kneeOA (intercept) KneeOA=0 logCRP=0.42682+0.33091*0 = 0.42682
KneeOA=1 logCRP=0.42682+0.33091*1 = 0.75773
proc reg data=kneeOA_vs_noOA; model logCRP= KneeOA; where female=1 and white=1;run;
White Females: 2-Group White Females: 2-Group Comparison Using Dummy-Comparison Using Dummy-
coded Groupscoded Groups* No OA is “referent” group (KneeOA=0);
proc reg data=kneeOA_vs_noOA; model logCRP= KneeOA; where female=1 and white=1;run;
Note: Regression using Dummy (0, 1) for group variable (e.g. KneeOA=0,1) In regression, equal (pooled) variance is assumed
“No OA” mean
“kneeOA” mean difference from referent
White Females: 2-Group White Females: 2-Group Comparison Using Dummy-Comparison Using Dummy-
coded Groupscoded Groups* No OA is “referent” group (KneeOA=0);
proc reg data=kneeOA_vs_noOA; model logCRP= KneeOA; where female=1 and white=1;run;
Note: Regression using Dummy (0, 1) for group variable (e.g. KneeOA=0,1) In regression, equal (pooled) variance is assumed
“No OA” mean
“kneeOA” mean difference from referent
Same p-value as equalvariance t-test
Pearson CorrelationPearson CorrelationPearson Correlation = a measure of linear association
Pearson vs Spearman Pearson vs Spearman CorrelationCorrelation
Spearman: • A measure of rank order correlation • Works for any general trend that is increasing or decreasing and not necessarily linear
Pearson vs Spearman Pearson vs Spearman CorrelationCorrelation
Spearman: • A measure of rank order correlation
• Works for any general trend that is increasing or decreasing and not necessarily linear
• Equals Pearson Correlation using the ranks of the observations instead of actual values
Heuristically: Spearman measures degree that
low goes with low, middle with middle, high with high
Regression on a continuous Regression on a continuous independent variable (BMI)independent variable (BMI)
proc reg data=kneeOA_vs_noOA; model logCRP=bmi; where female=1 and white=1 and kneeOA=1;run;
logCRP= -0.59337 + 0.04699*BMI Exact same p-valueas test of H0: Correlation=0
Effect of Centering BMI at Effect of Centering BMI at 2525
proc reg data=kneeOA_vs_noOA; model logCRP=bmi_minus25; where female=1 and white=1 and kneeOA=1;run;
logCRP= 0.58144 + 0.04699*(BMI-25) = 0.58144 at BMI=25 (see graphic)
Effect of Centering BMI Effect of Centering BMI at 25at 25
Model 2: logCRP= 0.58144 + 0.04699*(BMI-25) = 0.58144-25*0.04699 + 0.04699*BMI =-0.59337 + 0.04699*BMI
{
UnadjustedMean Difference
{
UnadjustedMean Difference
Notice: At any BMI level, the mean logCRP differencebetween KneeOA vs Notis smaller than the unadjusted difference
ANCOVA (Analysis of ANCOVA (Analysis of Covariance)Covariance)
Compare logCRP adjusted Compare logCRP adjusted for BMIfor BMIproc reg data=kneeOA_vs_noOA;
model logCRP=KneeOA bmi; where female=1 and white=1;run;
Note: Equal BMI slopes in each group is being modeled
Unadjusted diffWas 0.33
BMI partially“explains” thisdifference
ANCOVA (Analysis of ANCOVA (Analysis of Covariance)Covariance)
Centering BMI at 25Centering BMI at 25proc reg data=kneeOA_vs_noOA; model logCRP=KneeOA bmi_minus25; where female=1 and white=1;run;
Note: Equal BMI slopes in each group is being modeled
ANCOVA (Analysis of ANCOVA (Analysis of Covariance)Covariance)
Compare logCRP adjusted Compare logCRP adjusted for BMIfor BMI
Check of ANCOVA Assumption: Check of ANCOVA Assumption:
Equality of BMI slopes: KneeOA vs Equality of BMI slopes: KneeOA vs NotNotproc reg data=knee_vs_noOA;proc reg data=knee_vs_noOA;
model logCRP=KneeOA bmi BMI_x_KneeOA;model logCRP=KneeOA bmi BMI_x_KneeOA; where female=1 and white=1;where female=1 and white=1;run;run; (“interaction term”)(“interaction term”)
HABCID logCRP kneeOA BMI BMI_x_KneeOA
1000 1.10972 0 22.5922 0.0000 1001 0.16526 0 22.2751 0.0000 1002 1.50988 0 26.1207 0.0000 1003 -0.62048 0 26.9536 0.0000 1014 0.65657 1 26.5266 26.5266 1017 0.82039 1 30.2526 30.2526 1033 0.84323 1 29.8458 29.8458 1048 1.67787 1 39.8597 39.8597
Check of ANCOVA Assumption: Check of ANCOVA Assumption:
Equality of BMI slopes: KneeOA vs Equality of BMI slopes: KneeOA vs NotNotproc reg data=knee_vs_noOA;proc reg data=knee_vs_noOA;
model logCRP=KneeOA bmi BMI_x_KneeOA;model logCRP=KneeOA bmi BMI_x_KneeOA;
where female=1 and white=1;where female=1 and white=1;
run;run;
The “BMI” slopes are not signif different (p=0.8019) => they are parallel
logCRP between KneeOA vs logCRP between KneeOA vs NotNot
Adjusted for BMI, AgeAdjusted for BMI, Ageand Anti-inflammatory Medsand Anti-inflammatory Meds
Thank youThank you
Questions, comments, suggestions or insights?Questions, comments, suggestions or insights?
Remaining time: Open consultation …Remaining time: Open consultation …