24
Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG) Rafael A. Irizarry Department of Biostatistics, JHU (joint work with Bridget Hobbs and Terry Speed, Walter & Eliza Hall Institute of Medical Research)

Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG)

  • Upload
    sadie

  • View
    28

  • Download
    0

Embed Size (px)

DESCRIPTION

Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG). Rafael A. Irizarry Department of Biostatistics, JHU (joint work with Bridget Hobbs and Terry Speed, Walter & Eliza Hall Institute of Medical Research). Summary. - PowerPoint PPT Presentation

Citation preview

Page 1: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG)

Rafael A. IrizarryDepartment of Biostatistics, JHU

(joint work with Bridget Hobbs and Terry Speed,

Walter & Eliza Hall Institute of Medical Research)

Page 2: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Summary

• Summarize the expression level of a probe set by Average Log2 (PM-BG)

• PMs need to be normalized • Background makes no use of probe-specific MM• Evaluate and compare through bias, variance and

model fit to AvDiff and the Li & Wong algorithm• Use Gene Logic spike-in and dilution study• All three expression measures performed well• AvLog(PM-BG) is arguably the best of the three

Page 3: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

SD vs. Avg of Defective Probes

Page 4: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Normalization at Probe Level

Page 5: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Expression after Normalization

Page 6: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Background Distribution

Page 7: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Average Log2(PM-BG)

• Normalize probe level data

• Compute BG = background mean by estimating the mode of the MM distribution

• Subtract BG from each PM

• If PM-BG < 0 use minimum of positives divided by 2

• Take average

Page 8: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Spike-In Experiments

• Add concentrations (0.5pM – 100 pM) of 11 foreign species cRNAs to hybridization mixture

• Set A: 11 control cRNAs were spiked in, all at the same concentration, which varied across chips.

• Set B: 11 control cRNAs were spiked in, all at different concentrations, which varied across chips. The concentrations were arranged in 12x12 cyclic Latin square (with 3 replicates)

Page 9: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Why Remove Background?

Page 10: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Probe Level Data (12 chips)

Page 11: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

What Did We Learn?

• Don’t subtract or divide by MM

• Probe effect is additive on log scale

• Take logs

Page 12: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Expression Level

Page 13: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Spike-In BGene Conc 1 Conc 2 Rank

BioB-5 100 0.5 1

BioB-3 0.5 25.0 2

BioC-5 2.0 75.0 3

BioB-M 1.0 35.7 4

BioDn-3 1.5 50.0 5

DapX-3 35.7 3.0 6

CreX-3 50.0 5.0 7

CreX-5 12.5 2.0 8

BioC-3 25.0 100 9

DapX-5 5.0 1.5 10

DapX-M 3.0 1.0 11

Later we consider 24 different combinations of concentrations

Page 14: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Differential Expression

Page 15: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Observed vs True Ratio

Page 16: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Dilution Experiment• cRNA hybridized to human chip (HGU_95) in

range of proportions and dilutions• Dilution series begins at 1.25 g cRNA per

GeneChip array, and rises through 2.5, 5.0, 7.5, 10.0, to 20.0 g per array. 5 replicate chips were used at each dilution

• Normalize just within each set of 5 replicates• For each probe set compute expression, average

and SD over replicates, and fit a line to log expression vs. log concentration

• Regression line should have slope 1 and high R2

Page 17: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Dilution Experiment Data

Page 18: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Expression and SD

Page 19: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Slope Estimates and R2

Page 20: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Model check

• Compute observed SD of 5 replicate expression estimates

• Compute RMS of 5 nominal SDs

• Compare by taking the log ratio

• Closeness of observed and nominal SD taken as a measure of goodness of fit of the model

Page 21: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Observed vs. Model SE

Page 22: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Observed vs. Model SE

Page 23: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Conclusion

• Take logs• PMs need to be normalized • Using global background improves on use of

probe-specific MM• Gene Logic spike-in and dilution study show all

three expression measures performed very well• AvLog(PM-BG) is arguably the best in terms of

bias, variance and model fit• Future: better BG; robust/resistant summaries

Page 24: Bias, Variance, and Fit for Three Measures of Expression:  AvDiff, Li &Wong’s, and AvLog(PM-BG)

Acknowledgements

• Gene Brown’s group at Wyeth/Genetics Institute, and Uwe Scherf’s Genomics Research & Development Group at Gene Logic, for generating the spike-in and dilution data

• Gene Logic for permission to use these data • Francois Collin (Gene Logic)• Ben Bolstad (UC Berkeley)• Magnus Åstrand (Astra Zeneca Mölndal)