Dahlia Nielsen North Carolina State University Bioinformatics Research Center

Preview:

Citation preview

Dahlia Nielsen

North Carolina State University

Bioinformatics Research Center

Microarray Animation

http://www.bio.davidson.edu/Courses/ genomics/chip/chip.html

Importing data into JMP/Genomics Need two (paired) tables

Data: expression intensities Experimental design

Data probably originally exists in separate files: one file per sample/microarray first create experimental design file

Experimental Design File

Required Columns columnname file Array (can be “made up” values) intensity

if using text file input dye (or channel) if two-color platform

cy3 vs cy5

Experimental Design File

Required Columns Other columns

information about samples treatment class phenotype …

Data Analysis Steps

QC distribution analysis correlation plots

Normalization more QC

same as above Analysis Results visualization

Data Analysis Steps

QC distribution analysis correlation plots

Normalization more QC

same as above Analysis Results visualization

JMP/Genomics creates a script for each of these

can run script to re-create results (without re-doing analyses)

QC

Distribution analysis visualization of how consistent your data/samples

are useful for detecting problem arrays

Correlation plots also a measure of array consistency

Normalization

Lots of choices Lots of discussion No right / wrong Depends in part on your goals Different degrees

very “light” (mixed model) intermediate (loess) more “heavy-handed” (quantile)

More QC

Indication of success of normalization procedure

as before … consistency between arrays/samples detect problem arrays

Analysis

Generally performed one gene at a time Hypothesis-testing framework

ANOVA (test for changes in expression levels across treatment groups)

multiple-testing adjustment necessary Exploratory procedures

pca cluster analysis

Volcano plots

Visualization tool to display results plot of effect size (x-axis) vs. significance

level (y-axis) Some genes may display large differences

between treatment groups, but also high variance (less significance)

Some genes might display smaller effect sizes, but expression values very consistent (low var.) … smaller p-values

Final results

Probably should consider not only pvalues, but also magnitude of effect

small changes (in spite of small pvalues) might not be replicable inherent accuracy of microarrays tendency of performing experiments with small

sample sizes

Final check on results

Once identify genes with significant results e.g. expression levels significantly different

between treatment groups Examine data

Is the change identified (above) readily apparent? Normalized data … And raw data

Recommended