Upload
cruz-chen
View
50
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Sai Moturu. Introduction. Current approaches to microarray data analysis Analysis of experimental data followed by a posterior process where biological information is incorporated to make inferences Integrative analysis technique in this paper - PowerPoint PPT Presentation
Citation preview
Sai Moturu
Introduction
• Current approaches to microarray data analysis– Analysis of experimental data followed by a
posterior process where biological information is incorporated to make inferences
• Integrative analysis technique in this paper– Integrate gene annotation with expression data
to discover intrinsic associations among both data sources based on co-occurrence patterns
Methods and Data
– Association Rules Discovery
– Gene expression data
– Gene annotation: Gene ontology categories, metabolic pathways and transcriptional regulators
– Applied to two previously studied experiments
Association Rules Discovery
– Antecedent -> Consequent X -> Y
– Measures of Quality
• Support: P(XυY)
• Confidence: P(Y|X) = P(XυY)/P(Y)
• Improvement: Confidence/Consequent = P(XυY)/(P(X)*P(Y))
Association Rules Discovery
– Itemsets• Genes and the set of experiments in which gene is
over or underexpressed• Gene characteristics
– Constraint• Antecedent needs to be gene annotation
– Expression Thresholds• Genes with log expression values >1 are
overexpressed and <-1 are underexpressed (two fold)
Mining Association Rules
– The association rules that we are interested in have low support values and high confidence values
– A variant of the apriori algorithm is used that has helped previously with mining low support-high confidence biologically significant patterns
Filtering
– Major drawback with association rules is the number of rules generated is huge
– Also there is redundancy
– This is taken care of with two filters• Redundant filter
• Single antecedent filter
Diauxic shift dataset
– Gene expression accompanying the metabolic shift from fermentation to respiration that occurs when fermenting yeast cells
– Expression levels recorded at 7 time points
– External information• Metabolic pathways• Transcriptional regulators
Results
– Association rules among metabolic pathways and expression patterns
• 1126 out of over 6000 genes were annotated with at least one pathway
• Association rules with minimum support of 5, minimum confidence of 40% and minimum improvement of 1
• Redundant and single antecedent filters applied
• 21 association rules
Results
– Association rules among transcriptional regulators and expression patterns
• 3490 genes were annotated with at least one regulator
• Association rules with minimum support of 5, minimum confidence of 80% and minimum improvement of 1
• Redundant filter applied
• 28 association rules
Results
– Association rules among transcriptional regulators, metabolic pathways and expression patterns
• 3882 genes
• Association rules with minimum support of 5, minimum confidence of 80% and minimum improvement of 1
• Redundant filter applied
• 37 association rules
Results
Results
Results
Serum stimulation dataset
– Gene expression program of human fibroblast after serum exposure
– External information• Gene ontology terms
Results
– Association rules among biological process annotation and expression patterns
• 4092 genes of over 8000
• Support of 4, min confidence of 10% and min improvement of 1
• Single antecedent and redundant filters applied
• 12 associations
Results
– Association rules among terms from all GO categories
• 4630 genes of over 8000
• Support of 4, min confidence of 10% and min improvement of 1
• Redundant filter applied
• 31 associations
Results
Results
Results
Conclusions
– Some of the biological implications matched the ones found experimentally
– The others could be explored further
– Integrative data analysis is very useful for meaningful discoveries using gene expression data