Sai Moturu

Preview:

DESCRIPTION

Sai Moturu. Introduction. Current approaches to microarray data analysis Analysis of experimental data followed by a posterior process where biological information is incorporated to make inferences Integrative analysis technique in this paper - PowerPoint PPT Presentation

Citation preview

Sai Moturu

Introduction

• Current approaches to microarray data analysis– Analysis of experimental data followed by a

posterior process where biological information is incorporated to make inferences

• Integrative analysis technique in this paper– Integrate gene annotation with expression data

to discover intrinsic associations among both data sources based on co-occurrence patterns

Methods and Data

– Association Rules Discovery

– Gene expression data

– Gene annotation: Gene ontology categories, metabolic pathways and transcriptional regulators

– Applied to two previously studied experiments

Association Rules Discovery

– Antecedent -> Consequent X -> Y

– Measures of Quality

• Support: P(XυY)

• Confidence: P(Y|X) = P(XυY)/P(Y)

• Improvement: Confidence/Consequent = P(XυY)/(P(X)*P(Y))

Association Rules Discovery

– Itemsets• Genes and the set of experiments in which gene is

over or underexpressed• Gene characteristics

– Constraint• Antecedent needs to be gene annotation

– Expression Thresholds• Genes with log expression values >1 are

overexpressed and <-1 are underexpressed (two fold)

Mining Association Rules

– The association rules that we are interested in have low support values and high confidence values

– A variant of the apriori algorithm is used that has helped previously with mining low support-high confidence biologically significant patterns

Filtering

– Major drawback with association rules is the number of rules generated is huge

– Also there is redundancy

– This is taken care of with two filters• Redundant filter

• Single antecedent filter

Diauxic shift dataset

– Gene expression accompanying the metabolic shift from fermentation to respiration that occurs when fermenting yeast cells

– Expression levels recorded at 7 time points

– External information• Metabolic pathways• Transcriptional regulators

Results

– Association rules among metabolic pathways and expression patterns

• 1126 out of over 6000 genes were annotated with at least one pathway

• Association rules with minimum support of 5, minimum confidence of 40% and minimum improvement of 1

• Redundant and single antecedent filters applied

• 21 association rules

Results

– Association rules among transcriptional regulators and expression patterns

• 3490 genes were annotated with at least one regulator

• Association rules with minimum support of 5, minimum confidence of 80% and minimum improvement of 1

• Redundant filter applied

• 28 association rules

Results

– Association rules among transcriptional regulators, metabolic pathways and expression patterns

• 3882 genes

• Association rules with minimum support of 5, minimum confidence of 80% and minimum improvement of 1

• Redundant filter applied

• 37 association rules

Results

Results

Results

Serum stimulation dataset

– Gene expression program of human fibroblast after serum exposure

– External information• Gene ontology terms

Results

– Association rules among biological process annotation and expression patterns

• 4092 genes of over 8000

• Support of 4, min confidence of 10% and min improvement of 1

• Single antecedent and redundant filters applied

• 12 associations

Results

– Association rules among terms from all GO categories

• 4630 genes of over 8000

• Support of 4, min confidence of 10% and min improvement of 1

• Redundant filter applied

• 31 associations

Results

Results

Results

Conclusions

– Some of the biological implications matched the ones found experimentally

– The others could be explored further

– Integrative data analysis is very useful for meaningful discoveries using gene expression data

Recommended