17
Reconstructing Gene Networks Presented by Andrew Darling Based on article “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised Learning” - Soinov, Krestyaninova, Brazma

Reconstructing Gene Networks Presented by Andrew Darling Based on article “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

Embed Size (px)

Citation preview

Page 1: Reconstructing Gene Networks Presented by Andrew Darling Based on article  “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

Reconstructing Gene Networks

Presented by Andrew Darling Based on article

“Research Towards Reconstruction of Gene Networks from Expression Data by Supervised Learning”

- Soinov, Krestyaninova, Brazma

Page 2: Reconstructing Gene Networks Presented by Andrew Darling Based on article  “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

Outline

Why study another microarray algorithm? Background info Methods Results Discussion Conclusion

Page 3: Reconstructing Gene Networks Presented by Andrew Darling Based on article  “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

Why study another microarray algorithm?

Study of microarray data continues Still unclear on what the data means Still unclear on how the genome works

Confirm existing knowledge about gene networks using existing datasets

Proof of concept in a new algorithm using existing knowledge and datasets

This algorithm actually explains its reasoning

Page 4: Reconstructing Gene Networks Presented by Andrew Darling Based on article  “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

Background information

What is a gene network? What is supervised learning? What are decision trees / classifiers? Why use classifiers?

Page 5: Reconstructing Gene Networks Presented by Andrew Darling Based on article  “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

What is a gene network?

A model of a genes affecting other genes What other genes affect a given gene How other genes affect a given gene

Positive, negative, complicated

Several model types – graphs, nodes, edges Boolean ( on – off ) Bayesian network ( conditional probability ) Differential equations ( derivatives, integrals )

Page 6: Reconstructing Gene Networks Presented by Andrew Darling Based on article  “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

Gene network - example

Page 7: Reconstructing Gene Networks Presented by Andrew Darling Based on article  “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

What is supervised learning?

The paper was unclear on the subjectPerhaps a reference to the type of algorithm

used It may have involved human interaction with

the softwarePossibly, the software produced the

classifiers in the form of a decision tree, then users interpreted the output into classification rules

Page 8: Reconstructing Gene Networks Presented by Andrew Darling Based on article  “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

What are decision trees / classifiers?

Acyclic directed graph - tree Each graph explains what other genes affect a

specific gene Inner nodes are gene products of other genes Edges are thresholds of concentration of the gene

products of the other genes – rules of the tree Leaf nodes are effects on transcription of the specific

gene

Each graph is a classifier for a specific gene

Page 9: Reconstructing Gene Networks Presented by Andrew Darling Based on article  “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

Classifiers – model of gene networks

Expression of gene is function of transcription Transcription of gene is in discrete states

Expressed more than average Expressed less than average

Transcription state affected by amount of other gene products (expression of other genes)

Use yeast cell cycle data to test algorithm and previous knowledge to judge accuracy

Page 10: Reconstructing Gene Networks Presented by Andrew Darling Based on article  “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

Why use classifiers?

The products affecting a specific gene are listed in the tree

Allows for continuous values for concentrations Each additional dataset refines the decision

information Decision trees are easy to read and interpret

Page 11: Reconstructing Gene Networks Presented by Andrew Darling Based on article  “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

Classifier - example

Page 12: Reconstructing Gene Networks Presented by Andrew Darling Based on article  “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

Methods

Use induction algorithm to generate decision trees Program called C4.5

Apply program three ways Regulation of target gene as a function of other genes

at same time (simultaneous) Regulation of target gene as a function of other genes

at previous times (time delay) Regulation as a function of change of other genes

(changes)

Page 13: Reconstructing Gene Networks Presented by Andrew Darling Based on article  “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

Results - given

These genes and yeast datasets

Spellman, Cho, … Cdc28 Alpha-factor

Page 14: Reconstructing Gene Networks Presented by Andrew Darling Based on article  “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

Results – produced this

Page 15: Reconstructing Gene Networks Presented by Andrew Darling Based on article  “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

Results – with this accuracy

Page 16: Reconstructing Gene Networks Presented by Andrew Darling Based on article  “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

Discussion

Some concern about the accuracy between 70% and 94% on systems with known interactions

Does that imply that the microarray data is wrong or the algorithm is flawed?

Page 17: Reconstructing Gene Networks Presented by Andrew Darling Based on article  “Research Towards Reconstruction of Gene Networks from Expression Data by Supervised

Conclusions

Decision trees and classifiers seem a better way to explain gene expression

This paper did not do a good job of explaining how to make / use them

Reference to the algorithm itself was almost specious