49
25th Clemson Mini-Conference on Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra J. Knisley 25th Clemson Mini-Conference on Discrete Math and Algorithms October 7, 2010 Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 1 / 50

25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Two recent applications of graph theory inmolecular biology

Debra J. Knisley

25th Clemson Mini-Conference on Discrete Math and Algorithms

October 7, 2010

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 1 / 50

Page 2: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Outline

New applications in molecular biology

Graphs as Secondary RNA structuresGraphs as Amino AcidsGraphs as Proteins

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 2 / 50

Page 3: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

DNA to RNA to Protein

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 3 / 50

Page 4: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Amino Acid 3-letter 1-letter Polarity

Alanine Ala A nonpolar

Arginine Arg R polar

Asparagine Asp N polar

Cysteine Cys C polar

Glutamic acid Glu E nonpolar

Glycine Gln Q polar

Histidine His H polar

Isoleucine Ile I nonpolar

Leucine Leu L polar

Lysine Lys K nonpolar

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 4 / 50

Page 5: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Human Genome Project

How many genes are in the Human Genome?

Majority of biologists believed a lot more than50,000

Rice genome contains about 50,000 genes

Big surprise, humans only have about half thatmany!

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 5 / 50

Page 6: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Major changes in the science of molecularbiology

The number of genes in the genome was just thefirst of many surprises!

The belief that the sole function of RNA is to carrythe ”code” of instructions for a protein is not true

It is now known that more than half of the RNAmolecules that have been discovered are”non-coding”

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 6 / 50

Page 7: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Systems Biology

The two surprising discoveries are related–

It is not how many genes, it is how they areregulated that is important!

Now everyone is trying to understand regulatorynetworks - Systems Biology

The study of RNA molecules is at the forefront ofthe field of Systems Biology

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 7 / 50

Page 8: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Database of RNA Tree Graphs

Home RNA Tree Graph Subproject Home How to Represent RNA Trees as Planar Graphs

List of RNA Tree Graphs by Number of Vertices2 Vertices

3 Vertices

4 Vertices

5 Vertices

6 Vertices

7 Vertices

8 Vertices

9 Vertices

10 Vertices

List of RNA Tree Graphs by Functional RNA Class

P5abc

RNA in Signal Recognition Complex

Single Strand RNA

tRNA

U2 snRNA

5S rRNA

70S

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 8 / 50

Page 9: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Secondary RNA structure: Tamar Schlicket.al, web resource RAG (RnaAsGraphs)

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 9 / 50

Page 10: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Vertices represent internal loops

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 10 / 50

Page 11: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Tree model of secondary RNA structure

See http://www.monod.nyu.edu

(Schlick et al.)

Secondary RNA structure

And corresponding tree.

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 11 / 50

Page 12: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

They can be large, but most foldingalgorithms predict smaller tree structures

The tree representation

of an RNA molecule’s

secondary structure.

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 12 / 50

Page 13: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

The RAG Database

The six trees of order 6 with their RAG (n.z) index andcolor classification

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 13 / 50

Page 14: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Modeling RNA Bonding

When two trees T1 and T2 containing n and m verticesrespectively are merged, the resulting tree T1+2 has atotal of n + m− 1 vertices.

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 14 / 50

Page 15: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Modeling RNA Bonding

To accurately model RNA molecule bonding, we mustconsider all possible vertex identifications between twoRNA tree models.

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 15 / 50

Page 16: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Modeling RNA Bonding

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 16 / 50

Page 17: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Modeling RNA Bonding

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 17 / 50

Page 18: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Modeling RNA Bonding

For all 94 graphs on 2 through 9 vertices, werecorded every possible vertex identificationresulting in a graph on 9 or less vertices

The data from each vertex identification wasrecorded and translated into a data vector

[〈c1, c2, deg(v1), deg(v2)〉 , 〈y1, y2〉]

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 18 / 50

Page 19: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Observations

Examples of Data Vectors from Red Trees and BlackTrees

Red Trees 8.5 and 8.15 Black Trees 8.22 and 8.23Data Vectors Data Vectors

[< 1, 1, 1, 1 >,< 1, 0 >] [< 1, 1, 1, 5 >,< 0, 1 >][< 1, 1, 1, 2 >,< 1, 0 >] [< 1, 1, 2, 4 >,< 0, 1 >][< 1, 1, 2, 1 >,< 1, 0 >] [< 1, 1, 3, 4 >,< 0, 1 >][< 1, 1, 2, 2 >,< 1, 0 >] [< 1, 0, 1, 6 >,< 0, 1 >]

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 19 / 50

Page 20: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Artificial Neural Network

Each tree’s final classification was calculated as anaverage of a linear combination of prediction values fromthe vertex identifications

Code Prediction

a[8.5] := 2 ∗ Classify(〈1, 1, 1, 2〉); 〈1.99966, 0.00135〉b[8.5] := 4 ∗ Classify(〈1, 1, 1, 1〉);

⟨4.00000, 2.6820110−7⟩

c[8.5] := 4 ∗ Classify(〈1, 1, 1, 1〉);⟨4.00000, 2.6820110−7⟩

d[8.5] := 4 ∗ Classify(〈1, 1, 2, 1〉); 〈3.99932, 0.00270〉

Class[8.5] := a[8.5]+b[8.5]+c[8.5]+d[8.5]14 〈0.99991, 0.00034〉

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 20 / 50

Page 21: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Results: Values for the Classified RAG Trees

RAG Color ANN ANNIndex Class Prediction Result

7.1 Red 1.00000 Highly RNA-Like7.10 Black 0.59091 Unclassifiable7.11 Black 0.00045 Highly Not-RNA-Like7.2 Red 0.99860 Highly RNA-Like7.3 Red 0.99990 Highly RNA-Like7.9 Black 0.43130 Unclassifiable

8.10 Red 0.99994 Highly RNA-Like8.15 Red 0.56343 Unclassifiable8.17 Black 0.36524 Not-RNA-Like8.18 Black 0.00595 Highly Not-RNA-Like9.6 Red 0.99991 Highly RNA-Like

9.11 Red 0.99993 Highly RNA-Like9.13 Red 0.99740 Highly RNA-Like9.27 Red 0.99795 Highly RNA-Like

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 21 / 50

Page 22: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 22 / 50

Page 23: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Domination number = 4

Domination number = 3

Domination number = 2

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 23 / 50

Page 24: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Locating-dominating number1

2

34

56 7

8

9

10

Vertex Dominating Vertex

1 9

2 7

3 4 & 10

5 4

6 7 & 10

8 4, 7 & 9

Combinatorial optimization

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 24 / 50

Page 25: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

locating domination number = 4

locating domination number = 5

locating domination number = 5

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 25 / 50

Page 26: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Graphical measures

We used two measures defined by dominationinvariants

P1 =γ + γt + γa

n, P2 =

γL + γD

n

where γ is the domination number, γt is the totaldomination number, γa is the global allieddomination number, γL is the locating dominationnumber, γD is the differentiating dominationnumber, and n is the number of vertices in thegraph.

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 26 / 50

Page 27: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Results

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 27 / 50

Page 28: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Combinatorial Molecular Biology?

In combinatorics, we find optimal structuralproperties by defining graphical invariants based onminimal/maximal achievements such as thedomination number, chromatic number, etc.

Can we utilize the graphical invariants from graphtheory as descriptors of biomolecules? That is, dobiomolecules exhibit optimal structural properties?

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 28 / 50

Page 29: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Computational Chemistry- Chemical GraphTheory

Topological indices together with quantities derivedfrom chemical properties are called moleculardescriptors

There are over 3,000 molecular descriptors in theHandbook of Molecular Descriptors. Of those,approximately 1600 can be classified as topologicalindices.

A tool of QSAR, the process by which chemicalstructure is quantitatively correlated with a welldefined process, such as biological activity orchemical reactivity.

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 29 / 50

Page 30: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

The dilemma

It is generally believed that topological indices arenot valid for large graphs

Unlike RNA, proteins are very large molecules

Graphical invariants are suited for large graphs, butare often intractable

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 30 / 50

Page 31: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Determining the function of a protein

When a new protein is discovered, we may notknow its function

There are two ways to address the question offunction

Compare its primary sequence (sequence of aminoacids) to all other known proteins

Predict (or determine) its 3D conformation andcompare its conformation to all other known 3Dconformations

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 31 / 50

Page 32: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Hamming distance

R-T-S-C-A-G-G-W-A-C target sequenceR-T-S-C-Y-Y-W-G-A-C second sequence

R-T-S-C-A-G-G-W-A-C target sequenceK-T-S-C-G-A-G-Y-A-C third sequenceWhat is the Hamming distance in each case.

Are these “equally close”

No, some amino acids are more likely to substitutefor each other than other amino acids. There existScoring Matrices (often based on the likelyhood ofone amino acid being replaced by another)

See: AAindex database

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 32 / 50

Page 33: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Graphs for amino acids

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 33 / 50

Page 34: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Graphs for amino acids

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 34 / 50

Page 35: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Table of graph invariants fromvertex-weighted graphs

Row acid G g d c m p

1 A 12 12 2 0 1.000 12

2 R 54 36 9 0 1.750 14

3 N 42 24 5 0 1.600 15

4 D 44 24 5 0 1.600 16

5 C 12 12 3 0 1.333 32

6 Q 42 24 6 0 1.667 15

7 E 44 24 6 0 1.667 16

8 G 0 0 0 0 0.000 0

9 H 40 36 6 5 2.000 14

10 I 24 24 5 0 1.600 12

11 L 36 24 5 0 1.600 12

12 K 38 24 8 0 1.667 14

13 M 44 24 6 0 1.600 12

14 F 36 24 7 6 2.000 12

15 P 24 12 4 4 2.000 12

16 S 12 12 3 0 1.333 16

17 T 12 12 3 0 1.500 14

18 W 62 48 9 9 2.182 12

19 Y 52 24 8 6 2.000 16

20 V 12 12 3 0 1.500 12

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 35 / 50

Page 36: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Selecting the invariants

WYFHMLIEQDNKRCGPSVTA

10.88

7.25

3.63

0.00

Amino acids

Distance

Dendrogram based on invariants g_1, d, c and pWard Linkage, Euclidean Distance

WYFHKMLIEQDNRCGPSVTA

14.81

9.87

4.94

0.00

Amino acids

Distance

Dendrogram using 5 invariantsWard Linkage, Euclidean Distance

We see that the measure

m - average degree

changed the relationship of R

and K quite a bit.

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 36 / 50

Page 37: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Selecting the invariants

WYFHMLIEQDNKRCGPSVTA

10.88

7.25

3.63

0.00

Amino acids

Distance

Dendrogram based on invariants g_1, d, c and pWard Linkage, Euclidean Distance

Four graphical invariants that capture the basic

properties of the amino acids

Possible add-on tool for scoring multiple sequence alignments

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 37 / 50

Page 38: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

What is reverse engineering?

Building a mathematical model from observations

The model is a representation of the biomolecule,network or whatever system under observation

The model is adjusted to fit the observations

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 38 / 50

Page 39: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

The DREAM CHALLENGE

What is DREAM?

Discussions for Reverse Engineering Assessmentand Methods“The fundamental question for DREAM is simple:How can researchers assess how well they aredescribing the biological systems?”

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 39 / 50

Page 40: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Sponsors of DREAM

Columbian University Center for Multiscale AnalysisGenomic and Cellular Networks (MAGNet)

NIH Roadmap Initiative

IBM Computational Biology Center

The New York Academy of Sciences

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 40 / 50

Page 41: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

A Challenge is posted each year: Anopportunity for “assessment”

DREAM4(2009)- Organized by Gustavo Stolovitzky,IBM Computational Biology Center and AndreaCalifano, Columbia University

The PDZ challenge – Given the primary sequence of5 proteins, almost identical, can you predict thebinding target of each one. That is, what is thebinding affinity difference of each ”mutant”.

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 41 / 50

Page 42: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

The Challenge

There is a binding pocket in each PDZ domain

Mutations in the binding pocket result in a changein the binding affinity

Given a mutation, can you predict the change inbinding affinity?

A single amino acid change can have ”synergistic”consequences

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 42 / 50

Page 43: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Higlighted amino acids are attracted todifferent amino acids in the target protein

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 43 / 50

Page 44: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Our Approach

1 Use the graph-theoretic models and correspondinggraphical invariants of individual amino acids

2 Replace each alphabet string representing theprimary structure of each domain (andcorresponding target) with a numerical string. Thenumerical string was determined by the observationof the properties of the amino acids.

3 Train a neural network using these numerical stringsas inputs.

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 44 / 50

Page 45: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Conclusion

“The idea of optimization is intimately connected withmodern science. Pioneers like Galileo, Fermat andNewton, were convinced that the world had been createdby a benevolent God who had established the laws ofnature as the most efficient way to achieve his purposes:in short, this is the best of all possible worlds and it isthe task of science to find out why and how. Graduallythis view was overturned, leaving optimization as animportant tool for the human-engineered world.”

Taken from the abstract of Ivar Ekeland, University ofBritish Columbia, for his upcoming Math MattersLecture on The Best of All Possible Worlds: The Idea ofOptimization.

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 45 / 50

Page 46: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

Joint work with a number of people atETSU

Jeff Knisley- applied mathematics

Teresa Haynes- graph theory

Edith Seier-statistics

Denise Koessler- Mathematics graduate student

Cade Herron and Jordon Shipley- Mathematicsundergraduate students

– Acknowledgement: Tamar Schlick and the webresource folks at NYU

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 46 / 50

Page 47: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

ReferencesG. Benedetti and S. Morosetti, A Graph-Topological Approach toRecognition of Pattern and Similarity in RNA SecondaryStructures, Biol Chem. 22 (1996) 179-184.

N. Bose and P. Liang, Neural Network Fundamentals with Graphs,Algorithms, and Applications, New York: McGraw-Hill (1996).

G. Chartrand, Introductory Graph Theory, Boston: Prindle, Weber& Schmitt (1977).

H. Gan, S. Pasquali, and T. Schlick, Exploring the Repertoire ofRNA Secondary Motifs Using Graph Theory: Implications for RNADesign, Nucleic Acids Research, 31 (2003) 2926-2943.

F. Harary and G. Prings, The Number of HomeomorphicallyIrreducible Trees and Other Species, Acta Math, 101 (1959)141-162.

T. Haynes, S. Hedetniemi, and P. Slater, Fundamentals ofDomination in Graphs, New York: Marcel Dekker, Inc. (1998).

D. Knisley, T. Haynes T, E. Seier, and Y. Zou, A QuantitativeAnalysis of Secondary RNA Structure Using Domination BasedParameters on Trees, BMC Bioinformatics 7 (2006) 108.

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 47 / 50

Page 48: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

ReferencesD. Knisley D, T. Haynes T, and J. Knisley, Using a Neural Networkto Identify Secondary RNA Structures Quantified by GraphicalInvariants, Communications in Mathematical and in ComputerChemistry / MATCH, 60 (2008) 277-290.S. Le, R. Nussinov, and J. Maziel, Tree Graphs of RNA SecondaryStructures and their Comparison, Comp. Biomed. Res., 22 (1989)461-473.M. Nelson and S. Istrail, RNA Structure and PredictionComputational Molecular Biology,[http://tuvalu.santafe.edu/ pth/rna.html], (1996).T. Schlick, D. Fera, N. Kim, N. Shiffeldrim, J. Zorn, U. Laserson,and H. Gan, RAG: RNA-As-Graphs Web Resource, BMCBioinformatics, 5 (2004) 88.T. Schlick, D. Fera, N. Kim, N. Shiffeldrim, J. Zorn, U. Laserson,and H. Gan, RAG: RNA as Graphs Web Resource,[http://monod.biomath.nyu.edu/rna/rna.php], (2007).T. Schlick, D. Fera, N. Kim, N. Shiffeldrim, J. Zorn, U. Laserson,H. Gan, and M. Tang, RAG: RNA-As-Graphs Database-Concepts,Analysis, and Features, Bioinformatics, 20 (2004) 1285-1291.

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 48 / 50

Page 49: 25th Clemson Mini-Conference on Discrete Math and ...goddard/MINI/2010/knisley.pdf · Discrete Math and Algorithms Two recent applications of graph theory in molecular biology Debra

tugraz

25th Clemson Mini-Conference onDiscrete Math and Algorithms

ReferencesT. Schlick, N. Kim, N. Shiffeldrim, and H. Gan, Candidates forNovel RNA Topologies, J Mol Biol, 341 (2004) 1129-1144.

J. Shao, Linear model selection by cross-validation, J. Am.Statistical Association, 88 (1993) 486-494.

S. Vishveshwara, K. V. Brinda, and N. Kannan, Protein Structure:Insights from Graph Theory, Journal of Theoretical andComputational Chemistry, 1 (2002) 187-211.

Debra J. Knisley October 7, 2010 Two recent applications of graph theory in molecular biology 49 / 50