93
Bleker, Clif, Garcia 1 Carissa Bleker, Ashley Cliff, Sergio Garcia

Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 1

Carissa Bleker, Ashley Cliff, Sergio Garcia

Page 2: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 2

Test Questions

1. Which thresholding method did we try to use?

2. Name a type of calculated edge.

3. Name one type of graph that can be used to represent metabolic

networks.

Page 3: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 3

Ashley Cliff● Bredesen Center Student (DSE)

○ Advisor: Dan Jacobson (ORNL)● Central College, Pella, IA

○ BA: Physics & Computer Science● From: Decorah, IA

○ Population: ~8,000

Page 4: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 4

Carissa Bleker

Bredesen Center Student (DSE)Advisor: Dr Langston

Stellenbosch University:● BScHons Mathematics

From Cape Town, South Africa● Population of 3.7 million● Not the tip of Africa…

Mojo & Amper

Cape Town Cape L’Agulhas

Page 5: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 5

Sergio Garcia

● From: Murcia (Population 439K), Spain.

● Pursuing PhD in Chemical and Biomolecular Engineering.

● I play the piano.

Page 6: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 6

Presentation Outline

1. Background

2. Building Graphs from Real Data

3. Time Varying Graphs

4. Network Analysis Tools

5. Common Networks in Molecular Biology

6. Higher Level Biological Examples

7. Yeast Life Cycle Time Varying Graph

8. Issues

Page 7: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 7

1. Background

Page 8: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 8

The Discipline of Systems Biology

● Systems biology is the computational and mathematical modeling of complex biological systems.

● Focus on complex interactions within biological systems, using a holistic approach (holism instead of the more traditional reductionism) to biological research.

● Model and discover emergent properties of cells, tissues and organisms functioning as a system.

● Applications: Disease, biocatalysis, waste management, ….

Page 9: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 9

Cell

Communities

Page 10: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 10

Page 11: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 11

High-throughput Collection of Biological Data

Page 12: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 12

2. Building Graphs from Real Data

Page 13: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 13

Building Graphs from Real (Dirty) Data

● Formatting ○ File format - tab vs space vs comma vs no format○ Remove corrupted lines, odd characters (#, %)○ Create ‘simple’ Dimacs or adjacency matrices

● Data cleaning○ Duplicate data○ Are zeros actually zeros○ Missing values (remove, ignore, impute)○ Standardizing/Normalizing

These steps could take longer than the graph analysis - never assume your data is clean/formatted

Page 14: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 14

Determining Vertices and Edges

● What are the vertices?○ genes, locations, molecules, …

● And edges?○ interactions, covariation, proximity, ...○ roads, wired connection○ Calculated edges - correlation○ Directed, weighted

● Multigraphs○ Ex: Vertices - cities, Edges - roads (blue), direct flight (red), etc..

● What different insights are gained by changing or swapping how we identify vertices and edges?

Page 15: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 15

Calculated Edges

● Pearson correlation coefficient

○ Pearson p-value

● Spearman (rank based Pearson)

○ Spearman p-value

● Cosine

● Euclidian

● Mutual information

● ...

Page 16: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 16

Calculate Edges

Page 17: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 17

Thresholding

A correlation analysis or matrix will generate a complete graph. Thresholding aims to separate signal from noise.

Page 18: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 18

Thresholding

After: we only have significant edges/associations between vertices.

Page 19: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 19

Thresholding: Spectral methods

• Weaker (smaller weighted) edges connect dissimilar clusters of the graph

• As t (the threshold) is increased:→ weaker edges are removed→ dissimilar clusters are less connected→ the number of “nearly-disconnected” clusters increases

Perkins, A. D., & Langston, M. A. (2009). Threshold selection in gene co-expression networks using spectral graph theory techniques. Bmc Bioinformatics, 10(11), S4.

Page 20: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 20

Thresholding: Spectral methods

Finding “nearly-disconnected” clusters: ● Extract the largest connected component● Laplacian of G:

● Sort the values of eigenvector of the second smallest eigenvalue

● Results in an ascending step like function, and each step corresponds to a transition from one cluster to another

Perkins, A. D., & Langston, M. A. (2009). Threshold selection in gene co-expression networks using spectral graph theory techniques. Bmc Bioinformatics, 10(11), S4.

Page 21: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 21

• Select t that maximises the number of "nearly-disconnected" components, and therefore minimises the number of edges connecting dissimilar parts of the network.

Thresholding: Spectral methods

Perkins, A. D., & Langston, M. A. (2009). Threshold selection in gene co-expression networks using spectral graph theory techniques. Bmc Bioinformatics, 10(11), S4.

Page 22: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 22

Thresholding: Random Matrix Theory

Based on the nearest neighbor spacing distribution (NNSD) of eigenvalues from the adjacency/correlation matrix.

NNSD: differences between subsequent (ordered) eigenvalues

Jalan, S., & Bandyopadhyay, J. N. (2007). Random matrix analysis of complex networks. Physical Review E, 76(4), 046107.

random network scale-free network small-world network

Page 23: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 23

Thresholding: Random Matrix Theory

Based on the nearest neighbor spacing distribution (NNSD) of eigenvalues from the adjacency/correlation matrix.

NNSD: differences between subsequent (ordered) eigenvalues

NNSD of eigenvalues of a random matrix can be approximated by Wigner Surmise

NNSD of a non-random matrix appears Poisson

Iterate over t until we find the point of transition of the NNSD

Gibson, S. M., Ficklin, S. P., Isaacson, S., Luo, F., Feltus, F. A., & Smith, M. C. (2013). Massive-scale gene co-expression network construction and robustness testing using random matrix theory. PLoS One, 8(2), e55871.

Page 24: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 24

Thresholding: Random Matrix Theory

Page 25: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 25

3. Time-varying Graphs

Page 26: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 26

Time-varying Graphs

● Dynamic undirected graphs with fixed underlying vertex set○ Edges change over time, vertices do not

● AKA time evolving graphs (TEG)

● Useful for time based data ○ gene expression over time, ○ congestion at intersections, etc

● Use simple metrics (or more complicated) to ‘label’ the differences between time steps

Page 27: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 27

4. Network Analysis Tools

Page 28: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 28

Network Properties

● Density: Some biological networks are sparse

Page 29: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 29

Network Properties

● Clustering coefficient

N= |V|; Ei = |edges between neighbors of i|; ki = degree of i;

Average clustering coefficient for the metabolic networks of 43 organisms (colors represent different taxonomic domains). N is the number of nodes.The diamonds correspond to a scale free network with the same number of nodes and edges.

Page 30: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 30

Network Properties

● Other:○ Diameter: Shortest distance between the two most distant nodes

in the network. The diameter of metabolic networks is conserved even across distant organisms.

○ Average path length: Average shortest paths.○ Degree Distribution: Number of nodes with a certain degree.

Biological networks tend to follow a power law.○ ...

Page 31: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 31

Complex Network Models

Page 32: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 32

Node Centralities and Ranking

● Degree centrality: Nodes with high degree centrality are hubs. While biological networks are robust against perturbation, the removal of hubs often leads to system failure.

Page 33: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 33

Node Centralities and Ranking

● Other:○ Closeness Centrality: Indicates important nodes that can

communicate quickly with other nodes. Used to identify key central metabolites and extract the core metabolic network.

○ Betweenness Centrality, nodes that appear in many shortest paths rank higher. Metabolites controlling flux between two modules. In telecommunication networks such node would have higher control.

○ Eigenvector Centrality, ranks higher the nodes that are connected to important neighbors. Used to identify pairs of genes that cause sickness/death.

○ ....

Page 34: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 34

Clustering

● Clusters are parts of a graph that are highly associated

● In biology clusters are of interest for a number of reasons:○ Finding co-regulated vertices○ Finding vertices that are part of the same process○ Hypothesising functionality on unannotated vertices

Page 35: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 35

Underlying Idea: A random walk on a transition graph that starts within a cluster, is more likely to stay within that cluster than to leave it.

Clustering Algorithms: Markov Clustering

Page 36: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 36

Two steps on a transition matrix:

1. Matrix square- Simulates random walks through the graph

2. Elementwise matrix squaring- Strengthens strong transition probabilities, and weakens low

probabilities

Clustering Algorithms: Markov Clustering

Page 37: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 37

Transition matrix

Pi, j = P(i | j)

= probability of walking from j to i

Each column consists of the probabilities of each way you can leave that node, and sums to 1

Clustering Algorithms: Markov Clustering

1 2 3 4 5

1 P1, 1 P1, 2 . . P1, 5

2 P2,1 . .

3 . . .

4 . . .

5 P5, 1 P5, 5

M =

Page 38: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 38

Transition matrix multiplication

(M2) i, j = ∑k Pi, k Pk, j = ∑kP(walk to i from j through k)

(M2) 2, 3 = Probability of walking to 2 from 3, over all 2-step paths

Clustering Algorithms: Markov Clustering

P1, 1 P1, 2 . . P1, 5

P2,1 . .

. . .

. . .

P5, 1 P5, 5

P1, 1 P1, 2 . . P1, 5

P2,1 . .

. . .

. . .

P5, 1 P5, 5

XM2 =

Page 39: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 39

Clustering Algorithms: Markov Clustering

G is a graph

add self-loops to G

set parameter I

set M the stochastic matrix of G

while (change > ε){

M’ = M x M

M’ = ГI (M’)

make M’ stochastic

change = M - M’

M’ = M

}

Clustering is the components of M’

Page 40: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 40

Clustering Algorithms: Markov Clustering

Page 41: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 41

Clustering Algorithms: Markov Clustering

Page 42: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 42

Clustering Algorithms: Markov Clustering

Page 43: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 43

Clustering Algorithms: Markov Clustering

Page 44: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 44

Clustering Algorithms: Paraclique

g = 3

Page 45: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 45

Clustering Algorithms: Paraclique

g = 3

Page 46: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 46

Clustering Algorithms: Paraclique

g = 3

Page 47: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 47

Clustering Algorithms: Paraclique

g = 3

Page 48: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 48

Clustering Algorithms: Paraclique

g = 3

Page 49: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 49

Clustering Algorithms: Paraclique

g = 3

Page 50: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 50

Clustering Algorithms: Paraclique

g = 3

Page 51: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 51

Clustering Algorithms: Paraclique

g = 3

Page 52: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 52

Clustering Algorithms: Paraclique

g = 3

Page 53: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 53

Clustering Algorithms: Paraclique

g = 3

Page 54: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 54

Clustering Algorithms: Paraclique

g = 3

Page 55: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 55

5. Common Networks in Molecular Biology

Page 56: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 56

Metabolic Networks● Network of chemical reactions enabling the conversion of substrates

into energy and biomass. Can be represented by:○ Simple graphs○ (Directed) bipartite graphs.○ (Directed) hypergraphs

Page 57: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 57

Hierarchical Modularity in Metabolic Networks

Page 58: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 58

Protein-protein Interaction Networks● Represent how different proteins operate in coordination with

others to enable biological processes within the cell.

Page 59: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 59

Gene Co-expression Networks

Page 60: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 60

6. Higher Level Biological Examples

Page 61: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 61

Epilepsy Seizure Prediction

Page 62: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 62

fMRI in Schizophrenia

Page 63: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 63

● WHO child cause of death for 194 countries in 2013

● Method:

○ Pearson correlation between countries, across COD categories

● Graph:

○ Vertices - Countries

○ Edges - Similarities in COD

● Threshold at 0.95

● Markov clustering

Health Disparities

Page 64: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 64

Africa

Americas

East Mediterranean

Europe

South East Asia

Western Pacific

Health Disparities Graph

Page 65: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 65

Health Disparities Clustering

Africa

Americas

East Mediterranean

Europe

South East Asia

Western Pacific

Page 66: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 66

7. Yeast Life Cycle Time Varying Graph

Page 67: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 67

Want to know how gene-gene associations change over the life cycle of a yeast cell.

Yeast Life Cycle

Page 68: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 68

Yeast Life Cycle Data

● Yeast gene expression data collected from synchronised cultures

● 24 time points over 10 minute increments● 6,178 genes

Time points

Genes

Page 69: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 69

Expression Over Time

Time (s)

Nor

mal

ized

exp

ress

ion

valu

e

Page 70: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 70

Yeast Example Process

1. Removed genes with:

○ More than 4 missing values○ Low variance over all time points (<0.6)

2. Calculated all-to-all pairwise Spearman correlations for time steps

3. Spectral thresholding - did not work

○ Hard cut off of 0.8 for all graphs

4. Metric calculations

Page 71: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 71

Variance

Variance

Num

ber o

f gen

es

Page 72: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 72

Time-varying Co-expression Network

t-1 t-2 ... t-M

G-1

G-2

::

Page 73: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 73

Time-varying Co-expression Network

t-1 t-2 ... t-M

G-1

G-2

::

Page 74: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 74

Time-varying Co-expression Network

t-1 t-2 ... t-M

G-1

G-2

::

Page 75: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 75

Time-varying Co-expression Network

t-1 t-2 ... t-M

G-1

G-2

::

Page 76: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 76

Time-varying Co-expression Network

t-1 t-2 ... t-M

G-1

G-2

::

Page 77: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 77

Time-varying Co-expression Network

t-1 t-2 ... t-M

G-1

G-2

::

Page 78: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 78

Time Graphs - Time 1

Page 79: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 79

Time Graphs- Time 2

Page 80: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 80

Time Graphs- Time 3

Page 81: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 81

Time Graphs- Time 4

Page 82: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 82

Time Graphs- Time 5

Page 83: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 83

Page 84: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 84

Page 85: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 85

Page 86: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 86

Page 87: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 87

Page 88: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 88

Page 89: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 89

Page 90: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 90

Page 91: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 91

Page 92: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 92

8. Issues

● Figure out how to threshold

● Better metrics to pinpoint differences in time based graphs

● Network validation, particularly for less studied systems

● Noise in high-throughput data

Page 93: Carissa Bleker, Ashley Cliff, Sergio Garciaweb.eecs.utk.edu/~cphill25/cs594_spring2017/... · Sergio Garcia From: Murcia (Population 439K), Spain. Pursuing PhD in Chemical and Biomolecular

Bleker, Clif, Garcia 93

Test Questions

1. Which thresholding method did we try to use?

2. Name a type of calculated edge.

3. Name one type of graph that can be used to represent metabolic

networks.