Upload
angel-rogers
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Systematic Analysis of Interactome:
A New Trend in Bioinformatics
KOCSEA Technical Symposium 2010
Young-Rae Cho, Ph.D.
Assistant Professor
Department of Computer Science
Baylor University
History of Bioinformatics
Stage 1. Sequence Analysis
• Gene sequencing
• Sequence alignment
• Homolog search
• Motif finding
History of Bioinformatics
Stage 1. Sequence Analysis
Stage 2. Structure Analysis
• Protein folding
• Homolog search
• Binding site prediction
• Function prediction
Computational Biology
• Gene sequencing
• Sequence alignment
• Homolog search
• Motif finding
History of Bioinformatics
Stage 1. Sequence Analysis
Stage 2. Structure Analysis
Stage 3. Expression Analysis
• Function prediction
• Gene clustering
• Sample classification
Functional Genomics Computational Biology
• Protein folding
• Homolog search
• Binding site prediction
• Function prediction
• Gene sequencing
• Sequence alignment
• Homolog search
• Motif finding
History of Bioinformatics
Stage 1. Sequence Analysis
Stage 2. Structure Analysis
Stage 3. Expression Analysis
Stage 4. Network Analysis
• Network modeling
• Interaction prediction
• Function prediction
• Pathway identification
• Module detection
Systems Biology Functional Genomics Computational Biology
• Function prediction
• Gene clustering
• Sample classification
• Protein folding
• Homolog search
• Binding site prediction
• Function prediction
• Gene sequencing
• Sequence alignment
• Homolog search
• Motif finding
Definition
Maps of biochemical reactions, interactions, regulations between genes or proteins
Importance
Provide insights into the mechanisms of molecular function within a cell
Significant resource for functional characterization of genes or proteins
Require computational and systematic approaches
Examples
Metabolic networks
Protein-protein interaction networks
Genetic interaction networks
Gene regulatory networks (Signal transduction networks)
Biological Networks
Determination
Experimental methods: Y2H, MS, Protein Microarray
Computational methods: Homolog search, Gene fusion analysis, Phylogenetic profiles
Genome-scale protein-protein interactions Interactome
Representation
Un-weighted, undirected graph
Challenges
Unreliability
Large scale
Complex connectivity
Protein Interaction Networks
Strategy
To resolve complex connectivity
Converts the complex graph to
a hierarchical tree structure
Uses the concepts of path strength,
functional linkage, and centrality
Process
Input: a protein interaction network
Output: a list of functional modules
Network Re-structuring
unweighted network
edge weightingedge weighting
functional linkage measurementfunctional linkage measurement
network restructuringnetwork restructuring
hub confidence measurementhub confidence measurement
network clusteringnetwork clustering
weighted network
score matrix
structured network
hubs
clusters
Path Strength Model
Assumption: each node in a path chooses a succeeding edge based on the weighted
probability
Path Strength Factors
Edge weights
Path length
Node weighted degree
Path Strength
Measurements
Path strength of the strongest path between two nodes
Computational problem
Needs a heuristic approach
Uses a user-specified threshold of the max path length
Formula
k-length path strength:
Functional linkage:
Functional Linkage
shortest path length threshold
Centrality
Weighted closeness:
Algorithm
Computes centrality for each node a
Selects a set of ancestor nodes, T(a), of a by
Selects a parent node, p(a), of a by
Example
Network Restructuring
Measurement
Selects a set of child nodes, D(a), of a by
Selects a set of descendent nodes, La, of a by
Computes the hub confidence, H(a), of a by
Example
Hub Confidence
Algorithm
Iteratively select a hub a with the highest hub confidence
Output the sub-tree La including a as a cluster (functional module)
Cluster Depth
The max path length from the root of the sub-tree to a leaf
Example
Clustering
Network Vulnerability
Random attack: repeatedly disrupt a randomly selected node
Degree-based hub attack: repeatedly disrupt the highest degree node
Structural hub attack: repeatedly disrupt the node with the highest hub confidence
For each iteration, observes the largest component
Results
Topological Assessment of Hubs
0.60
0.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
0 20 40 60 80 100 120 140 160
number of nodes
frac
tio
n o
f la
rges
t co
mp
on
ent
random attack
degree-based hub attack
structural hub attack
Protein Lethality
Determines lethal / viable proteins by knock-out experiment
Lethality represents functional essentiality
Orders proteins by degree and hub confidence
Observes the cumulative proportion of lethal proteins for every 10 proteins
Results
Biological Assessment of Hubs
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 20 40 60 80 100 120 140
number of hubs
av
era
ge
leth
alit
y
structural hubs
degree-based hubs
Modularity
A combined measure of density within each cluster and separability among clusters
Estimated by the ratio of the number of edges within a cluster (sub-graph)
to the number of all edges starting from the nodes in the cluster (sub-graph)
Observes the average modularity of clusters with respect to the cluster depth
Results
More specific function module has
higher modularity
Justify the general-to-specific concepts
of hierarchical functional modules
Topological Assessment of Clusters
0
20
40
60
80
100
120
140
160
180
1 2 3 4 5 6 7 8 9 10 11 12
cluster depth
ave
rag
e m
od
ula
rity
f-Measure
Compares each output cluster X with the real functional annotation Y (from MIPS)
Recall = (# of common proteins of X and Y) / (# of proteins in Y)
Precision = (# of common proteins of X and Y) / (# of proteins in X)
f-measure = 2 × Recall × Precision / (Recall + Precision)
Results
Compared with the results from previous hierarchical clustering methods, e.g.,
edge-betweenness (top-down approach) and ProDistIn (bottom-up-approach)
Biological Assessment of Clusters
Motivation
Significant functional knowledge in protein interaction networks (interactome)
Complex connectivity
Contributions
Convert an unstructured network to a structured network
Conserve functional information through pathways
High network vulnerability, low functional lethality at hubs as a drug target
Applicable to various fields, e.g., social networks, WWW
Foundation of structural dynamics during network evolution
Conclusion