45
Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

Embed Size (px)

Citation preview

Page 1: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

Lecture 8. Topics in Biological Networks (Basics)

The Chinese University of Hong KongCSCI5050 Bioinformatics and Computational Biology

Page 2: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 2

Lecture outline1. Definition and different types of biological

networks2. Some high-throughput experimental methods for

probing biological networks– Important databases

3. Some computational methods for reconstructing biological networks

4. Data analysis– Analyzing the networks– Using the networks to analyze other data– Visualization and analysis tools

Last update: 22-Oct-2015

Page 3: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

DEFINITION AND TYPESPart 1

Page 4: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 4

Biological networks• A biological network is represented by a graph G=(V, E)– V: a set of nodes (vertices). Each node viV represents an

object• A gene, protein, metabolite, drug, ...

– E: a set of edges. Each edge eijE connects two nodes vi and vj, and represents a relationship between the two objects• Protein-protein interaction (PPI), gene regulation, ...• Undirected (eijE ejiE, e.g., PPI) or

directed (eijE does not imply ejiE, e.g., gene regulation)

– May have additional node and edge attributes such as confidence of interaction

Last update: 22-Oct-2015

v1 v2

v3v4

Page 5: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 5

Network types• Gene regulatory networks [project]

– Transcription factor binding• Promoters• Distal regulatory elements

– Micro-RNA• Co-expression networks• Protein-protein interaction networks (lecture)• Genetic interaction networks [project]• Metabolic networks [project]• Gene-drug interaction networks [project]• Signaling networks• Neural networks• Disease transmission networks• Phylogenetic networks• Food web• ...

Last update: 22-Oct-2015

Molecular: DNA

Inter-species

Multi-cellular

Inter-organism

Molecular: RNA

Cellular: pathways

Molecular: proteins

• Gene regulatory networks [project]– Transcription factor binding

• Promoters• Distal regulatory elements

– Micro-RNA [project]• Co-expression networks• Protein-protein interaction networks• Genetic interaction networks [project]• Metabolic networks [project]• Gene-drug interaction networks [project]• Signaling networks• Neural networks• Disease transmission networks• Phylogenetic networks• Food web• ...

Page 6: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 6

TF regulatory networks• Each node represents a gene and the

protein(s) that it encodes• An edge eij exists if vi represents a

transcription factor (TF) and it regulates the gene represented by vj

– Edges are directed– Edges should be signed (activation vs.

repression) – although this information is usually unavailable

– May have edge weights to indicate confidence– Should record only direct regulation– The network itself does not provide

information about the relationships between different edges

• Other types of gene regulatory (e.g., miRNA) networks are defined in similar ways

Last update: 22-Oct-2015

Image credit: Deneris and Wyler., Nature Neuroscience published online 26 February 2012

Page 7: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 7

Co-expression networks

• Each node represents a gene• An edge eij exists if the genes represented by vi and vj co-express

– Co-expression could be measured by correlation across multiple samples/conditions• May have edge weights to represent degree of co-expression

– Edges are usually undirected• Unless measures like expression ranks are used

– Usually more meaningful to measure protein abundance, but easier to measure RNA level

– Co-expression may suggest functional relationships

Last update: 22-Oct-2015

Image credit: Prieto et al., PLoS One 3(12):e3911, (2008)

Node color indicates some network statistics to be explained later.

Page 8: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 8

Protein-protein interaction (PPI) networks

• Each node represents a protein• An edge eij exists if the proteins represented

by vi and vj physically interact– Edges are undirected– Usually not distinguishing between permanent

and transient interactions– In some datasets/databases, eij simply indicates

that both the proteins represented by vi and vj participate in a complex, but they may not physically interact directly

– Usually not considering whether it is possible for the different interactions to happen simultaneously

– There are networks for specific types of interactions, e.g., phosphorylation networks

Last update: 22-Oct-2015

Human Calcineurin heterodimer (1AUI)

Image source: RCSB Protein Data Bank

Page 9: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 9

Genetic interaction networks• The term “genetic interaction”

in general means any types of relationship between genes

• Specifically, it has been used to describe some particular types of scenarios:– Each node represents a gene– An edge eij exists if the growth

rate of the cell is affected by the knockout/knockdown/overdose of the genes as shown in the table

– Depending on the type, the edges can be directed or undirected

Last update: 22-Oct-2015

Type Definition

Synthetic lethality 0=ij<i,j

Synthetic sick 0<ij<i,j

Synthetic rescue 0?=i<ij

Dosage lethality 0=ij*<i

Dosage sick 0<ij*<i

Dosage rescue 0?=i<ij*

Phenotypic enhancement ij<E[ij]

Phenotypic suppression E[ij]<ij

Image credit: Drees et al., Genome Biology 6(4):R38, (2005)

*: overdose

Page 10: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 10

Metabolic pathways• Each node is a metabolite• An edge eij exists if there is a reaction that turns the

metabolite represented by vi to the metabolite represented by vj

– Edges are directed– Both eij and eji exist if the reaction is reversible– Each edge is labeled by the enzyme that accelerates the reaction

in the cell• There is a dual representation, in which each node is a

reaction, and an edge eij exists if the reaction represented by vi produces a product that is a substrate of the reaction represented by vj

Last update: 22-Oct-2015

Page 11: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 11

Metabolic pathways: an example

Last update: 22-Oct-2015

Image source: Kyoto Encyclopedia of Genes and Genomes

Page 12: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 12

Metabolic pathways: an example

Last update: 22-Oct-2015

Image source: Kyoto Encyclopedia of Genes and Genomes

Page 13: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 13

Signaling pathways• Describing the events

that happen in a cell in response to an external signal

• A heterogeneous network involving different types of data– Protein-protein

interaction• Phosphorylation

– Gene regulation– ...

Last update: 22-Oct-2015

Image source: Kyoto Encyclopedia of Genes and Genomes

Page 14: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 14

Handling many types of relationship• Need a systematic way to represent the many

different types of relationship

Last update: 22-Oct-2015

Image credit: Lu et al., Trends in Biochemical Sciences 32(7):320-331, (2007)

Page 15: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 15

Phylogenetic networks• Generalization of phylogenetic

trees, allowing non-tree structures (i.e., cycles, due to for example horizontal gene transfers)

• Each node is a species/clade• An edge eij exists if the species

represented by vj was diverged from/received genetic materials from the species represented by vi

– Network based on a single gene vs. network based on the whole genome of a species

Last update: 22-Oct-2015

Image credit: Wikipedia; Smets and Barkay, Nature Reviews Microbiology 3(9):675-678, (2005)

Phylogenetic tree

Phylogenetic network

Page 16: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

HIGH-THROUGHPUT EXPERIMENTAL METHODS

Part 2

Page 17: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 17

Probing gene regulatory networks• Transcription factor binding targets– Chromatin immunoprecipitation followed by• Microarray (ChIP-chip)• Sequencing (ChIP-seq)

• miRNA targets– Over-expression/silencing of miRNA, followed by

profiling of changes in mRNA/protein levels• Including direct and indirect targets

– Cross-linking immunoprecipitation-high-throughput sequencing (CLIP-seq)

Last update: 22-Oct-2015

Page 18: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 18

PPI: Yeast-two-hybrid (Y2H)• To test whether two proteins

physically interact• Fuse one protein with a DNA

binding domain (BD)• Fuse the other with an

activation domain (AD)• If the two proteins physically

interact, a reporter gene is expressed

• Can fix the first protein (the “bait”), and try many different second proteins (the “preys”)

Last update: 22-Oct-2015

Image source: Wikipedia

Page 19: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 19

Protein complex: TAP-MS• Tandem affinity purification

followed by mass spectrometry– Adding a TAP tag to a bait protein– The protein and other proteins

that bind to it (directly or indirectly) bind to IgG beads, while other proteins are washed away

– The identity of the proteins pulled-down in this way can be determined by mass spectrometry

Last update: 22-Oct-2015

Image source: Wikipedia

Page 20: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 20

Synthetic lethality• There are different methods• One of them is Synthetic

Genetic Array (SGA)– Create single mutation strains

of different mating types– Mate and select for double

mutation– Growth rate measured by

visual inspection or image analysis of colony size

Last update: 22-Oct-2015

Image credit: Tong et al., Science 294(5550):2364-2368, (2001)

Page 21: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 21

Databases• There are many databases for biological networks• BioGrid is a general database for various types of interactions in

multiple species• Gene Expression Omnibus (GEO) contains a lot of gene expression data• The Kyoto Encyclopedia of Genes and Genomes (KEGG) contains

information about pathways• The Protein Data Bank (PDB) contains some crystal structures about

interacting biological objects• There are species-specific databases

– Human Protein Reference Database (HPRD)– Saccharomyces Genome Database (SGD)– ...

• There are also databases that integrate other databases– Biological Networks database (IntegromeDB)

Last update: 22-Oct-2015

Page 22: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 22

File formats• Two main ways to store matrices:

– Adjacency matrix– Adjacency list

• Since most biological networks are sparse, adjacency list is more commonly used

• Simplest formats:– <Object 1><Tab><Object 2>– Simple interaction file (SIF):

<Object 1><Tab><Type><Tab><Object 2>– XML– Formats with visualization information (e.g., GML)( See http://wiki.cytoscape.org/Cytoscape_User_Manual/Network_Formats for

some commonly used formats)

Last update: 22-Oct-2015

Page 23: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

COMPUTATIONAL NETWORK RECONSTRUCTION METHODS

Part 3

Page 24: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 24

Problem definition• Network reconstruction as a machine learning problem

– As a low-cost supplement to experimental methods• Inputs

– A set of nodes V, each node vi described by a vector of features xi

– Each node pair (vi, vj) described by a (potentially empty) vector of features zij

– A (potentially empty) set of positive example edges E+ VV (ideally E+ E)– A (potentially empty) set of negative example edges E- VV

• Goal: For each node pair (vi, vj), determine whether the edge (vi, vj) is in the unknown set of edges, E

• Evaluating accuracy of predictions:– Cross-validation (using some examples for training and some for testing.

Repeat for different training/testing splits)– Functional enrichment analysis– Experimental validation

Last update: 22-Oct-2015

Page 25: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 25

Example: TF regulation• Inputs

– V: the set of all genes (TFs and non-TFs), each node vi described by a vector xi of node features:

• Expression level of the gene at different time points• Sequence at the promoter region of the gene• ...

– Each node pair (vi, vj) is described by a vector zij of features:• (If vi represents a TF) Binding signal of the TF represented by vi at the promoter region

of the gene represented by vj

• (If vi represents a TF) Expression of the gene represented by vj when the gene represented by vi is knocked out/down

• ...

– In some settings, there are no input positive examples– Usually there are no negative examples

• Goal: Determine which gene each TF regulates (and how, i.e., activation vs. repression, coefficients, etc.)

Last update: 22-Oct-2015

Page 26: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 26

Some common difficulties• Big data size (O(n2) number of node pairs for n nodes)

– Long computational time– Large memory consumption

• Small number of positive examples• Noisy positive examples (false positives)• Lack of negative examples• How node features should be used to predict edges is

not trivial• Weak features• Non-linear relationship between features and class

(interaction/no interaction)

Last update: 22-Oct-2015

Page 27: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

DATA ANALYSISPart 4

Page 28: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 28

How to interpret these hair balls?

Last update: 22-Oct-2015

Image credit: Zhu et al., Genes & Development 21(9):1010-1024, (2007)

Transcription factor binding Protein-protein interactions

Phosphorylation Metabolic Genetic interactions

Page 29: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 29

Interpreting biological networks• Network statistics– Identifying important nodes/edges

• Network generation process– Understanding the formation/evolution of

networks• Network modules– Identifying functional object groups

• Network motifs– Understanding working principles

Last update: 22-Oct-2015

Page 30: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 30

Network statistics• Some statistics about a network:– Degree of a node: number of edges incident on the

node• In-degree and out-degree for a directed graph

– Clustering coefficient of a node, what fraction of the neighbors of the node is connected

– Shortest path length between two nodes– Eccentricity of a node: the maximum of its shortest

path lengths to all other nodes– Betweenness of a node: number of shortest paths that

involve the node• Similar definition for the betweenness of an edge

Last update: 22-Oct-2015

Page 31: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 31

Identifying important objects• A hub is an object with a

large degree– It is likely important as if it

is disrupted, many interactions could be affected

• A bottleneck is an object with a large betweenness– It is likely important as if it

is disrupted, the information flow between many node pairs could be affected

Last update: 22-Oct-2015

Image credit: Yu et al., PLoS Computational Biology 3(4):e59, (2007)

Page 32: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 32

Degree distribution• It has been found that in many biological (and non-biological

networks), the degree distribution has a long tail– Most nodes have few interactions– A few nodes have many interactions– It has been proposed that these networks are “scale-free”, where the

degree distribution follows a power law: P(k) ~ ck- (usually 2 < < 3)• Preferential attachment is one way to produce a scale-free network – The

rich becomes richer, the poor becomes poorer

Last update: 22-Oct-2015

An Erdős-Rényi random network A scale-free network

Image source: Wikipedia

Page 33: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 33

Identifying important pathways• In functional enrichment analysis, we check if an unexpectedly

large fraction of genes in a target set share a common annotation

• This idea can be generalized: whether the genes in a target set are unexpectedly similar to each other

• A biological network provides a natural way to compute similarity– Finding cluster of genes with many direct connections (similar to

finding protein complexes from PPI)• Alternatively, finding such highly-connected modules could suggest gene

sets for performing standard functional enrichment analysis

– Finding cluster of genes that are close to each other in the network– Finding genes (in the target set or not) that are close to the genes in

the target set

Last update: 22-Oct-2015

Page 34: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 34

Network modules

Last update: 22-Oct-2015

Image credit: Palla et al., Nature 435(7043):814-818, (2005)

Page 35: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 35

Network modules

Last update: 22-Oct-2015

Image credit: Costanzo et al., Science 327(5964):425-431, (2010)

Page 36: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 36

Genetic interaction network• Between-pathway vs. within-pathway explanations

for negative interactions (phenotype of double knock-out worse than the expected one based on the two single knock-outs):

Last update: 22-Oct-2015

Image credit: Dixon et al., Annual Review in Genetics 43(1):601-625, (2009)

Page 37: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 37

Phenotype-associating sub-networks

• A biological network can also be used to find consistent signals in sub-networks (and average out noise)

Last update: 22-Oct-2015

Image credit: Chuang et al., Molecular Systems Biology 3:140, (2007)

Page 38: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 38

Biological networks and network motifs

Last update: 22-Oct-2015

Image credit: Milo et al., Science 298(5594):824-827, (2002)

Page 39: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 39

Statistical significance of a motif• To evaluate whether a pattern is over-represented, we want to

know how many such patterns would be found in a “random” network

• How to form a random network?– Erdos-Renyi random graphs: define the nodes, then each edge

appears with a certain probability• Not close to reality in many cases

– Price/Barabasi-Albert model: add the nodes one by one, where the chance for the new node to connect to an old node is proportional to the number of edges the old node already had• Closer to reality

– Permuting the graph by reconnecting edges• Preserving the total number of nodes• Preserving the total number of edges• Preserving the number of edges of each node

Last update: 22-Oct-2015

Page 40: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 40

Statistical significance

Last update: 22-Oct-2015

Image credit: Milo et al., Science 298(5594):824-827, (2002)

Page 41: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 41

Actual numbers observed

Last update: 22-Oct-2015

Image credit: Milo et al., Science 298(5594):824-827, (2002)

Page 42: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 42

Possible functions of network motifs• A coherent feed-forward

loop can reject rapid variations in the input, so that output is produced only when there is a persistent input

• A single input motif (SIM) can turn on and turn off several downstream devices at different time according to their activation thresholds

Last update: 22-Oct-2015

Image credit: Shen-Orr et al., Nature Genetics 31(1):64-68, (2002)

X Y Z

Page 43: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 43

Visualization tools• Aisee – Tool for generating network figures in vector

format• Cytoscape – one of the most popular tool, a

visualization tool and a platform with many open-source plugins for various types of analysis

• JUNG• N-Browse• Osprey• Pajek• tYNA • ...

Last update: 22-Oct-2015

Page 44: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 44

Analysis tools• Some of the tools listed on the last slide• GeneSpring – Popular tool for pathway analysis

(commonly used for microarray data)• GraphWeb• HCE, Weka, ... (for clustering and other types of data

mining/machine learning tasks)• NetBox• Pandora( See http://wiki.reactome.org/index.php/Reactome_Resource_Guide

for a long list of tools)

Last update: 22-Oct-2015

Page 45: Lecture 8. Topics in Biological Networks (Basics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational Biology

CSCI5050 Bioinformatics and Computational Biology | Kevin Yip-cse-cuhk | Fall 2015 45

Summary• There are many types of biological networks

– Gene regulatory– Protein-protein interaction– Metabolic– ...

• There are high-throughput experimental methods for identifying the interactions

• There are also many computational methods for supplementing the noisy networks from experimental data

• Networks can be used to study object relationships, identifying important objects and modules, and associations with a phenotype

Last update: 22-Oct-2015