55
Patching the Puzzle of Genetic Network Grace S. Shieh Institute of Statistical Science, Academia Sinica [email protected]

Patching the Puzzle of Genetic Network

  • Upload
    kevlyn

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

Patching the Puzzle of Genetic Network. Grace S. Shieh Institute of Statistical Science, Academia Sinica [email protected]. Outline. What is Genetic Network? Why the area is one of the frontiers? How Statistical modeling/computational algorithms simplify the complex puzzle? - PowerPoint PPT Presentation

Citation preview

Page 1: Patching the Puzzle of Genetic Network

Patching the Puzzle of Genetic Network

Grace S. Shieh

Institute of Statistical Science,Academia Sinica

[email protected]

Page 2: Patching the Puzzle of Genetic Network
Page 3: Patching the Puzzle of Genetic Network

Outline

What is Genetic Network?Why the area is one of the frontiers?

How Statistical modeling/computational algorithms simplify the complex puzzle?

Applications

Page 4: Patching the Puzzle of Genetic Network

Dogma of biology

DNA -> mRNA -> Protein

Proteins: the elements that function in organisms, e.g. yeast and human.

Page 5: Patching the Puzzle of Genetic Network

Somatic mutations affect key pathways in Lung adenocarcinoma Nature, Oct.2008

Page 6: Patching the Puzzle of Genetic Network

Science, Sept, 2008

Page 7: Patching the Puzzle of Genetic Network
Page 8: Patching the Puzzle of Genetic Network

Complex human disease

l Digenic effects may underlie:

Type II diabetes Schizophrenia Retinitis pigmentosa Glaucoma

Tong et al., Science 2004

Page 9: Patching the Puzzle of Genetic Network

Complex human disease These diseases may have similar synthetic

effect in the yeast genetic interaction map

The topology of the genetic network of neighborhood of SGS1 (Tong et al., 2004)

Elements of genetic network derived from model organism, e.g. yeast, are likely to be conserved

Page 10: Patching the Puzzle of Genetic Network

Experimental method to reveal genetic interactions Systematic Genetic Analysis with ordered

Arrays of Yeast Deletion Mutants Tong et al., 2001, Science

Global mapping of the Yeast Genetic interaction network

Tong et al., 2004, Science Genome landscape of a cell Costanzo et al. 2010, Science

Page 11: Patching the Puzzle of Genetic Network

Costanzo et al., Science 2010

Page 12: Patching the Puzzle of Genetic Network

Synthetic sick or lethal (SSL) gene pairs: when both genes are mutated, the organism will die, but neither lethal

SSL is important for understanding how an organism tolerates genetic mutations

Hartman, Garvik and Hartwell, 2001, Science

Page 13: Patching the Puzzle of Genetic Network
Page 14: Patching the Puzzle of Genetic Network

Scenarios resulting in synthetic interaction

SSL

2 partially redundant pathways

A

B

D

C

E

F

H

G

I

Partially redundant

genes

C1

A

B

E

D

C2

3 partially redundant

pathways, 2 required

E

F

H

G

I

A

B

D

C

J

K

M

L

Protein complex

tolerating 1 but not

2 destabiliz

ing mutations

A C

FDE

B

< 2% < 4% *

Page 15: Patching the Puzzle of Genetic Network

A Pattern Recognition Approach to Infer Gene

Networks

Grace S. Shieh

joined withC.-L. Chuang, C.-H. Jen and C.-M. Chen

Bioinformatics 2008

Page 16: Patching the Puzzle of Genetic Network

Excerpted from Tong et al. (2001) Science

Page 17: Patching the Puzzle of Genetic Network

Transcriptional Compensation (transcription reverse compensation) interactions (Lesage et al. 2004; Wong & Roth, 2005, Genetics; Kafri et al.,2005, Nature Genetics):

among paralogues or SSL gene pairs, when one gene is mutated, its partner gene’s expression increases (decreases)

Goal: to predict TC and TRC interactions among SSL

gene pairs

Page 18: Patching the Puzzle of Genetic Network

Four sets of Yeast (Sachromyces cerevisiae) micro-array gene expression data (Spellman, et al, 1998) were used.The red channel R: intensities of synchronized yeast by alpha factor arrest, arrest of a cdc 15 or cdc 28 mutant and Elutration; The Green channel G: average of non-synchronized.

Page 19: Patching the Puzzle of Genetic Network

Cell cycles of CLN2 gene

Page 20: Patching the Puzzle of Genetic Network

qRT-PCR experiments

For a given pair of SSL genes,Experimental group: gene A’s expression, gene B been knocked out Control group: gene A’s expression, gene B wildtype if A >> B => A& B may be TC if A << B => A& B may be TRC

Page 21: Patching the Puzzle of Genetic Network
Page 22: Patching the Puzzle of Genetic Network

Gene expression of Transcription Compensation (TC) pairs

Page 23: Patching the Puzzle of Genetic Network

Gene expression of Transcription Reverse Compensation (TRC) pairs

Page 24: Patching the Puzzle of Genetic Network

The dependence of patterns and their associated interactions

Assumption for PARE:

the dependence of CP (SP) and TC (TD) interactions is significant. To test this hypothesis: Fisher’s exact test

Page 25: Patching the Puzzle of Genetic Network

The Proportion of Complementary Pattern (CP) in TC Screen genes with significant changes over

time by resulted in 35 gene pairs

( ) ( )max ( ) min ( ) 1.5t i t iG t G t >

Fisher’s exact test: p-value < 0.02 significant at 95% level

CP SP TotalTC 13 9 22

TD 2 11 13

Total 15 20 35

Page 26: Patching the Puzzle of Genetic Network

PARE The gene expression of the regulating gene is treated as

object contour, and the lagged-1 expression of the target gene the boundary of interest in image segmentation algorithm

1,t t′= +

( ) ( )D1,

jii j

def

t

G tG tE

t t

′ ∂∂= ⋅

′∂ ∂∑( ) ( )22

D2, 2 2

def jii j

t

G tG tE

t t=

′ ∂∂⋅

′∂ ∂∑

( ) ( )Area,

12

g g∈ℑ

= ′ ×∑uuv uuvdef

i j i jt

E t t

( ){ }( ), ( ) 90 g g′∈ℑ >uuv uuv oi jt t tθ

Page 27: Patching the Puzzle of Genetic Network

Discrete Signals Because gene expression is discrete signal, the 1st- and 2nd-

order partial differential terms can be modified as follows:

the interaction can be determined as weighted sum of the internal and external energies:

D1 D2 Area

, , , ,= ⋅ + ⋅ − ⋅

i j i j i j i jS E E Eα β γ

,i jS

( ) ( 1) ( )∂ + −=

∂ Δi i iG t G t G tt t

22

2

)()()1(2)2()(

ttGtGtG

ttG iiii

Δ++−+

=∂

Page 28: Patching the Puzzle of Genetic Network

PARE In this study, each gene is represented by a node in a In this study, each gene is represented by a node in a

graphical model, which is denoted by graphical model, which is denoted by , where , where ii = 1, 2, …, = 1, 2, …, NN. . The edge The edge represents the gene-gene interaction between represents the gene-gene interaction between

and and , where the enhancer gene , where the enhancer gene plays a key role in plays a key role in activating or repressing the target gene activating or repressing the target gene ..

iG

iG jG iG

jG

,i jS

Page 29: Patching the Puzzle of Genetic Network

Training set vs test set Leave-one-out cross validation: among n pairs, use n-1 pairs to train PARE, then predict

the left 1 pair, iteratively for n.

3-fold cross validation: among all pairs, use 2/3 pairs to train, then predict the

left 1/3, from all combinations iterative this for N times

Page 30: Patching the Puzzle of Genetic Network

Experimental Results (TC/TRC)

alpha data set (18 time points) –

Table 1. The prediction results, checked against the qRT-PCR experiments

*Since 500 times 3-fold CVs were performed, only averages of TPRs are reported.

Training Test

TPR FPR TPR Std FPR

Lagged Corr. 46%

EB-GGMs 52%

PAREn-fold 76% 20% 73% 23%

3-fold 78%* 18%* 71%* 3% 23%*

Page 31: Patching the Puzzle of Genetic Network

Experimental Results (TC/TRC) For the alpha dataset, PARE yields

71-73% of true-positive rate

prediction accuracy 81%

FPR for predicting TC (TD) interaction was bounded by 12% (10%) genome-wide.

Page 32: Patching the Puzzle of Genetic Network

Experimental Results (TC/TRC)

Page 33: Patching the Puzzle of Genetic Network

Checking against published literature These genetic interactions are consistent

with the following experimental results:

Sgs1 and Srs2 are known redundant pathways in replication (Ira et al., 1999; Lee et al., 1999)

Ex: Srs2 and Sgs1-Top3 suppress crossovers during double stand break repair in yeast.

Page 34: Patching the Puzzle of Genetic Network

Sgs1/Top3/Rmi1 and Mus81/Mms4 complex are involved in both double-strand break repair and homologous recombination (Frabe et

al., 2002).

This indicates that Sgs1/Top3/Rmi1 and Mus81/Mms4 are alternative pathways to resolve recombination intermediates.

Page 35: Patching the Puzzle of Genetic Network

Inferring transcriptional interactions 132 pairs of Activator-target gene (AT) and Repressor-target (RT) gene interactions were collected from published literatures (MIPS, Mewes et al, 1999, Nucleic Acids Research; Gancedo, 1998, Microbiology & Molecular Biology; Draper et al., 1994, Molecular & Cellular Biology, etc)

Page 36: Patching the Puzzle of Genetic Network
Page 37: Patching the Puzzle of Genetic Network

Test for CP (SP) associatied with RT (AT) pairs in the data

Chi-Squared test

Page 38: Patching the Puzzle of Genetic Network

Experimental Results (AT/RT)

*the average of 500 times repeats

Table 2. The prediction results using Elu data set, checked against the 132 TIs from literatures.

Training Test

TPR FPR TPR Std FPR

Lagged Corr. 51%

EB-GGMs 59%

PAREn-fold 79% 16% 77% 17%

3-fold 81%* 16%* 74%* 3% 19%*

FPRs for genome-wide TIs predictions, and they are bounded by 21%.

Page 39: Patching the Puzzle of Genetic Network

Conclusions The proposed PARE learns gene expression

patterns, then it can predict similar genetic interactions using microarray data.

TPRs of PARE applied to the alpha (Elu) dataset are about 73% (77%) for inferring TC/TD interactions (TI), respectively.

Page 40: Patching the Puzzle of Genetic Network
Page 41: Patching the Puzzle of Genetic Network

Inferring genesis of obesity in human (join w. Karine & Jean-Daniel

MGED from Human adipocyte-derived cell lines

Adipocytes cells that primarily compose adipose tissue specialized in storing energy as fat

0 2 4 6 8 10-1

0

1

2

C/EBP alpha (time-course)

dayexpression level (log

2)

0 2 4 6 8 100

1

2

C/EBP alpha (MGED in ratio)

day

J i/J i-1

Time-course MGED

Page 42: Patching the Puzzle of Genetic Network

PARE to infer genesis of obesity in human

Training stage: MGED of human adipocytes-derived cell lines

70 known transcriptional interactions (TIs) from iHOP

Prediction results: 40+ pairs of TIs and some genetic interactions

predicted Some are consistent with existing experimental

results, some novel ones

Page 43: Patching the Puzzle of Genetic Network

Inferring TIsData preparation: Select significantly expressed genes:

P-value < 0.01 Significantly expressed in at least 1 time point (5 time

points in total)

->36 genes with a function of interest Interact with 14 genes of interest (AP2, CCL2, CCL5,

LEP, etc…) -> 504 gene pairs

Page 44: Patching the Puzzle of Genetic Network

WebPARE: webcomputing service of PARE (Chuang+, Wu+, Cheng and Shieh*, 2010, Bioinformatics)

To provide a simple web-interface for users to infer GIs/TIs using time course gene expression data and existing knowledge, e.g. pre-stored validated TIs in yeast, mouse, human, etc (TRANSFAC)

Page 45: Patching the Puzzle of Genetic Network

45

Page 46: Patching the Puzzle of Genetic Network

An example:

A list of genes involved in cell cycle and a data set (e.g. Elu) were uploaded to WebPARE, TIs of these pairs were of interest.

Using integrated (pre-stored) pairs of TIs in yeast, PARE correctly predicted 118 out of 176 TIs, mTPR=67%

e.g. The significant predicted network from 66 pairs ->

46

Page 47: Patching the Puzzle of Genetic Network

WebPARE html www.stat.sinica.edu.tw/WebPARE

Page 48: Patching the Puzzle of Genetic Network

Demo WebPARE can be assessed at:

http://www.stat.sinica.edu.tw/WebPARE

Page 49: Patching the Puzzle of Genetic Network

Acknowledgement Dr. Ting-Fang Wang and Da-Yow Huang, Inst. of Biological Chemistry, Academia Sinica

Drs. Karine Clement and J-D. Zucker, INSERM & IRD, France

Cheng-Long Chuang, Chin-Yuan Guo, Chia-Chang

Wang, Dr. Shi-Fong Guo, Yu-Bin Wang, Jia-Hung Wu

Inst. of Statistical Science

Page 50: Patching the Puzzle of Genetic Network

Thank you for your attention!

Page 51: Patching the Puzzle of Genetic Network

Wanted ( 誠徵 )

兼任 PhD students Research assistants to work at Shieh lab.( 謝叔蓉老師實驗室 ) 統計所中研院

 

Page 52: Patching the Puzzle of Genetic Network

Parameter estimation

Next, we estimate parameters via the particle swarm optimization (PSO) algorithm (Kennedy and Eberhart, 1995) is a stochastic optimization technique that simulate the behavior of a flock of birds.

Page 53: Patching the Puzzle of Genetic Network

Example (finding largest gradient)

Evolutionary Process of PSO

Page 54: Patching the Puzzle of Genetic Network

Gene expression of Activator-Target (AT) gene pairs

Page 55: Patching the Puzzle of Genetic Network

Gene expression of Repressor-Target (RT) gene pairs