16
Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

Embed Size (px)

Citation preview

Page 1: Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

Detecting robust time-delayed regulation in Mycobacterium tuberculosis

Iti Chaturvedi and Jagath C Rajapakse

INCOB 2009

Page 2: Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

Gene Regulatory Networks (GRN)

• GRN represents the regulatory effects (causal effects) among genes involved in a particular pathway.

• Signal transduction may is transient – dynamic delayed regulations seem to exist.

• Distributed nature causes intense cross-talk• M. tuberculosis is a bacteria causing TB in

man and has a very slow growth rate in vitro .

• The DNA repair pathway is activated when a damage to DNA occurs.

• System consists of LexA and RecA and upto 40 other genes that are regulated by these two proteins.

linB

Rv2719c

lexA

fadd21recA

dnaB

fadd23

ruvC

dnaE2

agenda

Page 3: Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

Methods of Building GRN

• Bayesian networks (BN): graphical models with regulations denoted by conditional probabilities (Heckerman et al 1995)

• Dynamic BN (DBN): Transition network over time can model cyclic events (Friedman, N. et al 1998)

• Higher-order (HDBN): Extended transition network for longer delays

(Zheng et al 2006)• Skip-chain BN : Can model very long

delays of arbitary length using feature functions. (Galley, M. 2006)

g1

g2

g3 g4

,1

( ) ( | )n

i ii

P x p x a

ai are the parents of gene i

Page 4: Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

Optimization Using GA

g2

g1

g3

g4

g1

g2g3

g4

g1

g2 g3

g4

g1

g2 g3

g4

g1 g2 g3 g4

g1

g2

g3

g4

2

1

3

(a) (b) (c) (d)

Desired Network Chromosome 1 Chromosome 2 After Crossover at g3

0 0 0 0

000

000

000

g1 g2 g3 g4

g1

g2

g3

g4

2

0

0

0 0 0 0

000

000

0(1)0

g1 g2 g3 g4

g1

g2

g3

g4

0

1

3

0 0 0 0

0(1)0

000

000

g1 g2 g3 g4

g1

g2

g3

g4

2

1

3

0 0 0 0

000

000

000

Crossover involves swapping several rows between two parents.

Each individual is an o-nary interaction matrix where

for no interaction and for o-order interaction

,{ }i j n nC c

, 0i jc ,i jc o

Page 5: Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

GA Algorithm,{ }i j n nC c ,

max ( , , )

0 , ( , , )l

i j

M i j l thlc l o

if M i j l th

Initialize N individuals with 0.7 similarity using Mutual Information

Rank all individuals using fitness function.

1 1 1

( | , )i

ijk

qn dNijk

i j k

p x S

1

ijkijk d

ijkk

N

N

Elite individual E is sent to next generation. Select two parents using roulette strategy for mating.

Swap last few rows of the selected parents to generate two new children.Try another crossover point if population similarity is < 0.7

,

,,

max ( , , ) , , 0

0 , 0

l i j

i ji j

M i j l th if clc l o

if c

Invert a random cell to cause mutation1( ) ( )q q

gad p E p E

If dga < 1 for 20 generations or number of generations greater than Q

No

Yes

E is optimal network

Initialize

Selection

Mating

Crossover

Mutate

Terminate

Page 6: Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

Skip-chain Model

• The likelihood of a gene expression xi is given by a weighted sum of linear and skip edge scores :

• Linear-chain feature functions represent local dependencies of o-order.

• The skip-chain features represent long range

dependencies in a GRN using a HMM

( : )log ( ) ( , , ) (1 ) ( , , , )i i i t o t i i tp x f x a t h x a s t

( : )( , , )i i t o tf x a t

( , , , )i i th x a s t

yt-1 yt yt+1

xt-1 xt xt+1

yt+10 yt+11

xt+10 xt+11

agenda

Page 7: Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

Viterbi Forward Path

• The skip-edge score is given as a the normalized MAP interaction:

• We can use maximum likelihood to estimate state transition and emission probabilities :

where denotes number of occurrences for• The most probable path is given by MAP estimate using dynamic

programming :

::

1( , , , ) log max ( | )

( ) ts tt

i i t s ty

t

h x a s t p y xt s

' '

,

,

,, 1

,1

, ,l ml m l mn t t

l mm

Ma x g x g G

M

,

,

,

1

( ) , , { , 1,....., },kl

l t t ldkl

k

Mb k k t s s t g G

M

' ',( 1) ( )max{ ( ) }m m l l m

lt b k t a

,l mM , ,, , 11

l t m tx x

Page 8: Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

Priors for Networks

• Most higher-order Markov models are sensitive to change in pathways and associated data.– Gibbs prior is used to model target network prior. – Interaction potentials, denotes an interaction in target network

and no interaction

– Here a small and large will reflect prior more and vice versa.

• We use adaption to reduce over-fitting due to sparse feature

specific data.• Adaption model can combine the reliable DBN with a volatile feature

specific HMM for long delays.

{ , }

log ( | ) log( | ) iji j S

p S x x S U

1 2{ , }ijU

12

1 2

where

Page 9: Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

Dirichlet Prior over Parameters

• We extend the MLE to a Bayesian learning. Dirichlet is a conjugate prior for multinomial distribution.

• We can maximize probability as (MAP) :

• Using the linear feature as a Dirichlet conjugate prior for the skip feature of a gene we get :

• Lastly, the interpolated probability of gene based on linear and skip-edges is :

argmax ( | ) ( )MAP p x p

' ( ) ( ) ( ) 1

1

argmax i i i

nh x h x f x

MAP ii

where'( ) ( )i ih x h x

( ) (1 ) ( )i i if x h x

where '( )ih x

Page 10: Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

Experiments : M. tuberculosis

• Here we looked at the response of bacteria to drug-induced stress. Treatment with Mitocyin C caused DNA damage and hence led to the upregulation of associated repair genes.

• Eight time points are available at NCBI Gene Expression Omnibus (GSE1642-GPL1396 series) 0.33hr, 0.75hr,1.5hr, 2hr, 4hr, 6hr, 8hr and 12hr after DNA damage.

• Data was discretized into 0 for down and 1 for up regulation.• The corresponding skip probabilities were calculated as described

in methods . Upto seven time points of delays were allowed.• Firstly, we used 9 genes previously specified. In order to get an

expanded dataset, the original dataset was subjected to ICA and the components closest to 9 genes were identified. This gave us a second dataset of 32 genes.

agenda

Page 11: Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

Table 1. Predicted by DBN, HDBN, and skip-chain without priors

Higher-order edges(hrs)

# Genes Model:o ML 1(0.75) 2(1.5) 3(2) 4(4) 5(6)

9DBN:1 -14.7 9HDBN:3 -8.69 8 2 7SKIP-CHAIN:1/5 -6.05 13 (3)

32DBN:1 -48.9 36HDBN:4 -39.4 20 6 14 20SKIP-CHAIN:2/5 -37.2 54 18 (41) (4)

Time delays

It can be seen that the ML of the underlying skip-chain prediction is much higher than the DBN or HDBN, confirming that the network fits data well.

Page 12: Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

Time delays : Priors

Higher-order edges(hrs)

# Genes Model:o ML 1(0.75) 2(1.5) 3(2) 4(4) 5(6)

9

SKIP-CHAIN:1 -6.05 13 (3)

SKIP-CHAIN(Gibbs):1 -5.8 11 (2)

SKIP-CHAIN(Dirichlet):2 -5.2 7 13 (11) (1)

SKIP-CHAIN(Gibbs and Dirichlet):3 -3.27 2 7 (4) (5)

32

SKIP-CHAIN:2 -37.2 54 18 (41) (4)

SKIP-CHAIN(Gibbs):3 -35.2 37 16 24 (40) (3)

SKIP-CHAIN(Dirichlet):2 -35.05 54 16 (37) (4)

SKIP-CHAIN(Gibbs and Dirichlet):2 -34.54 50 15 (41) (4)

Table 2. Predicted by skip-chain models with priors: Gibbs prior for structures and Dirichlet prior for

parameters

Using priors further increased likelihood and gave many new time-delayed interactions. The combined use of both Dirichlet and Gibbs priors is optimal.

Page 13: Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

Networks 9 genes

Fig 3 : Time-delayed interactions in predicted network of 9 genes (a) DBN network, (b) HDBN network, (c) Skip-chain network, (d) Skip-chain network with Gibbs prior, (e) Skip-chain network with Dirichlet prior, (f) Skip-chain network with Gibbs and Dirichlet prior

A small number of transcription factors (TF) regulate the rest of the repair system. At the same time the in-degree is low, as each gene is regulated by just one TF.

Page 14: Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

Networks 32 genes

Fig 4 : Time-delayed interactions in predicted network of 32 genes (a) DBN network, (b) HDBN network, (c) Skip-chain network, (d) Skip-chain network with Gibbs prior, (e) Skip-chain network with Dirichlet prior, (f) Skip-chain network with Gibbs and Dirichlet prior

The second dataset of 32 genes indicated that our method is good for identifying core genes. Use of priors gave better networks with fewer hubs.

Page 15: Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

Conclusion

• An organism responds to changes in its environment by altering the level of expression of critical genes.

• Skip-chain models address the difficulties of a HDBN by easily incorporating long time-delayed regulations.

• Numerous time-delays are identified between the same pair of genes. The forward Viterbi path determines the best long-distant time delay between two genes.

• Using priors gave us higher likelihood and improved the over-fitting in building the regulatory networks.

• The work can be extended to skip-chain conditional random fields and fusion with protein interaction networks.

Page 16: Detecting robust time-delayed regulation in Mycobacterium tuberculosis Iti Chaturvedi and Jagath C Rajapakse INCOB 2009

Thank You

backup