40
XPRIME: A Novel Motif Searching Method Rachel L. Poulsen Department of Statistics Brigham Young University June 15, 2009

XPRIME: A Novel Motif Searching Method

Embed Size (px)

DESCRIPTION

Presentation prepared for the WNAR conference held at Portland State University in 2009

Citation preview

Page 1: XPRIME: A Novel Motif Searching Method

XPRIME: A Novel Motif Searching Method

Rachel L. Poulsen

Department of StatisticsBrigham Young University

June 15, 2009

Page 2: XPRIME: A Novel Motif Searching Method

Introduction

DNA contains the genetic instructions that uniquely define anorganism

RNA is created to carry genetic instructions from the DNA tothe rest of the cell

The process of DNA “talking” to the rest of the cell is calledtranscription

Page 3: XPRIME: A Novel Motif Searching Method

Introduction

DNA contains the genetic instructions that uniquely define anorganism

RNA is created to carry genetic instructions from the DNA tothe rest of the cell

The process of DNA “talking” to the rest of the cell is calledtranscription

Page 4: XPRIME: A Novel Motif Searching Method

Transcription

DNA

RNA

Page 5: XPRIME: A Novel Motif Searching Method

Transcription

DNA RNA

Page 6: XPRIME: A Novel Motif Searching Method

Transcription

DNA RNA

Page 7: XPRIME: A Novel Motif Searching Method

Position Weight Matrix (PWM) (Hertz et al 1990)

ETS1 TF binding motif

Position: 1 2 3 4 5 6 7 8ACGT

0.067 0.333 0.0 0.0 1.0 0.533 0.267 0.0670.933 0.600 0.0 0.0 0.0 0.133 0.067 0.4000.000 0.000 1.0 1.0 0.0 0.000 0.667 0.0000.000 0.067 0.0 0.0 0.0 0.333 0.000 0.533

Page 8: XPRIME: A Novel Motif Searching Method

Position Weight Matrix (PWM) (Hertz et al 1990)

ETS1 TF binding motif

Position: 1 2 3 4 5 6 7 8ACGT

0.067 0.333 0.0 0.0 1.0 0.533 0.267 0.0670.933 0.600 0.0 0.0 0.0 0.133 0.067 0.4000.000 0.000 1.0 1.0 0.0 0.000 0.667 0.0000.000 0.067 0.0 0.0 0.0 0.333 0.000 0.533

Page 9: XPRIME: A Novel Motif Searching Method

Sequence Logos

Figure: DNA binding motif for the ETS1 TF

Page 10: XPRIME: A Novel Motif Searching Method

De Novo motif searching

Regular expression enumeration1 Actual count vs. expected count2 Dictionary-based sequence model (Bussemaker et al. 2000)

PWM updating1 MEME (Bailey et al 1995)2 Gibbs Motif Sampler (GMS) (Lawrence et al 1993)3 BioProspector (Liu et al 2001)4 AlignACE (Roth et al 1998)

Page 11: XPRIME: A Novel Motif Searching Method

De Novo motif searching

Regular expression enumeration

1 Actual count vs. expected count2 Dictionary-based sequence model (Bussemaker et al. 2000)

PWM updating1 MEME (Bailey et al 1995)2 Gibbs Motif Sampler (GMS) (Lawrence et al 1993)3 BioProspector (Liu et al 2001)4 AlignACE (Roth et al 1998)

Page 12: XPRIME: A Novel Motif Searching Method

De Novo motif searching

Regular expression enumeration1 Actual count vs. expected count2 Dictionary-based sequence model (Bussemaker et al. 2000)

PWM updating1 MEME (Bailey et al 1995)2 Gibbs Motif Sampler (GMS) (Lawrence et al 1993)3 BioProspector (Liu et al 2001)4 AlignACE (Roth et al 1998)

Page 13: XPRIME: A Novel Motif Searching Method

De Novo motif searching

Regular expression enumeration1 Actual count vs. expected count2 Dictionary-based sequence model (Bussemaker et al. 2000)

PWM updating

1 MEME (Bailey et al 1995)2 Gibbs Motif Sampler (GMS) (Lawrence et al 1993)3 BioProspector (Liu et al 2001)4 AlignACE (Roth et al 1998)

Page 14: XPRIME: A Novel Motif Searching Method

De Novo motif searching

Regular expression enumeration1 Actual count vs. expected count2 Dictionary-based sequence model (Bussemaker et al. 2000)

PWM updating1 MEME (Bailey et al 1995)2 Gibbs Motif Sampler (GMS) (Lawrence et al 1993)3 BioProspector (Liu et al 2001)4 AlignACE (Roth et al 1998)

Page 15: XPRIME: A Novel Motif Searching Method

Known Motif Search

1 GREP

2 Database search with scoring function (Hertz et al 1990)

Page 16: XPRIME: A Novel Motif Searching Method

XPIME: An Improved Method

TRANSFAC (Matys et al 2003)

Information pulled from in vitro experiments and literatureMost methods justify results using TRANSFACXPRIME incorporates prior informationXPRIME can search for both de novo motifs and known motifssimultaneously

Page 17: XPRIME: A Novel Motif Searching Method

XPIME: An Improved Method

TRANSFAC (Matys et al 2003)

Information pulled from in vitro experiments and literatureMost methods justify results using TRANSFAC

XPRIME incorporates prior informationXPRIME can search for both de novo motifs and known motifssimultaneously

Page 18: XPRIME: A Novel Motif Searching Method

XPIME: An Improved Method

TRANSFAC (Matys et al 2003)

Information pulled from in vitro experiments and literatureMost methods justify results using TRANSFACXPRIME incorporates prior information

XPRIME can search for both de novo motifs and known motifssimultaneously

Page 19: XPRIME: A Novel Motif Searching Method

XPIME: An Improved Method

TRANSFAC (Matys et al 2003)

Information pulled from in vitro experiments and literatureMost methods justify results using TRANSFACXPRIME incorporates prior informationXPRIME can search for both de novo motifs and known motifssimultaneously

Page 20: XPRIME: A Novel Motif Searching Method

Notation and Data

Indices

w: width of motifL: length of sequencem: motif indicatori: position in sequencej: position in motifs: indicates sequence

The data, zs

zs = (yis ,∆1i ,∆2i , · · · ,∆(m+1)i )

yi represents the position (w-mer)∆mi indicates if yi belongs to motif m or not∆(m+1)i indicates if yi belongs to the backgrond motif or not

Page 21: XPRIME: A Novel Motif Searching Method

Notation and Data

Indices

w: width of motifL: length of sequencem: motif indicatori: position in sequencej: position in motifs: indicates sequence

The data, zs

zs = (yis ,∆1i ,∆2i , · · · ,∆(m+1)i )

yi represents the position (w-mer)∆mi indicates if yi belongs to motif m or not∆(m+1)i indicates if yi belongs to the backgrond motif or not

Page 22: XPRIME: A Novel Motif Searching Method

Notation and Data

Indices

w: width of motifL: length of sequencem: motif indicatori: position in sequencej: position in motifs: indicates sequence

The data, zs

zs = (yis ,∆1i ,∆2i , · · · ,∆(m+1)i )

yi represents the position (w-mer)∆mi indicates if yi belongs to motif m or not∆(m+1)i indicates if yi belongs to the backgrond motif or not

Page 23: XPRIME: A Novel Motif Searching Method

Notation and Data

Indices

w: width of motifL: length of sequencem: motif indicatori: position in sequencej: position in motifs: indicates sequence

The data, zs

zs = (yis ,∆1i ,∆2i , · · · ,∆(m+1)i )

yi represents the position (w-mer)∆mi indicates if yi belongs to motif m or not∆(m+1)i indicates if yi belongs to the backgrond motif or not

Page 24: XPRIME: A Novel Motif Searching Method

The Scoring Function

MotifScore = f (y) =w∏

j=1

∑i∈A,C ,G ,T

pij I (yj = i).

Page 25: XPRIME: A Novel Motif Searching Method

Methods: Complete Data Likelihood

(m+1) – component mixture model

L(θ|z) =Ls∏i=1

C (yi )[r1f1(yi )]∆1i [r2f2(yi )]∆2i · · · [rm+1fm+1]∆(m+1)i

f(y) is the Motif Score equation

Page 26: XPRIME: A Novel Motif Searching Method

Methods: Complete Data Likelihood

(m+1) – component mixture model

L(θ|z) =Ls∏i=1

C (yi )[r1f1(yi )]∆1i [r2f2(yi )]∆2i · · · [rm+1fm+1]∆(m+1)i

f(y) is the Motif Score equation

Page 27: XPRIME: A Novel Motif Searching Method

Methods: Priors

fm+1(y) is fixed a priori

∆(m+1)i ’s are missing a priori

f1(y), · · · , fm(y) have product Dirichlet priors such that

π(fm(y)) ∝L∏

j=1

∏k∈(A,C ,G ,T )

papmij

−1

mjk

r also has a Dirichlet prior

π(r) ∝M∏i=1

rari−1

i

Page 28: XPRIME: A Novel Motif Searching Method

Methods: Gibbs Algorithm

1 Draws ∆’s from a multinomial distribution

p∆ ∝ rM ∗ fM(y)

2 Draws r from a Dirichlet distribution

αr =∑L

i=1 ∆Mi + aM

3 Draws pmij from a Dirichlet distribution

αpmij =∑L

i=1

∑k={A,C ,G ,T} ∆mi I (yij = k) + apmij

Page 29: XPRIME: A Novel Motif Searching Method

Methods: Gibbs Algorithm

1 Draws ∆’s from a multinomial distribution

p∆ ∝ rM ∗ fM(y)

2 Draws r from a Dirichlet distribution

αr =∑L

i=1 ∆Mi + aM

3 Draws pmij from a Dirichlet distribution

αpmij =∑L

i=1

∑k={A,C ,G ,T} ∆mi I (yij = k) + apmij

Page 30: XPRIME: A Novel Motif Searching Method

Methods: Gibbs Algorithm

1 Draws ∆’s from a multinomial distribution

p∆ ∝ rM ∗ fM(y)

2 Draws r from a Dirichlet distribution

αr =∑L

i=1 ∆Mi + aM

3 Draws pmij from a Dirichlet distribution

αpmij =∑L

i=1

∑k={A,C ,G ,T} ∆mi I (yij = k) + apmij

Page 31: XPRIME: A Novel Motif Searching Method

Methods: Gibbs Algorithm

1 Draws ∆’s from a multinomial distribution

p∆ ∝ rM ∗ fM(y)

2 Draws r from a Dirichlet distribution

αr =∑L

i=1 ∆Mi + aM

3 Draws pmij from a Dirichlet distribution

αpmij =∑L

i=1

∑k={A,C ,G ,T} ∆mi I (yij = k) + apmij

Page 32: XPRIME: A Novel Motif Searching Method

An Example: ETS1

We hypothesize that ETS1 has a specific binding site

The Data1 ETS1 only2 GABP only3 ETS1 and GABP

Page 33: XPRIME: A Novel Motif Searching Method

ETS1 Binding Motifs

(a) ETS1 from TRANSFAC (b) ETS1 from ETS1 only

(c) ETS1 from GABP only (d) ETS1 from ETS1/GABP

Page 34: XPRIME: A Novel Motif Searching Method

Justification of Prior Information

Pete Hollenhorst sequence logo

Page 35: XPRIME: A Novel Motif Searching Method

Justification of Prior Information

Figure: Motif found without prior specification

Figure: Motif found with prior specification

Page 36: XPRIME: A Novel Motif Searching Method

Conclusions and Future Research

XPRIME successfully searches for de novo and known motifs

Evidence found suggesting ETS1 has its own binding motif

Hidden Markov Models and forward backward algorithm

Prior information on r

Page 37: XPRIME: A Novel Motif Searching Method

Conclusions and Future Research

XPRIME successfully searches for de novo and known motifs

Evidence found suggesting ETS1 has its own binding motif

Hidden Markov Models and forward backward algorithm

Prior information on r

Page 38: XPRIME: A Novel Motif Searching Method

Conclusions and Future Research

XPRIME successfully searches for de novo and known motifs

Evidence found suggesting ETS1 has its own binding motif

Hidden Markov Models and forward backward algorithm

Prior information on r

Page 39: XPRIME: A Novel Motif Searching Method

Conclusions and Future Research

XPRIME successfully searches for de novo and known motifs

Evidence found suggesting ETS1 has its own binding motif

Hidden Markov Models and forward backward algorithm

Prior information on r

Page 40: XPRIME: A Novel Motif Searching Method

Conclusions and Future Research

XPRIME successfully searches for de novo and known motifs

Evidence found suggesting ETS1 has its own binding motif

Hidden Markov Models and forward backward algorithm

Prior information on r