20
Hidden Markov Model Special case of Dynamic Bayesian network Single (hidden) state variable Single (observed) observation variable Transition probability P(S’|S) assumed to be sparse Usually encoded by a state transition graph S S O G G 0 Unrolled network S 0 O 0 S 0 S 1 O 1 S 2 O 2 S 3 O 3

Hidden Markov Model Special case of Dynamic Bayesian network Single (hidden) state variable Single (observed) observation variable Transition probability

  • View
    223

  • Download
    1

Embed Size (px)

Citation preview

Hidden Markov Model Special case of Dynamic Bayesian network

Single (hidden) state variable Single (observed) observation variable Transition probability P(S’|S) assumed to be sparse

Usually encoded by a state transition graph

S S’

O’

G G0 Unrolled network

S0

O0

S0 S1

O1

S2

O2

S3

O3

Hidden Markov Model Special case of Dynamic Bayesian network

Single (hidden) state variable Single (observed) observation variable Transition probability P(S’|S) assumed to be sparse

Usually encoded by a state transition graph

S1

S2

S3

S4

s1 s2 s3 s4

s1 0.2 0.8 0 0

s2 0 0 1 0

s3 0.4 0 0 0.6

s4 0 0.5 0 0.5

P(S’|S)

State transition representation

Joint Probability Distribution

Unrolled network

S0 S1

O1

S2

O2

S3

O3

n

i

iiii SOPSSPSPOSP1

10 )|()|()(),(

Exact Inference Variable Elimination

Inference in a simple chain Computing P(X2)

X1 X2

11

)|()(),()( 121212xx

xXPxPXxPXP

All the numbers for this computation are in the CPDs of the original Bayesian network

O(|X1||X2|) operations

X3

Exact Inference Variable Elimination

Inference in a simple chain Computing P(X2)

Computing P(X3)

X1 X2

11

)|()(),()( 121212xx

xXPxPXxPXP

X3

22

)|()(),()( 232323xx

xXPxPXxPXP

P(X3|X2) is a given CPD

P(X2) was computed above

O(|X1||X2|+|X2||X3|) operations

Exact Inference Variable Elimination

Inference in a general chain Computing P(Xn)

Compute each P(Xi+1) from P(Xi) k2 operations for each computation (assuming |Xi|=k) O(nk2) operations for the inference Compare to kn operations required in summing over all

possible entries in the joint distribution over X1,...Xn

Inference in a general chain can be done in linear time!

X1 X2 X3 Xn...

Exact Inference Variable Elimination

X1 X2

1 2 3

),,,()( 43214X X X

XXXXPXP

X3 X4

1 2 3

)|()|()|()( 3423121X X X

XXPXXPXXPXP

3 2 1

)|()()|()|( 1212334X X X

XXPXPXXPXXP

3 2

)()|()|( 22334X X

XXXPXXP

3

)()|( 334X

XXXP

)( 4X

Pushing summations = Dynamic programming

Inference

Unrolled network

S0 S1

O1

S2

O2

S3

O3

1010 ,,,,,

1

1

10 )|()|()()( ii OOSSx

i

j

jjjji SOPSSPSPSP

Computing P(Si)

0

1011 )|()()|()|()|( 010

,,,,,

1

2

111

SxOOSSx

i

j

jjjj SSPSPSOPSSPSOPii

)()|()|()|( 1

,,,,,

1

2

1111011 SFSOPSSPSOPii OOSSx

i

j

jjjj

Inference

1110 ,,,,,

1

1

10 )|()|()()( ii OOSSx

i

j

jjjji SOPSSPSPSP

Computing P(Si)

0

1111 )|()()|()|()|( 010

,,,,,

1

2

111

SxOOSSx

i

j

jjjj SSPSPSOPSSPSOPii

)()|()|()|( 1

,,,,,

1

2

1111111 SFSOPSSPSOPii OOSSx

i

j

jjjj

1

1112 )()|()|()|()|( 11112

,,,,,

1

2

1

SxOOSSx

i

j

jjjj SFSOPSSPSOPSSPii

),()|()|( 21

,,,,,

1

2

11112 SOFSOPSSPii OOSSx

i

j

jjjj

1

1212 ),()|()|( 21

,,,,,

1

2

1

OxOOSSx

i

j

jjjj SOFSOPSSPii

)()|()|( 2

,,,,,

1

2

11212 SFSOPSSPii OOSSx

i

j

jjjj

Inference: Forward-Backward Algorithm

),,(

),,,(),,|(

1

11

n

nini

OOP

OOSPOOSP

Computing P(Si|O1,...,On)

),,(

),,,|,,(),,,(1

1111

n

iiniii

OOP

OOSOOPSOOP

),,(

)|,,(),,,(1

11

n

iniii

OOP

SOOPSOOP

Forward Backward

Normalization factor

Computing the Forward Step

),,,()( 11 jSOOPi iij Define

)()0( 0 jSPj Initialization:

Induction step:),,,()1( 11 jSOOPi ii

j

x

iii jSxSOOP ),,,,( 11

x

iiiiii xSOOjSOPxSOOP ),,,|,(),,,( 11111

x

iiiii xSjSOPxSOOP )|,(),,,( 111

x

iiiix xSjSPxSOPi )|()|()( 1

Computing the Backward Step

)|,,()( jSOOPi inij Define

1)1( njInitialization:

Induction step:)|,,()( jSOOPi ini

j

x

iini jSxSOOP )|,,,( 1

x

iiiniiii jSxSOOOPjSxSOP ),,|,,()|,( 111

x

iniiiiii xSOOPjSOxSPjSOP )|,,(),|()|( 111

xx

iiii ijSxSPjSOP )1()|()|( 1

Computing Evidence Probability

),,,()( 11 jSOOPi iij Since

x

nn xSOOPOP ),,,()( 1 Then:

x

x n)(

Since

x

n xSOOPxSPOP )|,,()()( 111 Then:

x

xxSP )1()( 1

)|,,()( jSOOPi inij

Assignment 3 Part 1: Constructing and evaluating a

nucleosome probability model Model 1: zero order Markov model Model 2: first order Markov model

Both models have two components: PN: Position-dependent distribution over nucleotides PL: Position-independent distribution over nucleotides P=PN/PL

Assignment 3 PN:

Markov order 0: Markov order 1:

Estimating PN

Create an alignment from all nucleosome reads and the reverse complement of each read

Estimate PN,i from counts in the data Example for Markov order 1:

where #(Sk=i|Sk-1=j) is the number of times that the nucleotide at position k in the alignment is i, AND the nucleotide at position k-1 in the alignment is j

147

2

1,

11, )|()()(

i

iiiNNN SSPSPSP

147

1, )()(

i

iiNN SPSP

x

kk

kkkk

kN jSxS

jSiSjSiSP

)|(#

)|(#)|(

1

11

,

Assignment 3 PL:

Markov order 0: Markov order 1:

Estimating PL

For Markov order 0: compute the average number of reads that cover each of the possible 4 basepairs in the genome

For Markov order 1: compute the average number of reads that cover each of the possible 16 dinucleotides in the genome

Estimate PL from counts in the data Example for Markov order 1:

where A(Sk=i|Sk-1=j) is the average coverage of the dinucleotide i,j, computed as explained above

147

2

11 )|()()(i

iiLLL SSPSPSP

147

1

)()(i

iLL SPSP

x

kk

kkkk

L jSxSA

jSiSAjSiSP

),(

),()|(

1

11

Assignment 3 Evaluating the model

Construct the model in a cross validation scheme, i.e., create it only using the data of chromosomes 1-8

Test the model (order 0 & 1) on the held-out chromosomes

Compute the log-likelihood of all held-out nucleosome reads (work in log-space!)

Compare to the log-likelihood of a random selection of sequences from the genome

Compare to the log-likelihood of permutations of the sequences

)(log)(log)(log ii SPSPSP

Assignment 3 Evaluating the model (cont.)

Test the model (order 0 & 1) on the held-out chromosomes

Create an ROC evaluation Select a threshold t, equal to the average number of reads per

basepair in the genome Define ‘positive’ regions as maximal contiguous regions in which

every basepair is above t. Remove regions whose size is <50bp Define ‘negative’ regions as maximal contiguous regions in

which every basepair is below t. Remove regions whose size is <50bp

Use the model to score each region, as the average score of the basepairs it contains, where the score of each basepair is the average score of all 147 scores that cover that basepair

Create an ROC score using these positive and negative regions. This is done by ranking all regions according to the model scores (above), and plotting, at each rank, the false positive rate (x-axis) vs. true positive rate (y-axis)

Compute the AUC (area under the curve)

Assignment 3 Use the model in an HMM framework and

compute the average nucleosome occupancy at each basepair Easiest to view as a generalized HMM with two states

Si=0: no nucleosome starts at position i Si=1: nucleosome starts at position i

Notes Emission probability given S=1 is taken from nucleosome

model Emission probability given S=0 is uniform over all basepairs Placing a nucleosome ‘emits’ 147 basepairs Implement a uniform non-normalized transition probability

between the two states, i.e., W(S=0)=1, W(S=1)=1

Compute P(Si=0|O) and P(Si=1|O) for every basepair Compute the average occupancy at each basepair as

i

ij

i OSPiP146

)|1()(

Assignment 3 Evaluating the HMM model

Generate a plot of average occupancy of the real data and the model predictions at a 2000bp region of your choice

Perform the same ROC analysis as with the previous model, except that scores of the positive and negative regions are now computed as the average nucleosome occupancy of those regions according to your genome-wide computation