22
Similar Techniques For Similar Techniques For Molecular Sequencing and Molecular Sequencing and Network Security Network Security Doug Madory Doug Madory 27 APR 05 27 APR 05 Big Picture Big Picture Protein Structure Protein Structure Sequencing using Profile HMM Sequencing using Profile HMM

Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

  • Upload
    noelle

  • View
    21

  • Download
    0

Embed Size (px)

DESCRIPTION

Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05. Big Picture Protein Structure Sequencing using Profile HMM. Big Picture. PQS for Network Security (Us) Design HMM for network event Find event within linear stream of observed network events. - PowerPoint PPT Presentation

Citation preview

Page 1: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

Similar Techniques For Similar Techniques For Molecular Sequencing and Molecular Sequencing and

Network SecurityNetwork Security

Doug MadoryDoug Madory27 APR 0527 APR 05

Big PictureBig Picture Protein StructureProtein Structure Sequencing using Profile HMMSequencing using Profile HMM

Page 2: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

Big PictureBig Picture PQS for Network Security (Us)PQS for Network Security (Us)

Design HMM for network eventDesign HMM for network event Find event within linear stream of observed Find event within linear stream of observed

network eventsnetwork events

Sequencing using Profile HMM (Bioinformatics)Sequencing using Profile HMM (Bioinformatics) Train HMM using known information about Train HMM using known information about

subsequencesubsequence Find subsequence within linear protein / genome Find subsequence within linear protein / genome

sequencesequence

Q: Did an event happen?

Q: If it exists, where is sequence?

Page 3: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05
Page 4: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05
Page 5: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05
Page 6: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05
Page 7: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05
Page 8: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

Profile HMM - Simple Profile HMM - Simple CaseCase

Train HMMTrain HMM Viterbi ScoringViterbi Scoring Backtrace ViterbiBacktrace Viterbi

Query:Query: A-, AA, TAA-, AA, TA DB:DB: ATAATA

Page 9: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

HMM TrainingHMM Training

Build HMM with 2 M states because there are 2 columns in query M1 M2

Begin A C G T

End

D2

I2I0

D1

I1

A C G T

Page 10: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

HMM TrainingHMM Training

Step 1 – add pseudocount to each transition and emission M1 M2

Begin A 1 C 1 G 1 T 1

End

D2

I2I0

D1

I1

A 1 C 1 G 1 T 1

1

1

1

1

1

1

1

11

1

1

11

1

11

11

1

1

Page 11: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

HMM TrainingHMM Training

Step 2 – train with A-

M1 M2

Begin A 2 C 1 G 1 T 1

End

D2

I2I0

D1

I1

A 1 C 1 G 1 T 1

1

1

1

2

1

1

1

11

1

1

21

1

21

11

1

1

Page 12: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

HMM TrainingHMM Training

Step 3 – train with AA

M1 M2

Begin A 3 C 1 G 1 T 1

End

D2

I2I0

D1

I1

A 2 C 1 G 1 T 1

1

1

1

3

1

1

1

11

1

2

21

2

21

11

1

1

Page 13: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

HMM TrainingHMM Training

Step 4 – train with TA

M1 M2

Begin A 3 C 1 G 1 T 2

End

D2

I2I0

D1

I1

A 3 C 1 G 1 T 1

1

1

1

4

1

1

1

11

1

3

21

3

21

11

1

1

Page 14: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

HMM TrainingHMM Training

Fully trained HMM

M1 M2

Begin A 3 C 1 G 1 T 2

End

D2

I2I0

D1

I1

A 3 C 1 G 1 T 1

1

1

1

4

1

1

1

11

1

3

21

3

21

11

1

1

Page 15: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

Viterbi ScoringViterbi Scoring

XX AA TT AAB/IB/I00 VVBB=0=0MM11

II11

DD11

MM22

II22

DD22

EE

Insert

MatchDelet

e

Moves

VI0(1) = log aB-I0

VM1(0) = 0

VI1(0) = 0

VD1(0) = log aB-D1

Illegal Moves Observations

States

Page 16: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

Viterbi ScoringViterbi Scoring

XX AA TT AAB/IB/I00 VVBB=0=0 VVII

00(1) = -(1) = -0.780.78

MM11 VM1(0)= 0

II11 VI1(0)= 0

DD11 VD1(0)= -

0.78MM22

II22

DD22

EE

Insert

MatchDelet

e

Moves

VI0(2) = VI

0(1)+log aI0-I0

VI0(3) = VI

0(2)+log aI0-I0

Observations

States

Page 17: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

Viterbi ScoringViterbi Scoring

XX AA TT AAB/IB/I00 VVBB=0=0 VVII

00(1) = -(1) = -0.780.78

VI0(2)= -

1.25VI

0(3)=-1.72

MM11 VM1(0)= 0

II11 VI1(0)= 0

DD11 VD1(0)= -

0.78MM22

II22

DD22

EE

Insert

MatchDelet

e

Moves

VM1(1) = log e(A)/q + VB + log aB-M1

VM1(1) = log (3/7)/(1/4) + 0 - 0.17

VM1(1) = 0.23 – 0.17 = 0.06

Observations

States

Page 18: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

Viterbi ScoringViterbi Scoring

XX AA TT AAB/IB/I00 VVBB=0=0 VVII

00(1) = -(1) = -0.780.78

VI0(2)= -

1.25VI

0(3)=-1.72

MM11 VM1(0)= 0 VM

1(1)= 0.06

II11 VI1(0)= 0

DD11 VD1(0)= -

0.78MM22

II22

DD22

EE

Insert

MatchDelet

e

Moves

VD1(1) = VI

1(0) + log aI0D1

VD1(1) = -0.78 – 0.47 = -1.25

VM1(0) + log aM1I1

VI1(1) = 0 + max { VI

1(0) + log aI1I1 }VD

1(0) + log aD1I1

VI1(1) = 0 + max {-0.78+-0.47 }

VI1(1) = -0.47

Observations

States

Page 19: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

Viterbi ScoringViterbi Scoring

XX AA TT AAB/IB/I00 VVBB=0=0 VVII

00(1) = -0.78(1) = -0.78 VI0(2)= -1.25 VI

0(3)=-1.72

MM11 VM1(0)= 0 VM

1(1)= 0.06

II11 VI1(0)= 0 VI

1(1) = -0.47

DD11 VD1(0)= -0.78 VD

1(1)= -1.25

MM22

II22

DD22

EE

Insert

MatchDelet

e

Moves

Observations

States

Page 20: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

Viterbi ScoringViterbi Scoring

XX AA TT AAB/IB/I00 VVBB=0=0 VVII

00(1) = -0.78(1) = -0.78 VI0(2)= -1.25 VI

0(3)=-1.72

MM11 VM1(0)= 0 VM

1(1)= 0.06 VM1(2) = -1.19 VM

1(3)= -1.49

II11 VI1(0)= 0 VI

1(1) = -0.47 VI1(2) = -0.72 VI

1(3) = -1.19

DD11 VD1(0)= -0.78 VD

1(1)= -1.25 VD1(2) = -1.72 VD

1(3)= -1.25

MM22 VM2(0)= 0 VM

2(1)= -0.47 VM2(2)= -0.41 VM

2(3)= -1.19

II22 VI2(0)= 0 VI

2(1) = -1.85 VI2(2) = -1.07 VI

2(3) = -1.01

DD22 VD2(0)= -1.25 VD

2(1)= -1.25 VD2(2)= -0.58 VD

2(3)= -1.36

EE VVEE= -1.31= -1.31

Insert

MatchDelet

e

Moves

Observations

States

Page 21: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

Profile HMM - Simple Profile HMM - Simple CaseCase

Demo in PythonDemo in Python

Page 22: Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05

Big Picture RevisitedBig Picture Revisited PQS for Network Security (Us)PQS for Network Security (Us)

Design HMM for network eventDesign HMM for network event Find event within linear stream of observed Find event within linear stream of observed

network eventsnetwork events

Sequencing using Profile HMM (Bioinformatics)Sequencing using Profile HMM (Bioinformatics) Train HMM using known information about Train HMM using known information about

subsequencesubsequence Find subsequence within linear protein / genome Find subsequence within linear protein / genome

sequencesequence

Q: Did an event happen?

Q: If it exists, where is sequence?