Upload
tyra
View
65
Download
0
Embed Size (px)
DESCRIPTION
Dynamic Programming for Pairwise Alignment. Dr Alexei Drummond Department of Computer Science [email protected]. BIOSCI 359, Semester 2, 2006. Dynamic Programming. method for solving combinatorial optimisation problems guaranteed to give optimal solution - PowerPoint PPT Presentation
Citation preview
Dynamic Programmingfor
Pairwise Alignment
Dr Alexei Drummond
Department of Computer Science
BIOSCI 359, Semester 2, 2006
2
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Dynamic Programming
• method for solving combinatorial optimisation problems
• guaranteed to give optimal solution
• generalisation of “divide-and-conquer”
• relies on “Principle of Optimality”
i.e. sub-optimal solution of subproblem cannot be part of optimal solution of original problem instance.
3
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Auckland
Te Kuiti
Wellington
Principle of Optimality
4
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Auckland
Te Kuiti
Wellington
Principle of Optimality
5
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Key to efficiency• computation is carried out bottom-up
• store solutions to subproblems in a table
• all possible subproblems solved once each, beginning with smallest subproblems
• work up to original problem instance
• only optimal solutions to subproblems are used to compute solution to problem at next level
• DO NOT carry out computation in recursive, top-down manner
• same subproblems would be solved many times
6
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Pairwise alignment
Sequences
x = a c g g t sy = a w g c c t t
Alignment
x = a – c g g – t sy = a w – g c c t t
7
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Scoring
• Numeric score associated with each column
• Total score = sum of column scores
• Column types:
(1) Identical (+ve) (2) Conservative (+ve)
(3) Non-conservative (-ve) (4) Gap (-ve)
x = a – c g g – t sy = a w – g c c t t
8
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
• Linear score: (g) = -gd
gap penality
• Affine score: (g) = -d - (g-1)e
gap-open penality gap-extension penalty
Gap penalties
----------g
y
x
9
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
BLOSUM50 matrix“Blocks Amino Acid Substitution Matrix”
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
10
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Needleman & Wunsch algorithm• Dynamic programming algorithm for global alignment
• Needleman & Wunsch (‘70), modified Gotoh (‘82)
Assumptions:
Linear gap score d
Symmetric scoring matrix S
s(a,b) = s(b,a) score from lining up a and b
s(a,-) = s(-,a) = -d score from lining up a with -
11
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Principle of Optimality
Given sequences:
Define:
F(i,j) = score of best alignment
between
and
€
Y = (y1,y2,...,yn )
X = (x1,x2,...,xm )
€
(x1,x2,...,x i)
€
(y1,y2,...,y j )
12
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Principle of Optimality
Optimal alignment
€
x1, x2, x3, ..., x i
€
y1, y2, y3, ..., y j
€
F(i, j)
13
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Principle of Optimality
Optimal alignment
€
x1, x2, x3, ..., x i
€
y1, y2, y3, ..., y j
Looks like ……
€
x1,x2,x3,...,x i−1
€
y1,y2,y3,...,y j−1
€
x i
€
y j
€
F(i, j)
€
F(i −1, j −1) + s(x i,y j )
14
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Principle of Optimality
Optimal alignment
€
x1, x2, x3, ..., x i
€
y1, y2, y3, ..., y j
Looks like ……
€
x1,x2,x3,...,x i−1
€
y1,y2,y3,...,y j−1
€
x i
€
y j
€
F(i, j)
€
F(i −1, j −1) + s(x i,y j )
or ……………
€
x1,x2,x3,...,x i
€
y1,y2,y3,...,y j−1
€
−
€
y j
€
F(i, j −1) − d
15
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Principle of Optimality
Optimal alignment
€
x1, x2, x3, ..., x i
€
y1, y2, y3, ..., y j
Looks like ……
€
x1,x2,x3,...,x i−1
€
y1,y2,y3,...,y j−1
€
x i
€
y j
€
F(i, j)
€
F(i −1, j −1) + s(x i,y j )
or ……………
€
x1,x2,x3,...,x i
€
y1,y2,y3,...,y j−1
€
−
€
y j
€
F(i, j −1) − d
or ……………
€
x1,x2,x3,...,x i−1
€
y1,y2,y3,...,y j
€
x i
€
−
€
F(i −1, j) − d
16
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Principle of Optimality
Optimal alignment
€
x1, x2, x3, ..., x i
€
y1, y2, y3, ..., y j
Looks like ……
€
x1,x2,x3,...,x i−1
€
y1,y2,y3,...,y j−1
€
x i
€
y j
€
F(i, j)
€
F(i −1, j −1) + s(x i,y j )
or ……………
€
x1,x2,x3,...,x i
€
y1,y2,y3,...,y j−1
€
−
€
y j
€
F(i, j −1) − d
or ……………
€
x1,x2,x3,...,x i−1
€
y1,y2,y3,...,y j
€
x i
€
−
€
F(i −1, j) − d
so ……………
€
F(i −1, j −1) + s(x i,y j )
F(i, j) = max F(i, j −1) − d
F(i −1, j) − d
17
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Principle of Optimality
Basis:
€
x1, x2, x3, ..., x i
€
− − − − ... −
€
y1, y2, y3, ..., y j
€
− − − − ... −
€
F(i,0) = F(i −1,0) + s(x i,−)
€
F(0, j) = F(0, j −1) + s(−,y j )
€
F(0,0) = 0
18
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Filling up table
0
F matrix
0
1
2
m
0 1 2 n
X
Y
19
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Filling up table
0
F matrix
0
1
2
m
0 1 2 n
X
Y
20
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Filling up table
0
F matrix
0
1
2
m
0 1 2 n
X
Y
21
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Filling up table
0
F matrix
0
1
2
m
0 1 2 n
X
Y
22
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Filling up table
0
F matrix
0
1
2
m
0 1 2 n
X
Y
23
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Filling up table
0
F matrix
0
1
2
m
0 1 2 n
X
Y
24
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Filling up table
0
F matrix
0
1
2
m
0 1 2 n
X
Y
25
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Filling up table
0
F matrix
0
1
2
m
0 1 2 n
X
Y
26
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Filling up table
0
F matrix
0
1
2
m
0 1 2 n
X
Y
27
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Filling up table
0
F matrix
0
1
2
m
0 1 2 n
X
Y
28
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Filling up table
0
F matrix
0
1
2
m
0 1 2 n
X
Y
29
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Filling up table
0
F matrix
0
1
2
m
0 1 2 n
X
Y
30
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Filling up table
0
F matrix
0
1
2
m
0 1 2 n
X
Y
31
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Filling up table
0
F matrix
0
1
2
m
0 1 2 n
X
Y
Optimalalignmentscore
32
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Constructing alignment
0
F matrix
0
1
2
m
0 1 2 n
X
Y
Optimalalignmentscore
33
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Example
0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80
-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73
-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60
-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37
-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19
-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5
-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2
-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1
F matrix
0
1
2
m
0 1 2 n
X
Y
Optimalalignmentscore
H E A G A W G H E E
P
A
W
H
E
A
E
34
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Example
0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80
-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73
-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60
-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37
-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19
-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5
-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2
-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1
F matrix
0
1
2
m
0 1 2 n
X
Optimalalignmentscore
P
A
W
H
E
A
E
? ? ? ? ? ? ? ? ? ? E
X
Y
Y
H E A G A W G H E E
? ? ? ? ? ? ? ? ? ? EAlignment
35
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Example
0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80
-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73
-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60
-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37
-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19
-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5
-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2
-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1
F matrix
0
1
2
m
0 1 2 n
X
Optimalalignmentscore
P
A
W
H
E
A
E
Y
H E A G A W G H E E
X
Y ? ? ? ? ? ? ? ? ? - E
? ? ? ? ? ? ? ? ? A EAlignment
36
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Example
0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80
-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73
-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60
-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37
-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19
-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5
-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2
-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1
F matrix
0
1
2
m
0 1 2 n
X
Optimalalignmentscore
P
A
W
H
E
A
E
Y
H E A G A W G H E E
X
Y ? ? ? ? ? ? ? ? E - E
? ? ? ? ? ? ? ? E A EAlignment
37
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Example
0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80
-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73
-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60
-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37
-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19
-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5
-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2
-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1
F matrix
0
1
2
m
0 1 2 n
X
Optimalalignmentscore
P
A
W
H
E
A
E
Y
H E A G A W G H E E
X
Y ? ? ? ? ? ? ? H E - E
? ? ? ? ? ? ? H E A EAlignment
38
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Example
0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80
-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73
-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60
-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37
-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19
-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5
-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2
-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1
F matrix
0
1
2
m
0 1 2 n
X
Optimalalignmentscore
P
A
W
H
E
A
E
Y
H E A G A W G H E E
X
Y ? ? ? ? ? ? G H E - E
? ? ? ? ? ? - H E A EAlignment
39
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Example
0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80
-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73
-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60
-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37
-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19
-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5
-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2
-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1
F matrix
0
1
2
m
0 1 2 n
X
Optimalalignmentscore
P
A
W
H
E
A
E
Y
H E A G A W G H E E
AlignmentX
Y ? ? ? ? ? W G H E - E
? ? ? ? ? W - H E A E
40
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Example
0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80
-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73
-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60
-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37
-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19
-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5
-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2
-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1
F matrix
0
1
2
m
0 1 2 n
X
Optimalalignmentscore
P
A
W
H
E
A
E
Y
H E A G A W G H E E
AlignmentX
Y ? ? ? ? A W G H E - E
? ? ? ? A W - H E A E
41
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Example
0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80
-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73
-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60
-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37
-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19
-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5
-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2
-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1
F matrix
0
1
2
m
0 1 2 n
X
Optimalalignmentscore
P
A
W
H
E
A
E
Y
H E A G A W G H E E
AlignmentX
Y ? ? ? G A W G H E - E
? ? ? - A W - H E A E
42
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Example
0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80
-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73
-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60
-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37
-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19
-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5
-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2
-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1
F matrix
0
1
2
m
0 1 2 n
X
Optimalalignmentscore
P
A
W
H
E
A
E
Y
H E A G A W G H E E
AlignmentX
Y ? ? A G A W G H E - E
? ? P - A W - H E A E
43
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Example
0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80
-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73
-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60
-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37
-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19
-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5
-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2
-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1
F matrix
0
1
2
m
0 1 2 n
X
Optimalalignmentscore
P
A
W
H
E
A
E
Y
H E A G A W G H E E
AlignmentX
Y ? E A G A W G H E - E
? - P - A W - H E A E
44
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Example
0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80
-8 -2 -9 -17 -25 -33 -42 -49 -57 -65 -73
-16 -10 -3 -4 -12 -20 -28 -36 -44 -52 -60
-24 -18 -11 -6 -7 -15 -5 -13 -21 -29 -37
-32 -14 -18 -13 -8 -9 -13 -7 -3 -11 -19
-40 -22 -8 -16 -16 -9 -12 -15 -7 3 -5
-48 -30 -16 -3 -11 -11 -12 -12 -15 -5 2
-56 -38 -24 -11 -6 -12 -14 -15 -12 -9 1
F matrix
0
1
2
m
0 1 2 n
X
Optimalalignmentscore
P
A
W
H
E
A
E
Y
H E A G A W G H E E
AlignmentX
Y H E A G A W G H E - E
- - P - A W - H E A E
45
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Time and space
€
⇒ Θ(mn)
F matrix
0
1
2
m
0 1 2 n
€
(m +1) × (n +1) table entries space
Each entry computed in constant time
€
⇒ Θ(mn) time
46
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Smith & Waterman algorithm
Computes local alignment.
i.e. look for best alignment of subsequences of X and Y, ignoring scoresof regions on either side
Y
X
Best subsequence alignment
47
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Principle of Optimality
Given sequences
Define F(i,j) = score of best suffix alignment
between
and
N.B. Includes empty alignment with score 0
€
Y = (y1,y2,...,yn )
X = (x1,x2,...,xm )
€
(xs,xs+1,...,x i) where s ≤ i
€
(yr,yr+1,...,y j ) where r ≤ j
48
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Dynamic Programming recurrences
Optimal alignment
€
xr, xr+1, xr+2, ...,x i
€
ys, ys+1, ys+2, ..., y j
Looks like ……
€
xr,xr+2,xr+2,...,x i−1
€
ys,ys+1,ys+2,...,y j−1
€
x i
€
y j
€
F(i, j)
€
F(i −1, j −1) + s(x i,y j )
or ……………
€
xr,xr+1,xr+2,...,x i
€
ys,ys+1,ys+2,...,y j−1
€
−
€
y j
€
F(i, j −1) − d
or ……………
€
xr,xr+1,xr+2,...,x i−1
€
ys,ys+1,ys+2,...,y j
€
x i
€
−
€
F(i −1, j) − d
or ……………
€
xr, xr+1, xr+2, ...,x i
€
ys, ys+1, ys+2, ..., y j
€
0
49
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Principle of Optimality
so ……
€
0
F(i −1, j −1) + s(x i,y j )
F(i, j) = max F(i, j −1) − d
F(i −1, j) − d
€
F(i,0) = F(0, j) = 0Basis:
50
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Example
F H E A G A W G H E E
0 0 0 0 0 0 0 0 0 0 0
P 0 0 0 0 0 0 0 0 0 0 0
A 0 0 0 5 0 5 0 0 0 0 0
W 0 0 0 0 2 0 20 12 4 0 0
H 0 10 2 0 0 0 12 18 22 14 6
E 0 2 16 8 0 0 4 10 18 28 20
A 0 0 8 21 13 5 0 4 10 20 27
E 0 0 6 13 18 12 4 0 4 16 26
51
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Example
F H E A G A W G H E E
0 0 0 0 0 0 0 0 0 0 0
P 0 0 0 0 0 0 0 0 0 0 0
A 0 0 0 5 0 5 0 0 0 0 0
W 0 0 0 0 2 0 20 12 4 0 0
H 0 10 2 0 0 0 12 18 22 14 6
E 0 2 16 8 0 0 4 10 18 28 20
A 0 0 8 21 13 5 0 4 10 20 27
E 0 0 6 13 18 12 4 0 4 16 26
AlignmentX
Y A W G H E
A W - H E
52
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Repeated (local) matches
Long sequences - interested in all local alignments with significant score,> threshold T.
e.g. copies of repeated domain or motif in a protein.
X = sequence containing motif
Y = target sequence
Method is asymmetric
Y
Matching parts of X
53
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Principle of Optimality
Given sequences
Define F(i,j) (i ≥ 1) = best sum of match scores in
and €
Y = (y1,y2,...,yn )
X = (x1,x2,...,xm )
€
(x1,x2,...,x i)
€
(y1,y2,...,y j )
€
y j
€
x i
€
y j
assuming
and match ends in
is in a matched region
or
54
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Ends of matches
€
F(0,0) = 0
€
F(0, j) = best sum of completed match scores to
€
(y1,y2,...,y j )
assuming that
€
y j is not in a matched region
€
F(0, j −1)
F(0, j) = max F(i, j −1) −T, i =1,...,n
Row 0 therefore marks unmatched regions and ends of matches in Y.
55
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
General recurrence
€
F(0, j)
F(i −1, j −1) + s(x i,y j )
F(i, j) = max F(i, j −1) − d
F(i −1, j) − d
Start of new match
Extension of previous match
56
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
ExampleF H E A G A W G H E E
0 0 0 0 1 1 1 1 1 3 9
P 0 0 0 0 1 1 1 1 1 3 9
A 0 0 0 5 1 6 1 1 1 3 9
W 0 0 0 0 2 1 21 13 5 3 9
H 0 10 2 0 1 1 13 19 23 15 9
E 0 2 16 8 1 1 5 11 19 29 21
A 0 0 8 21 13 6 1 5 11 21 28
E 0 0 6 13 18 12 4 1 5 17 27
9
Extra cell for final total score
57
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Example
AlignmentX
Y H E A G A W G H E E
H E A . A W - H E .
Extra cell for final total score
F H E A G A W G H E E
0 0 0 0 1 1 1 1 1 3 9
P 0 0 0 0 1 1 1 1 1 3 9
A 0 0 0 5 1 6 1 1 1 3 9
W 0 0 0 0 2 1 21 13 5 3 9
H 0 10 2 0 1 1 13 19 23 15 9
E 0 2 16 8 1 1 5 11 19 29 21
A 0 0 8 21 13 6 1 5 11 21 28
E 0 0 6 13 18 12 4 1 5 17 27
9
58
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Overlap matchesY Y
X X
YY
X X
Don’t penalise overhanging ends i.e. set F(i,0) = F(0,j) = 0
€
F(i −1, j −1) + s(x i,y j )
F(i, j) = max F(i, j −1) − d
F(i −1, j) − d
Otherwise
59
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
ExampleF H E A G A W G H E E
0 0 0 0 0 0 0 0 0 0 0
P 0 -2̀ -1 -1 -2 -1 -4 -2 -2 -1 -1
A 0 -2 -2 4 -1 3 -4 -4 -4 -3 -2
W 0 -3 -5 -4 1 -4 18 10 2 6 -6
H 0 10 2 6 -6 -1 10 16 20 12 4
E 0 2 16 8 0 7 2 8 16 26 18
A 0 -2 8 21 13 5 3 2 8 18 25
E 0 0 4 13 18 12 4 4 2 14 24
60
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
ExampleF H E A G A W G H E E
0 0 0 0 0 0 0 0 0 0 0
P 0 -2̀ -1 -1 -2 -1 -4 -2 -2 -1 -1
A 0 -2 -2 4 -1 3 -4 -4 -4 -3 -2
W 0 -3 -5 -4 1 -4 18 10 2 6 -6
H 0 10 2 6 -6 -1 10 16 20 12 4
E 0 2 16 8 0 7 2 8 16 26 18
A 0 -2 8 21 13 5 3 2 8 18 25
E 0 0 4 13 18 12 4 4 2 14 24
AlignmentX
Y G A W G H E E
P A W - H E A
61
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Affine gap penalities
• Affine score: (g) = -d - (g-1)e
gap-open penalty gap-extension penalty
• Different penalties associated with extending alignment with gap symbol
Y = C C T W PX = C S T W -
Y = C C T W PX = C S T - -
different from
62
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
General recurrence
€
F(i −1, j −1) + s(x i,y j )
F(i, j) = max F(k, j) + γ(i − k), k = 0,1,...,i −1
(i, j > 0) F(i,k) + γ ( j − k), k = 0,1,..., j −1
Extend by matching
€
x i and y j
Extend by matching suffix of X to gap of length k
Extend by matching suffix of Y to gap of length k
€
Θ(n3)Problem: Procedure runs in worst-case time
63
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
version
€
Θ(n2)
Extra variables
€
M(i, j) = best score of alignment of (x1,x2,...,x i) and
(y1,y2,...,y j ) given that x i is aligned with y j Ix (i, j) = best score of alignment of (x1,x2,...,x i) and
(y1,y2,...,y j ) given that x i is aligned with a gap
Iy (i, j) = best score of alignment of (x1,x2,...,x i) and
(y1,y2,...,y j ) given that y j is aligned with a gap
64
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Recurrences
€
M(i −1, j) − d
Ix (i, j) = max Ix (i −1, j) − e
(i, j > 0)
M(i, j −1) − d
Iy (i, j) = max Iy (i, j −1) − e
(i, j > 0)
M(i −1, j −1) + S(x i,y j )
M(i, j) = max Ix (i −1, j −1) + S(x i,y j )
Iy (i −1, j −1) + S(x i,y j )
(i, j > 0)
aligned to start of gap
€
x i
€
Θ(n2)Procedure runs in worst-case time
aligned to continuation of gap
€
x i
aligned to start of gap
€
y j
aligned to continuation of gap
€
y j
65
Dyn
amic
Pro
gra
mm
ing
fo
r P
airw
ise
Alig
nm
ent
Linear space alignment
Hirschberg’s insight
F
m
n00
€
m2⎣ ⎦