Upload
jesse-fitzgerald
View
272
Download
9
Embed Size (px)
Citation preview
1
Multiple Sequence Alignment
暨南大學資訊工程學系黃光璿2004/05/31
2
What is a multiple alignment?
3
4
An alignment of ten I-set immunoglobin superfamily
5
Motivation
A multiple alignment may suggest a common structure of the protein produ
cts; a common function; a common evolutionary source.
6
Issues
How to define meaningful scoring function for an alignment? evolutionary correct alignment --- more difficult! structure alignment
How to find the best alignment? by algorithms
7
Three types of alignment problems DNA protein
joined by disulfide bond RNA
more difficult due to long-range correlation
We focus on alignment problems of sequences of DNAs or proteins.
8
9
10
11
12
To prove that a computational problem is NP-hard, we need to reduce an NP-complete (hard) problem to
this problem.
13
When a computational problem is NP-hard, we deal with it by heuristic: convince other people by experiment
s approximation: how to analyze the performanc
e? randomization: how to design a reasonable alg
orithm
14
15
16
17
18
19
20
Branch & bound heuristic for the DP algorithm of the Sum-of-pairs Carrillo & Lipman (1988) The idea was implemented in the famous p
roblem MSA. Lipman, Altshul, Kececiogly, 1989
MSA can align 6 sequences of length ~200 in reasonable time.
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
參考資料及圖片出處
1. Biological Sequence Analysis – Probabilistic Models of Proteins and Nucleic AcidsR. Durbin, S. Eddy, A. Krogh, and G. Mitchison,
Cambridge University Press, 1998.