89
Cédric Notredame (03/01/22) Recent Progress in Multiple Sequence Alignments: A Survey Cédric Notredame

Recent Progress in Multiple Sequence Alignments: A Survey

  • Upload
    nalani

  • View
    29

  • Download
    1

Embed Size (px)

DESCRIPTION

Recent Progress in Multiple Sequence Alignments: A Survey. Cédric Notredame. Our Scope. What are The existing Methods ?. How Do They Work: -Assemby Algorithms -Weighting Schemes. When Do They Work ?. Which Future ?. Outline. - Introduction. - A taxonomy of the existing Packages. - PowerPoint PPT Presentation

Citation preview

Page 1: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Recent Progress in Multiple Sequence

Alignments:A Survey

Cédric Notredame

Page 2: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Our Scope

What are The existing Methods?

How Do They Work: -Assemby Algorithms-Weighting Schemes.

When Do They Work ?

Which Future?

Page 3: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Outline

-Introduction

-A taxonomy of the existing Packages

-A few algorithms…

-Performance Comparison using BaliBase

Page 4: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Introduction

Page 5: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

What Is A Multiple Sequence Alignment?

A MSA is a MODEL

It Indicates the RELATIONSHIP between residues of different sequences.

It REVEALS-Similarities-Inconsistencies

LIKE ANYMODEL

Page 6: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *

chite AATAKQNYIRALQEYERNGG-wheat ANKLKGEYNKAIAAYNKGESAtrybr AEKDKERYKREM---------mouse AKDDRIRYDNEMKSWEEQMAE * : .* . :

How Can I Use A Multiple Sequence Alignment?

Extrapolation

Motifs/Patterns

Phylogeny

Profiles

Struc. Prediction

Multiple Alignments Are CENTRAL to MOST Bioinformatics Techniques.

Page 7: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

How Can I Use A Multiple Sequence Alignment?

Multiple Alignments Is the most INTEGRATIVE Method Available Today.

We Need MSA to INCORPORATE existing DATA

Page 8: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Why Is It Difficult To Compute A multiple Sequence Alignment?

A CROSSROAD PROBLEM

BIOLOGY:What is A Good Alignment

COMPUTATIONWhat is THE Good Alignment

chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP ***. ::: .: .. . : . . * . *: *

Page 9: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Why Is It Difficult To Compute A multiple Sequence Alignment ?

BIOLOGY

CIRCULAR PROBLEM....

GoodSequences

GoodAlignment

COMPUTATION

Page 10: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

A Taxonomy of Multiple Sequence Alignment Methods

Page 11: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Grouping According to the assembly Algorithm

Page 12: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

SimultaneousAs opposed to Progressive

Exact As opposed to Heursistic

Stochastic As opposed to Determinist

Iterative As opposed to Non Iterative

[Simultaneous: they simultaneously use all the information]

[Heuristics: cut corners like Blast Vs SW]

[Heuristics: do not guarranty an optimal solution]

[Stochastic: contain an element of randomness]

[Stochastic: Example of a Monte Carlo Surface estimation ]

[Iterative: Most stochastic methods are iterative]

[Iterative: run the same algorithm many times]

Page 13: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)Iterative

Iteralign

Prrp

SAM HMMer

SAGAGA

Clustal

Dialign

T-Coffee

ProgressiveSimultaneous

MSA

POA OMA

PralineMAFFT

DCA

Combalign

Non tree based

GAs

HMMs

Page 14: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)Iterative

Iteralign

Prrp

SAM HMMer

GA

Clustal

Dialign

T-Coffee

ProgressiveSimultaneous

MSA

POA OMA

PralineMAFFT

DCA

Combalign

StochasticSAGA

Page 15: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

NEARLY EVERY OPTIMISATIONALGORITHM

HAS BEEN APPLIED TO THEMSA PROBLEM!!!

Page 16: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Grouping According to the Objective Function

Page 17: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Scoring an Alignment: Evolutionary based

methods

BIOLOGYHow many events separate my sequences?

Such an evaluation relies on a biological model.

COMPUTATIONEvery position musd be independant

Page 18: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

REAL Tree

Model: ALL the sequences evolved from the same ancestor

A

A

A C

Tree: Cost=1C

AAACC

A CA

PROBLEM: We do not know the true tree

Page 19: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

STAR Tree

Model: ALL the sequences have the same ancestor

A

A

A CStar Tree: Cost=2

C

AAACC

A

PROBLEM: the tree star is phylogenetically wrong

Page 20: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Sums of Pairs

Model=Every sequence is the ancestor of every sequence

A

A

A CSums of Pairs: Cost=6

CAAACC

PROBLEM: -over-estimation of the mutation costs-Requires a weighting scheme

lk

li

kii mmsmS ,

[s(a,b): matrix]

[i: column i]

[k, l: seq index]

Page 21: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Sums of Pairs: Some of itslimitations (Durbin,

p140)

LLLLL

GCost=5*N*(N-1)/2-(5)*(N-1) - (-4)*(N-1)

[glycine effect]

Cost=5*N*(N-1)/2-(9)*(N-1)

Cost= 5*N*(N-1)/2[5: Leucine Vs Leucine with Blosum50]

Page 22: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Sums of Pairs: Some of its limitations (Durbin,

p140)

LLLLL

G

Delta=2*(9)*(N-1)

5*N*(N-1)=

(9)

5*N

N

Delta

Conclusion: The more Leucine, the less expensive it gets to add a Glycin to the column...

Page 23: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Enthropy based Functions

Model: Minimize the enthropy (variety) in each Column

AAACC

PROBLEM: -requires a simultaneous alignment-assumes independant sequences

j

jiia amc [number of Alanine (a) in column i]

a

iaiai PcmS log* [Score of column i][a: alphabet]

[P can incorporate pseudocounts]

S=0 if the column is conserved

Page 24: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Consistency based Functions

Model: Maximise the consistency (agreement) with a list of constraints (alignments)

PROBLEM: -requires a list of constraints

AAACC

lk

li

kii mmS , [kand l are sequences, i is a column]

Existsmmmm li

ki

li

ki ,1,

[the two residues are found aligned in the list of constraints]

Page 25: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Concistency Based

Iteralign

Dialign

T-Coffee

Praline

Combalign

Prrp

ClustalPOA

MSA

MAFFTOMA

DCA

SAGA

WeightedSums

of Pairs

EnthropySAM HMMer

GIBBS

Page 26: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

A few Multiple Sequence Alignment Algorithms

Page 27: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

A Few Algorithms

MSA and DCA

ClustalW

Dialign IIPrrp

SAGA

GIBBS Sampler

MAFFT

POA

Page 28: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Simultaneous: MSA and DCA

Page 29: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Simultaneous Alignments : MSA

1) Set Bounds on each pair of sequences (Carillo and Lipman)

2) Compute the Maln within the Hyperspace

-Few Small Closely Related Sequence.

-Do Well When They Can Run.

-Memory and CPU hungry

Page 30: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

MSA: the carillo and Lipman bounds

chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSEtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGPmouse -----KPKRPRSAYNIYVSESFQ----EAKDDS-AQGKLKLVNEAWKNLSP

chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDwheat --DPNKPKRAPSAFFVFMGEFREEFKQKNPKNKSVAAVGKAAGERWKSLSE

chite ---ADKPKRPLSAYMLWLNSARESIKRENPDFK-VTEVAKKGGELWRGLKDtrybr KKDSNAPKRAMTSFMFFSSDFRS----KHSDLS-IVEMSKAAGAAWKELGP

S( )=

S(S(

)

)+

…[Pairwise projection of sequences k and l]

Page 31: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

MSA: the carillo and Lipman bounds

a(k,l)=score of the projection k l in the optimal MSA

â(k,l)=score of the optimal alignment of k l

(a(x,y))=score of the complete multiple alignment

a(k,l) â(k,l) a(k,m) â(k,m)

?

Upper

Lower

Page 32: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

MSA: the carillo and Lipman bounds

LM: a lower bound for the complete MSA

a(k,l)>=LM +â(k,l)-(â(x,y))

LM<=(â(x,y)) - (â(k,l)-a(k,l))

a(k,l) â(k,l)

â(k,l)

LM+ â(k,l)-(â(x,y))

?

Page 33: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

MSA: the carillo and Lipman bounds

LM: can be measured on ANY heuristic alignment

a(k,l) â(k,l)

â(k,l)

LM+ â(k,l)-(â(x,y)) ä(k,l)

LM = (ä(x,y))

The better LM, the tighter the bounds…

Page 34: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

MSA: the carillo and Lipman bounds

backward Forward

Best( M-i, N-j) Best( 0-i, 0-j)

0

M

N 0

M

N

+

Page 35: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Simultaneous Alignments : MSA

1) Set Bounds on each pair of sequences (Carillo and Lipman)

2) Compute the Maln within the Hyperspace

-Few Small Closely Related Sequence.

-Do Well When They Can Run.

-Memory and CPU hungry

Page 36: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Simultaneous Alignments : DCA

-Few Small Closely Related Sequence, but less limited than MSA

-Do Well When Can Run.

-Memory and CPU hungry, but less than MSA

Page 37: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Simultaneous With a New Sequence Representaion:

POA-Partial Ordered Graph

Page 38: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Page 39: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Page 40: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

POA

POA makes it possible to represent complex relationships:

-domain deletion-domain inversions

Page 41: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Progressive: ClustalW

Page 42: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Progressive Alignment: ClustalW

Feng and Dolittle, 1988; Taylor 198ç

Clustering

Page 43: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Dynamic Programming Using A Substitution Matrix

Progressive Alignment: ClustalW

Page 44: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Tree based Alignment : Recursive Algorithm

Align ( Node N){

if ( N->left_child is a Node)A1=Align ( N->left_child)

else if ( N->left_child is a Sequence)A1=N->left_child

if (N->right_child is a node)A2=Align (N->right_child)

else if ( N->right_child is a Sequence)A2=N->right_child

Return dp_alignment (A1, A2)}

A D E F GCB

Page 45: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Progressive Alignment : ClustalW

-Depends on the ORDER of the sequences (Tree).

-Depends on the CHOICE of the sequences.

-Depends on the PARAMETERS:

•Substitution Matrix.

•Penalties (Gop, Gep).

•Sequence Weight.

•Tree making Algorithm.

Page 46: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Weighting Within ClustalWProgressive Alignment : ClustalW Weighting

Page 47: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Position Specific GOPProgressive Alignment : ClustalW GOP

Page 48: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

ClustalW is the most Popular Method

-Fast

-Greedy Heuristic (No Guarranty).

Progressive Alignment : ClustalW

-Scales Well: N, N L3 2 2

Page 49: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Progressive Alignment With a Heuristic DP:

MAFFT

Page 50: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Page 51: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

ProgressiveAnd

Concistency BasedDialign II

Page 52: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Dialign II

1) Identify best chain of segments on each pair of sequence. Assign a Pvalue to each Segment Pair.

3) Assemble the alignment according to the segment pairs.

2) Ré-évaluate each segment pair according to its consistency with the others

Page 53: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Dialign II

-May Align Too Few Residues

-No Gap Penalty-Does well with ESTs

Page 54: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

ProgressiveAnd

Concistency BasedT-COFFEE

Page 55: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Mixing Local and Global Alignments

Local Alignment Global Alignment

Extension

Multiple Sequence Alignment

Page 56: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

What is a library?

Extension+T-Coffee

2Seq1 MySeqSeq2 MyotherSeq#1 21 1 253 8 70….

3Seq1 anotherseqSeq2 atsecondoneSeq3 athirdone#1 21 1 25#1 33 8 70….

Page 57: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Iterative

Page 58: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

7.16.1 ProgressiveIterative Methods

-HMMs, HMMER, SAM.

-Slow, Sometimes Inaccurate-Good Profile Generators

Page 59: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

7.16.2 PrrpInitial Alignment

Tree and weights computation

Weights converged End

Realign two sub-groups

Alignment converged

YES

NO

YES NO

Inner Iteration

Outer Iteration

Iterative Methods : Prrp

Page 60: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Iterative Sochastic:SAGA, The Genetic

Algorithm

Page 61: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Page 62: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Page 63: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Page 64: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Automatic scheduling of the operators

Page 65: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Page 66: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Weighting Schemes

Page 67: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

The Problem

The sequences Contain Correlated Information

Most scoring Schemes Ignore this Correlation

Page 68: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Weighting Sequence Pairs with a Tree:

Carillo and LipmanRationale I

Page 69: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

A D E F GCB

E=EDGE

P=Evolutive Path from A to X

E must contribute the same weight to every path P that goes throught it.

QUESTION: Which Weight for a Pair of Sequences

All the weights using E must sum to 1: (WP,E)=1.

Wp=Nk-1)

1

Nk: Number of Edges meeting on Node k.

Page 70: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

USAGE

]][[*),( yB

xAAB

yB

xA RRMatWRRScore

Page 71: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

PROBLEM: Weight Depends only on the Tree topology

B

A C

AB: 0.5AC: 0.5BC: 0.5.

B

A C

AB: 0.5AC: 0.5BC: 0.5.

Page 72: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Weighting Sequences with a Tree

Clustal WWeights

Page 73: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

GA D E FCB

QUESTION: Which Weight for Sequences ?

W=Length *1/4

W=Length *1/2

W=Length *1

GG W=W)

Number Sequences Sharing Edge

Edge LengthWseq =

Page 74: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

USAGE

]][[**),( yB

xABA

yB

xA RRMatWWRRScore

Page 75: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

PROBLEM: Overweight of distant sequences

D E F G

C-C Will dominate the Alignment

-C Will be very Difficult to align

Page 76: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Performance Comparison Using

Collections of Reference

Alignments: BaliBase and

Ribosomal RNA

Page 77: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

What Is BaliBaseBaliBase

BaliBase is a collection of reference Multiple Alignments

The Structure of the Sequences are known and were used to assemble the MALN.

Evaluation is carried out by Comparing the Structure Based Reference Alignment With its Sequence Based Counterpart

Page 78: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

What Is BaliBaseBaliBase

DALI, Sap …

Method X

Comparison

Page 79: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

What Is BaliBaseBaliBase

DescriptionPROBLEM

Source: BaliBase, Thompson et al, NAR, 1999,

Even Phylogenic Spread.

One Outlayer Sequence

Two Distantly related Groups

Long Internal Indel

Long Terminal Indel

Page 80: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Choosing The Right Method

Page 81: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Choosing The Right Method (POA Evaluation)

Page 82: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Choosing The Right Method (POA Evaluation)

Page 83: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Choosing The Right Method (MAFFT evaluation)

Page 84: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Choosing The Right Method (MAFFT evaluation)

Page 85: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Choosing The Right Method (MAFFT evaluation)

Page 86: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Conclusion

Page 87: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

What Is BaliBaseWhich Method ?

PROBLEM

Source: BaliBase, Thompson et al, NAR, 1999,

Strategy

Strategy

ClustalW, T-coffee,MSA, DCA

PrrP,T-Coffee

Dialign

T-Coffee

T-Coffee

Dialign

T-Coffee

Page 88: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Methods /Situtations

1-Carillo and Lipman:-MSA, DCA.

-Few Small Closely Related Sequence.

2-Segment Based:-DIALIGN, MACAW.

-May Align Too Few Residues-Good For Long Indels

-Do Well When They Can Run.

3-Iterative:-HMMs, HMMER, SAM.

-Slow, Sometimes Inaccurate-Good Profile Generators

4-Progressive: -ClustalW, Pileup, Multalign…-Fast and Sensitive

Page 89: Recent Progress in Multiple Sequence Alignments: A Survey

Cédric Notredame (21/04/23)

Addresses

MAFFT Progressive www.biophys.kyoto-u.jp/katoh POA Progressive/Simulataneous www.bioinformatics.ucla.edu/poa MUSCLE Progressive/Iterative www.drive5.com/muscle/