Construction of Substitution Matrices BLOSUM: BLOcks SUbstitution Matrix PAM: Point Accepted...

Preview:

Citation preview

Construction of Substitution Matrices

• BLOSUM: BLOcks SUbstitution Matrix

• PAM: Point Accepted Mutations

Substitution Matrices

• Contain values proportional to the probability that amino acid A mutates into amino acid B for all pairs of amino acids through a period of evolution

Substitution Matrices

• Contain values proportional to the probability that amino acid A mutates into amino acid B for all pairs of amino acids through a period of evolution

• Are constructed from a large and diverse sample of sequence alignments

Substitution Matrices

• Contain values proportional to the probability that amino acid A mutates into amino acid B for all pairs of amino acids through a period of evolution

• Are constructed from a large and diverse sample of sequence alignments

• Multiple alignment of well studied gene sequences from different species

Substitution Matrices

• Contain values proportional to the probability that amino acid A mutates into amino acid B for all pairs of amino acids through a period of evolution

• Are constructed from a large and diverse sample of sequence alignments

• Multiple alignment of well studied gene sequences from different species

• Use orthologs - functionally similar

Substitution Matrices

• Contain values proportional to the probability that amino acid A mutates into amino acid B for all pairs of amino acids through a period of evolution

• Are constructed from a large and diverse sample of sequence alignments

• Multiple alignment of well studied gene sequences from different species

• Use orthologs - functionally similar• Observed substitutions tend to preserve

functions

Substitution Matrices

• Contain values proportional to the probability that amino acid A mutates into amino acid B for all pairs of amino acids through a period of evolution

• Are constructed from a large and diverse sample of sequence alignments

• Multiple alignment of well studied gene sequences from different species

• Use orthologs - functionally similar• Observed substitutions tend to preserve functions• Minimal gaps

How to Construct Substitution Matrices

Tabulate substitutions• A to A: 9867 times

• A to R: 2 times

• A to N: 9 times

• etc….

How to Construct Substitution Matrices

How to Construct Substitution Matrices (BLOSUM)

How to Construct Substitution Matrices (BLOSUM)

How to Construct Substitution Matrices

Finding the Random Mutation Rate

• Compute overall occurrence of an amino acid in a protein database

Finding the Random Mutation Rate

• Compute overall occurrence of an amino acid in a protein database

http://www.ebi.ac.uk/swissprot/sptr_stats/index.html

Finding the Random Mutation Rate

• Compute overall occurrence of an amino acid in a protein database

http://www.ebi.ac.uk/swissprot/sptr_stats/index.html

Finding the Random Mutation Rate

Example:

Expected random mutation rate is 1 in 10000 and observed mutation rate of W to R is 1 in 10

Score = log (0.1/0.0001) = log (1000) = +3

PAM Matrices

[1 point mutation per 100 amino acids]• does not take into account different evolutionary

rates between conserved and non-conserved regions

• PAM1 is 1% average change in amino acids

• PAM 250:??

PAM Matrices

PAM vs. BLOSUM

Basic Local Alignment Search Tool (BLAST)

• Heuristic method

BLAST Algorithm

BLAST Algorithm

BLAST Algorithm

What can we search and compare?

DNA vs DNA

Protein vs Protein

DNA vs Protein

Protein vs DNA

Reading Frames

The best BLAST program

Recommended