23
Correlated Mutations and Co-evolution May 1 st , 2002

Correlated Mutations and Co-evolution May 1 st, 2002

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Correlated Mutations and Co-evolution

May 1st, 2002

What is Co-evolution (Correlated Mutation)?

• Individual regions of proteins interact

• Regions can be either on the same chain or on different chains (complexes)

• A mutation in one half of the pair induces a change in the other half of the pair

• “the tendency of positions in proteins to mutate co-ordinately” Pazos et. al. 1997

“Correlated Mutations Contain Information about Protein-protein interactions” Pazos et. al. 1997

• A possible aid to the “docking” problem, using only sequence information

• Docking: The process by which protein domains interact with one another fitting

Methodology

The correlation coefficient• S is the similarity between residues at the

positions i/j of type k versus l• Arbitrarily chosen cutoff M predicted contacts

(greatest L/2 values) i.e. M=L/2

The Harmonic Average (Xd)

• Measure of “correlatedness”

• Pic percentage of correlated pairs with that distance, Pia for all pairs

Comparisons of Correlations

Docking solutions test

• Note: larger percentages imply worse performance

• Special mention of 2gcr and 3adk

• “sequence information does not seem to be sufficient to discriminate”

Figure 5: Scatter plot of Xd vs RMS distance

9pap

Hemoglobin 1hbb

Prediction: Hsc70

• Figure 6: predicted contacts of Nt and Ct domains of Hsc70

• Could be verified experimentally

Coevolving Protein Residues: Maximum Likelihood and Relationship to Structure.

Pollock et. al 1999• Using size and charge characteristics to

define co-evolution (correlation)

• Negative Correlation: Correlation due to differences in charge (and thus also coevolution)

The Markov process model (simulated evolution)

• Two states, A and a• Equation 1, the probability of transitioning state• λ rate parameter• π equilibrium frequency

Use of parameters in model

• Basic model for how they simulate evolutionary steps

Likelihood Test Characteristic (LR)

• LI and LD maximum likelihood values for independent and dependent model

• Method of determining whether dependence is statistically significant

Test of Significance (LR values for change in parameters)

Myoglobin

• Used structure of myoglobin; compared differences in sequences

• Variety of species used for sequence information; sperm whale 3D protein structure

LR distributions for myoglobin: size and charge

• Note the large negative correlation LR values in charge

Co-evolution of Proteins with their Interaction Partners, Goh

et. al. 2000

• Applied to PGK

• Chemokines

What is PGK?

Methodology

• Two independent sequence alignments, for N and C regions, using PSI-BLAST

• ClustalW to create distance matrix between complete domains

• To determine correlation, used equation below• X and Y correspond to domains; r a measure of

relatedness between these domains

PGK correlations

Chemokines

• Role of chemokines; importance in immunity (HIV, cancer)

• Four categories, mean nothing to me

Clustering of Chemokines

Clustering of Chemokine receptors