29
DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT ESTIMATING THE dN/dS RATIO FOR GENE SEQUENCES IN THE PRESENCE OF RECOMBINATION Danny Wilson 12 th October 2004

DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

  • Upload
    fahim

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT. ESTIMATING THE dN/dS RATIO FOR GENE SEQUENCES IN THE PRESENCE OF RECOMBINATION. Danny Wilson 12 th October 2004. Menu. Codon-based models of molecular evolution An new method for estimating omega with recombination - PowerPoint PPT Presentation

Citation preview

Page 1: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

ESTIMATING THE dN/dS RATIO FOR GENE SEQUENCES IN THE PRESENCE OF

RECOMBINATION

Danny Wilson12th October 2004

Page 2: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Menu

Codon-based models of molecular evolution

An new method for estimating omega with recombination

Does it work? Simulation studies and example data

Page 3: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Part one

Codon-based models of molecular evolution

Page 4: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Underlying rates of non-synonymous mutation are usually confounded with selection against inviable mutants.

Thus it is convenient to model functional constraint as mutational bias.(Or rather, make no attempt to disentangle the two).

Ancestral type

Neutral mutant

Inviable mutant

Mutation Selection

Sampling usuallyoccurs at this point

i.e. post-selection

Page 5: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Phe Phe Leu Leu Ser Ser Ser Ser Tyr Tyr STOPSTOP Cys Cys STOP Trp Leu Leu Leu Leu Pro Pro Pro Pro His His Gln Gln Arg Arg Arg Arg Ile Ile Ile Met Thr Thr Thr Thr Asn Asn Lys Lys Ser Ser Arg Arg Val Val Val Val A la Ala Ala Ala Asp Asp Glu Glu Gly Gly Gly GlyUUU UUC UUA UUG UCU UCC UCA UCG UAU UAC UAA UAG UGU UGC UGA UGG CUU CUC CUA CUG CCU CCC CCA CCG CAU CAC CAA CAG CGU CGC CGA CGG AUU AUC AUA AUG ACU ACC ACA ACG AAU AAC AAA AAG AGU AGC AGA AGG GUU GUC GUA GUG GCU GCC GCA GCG GAU GAC GAA GAG GGU GGC GGA GGG

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64Phe UUU 1 1 3 4 4 5 4 0 0 4 0 5 4 4Phe UUC 2 1 4 4 5 4 0 0 4 0 5 4 4Leu UUA 3 1 3 5 4 0 4 2 4 4Leu UUG 4 1 5 0 4 0 4 2 4 4Ser UCU 5 1 3 2 2 4 0 0 4 0 5 4 4Ser UCC 6 1 2 2 4 0 0 4 0 5 4 4Ser UCA 7 1 3 4 0 4 5 4 4Ser UCG 8 1 0 4 0 4 5 4 4Tyr UAU 9 1 3 4 4 5 0 5 4 4Tyr UAC 10 1 4 4 5 0 5 4 4

STOP UAA 11 0 0 0 0 0 0 0 0 0 0 1 3 0 0 3 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0STOP UAG 12 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0Cys UGU 13 0 0 1 3 4 4 5 4 4Cys UGC 14 0 0 1 4 4 5 4 4

STOP UGA 15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0Trp UGG 16 0 0 0 1 5 4 4Leu CUU 17 0 0 0 1 3 2 2 5 4 4 4 4Leu CUC 18 0 0 0 1 2 2 5 4 4 4 4Leu CUA 19 0 0 0 1 3 5 4 4 4 4Leu CUG 20 0 0 0 1 5 4 4 4 4Pro CCU 21 0 0 0 1 3 2 2 4 4 4 4Pro CCC 22 0 0 0 1 2 2 4 4 4 4Pro CCA 23 0 0 0 1 3 4 4 4 4Pro CCG 24 0 0 0 1 4 4 4 4His CAU 25 0 0 0 1 3 4 4 5 4 4His CAC 26 0 0 0 1 4 4 5 4 4Gln CAA 27 0 0 0 1 3 5 4 4Gln CAG 28 0 0 0 1 5 4 4Arg CGU 29 0 0 0 1 3 2 2 4 4Arg CGC 30 0 0 0 1 2 2 4 4Arg CGA 31 0 0 0 1 3 2 4Arg CGG 32 0 0 0 1 2 4Ile AUU 33 0 0 0 1 3 2 4 5 4 4 5Ile AUC 34 0 0 0 1 2 4 5 4 4 5Ile AUA 35 0 0 0 1 5 5 4 4 5

Met AUG 36 0 0 0 1 5 4 4 5Thr ACU 37 0 0 0 1 3 2 2 4 4 5Thr ACC 38 0 0 0 1 2 2 4 4 5Thr ACA 39 0 0 0 1 3 4 4 5Thr ACG 40 0 0 0 1 4 4 5Asn AAU 41 0 0 0 1 3 4 4 5 5Asn AAC 42 0 0 0 1 4 4 5 5Lys AAA 43 0 0 0 1 3 5 5Lys AAG 44 0 0 0 1 5 5Ser AGU 45 0 0 0 1 3 4 4 5Ser AGC 46 0 0 0 1 4 4 5Arg AGA 47 0 0 0 1 3 5Arg AGG 48 0 0 0 1 5Val GUU 49 0 0 0 1 3 2 2 5 4 4Val GUC 50 0 0 0 1 2 2 5 4 4Val GUA 51 0 0 0 1 3 5 4 4Val GUG 52 0 0 0 1 5 4 4Ala GCU 53 0 0 0 1 3 2 2 4 4Ala GCC 54 0 0 0 1 2 2 4 4Ala GCA 55 0 0 0 1 3 4 4Ala GCG 56 0 0 0 1 4 4Asp GAU 57 0 0 0 1 3 4 4 5Asp GAC 58 0 0 0 1 4 4 5Glu GAA 59 0 0 0 1 3 5Glu GAG 60 0 0 0 1 5Gly GGU 61 0 0 0 1 3 2 2Gly GGC 62 0 0 0 1 2 2Gly GGA 63 0 0 0 1 3Gly GGG 64 0 0 0 1

K ey

0 change involving a stop codon1 no change2 synonymous transversion3 synonymous transition4 non-synonymous transversion5 non-synonymous transition

Page 6: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Types of single nucleotide mutationTransitions vs. transversions

A G

T C

Purine

Pyramidine Transitions

Transitions

Transversions

For any base there are always 2 possible transversions and 1 possible transition.

Page 7: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Types of codon mutationSynonymous vs. non-synonymous

T T G

T T A

Leucine

Leucine

T T G

A T G

Leucine

Methionine

LeucinepH 5.98

6-fold degeneracy in the genetic code

MethioninepH 5.74

Single unique codon ATG

(CH3)2-CH-CH2-CH(NH2)-COOHCH3-S-(CH2)2-CH(NH2)-COOH

Synonymous Non-synonymous

Page 8: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Example: CTT

C T T T TT TT T

T TT TT TT TT TT T

TAG

CAG

CAG

Phe Non-synonymous transition

Ile Non-synonymous transversion

Val Non-synonymous transversion

Ser Non-synonymous transition

Tyr Non-synonymous transversion

Cys Non-synonymous transversion

Phe Non-synonymous transition

Leu Synonymous transversion

Leu Synonymous transversion

Leucine

Page 9: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Nielsen and Yang (1998) codon-based model of molecular evolution

Mutation rate

Synonymous transversion

Synonymous transition

Non-synonymous transversion

Non-synonymous transition

Other

Interpretation

Transition-transversion ratio

dN/dSRelative viability of non-synonymous mutations

Page 10: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

codeML

Pros Viable method for detecting mode of selection

on a codon sequence

Cons Categorizes possible values for omega into a

small number of discrete intervals Results can be misleading in the presence of

recombination

Page 11: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Part two

An new method for estimating omega with recombination

Page 12: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Inference with recombination

?|Pr

Pr,|Pr

1|Pr

,|Pr1

|Pr

PrdPr,|Pr

Pr|Pr|Pr

1

1

X

GQ

GGX

MX

GXM

X

GGGX

XX

i

iM

i

i

M

i

i

Page 13: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Li and Stephens (2003)Approximation to the likelihood

,...|ˆ...,|ˆ|ˆ

,...|Pr...,|Pr|Pr

|Pr

11121

11121

nn

nn

XXXXXX

XXXXXX

X

Page 14: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Li and Stephens (2003)Approximation to the likelihood

TTTGATACTGTTGCCGAAGGTTTGGGCGAAATTCGCGATTTATTGCGCCGTTATCATCAT

TTTGATACCGTTGCCGAAGGTTTGGGTGAAATTCGCGATTTATTGCGCCGTTACCACCGC

TTTGATACCGTTGCCGAAGGTTTGGGTAAAATTCGCGATTTATTGCGCCGTTACCACCGC

TTTGATACCGTTGCCGAAGGTTTGGGCGAAATTCGTGATTTATTGCGCCGTTATCATCAT

,...|Pr 314 XXX

Page 15: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Li and Stephens (2003)Approximation to the likelihood

TTTGATACTGTTGCCGAAGGTTTGGGCGAAATTCGCGATTTATTGCGCCGTTATCATCAT

TTTGATACCGTTGCCGAAGGTTTGGGTGAAATTCGCGATTTATTGCGCCGTTACCACCGC

TTTGATACCGTTGCCGAAGGTTTGGGTAAAATTCGCGATTTATTGCGCCGTTACCACCGC

,...|Pr 314 XXX

ii

ii

ii

ii XX

XXk

XXkk

k

XX

rec

,,4

,,4

,,4

,,4 |

2

1

2

1

)|Pr(

2/expPr

Page 16: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

My modification to Li and Stephens(2003)

0

2,

0 ,,4

,,4,,4

exp

)Pr(),|Pr(

)|Pr(|

,,4dtktkp

dtttXX

XXXX

tXX

ii

iiii

ii

iX ,4 iX ,

t

Page 17: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Estimating variable omega

The problem A constant omega model is prone to averaging

positive and negative omegas in a gene Allowing every site its own omega leaves little

information for inference

The solution A change-point model where windows of

adjacent sites share the same omega

Page 18: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Estimating variable omega

1 2 3 4 5

MCMC moves: Change omega for a single block Extend a block 5’ or 3’ Split an existing block Merge adjacent blocks

Page 19: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Part three

Does it work? Simulation studies and example data

Page 20: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Posterior distribution for known and unknown genealogy

Page 21: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Posterior distribution for known and unknown genealogy

Page 22: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Neutral dataset

True omega

Posterior mean

Posterior HPD interval

Page 23: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Non-neutral dataset

True omega

Posterior mean

Posterior HPD interval

Page 24: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

HIV envelope geneSlow Non-Progressors vs Rapid Progressors

Slow Non-Progressors Rapid Progressors

Page 25: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

HIV envelope geneSlow Non-Progressors vs Rapid Progressors

Slow Non-Progressors Rapid Progressors

Page 26: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Neisseria meningitidis PorB3

Page 27: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Neisseria meningitidis PorB3

95% HPD Upper0.0386

95% HPD Lower0.0187

Page 28: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Work in progress…

Variable recombination rateModel indelsFalsifiability testTest for sensitivity to rate heterogeneity

Page 29: DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT

Acknowledgements

Gil McVean (Supervisor)Martin Maiden (Supervisor)Ziheng YangRachel Urwin (meninge data)Charlie Edwards (HIV data)