View
227
Download
1
Category
Preview:
Citation preview
Evolution of the Genetic Code:Before and After the LUCA
1. The genetic code evolved to its canonical form before the Last Universal Common Ancestor of Archaea, Bacteria and Eukaryotes - >3 billion years ago. It appears to be highly optimized. How did it get to be this way?
2. Numerous small changes have occurred to the canonical code since then. What is the mechanism of codon reassignment?
Codon Reassignment – The Genetic code is variable in mitochondria (and also some cases of other types of genomes)
Second Position
U C A G Third Pos.
FirstPosition
U
F FLL
SS SS
Y YStopStop
C CStop W
U C A G
C
L LLL
PP PP
H HQ Q
RR RR
U C A G
A I II M
TT TT
N NK K
S SRR
U C A G
G VV V V
AA AA
D DE E
GG GG
U C A G
UGA Stop to Trp
AUA Ile to Met
CUN Leu to Thr
CGN Arg to unassigned
AGR Arg to Ser to Stop/Gly
etc.....
But how can this happen? It should be disadvantageous.
Porifera
Cnidaria
Arthropoda
Nematoda
Platyhelminthes
Lophotrochozoa
Echinodermata
Hemichordata
Urochordata
Cephalochordata
Craniata
AAA : Lys -> Asn
Loss of tRNA-Ile(CAU) but AUA remains Ile
Loss of tRNA-Arg(UCU) and AGR : Arg -> Ser
AGR : Ser -> Stop
AGR : Ser -> Gly
AUA : Ile -> Met
Loss of many tRNAs + import from cytoplasm
AAA : Lys -> unassigned
Reassignments in Metazoa
Example 1: AUA was reassigned from Ile to Met during the early evolution of the mitochondrial genome.
Before Codon Anticodon Notes
Ile
Ile
Ile
Met
AUU
AUC
AUA
AUG
GAU
k2CAU
CAU
G in the wobble position of the tRNA-Ile can pair with U and C in the third codon position Bacteria and some protist mitochondria possess another tRNA-Ile with a modified base that translates AUA only.
The tRNA-Met translates AUG only.
After Codon Anticodon Notes
Ile
Ile
Met
Met
AUU
AUC
AUA
AUG
GAU
UAU or
f5CAU
In animal mitochondria the k2CAU tRNA has been deleted.
There is a gain of function of the tRNA-Met by a mutation or a base modification
Example 2: UGA was reassigned from Stop to Trp many times (12 times in mitochondria).
Before Codon Anticodon Notes
Stop
Trp
UGA
UGG
RF
CCA
Release Factor recognizes UGA codon.
Normal tRNA-Trp translates only UGG codons.
After Codon Anticodon Notes
Trp
Trp
UGA
UGG
UCA In animal mitochondria (and elsewhere) there is a gain of function of the tRNA-Trp via mutation or base modification so that it translates both UGG and UGA.
The GAIN-LOSS framework
(Sengupta & Higgs, Genetics 2005)
LOSS = deletion or loss of function of a tRNA or RF
GAIN = gain of a new tRNA or a gain of function of an existing one.
Mutations in coding sequences
Initial Code.No Problem.
Ambiguous codon.Selective disadvantage.
Unassigned codon.Selective disadvantage.
New Code.Selective disadvantage because codons are used in wrong places
GAIN
GAINLOSS
LOSS
New Code.Codons now used in right places.No Problem.
Note – the strength of the selective disadvantage depends on the number of times the codon is used. There is no disadvantage if the codon disappears.
Four possible mechanisms of codon reassignment.
1. Codon Disappearance - The codon disappears. The order of the gain and loss is irrelevant.
For the other three mechanisms the codon does not disappear.
2. Ambiguous Intermediate – The gain happens before the loss. There is a period when the gain is fixed in the population and translation is ambiguous.
3. Unassigned Codon – The loss happens before the gain. There is a period when the loss is fixed in the population and the codon is unassigned.
4. Compensatory Change – The gain and loss are fixed in the population simultaneously (although they do not arise at the same time). There is no intermediate period between the old and the new codes. - cf. theory of compensatory substitutions in RNA helices.
Sengupta & Higgs (2005) showed that all four mechanisms work in a population genetics simulation
Codon reassignment
No. of
times
Can this be explained by
GCAU mutation pressure?
Change in No.
of tRNAs
Is mispairing important?
Mechanism
UAG: Stop Leu 2 G A at 3rd pos. +1 No CD
UAG: Stop Ala 1 G A at 3rd pos. +1 No CD
UGA: Stop Trp12
G A at 2nd pos. 0
Possibly. CA at 3rd pos.
CD
CUN: Leu Thr 1 C U at 1st pos. 0 No CD
CGN: Arg Unass 5 C A at 1st pos. -1 No CD
AUA: Ile Met or Unassigned
3 / 5No
-1Yes. GA at 3rd pos.
UC
AAA: Lys Asn 2
No0
Yes. GA at 3rd pos.
AI
AAA: Lys Unass1
No0
Possibly. GA at 3rd pos.
UC or AI
AGR: Arg Ser 1
No-1
Yes. GA at 3rd pos.
UC
AGR: Ser Stop 1 No 0 No AI(b)
AGR: Ser Gly 1 No +1 No AI(b)
UUA: Leu Stop 1 No 0 No UC or AI
UCA: Ser Stop 1 No 0 No UC or AI
Summary of Codon Reassignments in Mitochondria
CD mechanism explains disappearance of stop codons because they are rare initially. Only a few examples of CD for sense codons. UC and AI are important for sense codons.
Three examples in yeasts (Mutation pressure GC to AU)
Second Position
U C A G Third Pos.
FirstPosition
U
F FLL
SS SS
Y YStopStop
C CStop W
U C A G
C
L LLL
PP PP
H HQ Q
RR RR
U C A G
A I II M
TT TT
N NK K
S SRR
U C A G
G VV V V
AA AA
D DE E
GG GG
U C A G
CGN is rare (replaced by AGR)
CGN Arg codons become unassigned.
CUN is rare (replaced by UUR)
CUN Leu to Thr
AUA and AUU common and AUC is rare
Nevertheless AUA is reassigned to Met. Codon does not disappear
LeuCUN
Leu UUR
Arg CGN
Arg AGR
S 53 192 7 33
Y. 44 618 0** 75
C 3 279 12 29
C 132 397 47 26
C 66 547 39 45
P 25 714 18 67
K 0 286 0** 48
C 11* 294 1** 45
S 33* 333 7 49
S 19* 274 0** 40
S 22* 300 0** 46
Leu and Arg codons in yeasts
Codon Disappearance causes reassignments
* CUN = Thr. Unusual tRNA-Thr present instead of tRNA-Leu
** CGN = unassigned. tRNA-Arg is deleted
AUU AUC AUA AUG AUA is tRNAJ 133 40 32 48 Ile K2CAUO 161 34 0 57 Absent noneP 113 39 49 51 Ile K2CAU
Codon UsageAUA Ile to Met in Yeasts
AUU AUC AUA AUGC 119 81 229 100 Ile K2CAUC 303 32 193 117 Ile K2CAUP 274 18 562 105 Ile K2CAUK 213 16 7 63 ? noneC 207 21 16 73 Met C*AUS 239 31 60 73 Met C*AUS 203 7 101 56 Met C*AUS 218 11 95 70 Met C*AU
codon anticodon
AUU Ile GUA
AUC Ile “
AUA Ile K2CAU
AUG Met CAU
Evolution of the canonical code - Before the LUCA
The canonical code seems to be optimized to reduce the effects of translational and mutational errors.
Neighbouring codons code for similar amino acids.
5 7 9 11 13
C LI F WMY V PT A HQSG NKR E D
Woese’s polar requirement scale
Measure difference between amino acid properties by how far apart they are on this scale.
Principal Component Analysis Projects the 8-d space into the two ‘most important’ dimensions.
Big
Small
Hydrophobic Hydrophilic
Cost function g(a,b) for replacing amino acid a by amino acid b
e.g. difference in Polar Requirement
i j
iji j
jiij raagrE /),(
rij = rate of mistaking codon i for codon j
= 1 for single position mistakes, 0 otherwise
E = measure of error associated with a code
Generate random codes by permuting the 20 amino acids in the code table
E is smaller for the canonical code than for almost all random codes.
E
p(E)Ereal
f
f ~ 10-6
one in a million codes is better (Freeland and Hurst)
GAU
GAC
GAA
GAG
AAU
AAC
AAA
AAG
CAU
CAC
CAA
CAG
UAU
UAC
UAA
UAG
Asp
Asp
Gu
Asn
Ays
Lys
His
Hn
Gln
Tyr
Tyr
*
*
GCU
GCC
GCA
GCG
ACU
ACC
ACA
ACG
CCU
CCC
CCA
CCG
UCU
UCC
UCA
UCG
Ala
Ala
Ala
Ala
Ser
Ser
Ser
Ser
GUU
GUC
GUA
GUG
AUU
AUC
AUA
AUG
CUU
CUC
CUA
CUG
UUU
UUC
UUA
UUG
Val
Val
Val
Val
Ile
Ile
Ile
Met
Leu
Leu
eu
Phe
Phe
Leu
Leu
GGU
GGC
GGA
GGG
AGU
AGC
AGA
AGG
Ser
Ser
Arg
Arg
CGU
CGC
CGA
CGG
Arg
Ag
UGU
UGC
UGA
UGG
Cp
Pro
Pro
Pro
Pro
The statistical argument shows that the code is highly non-random but it does not explain how the code evolved to be that way. Need a step-by-step evolutionary argument that leads from a proposed first stage of the code to today’s code.
Random permutations – Not Possible
Random swaps – seems unlikely
The earliest code probably had few amino acids. Which were the first? Selection acts when new amino acids are added.
GAU
GAC
GAA
GAG
AAU
AAC
AAA
AAG
CAU
CAC
CAA
CAG
UAU
UAC
UAA
UAG
Asp
Asp
Gu
Asn
Ays
Lys
His
Hn
Gln
Tyr
Tyr
*
*
GCU
GCC
GCA
GCG
ACU
ACC
ACA
ACG
CCU
CCC
CCA
CCG
UCU
UCC
UCA
UCG
Ala
Ala
Ala
Ala
Ser
Ser
Ser
Ser
GUU
GUC
GUA
GUG
AUU
AUC
AUA
AUG
CUU
CUC
CUA
CUG
UUU
UUC
UUA
UUG
Val
Val
Val
Val
Ile
Ile
Ile
Met
Leu
Leu
eu
Phe
Phe
Leu
Leu
GGU
GGC
GGA
GGG
AGU
AGC
AGA
AGG
Ser
Ser
Arg
Arg
CGU
CGC
CGA
CGG
Arg
Ag
UGU
UGC
UGA
UGG
Cp
Pro
Pro
Pro
Pro
Time scale for the origin of life
The origin of the genetic code is the end of the RNA World
What preceded RNA? Another polymer? Metabolism only?
Dating of rocks and meteorites
Last ocean- vaporizing impact. Lunar craters
Isotopic evidence for life
Microfossil evidence
Stromatolites.
Phylogenetic methods (divergence after LUCA)
Prebiotic synthesis of organic molecules Miller-Urey experiment (1953)
Began with a mixture of CH4 , NH3, H2O and H2.Energy source = electric spark or UV light.Obtained 10 amino acids.
Atmospheres and Chemistry
reducing: CH4 , NH3, H2O, H2. or CO2, N2, H2 or CO, N2, H2 There is hydrogen gas and/or hydrogen is present combined with other elements (methane, ammonia, water)
neutral: CO or CO2 , N2 , H2O
no hydrogen or oxygen gas
oxidizing: O2, CO2, N2
oxygen gas present
Prebiotic chemists favour reducing atmospheres.
Yields in Miller-Urey exp are higher and more diverse in reducing than in neutral atmospheres. Doesn’t work in oxidizing atmosphere.
Planetary AtmospheresMajor element in universe is H (big bang) so doesn’t it make sense that atmosphere was reducing?Jupiter retains original mixture: H2, He + small amounts CH4, NH3, H2O
Smaller planets lose H2
New atmosphere created by outgassing from interior
Geologists & Astronomers favour an intermediate atmosphere.
(i) Venus - 64 Earth atmospheres pressure! Mostly CO2 and N2
(ii) Carbonates in sedimentary rocks on Earth suggest previously lots of CO2
Current Earth: Mostly N2, O2 + small amounts of CO2 H2O – changed by life.
Mars: very low pressure – mostly CO2 and N2
So maybe Miller and Urey were wrong? :-(
Alternative suggestion – Hydrothermal vents
Sea water passes through vents.
Heated to 350o C. Cools to 2o C in surrounding ocean.
Supply of H2 H2S etc.
Fierce debate as to whether these conditions favour formation or breakup of organic molecules (Miller & Lazcano, 1995)
Organic compounds in meteorites
Most widely studied meteorite is the Murchison meteorite. Fell in Australia in 1969. Carbonaceous chondrite.
Contained both biological and non-biological amino acids
Both optical isomers (later shown to be not quite equal)
Compounds are not contamination
Just about all the building block molecules have now been found in carbonaceous meteorites (Sephton, 2002).
Astrochemistry: molecular clouds; icy grains; parent bodies of meteorites....
Delivery by: dust particles; meteorites; comets....
Was external delivery an important source of organic molecules?
The earliest code probably had few amino acids. Which were the first? Selection acts when new amino acids are added.
GAU
GAC
GAA
GAG
AAU
AAC
AAA
AAG
CAU
CAC
CAA
CAG
UAU
UAC
UAA
UAG
Asp
Asp
Gu
Asn
Ays
Lys
His
Hn
Gln
Tyr
Tyr
*
*
GCU
GCC
GCA
GCG
ACU
ACC
ACA
ACG
CCU
CCC
CCA
CCG
UCU
UCC
UCA
UCG
Ala
Ala
Ala
Ala
Ser
Ser
Ser
Ser
GUU
GUC
GUA
GUG
AUU
AUC
AUA
AUG
CUU
CUC
CUA
CUG
UUU
UUC
UUA
UUG
Val
Val
Val
Val
Ile
Ile
Ile
Met
Leu
Leu
eu
Phe
Phe
Leu
Leu
GGU
GGC
GGA
GGG
AGU
AGC
AGA
AGG
Ser
Ser
Arg
Arg
CGU
CGC
CGA
CGG
Arg
Ag
UGU
UGC
UGA
UGG
Cp
Pro
Pro
Pro
Pro
Prebiotic Synthesis of amino acids
Higgs and Pudritz (2009) Astrobiology
Amino acids are found in
• Meteorites
• Atmospheric chemistry experiments (Miller-Urey)
• Hydrothermal synthesis
• Icy dust grains in space
Rank amino acids in order of decreasing frequency in 12 observations. Derive ranking.
Miller Murchison Yamato Ice Exp.Gly 1.000 1.00 1.000 1.000Ala 1.795 0.34 0.380 0.293Asp 0.077 0.19 0.035 0.022Glu 0.018 0.40 0.110Val 0.044 0.19 0.100 0.012Ser 0.011 0.003 0.072Ile 0.011 0.13 0.060Leu 0.026 0.04 0.035Pro 0.003 0.29 0.001Thr 0.002 0.003
Comparison of amino acid frequencies produced non-biologically
10 amino acids are found in the Miller-Urey experiments. Very similar ones are also found in meteorites, an Ice grain analogue experiment, and other places.
These are ‘early’ amino acids that were available for use by the first organisms.
G A D E V S I L P T
The other 10 are not seen. These are late amino acids that were only used when organisms evolved a means of synthesizing them biochemically.
K R H F Q N Y W C M
concentrations normalized
relative to Gly
The earliest amino acids are those that are cheapest to form thermodynamically
Positions of early and late amino acids....
What does this mean?
Second Position
U C A G Third Pos.
FirstPosition
U
F FLL
SS SS
Y YStopStop
C CStop W
U C A G
C
L LLL
PP PP
H HQ Q
RR RR
U C A G
A I II M
TT TT
N NK K
S SRR
U C A G
G VV V V
AA AA
D DE E
GG GG
U C A G
M
FF
Maybe only 2nd position was relevant initially.
Late amino acids took over codons previously assigned to amino acids with similar properties.
U C A G
U
Val Ala Asp Gly
UC
AG
CUC
AG
AUC
AG
GUC
AG
Propose that the four earliest amino acids were Val, Ala, Asp, Gly
Four column code. (Higgs Biol. Direct. 2009)
This is a triplet code but only the second base means anything.
The second base is the most important for codon-anticodon recognition.Unlikely to make a mistake at second position.
All first and third position mistakes are synonymous.
Code structure after addition of the 10 early amino acids..
Add new amino acids in positions that were formerly occupied by amino acids with similar properties. This minimizes disruption to existing gene sequences.
Summary of my argument -
Selection acts at the time of addition of new amino acids to the code. The new amino acid is assigned to codons that formerly coded for an amino acid with similar properties. This minimizes disruption to existing genes.
The result is that codons in the same columns end up assigned to amino acids with similar properties. The column structure is retained from the earliest code.
Hence the code appears to minimize translational error with respect to randomly reshuffled codes, even though translational error was not the main factor being selected.
Pathways of amino acid synthesis in modern organisms (from Di Giulio 2008)
Other points –
Column structure suggests that translational errors were more important than mutational errors (tRNA structure/RNA world)
Precursor-product pairs tend to be neighbours (but doubts over statistical significance). Maybe late amino acids took over codons previously assigned to their biochemical precursors.
Direct chemical interactions between RNA motifs and amino acids (“stereochemical theory”). In vitro selection experiments suggest binding sites of aptamers preferentially contain codon and anticodon sequences.
RNA WorldFirst hypothesis:
There was a stage of evolution at when RNA molecules performed both genetic and catalytic roles. DNA later took over the genetic role and proteins took over the catalytic role.
Translation depends on RNA:mRNA supplies the information for protein synthesis.Active ingredient of the ribosome is rRNA – 3d structures show site of peptidyl transferase reaction. Proteins probably added as a late addition to the ribosome.tRNAs also essential for translation.
Second hypothesis:
The RNA world arose de novo in the form of self replicating ribozymes.
Almost certainly true
The jury is still out
Self-splicing introns. First RNA catalysts to be discovered. Tom Cech (1982).
‘RNA World’ term coined by Walter Gilbert (1986).
RNA world idea originated in 60’s as a theoretical solution to the chicken and egg problem of DNA and proteins.
Hammerhead ribozyme
Cleaves RNA at a specific point.
Rolling circle mechanism of replication of virus-like RNAs in plants. Chops long strand into pieces.
Example of an RNA catalyst
What can ribozymes do? Ligases
T. A. Lincoln, G. F. Joyce, Science 323, 1229 (2009)
EBA E’
An Autocatalytic Set Made from Ligases
T. A. Lincoln, G. F. Joyce, Self-Sustained Replication of an RNA Enzyme, Science 323, 1229, (2009)
Given a supply of A, B, A’, B’, the E and E’ make more of themselves.
EBA E ' ''' EBA E
What can ribozymes do?Recombinases
E.J. Hayden, G.v. Kiedrowski & N. Lehman, Angew. Chem. Int. Edit. (2008) 120, 8552
Catalyst is autocatalytic given a supply of W X Y Z.The non-covalent assembly is also a catalyst.
What can ribozymes do?Polymerases
Black +Blue – ribozymeRed – templateOrange – primer
Primer extended by up to 14 nucleotides
Johnstone et al. (2001) Science
Gradual improvement of Polymerases in the lab
Wochner et al. (2011) Science - up to 95 nucleotides
What can ribozymes do?Nucleotide Synthetases
Unrau and Bartel, (1998) Nature
An RNA organism must have had a metabolism.
Hypothetical pathway for RNA catalyzed RNA synthesis (Joyce)
Synthesis of nucleosides
Phosphorylation
Generation of NTPs
Creation of activated nucleotides
Stepwise polymerization
Clutter of RNA synthesis (Joyce)
Why is this particular set of monomers used for nucleic acids?
How is this set synthesized specifically?
Where is the chemistry occurring? Earth, or space? Hydrothermal vents?
MW Powner et al. Nature 459, 239-242 (2009) doi:10.1038/nature08013
Previously assumed synthesis of -ribocytidine-2',3'-cyclic phosphate 1 (blue; note the failure of the step in which cytosine 3 and ribose 4 are proposed to condense together) and the successful new synthesis described here (green). p, pyranose; f, furanose.
A new route to Pyrimidine ribonucleotide assembly.
Chemical synthesis of monomers and polymers must have occurred before the origin of ribozymes.
Ferris (2002) Orig. Life Evol. Biosph.Montmorillonite catalyzed synthesis of RNA oligonucleotides (30-50 mers)
Rajamani et al. (2008) Orig. Life Evol. Biosph.Lipid assisted synthesis of RNA-like polymers from mononucleotides
Costanzo et al. (2009) J. Biol. Chem.Synthesis of long RNA strands from cyclic nucleotides in water
Rajamani et al. (2010) J. Am. Chem. Soc.Measurements of error rates in non-enzymatic RNA replication
There are still some experimental issues…But this is a logical necessity!
How could the RNA world have got started?
Getting from chemistry to biology….
RNA replicators must have emerged from prebiotic synthesis of random sequences
Monomers
Activatedmonomers
Shortpolymers
Long polymers
PrecursorsSynthesis
Polymerization
Activation
Polymerization
Ribozymes
catalyze catalyze
catalyze
catalyze
Jump-starting the RNA WorldWu & Higgs (2009) J. Mol. Evol.
RNA
Are there alternatives to RNA?
a – Threose Nucleic Acid – TNA c – Glycerol derived nucleic acidb – Peptide nucleic acid – PNA d – Pyranosyl RNA
RNA hybridizes with other nucleic acids. Information is not lost.
DNA-RNA hybrids DNA takes over at end of RNA world.
Maybe TNA or PNA preceded the RNA world. Information passed to RNA.
Would need to show that the alternative was easier to synthesize than RNA.
Two scenarios from Segré & Lancet (2000)
A – RNA first (strong RNA world hypothesis)
B – Lipids first (lipid world hypothesis – compositional genomes – metabolism without genes)
Recommended