View
5.595
Download
11
Category
Tags:
Preview:
DESCRIPTION
Talk by Jonathan Eisen on Phylogenomics of DNA repair at Lake Arrowhead Small Genomes meeting in 2000.
Citation preview
TIGRTIGRTIGRTIGR
TIGRTIGRTIGRTIGR
Topics of Discussion
• DNA Repair
• Why study evolution of repair?
• Evolution of specific pathways with examples from recent genome projects (e.g., A. thaliana, Vibrio cholerae, Shewanella putrefaciens, Buchnera aphidicolum symbiont)
• Big picture – evolutionary origins of repair
TIGRTIGRTIGRTIGR
Damage is not just to DNA
TIGRTIGRTIGRTIGR
TIGRTIGRTIGRTIGR
General Mechanisms of Resistance to Cellular Damaging Agents
• Damage protection/prevention
• Damage tolerance
• Repair and recovery
TIGRTIGRTIGRTIGR
Classes of DNA Repair
• Direct repair– Photoreactivation– Alkylation transfer– DNA ligation/non-homologous end joining
• Excision repair– Base excision repair– Mismatch excision repair– Nucleotide excision repair
• Recombinational repair
TIGRTIGRTIGRTIGR
Endonuclease
NUCLEOTIDE and
MISMATCH EXCISION
BASE EXCISION
Exonuclease, Helicase, Polymerase
Damage Recognition
* *
N-glycosylase
AP endo
Ligase
Exonuclease, Polymerase
Excision Repair Outline
TIGRTIGRTIGRTIGR
Generation of single-strand overhang
Initiation, alignment
RecA RecFOR
Rad52
Strand invasion
DNA synthesis
Branch migration and resolution
RuvABC RecG,RUS?
Rad54?
RecA Rad51,55,57
RecBCD RecE,T RecQ,J
Rad50, MRE11, XRS2
Recombination Outline
TIGRTIGRTIGRTIGRTIGRTIGRTIGRTIGR
“Nothing in biology makes senseexcept in the light of evolution.”
T. H. Dobzhansky (1973)
TIGRTIGRTIGRTIGR
Why Study Evolution and Repair?
• Repair variation leads to differences in evolutionary patterns within and between species.
• Evolutionary analysis can identify mutation/repair biases.
• Evolutionary studies can improve our understanding of repair proteins and pathways.
• Comparisons of repair genes can be used to infer evolutionary history.
• Information on mutation processes improves sequence and phylogenetic analysis.
• Evolutionary analysis is required to infer the origins and history of repair processes.
TIGRTIGRTIGRTIGR
Steps in Phylogenomic Analysis
• Create database of genes of interest
• Presence/absence of homologs in complete genomes
• Phylogenetic trees of each gene family
• Infer evolutionary events (gene origin, duplication, loss and transfer)
• Refine presence/absence (orthologs, paralogs, subfamilies)
• Functional predictions and functional evolution
• Analysis of pathways
TIGRTIGRTIGRTIGR
Pathway Biochemical Activity(s). |-------------------------------------Bacteria------------------------------------| |-----Archaea------| |--Eukarya---|Protein Name(s)
Bacter ial NERUvrA Binds damaged DNA + + + + + + + + + + + + + + + - - - - -UvrB Helicase, 3' incision endonuclease + + + + + + + + + + + + + + + - - - - -UvrC 5' incision endonuclease + + + + + + + + + + + + + + + - - - - -UvrD Excision helicase + + + + + ++ + + + ++ + + + + ++ - - - + +MFD Transcription repair coupling + + + + + + - - + + + + - + - - - - - -
Eukaryotic NER
RecognitionRad14 (XPA) Binds damaged DNA - - - - - - - - - - - - - - - - - - + + +RFA1/RPA1 ssDNA binding w/ RFA2,3 - - - - - - - - - - - - - - ± - - - + + +RFA2/RPA2 ssDNA binding w/ RFA1,3 - - - - - - - - - - - - - - - - - - + ++ +RFA3/RPA3-human ssDNA binding w/ RFA1,2 - - - - - - - - - - - - - - - - - - - + +RFA3/RPA3-yeast ssDNA binding w/ RFA1,2 - - - - - - - - - - - - - - - - - - + +
InitiationRad3 (XPD) (ERCC2) TFIIH component – helicase - - - - - - - - - - - - - - - - - ± + + +Rad25 (XPB) (ERCC3) TFIIH component – helicase - - - - - - - - - + - + - + - - + + + + +SSL1 (p44) TFIIH component - - - - - - - - - - - - - - - - - - + + +TFB1 (p62) TFIIH component - - - - - - - - - - - - - - - - - - + + +TFB2 (p52) TFIIH component - - - - - - - - - - - - - - - - - - + + +TFB3 (MAT1) TFIIH component - - - - - - - - - - - - - - - - - - + + +TFB4 (p34) TFIIH component - - - - - - - - - - - - - - - - - - + + +CCL1 (CyclinH) TFIIH component - - - - - - - - - - - - - - - - - - + + +Kin28 (CDK7) TFIIH component - protein kinase - - - - - - - - - - - - - - - - - - + + +
IncisionRad2 (XPG) (ERCC5) 3' incision (flap endonuclease) - - - - - - - - - - - - - - + + + + + + +Rad10 (ERCC1) 5' incision endonuclease w/ Rad1 - - - - - - - - - - - - - - - - - - + + +Rad1 (XPF) (ERCC4) 5' incision endonuclease w/ Rad10 - - - - - - - - - - - - - + + + + + + +
SpecificityRad4 (XPC) Repair of inactive DNA - - - - - - - - - - - - - - - - - - + + +Rad23 (HHRAD23) Repair of inactive DNA - - - - - - - - - - - - - - - - - - + ++ +Rad7 Repair of inactive DNA - - - - - - - - - - - - - - - - - - + +Rad16 Repair of inactive DNA - - - - - - - - - - - - - - - - - - + + +Rad26 (CSB) (ERCC6) Transcription-repair coupling - - - - - - - - - - - - - - - - - - + + +CSA (ERCC8) Transcription-repair coupling - - - - - - - - - - - - - - - - - - ± + +
Nucleotide Excision Repair
TIGRTIGRTIGRTIGR
Pathway Biochemical Activity(s). |-------------------------------------Bacteria------------------------------------| |-----Archaea------| |--Eukarya---|Protein Name(s)
Initiation
RecBCD pathwayRecB ExoV Helicase + + + - - - - - - + + - - + - - - - - -RecC ExoV Nuclease + + + - - - - - - + ±+ - - + - - - - - -RecD ExoV Helicase + + + - ± ± - - - + ±+ - - + - - - - - -
RecF pathwayRecF Assists RecA fi lamentation + + - - + + - - + + - + - + - - ± - ± ±RecJ 5'-3' ssDNA exonuclease + + + + + + - - + - + + + + - - - - - -RecO Binds ssDNA, assists RecF? + + + - + + - - + + - - - + - - - - - -RecR ATP binding, assists RecF? + + + ±+ + + - - + + - + + + - - - - - -RecN ATP binding + + + + + + - - + + - + + + - - ± - - -RecQ 3'-5' DNA helicase + + + - ± + - - + - - + - + - - - - + ++ +
RecE pathwayRecE/ExoVIII 5'-3' dsDNA exonuclease + - - - - - - - - - - - - + - - - - - -RecT Binds ssDNA, promotes pairing + - - - + + - - - - - - - + - - - - - -
SbcBCD pathwaySbcB/ExoI 3'-5' ssDNA exonuclease + + - - - - - - - - - - - + - - - - - -SbcC dsDNA exonuclease (w/ sbcD) + - - - ±+ + - - + - + + + + ± ± ± ± ± ± ±SbcD dsDNA exonuclease (w/ sbcC) + - - - - + - - + - + + + + ± ± ± ± ± ± ±
AddAB PathwayAddA/RexA Exonuclease + helicase w/ AddB - - + - + + - - - - - + - + - - - - - -AddB/RexB Exonuclease + helicase w/ AddA - - + - + + - - - - - - - + - - - - - -
Rad52 pathwayRad52, Rad59 n/a - - - - - - - - - - - - - - - - - - ++ + +Mre11/Rad32 Nuclease w/ Rad50 ± - - - ± ± - - ± - ± ± ± ± + + + + + + +Rad50 Nuclease w/ Mre11 ± - - - ± ± - - ± - ± ± ± ± + + + + + + +
RecombinaseRecA, Rad51 DNA binding, strand exchange + + + + + + + + + + + + + + + + + + ++ ++ ++
Branch migration/resolution
Branch migrationRuvA Binds junctions. Helicase w/ RuvB + + + + + + + + + + + + - + - - - - - -RuvB 5'-3' junction helicase w/ RuvA + + + + + + + + + + + + - + - - - - - -
RecG Resolvase, 3'-5' junction helicase + + + + + + - - + + + + + + - - - - - -
ResolvasesRuvC Junction endonuclease + + + + - - - - + + - + - + - - - - - -RecG Resolvase, 3'-5' junction helicase + + + + + + - - + + + + + + - - - - - -Rus Junction endonuclease + - - - - - - - - ±+ - - ±+ + - - - - - -CCE1 Junction endonuclease - - - - - - - - - - - - - - - - - - + +
Other recombination proteinsRad54 n/a - - - - - - - - - - - - - - - - - - + + +Rad55 n/a - - - - - - - - - - - - - - - - - - + + +Rad57 n/a - - - - - - - - - - - - - - - - - - + + +Xrs2 Assists Rad50/MRE11? - - - - - - - - - - - - - - - - - - + +
Recombinational Repair
TIGRTIGRTIGRTIGR
Evolution of Specific Pathways
TIGRTIGRTIGRTIGR
Photoreactivation and Photolyases
• All photoreactivation is carried out by enzymes in the photolyase family
• Two main classes of photolyases – class I and class II – are distantly related to each other and likely the result of an ancient duplication
• PhrI and PhrII missing from most species for which complete genomes are available.
• Many cases of functional change (e.g., CPD -> 6-4) and some are not even involved in DNA repair
• Many of the eukaryotic proteins appear to be of an organellar ancestry
TIGRTIGRTIGRTIGR
Uses of Evolution : Photoreactivation• All known enzymes that perform photoreactivation are part of
a single large photolyase gene family
• Some members of the family do not function as photolyases, but instead work as blue-light receptors
• If a species does not encode a member of the photolyase gene family, it likely does not have photoreactivation capability
• If a species encodes a photolyase, one cannot conclude it has photolyase activity
• Position of photolyase homologs within photolyase tree helps predict what activities they have
TIGRTIGRTIGRTIGR
Phr.S thyp
PHR E. coli
ORFA00965*********
phr.neucr
Phr.Tricho
Phr.Yeast
Phr.B firm
phr.strpy
phr.haloba
PHR STRGR
pCRY1.huma
phr.mouse
phr2.human
phr2.mouse
phr.drosop
phr3.Synsp
ORF02295.Vibch********
phr.neigo
ORF01792.Vibch*******
Phr.Adiant
Phr2.Adian
Phr3.Adian
phr.tomato
CRY1 ARATH
phr.phycom
CRY2 ARATH
PHH1.arath
PHR1 SINAL
phr.chlamy
PHR ANANI
phr.Synsp
PHR SYNY3
phr.Theth
Rh.caps
MTHF type Class I CPD Photolyases
6-4 Photolyases
Blue Light
Receptors
8-HDF type CPD
Photolyases
TIGRTIGRTIGRTIGR
Photolyases in A. thalianaphr.chlamycry2.tomatPHH1.CRY2.PHR1 SINALCry3.AdianCRY1/hy4.Aphr.BrevibPhr.Cordi.ORF05094.CPhr.RhocaPhr.BacfiPhr.EntfaPhr.StrpyPhr.Pseae.Phr.Yerpe.Phr.EcoliPhr.S thypPhr.SaltyPhr.Shepu.A00965.VibPhr.YeastPhr.NeucrPhr.TrichoPhr.SYNY3 Phr.AnaniPhr.Synsp Phr.ThethPhr.Mycav.Phr.MycsmCT12574.FlARATH3 MSJPhr.flypCRY1.humaPhrL.MousePhrL2.humaPhrL2.Mous295.VibchPhr2.SynspARATH2 T30ARATH5 F6A1792.VibchPhr.NeimePhr.NeigoPhr.HalhaPhr.Strgr
Crys Group with -Proteobacteria
Other Bacteria
Eukaryotic
Cyano/Plastid
TIGRTIGRTIGRTIGR
Alkyltransferases
• All known alkyltransferases are members of a single gene family
• Found in most but not all species
• Likely present in LUCA
• Ada protein in E. coli originated by fusion between an alkyltransferase and a transcription-regulatory domain
• Gram-positive bacteria have the Ada domain fused to an alkylation glycosylase instead of alkyltransferase
TIGRTIGRTIGRTIGR
AlkA Domain (O6-Me-G glycosylase)
Ogt Domain (O6-Me-G alkyltransferase)
Ada Domain (transcriptions regulator)
Ada E. coli
Ada H. infl
Ogt E. coli
Ogt H. infl
Ogt Gram+
Ogt D. radio
AlkA Gram+
AlkA E. coli
MGMT Euks
Alkylation Repair Genes
TIGRTIGRTIGRTIGR
DNA Ligases
• Two major ligase families
• Ligase I– NAD dependent– Found in all bacteria and only in bacteria
• Ligase II– ATP dependent– Found in all Archaea and eukaryotes– Found in some bacteria– Duplicated in many eukaryotes
TIGRTIGRTIGRTIGR
DNA Ligases in A. thaliana
ARATH1 F4N21.14YEAST-GP-600039YEAST-SW-DNLI YEASTYEAST-GP-3515ARATH1 F13F21.31ARATH1 T23G18.1ARATH1 T6D22.10CELEG C29A12.3DROMECG560AERPE-gi|5104764.AQUAE-gi|2983805DROME-CG17227ARATH5 MUL3 11YEAST-SW-DNL4 YEASTDROMECG12176ARCFU-gi|2648829METJA-gi|1590924METTH-gi|2622703ARCFU-gi|2649996PYRHO-gi|3258051PYRFUPf 1527421
TIGRTIGRTIGRTIGR
Mismatch Excision Repair• Core of process highly homologous between bacteria and eukaryotes (all use
MutS and MutL homologs).
• Eukaryotes encode multiple MutS and MutL homologs, not all of which are involved in mismatch repair.
• Two major MutS groups– MutS-I proteins involved in MMR and MutS-II proteins involved in chromosome segregation.
• MutS1 and MutL missing from many bacteria, especially pathogens. Other MMR proteins also defective in some.
• Few homologs in Archaea – some encode MutS2, none encode MutS1, and some may encode MutL.
• Some evolutionary and functional relationships to restriction-modification systems (MutH, MED1, Vsr).
TIGRTIGRTIGRTIGR
95
79
MSH6
MSH3
MutS1
MutS-I Mismatch Repair MSH1
MSH2
100
100
74
MutS-II Chromosome Crossover & Segregation
MSH5
MSH4
MutS2
96
100
95
100
100
90
96
25
55
85
60
Proposed duplication
61/89
TIGRTIGRTIGRTIGR
5
1
2
3
4
E. coli
H. influenzae
N. gonorrhoeae
H. pylori
Syn. sp
B. subtilis
S. pyogenes
M. pneumoniae
M. genitalium
A. aeolicus
D. radiodurans
T. pallidum
B.burgdorferi
A. aeolicus
S pyogenes
B. subtilis
Syn. sp
D. radiodurans
B. burgdorferi
Syn. sp
B. subtilis
S. pyogenes
A. aeolicus
D. radiodurans
B. burgdorferi
MutS2
MutS1
A. B.
Gene Duplication
Gene Duplication
Ancient Duplication in MutS Family
TIGRTIGRTIGRTIGR
Parallel Loss of MutLS
Lost in mycoplasmal lineage (present in B. subtilis and S. pyogenes)
Lost in M. tuberculosis lineage (found in some other highGC Gram-positives)
Lost in H. pylori / C. jejuni lineage (present in many other Proteobacteria)
Possibly lost in Euryarchaeota lineageDefective in many “wild” E. coli and S. typhimurium strains
Loss of genes may give an advantage in some conditions by increasing mutation rate or recombination rate between species.
TIGRTIGRTIGRTIGR
Nucleotide Excision Repair• Bacterial and eukaryotic systems are not-homologous, despite having
very similar mechanisms
• Most of the eukaryotic and bacterial proteins originated within each of these domains
• Some of the eukaryotic proteins are shared with Archaea (Rad1, Rad2, Rad25).
• All free-living bacteria encode UvrABCD. B. aphidicolum encodes Mfd but not UvrABCD.
• UvrABC also found in one Archaea.
• Some functional and evolutionary relationships with drug resistance and transport
TIGRTIGRTIGRTIGR
Evolution of UvrA Family
UvrA2UvrA2 S. coelicolor
DrrC S. peuceteus
UvrA2 D. radiodurans
Duplication in UvrA family
UvrA1
UvrA H. influenzae
UvrA E. coli
UvrA N. gonorrhoaea
UvrA R. prowazekii
UvrA S. mutans
UvrA S. pyogenes
UvrA S. pneumoniae
UvrA B. subtilis
UvrA M. luteus
UvrA M. tuberculosis
UvrA M. hermoautotrophicum
UvrA H. pylori
UvrA C. jejuni
UvrA P. gingivalis
UvrA C. tepidum
uvra1 D. radiodurans
UvrA T. thermophilus
UvrA T. pallidum
UvrA B. burgdorefi
UvrA T. maritima
UvrA A. aeolicus
UvrA Synechocystis sp.
UvrA1
UvrA2
OppDF
UUP
NodI
LivF
XylG
NrtDC
PstB
MDR
HlyB
TAP1
CFTR, SUR
A. ABC Transporters B. UvrA Subfamily
TIGRTIGRTIGRTIGR
UvrA Evolution
Diversification of ABC family
UvrA
UvrAC UvrAN
UvrA1C UvrA1N UvrA2C UvrA2N
ABC1ABC2
ABC
Tandem Duplication
Gene Duplication
TIGRTIGRTIGRTIGR
Base Excision Repair Glycosylases
• Distribution patterns highly uneven but some glycosylases have been found in all species
• Some are ancient enzymes, probably presence in LUCA (e.g., MutY-Nth), others more recent (e.g., TagI).
• Many families are distantly related to each other (e.g., Ogg, AlkA, MutY-Nth)
• Many cases of gene duplication, loss and possibly transfer, especially from organellar genomes to nucleus
• Orthologs frequently have different specificity
TIGRTIGRTIGRTIGR
A. thaliana TAG homologsC. crescentus
A. thaliana_ 5 K23L20 1
A. thaliana_ 3 MBK21.7
A. thaliana_ 1 F23A5.15
A. thaliana_ 1 T24D18.7
A. thaliana_5 MTI20 23
A. thaliana_1 F9E10.6
V. cholerae
H. influenzae
E.coli
M. tuberculosis
N. meningitidis A
N. meningitidis B
TIGRTIGRTIGRTIGR
AP Endonucleases
• All species encode either Nfo or Xth homologs. Some encode both.
• Only Nfo: mycoplasmas, Aquifex, M. jannascii, yeast
• Only Xth: many bacteria, A. fulgidus, humans (so far)
• Both: E. coli, B. subtilis, M. tuberculosis, M. thermoautotrophicum
• Both Nfo and Xth are likely ancient.
• Many cases of gene loss of one or the other, but never both
TIGRTIGRTIGRTIGR
Recombinational Repair• RecA homologs found in all free-living species (B. aphidicolum encodes
RecBCD but not RecA)
• Most recombination initiation pathways are of recent origin– RecBCD, RecE within Proteobacteria/Gram-positives– RecF within bacteria– AddAB within low-GC gram-Positives– SbcCD may be of ancient origin (possibly homologous to MRE11/Rad50)
• Resolution pathways also somewhat recent origin– CCE1 within eukaryotes– RuvABC, RecG near origin of bacteria– Rus within bacteria (phage origin?)
• Many cases of gene loss in initiation, resolution pathways.
TIGRTIGRTIGRTIGR1Myx.xanth
2Myx.xanth
MBBAD17TF******
0.1
Xen.bovieXen.nemat
Pr.vulgariPr.mirabilEnt.agglo
Y.pestisS.marcesce
E.coliShig.flexShig.sonn
Shepu.tigVib.angui
Vib.cholerPs.oleovor
Ps.marginaPs.fluores
Ps.putidPs.aerugiPs.aePAM
Az.vinelan
Ac.calcoacAc.sp.ADP
Past.haemH.influenz
Past.multoActinobaci
Aer.salmonXa.oryza
Xa.citriXa.campes
B.pertussiPs.cepaciChrom.vino
Mthmon.claMthphy.met
Mthbac.flaNitrosomon
L.pneumopNe.gonorr
Ne.meningiT.ferrooxiRhb.phaseRh.leguminA.tumefaciRh.melilot
Br.abortusBlastochlo
Rhps.paluAceto.polAceto.alt
Gluc.oxydAq.magnet
Zym.mobiliCaul.cresc
Prcs.denitRho.sphae
Rho.capsu
He.pyloriHe.pylori2
Cmp.jejuniCmp.fetus
TIGRTIGRTIGRTIGR
RumB R391
A05970
ImpB
MucB
UmuCs
RulB
RumB******
DinP1DinP2
UvrX
DinP3
TIGRTIGRTIGRTIGR
Big Picture: Evolutionary Origin
TIGRTIGRTIGRTIGR
Likely Ancient Repair Processes/Proteins
Process Proteins
Mismatch repair MutL, MutSAP endonuclease Xth, NfoRecombinase RecA/RadA/Rad51Alkylation reversal Ogt/MGMTPhotolyase PhrII, PhrIdGTP/GTP clean up MutTBase excision glycosylases MutY/Nth, AlkA, Ung?Recombination endonuclease SbcC/Rad50, SbcD/MRE11Other SMS, Lon, UmuC
TIGRTIGRTIGRTIGR
Originated within Bacteria
Process Proteins
Mismatch repair MutH, VsrAlkylation reversal Ada (fusion of Ogt)Base excision Fpg-Nei, TagIRecombination initiation RecFJNOR, AddAB,
RecBCD, RecET, SbcBRecombination resolution RecG, RuvABC, RusNucleotide excision repair UvrABCDTranscription-coupled repair MFDInduction LexAOther SSB, LigaseI
TIGRTIGRTIGRTIGR
Originated within EukaryotesProcess Proteins
Mismatch repair duplications of MutS, mutLBase excision 3MG?Recombinase duplication of RecARecombination initiation duplications of RecQRecombination resolution CCE1Nucleotide excision repair Most XPs, TFIIH, etc.Transcription-coupled repair CSA, CSBInduction P53Non-homologous end joining XRCC4, Kus, DNA-PKcsOther RFAs, Rad52-59, XRS2
TIGRTIGRTIGRTIGR
Originated in Eukaryote-Archaea Lineage
Process Proteins
Base excision Ogg
Nucleotide excision repair Rad1, Rad2, Rad25?
Ligation LigaseII
TIGRTIGRTIGRTIGR
Ambiguous Origin
TIGRTIGRTIGRTIGR
Repair Genes in Archaea
• All species: RecA,MRE11, Rad50, MutY-Nth, Ogt, Rad2, Lig-II, PCNA
• UvrABCD in M. thermoautotrophicum• PhrI and PhrII in some species• Variety of glycosylases in some species• No Ung homologs in any species, but
alternative glycosylases have Ung activity• Rad1 in many species.• New Holliday junction resolvase
TIGRTIGRTIGRTIGR
TIGRTIGRTIGRTIGR
DNA Repair Genes in D. radiodurans Complete Genome
Process Genes in D. radiodurans
Nucleotide Excision Repair UvrABCD, UvrA2Base Excision Repair AlkA, Ung, Ung2, GT, MutM, MutY-Nths,
MPGAP Endonuclease XthMismatch Excision Repair MutS, MutLRecombination Initiation Recombinase Migration and resolution
RecFJNRQ, SbcCD, RecDRecARuvABC, RecG
Replication PolA, PolC, PolX, phage PolLigation DnlJdNTP pools, cleanup MutTs, RRaseOther LexA, RadA, HepA, UVDE, MutS2
TIGRTIGRTIGRTIGR
Problem:
List of DNA repair gene homologs in D. radiodurans genome is not significantly different from other
bacterial genomes of the similar size
TIGRTIGRTIGRTIGR
Unusual Features of D. radiodurans DNA Repair Genes
Process Genes
Nucleotide excision repair Two UvrAs
Base excision repair Four MutY-Nths
Recombination RecD but not RecBC
Replication Four Pol genes
dNTP pools Many MutTs, two RRases
Other UVDE
TIGRTIGRTIGRTIGR
-Ogt-RecFRQN-RuvC-Dut-SMS
-PhrI-AlkA-Nfo-Vsr-SbcCD-LexA-UmuC
-PhrI-PhrII-AlkA-Fpg-Nfo-MutLS-RecFORQ-SbcCD-LexA-UmuC-TagI
-PhrI-Ogt-AlkA-Xth-MutLS-RecFJORQN-Mfd-SbcCD-RecG-Dut-PriA-LexA-SMS-MutT
-PhrI-PhrII?-AlkA-Fpg-Nfo-RecO-LexA-UmuC
-PhrI-Ung?-MutLS-RecQ?-Dut-UmuC
-PhrII-Ogg
-Ogt-AlkA-TagI-Nfo-Rec-SbcCD-LexA
-Ogt-AlkA-Nfo-RecQ-SbcD?-Lon-LexA
-AlkA-Xth-Rad25?
-AlkA-Rad25
-Nfo
-Ogt-Ung-Nfo-Dut-Lon
-Ung
-PhrII
-PhrI
Ecoli
Haein
Neig
o
Help
y
Bacsu
Str
py
Mycg
e
Mycp
n
Borb
u
Tre
pa
Syn
sp
Metj
n
Arc
fu
Mett
h
Hu
man
Yeast
BACTERIA ARCHAEA EUKARYOTES
from mitochondria
+Ada+MutH+SbcB
dPhr
+TagI?+Fpg
+UvrABCD+Mfd
+RecFJNOR+RuvABC
+RecG+LigI
+LexA+SSB
+PriA+Dut?
+Rus+UmuD
+Nei?+RecE
tRecT?
+Vsr+RecBCD?
+RFAs+TFIIH
+Rad4,10,14,16,23,26+CSA
+Rad52,53,54+DNA-PK, Ku
dSNF2dMutSdMutLdRecA
+Rad1+Rad2
+Rad25?+Ogg+LigII
+Ung?+SSB,
+Dut?
+PhrI, PhrII+Ogt
+Ung, AlkA, MutY-Nth+AlkA
+Xth, Nfo?+MutLS?
+SbcCD+RecA
+UmuC+MutT
+LondMutSI/MutSII
dRecA/SMSdPhrI/PhrII
+Sprt3MG
+Rad7+CCE1
+P53dRecQdRad23+MAG?
-PhrII-RuvC
tRad25
+TagI?
+RecT
tUvrABCD
tTagI ?
Gain and Loss of Repair Genes
TIGRTIGRTIGRTIGR
TIGRTIGRTIGRTIGR
Repair Studies in Different Species(determined by Medline searches as of 1998)
Humans 7028E. coli 3926S. cerevisiae 988Drosophila 387B. subtilits 284S. pombe 116Xenopus 56C. elegans 25A. thaliana 20Methanogens 16Haloferax 5Giardia 0
TIGRTIGRTIGRTIGR
Evolution of Repair Summary• Mycoplasmas have lost many repair genes which may
explain high mutation rate.• Mismatch repair genes absent in many pathogens (is high
mutation rate advantageous?)• Whole pathways frequently lost as units (e.g., MutLS).• May be able to predict pathway interactions by correlated
loss of genes.• Archaeal genomes have few homologs of bacterial or
eukaryotic repair proteins.• Some eukaryotic repair proteins have likely mitochondrial
and plastid ancestry• Many ancient duplications (MutS, SNF2, UvrC).• Some unusual distributions (XPB, UvrABCD)
TIGRTIGRTIGRTIGR
TIGRTIGRTIGRTIGR
AcknowledgementsAcknowledgementsNIEHS•Ben Van Houten
TIGR•Craig Venter•Claire Fraser•John Heidelberg•Owen White•Steve Salzberg
Stanford•Phil Hanawalt•Rick Myers•D. Crowley
Louisiana State University•John Battista
U.C. Berkeley•Michael Eisen•A. J. Clark
Funding•DOE, OBER•NIH•NSF
Other•J. Laval•F. Taddei•A. Britt•J. Miller
TIGRTIGRTIGRTIGR
TIGRTIGRTIGRTIGR
Unusual Distributions
• XP-B like gene in some bacteria and some Archaea.• LigaseII in M. tuberculosis, B. subtilis, and A. aeolicus• UvrABCD in M. thermoatuotrophicum• Mycoplasmas and some low GC gram positives do not have
any Holliday junction resolving homologs (RuvC, RecG, Rus)
• Mycoplasmas are the only species without MutY-Nth homologs
• MutS2 unevenly distributed among bacteria, Archaea• Genes in RecF pathway not always present as a unit• Uracil glycosylase missing from Archaea and some bacteria
TIGRTIGRTIGRTIGR
Big Picture: Duplication and Loss
TIGRTIGRTIGRTIGR
Genes Lost in Mycoplasmal Lineage
Process Protein
Base excision repair MutY/Nth, AlkARecombination initiation RecF pathway, SbcCDRecombination resolution RecG, RuvCMismatch repair MutLSTranscription coupled repair MFDInduction LexADirect repair PhrI, OgtAP endonuclease XthOther MutT, Dut, PriA, SMS
TIGRTIGRTIGRTIGR
Parallel Loss of MutLS
Lost in mycoplasmal lineage (present in B. subtilis and S. pyogenes)
Lost in M. tuberculosis lineage (found in some other highGC Gram-positives)
Lost in H. pylori lineage (present in many other Proteobacteria)Possibly lost in Euryarchaeota lineageDefective in many “wild” E. coli and S. typhimurium strains
Loss of genes may give an advantage in some conditions by increasing mutation rate or recombination rate between species.
TIGRTIGRTIGRTIGR
Need for Experimental Studies in Archaea
• No novel repair genes cloned in Archaea. All repair genes show homology to repair genes in other species.
• Many novel repair genes found in bacteria and eukaryotes because of experimental work in these species.
• Since novel repair pathways appear to evolve frequently in bacteria and eukaryotes, there is a need for more genetic and experimental studies of repair in Archaea.
TIGRTIGRTIGRTIGR
Repair Genes in all Archaea
Process Protein
Nucleotide excision repair Rad2, Rad1 ±
Recombination RecA, Mre11, Rad50
Replication PolB, PCNA
Ligase Ligase II
Base excision repair MutY-Nth
dNTP pools MutT family
Alkyltransferase Ogt in all species
TIGRTIGRTIGRTIGR
DNA Repair Gene Summary
• Most of the standard eukaryotic DNA repair genes are found
• Some likely plastid repair genes are found
• Some duplications relative to other species
TIGRTIGRTIGRTIGR
Acknowledgements
• Genome duplications: S. Salzberg, J. Heidelberg, O. White, A. Stoltzfus, J. Peterson
• Genome sequences and analysis: J. Heidelberg, T. Read, H. Tettelin, K. Nelson, J. Peterson, R. Fleischmann, D. Bryant
• Horizontal transfers: K. Nelson, W. F. Doolittle
• TIGR: C. Fraser, J. Venter, M-I. Benito, S. Kaul, Seqcore
• $$$: DOE, NSF, NIH, ONR
TIGRTIGRTIGRTIGR
Evolution of Uracil Glycosylase
• Many non-homologous proteins have uracil-DNA glycosylase activity (Ung, GPADH, MUG, cyclin)
• Therefore, absence of homologs of these genes should not be used to infer likely absence of activity
• However, presence of homologs of Ung and MUG genes can be used to indicate presence of activity because all homologs of these genes have this activity
TIGRTIGRTIGRTIGR
Ambiguous Origin
Process Proteins
Base excision 3MG, GT MMR, Ung
Nucleotide excision repair Rad25
Recombination initiation RecQ
Other Dut
TIGRTIGRTIGRTIGR
Big Picture: Distribution Patterns
TIGRTIGRTIGRTIGR
Present in All Bacteria
Process Proteins
Nucleotide Excision Repair
Recombinase
Replication PolA,C
Single-strand DNA Binding SSB
Ligase LigaseI
TIGRTIGRTIGRTIGR
Present in All Free-Living Bacteria
Process Proteins
Nucleotide Excision Repair UvrABCD
Recombinase RecA
Replication PolA,C
Single-strand DNA Binding SSB
Ligase LigaseI
TIGRTIGRTIGRTIGR
Present in Most Bacteria
Process Protein
Nucleotide excision repair UvrABCDHolliday junction resolution RuvABCRecombination RecA; RecJ, RecGReplication PolA,C; PriA; SSBLigase DnlJTranscription-coupled repair MfdBase excision repair Ung, MutY-NthAP endonuclease Xth
TIGRTIGRTIGRTIGR
Present in Bacteria or Eukaryotes(But Not Both)
Process Bacteria Eukaryotes
Transcription-coupled repair CSB, CSA
Mismatch strand recognition MutH -
Nucleotide excision repair UvrABC XPs, TFIIH, etc.
Recombination initiation RecBCD, RecF KU, DNA-PK
Holliday junction resolutionRuvABC CCE1
Base excision -
Inducible responses LexA P53
TIGRTIGRTIGRTIGR
Evolution of Alkyltransferases
• All known alkyltransferases share a conserved, homologous alkyltransferase domain
• Therefore, if a species does not encode any protein with this domain, it likely does not have alkyltransferase activity
• If a species does encode an member of this gene family, it likely has alkyltransferase activity
TIGRTIGRTIGRTIGR
Standard Eukaryotic Repair GenesPathway Genes
Mismatch Repair MSH2-6, MutLs
Base Excision Repair Ogg, MutY-Nth, Tag, 3MG, Ung
Nucleotide Excision Repair
XPA, Rad1, Rad2, Rad3, Rad10, Rad25, etc
Recombination MRE11, Rad50 Rad51
Direct Repair Phr, Dnl1
Other PCNA, Dut, Lon
TIGRTIGRTIGRTIGR
Missing Eukaryotic Repair Genes?
Pathway GenesMismatch RepairBase Excision RepairNucleotide ExcisionRepair
XPA, Btf2, Btf3,Kin28
RecombinationDirect Repair OgtOther
TIGRTIGRTIGRTIGR
MSH6.PombeMSH6.YeastGTBP.Arath IV.At4g02070ARATH T10M13.8ARATH AGAA.3MSH7.ArathGTBP.MouseGTBP.HumanY47G6A.11.CelegansMSH3.HumanREP1.MouseMSH3.Arath IV M7J2.90SWI4.pombeMSH3.yeastMUTS BORBUMUTS TREPAMutS.Cloac.blastMutS.Clodi.blastMutS.SynspMutS.Chlte.blastMutS.Porgi.blastMutS.TheaqMutS.Theaq caldMutS.ThethMutS.ThemaMutS.EcoliMutS.SaltyMutS.Yerpe.blastMutS.VibchMutS.HaeinMutS.Actin.blastMutS.Actin.blastMutS.Pasmu.blastMutS.Shepu.blastMutS.Neime.TIGRMutS.Neigo.blastMutS.AzoviMutS.Pseae.blastMutS.Thife.blastMutS.Entfa.blastMutS.Strmu.blastMutS.Strpy.blastHexA.StrpnMutS.Staau.blastMutS.BacsuMutS.RicprMUTS RHIMEMutS.Caucr.prelimMutS.AqupyMutS.AquaeMutS.Chltr.?MutS.ChlpnMSH2.RatMSH2.HumanMSH2.MouseMSH2.XenlaMSH2.NeucrMSH2.YeastMSH2.Arath3Mus1.MaizeMSH2.PombeSPE1.DromeH26D21.2.CelegMSH1.SpombeMSH1.YeastMSH1.Canal.blastMSH4.CelegMSH4.YeastMSH4.CanalMSH4.humanARATH3 MQC12.24MSH5.YeastMSH5.HumanMsh5.MouseMSH5.CelegmMutS.Saco.glaucum.Muts2.MetthMutS2.PyrhoMutS2.PyrabMutS2.Helpy.TIGRMutS2.Helpy99MutS2.CamjeMutS2.Deira.TIGRMutS2.ThemaMutS2.BacsuMutS2.StaauMuts2.Entfa.blastMuts2.Strmu.blastMutS2.Strpy.BlastMutS2.Strpn.blastMuts2.Cloac.blastMuts2.Clodi.blastMutS2.SynspARATH 5 MJP23 7Muts2.Chlte.blastMutS2.BorbuMutS2.AquaeMuts2.Porgi.blastMutS2.Arath 1 F16G16.7
89
8654
7233
15
93
3855
3010035
93100
27
3882100
3112
17
20
96
12
35
4830
26
33
66
100 100
834310082
8260100
9766100
100
61
10015
9798 77100
78
54100
24
100100
89
85887240
3810093 79
100
92
88
80
51 76
67
4653100
55
100 100
100100
4950 179
27 39872317
584510077
9526
70
Bootstrap
MutS2
MSH5
MSH4
MSH1
MSH2
MutS1
MSH3
MSH6
TIGRTIGRTIGRTIGR
E.coliShig.flexneriShig.sonnei
Ent.agglomeransY.pestisS.marcescensXen.bovienii.editXen.nematophilus
Pr.vulgarisPr.mirabilisShepu.tigr
Vib.anguillarumVib.cholerae%MPLAAer.salmonicidaActinobacillus actinomycetemcoH.influenzaePast.multocidaPast.haemolyticaPs.oleovorans
Ps.marginalisPs.fluorescensPs.putida
Ps.aeruginosaPs.aePAM7Az.vinelandiiAc.calcoaceticusAc.sp.ADP1
Mthmon.claraMthphy.methylotrophusMthbac.flagellatumNitrosomonas.blastChrom.vinosum
L.pneumophilaT.ferrooxidansXa.oryzaeXa.campestrisXa.citri
B.pertussisPs.cepacia
Ne.gonorrhoeaNe.meningitis.TIGR
Blastochloris.viridisRhps.palustrisRhb.phaseoliRh.leguminosarumA.tumefaciensRh.melilotiBr.abortus
Aq.magnetotacticum.editAceto.polyoxogenesAceto.altoacetigenesGluc.oxydansZym.mobilisPrcs.denitrificansRho.sphaeroidesRho.capsulatusCaul.crescentusRic.prowazekii
reca.wolbachia.blastHe.pyloriHe.pylori2
Cmp.jejuniCmp.fetus1Myx.xanthus2Myx.xanthusblast.geobacterblast.desulf
Tmtg.maritimaTrep.pallidumTr.denticolum.blastBor.borgdorferi
Lept.biflexaLept.interrogansSpir.platensisSyn.7942Ana.variabilis
Syn.7002Syn.6803A.thaliana
A.thaliana.chr3A.thaliana.chr2
Ureo urolyticumMycp.genitaliumMycp.pneumoniaePhytoplasma sp
Mycp.mycoidesMycp.pulmonas
De.radioduransThe.aquaticusThe.thermophilus
Chlfx.aurantiacusDehalo.blastChl.trachomatisChl.pneumoniae
Cory.glutamicumCory.pseudotuberculosis
Strpm.coelicolorStrpm.lividansStrpm.ambofaciensStrpm.violaceusStrpm.rimosusMycb.tuberculosisMycb.bovis.sanger.fr2Mycb.avium.blastMycb.lepraeMycb.smegmatisAmycolatopsis mediterraneiBifido.breve
Chb.tepidumPorp.gingivalisBact.fragilisPrev.ruminocolaList.monocytogenesB.subtilisBac anthracis.blastEnt faecalis.blastStaph.aureus
Strc.pneumoniaeStrc.parasanguisStrc.pyogenes1Lct.lactisAch.laidlawiireca.blast.carboxyClost.perfringensCl.acetybutylicum.blast
Aq.pyrophilusAq.aeolicus
100
18
4
3
4
25
10
30
46
25
25
59
66
49
43
87
71
61
100
81
4587
55
55100
100
96
100
23
83100 75
100
93
100
96100
100
7
8
306590100
35
100
67
99
100
40
4936
44
54
96
85
87997885
94
93100
98
84
18
56100100
82
32
8095
17
449993
96
99
9768
39
94
10
21
2351
88
100100
68
97
100
9
48
60
100
9652
100
57
100
935299
99
993656100
39
10078
30
77
70
76
7364 60
100
10088
84
100
Bootstrap
SpirochetesCyanobacteria
MycoplasmasD/TGreen Non-SulfurChlamydia
High GC Gram +
Green Sulfur
Low GC Gram +
Hydrogenobacteria
TIGRTIGRTIGRTIGR
DDM1.Arath VYFK8.yeast
F24P17 ArathIIISNF2L2 yeast
SNF2L1 yeastSNF2L Celeg
ISW1 flySNF2LB human
SNF2L humanhSNF2h humanSNF2L Plasmodium
BRG1 humanBRG1 mouse
BRG1 chickenBRM human
BRM chickenBRM fly
C52B9 CelegF01G4 Celeg
STH1 yeastSNF2 yeastSPAC1250 pombe
Brm PlasmodiumT3B23.4 ArathII
T3F17.33.ArathIIT04D1.4 Celeg
human.6016932Mi2 human
CHD3 humanT14G8 CelegF26F12 Celeg
CHD3 flypickle.Arath II
F13D4.130 ArathIIT19K24.8 ArathV
CHD1 yeastSPAC3G6 Spombe
CHD1 SpombeCHD1 humanCHD1 mouse
CHD1Z chickenCHD2 human
CHD1 fly?H06O01 Celeg
F14O4.6.Arath IIT14P1.22.ArathII
YGP0 YEASTKIAA1259.human
CT11902.flyF2809.150.Arath IIIYDR334w.yeast
SPAC11E3 SpombeSRCAP.Human
CT27348.flyCT27330.fly
ETL.mouseKIAA1122.human
CG5899.flyM03C11.CelegSPAC20G8.Spombe
SPAC25A8.SpombeFUN30.yeast
F5O4.14.Arath IICSB.human
Rad26.yeastRhp26.Spombe
MSF3.14.Arath IIF11M21.32.Arath IF53H4 Celeg
T25N20.13.Arath IF16F14.11.Arath II
F3K23.21.Arath IIRad54L1.celeg
ATRX.humanATRX.mouse
CG4548.flyC27B7.Celegnas
F22O13.8.Arath IRad54.flyRad54.chicken
Rad54.mouseRad54.human
Rad54.NcrassaRhp54.pombe
RAD54.yeastRAD54B.Human
Rad54L1.yeastRad54L1.SpombeAC22F3
Mot1.arath
MOT1.yeastMot1.fly
TAFII170.humanF15D4.Celeg.3880600
Drp1CG10445
CG10445.flyLODE.DrosophilaT23H2.Celeg
F54E12.CelegHuf2.Human.3702846
GTA.Autographa virusGTA.O pseudo virus
GTA.BudwormHIP116A.human
RUSH1a.RabbitTNF.MouseRAD5.yeast
RAD8.PombeRIS1.yeast
AC17A2.pombeT19D16.2.Arath
F11P17 ArathIC19G10.02.pombe
RAD16.yeastRhp16.Spombe
YLR247c.YeastCT22559.CG7376SPAC144.Spombe
T7D17.5.Arath IIiridi virus
HepA.AfuHepA.Pyrho.3257311
seq.pyrfu.ncbiBTM.TM0990
HepA.Deira.ORFB00056HepA.HalYqhH.Bsu
HepA.DnoHepA.Eco
HepA.Vibch.03168hepa.shewa
HepA.HinF11A17 Arath I
humanmouseHarp.Bos taurus
C16A3.CelegansCG3753.fly
GVC.ORF02241HepA2.Chltr
BTC.ORF00253BCP.ORF00856
HepA2.MgeHepA2.Mpn
snf2.StrcoHepA2.BsuHepA2.Syn
HelZ.Mtub.2909552RadB.Sulso
HepA2.Dra.TIGR.partHepA.Chltr.3329163
BTC.ORF00041BCP.ORF00420
HepA2.Spy.partHepA2.Bce
79
65
34
42
55
36
40
22
30
35
100
40
45
43
75
22
16
25
100
97
53
96
97
98
100100
50
100
100
100
100
54100
100
100
10086
39
60
100
86
77
42
100
62
100
100
100
99
10090
9393
75
100100
97
90
10038
53
100
69
100100
69
80
10071
100
10080
54
49
47100
8593
45
49
100100
100
60100
58100
100
96
100
100100
98
10092
65
100
82100
100
24
68
46
51
95
79
64100
64100
100100
28
38
10082
98
49
4673
100
34100
9959
60
60
100
31
30100
100
25
39
35
10094
54
100
68
10076
70
100100
100
75
44
5465
100100
Bootstrap
DDM1
SNF2L
SNF2
CHD3
CHD1
YGPO
SRCAP
ETL1
CSB
ARTX
RAD54
MOT1
RAD16
HEPA1
HARP
HEPA2
TIGRTIGRTIGRTIGR
Homo sapiensAGMonkeyMonkeyMacacaRabbitWoodchuckMouseRatMastomysM auratusC griseusBos taurusBos.indicusBos.primigeniusOvis ariesFelis catusp53.guineapigChickenXenopuszebrafishp53.barbelp53.catfishTroutp53.Xiph maculatusp53.Xiph maculatusp53.Xiph helleriOryziasp53.tetraodonFlounderSquidp53.flycusp.humanp73h.mousep51a.humanp51d.humanp63a.humanp63b.mousep63g.humanp63g.mousep63a.mouseket.ratp73.barbelp73#3.humanp53#2.humanp73#1.human
99
100
52
83
10053
85
6187 90
9760
99
95
100
53
93100
55100
71
73
94
100
74
100
Bootstrap
TIGRTIGRTIGRTIGR
Mitochondrial Repair Genes
Pathway Genes Mismatch Repair Base Excision Repair TagI Nucleotide Excision Repair
Recombination RecA? Direct Repair Phr Other Lon, SSB, AlkB
TIGRTIGRTIGRTIGR
Plastid Repair Genes
Pathway Genes Mismatch Repair MutS2a, MutS2b Base Excision Repair Fpg Nucleotide Excision Repair
Mfd
Recombination RecA, RecG, SMS Direct Repair Phr Other Lon, PolI, SSB?
Recommended