Upload
roch
View
22
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Evolution of bacterial regulatory systems. Mikhail Gelfand Research and Training Center “Bioinformatics” Institute for Information Transmission Problems Moscow, Russia. January 2008. Plan. Individual sites Transcription factors and their binding signals Regulatory systems and regulons. - PowerPoint PPT Presentation
Citation preview
Evolution of bacterial regulatory systems
Mikhail Gelfand
Research and Training Center “Bioinformatics”Institute for Information Transmission
ProblemsMoscow, Russia
January 2008
Plan
• Individual sites
• Transcription factors and their binding signals
• Regulatory systems and regulons
Birth and death of sites is a very dynamic process
NadR-binding sites upstream of pnuB seem absent in Klebsiella pneumoniae and Serratia marcescens
… but there are candidate sites further upstream …
… and they are clearly different (not simply misaligned).
Cryptic sites and loss of regulators
Loss of RbsR in Y. pestis (ABC-transporter also is lost)
Start codon of rbsD
RbsR binding site
Unexpected conservation of non-consensus positions in orthologous
sites
regulatory site of LexA upstream of lexAconsensus nucleotides are in caps
Escherichia coli TgCTGTATATActcACAGcA
Salmonella typhi aACTGTATATActcACAGcA
Yersinia pestis agCTGTATATActcACAGcA
Haemophilus influenzae atCTGTATAcAatacCAGTt
Pasteurella multocida TtCTGTATATAataACAGTt
Vibrio cholerae cACTGgATATActcACAGTc
wrong consensus?
TF PurR, gene purL
Escherichia coli ACGCAAACGgTTtCGT
Salmonella typhi ACGCAAACGgTTtCGT
Yersinia pestis ACGCAAACGgTTtCGT
Haemophilus influenzae AtGCAAACGTTTGCtT
Pasteurella multocida ACGCAAACGTTTtCGT
Vibrio cholerae ACGCAAACGgTTGCtT
TF PurR, gene purMEscherichia coli tCGCAAACGTTTGCtT
Salmonella typhi tCGCAAACGTTTGCtT
Yersinia pestis tCGCAAACGTTTGCcT
Haemophilus influenzae tCGCAAACGTTTGCtT
Pasteurella multocida tCGCAAACGTTTGCtT
Vibrio cholerae ACGCAAACGTTTtCcT
Non-consensus positions are more conserved than synonymous codon positions
Regulators and their motifs
• Cases of motif conservation at surprisingly large distances
• Subtle changes at close evolutionary distances
• Correlation between contacting nucleotides and amino acid residues
• Changes in symmetry patterns
NrdR (regulator of ribonucleotide reducases and some other replication-related genes): conservation at large
distances
DNA motifs and protein-DNA interactions
CRP PurR
IHF TrpR
Entropy at aligned sites and the number of contacts (heavy atoms in a base pair at a distance <cutoff from a protein atom)
The LacI family: subtle changes in motifs at close
distances
G
An
CGGn GC
Specificity-determining positions in the LacI family
Training set: 459 sequencesaverage length: 338 amino acids,85 specificity groups
10 residues contact NPF (analog of the effector)
6 residues in the intersubunit contacts
7 residues contact the operator sequence
7 residues in the effector contact zone (5Ǻ<dmin<10Ǻ)
5 residues in the intersubunit contact zone (5Ǻ<dmin<10Ǻ)
6 residues in the operator contact zone (5Ǻ<dmin<10Ǻ)
– 44 SDPs
LacI from E.coli
The CRP/FNR family of regulators
FNR
HcpR
CooA
Gam ma
Desulfovibrio
Desulfovibrio
TGTCGGCnnGCCGACA
TTGTgAnnnnnnTcACAA
TTGTGAnnnnnnTCACAA
TTGATnnnnATCAA
Correlation between contacting nucleotides and amino acid
residues• CooA in Desulfovibrio spp.• CRP in Gamma-proteobacteria• HcpR in Desulfovibrio spp. • FNR in Gamma-proteobacteria
DD COOA ALTTEQLSLHMGATRQTVSTLLNNLVRDV COOA ELTMEQLAGLVGTTRQTASTLLNDMIREC CRP KITRQEIGQIVGCSRETVGRILKMLEDYP CRP KXTRQEIGQIVGCSRETVGRILKMLEDVC CRP KITRQEIGQIVGCSRETVGRILKMLEEDD HCPR DVSKSLLAGVLGTARETLSRALAKLVEDV HCPR DVTKGLLAGLLGTARETLSRCLSRMVEEC FNR TMTRGDIGNYLGLTVETISRLLGRFQKYP FNR TMTRGDIGNYLGLTVETISRLLGRFQKVC FNR TMTRGDIGNYLGLTVETISRLLGRFQK
TGTCGGCnnGCCGACA
TTGTgAnnnnnnTcACAA
TTGTGAnnnnnnTCACAA
TTGATnnnnATCAA
Contacting residues: REnnnRTG: 1st arginineGA: glutamate and 2nd arginine
The correlation holds
for other factors in
the family
NrtR (regulator of NAD metabolism): systematic search for correlated
positions
• analysis of correlated positions in proteins and sites• analysis of specificity determining positions• the same positions in one alpha-helix identified• plans for experimental verification
NiaR: changed dimer structure?
The GalR family and C-proteins of RM-systems: direct and inverted repeats
BirA: changed spacing
What are the events leading to the present-
day state?• Expansion and contraction of
regulons• New regulators (where from?)• Duplications of regulators with or
without regulated loci• Loss of regulators with or without
regulated loci• Re-assortment of regulators and
structural genes• … especially in complex systems• Horizontal transfer
Trehalose/maltose catabolism in alpha-proteobacteria
Duplicated LacI-family regulators: lineage-specific post-duplication loss
The binding motifs are very similar (the blue branch is somewhat different: to avoid cross-
recognition?)
Utilization of an unknown galactoside in gamma-
proteobacteria
Loss of regulator and merger of regulons: It seems that laci-X was present in the common ancestor (Klebsiella is an outgroup)
Yersinia and Klebsiella: two regulons, GalR and Laci-X
Erwinia: one regulon, GalR
Utilization of maltose/maltodextrin
in Firmicutes
Displacement: invasion of a regulator from a different subfamily (horizontal transfer
from a related species?) – blue sites
Orthologous TFs with completely different
regulons (alpha-proteobaceria and Xanthomonadales)
Catabolism of gluconate in proteobacteria
Extreme variability of the regulation of “marginal” regulon members
γ
Pse
udom
onas
spp
.
β
Regulation of amino acid biosynthesis in Firmicutes
• Interplay between regulatory RNA elements and transcription factors
• Expansion of T-box systems (normally – RNA structures regulating aminoacyl-tRNA-synthetases)
Three regulator
y systems for the
methionine bio-
synthesis
A. SAM-dependent riboswitch
B. Met-T-boxC. MtaR:
repressor of transcription
MtaR
Methionine regulatory systems: loss of S-box regulons
• S-boxes (SAM-1 riboswitch)– Bacillales– Clostridiales– the Zoo:
• Petrotoga
• actinobacteria (Streptomyces, Thermobifida)
• Chlorobium, Chloroflexus, Cytophaga
• Fusobacterium
• Deinococcus
• proteobacteria (Xanthomonas, Geobacter)
• Met-T-boxes (Met-tRNA-dependent attenuator) + SAM-2 riboswitch for metK– Lactobacillales
• MET-boxes (candidate transcription signal)– Streptococcales
Lact. Strep. Bac. Clostr.
ZOO
Recent duplications and bursts: Arg-T-box in Clostridium difficile
LJ_ARGS
LME_ARGS
LR_ARGS
LP_ARGS
CBE_ARGS
CPE_ARGSCB_ARGS
CTC_ARGS
CAC_ARGS
CDF_YQIXYZ
RDF02391
СDF_ARGC
CDF_ARGH
BC_ARGS2EF_ARGS
BH_ARGS
LSA_ARGSPPE_ARGS
LGA_ARGS
Bacillales
argSyqiXYZ
RDF02391
argCJBDF
predictedamino acidtransporters
NEW
argG
argH
Clostridiumdifficile
amino acidbiosynthetic genes
: ARG-specific T-box regulatory site
aminoacyl-tRNA synthetase
biosynthetic genes
amino acid transporters
NEW
Lactobacillales Clostridiales
argS argS
others
… following transcription factor loss
Expansion of T-box regulon
regulation of expression of arginine biosynthetic and transport genes by T-box antitermination
: ARG-specific T-box regulatory site
Binding to 5’ UTR gene region regulation of gene expression
Other clostridia spp. (CA, CTC, CTH, CPE, CB, CPE)
yqiXYZ
argC
argH
yqiXYZ
argC
argG
argH
AhrC regulatory protein (negative regulation of arginine metabolism positive regulation of arginine catabolism)
...AhrC site
: AhrC binding site
Gram+ bacteria: Clostridiumdifficile:
AhrC is lost
5’
Regulon expansion, or how FruR has become CRA
• CRA (a.k.a. FruR) in Escherichia coli:– global regulator
– well-studied in experiment (many regulated genes known)
• Going back in time: looking for candidate CRA/FruR sites upstream of (orthologs of) genes known to be regulated in E.coli
Common ancestor of gamma-proteobacteria
icdA
aceA
aceB
aceEF
pckA
ppsApykF
adhE
gpmApgk
tpiA
gapApfkAfbp
FructosefruKfruBA
eda
eddepd
Glucose
ptsHI-crr
Mannose
manXYZ
mtlDmtlAMannitol
Gamma-proteobacteria
Common ancestor of the Enterobacteriales
icdA
aceA
aceB
aceEF
pckA
ppsApykF
adhE
gpmApgk
tpiA
gapApfkAfbp
FructosefruKfruBA
eda
eddepd
Glucose
ptsHI-crr
Mannose
manXYZ
mtlDmtlAMannitol
Gamma-proteobacteriaEnterobacteriales
Common ancestor of Escherichia and Salmonella
icdA
aceA
aceB
aceEF
pckA
ppsApykF
adhE
gpmApgk
tpiA
gapApfkAfbp
FructosefruKfruBA
eda
eddepd
Glucose
ptsHI-crr
Mannose
manXYZ
mtlDmtlAMannitol
Gamma-proteobacteriaEnterobacterialesE. coli and Salmonella spp.
Life without Fur
Regulation of iron homeostasis (the Escherichia coli paradigm)
Iron:• essential cofactor (limiting in many environments)• dangerous at large concentrations
FUR (responds to iron):• synthesis of siderophores• transport (siderophores, heme, Fe2+, Fe3+)• storage• iron-dependent enzymes• synthesis of heme• synthesis of Fe-S clusters
Similar in Bacillus subtilis
Regulation of iron homeostasis in α-proteobacteria
Experimental studies:• FUR/MUR: Bradyrhizobium, Rhizobium and Sinorhizobium• RirA (Rrf2 family): Rhizobium and Sinorhizobium • Irr (FUR family): Bradyrhizobium, Rhizobium and Brucella
RirA IrrFeS heme
RirA
degraded
FurFe
Fur
Iron uptake systems
Siderophoreuptake
Fe / Feuptake Transcription
factors
2+ 3+
Iron storage ferritins
FeS synthesis
Heme synthesis
Iron-requiring enzymes
[iron cofactor]
IscR
Irr
[- Fe] [+Fe]
[+Fe][- Fe]
[+Fe][ Fe]-
FeS
FeS statusof cell
Distribution of
transcription factors in genomes
Search for candidate motifs and binding sites using standard comparative genomic techniques
Regulation of genes in
functional subsystemsRhizobiales
Bradyrhizobiaceae
Rhodobacteriales
The Zoo (likely ancestral state)
Reconstruction of history
Appearance of theiron-Rhodo motif
Frequent co-regulation
with Irr
Strict division of function
with Irr
All logos and Some Very Tempting Hypotheses:
1. Cross-recognition of FUR and IscR motifs in the ancestor.
2. When FUR had become MUR, and IscR had been lost in Rhizobiales, emerging RirA (from the Rrf2 family, with a rather different general consensus) took over their sites.
3. Iron-Rhodo boxes are recognized by IscR: directly testable
1
2
3
Summary and open problems• Regulatory systems are very flexible
– easily lost– easily expanded (in particular, by duplication)– may change specificity– rapid turnover of regulatory sites
• With more stories like these, we can start thinking about a general theory– catalog of elementary events; how frequent?– mechanisms (duplication, birth e.g. from enzymes,
horizontal transfer)– conserved (regulon cores) and non-conserved (marginal
regulon members) genes in relation to metabolic and functional subsystems/roles
– (TF family-specific) protein-DNA recognition code– distribution of TF families in genomes; distribution of
regulon sizes; etc.
People• Andrei A. Mironov – software, algorithms • Alexandra Rakhmaninova – SDP, protein-DNA correlations
• Anna Gerasimova (now at U. Michigan) – NadR• Olga Kalinina (on loan to EMBL) – SDP• Yuri Korostelev – protein-DNA correlations• Ekateina Kotelnikova (now at Ariadne Genomics) – evolution of
sites• Olga Laikova – LacI• Dmitry Ravcheev– CRA/FruR• Dmitry Rodionov (on loan to Burnham Institute) – iron etc.• Alexei Vitreschak – T-boxes and riboswitches
• Andy Jonson (U. of East Anglia) – experimental validation (iron)• Leonid Mirny (MIT) – protein-DNA, SDP• Andrei Osterman (Burnham Institute) – experimental validation
• Howard Hughes Medical Institute • Russian Foundation of Basic Research• Russian Academy of Sciences, program “Molecular and Cellular Biology”• INTAS