32
Genetics of Bacterial Genomes http://www.pasteur.fr/recherche/unites/REG/ [email protected] Symplectic biology: Universals in microbial genomes 24 april 2006

Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Symplectic biology:Universals in microbial genomes

24 april 2006

Page 2: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Symplectic biology:The Delphic Boat

Biology is a science ofrelationships betweenobjects rather than fromobjects: from συνtogether, πλεκτειν, toweaveProteins are part ofcomplexes, as are partsin an engineAs for constructing a boat,failing to understand theirrelationships will result inultimate failure ofsynthetic objects

The Delphic Boat: Harvard University

Press, february 2003

Page 3: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Three processes are needed for Life:

Information transfer (Living Computers?) => the goalof genomics is to decipher the blueprint of the “read-only” memory of the machine

Driving force for a coupling between the genomestructure and the structure of the cell:

MetabolismCompartmentalisation

What is Life?

Page 4: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Two processes are needed for computing:

A read/write machine

A program on a physical support (typically, a tapeillustrates the sequential string of symbols that makesup the programme), split (in practice) into two entities:

Programme (providing the goal)Data (providing the context)

The machine is distinct from the programme

What is computing?

Page 5: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Cells as computers

Genomics rests on an alphabetic metaphor, that of a textwritten with a four-letter alphabet, acting as a programme

Conjecture: do cells behave as computers?

Genetic engineeringVirusesHorizontal gene transferCloning animal cells

all point to separation betweenMachineData + Programme

Page 6: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

If the machine has not only to behave as acomputer but has also to construct themachine itself, one must find an image ofthe machine somewhere in the machine(John von Neumann)

Is there a map of the cellin the chromosome?

Page 7: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Genome organisation

Is the gene order random in the chromosomes?

At first sight, consistent with different DNA managementprocesses not much is conserved, and genes transferred fromother organisms are distributed throughout genomes

However, groups of genes such as operons or pathogenicityislands tend to cluster in specific places, and they code forproteins with common functions. « Persistent » genes areclustered together

Also, some motifs are ubiquitously present, suggestinggeneral rules constraining genome organisation

E Larsabal, A DanchinGenomes are covered with ubiquitous 11bp periodic patterns, the "class A flexible patterns"BMC Bioinformatics (2005) 6: 206

Page 8: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

A universal feature of the program: the period of 10-11.5

motifs

0 10 20 30 40 50 60 70 80 90 100bp

0 10 20 30 40 50 60 70 80 90 100bp

ffss(G(G--))-0.01

0

0.01

0 10 20 30 40 50 60 70 80 90 100bp

real

model

difference

Helicobacter pylori

Page 9: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Flexible motifs of type A

methods

1- 1-xxAAxxxxxxxxTTxxxxxxxxAAxxxxxxxxTTTTxxxxxxxxxxAAxxxxxxxxTTxxxxxxxxAAxxxxxx: : AllAll kindoms kindoms 2-2-xxxxxxxxxxxxxxxxxxxxxxGGxxxxxxxxTTTTxxxxxxCCxxxxxxxxxxTTxxxxxxxxxxxxxxxxxx:: ProteobacteriaProteobacteria 4- 4-xxxxxxxxxxxxTTxxxxxxxxAGAGxxxxxxTTTTxxxxxxxxxxxxxxxxTTxxxxxxxxxxxxxxxxxxxx:: ArchaeaArchaea 55''--xxxxxx-10xxxxxxxxx0xxxxxxxx10xxxxxx-10xxxxxxxxx0xxxxxxxx10xxxxxxbp-3bp-3''

TTTTxxxxxxGGxxxxxxTTxxxxxxxxxxxxxxxxxxxxTTTT

The nucleotides composing this classA flexible pattern are accessiblethrough this side too but thedinucleotides are set in minor grooves.

The nucleotides composing this classA flexible pattern are fully accessiblethrough this side and the dinucleotidesare set in major grooves.TTTT

GG TT TTTT

AAAACC

AA AAAA

Page 10: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

To lead or to lag...

Is it possible to see whether the position ofgenes in the chromosome is randomlydistributed on the leading and lagging strand?

Page 11: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Page 12: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

180

90

0

27055% leading

Escherichia coli

Ori

Ter

90270 65% leadingTreponema pallidum

Ori

Ter

180

90270 75% leadingBacillus subtilis

Ori

Ter

9027087% leading

Thermoanaerobactertengcongensis

Ori

Ter

CDS densityLeading CDS density

Page 13: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Chosing arbitrarily anorigin of replicationand a property of thestrand (basecomposition, codoncomposition, codonusage, amino acidcomposition of thecoded protein…) onecan use discriminantanalysis to seewhether thehypothesis holds.

To lag or to lead...

E. Rocha, A. Danchin & A. Viari Universal replication biases in bacteria. Mol. Microbiol. (1999) 32: 11-16

Page 14: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

To lag or to lead, that is thequestion

.

0,450,5

0,550,6

0,650,7

0,750,8

0,85

0 20 40 60 80 100

Bacillussubtilis

accu

racy

Borreliaburgdorferi

0,4

0,5

0,6

0,7

0,8

0,9

1

0 20 40 60 80 100 0,4

0,5

0,6

0,7

0,8

0,9

1Chlamydiatrachomatis

0 20 40 60 80 100

0,45

0,5

0,55

0,6

0,65

0,7

0,75

0 20 40 60 80 100

Escherichiacoli

accu

racy

0,45

0,5

0,55

0,6

0,65

0,7

0,75

0 20 40 60 80 100

Heamophilusinfluenzae

0 20 40 60 80 100

HelicobacterPylori

0,4

0,45

0,5

0,55

0,6

0,65

0,7

0,40,45

0,50,55

0,60,65

0,70,75

0,8

0 20 40 60 80 100

Methanobacteriumthermoautotrophicum

position (%) position (%) position (%)

accu

racy

0,45

0,5

0,55

0,6

0,65

0,7

0,75

0 20 40 60 80 100

Mycobacteriumtuberculosis

0,4

0,5

0,6

0,7

0,8

0,9

1

0 20 40 60 80 100

Treponemapallidum

Bases

Amino acids

Codons

Dinucleotides

Page 15: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Visible even inproteins…

Page 16: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Essentiality in B. subtilis

hi ghlyexpressed

0%

25%

50%

75%

100%

non-highlyexpressed

Essential genes Non-essential genes

Lagging

Leading

non-highlyexpressed

highlyexpressed

EPC Rocha, A DanchinEssentiality, not expressiveness, drives gene-strand bias in bacteriaNature Genetics (2003) 34: 377-378

Page 17: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

When polymerasescollide

DNAPdeceleration

End oftranscription

Arrest of RNAP & DNAP

Transcriptionabortion

Co-oriented Head-onConsequences:1. Replication slow-down

2. Loss of transcripts

Consequences:1. Aborted transcripts

2. Truncated essentialproteins

Page 18: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Three examples of therole of the context

Microbial genes are of infinite diversity but thereexists universals; only about 10% of their genes are ofpersistent and recognized function; we do not have yeta fair idea of the number of microbial species; thenumber of genes in a given species is highly variable(horizontal gene transfer)

Example 1: persistent genes Example 2: orphan genes and universal amino acids [Example 3: a new metabolic pathway] ....

Page 19: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

An extension of essentiality:Gene persistence

Page 20: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Some of the genes missing from the list of persistentgenes have diverged considerably. To assess thecontribution of this effect we measured for each pair ofgenomes the correlation between the similarity oforthologous pairs and that of the 16S rRNA. Thecorrelations were high. For example (A), 38% (resp. 48%)of B. subtilis (resp. E. coli) persistent genes showed acorrelation coefficient >0.9 between the sequencesimilarity of the pair of orthologs and the 16S RNA.

In contrast, some genes (B) evolve in an erratic way.This may be due to horizontal gene transfer, localadaptations leading to faster or slower evolutionary pace,or simply wrong assignments of orthology. The latter canbe a significant problem, especially in large proteinfamilies. The genes presenting such an erratic patternare rare in the persistent set.

Gene persistence

G Fang, EPC Rocha, A DanchinHow essential are non-essential genes?Mol Biol Evol (2005) 22: 2147-2156

Page 21: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Genomic islands

A clustering method based onthe analysis of codon usagebiases, using an informationtheory leads to group the genesinto homogeneous clusters,which are not distributedrandomly in the chromosome.One cluster corresponds tohighly expressed genes. Otherclusters are linked to specificfunctions or processes:horizontally transferred genes,motility or intermediarymetabolism.

M. Bailly-Béchet

M Bailly-Bechet, A Danchin, M Iqbal, M Marsili, M VergassolaCodon usage domains over bacterial chromosomesPLoS Computational Biology (2006) 2: april 20th

Page 22: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

One cluster is related to expression levels.Other groups feature an over-representation of genes belonging todifferent functional groups: horizontallytransferred genes, motility andintermediary metabolism. Genes with asimilar bias are close on the chromosomeand organized in coherent domains, moreextended than operons, demonstrating arole of translation in structuring bacterialchromosomes. A sizeable contribution tothis effect comes from the dynamiccompartimentalization induced by therecycling of tRNAs, leading to geneexpression rates dependent on theirgenomic and expression context

Genome islands

Page 23: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Genomeorganization

P. haloplanktis

origins

Page 24: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

P. haloplanktis : Correspondence Analysis

Pseudoalteromonas haloplanktis

IIMPsIIMPs IntermediaryIntermediarymetabolismmetabolism

OuterOutermembranemembraneor secretedor secretedInformationInformation

transfertransfer

UnknownUnknownfunction /function /

PhagePhageproteinsproteins

cold

Page 25: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Universal biases inamino acid composition

First axis: separates Integral Inner Membrane Proteins(IIMP) from the rest; driven by opposition between chargedand large hydrophobic residues

Second axis: separates proteins according to anopposition driven by the G+C content of the first codonbase

Third axis: separates proteins by their content inaromatic amino acids; enriched in orphan proteins

Page 26: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Temperature-dependentbiases in protein amino acid

composition

The general trend of amino acid compositionbias is to avoid some aminoacids at highertemperatures (associated to aging processes)

Mesophilic bacteria belong to at least twodifferent classes (in a 5-clusters analysis)

Biases are always dominated by the IIMPclustering

C Médigue, E Krin, G Pascal, V Barbe, A Bernsel, PN Bertin, F Cheung, S Cruveiller, S D'Amico, A Duilio, G Fang, G Feller, C Ho, S Mangenot, GMarino, J Nilsson, E Parrilli, EPC Rocha, Z Rouy, A Sekowska, ML Tutino, D Vallenet, G von Heijne, A DanchinCoping with cold: the genome of the versatile marine Antarctica bacterium Pseudoalteromonas haloplanktis TAC125Genome Research (2005) 15: 1325-1335

Page 27: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

A specific asparagine bias in psychrophiles

Comparative proteomics

IIMPsIIMPs

53%53%mesophilesmesophiles

62%62%thermophilesthermophiles

55%psychrophiles

55%psychrophiles

Motility

Cell wall,outermembrane

Transport(TonB),secretion

Adaptationto stress

Metabolismof DNA andRNA

isoaspartate

Page 28: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Asparagine deamidates: a major contribution to protein aging

Chemistry

Main post-translational modification

Reaction still poorly understood

Spontaneous reaction (untargeted?)

Affects the protein structure (and function?)

Role in regulating protein folding

Signal for degradation of intracellular proteins

Asparagine (N)

Intermediary:succinimide

Degradation:succinimide

IsoaspartateAspartate

Page 29: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

In 1991, at the EU meeting on genome programs inElounda, Greece, the presentation of the yeastchromosome III and the first 100 kb of the Bacillussubtilis genome revealed that, contrary to expectation(the only cases where this had been observed werephages, for obvious reasons), at least half of the genesuncovered were totally unknown, whether in structureor in function

The first discovery ofgenomics

Page 30: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Orphans: the gluons

A remarkable role of aromatic amino acids createsa universal bias. Expressed orphan proteins areenriched in these residues, suggesting that theymight participate in a process of gain of functionduring evolution. We postulate that the majority ismade of proteins — gluons — involved instabilising complexes, thus defining the "self" ofthe species.

G Pascal, C Médigue, A DanchinUniversal biases in protein composition of model prokaryotesProteins (2005) 60: 27-35

Page 31: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Why aromatic amino acids in orphan proteins?

From Orphans to « Gluons »

♣ Orphan proteins loose their status during evolution Rocha. 2002.Pedulla. 2003

Page 32: Universals in microbial genomes - normale supadanchin/lectures/Arlie_240406.pdf · Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes

Genetics of Bacterial Genomeshttp://www.pasteur.fr/recherche/unites/REG/ [email protected]

Thank you