Transcript
Page 1: Are essential genes conserved? - Amazon S3 essential genes conserved? Fatemeh Ashari Ghomi1, Paul Gardner1, Lars Barquist2 1 School of Biological Sciences, University of Canterbury,

Are essential genes conserved?Fatemeh Ashari Ghomi1, Paul Gardner1, Lars Barquist2

1 School of Biological Sciences, University of Canterbury, Christchurch, New Zealand2 Institute for Molecular Infection Biology, University of Würzburg, Würzburg, Germany

Transposon-directed insertion-site sequencing is anapproach for studying the essentiality of genes inprokaryotes. In this method, pools of single insertionmutants are constructed using transposon mutagene-sis and the effect of each mutation on the mutants’ sur-vival is evaluated by sequencing the survivors. Thiscan lead to the identification of essential genes.We have used transposon-directed insertion-site se-quencing to study the essentiality of genes in 12strains from Enterobacteriaceae which are depicted inthe figure. For this, we have studied different biasesthat can affect our transposon insertion experiment.After correcting for the biases, we have studied therelation between the essentiality of genes and theirconservation.

Klebsiella pneumoniae Ecl8

Salmonella Typhimurium SL1344

Salmonella Enteritidis P125109

Escherichia coli UPEC ST131

Salmonella Typhimurium D23580

Escherichia coli ETEC CS17

Salmonella Typhimurium A130

Enterobacter cloacae NCTC 9394

Citrobacter rodentium ICC168

Escherichia coli ETEC H10407

Klebsiella pneumoniae RH201207

Salmonella Typhi Ty2

Introduction

Questions1. Are there any biases that affect the results of transposon insertion experiments?

2. Is the conservation of essentiality consistent with the species tree?

3. Are essentiality of genes and their conservation related?

Questions1. Are there any biases that affect the results of transposon insertion experiments?

2. Is the conservation of essentiality consistent with the species tree?

3. Are essentiality of genes and their conservation related?

Transposon insertion is the process of inserting a nucleotide sequence into a geneso that it disrupts the gene and causes the gene lose its functionality.

• If the gene is essential the organism will not be able tosurvive.

• If it is non-essential the organism will be able to survive.

• If it is a beneficial loss the organism will benefit from los-ing it.

After genome sequencing:

• No or few transposon insertions are spotted in essential genes.

• An intermediate number of transposon insertions are detected in non-essentialgenes.

• Many transposon insertions are observed in beneficial losses.

Transposon-directed insertion-site sequencing

We have divided our genesinto 3 segments: 5% ofthe genes on the 5’ end,20% of the genes on the3’ end, and the rest in themiddle. The figure showsthat the number of inser-tions on the 3’ and 5’ endsis more than the internalregion in essential genesand less than the internalregion in beneficial losses.

Essential

position

mea

n ii

0.0

0.1

0.2

0.3

0.4

5' 3'

First 5%internalLast 20%

Beneficial loss

position

mea

n ii

0.0

0.5

1.0

1.5

2.0

2.5

5' 3'

First 5%internalLast 20%

Are transposon insertions evenly distributed within genes?

0.0 0.1 0.2 0.3 0.4 0.5

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Distance bias

Distance from the origin

inse

rtio

n in

dex

We have also investigated if the number of in-sertions in genes is related to the position ofthe gene within the genome or the G-C con-tent of the gene. The results propose the furtherwe get from the origin of replication, the fewernumber of insertions we have (the left figure).Moreover, the right figure shows that when theG-C content is greater than 0.5, there is no bias.

0.2 0.3 0.4 0.5 0.6 0.7

0.0

0.5

1.0

1.5

2.0

2.5

3.0

GC bias

GC content

inse

rtio

n in

dex

To test whether the results are biased towards certain motifs, we have generated logos from 10 nucleotides flanking the 100 top mostfrequent insertion sites. The analysis shows no significant bias.

probability

CTAG

CATG

AGCT

CAGT

GTAC

CATGTAGC

ATCG

GATCCTGAATCG

ACGTAGCT

AGTCGCTATCAG

ATGC

CTGAATGCGCATATCG

0

1

5 10 15 20

1

bit

s

C

T

T

A

C

CATGT

AGC

GATCC

T

GA

ATCGACGT

A

GTCG

CTAT

CAG

C

TGA

ATGCG

C

A

T

ATCG

2

0

The top 100 most frequent insertion sites

5 10 15 20

Are transposons biased towards certain positions in the genome?

We have compared the number of genes that are conserved in different strains in our study and the number of genes that are essentialin these strains. The results propose that although conservation of genes follows a tree-like trend, the essentiality does not show atree-like signal.

1909

779

471

261

208

165

121

97 93 93 84 82 78 77 74 64 61 61 58 56

0

500

1000

1500

2000

Inte

rsec

tion

Size

Conservation

Escherichia coli K-12 MG1655

Salmonella Typhi Ty2

Escherichia coli ETEC H10407

Salmonella Typhimurium A130

Escherichia coli ETEC CS17

Salmonella Typhimurium D23580

Escherichia coli UPEC ST131

Klebsiella pneumoniae RH201207

Klebsiella pneumoniae Ecl8

Salmonella Enteritidis P125109

Salmonella Typhimurium SL1344

Enterobacter cloacae NCTC 9394

Citrobacter rodentium ICC168

180

124

5648

4336 32 31

25 24 22 21 19 189 7 7 6 5 4

0

50

100

150

Inte

rsec

tion

Size

Essentiality

Escherichia coli K-12 MG1655

Salmonella Typhi Ty2

Escherichia coli ETEC H10407

Salmonella Typhimurium A130

Escherichia coli ETEC CS17

Salmonella Typhimurium D23580

Escherichia coli UPEC ST131

Klebsiella pneumoniae RH201207

Klebsiella pneumoniae Ecl8

Salmonella Enteritidis P125109

Salmonella Typhimurium SL1344

Enterobacter cloacae NCTC 9394

Citrobacter rodentium ICC168

Is the conservation of essentiality consistent with the species tree?

We have divided the genes in our12 strains into 3 groups: genus spe-cific genes, genes with one copy pergenome, and genes with multiple copiesper genome. The study of essential-ity in these groups shows that mostof the essential genes are copied onceper genome and most of the beneficiallosses are genus specific.We have performed a pathway en-richment analysis on different groupsof genes in Salmonella Typhi usingKOBAS 2.0. The results indicate thatessential genes are mostly involved inessential pathways such as replicationand translation; the enrichment of thepathways related to non-essential genesis not statistically significant; and thebeneficial losses are mostly involved inpathways that are not needed in nutrient-rich broth.

All clusters

Insertion Index

Fre

quen

cy

0 1 2 3 4

020

0

n = 6550

EssentialNon−essentialBeneficial loss

Genus specific

0 1 2 3 4

010

0

n = 2884

Single copy

0 1 2 3 4

n = 2742

Multiple copy

0 1 2 3 4

n = 924

Protein export

DNA replication

Homologous recombination

Terpenoid backbone biosynthesis

Ribosome

0 2 4 6 8−log10(P−value)

Pat

hway

Essential

Flagellar assembly

Microbial metabolism in diverse environments

Phosphotransferase system (PTS)

Sulfur metabolism

Two−component system

0.0 0.5 1.0−log10(P−value)

Pat

hway

Non−essential

Phosphotransferase system (PTS)

Lipopolysaccharide biosynthesis

Bacterial invasion of epithelial cells

Salmonella infection

Bacterial secretion system

0 2 4 6−log10(P−value)

Pat

hway

Beneficial losses

Are essential genes more likely to be conserved?

• The 5’ and 3’ ends of genes have a different tolerance for insertions compared to the internal region in transposon-directed insertion-site sequencing.

• The number of transposons inserted to a gene is related to the distance of the gene from the origin of replication.

• The transposons are not biased towards certain motifs or G-C content of the gene.

• The conservation of essentiality is not consistent with the species tree.

• Essential genes are more likely to be conserved.

Conclusions

ContactFatemeh Ashari [email protected]

ContactFatemeh Ashari [email protected]

Recommended