126
Genomics of Pathogens: Positive selection and disruptive technology Neil Hall @neilhall_uk

Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Genomics of Pathogens: Positive selection and disruptive

technology

Neil Hall

@neilhall_uk

Page 2: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Hall Lab

• Parasites

• Microbiomes

• Plants

Page 3: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

The Centre for Genomic Research

A national centre for genomic technology

Funded by

NERC (Core funding)

MRC (core funding)

BBSRC

http://www.liv.ac.uk/genomic-research/

Page 4: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

What’s a bioinformatician? (and what am I?)

Me

http://blog.fejes.ca/?p=2418

Anthony P. Fejes

Page 5: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Overview (Part 1 Pre-NGS)

• NGS – Why cost matters – NGS as an assay

• Sequencing as an assay

– The red queen

• Genomics with 1 genome (old school genomics) – Plasmodium falciparum genome project – The disease and the parasite – The genome – Metabolism

Page 6: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Overview (Part 2) • Genomics with >1 genome

– What we learn from other Plasmodium spp – Transcriptomes, proteomes and gene regulation

• Genomics with >100 genomes – Population genomics in Plasmodium – Mapping drug resistance

• Biology-by-sequencing in Entamoeba

– The Entamoeba genome – Sex and entamoeba – Mapping virulence – Transcriptomics and more sex

Page 7: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Genomics After NGS New Technology

Page 8: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

GENOMIC EVOLUTION

Genes Genomes Populations

1975 1995 > 2007

Page 9: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Properties of sequence reads

• Sequencing reads properties

– Length

– Quality

– Cost

– Time to production

@SRR001666.1

CACAAAGTGGCAAAGTACAATTAGTCTATAATGAACATGTTGAACTGATAGAGGTACCAATAAAACCTAGT

GATCGTTTAAAAGCTCGTGATATGTTGGGTAAAC

+SRR001666.1

BABBBFF@DFAFGGFDGGBGFGFHGHHHHHHHHHHHHHHHHHFHHFHDHHHHFHGHHHHHHHHHHH

GHHGHHHHHGFHGHHFHHHHHGFHFHGHHHHGHGGHFH

Qualities that scientists would say they are interested in

Qualities that scientists are actually interested in

Page 10: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Revolution in costs and throughput

http://www.genome.gov/sequencingcosts/ Hall, N Genome Biology 2013, 14:115

Rea

d le

ngt

h

2001 2012

Rea

d q

ual

ity

Page 11: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Sequencing costs have long been seen as a crucial issue for scientists

Science 1990

Costs • Reagent cost • Machine depreciation cost • Human cost

• DNA cost

Nature 1992

Page 12: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Process bottlenecks have changed

10 years ago Now Sample Prep

Sequencing

Analysis

Page 13: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Relative costs of a genome project

Sboner et al. Genome Biology 2011 12:125

Proportion of project costs

The questions we ask of the data are far more specific

Page 14: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

De novo

RNAseq

ChIP-Seq

SNP discovery

Transcriptomics

Metagenomics

Exome resequencing

Diagnostics

Community profiling

Mutation detection

Population genetics

RADtags

Metylation profiling

15 years ago now

APPLICATIONS

Page 15: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Shendure and Aiden Nature Biotechnology 30, 1084–1094 (2012)

APPLICATIONS

Page 16: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Sequencing as experiments

Shendure and Aiden Nature Biotechnology 30, 1084–1094 (2012)

All experiments have essentially the same workflow……

Page 17: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

The Red Queen A hypothesis driven sequencing experiment

"Now, here, you see", said the Red Queen to Alice in Lewis Carroll's Through the Looking Glass, "it takes all the running you can do, to keep in the same place".

Patterson et al 2012 Nature 464:275

Page 18: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Haldane 1949 – Disease and Evolution

• ‘genetical diversity as regards resistance to disease is vastly greater than that regards resistance to predators’

• ‘it is much easier for a mouse to get a set of genes which enable it to resist Bacillus typhimurium than a set which enable it to resist cats’

– Reprinted in ‘Evolution’ Mark Ridley

Page 19: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Polymorphism vs divergence

• Can use polymorphism within a species to generate null hypothesis for divergence between species

• Null hypothesis (MacDonald-Kreitman)

• positive selection should spread rapidly and become fixed. – Dn/Ds > Pn/Ps

• Under purifying selection – Dn/Ds <Pn/Ps

 

DN

DS=PN

PS

Page 20: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Pseudomonas & phage B

acte

rial

res

ista

nce

P

hag

e in

fect

ivit

y

Time

Past phage Current phage

Future phage

Past bact Current bact

Future bact

Page 21: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Pseudomonas & phage evolution through time

Bac

teri

al r

esis

tan

ce

Ph

age

infe

ctiv

ity

Time

Current phage

Current bact

Patterson et al 2012 Nature 464:275

Page 22: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

host

Experimental evolution

Phage adapt to hosts and hosts adapt to phage ‘co-evolution’

F2 phage in Pseudomonas fluorescens

Phage adapt to fixed host genotype ‘evolution’

Patterson et al 2012 Nature 464:275

Page 23: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Next-gen sequencing

• 454 sequencing of replicate populations – c. 200x coverage

• Measure frequencies of variants arising by mutation and subsequently selected – Rate of molecular evolution with and without Red

Queen?

– Diversity within and divergence between populations?

– What genes?

Patterson et al 2012 Nature 464:275

Page 24: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Replicates: (1) followed a similar trajectory away from the

ancestral sequence as they adapted to laboratory conditions;

(2) evolved similarly among replicates within a treatment but differently in different treatments

(3) showed independent evolution within each replicate, and higher rate in the coevolved than the evolved treatment.

Patterson et al 2012 Nature 464:275

Page 25: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Divergence from ancestor Diversity within population

Patterson et al 2012 Nature 464:275

Page 26: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Analysis of molecular variation within and between populations

Page 27: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Conclusions

• Positive selection is stronger in the co-evolved phage.

• Co-evolved phage show twice the genetic distance from the ancestor

• Phage attachment proteins in the tail contain the most diversity within and between populations

• Suggests that there is fluctuating selection within populations

Page 28: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Genome of Plasmodium falciparum

Page 29: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Protist Parasite Genomes: An evolutionary perspective

Baldauf et al (2003) Science 300:1703

Page 30: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Malaria

300-500 million cases,1.5-2 million deaths/yr Malaria kills a child every 40 seconds

Resistance to the cheapest antimalarial drug, chloroquine, is present in almost all endemic countries

Resistance has developed to most of the “new” antimalarials

No practical vaccine available

New methods for prevention and treatment of malaria are required

Caused by Plasmodium spp.

A eukaryotic single celled organism

Complex life cycle

Page 31: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Global distribution of malaria in 2007

Hey et al (2009) Plos Medicine 6:e1000048 (figures 2007)

Page 32: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Map of GDP per capita in 2006 based on IMF figures. Map from Wikimedia.org

GDP per capita, 2006

Page 33: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

The malaria parasite is

intracellular, spending much

of its life cycle living and

dividing within host cells.

Avoids host immune system

and receives a plentiful

supply of nutrients.

Vaccines against malaria must

target liver stages, sexual

stages or invasive stages.

Drugs need to access the

parasite in RBC or liver

stage.

The Plasmodium life cycle

Page 34: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Infection in non-immune humans; data derived from treatment of neuro-syphillis with malaria

The parasite keeps changing to evade the immune response and so

persists for a long time

Page 35: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Cytoadherence and antigenic variation

Erythrocyte invasion R

Mn

M

D

N

P

R = rhoptries Mn = micronemes D = dense granules

Page 36: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Cytoadherence and antigenic variation are crucial to pathogenicity in P. falciparum

• The parasite expresses proteins that are exported to the red blood cell surface

• The PfEMP proteins (encoded by var genes) mediate interaction with host endothelial cells causing cytoadherence.

• Cytoadherence causes cerebral malaria that can lead to death

• Var and Rif genes are also antigenic causing immune response

• The expressed genes can change (antigeneic variation)

Fernandez et al. 193 190 (10): 1393

Page 37: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

The Plasmodium genome project: A brief history

1995

Wellcome Trust funds

chromosome 1 pilot

1996

US funds $1m

pilot project

Consortium established

Sanger, TIGR , Stanford

Oxford, NMRI

1998

Chromosome 2

Published By TIGR

1999

Chromosome 3

Published by Sanger

2002 Whole genome published

Gardener, Hall, and Fung et al (2002) Nature 419:498-511 Hall et al (2002) Nature 419:527-31 Lasonder E et al (2002) 419:537-42

Page 38: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Not everyone celebrated

The Times (UK) 2002

Page 39: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Strategy

• Separate chromosomes by PFGE.

• Shotgun sequence individual chromosomes

• Align Contigs to map and close gaps using PCR/primer walking.

Page 40: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

DNA

Contiguous sequence

pUC clone

end sequence

large clone

end sequence

physical gap sequence gap

STS-1 STS-2 STS-3 STS-4

Shotgun Sequencing (back in the day)

14x coverage, YAC clones shotgun sequenced to anchor contigs

Page 41: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

The Plasmodium falciparim genome • 22853764 bp

• 19.4% GC

• 1 gene every 4.3kb (52% coding)

• Minimal set of tRNAs

• ~90% of genes have not previously described in Plasmodium.

• ~50% have no known homologues

Gardener, Hall, and Fung et al (2002) Nature 419:498-511

Page 42: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Chromosome Structure

SB-3 SB-2 SB-4 SB-4f SB-5

telomere

TARE2-5

Rep20

Subtelomeric repeats Var genes Rifin/stevor genes Other gene families

antigens House-keeping antigens

Page 43: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Adhesive domains of pfEMP-1

DBL-1 DBL-2 DBL-3 DBL-4 ATS

a b d e g

DBL-5

Blood

Group A

antigen

CR1

GAG’s CD36

IgM

ICAM-1 CD31 CSA

CIDR

Extracellular TM Intracellular

C2 CIDR TM

pfEMP-1 (250-350 kDa)

var (8-15 kb) Exon II Exon I

3 types of domain. DBL- duffy binding like CIDR- cystine rich interdomain region C2 - constant2

Page 44: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

VAR genes have a domain structure in Pf3D7

• 59 var genes in total.

• Telomeric Var genes tend to be type 1

• All internal Var clusters tend to be type 1

• Three upstream sequences – UPSa – 11 members- Inverted vars

– UPSb -35 members - telomeric vars

– UPSc 13 members - Internal vars

Gardener, Hall, and Fung et al (2002) Nature 419:498-511

Page 45: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Why put these genes at telomeres?

• Telomeres are hotspots for recombination

• Telomeres co-localize in the nucleus and non-homologous recombination can occur

• Lots of gene conversion + generation of new genes.

Page 46: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Var gene organisation is critical to antigeneic variation

Guizetti J,& Scherf A. 2013 Cell Microbiol. 15(5):718-26

UPS sequences regulate switch rate.

Page 47: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Regulation of gene expression

• Plasmodium falciparum has less transcription factors than you would expect in a genome of its size

– Either they are highly diverged.

– other mechanisms play a major roll in regulation of gene expression.

Gardner, Hall & Fung et al (2002) Nature 419:498-511.

Page 48: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Pfam Domain Proportion of matching domains Proportion normalised By genome size

Genes/1000 in P.falciparum

av

Classification of DNA binding domains in P. falciparum

Coulson, Hall, Ouzounis (2004) Genome Res. 14, 1548

Page 49: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Why so few regulators?

• Plasmodium has a predetermined life cycle.

• It does not have nutritional “choices” as it lives in defined, stable and nutrient rich environment.

• It is not competing with other species.

• The environments it lives in are varied but occur sequentially i.e. it knows what is coming next.

Page 50: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

IMP

AMP

XMP

GMP

GTP

GDP

glutamate N-acetyl-glutamate glutamine

Amino Compounds

aspartate

methionine salvage pathway NOVEL INHIBITORS

glycine serine cysteine alanine

proline asparagine ornithine

ornithine spermidine putrescine

Purine salvage,

Pyrimidine synthesis

PEP

pyruvate

GLYCEROL

GLUCOSE

fructose-6P

fructose-1,6-bisP

dihydroxyacetone-P +

glyceraldehyde-3P

Glycolysis

myo-inositol-1P

L-LACTATE

6-phospho-gluconate

ribulose-5P ribose-5P

Pentose Phosphate

Pathway

PRPP

glucosamine-6P

glucosamine

glucosamine-1P

dihydroorotate

orotate

dUMP Methylene THF

DHF THF

xylulose-5P +

erythrose-4P

CDP

dCDP

DNA RNA

chorismate ?

Shikimic Acid

Pathway

CoA

dephosphoCoA riboflavin

FMN

FAD

CO2

pABA

Folate biosynthesis

glycosyl

phophatidylinositol

(GPI anchors)

NOVEL

INHIBITORS

Glycerolipid Metabolism

3-deoxyarabino-heptulosanate- 7-phosphate

dTMP

Pyrimethamine

Cycloguanyl

oxaloacetate

glycerol triacylglycerol

phosphatidylcholine choline

ethanolamine phosphatidylethanolamine

Protohaem (FPIX2+)

porpho-bilinogen

acetoacetyl-ACP + malonyl-ACP

APICOPLAST

isopentenyl-PP

3-oxoacyl-ACP

3-hydroxy-

acyl-ACP enoyl-ACP

acyl-ACP

acetyl-CoA

Haem

Biosynthesis

Fatty acid

elongation

Glycerolipids

Triclosan

Thiolactomycin

ALA

Haem A Haem C

Ubiquinone (UQ)

acetate

glycine

NAD+

NADH modified tRNAs

?

malonyl-CoA

pyruvate

glucose-6P

malate

NOVEL

INHIBITORS

2C-methylerythrose-4P

deoxyxylulose-5P

NOVEL

INHIBITORS

O2

FPIX2+

Large peptides

Small peptides

Amino acids Haemoglobin

FPIX3+ O2

-

Haemozoin

FOOD VACUOLE

Chloroquine

Artemesinin

Quinine

PROTEASE INHIBITORS

PROTEASE INHIBITORS

F, V, & P-type ATPases

ADP ATP

H+

V

ADP ATP

P-lipids, Cu2+, other cations?

(16)

P

Na+ H+ Ca2+ H+ Zn2+ H+ Mn2+ H+ water/

glycerol nucleotide

or nt-sugar?

(2)

nucleo- side/base

ADP ATP

H+

F

?

? H+

NOVEL

INHIBITORS

Sulfonamides

ATP

hypoxanthine

xanthine

guanine

Folate Biosynthesis

7,8-dihydropteroate

dihydrofolate (DHF)

tetrahydrofolate (THF)

pABA

Pyrimethamine

Cycloguanyl

Purines and

Pyrimidines

H2O Cytc Fe3+

Cytc Fe2+

UQ

UQH2

O2 or

Atovaquone CO2 +

aspartate

MITOCHONDRION acetyl-CoA

glucose-1P

DOXP Pathway

Fosmidomycin

aminolevulinic acid (ALA)

oxoglutarate

citrate

Tricarboxylic acid

cycle

oxaloacetate

malate

fumarate

succinate

succinyl-CoA

isocitrate

cis-aconitate

Fatty Acid

Biosynthesis

NOVEL

INHIBITORS

PPi

H+

(2) Pi ADP ATP

ABC transporters

(13)

drugs?

NOVEL

INHIBITORS

NOVEL

INHIBITORS

?

malate

oxaloacetate

or

ubiquinone pool

Gardener, Hall, and Fung et al (2002) Nature 419:498-511

Page 51: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Purine salvage,

Pyrimidine synthesis

PEP

glutamate N-acetyl-glutamate glutamine

Amino Compounds

aspartate

methionine salvage pathway NOVEL INHIBITORS

glycine serine cysteine alanine

proline asparagine ornithine

ornithine spermidine putrescine

pyruvate

GLYCEROL

GLUCOSE

fructose-6P

fructose-1,6-bisP

dihydroxyacetone-P +

glyceraldehyde-3P

Glycolysis

myo-inositol-1P

L-LACTATE

6-phospho-gluconate

ribulose-5P ribose-5P

Pentose Phosphate

Pathway

PRPP

glucosamine-6P

glucosamine

glucosamine-1P

dUMP Methylene THF

DHF THF

xylulose-5P +

erythrose-4P

CDP

dCDP

DNA RNA

chorismate ?

Shikimic Acid

Pathway

CoA

dephosphoCoA riboflavin

FMN

FAD

CO2

pABA

Folate biosynthesis

glycosyl

phophatidylinositol

(GPI anchors)

NOVEL

INHIBITORS

Glycerolipid Metabolism

3-deoxyarabino-heptulosanate- 7-phosphate

dTMP

Pyrimethamine

Cycloguanyl

oxaloacetate

glycerol triacylglycerol

phosphatidylcholine choline

ethanolamine phosphatidylethanolamine

Protohaem (FPIX2+)

porpho-bilinogen

acetoacetyl-ACP + malonyl-ACP

APICOPLAST

isopentenyl-PP

3-oxoacyl-ACP

3-hydroxy-

acyl-ACP enoyl-ACP

acyl-ACP

acetyl-CoA

Haem

Biosynthesis

Fatty acid

elongation

Glycerolipids

Triclosan

Thiolactomycin

ALA

Haem A Haem C

Ubiquinone (UQ)

acetate

glycine

NAD+

NADH modified tRNAs

?

malonyl-CoA

pyruvate

glucose-6P

malate

2C-methylerythrose-4P

deoxyxylulose-5P

NOVEL

INHIBITORS

O2

FPIX2+

Large peptides

Small peptides

Amino acids Haemoglobin

FPIX3+ O2

-

Haemozoin

FOOD VACUOLE

Chloroquine

Artemesinin

Quinine

PROTEASE INHIBITORS

PROTEASE INHIBITORS

F, V, & P-type ATPases

ADP ATP

H+

V

ADP ATP

P-lipids, Cu2+, other cations?

(16)

P

Na+ H+ Ca2+ H+ Zn2+ H+ Mn2+ H+ water/

glycerol nucleotide

or nt-sugar?

(2)

nucleo- side/base

ADP ATP

H+

F

?

? H+

NOVEL

INHIBITORS

Sulfonamides

IMP

AMP ATP

XMP hypoxanthine

xanthine GMP

GTP

guanine GDP

Folate Biosynthesis

7,8-dihydropteroate

dihydrofolate (DHF)

tetrahydrofolate (THF)

pABA

Pyrimethamine

Cycloguanyl

Purines and

Pyrimidines

H2O Cytc Fe3+

Cytc Fe2+

UQ O2

or

Atovaquone CO2 +

aspartate

MITOCHONDRION acetyl-CoA

glucose-1P

DOXP Pathway

Fosmidomycin

aminolevulinic acid (ALA)

oxoglutarate

citrate

Tricarboxylic acid

cycle

oxaloacetate

malate

fumarate

succinate

succinyl-CoA

isocitrate

cis-aconitate

Fatty Acid

Biosynthesis

NOVEL

INHIBITORS

PPi

H+

(2) Pi ADP ATP

ABC transporters

(13)

drugs?

NOVEL

INHIBITORS

NOVEL

INHIBITORS

?

malate

oxaloacetate

or

ubiquinone pool

dihydroorotate

orotate

NOVEL

INHIBITORS

UQH2

Gardener, Hall, and Fung et al (2002) Nature 419:498-511

Page 52: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2
Page 53: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Pac Man Evolution of Malaria

Algae forms symbiotic

relationship with host

eukaryote

Ancient algae

chloroplast

Eukaryote cell

cyanobacterium

Modern Plasmodium Apicoplast

DNA from apicoplast

migrates to nucleus

Page 54: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Apicoplast targeting signals

~ 10% of nuclear genes may be imported into the

apicoplast

Foth et al Science, 299, 705-708, 2003

Page 55: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Evolution • 551 (10%) of proteins thought to be targeted to apicoplast by

identification of transit peptide.

• 246 targeted to mitochondria.

• Phylogenetic analysis of 333 putative plastid targeted proteins • 26 plastid derived

• 35 mitochondrial

• 85 possibly mitochondrial

• Redirection of targeted proteins ?

• “a range of herbicides that target plastid metabolism of undesired plants are also parasiticidal, making them potential new leads for antimalarial drugs” Kalanon and McFadden (2012)

Page 56: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

• A detailed understanding of the molecular basis antigenic variation

• A metabolic map for drug target identification

• A list of genes targeted to the apicoplast a known drug target.

What has the genome achieved?

Page 57: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Comparative Genomics of Malaria Parasites

Page 58: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Plasmodium species are highly host restricted

• Host jumping is rare

– P. Knowlesi into human

• Most plasmodium parasites are highly adapted to a single host or a few closely related species.

Hall, N Nature genetics (2013)

Page 59: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Eukaryotic Evolution 450 myr

310 mya

75mya

7mya

Chordate evolution

75mya

Plasmodium falciparum

Plasmodium berghei

Plasmodium chabaudi

Plasmodium evolution

Page 60: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Overview of assembly and gene Count

P. berghei P. c. chabaudi P. y. yoelii * P. falciparum

Size (bp) 17,996,878 16,866,661 23,125,449 22,853,764

No. contigs 7497 10679 5687 93

Av. contig

size (bp) 2,400 1,580 4,066 213,586

Sequence

coverage 4x 4x 5x 14.5x

No. protein

coding genes 5864 5698 5878 5268

*Carlton (2002) Nature 419:512-9.

Hall et al (2005) Science 307:82-6

Page 61: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2
Page 62: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

What is a good N50?….it depends

Page 63: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Orthology, Homology, Parology

• Homology

– Genes that are similar due to related ancestry.

• Orthologs

– Homologous genes separated due to a speciation event

• Paralogs

– Homologous genes separated by a duplication event

Page 64: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Orthologue Identification in between Plasmodium species

• Reciprocal orthologues identified using a BLAST cutoff of score of 50.

A B

orthologue

A B

Not ortholog

C

where are the differences ?

•5268 Genes in P. falciparum •4391 have orthologs in other species •Another 109 orthologs can be identified using synteny •736 have no orthologs.

•genes in RMP genomes clustered using Tribe-MCl

Page 65: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Plasmodium genomes are syntenic

Page 66: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Most (575) Species Specific genes are at the sub-telomeres

Orthologue map of P. falciparum Chr1

Of the 736 genes without orthologs. Only 161 are in internal regions. Of these 45 are members of Pf specific gene families 19 are Pf specific expansions Only 12 have predicted functions

Hall et al (2005) Science 307:82-6

Page 67: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Telomeric gene families

P. falciparum P. chabaudi P. berghei P. yoelii

VAR 59 - - -

Rifin 177 - - -

PIR - 138 180 838

Pyst-a 1 108 45 168

Pyst-b - 10 34 57

Pcst-b - 75 11 7

Hall et al (2005) Science 307:82-6

No

vel g

ene

fam

ilies

P

f sp

ecif

ic f

amili

es

Families clustered using Tribe-MCL

Page 68: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Evolution by gene family expansion

• There are very few genes that are truly “unique” to a Plasmodium species.

• There is a lot of gene family expansion that is species specific.

• This mostly occurs at telomeres.

• Some genes which are single copy in one species are multi-copy in another…and presumably have acquiring new functions.

Page 69: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Molecular selection analysis

Page 70: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Silent and Replacement Substitutions

Silent substitution:

Sequence 1: UUU CAU CGU

Sequence 2: UUU CAC CGU

Coded Amino Acids: Phe His Arg

Replacement substitution:

Sequence 1: UUU CAU CGU

Sequence 2: UUU CAG CGU

Coded Amino Acids: Phe His Arg Gln

‘Housekeeping Genes’.

Negative (Purifying) selection.

Change is Bad.

Genes that have a role in adaption.

Positive (Adaptive) selection.

Change is Good.

Selectively neutral genes

Genetic drift.

Change does not matter

Page 71: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

dN/dS analysis

Orthologous gene

Paml calcluate model of Evolution using maximum likelihood

dN/dS evolution rate

Muscle Protein alignment

nucleotide alignment

Page 72: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Positive selection on plasmodium genes is stronger in the mosquito vector

-The dN/dS of secretory proteins expressed in the vertebrate host is significantly higher than that of non secretory proteins Vertebrate expressed secretory proteins have a significantly higher dn/ds compared to mosquito expressed

Page 73: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Regulation of gene expression

• Plasmodium falciparum has less transcription factors than you would expect in a genome of its size

– Either they are highly diverged.

– other mechanisms play a major roll in regulation of gene expression.

Gardner, Hall & Fung et al (2002) Nature 419:498-511.

Page 74: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Expression data for P. berghei

• Amplicon based microarray, data

from RBC and gametocyte stages

• Proteomic data from RBC, mosquito and sporozoite stages.

Page 75: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Hall et al Science (2005)

Found Exclusively in this stage Non-exclusive to stage

Proteomic data

Page 76: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Identification of putative translational repressor motif in P. berghei

Identify transcripts upregulated only in the gametocyte

Of these find those that only have detectable protein products in ookinetes

49

9

Extract 3’ and 5’ UTRs and search for a conserved motif using MEME

Search for Motif back against the genome

Compare to expression data Hall et al Science, 2005

Page 77: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Putative regulatory motif in 3’ UTRs of gametocyte repressed transcripts

• 29 genes identified with 47bp degenerate motif (E value <10-5).

• 22 have clear orthologs in P. falciparum.

• 16 Pf orthologs are upregulated in Gametocytes.

• Only 2 identified in Pf. gametocyte proteome.

• Motif was not found in P. falciparum.

Hall et al Science, 2005

Page 78: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Dozi represses transcription in the female gametocyte

Mair , Braks, Garver, Wiegant, Hall, et al . 2006 Science. 313:667-9

Page 79: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Why translational control?

Transcription is ‘relatively’ slow. If your environment changes dramatically this could be a problem If you know what’s coming you can prepare for it. Hence gene regulation by transcriptional control is reduced in P.f

Page 80: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Population genomics in Plasmodium

Page 81: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Within species Plasmodium comparisons

• compared 470 regions 18 strains in total from around the world. – PCR amplification and sanger

sequence

• Measured total variation in each gene

• Diversity is strongly associated with host interaction

High diversity = Red Intermediate = Grey No diversity =Black

Volkman et al (2007) Nature Genetics 39:115

Page 82: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

MalariaGen population wide sequecning of malaria

• The MalariaGEN aim is to characterize genetic variation in P. falciparum populations.

• Goal is to establish a repository of population genomic data for P. falciparum that is of direct and immediate benefit to ongoing scientific research and ultimately to malaria control.

• 1685 samples from 25 separate locations in 17 countries

• Map haplotype frequencies for key mutations associated with drug resistance and other important phenotypes in P. falciparum

Page 83: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Allele frequency spectrum of SNPs genotyped in this study.

• Minor alleles dominate

– Consistent with recent expansion

• More low-freq alleles in Africa

– African origin

• Derived alleles associated with NS mutation

– Directional selection

Africa SE asia Paupa New Guinea

Manske (2012) Nature 487:375

Page 84: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

.

Genetic distance matrix between the 227 samples analysed

Manske (2012) Nature 487:375

Large separation between continents Separation of populations in areas of low transmission

Page 85: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Quantification of within-host diversity

• Inbreeding coefficient (Fws) can be estimated based on heterozygositiy within a host.

• Fws is probably related to transmission rates and human population distribution

• Inbreeding can lead to drug resistance.

Manske (2012) Nature 487:375

Page 86: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Identifying drug resistance loci

• Traditional genetic methods

• Allele selection

• Genome wide studies for selective sweeps

Page 87: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Genetic crosses for identifying drug resistance loci

• P. falciparum is very host specific. Genetic crosses cant be carried out in vitro

• Crosses require use of higher primates (Chimpanzees) and infection using live mosquitos.

• To date there have been 2 crosses of P. falciparum, published for identify drug resistance loci.

Anderson et al Pharmacogenomics (2011) 12(1), 59–85

Page 88: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Traditional methods for identifying drug resistance

• Take 2 strains with different phenotype

• Cross in mosquito and infect vertebrate host

• Clone and phenotype progeny

• Identify markers associated with the phenontype

Anderson et al Pharmacogenomics (2011) 12(1), 59–85

• The recombination rate in P. falciparum is high (1 cM = 17 kb) which is approximately 50-times greater than in humans (1 cM = 900 kb), and 15-times greater than Drosophila (1 cM = 250 kb).

• This means that mapping an provide relatively good resolution with few markers.

Page 89: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Allele selection

Allele selection has been used extensively in mouse model, P. chabaudii However the loci that are important in rodent parasites are not always the same as those important in human parasites.

Anderson et al Pharmacogenomics (2011) 12(1), 59–85

Multiple alleles contribute to chloroquine resistance in P. chabaudii Modrzynska et al. BMC Genomics 2012, 13:106

Page 90: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

GWAS analysis

• Alleles under recent selection (i.e from drugs) are often recently derived (new)

• Hence they have not been broken up by recombination

• These alleles are therefore often on large blocks and can be identified in populations as such.

Page 91: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

GWAS analysis

• In natural populations haplotypes are much smaller compared to in vitro crosses

• More markers are required to find linkage

• Also there is more variations so more individuals must be analysed to find a significant association

Page 92: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Long range LD and low diversity are a marker for drug resistance

Resistant lines in Blue have less diversity at DR loci than ancestral allele in red

Volkman et al (2007) Nature Genetics 39:115

Page 93: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Genetics from scratch in Entamoeba histolytica

Page 94: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Entamoeba histolytica • Transmission is by fecal oral route • Cysts in contaminated water or food enter gut

• Excystation occurs in the small bowel to form

motile trophozoites.

• Encystation of trophozoites produce new cysts that are passed in stools.

• Colitis occurs when parasites invade the gut wall

• Parasites entering the blood stream can invade the liver causing liver abscesses.

• 40-50 million develop invasive disease annually (WHO, 1985)

• 40-110,000 deaths annually

• 80-90% infected people asymptomatic

Page 95: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Entamoeba

http://www.medicine.virginia.edu/

Page 96: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2
Page 97: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Entamoeba spp are a diverse genus of commensals (and some parasites)

Clark et al (2006) Int J Syst Evol Microbiol 56, 2235-2239

Page 98: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Findings from E. histolytica genome

• Significant loss/degradation of variety of functions/organelles as a result of parasitic lifestyle in nutrient rich environment

• Concomitant expansion in specific domain families related to vesicle trafficking/phagocytosis & sensing of its environment

• Some unusual metabolic pathways may be present based on absence of synthesis pathways for key metabolites .

• Lateral gene transfer of genes related to various aspects of metabolism into the genome.

• Unusual or bacterial like processes may serve as good therapeutic targets as they are not found in the host.

Loftus et al (2005) Nature 433:865

Page 99: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Wilson Weedall and Hall. Parasite Immunology (2011)

Page 100: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Genetics of E. histolytica

• Disease is only occurs in a minority of cases of E. histolytica

• Some strains are more virulent than others in model infections (eg Rahman vs HM1)

• We hypothesise that there is an underlying genetic cause for this

• Unfortunately as there are no microsatellites in the E. histolytica genome no genetic studies have been performed

Page 101: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Why sex is important

• Recombination of genes

• New variants

• Allows establishment into new niches

• Variation helps adapt to immune system of host

• Spread of important traits .. drug resistance/virulence

Page 102: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Lots of Sex vs Occasional Sex vs No Sex

• Some parasites (Plasmodium) have meiosis every life cycle

• Others only replicate clonally (Leishmania major )

• Others have clonal epidemics with occasional recombination between strains (T. brucei)

• Others we are not sure about…. A-Clonal B-Panmictic C & D - Epidemic

Maynard Smith et al (1993)

Page 103: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Genome re-sequencing

* Unknown origin (isolated from sailor) ** Unknown origin (possibly Liberia or Colombia)

8 axenic strains sequenced (provided by C. Graham Clark, LSHTM)

Weedall et al Genome Biol. 2012 13(5):R38

Page 104: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

n

coverage

Defined ~4000 ‘high quality candidate marker’ sites = homozygous, coding, in all strains

>400 >400 <10 <10

Weedall et al Genome Biol. 2012 13(5):R38

Page 105: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

A test for recombination

4-gametes (4-haplotypes) test: 4 haplotypes must be due to recombination under the infinite sites model (no recurrent mutation)

X

3 1 2 1

3 2 4 2 1

} Recombination between haplotypes 1 and 3 creates new haplotype

- New mutation creates new haplotype

Weedall et al Genome Biol. 2012 13(5):R38

Page 106: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

A test for recombination

average distance between: 2-hap sites = 18,454 bp 3-hap sites = 46,644 bp 4-hap sites = 60,176 bp Proportion of site-pairs with 4 haplotypes increases with distance (on same reference scaffold) Pattern would be expected given recombination (probability increases with distance) Suggests historical recombination events

Used ~4000 ‘high quality candidate marker’ sites

Weedall et al Genome Biol. 2012 13(5):R38

Page 107: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Could we be observing gene conversion?

Recombination was modeled by specifying a recombination parameter 4.Ne.r,

where Ne is the effective population size and r is the per generation probability of a crossover occurring.

Gene conversion was modeled by specifying a parameter 4.Ne.f,

where f in the per generation probability of a gene conversion event in the sequence as well as the length of the 'converted' region.

Page 108: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Sources of variation in E. histolytica

• SNP

• Gene loss/gain

• Copy number variation

2352 / 8306 genes are polymorphic ~5 % have >= 4 SNP kb-1 ~1 % have >= 6 SNP kb-1

Page 109: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Rahman

MS84

HK9

CP-A1

Genomic plasticity: gene copy number variation

Page 110: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Rahman, transcriptome

Rahman, genome

PVBM08B, transcriptome

PVBM08B, genome

Copy number variation between strains is associated with differential transcription

Preativatanyou et al unpublished

Page 111: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Sources of intra-species genetic variation

• Recombination

• Genome plasticity – CNV –Gene loss gain

• These processes will affect the emergence and spread of – Drug resistance

– Virulence

Page 112: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Genotyping of field samples Gilchrest et al BMC Microbiol. 2012 12:151.

• 19 amebic liver aspirates;

• 26 xenic cultures – 14 from asymptomatic infections

– 12 from diarrheal infections

• 20 E. histolytica positive samples from diarrheal stool

• 21 marker loci selected for genotyping by Illumina sequencing. Rajshahi, Bangladesh

Page 113: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

SNP Genotypes from a single population are consistent with frequent recombination

• distribution of the SNP suggest a highly recombining population with few clustered branches.

• Suggesting a lot of recombination between alleles

Gilchrest et al BMC Microbiol. 2012 12:151.

Page 114: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

SNPs in the EHI_080100 locus segregate with disease

Reference alleles significantly more common in the Diarrhea/Dysentery sample relative to the Colitis and Liver abscess sample

Gilchrest et al BMC Microbiol. 2012 12:151.

Page 115: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

If recombination is occurring… when is it occurring?

Page 116: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Transcriptome of development in Entamoeba Ehrenkaufer et al Genome Biol. 2013 14(7):R77

• Using E. invadens as a model for Encystation

• Extract RNA over time-course • Reads mapped using TopHat

• Expression profiles clustered using

STEM

• Differential expression using CuffDiff

Page 117: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Encystation Excystation

10 µm

Trophozoite 8h Cyst 24h Cyst 48h Cyst 72h Cyst Excyst (2/8h)

10 µm 10 µm 10 µm 10 µm 10 µm

Library size: 58,449,143 58,660,483 59,468,129 59,024,235 48,908,902 48,106,288/ 41,257,294

Intra-timepoint correlation: 0.76 0.96 0.80 0.82 0.50 0.98/0.98

Changes in cell morphology during induced encystation and excystation

2 samples per time point and a third for northern confirmation

Ehrenkaufer et al Genome Biol. 2013 14(7):R77

Page 118: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

A

B

C

275

0

de

pth

EIN_014820

EIN_035150

975

0

de

pth

2

2

0

0 bits

Analysis of E. invadens introns

2599 3295 960

Introns from RNAseq mapping (n=3559)

Predicted E. invadens introns (n=5894)

Ehrenkaufer et al Genome Biol. 2013 14(7):R77

Page 119: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Temporal expression profiles

A B

127

1760

1384

146

90

430

219

428

537

325

129

169

360

378

45

41

44

47

4

8

15

19

49

37

40

48

17 18

7

16

11

0

16

15

41 24

31 34

37 19

21

20

35

13

45

32

Clustering of gene expression profiles using Short Time-Series Expression Miner (STEM) software During early encystation we observe a general trend of down regulation During excystation we observe a general trend of genes being switched on.

ENCYSTATION EXCYSTATION

Ehrenkaufer et al Genome Biol. 2013 14(7):R77

Page 120: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

FP

KM

04

00

01

00

00

FP

KM

04

00

80

01

20

0

FP

KM

05

000

15

00

0

FP

KM

02

00

040

00

60

00

FP

KM

04

00

80

01

20

0

FP

KM

040

00

80

00

12

000

FP

KM

04

00

08

000

8000

4000

10000

4000

1200

800

400

6000

2000

15000

5000

12000

4000

1200

800

400 EIN_192230

hypothetical

protein

EIN_017100

phospholipase D

EIN_060150

peflin

EIN_202650

eIF5A

EIN_093390

enolase

EIN_099680

hypothetical

protein

1kb

1kb

1kb

1kb

1kb

1.5kb

1.5kb

EIN_166570

hypothetical

protein

Tro

ph

Cys

t 2

4h

Cys

t 7

2h

Ex

cys

t

8h

Tro

ph

Cys

t 2

4h

Cys

t 7

2h

Ex

cys

t

8h

FP

KM

8h Cyst 24h Cyst 48h Cyst 72h Cyst 2h Excyst 8h Excyst T:8h T:24h T:48h T:72h T:2Ex T:8Ex 8h:24h 8h:48h 8h:72h 8h:2Ex 8h:8Ex 24h:48h 24h:72h 24h:2Ex 24h:8Ex 48h:72h 48h:2Ex 48h:8Ex 72h:2Ex 72h:8Ex

Pairwise Comp.

numb

er of

gene

s reg

ulated

020

040

060

080

010

0012

00

Up

Down

T:8h T:24h T:48h T:72h T:2Ex T:8Ex 8h:24h 8h:48h 8h:72h 8h:2Ex 8h:8Ex 24h:48h 24h:72h 24h:2Ex 24h:8Ex 48h:72h 48h:2Ex 48h:8Ex 72h:2Ex 72h:8Ex

Pairwise Comp.

nu

mb

er

of

gen

es

reg

ula

ted

02

00

40

06

00

80

01

00

01

20

0 Up

Down

T:8h T:24h T:48h T:72h T:2Ex T:8Ex 8h:24h 8h:48h 8h:72h 8h:2Ex 8h:8Ex 24h:48h 24h:72h 24h:2Ex 24h:8Ex 48h:72h 48h:2Ex 48h:8Ex 72h:2Ex 72h:8Ex

Pairwise Comp.

numb

er of

gene

s reg

ulated

020

040

060

080

010

0012

00

Up

Down

Comparison to trophozoites Comparison to 72h cysts T:8h T:24h T:48h T:72h T:2Ex T:8Ex 8h:24h 8h:48h 8h:72h 8h:2Ex 8h:8Ex 24h:48h 24h:72h 24h:2Ex 24h:8Ex 48h:72h 48h:2Ex 48h:8Ex 72h:2Ex 72h:8Ex

Pairwise Comp.n

um

be

r o

f g

en

es r

eg

ula

ted

02

00

40

06

00

80

01

00

012

00 Up

Down

0

20

0

40

0

60

0

80

0

10

00

nu

mb

er o

f ge

nes

re

gula

ted

RNAseq data are confirmed by Northern analysis

Ehrenkaufer et al Genome Biol. 2013 14(7):R77

Page 121: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

010

02

00

300

40

0500

conditon

FP

KM

0 8 24 48 72 80

EIN_040930

EIN_168780

Changes in the estimated expression levels of known encystation associated genes

A) chitin synthases

B ) chitin binding lectins

C) chitinase domain proteins

FPK

M

FPK

M

FPK

M

0

1

00

2

00

3

00

4

00

5

00

0

10

00

3

00

0

50

00

0

10

00

20

00

30

00

Troph 8h 24h 48h 72h 2h 8h

Troph 8h 24h 48h 72h 2h 8h

Troph 8h 24h 48h 72h 2h 8h

010

00

2000

30

00

400

050

00

conditon

FP

KM

0 8 24 48 72 80

EIN_015880EIN_016240EIN_040990EIN_050710EIN_058620EIN_104770EIN_137570EIN_186850EIN_294450EIN_312260

01

00

0200

0300

0

conditon

FP

KM

0 8 24 48 72 80

EIN_053310

EIN_059870

EIN_059900

EIN_084170

EIN_134930

EIN_239240

EIN_243430

EIN_289570

Page 122: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

expression levels of genes involved in meiosis increase at 24 h after encystation F

PK

M

05

01

00

15

020

02

50

Troph Cyst 8h Cyst 24h Cyst 48h Cyst 72h Excyst 2h Excyst 8h

All meiosis-associated genes

05

00

10

00

15

00

20

00

25

00

Troph Cyst 8h Cyst 24h Cyst 48h Cyst 72h Excyst 2h Excyst 8h

Meiosis-specific genes

Ehrenkaufer et al Genome Biol. 2013 14(7):R77

Ramesh et al 2005, Current Biol. 15:185–191

Page 123: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Troph Cyst_8h Cyst_24h Cyst_48h Cyst_72h

FP

KM

05

01

00

15

02

00

25

0 EIN_017100

EIN_196230

Troph Cyst 8h Cyst 24h Cyst 48h Cyst 72h

FP

KM

0

10

0

2

00

Troph Cyst_8h Cyst_24h Cyst_48h Cyst_72h

FP

KM

05

01

00

15

02

00

25

0 EIN_017100

EIN_196230

Phospholipse D expression and activity increases during encystation

• E. invadens has two genes encoding PLDs: EIN_017100 and EIN_196230, both of which are highly upregulated during encystation

• Phospholipase D activity

increased during encystation.

Ehrenkaufer et al Genome Biol. 2013 14(7):R77

Ency

sta

on

eff

ici

enc

y (%

)

Treatment

0

4

0

8

0

12

0

none n‐butanol t‐butanol

t‐butanol n‐butanol

• n-butanol was used to inhibit PLD,

t-butanol used as a control

• N-butanol significantly reduces encystation in culture.

Page 124: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Meiosis and Entamoeba

• Genomic data support the hypothesis that Recombination is occurring in Entamoeba

• We have described the transcriptional changes during encystation and excystation in E. invadens.

• We see strong transcriptional evidence for meiosis occurring during the encystation process

• These data further suggests that meiosis is integral to encystation and a common process in Entamoeba

• We have identified a lipid signaling molecule (PLD) that is involved in induction of encystation.

Page 125: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2

Final thoughts

• Genomics is cool

• You don’t need to me a superman/superwoman to be a genome scientist

• Remember you’re a biologist

Page 126: Genomics of Pathogens: Positive selection and disruptive technologyevomicsorg.wpengine.netdna-cdn.com/wp-content/uploads/... · 2017. 3. 14. · Malaria 300-500 million cases,1.5-2