1
HIV-1 subtype C now accounts for approximately 50% of the estimated 33 million people living with HIV/AIDS and half of the 1-2 million new infections annually. The predominance of a single clade of HIV-1 in the most severely affected sub-Saharan countries has been ascribed to a founder effect, while the rapid heterosexual transmission and dissemination of subtype C has been attributed to sexual-social factors, the frequency of concurrent partnerships, increased virus load, and shedding of virus alongside sexually transmitted infections. Comparisons with subtype B virus isolates have demonstrated subtype C viruses have enhanced tropism for macrophages and dendritic cells, increased viral replication rates through transcriptional regulation, and most recently, a higher rate of mutation and drug resistance among women receiving single-dose nevirapine. A high rate of infection is documented in young, antenatal women, where infection rates between ages 16-24 are 2-5% per year. HIV testing of pregnant women for pMTCT programs provides a consistent population from which recently transmitted viruses can be identified and the evolution of new subtypes, recombinants, and temporal changes in subtype C, can be characterized. Routine sequencing of pol genes of circulating virus, as a surveillance tool for drug resistance, has increasingly been used for evolutionary and phylogenetic mapping, and to explore the origins, molecular epidemiology, and genetic diversity of different HIV-1 subtypes. We analyzed subtype C pol sequence data over time from successive cohorts of women screening for HIV infection in antenatal clinics in Zimbabwe. Evaluation of genetic diversity and phylogenetic relationships in these cohorts enabled formulation of a Bayesian molecular clock, which provided information on the origins, timing, and epidemic growth patterns of the subtype C epidemic in southern Africa. Origins, Evolution, and Dissemination of Subtype C HIV-1 in Zimbabwe: Bayesian and Phylogenetic Analysis Sudeb Dalai, Seble Kassaye, Tulio de Oliveira, Gordon Harkins, Jennifer Lint, Elizabeth Johnston, and David Katzenstein. Stanford University School of Medicine, Stanford, CA, USA; and South African National Bioinformatics Institute, University of the Western Cape, Cape Town, South Africa. I. Background and Objectives II. Methods IV. Summary and Conclusions Contact Information: [email protected] III. Results Acknowledgements Sudeb Dalai is supported by the Howard Hughes Medical Institute Research Fellowship and the Paul and Daisy Soros Fellowship for New Americans. •We sequenced HIV-1 pol from samples obtained from 4 sequential cohorts of pregnant, HIV-positive women in Harare, Zimbabwe presenting to antenatal clinics from 1991-2006. •Maximum-likelihood phylogenetic trees were constructed with PhyML v2.4.4. •Ancestral sequences were reconstructed with HyPhy v1.0. •Sequence divergence from ancestral sequences was used to derive a nucleotide evolutionary rate, which we used to calibrate a Bayesian Markov chain Monte Carlo analysis (BEAST v1.4) under several different models of population growth. BEAST software implementation was used to reconstruct most recent common ancestor (MRCA) sequences, time the introduction of infection in this region, and estimate a time-resolved phylogeny and viral dynamics of the epidemic. •A number of other concurrent southern African subtype C sequences were added to the dataset, and the Slatkin-Maddison statistical test for gene flow (implemented in MacClade v4.0) was used to test the hypothesis of compartmentalization vs. migration of subtype C HIV-1 among southern African countries. Pol sequences obtained from 212 women (39 in 1991; 56 in 1998; 27 in 2000; 90 in 2006) demonstrated a significant increase in sequence diversity over the 16-year period, assessed by genetic distance (p<0.0001, one- way ANOVA). •Sequences demonstrated clustering by sampling year, with the recent 2006 sequences most divergent from the ancestral node (Figure 1). •BEAST analysis calibrated a molecular clock evolutionary rate at 2-3x10 -3 nucleotide subsitutions / site / year, across several different models of population growth. •The estimated date of the MRCA was 1973 across all population growth models (Figure 3), with clear evidence for multiple introductions of subtype C HIV-1 from neighboring countries during 1979-1981 (Figures 2 & 4). •Lineage calculations at various timepoints (as a percentage of current lineages) indicated that most lineage diversity was introduced during 1980-1985 (Figure 2). •Zimbabwean subtype C sequences clustered most closely with sequences from neighboring African nations, and were more divergent from subtype C sequences isolated from other regions of Africa or the world. •Bayesian Skyline analysis implemented in BEAST (Figure 4) demonstrated three epidemic growth phases: an initial, slow phase seeded HIV-1 in the 1970’s, followed by exponential growth in the 1980’s and a linear expanding epidemic to the present. •Slatkin-Maddison tests of Before 1998 n=563 After 1998 n=483 Before 1998 •The Zimbabwean HIV epidemic likely originated from multiple introductions of subtype C virus in the late 1970’s and early 1980’s. Historically, this corresponds to a change in political boundaries (Zimbabwean Independence) and rapid population influx from neighboring countries. •The timing, phylogenetic clustering, and genetic diversity of Zimbabwean subtype C sequences is consistent with an origin in southern Africa, followed by rapid expansion as modeled by Bayesian MCMC sampling of trees. •Characterizing the origins of subtype C HIV-1 in southern Africa, its molecular evolution in a changing landscape of host, virus, and ARV pressures, and epidemic patterns in at-risk populations, are critical in After 1998 2006 N=90 2001 N=27 1998 N=56 1991 N=39 Figure 1. Maximum-likelihood tree of 178 ZW subtype C sequences, demonstrating temporal clustering and increasing divergence over the 15-yr epidemic period. Figure 2. Time-resolved phylogenies constructed with Bayesian MCMC. Across various population growth models, rapid epidemic expansion is seen during 1979-1981 with multiple clusters of introduction. Percent lineage calculations demonstrate most of current genetic diversity was introduced during 1980-1985. Figure 3. Bayesian MCMC estimates of MRCA are consistent with a founder virus introduced in ZW ~1973-75, across various population models. Figure 4. Skyline plot analysis indicates multi-phase epidemic patterns (lag, explosive, linear) for ZW subtype C, with rapid expansion in the early 1980’s. Figure 5. Phylogeographic analysis (Slatkin-Maddison method), showing the frequency of gene flow (migrations) to/from various southern African nations. The size of each circle is proportional to the percentage of observed migrations. A. Before 1998, ZW migrations mainly involve South Africa, Zambia, and Botswana. B. After 1998, ZW migration expanded to include Mozambique and Malawi. C & D. Geographic representations of observed migrations. Size of arrow represents percent migration. A C B D

HIV-1 subtype C now accounts for approximately 50% of the estimated 33 million people living with HIV/AIDS and half of the 1-2 million new infections annually

Embed Size (px)

Citation preview

Page 1: HIV-1 subtype C now accounts for approximately 50% of the estimated 33 million people living with HIV/AIDS and half of the 1-2 million new infections annually

HIV-1 subtype C now accounts for approximately 50% of the estimated 33 million people living with HIV/AIDS and half of the 1-2

million new infections annually. The predominance of a single clade of HIV-1 in the most severely affected sub-Saharan countries

has been ascribed to a founder effect, while the rapid heterosexual transmission and dissemination of subtype C has been attributed

to sexual-social factors, the frequency of concurrent partnerships, increased virus load, and shedding of virus alongside sexually

transmitted infections. Comparisons with subtype B virus isolates have demonstrated subtype C viruses have enhanced tropism for

macrophages and dendritic cells, increased viral replication rates through transcriptional regulation, and most recently, a higher rate

of mutation and drug resistance among women receiving single-dose nevirapine.

A high rate of infection is documented in young, antenatal women, where infection rates between ages 16-24 are 2-5% per year.

HIV testing of pregnant women for pMTCT programs provides a consistent population from which recently transmitted viruses can

be identified and the evolution of new subtypes, recombinants, and temporal changes in subtype C, can be characterized.

Routine sequencing of pol genes of circulating virus, as a surveillance tool for drug resistance, has increasingly been used for

evolutionary and phylogenetic mapping, and to explore the origins, molecular epidemiology, and genetic diversity of different HIV-

1 subtypes. We analyzed subtype C pol sequence data over time from successive cohorts of women screening for HIV infection in

antenatal clinics in Zimbabwe. Evaluation of genetic diversity and phylogenetic relationships in these cohorts enabled formulation

of a Bayesian molecular clock, which provided information on the origins, timing, and epidemic growth patterns of the subtype C

epidemic in southern Africa.

Origins, Evolution, and Dissemination of Subtype C HIV-1 in Zimbabwe: Bayesian and Phylogenetic Analysis

Sudeb Dalai, Seble Kassaye, Tulio de Oliveira, Gordon Harkins, Jennifer Lint, Elizabeth Johnston, and David Katzenstein.

Stanford University School of Medicine, Stanford, CA, USA; and South African National Bioinformatics Institute, University of the Western Cape, Cape Town, South Africa.

I. Background and Objectives

II. Methods

IV. Summary and

Conclusions

Contact Information: [email protected]

III. Results

Acknowledgements

Sudeb Dalai is supported by the Howard Hughes Medical Institute Research Fellowship and the Paul and Daisy Soros Fellowship for New Americans.

•We sequenced HIV-1 pol from samples obtained from 4

sequential cohorts of pregnant, HIV-positive women in Harare,

Zimbabwe presenting to antenatal clinics from 1991-2006.

•Maximum-likelihood phylogenetic trees were constructed with

PhyML v2.4.4.

•Ancestral sequences were reconstructed with HyPhy v1.0.

•Sequence divergence from ancestral sequences was used to

derive a nucleotide evolutionary rate, which we used to calibrate

a Bayesian Markov chain Monte Carlo analysis (BEAST v1.4)

under several different models of population growth. BEAST

software implementation was used to reconstruct most recent

common ancestor (MRCA) sequences, time the introduction of

infection in this region, and estimate a time-resolved phylogeny

and viral dynamics of the epidemic.

•A number of other concurrent southern African subtype C

sequences were added to the dataset, and the Slatkin-Maddison

statistical test for gene flow (implemented in MacClade v4.0)

was used to test the hypothesis of compartmentalization vs.

migration of subtype C HIV-1 among southern African countries.

•Pol sequences obtained from 212 women (39 in 1991; 56 in

1998; 27 in 2000; 90 in 2006) demonstrated a significant increase

in sequence diversity over the 16-year period, assessed by genetic

distance (p<0.0001, one-way ANOVA).

•Sequences demonstrated clustering by sampling year, with the

recent 2006 sequences most divergent from the ancestral node

(Figure 1).

•BEAST analysis calibrated a molecular clock evolutionary rate at

2-3x10-3 nucleotide subsitutions / site / year, across several

different models of population growth.

•The estimated date of the MRCA was 1973 across all population

growth models (Figure 3), with clear evidence for multiple

introductions of subtype C HIV-1 from neighboring countries

during 1979-1981 (Figures 2 & 4).

•Lineage calculations at various timepoints (as a percentage of

current lineages) indicated that most lineage diversity was

introduced during 1980-1985 (Figure 2).

•Zimbabwean subtype C sequences clustered most closely with

sequences from neighboring African nations, and were more

divergent from subtype C sequences isolated from other regions

of Africa or the world.

•Bayesian Skyline analysis implemented in BEAST (Figure 4)

demonstrated three epidemic growth phases: an initial, slow phase

seeded HIV-1 in the 1970’s, followed by exponential growth in

the 1980’s and a linear expanding epidemic to the present.

•Slatkin-Maddison tests of compartmentalization indicated

migratory patterns consistent with flow of HIV-1 between

Zimbabwe and South Africa, Zambia, and Botswana (before

1998) and additional migratory patterns between Zimbabwe and

Malawi/Mozambique (after 1998) (Figures 5A-D).

Before 1998 n=563 After 1998 n=483

Before 1998

•The Zimbabwean HIV epidemic likely

originated from multiple introductions of

subtype C virus in the late 1970’s and early

1980’s. Historically, this corresponds to a

change in political boundaries (Zimbabwean

Independence) and rapid population influx

from neighboring countries.

•The timing, phylogenetic clustering, and

genetic diversity of Zimbabwean subtype C

sequences is consistent with an origin in

southern Africa, followed by rapid expansion

as modeled by Bayesian MCMC sampling of

trees.

•Characterizing the origins of subtype C HIV-

1 in southern Africa, its molecular evolution

in a changing landscape of host, virus, and

ARV pressures, and epidemic patterns in at-

risk populations, are critical in guiding

development of the next generation of drugs,

vaccines, and prevention strategies. Further

studies should elucidate the complex

selection factors driving viral evolution and

diversity.

After 1998

2006 N=90

2001 N=27

1998 N=56

1991 N=39

Figure 1. Maximum-likelihood tree of 178 ZW subtype C sequences, demonstrating temporal clustering and increasing divergence over the 15-yr epidemic period.

Figure 2. Time-resolved phylogenies constructed with Bayesian MCMC. Across various population growth models, rapid epidemic expansion is seen during 1979-1981 with multiple clusters of introduction. Percent lineage calculations demonstrate most of current genetic diversity was introduced during 1980-1985.

Figure 3. Bayesian MCMC estimates of MRCA are consistent with a founder virus introduced in ZW ~1973-75, across various population models. Figure 4. Skyline plot analysis indicates multi-phase epidemic patterns (lag, explosive, linear) for ZW subtype C, with rapid expansion in the early 1980’s.

Figure 5. Phylogeographic analysis (Slatkin-Maddison method), showing the frequency of gene flow (migrations) to/from various southern African nations. The size of each circle is proportional to the percentage of observed migrations. A. Before 1998, ZW migrations mainly involve South Africa, Zambia, and Botswana. B. After 1998,

ZW migration expanded to include Mozambique and Malawi. C & D. Geographic representations of observed migrations. Size of arrow represents percent migration.

A

C

B

D