21
The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

Embed Size (px)

Citation preview

Page 1: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer

Fan Ge, Li-San Wang, Junhyong Kim

Mourya Vardhan

Pallapotu, Naga Venkata Alekhya
Ne peru petti chav
mony
petti chacha
Page 2: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

Outline

• Controversy : The extent of HGT affecting the core genealogical history• Examination of this controversy by assessing the extent among

core orthologous genes

• A novel statistical method : To asses the extent of HGT based on comparisons of tree topology

Page 3: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

Introduction

• Horizontal gene transfer (HGT) refers to the transfer of genes between organisms in a manner other than traditional reproduction.

• Whole genome analyses of different prokaryotes have been thought to indicate rampant HGTs

• There is an on going debate over the estimation of HGT frequency and its impact on phylogeny

• Inference of HGT from tree comparisons should be done under a proper statistical framework

Page 4: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

Methodology to assess the extent

• New method to explicitly test for phylogenetic incongruence due to horizontal transfer versus statistical tree errors

• Used Clusters of Orthologous Groups (COG) from NCBI databases• Extracted most reliable COGs

• Built gene tree for every COG and integrated to construct W-G tree

• Comparisons of each gene tree with W-G tree to infer significant HGT

• Augmented this method to pairwise comparisons of gene trees to detect conflicts

Page 5: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan
Page 6: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

High-Quality Gene Groups and the W-G Tree

• COG database is built by redoing sequence comparisons over 43 genomes

• This resulted in retention of 297 high quality COG entries out of 3852

• To approximate the W-G tree, they used median tree estimator

• The estimate used boot strap values from bootstrap sampling

Page 7: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan
Page 8: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

Detection of HGT events

• By comparison of estimated trees against other gene trees or against trees that represent the history of genomes, we infer HGTs

• Discrepancy in the trees maybe caused due to HGT or other errors

• Distance metrics are used to test discrepancies

• The paper explicitly asks if the discrepancies are caused by HGT events, as an additional precaution.

Page 9: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

Comparison Metrics

• Maximum agreement subtree (MAST) - If two trees differ by branches, they share common subtree, the bound on size of the shared subtree can be calculated using MAST

• Symmetric Difference (SD) - Difference in the trees can be found by this metric

Page 10: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

Interpretation of HGT events…

• Case 1: • If both MAST and SD are low, trees are most likely not different

• Case 2: • If both the metrics are large, can be either HGT events or errors

• Case 3: • But if they have large SD and low MAST values, it is most likely an HGT event.

• Case 4: • Large MAST and low SD cannot occur due to algorithmic reasons

Page 11: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

SD and MAST scores for Gene Tree 1 and the W-G tree are 2 and 2, while the scores for Gene Tree 2 and the W-G tree are

8 and 2

Page 12: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

The Hypothesis Test• Hypothesis test Ɣ – difference of the two metrics

• Computed by generating null distribution by bootstrapping gene trees

• HGT was inferred when the observed Ɣ was significant with the p-value below the 5% level

• Simulation studies applied to each COG showed it detecting HGT events as follows, in a COG tree using the 5% significanceHGT Events Rates

1 53.8

2 70

3 77.3

Page 13: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

• ds is the SD metric

• dm is the MAST metric

• m,n are the no. of branch splits

• X is the no. of taxa

• Used PAUP software to calculate

Page 14: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

HGT Estimation via Comparisons between Each Gene Tree and the W-G Tree

• Hypothesis Test was applied to each COG

• Observations showed that the test does not significantly vary with the p-value

• At 5% level, 33/297 (11.1%) COGs showed putative HGTs

• These COGs are termed hCOGs

Page 15: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

The Relationship between Detecting COG entries with HGT and the p-Values

Page 16: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

HGT Estimation via Comparisons among Gene Trees

• Problem with comparing the Gene tree and W-G tree is that the results are sensitive to the W-G tree

• COG entries do not all share the same taxa

• If its a hCOG, it should test differently for all the comparisons

• 14,004 pairs of gene trees that contained greater than or equal to six shared taxa were compared

• At 5% level, 1,764/14,004 (12.6%) pairs were significant

Page 17: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan
Page 18: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

Identification of transferred branches in gene trees.

• For each COG that tested positive for HGT events, transferred branches were found by exhaustive enumeration of possible subtree matches

• Searched for all combinations of branch prunings to find the ‘‘troublesome’’ branches

• If there’s only one way to prune to make the trees congruent, it is an HGT event

Page 19: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

Color HGT Rates

Red >4%

Yellow 3%–4%

Pink 2%–3%

Blue 1%–2%

Green 1%

Page 20: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

References

1. Goddard W, Kubicka E, Kubicki G, McMorris FR (1994) The agreement metric for labeled binary trees. Math Biosci 123: 215–226.

2. Robinson DF, Foulds LR (1981) Comparison of phylogenetic trees. Math Biosci 53: 131–147

3. Conover WJ (1999) Practical nonparametric statistics, 3rd ed. New York: Wiley. 584 p.

4. Eisen JA (2000) Horizontal gene transfer among microbial genomes: New insights from complete genome analysis. Curr Opin Genet Dev 10: 606–611

Page 21: The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan

Thank You!