Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
http://comet.retrovirology.lu
COMET:Rapid and reliable HCV subtype predictor
(Context-based Modeling for Expeditious Typing)
Daniel StruckCRP-SANTÉ
Laboratory of Retrovirology([email protected])
Presented at the 10th EU Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona
http://comet.retrovirology.lu
Background
HCV genotype/subtype strong predictor of sustained virological response (SVR) to ribavirin/interferon and new HCV inhibitor therapies
Only one online HCV genotyping/subtyping tool based on phylogenetic analysis:
Oxford HCV Automated Subtyping Tool (Version 2.0)
COMET: Context-based modeling for classification of HIV sequences introduced in 2010 adapted to HCV subtyping
Presented at the 10th EU Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona
http://comet.retrovirology.lu
Comparison Oxford tool vs Comet:
Paired NS3-NS5B sequences (CRP dataset):
=> concordance in 182/183 cases (mismatch due to recombinant)
Subtyping full-length NS5B versus the conserved region (PR3-PR5, euHCVdb dataset)
=> 98.1% (1933/1970) match
Dataset 1 : NS3, NS5B sequences
Gene Origin n Oxford / COMET subtype
agreement
Discordance details
NS3 LANLCRP
762 99.70% 0.3% recombination events
NS5B euHCVdbCRP
2309 94.00% 2.1% COMET & Oxford WRONG
1.14% unassigned by COMET
6.6% unassigned by Oxford
Presented at the 10th EU Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona
http://comet.retrovirology.lu
COMET HCV subtyping improvements
Improvements:
Adapt recombination detection threshold to HCV
Introduce bootstrap values to assess quality of prediction
Presented at the 10th EU Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona
http://comet.retrovirology.lu
Dataset 2 : euHCVdb
The European HCV Database
min. length: 250
max length: 8500
exclude identical sequences
only seq. with provisional subtype
10223 sequences
average length: 726
79 different subtypes
dataset subtype
# %
1b 2814 27.53
1a 2609 25.52
3a 1225 11.98
4a 595 5.82
5a 361 3.53
6a 343 3.36
2a 203 1.99
2b 174 1.7
3b 149 1.46
4d 131 1.28
other (<1%) 1619 15.84Presented at the 10th EU Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona
http://comet.retrovirology.lu
Result genotype (n= 10223)
(*) genotype verified with neighbor joining tree using the LANL HCV reference sequences 2012
COMET <> euHCVdb provisional genotype
bootstrap value #
same genotype 10193 (99.7%)
different genotype unassigned 15 (0.1%)
wrong genotype for COMET (*) 100.00% 1
80% - <100% 3
<80% 9 13 (0.1%)
Oxford <> euHCVdb provisional genotype
same genotype 9403 (92%)
different genotype unassigned Check the Bootscan 4
Check the report 483
Sequence Error 272
Unassigned 61 820 (8%)
Presented at the 10th EU Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona
http://comet.retrovirology.lu
Result subtype (n=10223)
COMET <> euHCVdb provisional subtype
bootstrap (average)
# COMET training (median)
same subtype 99.58% 9613 94.03% 1-5 (2)
different subtype
unassigned / 15 0.10%
“wrong” subtype
78.43% 595 5.82% 0-1 (0)
Oxford <> euHCVdb provisional subtype
bootstrap (average)
#
same subtype 99.94% 8094 79.17%
different subtype
unassigned / 820 8.02%
only genotype 98.17% 419 4.10%
“wrong” subtype
96.19% 890 8.71%
Presented at the 10th EU Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona
http://comet.retrovirology.lu
COMET unassigned results (15)
euHCVdb COMET Recco Oxford support
1a unassigned_1;1b,1a 1b-1a Genotype 1 subtype a 95
1b unassigned_1;1b,2a 2a-1b Genotype 1 subtype b 100
2c unassigned_1;2a,2c 2a-2c Genotype 2 subtype c 100
2e unassigned_1;2q,2a 2a-2q Genotype 2 100
2k unassigned_1;1b,2a,2k 2k-1b Check the Bootscan
2k unassigned_1;1b,2a,2k 2k-1b Check the Bootscan
2e unassigned_1;2q,2a 2q-2j Genotype 2 100
3a unassigned_1;6e,6h,3a 3a-6h Check the Bootscan
2k unassigned_1;2q,2k 2k Genotype 2 subtype k 100
2k unassigned_1;2q,2k 2k Genotype 2 subtype k 100
3b unassigned_1;6e,3b,3a 3b Genotype 3 subtype b 100
3b unassigned_1;6e,3b,3a 3b Genotype 3 subtype b 100
3b unassigned_1;3b,3a 3b Genotype 3 subtype b 100
6d unassigned_1;6e,6d 6d Genotype 6 subtype d 100
6h unassigned_1;6h,6i,1a,1c,6e 6h Genotype 6 subtype h 100
Presented at the 10th EU Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona
http://comet.retrovirology.lu
Benchmark
Server setup: 2x quad core (Q1 2009, 2.93 GHz)
Analysis of the 10223 HCV sequences took 68 seconds
=> 150 sequences / second
Analysis of 500.000 sequences would take about 55 minutes
Presented at the 10th EU Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona
http://comet.retrovirology.lu
Summary
Reliable prediction of HCV subtype
Limitations due to the availability of near full length reference sequencesImprove COMET training for specific genes?
Generally it is best to compare the results of different approaches to define the subtype of a sequence (COMET: alternative to phylogenetic based Oxford tool)
High performance and scalability suitable for deep sequencing (454) analysis
Presented at the 10th EU Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona
http://comet.retrovirology.lu
Acknowledgements
CRP-Santé, Laboratory of Retrovirology
Jean-Claude Schmit
Carole Devaux
Danielle Perez Bercoff
Sandrine Ortiou
Presented at the 10th EU Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona
http://comet.retrovirology.lu
COMET website
Poster: O_15
http://comet.retrovirology.lu
Presented at the 10th EU Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona
http://comet.retrovirology.lu
COMET website results
Presented at the 10th EU Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona