Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
24/09/17
Sequence quality: GMI Proficiency Tests for
Whole Genome Sequencing of bacteria
Research Group of Genomic Epidemiology National Food Institute, Technical University of Denmark EURL-AR Training course 2017
Presented by Pimlapas Leekitcharoenphon (Shinny)
(DTU-Food)
Add Presentation Title in Footer via ”Insert”; ”Header & Footer”
GMI
2
(www.globalmicrobialidentifier.org)
Objectives of GMI PT
• The main objective of the annual proficiency test (PT) is to facilitate the production of reliable laboratory results of consistently good quality within the area of whole genome sequencing (WGS) by
– Selecting two strains of three species of public health importance – Selecting species that range in sequencing difficulties – Assessing the sequencing quality based on a set of quality markers
e.g. N50, no of contigs etc. but also the ability to identify epidemiological markers such as MLST and resistance genes
– Identify participants underperforming
• To facilitate harmonization and standardization of whole genome sequencing and data analysis setting tentative arbitrary quality control thresholds
3
Objectives of GMI PT
4
- To quantify differences among laboratories in order to facilitate the
development of reliable laboratory results of consistently good quality within the area of DNA preparation, sequencing, and analysis (e.g. phylogeny).
- To facilitate harmonization and standardization in whole genome sequencing and data analysis
Structure of GMI PT, wet-lab
Component 1a Material provided: Bacterial cultures (lyophilized) • DNA extraction, purification • Library-preparation, and whole-genome-sequencing of six bacterial
cultures Component 1b Material provided: Purified DNA (pre-prepared, dried) • Library-preparation, and whole-genome-sequencing of the same six
bacterial cultures Results • Submission reads (via a portal or ftp site) • Survey response
– Method details – MLST (optional) – Resistance genes (optional)
5
Development of GMI PT
6
2014 – pilot PT 2015 – ‘full roll-out’
Salmonella (2) E. coli (2) S. aureus (2)
2016
K. pneumonia (2) L. monocytogenes (2) C. coli (1) C. jejuni (1)
Participation in the 2016 GMI PT • 46 laboratories in 22 countries had provided data for at least one of the
PT components – Australia (3), Austria, Belgium (2), Canada (2), Denmark (3),
Finland, France, Germany (3), Hong Kong, Italy (7), Latvia, Luxembourg, Mexico, the Netherlands (3), Poland, Portugal, Singapore (2), Sweden (2), Switzerland, Taiwan, the United Kingdom (2), the United States (6)
DTU Food, Technical University of Denmark Add Presentation Title in Footer via ”Insert”; ”Header & Footer”
• Number of reads mapped to – reference total DNA sequence – reference chromosome – reference plasmid #1 – reference plasmid #2 – reference plasmid #3 – and unmapped reads
• Proportion of reads mapped to the above • Depth of coverage, of the above • Size of assembled genome • Size of assembled genome per total size of DNA sequence (%) • Total number of contigs • Number of contigs > 200 bp • N50 • NG50
Measured QC parameters
8 24 September
2017
Individual participants reports
9
Pending for the 2016 trial
DTU Food, Technical University of Denmark Add Presentation Title in Footer via ”Insert”; ”Header & Footer”
QC parameters output
24 September
2017
10
• Resistance gene partly as expected • Resistance gene not as expected • Resistance gene as expected
• 2 times standard deviation • 3 times standard deviation
• Data from participants with obvious errors will be omitted prior to analysis
DTU Food, Technical University of Denmark Add Presentation Title in Footer via ”Insert”; ”Header & Footer”
• The proportion of reads produced which map directly to the closed genome of the same strain. (=> cannot exceed 100%)
Proportion of reads mapped to reference DNA sequence (%)
11 Campylobacter; GMI16-001 – omitted #114
% %
Outlier
Outlier
Only in Bact samples • Indication of contamination or strain mix up • #83 and #115 missing reads
DTU Food, Technical University of Denmark Add Presentation Title in Footer via ”Insert”; ”Header & Footer”
• The proportion of contigs which map directly to the closed genome of the same strain (=> should not exceed 100%)
Size of assembled genome per total size of DNA sequence (%)
12 Campylobacter; GMI16-001
Outlier
Outlier
% %
Outlier
Clearly contaminations • Assembly exceed the expected size of the
reference • #83 and #79 of the DNA and #71 of both
samples types
Number of contigs
- Fewer is better
N50
Total size of contigs
50% of size
Size of contig
N50
DTU Food, Technical University of Denmark Add Presentation Title in Footer via ”Insert”; ”Header & Footer”
• Definition: The length for which the collection of all contigs of that length or longer contains at least half of the sum of the lengths of all contigs, and for which the collection of all contigs of that length or shorter also contains at least half of the sum of the lengths of all contigs. A N50 more than 15000 normally indicate good quality.
N50
14
15.000
Campylobacter; GMI16-001
Outlier
bp bp Poor performance – short contigs • #79 and #105 for the Bact sample • #71, #105, and #110 for the DNA sample
DTU Food, Technical University of Denmark Add Presentation Title in Footer via ”Insert”; ”Header & Footer”
• The total number of contigs assembled. A number of contigs less than 1000 normally indicate good quality.
Total number of contigs
15 Campylobacter; GMI16-001
Outlier
bp
1.000
bp
Poor performance – large number of contigs • #71, #79 and #105 for the Bact sample • #71, and #105 for the DNA sample
DTU Food, Technical University of Denmark Add Presentation Title in Footer via ”Insert”; ”Header & Footer”
SNP analysis
#83
Strain SampleType
NumberofSNPs
GMI16-001 Culture 3DNA 0
Number of SNPs per strain
Campylobacter; GMI16-001
3 SNPs difference to the ref. (#83)
DTU Food, Technical University of Denmark Add Presentation Title in Footer via ”Insert”; ”Header & Footer”
• Obvious outliers removed, #114 submitted data of another strain
• #83 Bact, indication of contaminations – Detected AMR genes not present in the reference genome – Proportion of reads mapping to ref. much less than 100% – Proportion of size per total size of ref. much higher than
100% – 3 SNPs difference to the ref.
• #79 Bact, indication of contaminations and poor performance – Detected AMR genes not present in the reference genome – Proportion of size per total size of ref. much higher than
100% – A total no. of contig higher than 1.000 – N50 lower than 15.000 bp
Overall results – poor performance
17
DTU Food, Technical University of Denmark Add Presentation Title in Footer via ”Insert”; ”Header & Footer”
• #115 Bact, indication of contaminations – Proportion of reads mapping to ref. much less than 100%
• #71 Both sample types, indication of contaminations and poor performance
– Proportion of size per total size of ref. much higher than 100%
– A total no. of contig higher than 1.000 – N50 lower than 15.000 bp for DNA
• #105, Both sample types, indication of poor performance – A total no. of contig higher than 1.000 – N50 lower than 15.000 bp for DNA
Overall results – poor performance, cont
18
DTU Food, Technical University of Denmark Add Presentation Title in Footer via ”Insert”; ”Header & Footer”
• #110, Both sample types, indication of poor performance – N50 lower than 15.000 bp for DNA
Overall results – poor performance, cont
19
DTU Food, Technical University of Denmark Add Presentation Title in Footer via ”Insert”; ”Header & Footer”
• The interpretation of the MLST data and final layout of the QC are pending but scheduled to be finished in May ‘17
• The individual participants reports disseminated before July ’17
• PT report 2016 online before July ’17 and 2015 report before Sep ‘17
• A satisfactory results for most labs except for – #71, #79, #83, #114, #115 due to contaminations – #71, #79, #105, #110 due to poor sequencing performance
• Continuation in 2017 focusing on Salm., E.coli and S. aureus
Summary of PT 2016
20
Acknowledgement
21
Oksana Lukjancenko (DTU Food)
Susanne Klarsmose Pedersen (DTU Food)
Pimlapas Leekitcharoenphon (DTU Food)
Rolf Sommer Kaas (DTU Food)
Inge Marianne Hansen (DTU Food)
Jacob Dyring Jensen (DTU Food)
Frank Aarestrup (DTU Food)
Ole Lund (DTU Systems Biology)
Jose Luis Bellod Cisneros (DTU Systems Biology)
James Pettengill (US FDA)
Division of Microbiology (CFSAN/FDA)
Anthony Underwood (PHE)
Brian Beck (Microbiologics)
Isabel Cuesta de la Plaza (ISCIII)
Angel Zaballos (ISCIII)
Jorge De La Barrera Martinez (ISCIII)
…..and the rest of WG 4 (‘advisory group’)
GMI is supported by:
DTU Food, Technical University of Denmark Add Presentation Title in Footer via ”Insert”; ”Header & Footer”
Thank you for your attention
Pimlapas Leekitcharoenphon (Shinny), PhD
Research Group Genomic Epidemiology
WHO Collaborating Centre for Antimicrobial Resistance in Food borne Pathogens
and Genomics
European Union Reference Laboratory for Antimicrobial Resistance
National Food Institute, Technical University of Denmark