Upload
ben-pascoe
View
63
Download
0
Embed Size (px)
Citation preview
Introduction to nullarbor
The Milner Centre for EvolutionDepartment of Biology & Biochemistry, University of Bath
http://www.climb.ac.uk/http://www.sheppardlab.com/
Ben Pascoewith Maciej Filocha (Warwick) & Mark Pallen
(Warwick)
1. Introduction to nullarbor2. Setting up nullarbor with test campy dataset3. Launching your own VM4. (something for you try later – Shigellosis nullarbor tutorial)5. Checking your nullarbor output
Introduction to nullarbor
Sequencing
https://github.com/tseemann/nullarbor
Nullarbor - "Reads to report" for public health and clinical microbiology
http://www.slideshare.net/torstenseemann/bioinformatics-tools-for-the-diagnostic-laboratory-tseemann-antimicrobials-2016-melb-au-sat-27-feb-2016
What is nullarbor?
Per isolateClean /trim sequence reads (Trimmomatic)• Remove adaptors, quality scoresSpecies identification (Kraken)• K-mer analysis against KNOWN databaseDe novo assembly (MEGAHIT/SPAdes)• Fast, confident genome assemblyAnnotation (Prokka)• Genome annotationMLST calling• From KNOWN databasesResistome (Abricate)• ID AMR genes from KNOWN databaseVariant calling from reads compared to reference
Per datasetCore genome SNPs (Snippy – from readsPhylogenetic trees (FastTree)Accessory genome (ROARY)Report generation
Workshop isolates
4 Campylobacter isolatesAll LAB strains – should all be VERY similar…
Run nullarbor
How similar are the isolates?Is there an explanation for any difference observed?
Implications 11168 widely-used as lab strain and molecular studies based on this reference strain
Campylobacter: background
Sheppard et al. (2009) Clinical Infectious Diseases 48:1072–1078
952
22
42
45
177
682
48
1275
661692
61
206
354
257
1034
57421
Sheppard et al. (2010) Applied Environmental Microbiology 76, 5269-5277
Campylobacter: source attribution
Campylobacter: introgression
Campylobacter: GWAS
Linking phenotypes and genotypes using GWAS:Asymptomatic isolates Vs Symptomatic isolates
Weights association compared to relative position on the tree
Sheppard et al, PNAS 2013; Pascoe et al, Environmental Microbiology 2015; Monteil et al, Microbial Genomics 2016, Yahara & Meric et al, Environmental Microbiology 2017
Development of GWAS for use with bacteria: GWAS within clonal complex
Sheppard et al (2013) PNAS 110: (29) 11923-11927
Cattle isolates Vs Chicken isolates
Pascoe et al (2015) Environmental MicrobiologyDOI: 10.1111/1462-2920.13051
Good Vs Bad Biofilm isolates
Previous studies were confined to single clonal complex:
Bacteria are clonal – difficult to associations biased by lineage effects – inheritance from common ancestor.
Accessory genome – bacterial genomes are all different sizes!
Development of GWAS for use with bacteria: pan-genome GWAS
SymptomaticAsymptomatic
Paired isolates for pan-genome GWASFastML tree of 36 paired isolates (pan-genome)
Reduce false positivesMaintain statistical powerNot confined to single clonal
complexZero unmapped words
Mageiros & Meric et al, unpublished; Pascoe et al, unpublished
Previous studies were confined to single clonal complex:
Association weighted against the clonal frame (tree)
Paired isolates from many CCs.
Use of reference pan-genome instead of 1 single reference genome.
Genome-wide association of Campylobacter genetic elements with disease severity / asymptomatic carriage
Pascoe et al, unpublished
High statistical association: glycosylation genes
Iron uptake Motility
*scores for all genes in pan-genome from all 77 isolates – 2,996 genes
Thousands of ‘this’ in ~3,000 genes!
Access VM using VM box
Using your ip address:gambia-1: 137.205.69.151gambia-2: 137.205.69.153gambia-3: 137.205.69.154gambia-4: 137.205.69.155gambia-5: 137.205.69.156gambia-6: 137.205.69.157gambia-7: 137.205.69.158gambia-8: 137.205.69.159gambia-9: 131.251.130.226gambia-10: 131.251.130.227
For all: User: ubuntuPassword: password123
Check we have all the files you need
What do we need?• Input file: allinput.tab• Reference genome: al111168.fasta • Reads from MiSeq: *.fastq.gz
(8 files, 4 isolates)
Setup nullarbor
nullarbor.pl --name gambia --mlst campylobacter --ref al111168.fasta --input allinput.tab --outdir output --verbose
• Type command to setup nullarbor• Nullarbor will perform checks and give you command to use to start run:
nice make -j 1 -C /home/ubuntu/gambia/output• run
• Can also run with ‘no hangup’nohup nice make -j 1 -C /home/ubuntu/gambia/output &
It will run for a couple of hours…
Launching your own VM
https://discourse.climb.ac.uk/
Nullarbor output: report example
Workshop isolates
Are all four isolates very similar?Which of the 4 isolates were contaminated?Which isolate was passaged through a chicken?
https://discourse.climb.ac.uk/
Nullarbor tutorials on discourse.climb.ac.uk: Can you run this on your own VM?