Upload
verity
View
46
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Meet the ants. Camponotus floridanus Carpenter ant. Harpegnathos saltator Jumping ant. Solenopsis invicta Red imported fire ant. Atta cephalotes Acromyrmex echinatior Leafcutter ants. Pogonomyrmex barbatus Harvester ant. Linepithema humile Argentine ant. Now meet their genomes…. - PowerPoint PPT Presentation
Citation preview
Meet the ants
Camponotus floridanusCarpenter ant
Harpegnathos saltatorJumping ant Solenopsis invicta
Red imported fire ant
Pogonomyrmex barbatusHarvester ant
Linepithema humileArgentine ant
Atta cephalotes Acromyrmex echinatiorLeafcutter
ants
Now meet their genomes…
Species Citation Platform(Coverage)
Assembly Program(s)
Scaffold Length N50 (total)
Harpegnathos saltatorJumping ant
Bonasio et al 2010 Science
Illumina(104x)
SOAP de novo6 lib.- 3 paired end, 3 mate pair
598 Kb(297 Mb)
Camponotus floridanusCarpenter ant
Bonasio et al 2010 Science
Illumina(102x)
SOAP de novo- 3 paired end, 3 mate pair
603 Kb(238 Mb)
Acromyrmex echinatiorLeafcutter ant
Nygaard et al 2011 Genome Research
Illumina(123x)
SOAP de novo5 lib.– 2 paired end, 3 mate pair
1.1 Mb(300Mb)
Atta cephalotesLeafcutter ant
Suen et al. 2011 PNAS
454(18-20x)
Roche GSAssembler
5.1 Mb(317 Mb)
Solenopsis invictaFire ant
Wurm et al. 2011 PNAS
454 + Illumina(~55x)
SOAP denovo + Roche GS Assembler
720 Kb(353 Mb)
Linepithema humileArgentine ant
Smith et al. 2011 PNAS
454 + Illumina(23x)
Roche GS Assembler + Celera CABOG
1.3 Mb(43 Mb)
Pogonomyrmex barbatusHarvester ant
Smith et al. 2011 PNAS
454(10-12x)
Celera CABOG 793 Kb(235 Mb)
Generic assembly procedure
Assemble fragments into contigs
Scaffolding– connecting contigs usingmate-pair information
Steps involved in Illumina Assembly
1) Download data (qseq file– sequences with quality scores)2) Filter data
A) Filter low quality readsB) Trim adapter sequences
3) SOAPdenovo stepsA) Preassembly error correction (Identify pairs of reads sharing a common sequence (k-mer, e.g. 17-20), estimate k-mer frequency, and remove erroneous k-mers)B) Construct contigs based on short insert libraries (200-800bp)C) Join contigs into scaffolds using information from large insert mate pair libraries (1Kb-10Kb) D) Do local reassembly of unresolved gap regions using Gap Closer for SOAPdenovo
2) Filtering data (specifics)
• A) Remove low quality reads– Remove reads that do not pass GA analysis
Failed_Chastity filter (have an N in the last column of the GA export file)
– Can use R BioConductor ShortRead package (may have to convert files from qseq to fastq format)
• B) Remove adapter sequences– need adapter sequence information from person that
did sequencing – Can use vectorstrip in EMBOSS
Computational power and time required for SOAPdenovo?
Li et al 2010 Genome Research
And compared to other programs
Lin et al 2011 Genomics
Acromyrmex echinatior genome raw dataNCBI: SRA Acromyrmex genome
Mate pair libraries(More redundant, To build scaffolds)
Shotgun libraries(Broader coverage, To build contigs)
Paired end sequencing (<1Kb) Mate pair library, paired end sequencing (>1Kb)