22
Tomato Genome SL2.50 and Beyond… Surya Saha, Jeremy Edwards and Lukas Mueller Sol Genomics Network (SGN) Boyce Thompson Institute, Ithaca, NY [email protected] @ SahaSurya Slides: http://bit.ly/PAGbld230 https://fanart.tv/movie/196/back-to-the-future-part-iii/

Tomato Genome SL2.50 and Beyond…

Embed Size (px)

Citation preview

Tomato Genome SL2.50 and

Beyond…

Surya Saha, Jeremy Edwards and Lukas Mueller

Sol Genomics Network (SGN)

Boyce Thompson Institute, Ithaca, NY

[email protected] @SahaSurya

Slides: http://bit.ly/PAGbld230

https://fanart.tv/movie/196/back-to-the-future-part-iii/

CHROMOSOMES

SCAFFOLDSCONTIGS

Gene to Genome – The BIG picture

SCAFFOLD GAPS

CHROMOSOME GAPS

SGN Workshop, PAG 2015

GENES

TM2 (Chr 9)

L2 (Chr 10)

Tomato Build SL2.40 SL2.50

SGN Workshop, PAG 2015

Lindsay Shearer

Stephen Stack

Genome Assembly @NCBI

Contigs

• Components

Tiling Path file

(TPF)

• Accession numbers

• Can have nested

components

Accession

Golden Path files

(AGP)

• Scaffold IDs

• Orientation

• Chromosome from

contig AGP

• Chromosome from

scaffold AGP

• Scaffold from

contig AGP

NCBI

SGN Workshop, PAG 2015

Jeremy Edwards

https://github.com/solgenomics/Bio-GenomeUpdate

FISH• Order

• Orientation

• Gap sizes

Tiling Path file

(TPF)

Accession

Golden Path files

(AGP)NCBI

Gap extension

Scaffold flip

SGN Workshop, PAG 2015

Jeremy Edwards

https://github.com/solgenomics/Bio-GenomeUpdate

SL2.40 Annotation

• SL2.40 AGP

• SL2.50 AGP

• SL2.40 GFF3

SL2.50 Annotation

• SL2.50 GFF3

• Validated via Fasta

Errors corrected

• Start/end coordinates in different scaffolds

• Start > end coordinates for UTRs

• Start or end coordinates in gap region

• Dropped Solyc03g053140.1 and Solyc12g032910.1

SL2.50 Availability

JBrowse

FTP Site

SGN Locus/Gene Pages

NCBI

SGN Workshop, PAG 2015

SL2.50 Genome Release

Genome build

2.5 Fasta

+

ITAG 2.4 GFFs

CHADO

FTP site

Website

JBrowse

Blast DBs

SGN Workshop, PAG 2015

State of the SL2.50 Build

SGN Workshop, PAG 2015

0

20000000

40000000

60000000

80000000

100000000

120000000

0 1 2 3 4 5 6 7 8 9 10 11 12

State of the SL2.50 Build

SGN Workshop, PAG 2015

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 1 2 3 4 5 6 7 8 9 10 11 12

Sequence Scaffold gap length Component gap length

State of the SL2.50 Build

SGN Workshop, PAG 2015

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 1 2 3 4 5 6 7 8 9 10 11 12

Sequence Scaffold gap length Component gap length

Length 823Mb

Sequence 737Mb

Component gaps 43Mb (5.30%)

Scaffold gaps 42Mb (5.17%)

Total gaps 86Mb (10.47%)

SGN Workshop, PAG 2015

https://fanart.tv/movie/196/back-to-the-future-part-iii/

BAC Resources

SGN Workshop, PAG 2015

BAC Resources

Bruce Roe

HTGS Phase 1: 332

HTGS Phase 2: 520

HTGS Phase 3: 2751

http://www.ncbi.nlm.nih.gov/genbank/htgs/faq

SGN Workshop, PAG 2015

HTGS Phase 3 BACs

SGN Workshop, PAG 2015

Chr 0 53

Chr 1 589

Chr 2 248

Chr 3 137

Chr 4 147

Chr 5 117

Chr 6 104

Chr 7 111

Chr 8 249

Chr 9 119

Chr 10 620

Chr 11 100

Chr 12 86

Unknown 84

SGN Workshop, PAG 2015

Jeremy Edwards

https://github.com/solgenomics/Bio-GenomeUpdate

BAC assemblies

• Phrap

• ACE files

BAC sets

• Assembled BACs

• Singleton BACs

Align to SL2.50

• Nucmer

• 100bp word size

• 500bp minimum alignment

• 99% identity

Novel sequences

• Extensions

• Gap coverage

HTGS Phase 3 BACs

SGN Workshop, PAG 2015

0

100

200

300

400

500

600

700

1 2 3 4 5 6 7 8 9 10 11 12

Phrap Assembly (HTGS Phase 3 BACs)

SGN Workshop, PAG 2015

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 2 3 4 5 6 7 8 9 10 11 12

Assembled BACs Singleton BACs

Phrap Assembly (HTGS Phase 3 BACs)

SGN Workshop, PAG 2015

Chr10 Contig68 10 BACs (242Kb!!)

Chr2 Contig185 7 BACs (566Kb!!)

Future Work

• Manually examine assembled BAC contigs with < 99% identity

• Evaluate HTGS phase 2 BACs

• Use PCR walking to close gaps

• Create TPF files for SL3.0

• Annotate SL3.0 and lift over annotations from SL2.50

SGN Workshop, PAG 2015

Acknowledgements

SGN Workshop, PAG 2015

SGN Workshop, PAG 2015