VectorBase Vectorbase probe mapping. VectorBase Automatic Annotation browser Array data CHADO Manual...

Preview:

Citation preview

VectorBaseVectorBase

Vectorbase probe mapping

VectorBaseVectorBase

Automatic Annotation

browserbrowser

Array data

CHADO

Manual Annotation

XML

vectorbase

Automatic Annotation

VectorBaseVectorBase

Integration aims

• Contigview– View alignment

track

• Geneview– Reporters– experiments

– Expression

patterns?

• Reporter page– Positions mapped– Genes overlapped

– Experiments used in

VectorBaseVectorBase

Linked to location: Feature track

Link to BASE Detail popup:

st-end, name, %id…

contigview

VectorBaseVectorBase

Associated with gene:ReporterExperiments(?)

geneview

VectorBaseVectorBase

• Slow

• Delayed data flow

• Limited scope for adaptation– a load of links…

DAS Vs. e!

VectorBaseVectorBase

DAS server

DB

request browser

DAS - request:response

VectorBaseVectorBase

Automatic Annotation

browserbrowser

Array data

CHADO

Manual Annotation

XML

DAS

1

2

3

data flow

2 3

4

namechr::start-end::strand

VectorBaseVectorBase

• Slow

• Delayed data flow

• Limited scope for adaptation– a load of links…

DAS Vs. e!

• Fast

• Data generated with new assembly

• Fully integrated with ensembl– Pages are

extendable & adaptable

VectorBaseVectorBase

cf affy_feature

affy feature

VectorBaseVectorBase

featureview

VectorBaseVectorBase

est sequences

Is feature associated with any external databases? (i.e. EMBL)

Xref

ContigView display track

FeatureView positions in genome

> 97% identity> 90% coverage

DNA_Align_Feature

GeneView display information with associated gene

FeatureView positions in genome other features est is associated with

e! features

VectorBaseVectorBase

probe types

• MMC1– Spotted array: cDNA clones

• MMC2– Spotted array: PCR products

• Affy- Short oligo tiling path

- Agilent- Long oligo, tiling path?

Already handled by ensembl

VectorBaseVectorBase

DNA_Align_Feature (ESTs)

MMC1 (clones)exonerate: EST2genes

Misc_Feature(REPORTER)

collapse ESTsinto clones

ContigView display all features in ‘probes’ track

FeatureView: positions in genome

1 - est2reporter

VectorBaseVectorBase

2L

3R

est1

est2 est2

est2 est2

len est2

len (est2)+len (est1)

est1

est2

300bp

500bp

500 = 500

500 + 300 = 800

2 - est2reporter

VectorBaseVectorBase

2L

3R

est1

est2 est2

est2 est2

len est2

len (est2)+len (est1)

est1

est2

50bp

500bp

500 = 500

500 + 50 = 550est2

est2

3 - est2reporter

VectorBaseVectorBase

DNA_Align_Feature (ESTs)

MMC1 (clones)exonerate: EST2genes

Marker_Feature (PCR primers)

MMC2 (primers)e-PCR

assess significance?

Misc_Feature(REPORTER)

collapse ESTsinto clones

ContigView display all features in ‘probes’ track

FeatureView: positions in genome

pcr2reporter

VectorBaseVectorBase

2L

3R

550bp => map weight 0

100bp => map weight 1

800bp => map weight 1

id left_primer right_primer distance name accession species

1.1.1 GATTACAACATCCAGAAGGAGTC GTAGTACTTGAGGACAGCAAG 104 ENSANGG00000020724 ENSANGG00000020724 A.gambiae

1.1.10 GCCTTTGCCGGGCTGC TTCGGGGGTTTCGAGCAG 497 ENSANGG00000002666 ENSANGG00000002666 A.gambiae

1.1.11 GATAATTCAGCGCTACACATTA CACCGTAATGCTAACATCGAA 146 ENSANGG00000002705 ENSANGG00000002705 A.gambiae

1.1.11 GATAATTCAGCGCTACACATTA CACCGTAATGCTAACATCGAA 146 ENSANGG00000020019 ENSANGG00000020019 A.gambiae

1.1.12 CAGTGCTACGTGAAGAATGA TCCGCTGTCGAGGGAAC 473 ENSANGG00000003095 ENSANGG00000003095 A.gambiae

1.1.2 TCGTCCAACAGTTTCTCCTAC GATCGTTTGCTGCTTGCATA 449 ENSANGG00000000521 ENSANGG00000000521 A.gambiae

sts pipeline

VectorBaseVectorBase

2L

ENSANG00012345:

MMC2: 9713

MMC1: 4A3B-AAG-D08

1 - reporter2gene

VectorBaseVectorBase

DNA_Align_Feature (ESTs)

MMC1 (clones)exonerate: EST2genes

Marker_Feature (PCR primers)

MMC2 (primers)e-PCR

assess significance?

Misc_Feature(REPORTER)

collapse ESTsinto clones

Xref

Xref: criteria? % alignment overlap w. exon boundaries

ContigView display all features in ‘probes’ track

FeatureView: positions in genome

GeneView display reporters (& experiments?) in geneview

FeatureView: positions in genome

Links to genesLinks to experiments

2 - reporter2gene

VectorBaseVectorBase

MMC1 locations

VectorBaseVectorBase

locations arraymap.v1 no est seqs anoest_v7.5 no est seqs0 11315 3060 2573 30601 4779 0 8042 02 484 0 3252 03 14 0 1587 04 21 0 603 05 3 0 197 06 0 0 90 07 0 0 46 08 3 0 37 09 0 0 22 0

10 2 0 14 0>10 2 0 160 0

MMC1 locations table

VectorBaseVectorBase

MMC1 genes

VectorBaseVectorBase

MMC1 genes table

hits anoest_v7.1 arraymap.v1-1 5433 126150 9049 17601 4215 48812 718 3263 123 274 70 145 18 146 16 67 4 68 8 39 1 3

10 7 1>10 21 27

VectorBaseVectorBase

MMC2 locations

VectorBaseVectorBase

MMC2 genes

VectorBaseVectorBase

MMC2 table

genes count-1 6150 19131 86992 5873 954 365 176 37 29 4

10 1>10 23

aligns count0 6151 96732 6303 2144 1685 1646 1107 648 439 34

10 24>10 256

VectorBaseVectorBase

MMC DAS trackshttp://base.vectorbase.org:8080/das

VectorBaseVectorBase

MMC listings

http://base.vectorbase.org:8080/MMC1.jsphttp://base.vectorbase.org:8080/MMC2.jsp

VectorBaseVectorBase

fin

Thanks…

– Bob, George, Fotis– Dan, Karyn, Martin @ EBI– Ian Sealy @ Sanger– Informatics support @ Sanger

VectorBaseVectorBase

EST quality

TOTAL: 19280• OK: 17186 (89.14%)

• POOR: 2094 (10.86%)– repetitive: 2094– short: 36

– avg length (bp): 576.39– avg repeat %: 1.05%

VectorBaseVectorBase

general runnable

e! DB

runnableDB runnable

Exonerate

BLAT

ePCR

DB record >> perl object

perl object >> DB record

process object

runexecutable

get object from db

VectorBaseVectorBase

MMC1 runnable

e! DB

runnableDB runnable

Exonerate

BLAT

ePCR

DB record >> dna align object

dna align object >> DB record

MMC1_xref reporter XREF

VectorBaseVectorBase

MMC2 runnable

e! DB

runnableDB runnable

Exonerate

BLAT

ePCR

DB record >> marker object

marker object >> DB record

MMC2_xref reporter XREF

Recommended