24
Andy Jenkinson, EBI The DAS Protocol

Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

Embed Size (px)

Citation preview

Page 1: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

Andy Jenkinson, EBI

The DAS Protocol

Page 2: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

Summary of Topics

• Technical overview

• Principles of communication

• Pros and cons

• DAS capabilities

Page 3: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

DAS Architecture

• A client asks for data from many servers• HTTP requests• identically structured URLs, the same parameters

• Each server behaves in the same way• pre-defined set of behaviours• e.g. provide a sequence, provide annotations of a sequence

• Each server provides different data in the same format• DAS-XML

Page 4: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

DAS Concepts

Reference object• usually a sequence• e.g. “chromosome X” or “NT_025741”

Annotation• information attached to a location within a segment• e.g. “substitution at residue 326 of BRCA1”

Page 5: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

DAS Concepts

Reference server• server that provides “core” reference object data• e.g. GRCh37 sequence data

Annotation server• server that provides annotations of reference objects

Segment• part of a reference object • e.g. “bases 100 to 200 of chromosome X”• ties together annotation and reference servers

Page 6: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

Architectural Overview

Page 7: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

The DAS Protocol

Defines 3 constraints• transport layer: HTTP• query format: constrained REST URLs• response format: constrained XML

Keyword: constrained

Page 8: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

The DAS Protocol

Defines 3 constraints• transport layer: HTTP• query format: constrained REST URLs• response format: constrained XML

Page 9: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

The DAS Protocol

Defines 3 constraints• transport layer: HTTP• query format: constrained REST URLs• response format: constrained XML

Data transport• Standard HTTP• Includes compression• Some additional headers, e.g. to indicate DAS version

Page 10: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

The DAS Protocol

Defines 3 constraints• transport layer: HTTP• query format: constrained REST URLs• response format: constrained XML

Well-defined query URLs• A client can issue a command

http://das.sanger.ac.uk/das/ccds_mouse/features?segment=...^^^^^^^^^^^^^^^^^^^^^^^ ^^^ ^^^^^^^^^^ ^^^^^^^^ ^^^^^^^^^^^ site prefix das source command arguments

Page 11: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

The DAS Protocol

Defines 3 constraints• transport layer: HTTP• query format: constrained REST URLs• response format: constrained XML

XML format• server responds with a simple XML document

<SEGMENT id=“X” start=“1” end=“100”> <FEATURE id=“exon1”> <TYPE id=“exon”>exon</TYPE>

Page 12: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

Why DAS?

Fast, targeted queries• suitable for visual display

Based on existing simple tech• XML/HTTP/CGI• “dumb server, clever client” - relatively low knowledge

barrier for bioinformaticians with data to expose

Scalable• integrators (client software) get more data for zero cost

Page 13: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

Why not DAS?

One-dimensional queries• query only by sequence position• not by developmental stage, tissue type, etc• (yet)

Constrained generic format• clients aren’t “tailored” to each data source • possible data types are to some extent limited

Not semantically rich• ontology support optional

Page 14: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

Commands: the basics

Sequence• give me the DNA sequence for a given segment of a

reference object• e.g. “bases 100k – 200k of chromosome 15”

Features• give me all annotations offered by the data source that

are attached to a given segment of the sequence

Page 15: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

The sequence command

/das/<source>/sequence?<params>

Parameters:

segment=ID:start,end (one or more)

ID of reference object

Example:

/das/<source>/sequence?segment=X:100,200 ;segment=Y:500,600

Page 16: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

The sequence command

Response:<DASSEQUENCE> <SEQUENCE id="X” start="100” stop="200” version="1.0”> cctgagccagcagtggcaacccaatggggtccctttcca... </SEQUENCE> <SEQUENCE id=”Y” start=”500” stop=”600” version="1.0”> ctggacagcccggaaaatgagctcctcatctctaaccca...</SEQUENCE></DASSEQUENCE>

Page 17: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

The features command

/das/<source>/features?<params>

Parameters:

segment=ID:start,end (one or more)

type=foo (zero or more)

category=bar (zero or more)

Example:

/das/<source>/features?segment=X:100,200 ;segment=Y:500,600 ;type=SNP

Page 18: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

The features command

Response:<DASGFF> <GFF version="1.01" href=”..."> <SEGMENT id="X" start="100" stop="200"> <FEATURE id="X"> <START>100</START> <END>200</END> <TYPE id=”SNP” category=”variation">SNP</TYPE> <METHOD id=”sequencing">sequencing</METHOD> <SCORE>86.4</SCORE> <ORIENTATION>+</ORIENTATION> </FEATURE> ...

Page 19: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

Other Commands

Stylesheet• hints on how to render different types of feature• e.g. “exons as blue boxes, SNPs as red triangles”

/das/<source>/stylesheet

Types• lists the types of feature available

/das/<source>/types

Page 20: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

Metadata

Can make a client that knows how to query a server and parse the response

BUT something missing…• which data sources are available on a server?• which commands does a source support?• what kind of reference objects does it know about?

Page 21: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

The sources command

<server>/das/sources

• Lists a server’s data sources

For each source:• text description• list of “capabilities” (commands)• list of coordinate systems (type of reference object)• etc

Page 22: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

DAS Registry

• third component of DAS• catalogue of DAS sources

Human interface• validate, register, search, view statistics

Programmatic interface• http://www.dasregistry.org/das/sources• http://www.dasregistry.org/das/coordinatesystem• http://www.dasregistry.org/das/organism

Page 23: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

SOA

Registry

Find

ClientBindServer

Publ

ish

Page 24: Andy Jenkinson, EBI The DAS Protocol. Summary of Topics Technical overview Principles of communication Pros and cons DAS capabilities

Links

DAS Homepage• http://www.biodas.org/

DAS Specification• http://www.biodas.org/documents/spec-1.6.html

DAS in Ensembl:• http://www.ensembl.org/info/docs/das/index.html

Mailing list:• http://biodas.org/mailman/listinfo/das