12
PhyloTastic: Names-Based Phyloinformatic Data Integration Rutger Vos

PhyloTastic: names-based phyloinformatic data integration

Embed Size (px)

DESCRIPTION

Lightning talk to the 2013 TDWG conference symposium on phyloinformatics, brief report on PhyloTastic with special attention to the taxonomic name reconciliation service TaxoSaurus.

Citation preview

Page 1: PhyloTastic: names-based phyloinformatic data integration

PhyloTastic: Names-Based Phyloinformatic Data IntegrationRutger Vos

Page 2: PhyloTastic: names-based phyloinformatic data integration

Re-use of phylogenetic knowledge

Currently, most phylogenetic knowledge is not easily re-used due to a lack of:• archiving;• awareness of best practices;• community-wide standards for

formatting data, naming entities, and annotating data.

Most attempts at data re-use seem to end in disappointment. Nevertheless, we find many positive examples of data re-use, particularly those that involve customized species trees generated by grafting to, and pruning from, a much larger tree.

Page 3: PhyloTastic: names-based phyloinformatic data integration

Phylomatic: automated re-use of phylogenetic knowledge

• In a recent survey of practices of re-use of phylogenetic knowledge, Phylomatic was the most frequently used method for obtaining trees, e.g. in studies of phylogenetic community structure.

• Phylomatic takes a set of input taxa and extracts them from a reference phylogeny by pruning and grafting.

• The reference phylogeny is usually APG-III

• Taxon names are matched exactly or grafted on.

• Branch lengths are either retained or modeled (bladj)

Page 4: PhyloTastic: names-based phyloinformatic data integration

Phylotastic: generalizing and modularizing phylomatic-ish functionality

Phylotastic was conceived by NESCent’s Hackathons, Interoperability, Phylogenies (HIP) working group and was initiated by several dozen participants at a NESCent hackathon on June 4-8, 2012. A second hackathon took place at iPlant’s headquarters in Tucson, Arizona on January 28 through February 1, 2013.

Page 5: PhyloTastic: names-based phyloinformatic data integration

Phylotastic: a design pattern for phylogenetic data re-use

1. Input list of names

2. Controller queries TNRS with list of names

3. TNRS provides token with redirect to results

4. Controller gets TNRS results

5. Controller queries Treestore for trees with TNRS taxa

6. Controller POSTs Treestore matches, GETs subtree back

7. Treestore (or proxy) performs pruning and grafting

8. Annotated subtree is returned

Page 6: PhyloTastic: names-based phyloinformatic data integration

Cross-pollinations and spin-offs

Page 7: PhyloTastic: names-based phyloinformatic data integration

TaxoSaurus: the PhyloTastic TNRS

• A simple, asynchronous, RESTful API that communicates in JSON.

• Modular design: multiple taxonomies can be ingested and queried

• Built around the iPlant TNRS service

• Available at taxosaurus.org

Page 8: PhyloTastic: names-based phyloinformatic data integration

TaxoSaurus: the PhyloTastic TNRS

/submit - POST or GET a list of scientific names to the service and retrieve a JSON token to access results.

Parameters:• query: newline separated list of scientific names.

OR• file: a text file containing newline separated

scientific names.• source (optional): a comma separated list of

taxonomic source ids (see /sources/list).• code (optional): the abbreviation for one of the

nomenclature codes (ICZN/ICN/ICNB).

Page 9: PhyloTastic: names-based phyloinformatic data integration

TaxoSaurus: the PhyloTastic TNRS

/retrieve/<token> - GET the result of a TNRastic query.• Parameters: none• Returns: a JSON object containing the accepted

names

/sources/list – GET a ranked list of available sources• Parameters: none• Returns: a JSON object containing the list of source

IDs

Page 10: PhyloTastic: names-based phyloinformatic data integration

TaxoSaurus: the PhyloTastic TNRS

/sources/<source_id> - GET the details about a particular source, or all sources if no ID specified

• Parameters: <source_id> or none

• Returns: a JSON object containing the source details

/delete/<token> - GET or POST or DELETE. Cancels a running job

• Parameters: <token>, the hash of the job to cancel

• Returns: a JSON object indicating success or an error

Page 11: PhyloTastic: names-based phyloinformatic data integration

TaxoSaurus: the PhyloTastic TNRS

Page 12: PhyloTastic: names-based phyloinformatic data integration

Voettekst vullen: Invoegen|Koptekst en voettekst / Insert|Header & Footer