12
assembling a draft overall tree of life from phylogenetic trees and taxonomic databases Jonathan A Rees US National Evolutionary Synthesis Center Duke University [email protected] TDWG, 31 October 2013

Assembling a draft overall tree of life from phylogenetic trees and taxonomic databases

  • Upload
    jar375

  • View
    158

  • Download
    1

Embed Size (px)

DESCRIPTION

Presentation at Symposium on sharing and delivery of reusable phylogenetic knowledge at TDWG conference 2013, http://wiki.tdwg.org/twiki/bin/view/Phylogenetics/PhyloSharingWorkshop2013

Citation preview

Page 1: Assembling a draft overall tree of life from phylogenetic trees and taxonomic databases

assembling a draft overall tree of life from phylogenetic trees and taxonomic databases

Jonathan A ReesUS National Evolutionary Synthesis Center

Duke [email protected]

TDWG, 31 October 2013

Page 2: Assembling a draft overall tree of life from phylogenetic trees and taxonomic databases

software team:

Jim AllmanJoseph BrownKaren CranstonCody HinchliffMark HolderJonathan LetoEmily McTavishPeter MidfordRick ReeStephen Smith

funding:

US NSF

Page 3: Assembling a draft overall tree of life from phylogenetic trees and taxonomic databases

what is open tree of life?

Page 4: Assembling a draft overall tree of life from phylogenetic trees and taxonomic databases

1. collect phylogenetic trees for best possible coverage of entire tree of life

Drew BT, Gazis R, Cabezas P, Swithers KS, Deng J, et al. (2013) Lost Branches on the Tree of Life. PLoS Biol 11(9): e1001636. http://dx.doi.org/10.1371/journal.pbio.1001636

Page 5: Assembling a draft overall tree of life from phylogenetic trees and taxonomic databases

2. normalize tips so that they match between source trees

label normalization

Hemsleya amabilis HS454 524163 Hemsleya amabilis

Theria 4267989 Theria in Arthropoda

Nicotiana suaveolans var excelsior

232354 Nicotiana rotundifolia

Selysia prunifera 949305 Cayaponia prunifera

Page 6: Assembling a draft overall tree of life from phylogenetic trees and taxonomic databases

3. synthesize a single ‘big tree’ algorithmically from the source trees

Smith SA, Brown JW, Hinchliff CE (2013) Analyzing and Synthesizing Phylogenies Using Tree Alignment Graphs. PLoS Comput Biol 9(9): e1003223. http://dx.doi.org/10.1371/journal.pcbi.1003223

Page 7: Assembling a draft overall tree of life from phylogenetic trees and taxonomic databases

4. expose source trees and ‘big tree’ in various ways

Page 8: Assembling a draft overall tree of life from phylogenetic trees and taxonomic databases

exposing provenance

• links to studies

• links to data deposits (e.g. treebase)

• links to taxonomic database records

• methods documentation

• versioning

Page 9: Assembling a draft overall tree of life from phylogenetic trees and taxonomic databases

reference taxonomy

• used for normalization, internal node labeling, gap-filling

• need NCBI taxonomy

• supplement with GBIF

• patch system

• future: other sources

Page 10: Assembling a draft overall tree of life from phylogenetic trees and taxonomic databases

‘open’

trees are not creative expression

... ergo no © protection

... ergo © licensing is meaningless

... CC0 is nice (and required by Dryad), but no CC0 for legacy data or NCBI

Page 11: Assembling a draft overall tree of life from phylogenetic trees and taxonomic databases

lessons• NeXML and badgerfish are good

• machine-processable tip identity would be awfully nice

• we were surprised by tree rooting problem

• provenance is an uphill battle

• to be seen: github for data curation?