30
Donat Agosti Plazi Brussels, June 2, 2014 Supported by the European Commission through its FP7 research funding programme Interoperability of Taxon Treatments

2 donat agosti-1

  • Upload
    agosti

  • View
    166

  • Download
    1

Embed Size (px)

DESCRIPTION

Interoperability of taxon treatments. Lecture at at the Final Meeting of the Pro-iBiosphere Conference, Meise, Belgium. http://wiki.pro-ibiosphere.eu/wiki/Final_Conference

Citation preview

Page 1: 2 donat agosti-1

Donat AgostiPlazi

Brussels, June 2, 2014 

Supported by the European Commission through its FP7 research funding programme 

Interoperability of Taxon Treatments

Page 2: 2 donat agosti-1

Hardisty, Nature 502, 171 (2013) 

BUT: predictive ecology has substantial data needs 

Harfoot, BIH2013, Rome, 2013

The big question

What is the future of the biological world? 

Imagine if we could:

…Predict community level dynamics of ecosystems atscales from local to global, based on the ecology andbiology of all individual organisms

Page 3: 2 donat agosti-1

200,000,000+ printed pages1,900,000 species described20,000,000+ species treatments 17,000 new species per year

Biodiversity libraries

BUT: The data are hidden

Incomplete digitization Publications are not semantically enhancedCollections are incompleteData is not linkedMost data are not open

Page 4: 2 donat agosti-1

Interoperability of taxa

Can we build a system (e.g. Open Biodiversity KnowledgeManagement System) that includes a component that extracts, stores and serves and serves information on taxa in a system thatis agnostic of Biota?

Traditionally Floras, Faunas, Mycotas are dealt with by different communities

Page 5: 2 donat agosti-1

Pro‐iBiosphere project is to develop a blue print of an Open Knowledge Management System

It is not building a system

Pilots to demonstrate specific issuesinteroperability of taxaexplore workflows to produce recommendations of «best» practicesinteroperability of infrastructuresregistration of namesadvanced publishing

Do not expect production level products

Page 6: 2 donat agosti-1

Each taxonomic name usage has it’s treatment

Treatment

Formica obsoleta Linnaeus, 1758: 580 

Page 7: 2 donat agosti-1

Treatment as standard containers

http://en.wikipedia.org

Page 8: 2 donat agosti-1
Page 9: 2 donat agosti-1

Pilot 1: Taxa used for markup

Taxa Documents Treatments

Mistletoes 3 124

Chenopodium 15 174

Fungi 5 5

Bryophyta 2 25

Nephrolepis 1 35

Centipedes 50 154

Ants 40 486

Spiders 30 219

TOTAL ca. 140 ca. 1500

Page 10: 2 donat agosti-1

Chenopodium pilot

Page 11: 2 donat agosti-1

Pardosa logunovi

Spider pilot: machine access to content through markup

Page 12: 2 donat agosti-1

Spider pilot: overview of 34 OA Zootaxa publications 

5170 specimens4062 plottable specimens from1138 unique locations

Page 13: 2 donat agosti-1

Pseudomyrmex ants and Vachellia ant‐acaciasare a classic example of mutualism in biology. 

allenii

melanoceras

ruddiae

chiapensis

collinsii

cookii

cornigera

globulifera

hindsii

janzenii

mayana

sphaerocephala

boopis

flavicornis

hesperius

ita

janzenikuenckeli

mixtecus

nigrocinctus

nigropilosus

opaciceps

particeps

peperi

reconditus

satanicus

simulansspinicola

subtilissimus

veneficus

ferrugineus

gentlei

gracilis

Transbiotic link networkAssociated species linked throughreferences in taxonomic treatments

Acacia‐ant species: Pseudomyrmex gracili

Treatment: original description

Treatment: redescription

Associated ant‐acacia: Acacia gentlei

Ants Plants

Photocredits: Alex Wild

Treatment

Treatments linked through citations

Transbiotic interoperability

Page 14: 2 donat agosti-1
Page 15: 2 donat agosti-1

Pro‐iBiosphere1,000 treatementsPlazi10,000 treatments

Pensoft23,000 

Total34,000 treatments

Legacy literature

Prospectiveliterature

Page 16: 2 donat agosti-1

Page 17: 2 donat agosti-1

All data in Plazi

14,590 specimens8900 plottable specimens from1138 unique locations

Page 18: 2 donat agosti-1

Brazil

5170 specimens4062 plottable specimens from1138 unique locations

Page 19: 2 donat agosti-1

Brasil

Page 20: 2 donat agosti-1

Journal of Hymenoptera Research

5170 specimens4062 plottable specimens from1138 unique locations

Page 21: 2 donat agosti-1

Interoperability of taxa

Can we build a system (e.g. Open Biodiversity KnowledgeManagement System) that includes a component that extracts, stores and serves and serves information on taxa in a system thatis agnostic of Biota?

Yes, we can.

Page 22: 2 donat agosti-1

Legacy Prospective

Digitization √

OCR / Text capture √

Markup √ (√)

Standardization √ √

Strategies to markup √

External links √ (√)

Semanticenhancment

√ (√)

Create content √ (√)

Isssues and Recommendations

Page 23: 2 donat agosti-1

Find the right mix of generic and domain specific solutions

Plazi SRS

find scan «OCR» markup store

?domain domaingeneric

Digitization and Markup Workflow:

$$$$ ?

Page 24: 2 donat agosti-1

200,000 Taxonomic Articles in Zoological Record Since 1864

Create Content: selection strategy

Page 25: 2 donat agosti-1

Markup / data extraction strategies

Dedicated external services, bulkApplications for individual contributor, small scaleInvolve community / crowd / wikimediaAd hoc Web Services, individualMixed strategies

Combination with re‐publishing, small scale

Create market for treatments, large scale

Page 26: 2 donat agosti-1

Variation in status labels

TaxStatus ctd.  Total ctdREVISED STATUS 10s. str. 1

sp. n. 130

sp. nov. 4057sp.n. 3

spec. nov. 34

stat. nov. 56Status revised 9

subsp. nov. 26var. nov. 80

(blank)

Grand Total 5965

TaxStatus Total

comb. nov. 246G. N. 65gen. nov. 19gen.nov. 10hybr. nov. 13n sp 12n. comb. 2n. nom. 6n. sp. 267n. stat. 5n. subg. 3new combination 139new species 651NEW STATUS 114nomen novum 6nov. spec. 1

Standardize and apply in prospective publishing …

Quality Control and Standardization

«sp.nov.»

Page 27: 2 donat agosti-1

Standardization of markup

Formica rufa Linnaeus 1758: 426Genus name year of pub.           

Speciesepithet page of

publicatName

Authority

Bibliographic reference

Treatment citation

Page 28: 2 donat agosti-1

Linking of treatment as an example for external links

Treatment citation

Treatment identifier

Page 29: 2 donat agosti-1

Conclusions

• Biodiversity literature is very rich in data

• BL has a basic structure (treatments) across all Biota

• Legacy literature should be strategically marked up

• Prospective literature should be semantically enhanced

• Markup tools exist and should be optimized

• Identifiers for treatments exist to link to treatments

Page 30: 2 donat agosti-1

Thank you very much!

Donat AgostiPlazi

[email protected]