20
L. Shyamal muscicapa @ crosswinds.net http://www.crosswinds.net/~muscicapa/ Taxonomic databases - Some possibilities

Taxonomy and computers

Embed Size (px)

DESCRIPTION

Presentation made in 2001 at the pre-Association of Tropical Biology meet in Bangalore. Main suggestion is for biologists to just contribute to the web at least in free text without needing to be burdened by technology oor knowledge of computers

Citation preview

Page 1: Taxonomy and computers

L. Shyamalmuscicapa @ crosswinds.net

http://www.crosswinds.net/~muscicapa/

Taxonomic databases - Some possibilities

Page 2: Taxonomy and computers

Database aimsWhat to store ? To what purpose ?

• Species information – Taxonomic - structure, naming, literature, identification

• Collection (Observations) information – identified (linked to species)

– unidentified (linkable )

• Analysis and reporting– searching

– identification-key generation (with geographic/taxonomic bounds)

– phylogenetic analyses

– biological studies

Page 3: Taxonomy and computers

The case for plain text• Knowledge level of an index-card user

• Not tied to software (applications / OS)

• Ease of conversion

• Searchable – with readily available tools and search engines

– Can be filtered in advanced ways

• usable with systems that extract data from natural language (eg: Finnish Lepidoptera database draws distribution maps based on country names)

• In the absence of entry tools - entry needs care

• greater storage

• the web is a text based database

Page 4: Taxonomy and computers

tagged fields - a taxonomic entry in HISPID format { CBG_NO 7702638, COLL_YEAR 1975, COLL_MONTH 'JUL', COLL_DAY 20, COLLECTOR 'Crisp, B.C.', FIELD_NUMBER '383', COUNTRY 'AUSTRALIA', REGION_CODE 'WHA', LOCALITY 'Great Sandy Desert; Wolf Creek Meteorite crater.', LAT_DEG 19, LAT_MIN 10, LAT_DIR 'S', LONG_DEG 127, LONG_MIN 48, LONG_DIR 'E', GEOCODE_SOURCE 1, HABITAT_A 'Red sand dune.', NOTES_A 'Shrub to 1 m.', HERBARIUM_ITEMS '1', HERBARIUM_MAT 'H', DETERMINED 'Crisp, M.D.', SUPRA_FAMILY 'PD', FAMILY 'FABACEAE', GENUS 'Jacksonia', SPECIES 'aculeata', SP_AUTHOR 'W.V. Fitzg.',}

Royal Botanic Gardens, Sydney

Page 5: Taxonomy and computers

Tagged Data<specimen id=“AB1001”>

<collector> X </collector>

<location latitude=“12.83N” longitude=“77.5S”date=“19DEC1999”>

Bangalore

</location>

<comments>feeding on ant pupae</comments>

<identification>

<determined> Y </determined>

<family>Calliphoridae</family>

<genus>Bengalia</genus>

<species>lateralis</species>

</identification>

<image href=“http://…”>wing image</image>

</specimen>

Page 6: Taxonomy and computers

Other Data ElementsLiterature

Descriptions - other notes, biology

links to pictures

links to sound

Maps - locations + time

Biochemical/sequence data

Cross references

to other species - hosts - predator - parasite, web sites, etc.

Other databases

Page 7: Taxonomy and computers

Extensible Markup Language

• XML - open specification (= ‘generic HTML’ )

• advantages of plain text + metadata

• checking and validation can be automated

• searching/querying/filtering the data is easy

• standard schema can be adopted

• can be used on the internet via browser

Page 8: Taxonomy and computers

DatabaseXML/text

Webserver

Web-browserApplets

internet

•Applets to render distribution maps•Applets to generate keys•Interactive identification programs•Applets to submit data

•Search•Query•Filter•Edit•Enter

Webserver

Page 9: Taxonomy and computers

• DELTA Description language for taxonomy– M. J. Dallwitz and T. A. Paine

• XDELTA (XML)

• Linnaeus II (ETI) UNESCO/University of Amsterdam

• TDWG - Taxonomic Database Working Group

• ITIS - USDA

• host-plant database NHM- public domain

• WCMC Red-data list

Current standards and models

Page 10: Taxonomy and computers

Available Systems Linnaeus II - ITIS - SpeciesAnalyst

Page 11: Taxonomy and computers

Character matricesautomated key generation

cladistic analysis

Character 1 Character 2 Character 3 Character 4 Character 5Species 1 1 1 0 1 -Species 2 1 2 1 2 1Species 3 1 2 0 2 -Species 4 0 1 1 2 -Species 5 1 2 1 2 -

Metadata - characters - states(ordinal/nominal), weightage

Weights c2 > c1 > c3 > c4 > c5

c2(s1,s4)

(s2,s3,s5)

c1s1

s4

c3

s3

(s2,s5) c5s2

s5

Page 12: Taxonomy and computers

DistributionDiversity

Page 13: Taxonomy and computers

http://www.wbrc.org.uk/general/maps.htm

Propylea 14-punctataThe 14-spot ladybird

Grid size 2x2Km

black dots represent records since 1995;

The hashed square is an older record given only to 10km by 10km accuracy.

Local Distributions

Page 14: Taxonomy and computers

Adsavakulchai ,S., V. Baimai, W. Prachyabrued, P. J.Grote and S. Lertlum (1998) Morphometric study using wing image analysis for identification of the

Bactrocera dorsalis complex (Diptera : Tephritidae).

The World Wide Web Journal of Biologyhttp://www.epress.com/w3jbio/vol3/Adsavakulchai/index.html

Automated Identification

Page 15: Taxonomy and computers
Page 16: Taxonomy and computers

Automated Data acquisition

• optical character recognition (OCR)• extraction of species and character information

from well-structured descriptions in old literature (eg. Fauna of British India)

Page 17: Taxonomy and computers

Remote access

• Databases as well as collections• Remote examination of specimens

– web cams - manual / robotic

– field / lab

• Sharing and copyright issues• Quality control

Page 18: Taxonomy and computers

Online references

Digital Taxonomy Resources ( http://www.geocities.com/RainForest/Vines/8695/ )

ITIS -USDA Taxonomists workbench ( http://itis.usda.gov/ )

SpeciesAnalyst XML ( http://habanero.nhm.ukans.edu/ )

ETI, University of Amsterdam – World Biodiversity Database ( http://www.eti.uva.nl )

DELTA ( http://www.keil.ukans.edu/delta/, http://www.biodiversity.uno.edu/delta/ )

XDELTA ( http://www.bath.ac.uk/~ccslrd/delta/index.html )

Software for Field biologists ( http://www.euronet.nl/users/mbleeker/prog/soflis_e.html )

Key generation ( http://prod.library.utoronto.ca/polyclave/, http://www.lucidcentral.com )

Page 19: Taxonomy and computers

Data Standards

Conn, B. (ed.). (1996). HISPID3. Herbarium information standards and protocols for interchange of data. Version Three. Sydney: Royal Botanic Gardens. 126 pp.

Dallwitz, M.J., T.A. Paine & E.J. Zurcher. (1993). DELTA user's guide. A general system of processing taxonomic descriptions. Fourth ed.,Canberra CSIRO Division of Entomology. 136pp.

Dallwitz, M. J. (1980). A general system for coding taxonomic descriptions. Taxon 29, 41–6.

Dallwitz, M. J. (1980). User’s guide to the DELTA system. A general system for coding taxonomic descriptions. CSIRO Aust. Div. Entomol. Rep. No. 13, 71 pp.

Dallwitz, M. J. (1984). User’s guide to the DELTA system: a general system for coding taxonomic descriptions. 2nd edition. CSIRO Aust. Div. Entomol. Rep. No. 13, 1–93.

Dallwitz, M. J., and Paine, T. A. (1986). User’s guide to the DELTA system: a general system for

processing taxonomic descriptions. 3rd edition. CSIRO Aust. Div. Entomol. Rep. No. 13, 1–106. Pankhurst, R. J. (1986). A package of computer programs for handling taxonomic databases.

CABIOS 2, 33–9.

Watson, L., Dallwitz, M. J., Gibbs, A. J. and Pankhurst, R. J. (1988). Automated taxonomic descriptions. Pp. 292-304 in: Hawksworth, D. L., Davies, R. G. & Bisby, F. A. (eds.), Prospects in Systematics. Clarendon Press, Oxford.

Page 20: Taxonomy and computers

Automated identification

• Yu, D.S., Kokko, E.G., Barron, J.R., Schaalje, G.B. & Gowen, B.E. (1992). Identification of ichneumonid wasps using image analysis of wings. Systematic Entomology, 17, 389-395.

• White, I.M. & Scott, P.R. (1994). Computerized information resources for pest identification : a review, pp.129-137 in : Hawksworth, D.L. (ed.), The Identification and Characterisation of Pest Organisms.. CAB International, Wallingford, UK.

• Howell, V.D., Hoelmer, K., Norman, P. &Allen, T.(1982) Computer-assisted measurement and identification of honey bees (Hymenoptera : Apidae). Annals of the Entomological Society of America, 75, 591-594.

• Frampton, E.R., Fry, J., Stephenson, B.P. & Cowley, J.M. (1991). A Computerised system for data management during a fruit fly outbreak. The International Symposium on the Biology and control of fruit flies.