Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
The SAIAB Biodiversity Data Curation Platform
Willem Coetzer South African Institute for Aquatic Biodiversity
Grahamstown
Overview
Introduction to SAIAB
Specimen collections
Research platforms
Case-study using BRUV data
The natural science collections community
Natural Science Collections Facility (NSCF)
The role of SAIAB in hosting museums’ biodiversity data for the NSCF
Concluding remarks
Data publication
The SAIAB Biodiversity Data Curation Platform
http://www.saiab.ac.za/information-brochures.htm Introduction to SAIAB
Collections Platform
• National Fish Collection
Research platforms (for use by the SA scientific community)
• African Coelacanth Ecosystem Programme (ACEP): Marine Platform
• Acoustic Tracking Array Platform (ATAP)
• Marine Remote Imagery Platform (Ma-RIP) *
• Vessels and instruments (e.g. remotely operated vehicle & remote underwater camera)
Stills from remote underwater video
Baited remote underwater video (BRUV)
How would you characterise these data?
a) Marine biodiversity data
b) Fish data
c) Behavioural data
d) Ecological data
e) Camera-trap data
f) All of the above
Baited remote underwater video (BRUV)
How would you characterise these data?
a) ‘Marine biodiversity data’: A marine biodiversity scientist can model knowledge…
b) ‘Fish data’: An ichthyologist can model knowledge…
c) ‘Behavioural data’: An ethologist can model knowledge…
d) ‘Ecological data’: An ecologist can model knowledge …
e) ‘Camera-trap data’?
Fundamentally we are talking about ‘raw data records’ of events and occurrences. We describe
these using metadata, which is an organised form of knowledge (i.e. a computer file).
We create a knowledge model of abstract concepts (e.g. an ontology) in a particular domain.
The knowledge model can then be applied to the raw data, and used as a ‘cookie cutter’ to extract useful or actionable biodiversity information from the knowledge/data fusion.
Metadata properties of the Occurrence class
Workbench: Mapping BRUV spreadsheet columns to Specify fields
Vertical integration of events and occurrences
Natural Science Collections Facility (NSCF)
• The NSCF (virtual facility) is a network of institutions/museums holding natural science collections that are accessible to external researchers (open data).
• The NSCF will ensure that natural science collections/data are used for research.
• The NSCF is funded by DST’s long-term funding programme, the Research Infrastructure Roadmap (SARIR).
Biodiversity Data Curation Platform (hosted by the SAIAB Data Centre)
• Three data centres (2 replicated in real time and 1 for backup)
• Systems Administrator / Data Custodian
• Museum web servers hosted by SAIAB Data Centre
• Specify Software (Specify Collections Consortium, USA)
• Specify 7 (web version)
• Information Manager / Data Steward
• Primarily for the use of SAIAB scientists (originally used for collection data)
• Open to all natural science museums in South Africa
• Collaborative research in capacity development, specifically for biodiversity data
curation in the context of South African natural science museums
• Not necessarily a long-term solution but is useful for NSCF objectives
Description
Intention
Museum Application Status of Vertebrates
Migration
Ditsong Museum Specify 7 Complete
Durban Nat Sci Museum Specify 7 Complete
Port Elizabeth Museum Specify 7 Complete
Albany Museum Specify 6/7
East London Museum Specify 7 Complete
KwaZulu-Natal Museum Specify 7 Complete
McGregor Museum Specify 7 Complete
Participating museums
Biodiversity Data Curation Platform: Support
• Data hosted by SAIAB is replicated across three independent data centres (>250m
apart)
• Two data centres replicate in real time, and the third is dedicated to storing
backups.
• High-capacity tape backup will be added in the near future
• As an additional measure, cloud storage is used to store daily extracts of Specify
databases, which are retained for one year.
Biodiversity Data Curation Platform: Backup
Specify 7 Workbench Application to import a spreadsheet (e.g. web application used by
data steward at a remote museum) *
Publication of data (Global Biodiversity Information Facility)
Concluding Remarks
• Museum / research institute dichotomy; reflected in the culture and attitude to data
• Capacity development for data curation in the biodiversity community
• The quality of data, and the adherence to metadata standards
• Encourage researchers to publish data (e.g. Biodiversity Data Journal)