33
GEOTRACES International Data GEOTRACES International Data Assembly Centre Assembly Centre Edward Mawji GDAC/BODC [email protected]

GEOTRACES International Data Assembly Centre

  • Upload
    ovid

  • View
    28

  • Download
    0

Embed Size (px)

DESCRIPTION

GEOTRACES International Data Assembly Centre. Edward Mawji GDAC/BODC [email protected]. Discussion Points. Important concepts in data management… What and who are GDAC ( GEOTRACES International Data Assembly Centre )? Attempt to bring together data managers for WG3 meeting - PowerPoint PPT Presentation

Citation preview

Page 1: GEOTRACES International Data Assembly Centre

GEOTRACES International Data GEOTRACES International Data Assembly CentreAssembly Centre

Edward Mawji

GDAC/BODC

[email protected]

Page 2: GEOTRACES International Data Assembly Centre

Discussion PointsDiscussion Points

Important concepts in data management… What and who are GDAC (GEOTRACES

International Data Assembly Centre)? Attempt to bring together data managers for WG3

meeting Issues with management of GEOTRACES data

across Europe Membership of WG Dates for first WG3 meeting

Page 3: GEOTRACES International Data Assembly Centre

Guiding principles of data managementGuiding principles of data management

• quality assurance of data

• treat all information as data

• data that lack sufficient metadata have limited value beyond the research program for which they were collected

• metadata should include sufficient information to support discovery, value assessment and accurate re-use of data

“data stewardship”

The data collection generated by a research projectis a valuable component of its legacy.

Page 4: GEOTRACES International Data Assembly Centre

The GEOTRACES International Data Assembly Centre (GDAC) was initially created in 2008 to serve the data management needs of the GEOTRACES community. At present GDAC is jointly funded by NSF and NERC.

Sole responsibility is data management and storage

Starting premise - data must be secure and readily usable in the short term for GEOTRACES participants and for long term without reference to the originator.

What and who are GDAC

GDAC is located at the British Oceanographic Data Centre Liverpool UK

The Web-site is: http://www.bodc.ac.uk

GDAC data manager is Edward Mawji

Page 5: GEOTRACES International Data Assembly Centre

GDAC roleGDAC role

• Main role of GDAC will be to compile global datasets for all key GEOTRACES parameters.

•Provide PI with guidance on Metadata requirements

•Capture and record supporting documentation (metadata)

•Make data easily accessible to participating scientists and the larger science community.

•To communicate with national data centres

•The policy for data release will be determined by the Scientific Steering Committee (SSC).

•Website for data delivery is under construction, the delivery aspect will be developed once data has been submitted to GDAC.

Page 6: GEOTRACES International Data Assembly Centre

Maintain contact with national DC

Provide advice and/or assistance for PI’s

prior to planning workshops prior to cruise …

cruise metadata forms discuss data management strategy

to support research post-cruise … data publishing …

data inventory cruise documentation data contributions …

To meet the requirements of the GEOTRACES programme, GEOTRACES cruises require a high level of data management

GDAC responsibilities

Page 7: GEOTRACES International Data Assembly Centre

GEOTRACES Data Management set up GEOTRACES Data Management set up

Ship-based measurements

National Data

Centres

Store data and all metadata for all GEOTRACES

cruises

Ship-based TEI Ship-based TEI measurementsmeasurements

GDAC

Flow of Data

Page 8: GEOTRACES International Data Assembly Centre

Policy Policy

Before cruise PI/PSO informs GDAC of intended cruise

Down load pre-cruise metadata form and Scientific sampling event log forms from website

Identify appropriate DAC

If no DAC inform GDAC who will act as DAC (cruise planning stage)

After cruise- Chief Scientist

Submits metadata forms and event logs to GDAC and DAC (1 week)

Submits underway navigational files and data (1 week)

Submits CTD data to GDAC (1 week)

Submits cruise report (8-16 weeks)

•DAC carries out data tracking and submits final data to GDAC, If no DAC GDAC carries out data tracking

Page 9: GEOTRACES International Data Assembly Centre

Progress to date

Initial stepsDesigned and published a GDAC Websitewww.bodc.ac.uk/geotraces/

Please have a look, feedback would be welcome

Page 10: GEOTRACES International Data Assembly Centre

Website Progress and issuesWebsite Progress and issues

Things to do

• Create links under relevant section for metadata forms and example event logs (waiting for feed back from DMC)

• Link to IMBERS data management cook book (once published).

• Add a full description of every parameter measured on each cruise. This can only be achieved when PI’s or national data centres pass on this information.

A cruise report would help.

Unified description of key GEOTRACES parameters

• Add cruises to POGO

• Development of the data delivery function will be put on hold until it is necessary.

Page 11: GEOTRACES International Data Assembly Centre

Accessing DataAccessing Data

GEOTRACES data specifications

Ask for user name and details when a data request is made

Provide information about the data with the data file = standard processing file OR metadata form

Link from GEOTRACES website to GDAC website

No Data will be made public without appropriate approval

Page 12: GEOTRACES International Data Assembly Centre

Attempt to bring together data managers Attempt to bring together data managers for WG3 meetingfor WG3 meeting

Initially a meeting was planned for May 2009

Poor response-so cancelled

Why

Lack of contact with national data centres

To early for a Data meeting?

Page 13: GEOTRACES International Data Assembly Centre

Why a WG3 meeting is importantWhy a WG3 meeting is important

Centralisation of data

Version control

Communication

Metadata forms and requirements

Advisable all GEOTRACES cruise have an on board data managers

Greatly helps the organisation of data. IMBER have developed an online guide, will be made available and/or adapted for GEOTRACES once the IMBER SSC approve the draft

Data Management Issues

Page 14: GEOTRACES International Data Assembly Centre

IMBER Data Management Cook Book

Would be useful for feedback from the GEOTRACES community for our own cookbook

http://planktondata.net/imber/

Page 15: GEOTRACES International Data Assembly Centre

Likely ProblemsLikely Problems

GEOTRACES Problems

Lack of cooperation from scientist and data centres. Lead nations need to identify GEOTRACES data

managers Mistrust i.e Australia IPY data will not be made

available until data is published

Integrate GEOTRACES data into GDAC’s database

Version control of Date: Potential problem from holding GEOTRACES data in multiple locations.

Measured needed to ensure international data management works smoothly

Data Managers should get together regularly to discuss progress.

Page 16: GEOTRACES International Data Assembly Centre

Contacts and progress with IPY Contacts and progress with IPY GEOTRACES dataGEOTRACES data

The following contact has been made

Germany- PANGAEA (DE) ([email protected]), Contact via email. Not much data in PANGAEA (Sea Bird Bottle data and some Radio nuclei data for cruise ARK-XXII/2). Issues over version control exist, hopefully will resolve before transferring data.

Netherlands- Taco de Bruin ([email protected]). IPY data manager but will also deal with GOTRACES IPY. New to the job GDAC have only just received his details (March 2009). Hein de Baar and Michiel Rutgers van der Loeff are confident they can make good progress for submitting data for the Polarstern cruises ARK-XXII/2 and ANT XXIV-3 in the next few months.

France -Marie-Paule Torre (FR) ([email protected]). Contact via email, Ed Mawji will arrange a meeting in the next few months to discuss BONUS-GOOD HOPE data. Marie Paule Torre expects data to start arriving around May 2009.

USA- Cyndy Chandler [email protected]. Very helpful, no data at present. Will probably start with metadata

Page 17: GEOTRACES International Data Assembly Centre

New Zealand - Philip Boyd ([email protected]) will handle all NZ GEOTRACES data. Ed Mawji is in the process of setting up arrangements for GDAC to act as local data centre for the GEOTRACES process study FeCycle II (March 2009).

Australia –Andy Bowie and Edward Butler are in charge of IPY-GEOTRACES data. Butler’s group is still analysing samples. Bowie has finalised data for SAZ-SENSE, but will not transfer data until it is published.

No contact with Norway and Sweden but have been advised the following people are in charge of IPY data .

Norway -Oystein Godoy, ([email protected])

Sweden- Jan Szaron ([email protected]) SMHI oceanography

No contact with Japan or China.

Page 18: GEOTRACES International Data Assembly Centre

Summary of progress and problemsSummary of progress and problems

Without a detail cruise report or metadata forms, tracking data becomes difficult (no record of what parameters were measured).

Metadata forms have been developed waiting for feed back from Reiner and Chris.

Distrust between some lead PI and data centres. And between data centres

Data retrieval method have not been tested but are likely to vary between data centers and country. No details at present.

No data set have arrived at GDAC yet. Not possible to set up BODC parameter codes until data with detailed method is received.

Future progress

The most important task is mapping all GEOTRACES parameters measured on all IPY cruises

Page 19: GEOTRACES International Data Assembly Centre

Membership of WGMembership of WG

Need to increase the data management membership

Please nominate an appropriate person from each nation

Dates for the first meeting will depend on increased membership

Page 20: GEOTRACES International Data Assembly Centre

FeedbackFeedback

Thank You

Feedback, comments and suggestions welcome and encouraged

Names of relevant data managers for GEOTRACES data

Page 21: GEOTRACES International Data Assembly Centre
Page 22: GEOTRACES International Data Assembly Centre
Page 23: GEOTRACES International Data Assembly Centre
Page 24: GEOTRACES International Data Assembly Centre

Data management support for large, Data management support for large, oceanographic research projectsoceanographic research projects

Facilitate the interchange of data within the project community

Work up and quality assure data Assemble project data into a single high quality

coherent data set, maintaining spatial and temporal relationships

Ensure documentation of data sets via developed metadata forms

Final banking and publication of the project data set Encourage future utilisation of data

Page 25: GEOTRACES International Data Assembly Centre

Quality Control of GEOTRACES Data

Reformat data to BODC internal format

Check parameters, units, time zone

Visually inspect data using in-house software

Data checked for spikes, gaps, physically unreasonable values

Data compared with that previously received from a site

Adjacent stations compared to check unusual signals

Problems discussed and resolved with data suppliers

Accompanying documentation compiled

Data loaded to BODC data bank

Data made available (via the Web, or on request)

Page 26: GEOTRACES International Data Assembly Centre

Mission statement of GDACMission statement of GDAC

To operate world class data management for GEOTRACES

providing data management support for all cruises

maintaining and developing a TEI’s database

international exchange and management of oceanographic data

making high quality TEI’s data readily available to research scientists in academia, government and industry

Page 27: GEOTRACES International Data Assembly Centre

Quality control and data integrationQuality control and data integration

The processing stage includes five main steps:

1) Initial quality control of metadataReported metadata are checked against other available sources

(e.g. ship’s log, scientific log, cruise report, other data centres)

Disagreements are investigated and originator may be contacted if any doubt subsists

Originator may also be contacted if there appears to be a problem with the data or if information about methodology is insufficient to start banking the data. GEOTRACES metadata forms should eliminate this problem

2) ReformattingAttribution of BODC parameter codes.

CTD and underway data are transferred to BODC internal binary format

Page 28: GEOTRACES International Data Assembly Centre

Cruise Departure Dates GEOTRACES PI Region Ship

IPY        

Sea of Okhotsk 05/08/2007 Jun Nishioka Artic. R/V Professor Khromov

ATOS-1 Antonia Tovar-Sanchez Artic.

SAZ-SENSE 17/01/2007 Andy Bowie and Edward Butler Antarctic Aurora Australis

ARK-XXII/2 (IPY-Cruise) 28/07/2007 Michiel Rutgers van der Loeff Artic. Polarstern

SIPEX 05/09/2007 Andy Bowie Antarctic Aurora Australis

ANT XXIV/3 (Caso-GEOTRACES; Zero & Drake) 06/02/2008 Hein de Baar Antarctic Polarstern

BONUS/GOOD-HOPE 07/02/2008 Marie Boye Antarctic Marion Dufresne

SR3-GEOTRACES 28/03/2008 Andy Bowie Antarctic Aurora Australis

International Siberian Shelf Study (ISSS 08 ) 15/08/2008 Per Andersson

Future IPY cruises

Beauford Sea (IPY) 2009 Roger Francois and Kristin Orians Artic

ATOS-2 (IPY) 2009 Antonia Tovar-Sanchez Antarctic  

Full list of IPY cruisesFull list of IPY cruises

Page 29: GEOTRACES International Data Assembly Centre

GEOTRACES CruisesGEOTRACES Cruises

Cruise Departure Dates GEOTRACES PI Region Ship

Process studies

DynaLife Jan-March 2009 Anne-Carlijn Alderkamp Antarctic Nathaniel Palmer

FeCycle II 2008 (sept) Philip Boyd FRV Tangaroa

GEOTRACES cruises

Intercalibration 1st leg 08/06/2008 Greg Cutter

Intercalibration 2nd leg 29/06/2008 Greg Cutter

Future cruises

NL. Iceland to Bermuda, provisionally scheduled in 2010 (June) West Atlantic RV Pelagia

NL. Bermuda to Barbados 2010 (August) West Atlantic RV Pelagia

UK 40oS cruise 2010 (Oct) Atlantic

US. Atlantic section 2010

NL.Barbados to Recife in 2011 2011 West Atlantic RV Pelagia

NL. Recife to Buenos Aires in 2011   West Atlantic RV Pelagia

Page 30: GEOTRACES International Data Assembly Centre

Quality control at GDAC/BODCQuality control at GDAC/BODC

• Quality control Visualisation of data –

“screening” Anomalous data points marked Developing more complex

automated systems

• Metadata assembly Oracle tables Link data to time, position, originator, restrictions, XML

documentation…

• Audit

• Banking Final version to a secure location Visible via GIDAC/BODC web software / NDG

Page 31: GEOTRACES International Data Assembly Centre

Screening of CTD and underway dataScreening of CTD and underway data

Visual screening on high speed graphics workstation using BODC’s graphics editor Serplo

Serplo enables the display of multiple parameters and rapid zooming

Page 32: GEOTRACES International Data Assembly Centre

Screening of CTD and underway dataScreening of CTD and underway data

So a suspect value is NOT edited but a 1-byte quality control flag becomes associated with it

Page 33: GEOTRACES International Data Assembly Centre

GEOTRACES DACGEOTRACES DAC

BODC is taking on responsibility for assembling and delivering GEOTRACES data

Who are BODC

A national facility for storing and distributing data concerning the marine environment (started in 1969).

• BODC deal with biological, chemical, physical and geophysical data and our data bases contain measurements of over 10,000 different variables.

This includes quality control, dissemination and long term archival

• Are involved in International project such as Argo, CLIVAR< WOCE, GLOSS and GEBCO and now GEOTRACES