GEOTRACES Data Assembly Centre (GDAC) Activities
GEOTRACES project team:Donna Cockwell, Gwen Moncoiffe, Helen Snaith, Richenda
Houseago-Stokes, Graham Allen
Mohamed AdjouData manager
British Oceanographic Data Centre, NOC, Liverpool, UK
JULY 2018: SSC – TAWAIN
Outline
IDP download from GDAC server
Catch-up on tasks inherited from the prioritized work of IDP2017 delivery
Improved Reporting of GEOTRACES Data Ingestion QC
Improving data workflow
Online management of GEOTRACES parameter names via the NERC vocabulary server
IDP download from GDAC server
IDP2014 and IDP2017total downloads (to 10 July)
440
392
622
551
172211 195
282
205
92
0
100
200
300
400
500
600
IDP2014 IDP2014_V2 IDP2014_V3 IDP2017_V1 IDP2017_V2
All Downloads Unique Users
IDP download from GDAC serverIDP2014 and IDP2017
Total downloads between 10 April-10 July
0 03
32
88
0 0 2
13
47
0
10
20
30
40
50
60
70
80
90
100
IDP2014 IDP2014_V2 IDP2014_V3 IDP2017_V1 IDP2017_V2
All Downloads Unique Users
IDP download from GDAC serverIDP2014 and IDP2017 downloads
by format (to 10 July)
233
144
249231
75
207
42
7690
200
47
96106
23
0
159
201
124
54
0
50
100
150
200
250
300
IDP2014 IDP2014_V2 IDP2014_V3 IDP2017_V1 IDP2017_V2
Total IDP downloads by format
ODV ASCII NETCDF Excel
IDP download from GDAC serverIDP2014 and IDP2017 downloads
by format (to 10 July)
312282
449
290
136128110
173 178
00 0 0
83
36
0
50
100
150
200
250
300
350
400
450
500
IDP2014 IDP2014_V2 IDP2014_V3 IDP2017_V1 IDP2017_V2
Total IDP downloads by data type
Sample CTD Aerosol/Rain
0
50
100
150
200
250
IDP downloads by Country
IDP2014 IDP2014_V2 IDP2014_V3 IDP2017_V1 IDP2017_V2
IDP2014 and IDP2017 downloadsby Country (to 10 July)
IDP2014 and IDP2017 downloadsby Country (to 10 July without USA & UK)
0
10
20
30
40
50
60
IDP downloads by Country
IDP2014 IDP2014_V2 IDP2014_V3 IDP2017_V1 IDP2017_V2
Catch-up on tasks inherited from the prioritized work of IDP2017 delivery
Insuring up-to-date gdac correspondence ([email protected]).
Updating cruise inventory: https://www.bodc.ac.uk/geotraces/data/inventories/
Catch-up on tasks inherited from the prioritized work of IDP2017 delivery
Updating section maps: https://www.bodc.ac.uk/geotraces/cruises/section_maps/
Section cruises per Ocean basin
Catch-up on tasks inherited from the prioritized work of IDP2017 delivery
Updating section maps: https://www.bodc.ac.uk/geotraces/cruises/section_maps/
Process studies and Compliant data
Catch-up on tasks inherited from the prioritized work of IDP2017 delivery
Diverse information update:
1. A dedicated IDP2017 web page indicating that IDP2014 and previous versions of the IDP are still available with a link to the new web page;
2. Updating the IDP2017 bibliographic reference;
3. Updating the download agreement;
Improved Reporting of GEOTRACES Data Ingestion QC
Increasing data QC standards was presented as a priority task during the last DMC meeting.
A working document on a QC strategy was submitted to the DMC (end of May)
GDAC propose to deliver a standardised data QC report for each dataset, together with a final document summarizing the QC analyses.
ORIGINATOR
BODC
Improving data workflowHow do we store samples’ data
Parameter name & Unit Value
From information submitted in the “Data metadata” file
Sample’s (x,y,z,t) geographical coordinates, depth and time
From information submitted in the“Data” file
DB
ORIGINATOR
BODC
Improving data workflow
Actions for more process fluidity and decreasing the chances of human errors
Parameter name & Unit Value
From information submitted in the “Data metadata” file
Sample’s (x,y,z,t) geographical coordinates, depth and time
From information submitted in the“Data” file
DB
2) Systematically supply data with event and bottle identifiers.
1) Provide GEOTRACES parameter names in data file headers (DMC proposal via the data portal)
For more fluid data processing at GDAC
1) Provide GEOTRACES parameter names in data file headers: DMC proposal via the data portal.
2) Systematically supply data with event and bottle identifiers: this will be done using cruise events/samples information that could be supplied by the chief scientist after the cruise.
Event Number
Gear(CTD, FISH, Net etc.)
Timebegin
Timeend
Long Lat Bottle number
Depth Comment
Here an example of the headers of the cruise logs master file:
Once submitted by the chief scientist we can make it available for data submitters through BODC stfp location.
Actions for more process fluidity and decreasing the chances of human errors
Online management of GEOTRACES parameter names via the NERC vocabulary server
Work load for this task submitted by GDAC to GIPO
When approved, we need up-to-date parameters’ lists
Would replace stand-alone spreadsheets
Would make the GEOTRACES parameter names searchable and referenceable
Would enable GDAC to improve data workflow
Why?
Current status
IDP2017 (V2) Access
www.bodc.ac.uk/geotraces/data/idp2017
To download files, need to: Log in to the BODC system Agree to the download agreement
IDP2017 (V2) Access
Metadata form update
We recently added two mentions on the GDAC metadata submission template :
1) listing the contributors, their respective ranks and affiliations
2) Error calculation (1 or 2 times Standard deviation…etc.).
Flags
Flag Description0 no quality control1 good value2 probably good value3 probably bad value4 bad value5 changed value6 value below detection7 value in excess8 interpolated value9 missing value
A value phenomenon uncertain
Q value below limit of quantification
SeaDataNet Quality Control FlagsThe following single character qualifying flags may be associated with one or more individual parameters with a data cycle:
Flag DescriptionBlank Unqualified< Below detection limit> In excess of quoted valueA Taxonomic flag for affinis (aff.)B Beginning of CTD Down/Up CastC Taxonomic flag for confer (cf.)D Thermometric depthE End of CTD Down/Up Cast
G Non-taxonomic biological characteristic uncertainty
H Extrapolated value
I Taxonomic flag for single species (sp.)
K Improbable value - unknown quality control source
L Improbable value - originator's quality control
M Improbable value - BODC quality control
N Null value
O Improbable value - user quality control
P Trace/calmQ IndeterminateR Replacement valueS Estimated valueT Interpolated valueU UncalibratedW Control valueX Excessive difference
BODC Quality Control FlagsThe following single character qualifying flags may be associated with one or more individual parameters with a data cycle:
Flags
All the SeaDataNet data quality flags have a correspondent flag in BODC except (“probably good value”).
Source: https://odv.awi.de/fileadmin/user_upload/odv/misc/ODV4_QualityFlagSets.pdf
Work to budget/incomefor past and future IDP
Scope of IDP increases each version (new parameters, new media)
Effort for GDAC increases
Effort for whole IDP>GDAC income
NERC funding pays for different activities
We will work with funded budget
BODC has still worked more than funded, so reduced FTE this year
GEOTRACES Data portal
GDAC compiled requirements
GDAC reviewed ‘buy’ options- none specific enough for IDP
DMC in contact with Guillaume Brissebrat (Toulouse) to build and manage the portal
IDP portal is not specific to GDAC
IPD2017_V1 and IPD2017_V2
V2 was also an opportunity to correct V1 errors with their various origins:
37%
31%
11%
6%3%3%3% 6%
Origins of updates/corrections in IDP 2017 V2
GDAC Data originatorMiscommunication Aerosol group/naming group updateODV formatting outside GDAC Not present in approved data list in V1Cell quotas New dataset
IPD2017_V2 was always planned to include cell quotas data (Twining) not included in V1.