31
The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

Embed Size (px)

DESCRIPTION

US National Academy of Sciences: 3 pillars of ‘data husbandry’ 1. Integrity 2. Access 3. Stewardship Principles Kleppner et al Ensuring the Integrity, Accessibility, and Stewardship of Research Data in the Digital Age. National Academy of Sciences. ISBN

Citation preview

Page 1: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

The IP3 Data Archive

Michael AllchinIP3 Data & Information Manager

Page 2: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

AimTo make IP3 data available to the broader scientific community and

general public, in a permanent legacy archive, as required under

the terms of the funding agreement with CFCAS

Page 3: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

US National Academy of Sciences:3 pillars of ‘data husbandry’

1. Integrity2. Access3. Stewardship

Principles

Kleppner et al. 2009. Ensuring the Integrity, Accessibility, and Stewardship of Research Data in the Digital Age. National Academy of Sciences.ISBN 978-0-309-13684-6

Page 4: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

TasksObtain

Page 5: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

TasksObtain

Understand

Page 6: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

Data Organisation:Good

Page 7: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

Data Organisation:Less Good

Page 8: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

TasksObtain

Understand

Validate

Page 9: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

Process- Build continuous series in Excel (mostly manual)

Page 10: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

Process- Build continuous series in Excel (mostly manual)

- First-pass programmatic (+Mk1 Eyeball) validation (check date progression, interval consistency, watch for estimation formulae, etc)

Page 11: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

Process- Build continuous series in Excel (mostly manual)

- First-pass programmatic (+Mk1 Eyeball) validation

(check date progression, interval consistency, watch for estimation formulae, etc)

- Write to data-model in RDBMS (Access MDB)

Page 12: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

Process- Build continuous series in Excel (mostly manual)

- First-pass programmatic (+Mk1 Eyeball) validation

(check date progression, interval consistency, watch for estimation formulae, etc)

- Write to data-model in RDBMS (Access MDB)

- Plot and check for consistency / problem areas: resolve or delete!

Page 13: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager
Page 14: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager
Page 15: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager
Page 16: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager
Page 17: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager
Page 18: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager
Page 19: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

TasksObtain

Understand

Validate

Archive

Page 20: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

Vital Statistics7 Basins: 44 Stations

Page 21: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

Vital Statistics424 Individual Datasets

Page 22: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

Vital Statistics29.7 Million Values

Page 23: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

Vital StatisticsLiDAR for principal research basins

(~89Gb)

Page 24: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

One Major ProblemHow to ensure open-ended public accessibility to large volumes of complex and disparate data and associated information, with no

ongoing budget or staff establishment?

Page 25: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

Solution: Part 1Go low-tech

write datasets to simply-formatted text files:make available for download from website

(hosted indefinitely by U.Sask.)

Demo 1

Page 26: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

To include…- Principal originator (‘Basin Lead’) and co-authors- General contact details- Official citation- Other funding agencies / contributors / support- Disclaimer- ‘Licensing’ text- Basin / Station details- Instrumentation and contextual information (where available)- Notes - Flag key

Page 27: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

Solution: Part 2Implement metadatabase on server

to support basic searches

Demo 2

Page 28: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

Other Routes: 1Make full database available for

download as Access MDB(with schema)

Page 29: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

Other Routes: 2Partner with WE-Hub

cutting-edge environmental data repository:will host clone of IP3 data archive

Page 30: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager

LessonsOrganise early: adopting standardised

procedures and protocols for gathering, validating, storing and transmitting data will streamline

generation of high-quality datasets, provide better support for

collaborative research, and enhance credibility / defensibility

Page 31: The IP3 Data Archive Michael Allchin IP3 Data & Information Manager