15
Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA / LBA-ECO Project Office & University of Tennessee 04-March-2004

Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA

Embed Size (px)

Citation preview

Page 1: Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA

Inter-American Workshop on Environmental Data Access

Panel discussion on scientific and technical issues

Merilyn Gentry, LBA-ECO Data Coordinator

NASA / LBA-ECO Project Office &

University of Tennessee

04-March-2004

Page 2: Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA

NASA Field Experiments

• Over the last 2 decades, NASA has been funding field experiments for ground-truthing satellite observations– FIFE – Grasslands/prairie of midwestern U.S. - Kansas

– BOREAS – Boreal forests of Canada

– Safari 2000 – South Africa

– LBA-ECO – LBA project led by Brazil

• Data from these experiments are archived at ORNL DAAC

Page 3: Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA

Data Policy Considerations

• NASA Data Policy – Data should be made available to the public as quickly

as reasonably possible, allowing for adequate quality assured

• Brazilian Law – Data collected in Brazil must remain in Brazil

• LBA Data Policy – All data resulting from the LBA study will be archived

in Brazil

– All LBA data will be made available to the public

Page 4: Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA

LBADATA

RAW PRELIMINARYYEAR 1

PRELIMINARYYEARS 1..2.. >>

FINAL

MATURITY

DOCUMENTATIONDOCUMENTATION

UPDATE METADATA

FTP TO INPE/CPTEC

PR

OC

ES

S

Register Metadata

CompleteComplete CompleteComplete CompleteComplete CompleteComplete

Available to PublicBefore

Leaving Brazil

Page 5: Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA

Goal: Long-term archive & distribution

• The “scientific and technical requirements for long-term preservation and accessibility of environmental data” was a key factor in the system design

• ORNL DAAC is part of a network of NASA data archive and distribution centers forming the Earth Observing System (EOS) DIS

• As a member of this network, ORNL DAAC must conform to certain EOSDIS protocols (interoperability)

• ORNL DAAC has evolved its own data archive standards and recommendations as well, “20-year rule”: accessible, retrievable, usable

Page 6: Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA

Build a prototype -- KISS

• NASA funded ORNL to develop a system:– Efficient, simple, highly automated

– Computer platform independent

– Use Web browser interfaces / software

– Same system works as data matures from raw, preliminary, to final data

– PI maintains full control of data visibility

– Painless user “learning curve”

– Yet with flexible, comprehensive searching

Page 7: Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA

The System Today

• 2 browser-based interfaces– LME – The LBA Metadata Editor– Beija-flor – Metadata Search & Data Retrieval

System• Uses traditional search engine technology, e.g. Yahoo,

Altavista• However, searches from only sources identified as

LBA DIS metadata

• Metadata• Data

Page 8: Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA

Technical interoperability of LBA DIS Hardware & Software

• Metadata– Uses XML (ASCII) code and standard metatag conventions

– Uses FGDC metadata standards + LBA-specific fields

– Imports from/exports to a DIF

• Data – Data formats are not dictated by LBA DIS, though

proprietary formats are discouraged

– Data files can reside at LBA DIS nodes, other data centers, PI web sites

Page 9: Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA

Metadata_File.xml

• ASCII text and contains standard metatags that are accessible to many search engines

• Also contains URLs to allow users to link to related data, documentation, and ancillary files, regardless of format

Metadata_File.xml

• ASCII text and contains standard metatags that are accessible to many search engines

• Also contains URLs to allow users to link to related data, documentation, and ancillary files, regardless of format

LBA Metadata File – the key to technical interoperability

Data1.txtData1.txt

Data2.xlsData2.xls

File.jpgFile.jpg

Doc.txtDoc.txt

Search engine

Page 10: Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA

Semantic interoperability of environmental data across disciplines and languages

• English is the language standard

• Every LBA-ECO team has a U.S. PI and a Brazilian Co-Investigator

• Minimize space science jargon

• Beija-flor offers multiple search approaches:

– Fielded searches – pick lists provided for values in the metadata

– Character string searches accommodate more open-ended queries (and possibly less-expert users)

– Map-based / spatial searches and temporal range searches

– Combination searches

– Browsing

Page 11: Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA

Facilitating interdisciplinary and international access to environmental data resources

• Both countries have committed long-term support for the archive and distribution of the LBA data collection:– LBA DIS in Brazil

– ORNL DAAC in the U.S.

• Global Change Master Directory will include a “DIF” for every LBA data set archived in the U.S.

• Links to non-LBA Amazonian-related data are available via Beija-flor

• The LBA metadata will be available for indexing by non-LBA search engines and metadata databases

Page 12: Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA

Human factors affecting data availability

• Scientists want to hold on to their data as long as they can

• Data collected is often part of students’ thesis

• Few incentives for scientists to publish their data

• Documentation requirements are often prohibitive

Page 13: Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA

Transfer

Transfer

Transfer

Raw

PreliminaryYear 1

Data Products, Data Products, Data Set Data Set Descriptions, Descriptions, Papers, PostersPapers, Posters

Data Products, Data Products, Data Set Data Set Descriptions, Descriptions, Papers, PostersPapers, Posters

LBA Metadata

Editor

(LME) Preliminary

Years 2+ >

DataArchive

atCPTEC

PI

ProducesData11

PI Registers/UpdatesMetadata

22 33PI Transfers Data To CPTEC

Beija-Flor Search Engine for LBA Data

Beija-Flor Search Engine for LBA Data

Metadata are compiled

in

Metadata are compiled

in

Receive44The LBA community and the public can access via Beija-Flor

Search

FinalQA’d with

Documentation

Brazilian Counterpart

Page 14: Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA

What is Needed for Archiving LBA-ECO Data Sets at the ORNL DAAC?

Reference: Best Practices for Archiving Data

1. Metadata – LBA project parameters (i.e., Beija-flor metadata) must

comply with latest GCMD / EOSDIS standards

2. Data files– Suggested format: tabular data in ASCII, Gridded data in

ASCII Grid, Image data in binary or non-proprietary format

– Self-describing to identify key entries such as parameter names and units of measure

www.daac.ornl.gov/DAAC/PI/info.html

Page 15: Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA

What is Needed for Archiving LBA-ECOData Sets at the ORNL DAAC (continued)

3. Data Set DocumentationDocumentation should include what a user would need to know about the data 20+ years from now; i.e., the 20-year rule–Data collection goals and description–Description of sample collection sites–Description of measurement methods (e.g., calibration,

calculations, software)

–Known errors and problems –Description of data file organization –Description of data reporting conventions (e.g., parameter names,

units, codes, flags, example data records)

–Key information from B-f (e.g., investigator(s), abstract, spatial and temporal attributes, data set citation, references, etc.)