Upload
ophelia-martin
View
215
Download
2
Embed Size (px)
Citation preview
Inter-American Workshop on Environmental Data Access
Panel discussion on scientific and technical issues
Merilyn Gentry, LBA-ECO Data Coordinator
NASA / LBA-ECO Project Office &
University of Tennessee
04-March-2004
NASA Field Experiments
• Over the last 2 decades, NASA has been funding field experiments for ground-truthing satellite observations– FIFE – Grasslands/prairie of midwestern U.S. - Kansas
– BOREAS – Boreal forests of Canada
– Safari 2000 – South Africa
– LBA-ECO – LBA project led by Brazil
• Data from these experiments are archived at ORNL DAAC
Data Policy Considerations
• NASA Data Policy – Data should be made available to the public as quickly
as reasonably possible, allowing for adequate quality assured
• Brazilian Law – Data collected in Brazil must remain in Brazil
• LBA Data Policy – All data resulting from the LBA study will be archived
in Brazil
– All LBA data will be made available to the public
LBADATA
RAW PRELIMINARYYEAR 1
PRELIMINARYYEARS 1..2.. >>
FINAL
MATURITY
DOCUMENTATIONDOCUMENTATION
UPDATE METADATA
FTP TO INPE/CPTEC
PR
OC
ES
S
Register Metadata
CompleteComplete CompleteComplete CompleteComplete CompleteComplete
Available to PublicBefore
Leaving Brazil
Goal: Long-term archive & distribution
• The “scientific and technical requirements for long-term preservation and accessibility of environmental data” was a key factor in the system design
• ORNL DAAC is part of a network of NASA data archive and distribution centers forming the Earth Observing System (EOS) DIS
• As a member of this network, ORNL DAAC must conform to certain EOSDIS protocols (interoperability)
• ORNL DAAC has evolved its own data archive standards and recommendations as well, “20-year rule”: accessible, retrievable, usable
Build a prototype -- KISS
• NASA funded ORNL to develop a system:– Efficient, simple, highly automated
– Computer platform independent
– Use Web browser interfaces / software
– Same system works as data matures from raw, preliminary, to final data
– PI maintains full control of data visibility
– Painless user “learning curve”
– Yet with flexible, comprehensive searching
The System Today
• 2 browser-based interfaces– LME – The LBA Metadata Editor– Beija-flor – Metadata Search & Data Retrieval
System• Uses traditional search engine technology, e.g. Yahoo,
Altavista• However, searches from only sources identified as
LBA DIS metadata
• Metadata• Data
Technical interoperability of LBA DIS Hardware & Software
• Metadata– Uses XML (ASCII) code and standard metatag conventions
– Uses FGDC metadata standards + LBA-specific fields
– Imports from/exports to a DIF
• Data – Data formats are not dictated by LBA DIS, though
proprietary formats are discouraged
– Data files can reside at LBA DIS nodes, other data centers, PI web sites
Metadata_File.xml
• ASCII text and contains standard metatags that are accessible to many search engines
• Also contains URLs to allow users to link to related data, documentation, and ancillary files, regardless of format
Metadata_File.xml
• ASCII text and contains standard metatags that are accessible to many search engines
• Also contains URLs to allow users to link to related data, documentation, and ancillary files, regardless of format
LBA Metadata File – the key to technical interoperability
Data1.txtData1.txt
Data2.xlsData2.xls
File.jpgFile.jpg
Doc.txtDoc.txt
Search engine
Semantic interoperability of environmental data across disciplines and languages
• English is the language standard
• Every LBA-ECO team has a U.S. PI and a Brazilian Co-Investigator
• Minimize space science jargon
• Beija-flor offers multiple search approaches:
– Fielded searches – pick lists provided for values in the metadata
– Character string searches accommodate more open-ended queries (and possibly less-expert users)
– Map-based / spatial searches and temporal range searches
– Combination searches
– Browsing
Facilitating interdisciplinary and international access to environmental data resources
• Both countries have committed long-term support for the archive and distribution of the LBA data collection:– LBA DIS in Brazil
– ORNL DAAC in the U.S.
• Global Change Master Directory will include a “DIF” for every LBA data set archived in the U.S.
• Links to non-LBA Amazonian-related data are available via Beija-flor
• The LBA metadata will be available for indexing by non-LBA search engines and metadata databases
Human factors affecting data availability
• Scientists want to hold on to their data as long as they can
• Data collected is often part of students’ thesis
• Few incentives for scientists to publish their data
• Documentation requirements are often prohibitive
Transfer
Transfer
Transfer
Raw
PreliminaryYear 1
Data Products, Data Products, Data Set Data Set Descriptions, Descriptions, Papers, PostersPapers, Posters
Data Products, Data Products, Data Set Data Set Descriptions, Descriptions, Papers, PostersPapers, Posters
LBA Metadata
Editor
(LME) Preliminary
Years 2+ >
DataArchive
atCPTEC
PI
ProducesData11
PI Registers/UpdatesMetadata
22 33PI Transfers Data To CPTEC
Beija-Flor Search Engine for LBA Data
Beija-Flor Search Engine for LBA Data
Metadata are compiled
in
Metadata are compiled
in
Receive44The LBA community and the public can access via Beija-Flor
Search
FinalQA’d with
Documentation
Brazilian Counterpart
What is Needed for Archiving LBA-ECO Data Sets at the ORNL DAAC?
Reference: Best Practices for Archiving Data
1. Metadata – LBA project parameters (i.e., Beija-flor metadata) must
comply with latest GCMD / EOSDIS standards
2. Data files– Suggested format: tabular data in ASCII, Gridded data in
ASCII Grid, Image data in binary or non-proprietary format
– Self-describing to identify key entries such as parameter names and units of measure
www.daac.ornl.gov/DAAC/PI/info.html
What is Needed for Archiving LBA-ECOData Sets at the ORNL DAAC (continued)
3. Data Set DocumentationDocumentation should include what a user would need to know about the data 20+ years from now; i.e., the 20-year rule–Data collection goals and description–Description of sample collection sites–Description of measurement methods (e.g., calibration,
calculations, software)
–Known errors and problems –Description of data file organization –Description of data reporting conventions (e.g., parameter names,
units, codes, flags, example data records)
–Key information from B-f (e.g., investigator(s), abstract, spatial and temporal attributes, data set citation, references, etc.)