Upload
annot
View
29
Download
0
Embed Size (px)
DESCRIPTION
HDF-EOS 2/5 to netCDF Converter. Bob Bane, Richard Ullman, Jingli Yang Data Usability Group NASA/Goddard Space Flight Center. Introduction. Status report Properties of netCDF and HDF-EOS Conversion strategy. Status Report. Last year - hdfeos52netcdf HDF-EOS 5 -> netCDF COARDS compatible - PowerPoint PPT Presentation
Citation preview
HDF-EOS 2/5 to netCDFConverter
Bob Bane, Richard Ullman, Jingli Yang
Data Usability Group
NASA/Goddard Space Flight Center
Introduction
• Status report
• Properties of netCDF and HDF-EOS
• Conversion strategy
Status Report
• Last year - hdfeos52netcdf– HDF-EOS 5 -> netCDF– COARDS compatible
• Current– Uses he25 interoperability library, so does both
HDF-EOS 2 and 5– CF compatible
Data Formats and Conventions
• Generic data containers– HDF, netCDF
• Conventions for domain-specific metadata– HDF-EOS, COARDS/CF
• HDF -> HDF-EOS
• netCDF -> COARDS/CF
netCDF
• netCDF files contain:– Variables
• multi-dimensional arrays of basic data types (character/integer/float)
– Dimensions• named sizes for dimensions of variables
– Attributes• named one-dimensional arrays• properties of variables
netCDF Conventions
• Metadata is stored in attributes– Conventions for names: “units”
• Coordinate vector– Variable with the same name as a dimension– Value is a vector of same size as the dimension– Is a mapping between (0,1,2…) dimension
indexing and physical quantities for dimension
COARDS Conventions
• Cooperative Ocean/Atmospheric Research Data Service– Conventions for use of netCDF
• Order of dimensions for variables
• Names of attributes (“Units”, “_FillValue”)
• Coordinate variables
– http://ferret.wrc.noaa.gov/noaa_coop/coop_cdf_profile.html
CF Conventions
• Climate and Forecast– Follow-on to COARDS
– Tighter• Many attributes optional in COARDS are required in CF
– More capable• Multi-dimensional geolocation support
– http://www.cgd.ucar.edu/cms/eaton/cf-metadata/
HDF
• Hierarchical Data Format• HDF files contain:
– Datasets• multi-dimensional arrays of basic data types
– Dimensions• Named sizes of dataset dimensions
– Groups• Named groups of datasets (and groups)
– Attributes• Named properties of datasets and groups, similar to netCDF
HDF-EOS
• Conventions and API for HDF• HDF-EOS files contain:
– Fields (datasets)– Points
• Individually geolocated measurements
– Swaths• Groups of data and geolocation fields, and mappings
between them
– Grids• Groups of data fields with rectilinear geolocation
HDF-EOS (cont)
• HDF-EOS 2 over HDF4
• HDF-EOS 5 over HDF5– HDF5 very different from HDF4– HDF-EOS 2/5 near identical API– Our he25 library allows uniform access to
HDF-EOS 2/5, so converter works for both• Looks/works like HDF-EOS 5
• On HDF-EOS 4 files, translates in/out
Observations
• HDF-EOS is “bigger” than netCDF– Additional structured metadata (ODL)– HDF-EOS API calls for geolocation
• netCDF file ~= HDF-EOS Swath/Grid– Both are groups of related datasets
Conversion Strategies
• One HDF-EOS file -> one netCDF file– Alternative is one Swath/Grid -> one file
• COARDS/CF - if original HDF-EOS followed conventions, converted netCDF will also– Most HDF-EOS data producers are aware of
COARDS/CF
• Skip HDF-EOS Point datasets– Reconsider this if real world Point data emerges
Conversion Strategies (cont)
• Convert data to enable future processing– Geolocation data, attributes (units)– Other metadata less important
• Could transfer ODL metadata as a string, but why?
– Can always go back to the original file and use good HDF-EOS tools
Conversion in General
Swath s1 Dimensions(lat,lon,time) Datafield f1(lat,lon,time) Geofield f2(lat,lon,time)Swath s2 Dimensions(lat,lon,time) Datafield f3(lat,lon,time) Geofield f4(lat,lon.time)
Dimensions(lat,lon,time,s2_time)Variable s1_f1(lat,lon,time)Variable s1_GEO_f2(lat,lon,time)Variable s2_f3(lat,lon,s2_time)Variable s2_GEO_f4(lat,lon,s2_time)
HDF-EOS
netCDF
• Flatten HDF-EOS hierarchy
• Encode names, types in variable names
Swaths
Swath s2 Dimensions(lat, glat ,lon, glon, time) DimensionMap(lat, glat, 0, 1) DimensionMap(lon, glon, 0, 1) Datafield f3(lat,lon,time) Geofield f4(glat,glon.time)
Dimensions(lat,glat,lon,glon,time,s2_time)Attributes: s2_DimensionMap: “lat/glat, lon/glon” s2_DMOffsets: (0,0) s2_DMIncrements: (1,1);Variable s2_f3(lat,lon,s2_time) Attributes: coordinates: “s2_GEO_f3”Variable s2_GEO_f4(glat,glon,s2_time)
HDF-EOS
netCDF
• Swath name, geofield type encoded in variable names
• Record dimension map in global attributes
Grids
Grid g1 Dimensions(lat,lon,time) Corners(upleft, upright, lowleft, lowright) Datafield f1(lat,lon,time)
Dimensions(lat,lon,time)Variable lat(lat) = (lowright,…upright)Variable lon(lon) = (lowleft, … upleft)Variable g1_f1(lat,lon,time)
HDF-EOSnetCDF
• Grid geolocation becomes coordinate variables
Converter
• C command-line application– hdfeos2netcdf HDF_file netCDF_file
• Should be portable to all HDF-EOS5/netCDF platforms– Naturally uses all libraries
Where is the Software?
• http://hdfeos.gsfc.nasa.gov– ‘Tools’ category– System ‘hdfeos2netcdf’
Big Picture
File Attributes fa1: “fa value”Swath s1 Attributes: sa1: “sa value” Dimensions(lat,lon,time) Datafield f1(lat,lon,time) Geofield f2(lat,lon,time)Swath s2 Dimensions(lat,lon,time) Datafield f3(lat,lon,time) Geofield f4(lat,lon.time)
File Attributes: fa1: “fa value” s1_sa1: “sa value”Dimensions(lat,lon,time,s2_time)Variable s1_f1(lat,lon,time)Variable s1_GEO_f2(lat,lon,time)Variable s2_f3(lat,lon,s2_time)Variable s2_GEO_f4(lat,lon,s2_time)
HDF-EOS
netCDF