Upload
melvin-biglin
View
225
Download
1
Embed Size (px)
Citation preview
Data Standards Workflow
Raw data Scripts Database
Store raw data in subversion to
keep track of history
Stored files (netcdf)
accessible through the web
Extract Transform Load
Charts & Maps
Tools and websites
Provide
Add meta information
Script to convert raw data into
netcdf
OpenEarthRawData
OpenEarth
OPeNDAP
OpenEarthTools
Data Standards Workflow
Raw data Scripts Database
Store raw data in subversion to
keep track of history
Stored files (netcdf)
accessible through the web
Extract Transform Load
Charts & Maps
Tools and websites
Provide
Add meta information
Script to convert raw data into
netcdf
OpenEarthRawData
OpenEarth
OPeNDAP
OpenEarthTools
Add metadata
• Use the inspire meta data form to store information about the dataset.• http://www.inspire-geoportal.eu/inspireEditor.htm• Click launch editor
Transform
Store in course/Pcnumber/inspire_description.xml
Transform – add metadata
Save metadata file1. Save metadata file (local)2. Add to subversion (local)3. Commit => metadata into subversion (remote)
Store in netcdf
• What’s netcdf?• Write a script to transform data into netcdf• Using CF convention
Transform
What is netcdf
• Data format defined by unidata• Data store used for coverage data and
multidimensional data• CF Metadata convention
Transform – store in netcdf - netcdf
What is netcdf
XX
ZZ
TT
YY
• An array based data structure for storing multidimensional data
• N-dimensional coordinates systems• X coordinate (e.g. longitude)• Y coordinate (e.g. latitude)• Z coordinate (e.g. altitude)• Time dimension• … other dimensions
• Variables – support for multiple variables• Temperature, humidity, pressure, salinity, etc
• Geometry – implicit or explicit• Regular grid (implicit)• Irregular grid• Points
TransformTransform – store in netcdf - netcdf
Storing Multidimensional Data
X Y Z Q
1 1 1 0.5
1 1 2 0.3
1 2 1 0.6
1 2 2 0.1
2 1 1 0.4
2 1 2 0.2
2 2 1 0.9
2 2 2 0.3
0.5 0.4
0.6 0.9
0.3 0.2
0.1 0.3
1 2
1
2
1
2
X Y Z
32 numbers
14 numbers
Transform – store in netcdf - netcdf
Data Model
Data model for netcdf and others.
Also usable for hdf, opendap, grib, etc. See the java library for details
Data model for netcdf and others.
Also usable for hdf, opendap, grib, etc. See the java library for details
Transform – store in netcdf - netcdf
ArcGis
ArcGis also reads and writes netcdf files.
ArcGis also reads and writes netcdf files.
Transform – store in netcdf – netcdf - applications
Your favorite text editor
xml representation of a netcdf file
xml representation of a netcdf file
Transform – store in netcdf - netcdf
Other Tools
NCO#diffncdiff -v time file1.nc file2.nc#compression & packingncpdq -4 -L 9 in.nc out.nc # Deflated packing (~80% lossy compression)#selecting variables by regexncks -v '^Q..' in.nc # Q01--Q99, QAA--QZZ, etc.
IDVVery usefulVery useful
Web hyperslabs, cool!Web hyperslabs, cool!
Not so stable.Not so stable.
Transform – store in netcdf - netcdf
Data Standards Workflow
Raw data Scripts Database
Store raw data in subversion to
keep track of history
Stored files (netcdf)
accessible through the web
Extract Transform Load
Charts & Maps
Tools and websites
Provide
Add meta information
Script to convert raw data into
netcdf
OpenEarthRawData
OpenEarth
OPeNDAP
OpenEarthTools
Store in netcdf
• What’s netcdf?• Write a script to transform data into netcdf• Using CF convention
Transform – store in netcdf - script
Write script
• Read raw data• Read header line• Read data• Read all data• Create function to read all data• Use function in Matlab
• Raw data into empty netcdf file• Create empty netcdf file• Add dimensions and variables• Store variables
• Read values
Transform – store in netcdf - script
Reading raw data into memory
• Use one of the following matlab functions to read the file data into an array• fscanf
Transform – store in netcdf - script
Example: Transect.txt file
1999 58 -135 3531 -130 3541 -125 3631 -120 4171 -115 6221 -110 8231 -105 9841 -100 10971 -95 12171 -90 12951… 200 -2415 210 -2995 220 -3595 99999999999 99999999999 2000 58 -135 3531 -130 3541 -125 3631 -120 4171 -115 6221 -110 8231 -105 9841 -100 10971 -95 12171 -90 12951
Header lineYear
number of points
PointsX Z X Z …. 9999999
Location: OpenEarthRawData\course\example\raw
Transform – store in netcdf - script
Read header line
>> fid = fopen('..\raw\transect.txt')fid = 15
>> header = fscanf(fid, '%d', 2)header = 2000 58
>> year = header(1)year = 2000
>> npoint = header(2)npoint = 58
Transform – store in netcdf - script
% read header header = fscanf(fid, '%d', 2); year = header(1); % store year in time time(i) = year; npoint = header(2); % read data data = fscanf(fid, '%d', npoint*2); data = reshape(data, [2, npoint]); % use column vectors data = data';
Read data>> % read datadata = fscanf(fid, '%d', npoint*2)
data = -150 3741 -140 3581 -135
>> data = reshape(data, [2, npoint])
data = Columns 1 through 7
-150 -140 -135 -130 3741 3581 3531 3541
1
2
>> % use column vectorsdata = data'
data = -150 3741 -140 3581 -135 3531
3
Transform – store in netcdf - script
Read all data% preallocate all data % (time, coastward)transectseries = NaN(3, 58);coastward_distance = NaN(58, 1);time = NaN(3, 1);% open file and get file idfid = fopen('..\raw\transect.txt');i = 1;while (~feof(fid)) % read header header = fscanf(fid, '%d', 2); year = header(1); % store year in time time(i) = year; npoint = header(2); % read data data = fscanf(fid, '%d', npoint*2); data = reshape(data, [2, npoint]); % use column vectors data = data' % store data in transect series transectseries(i,:) = data(:,2); coastward_distance(:) = data(:,1); fgetl(fid); i = i + 1;end
Transform – store in netcdf - script
Create a functionfunction transect = readtransect(filename)% preallocate all data % (time, coastward)transectseries = NaN(3, 58);coastward_distance = NaN(58, 1);time = NaN(3, 1);% open file and get file idfid = fopen(filename);i = 1;while (~feof(fid)) % read header header = fscanf(fid, '%d', 2); year = header(1); % store year in time time(i) = year; npoint = header(2); % read data data = fscanf(fid, '%d', npoint*2); data = reshape(data, [2, npoint]); % use column vectors data = data'; % store data in transect series transectseries(i,:) = data(:,2); coastward_distance(:) = data(:,1); fgetl(fid); i = i + 1;endtransect = struct('series', transectseries, … 'distance', coastward_distance, 'time', time);end
Transform – store in netcdf - script
Use the new function
>> data = readtransect('..\raw\transect.txt')
data =
series: [3x58 double] distance: [58x1 double] time: [3x1 double]
Transform – store in netcdf - script
Loading data into netcdf
• What does a netcdf file look like• Required meta information
Transform – store in netcdf - script
Netcdf filetransect.ncnetcdf transect {dimensions: coastward = 58 ; time = 3 ;variables: float coastward_distance(coastward) ; coastward_distance:unit = "metre" ; float year(time) ; year:unit = "year" ; float height(time, coastward) ; height:unit = "metre" ;data:
coastward_distance = -135, -130,…, 150, 160, 170, 180, 190, 200, 210, 220 ; year = 1999, 2000, 2001 ; height = 353, 354, … -142, -146, -170, -206, -232, -273, -309, -346, -375, -388, … -32, … -92, -110, -127, -143, -156, -177, -211, -259, -303, -334 ;}
Transform – store in netcdf - script
Create an empty netcdf file
>> nc_create_empty(outputfile)>> nc_dump(outputfile)netcdf transect.nc {
dimensions:
variables:
}
Transform – store in netcdf - script
Add dimensions
nc_add_dimension(outputfile, 'crossshore', 58)nc_add_dimension(outputfile, 'time', 3)nc_dump(outputfile)>>netcdf transect.nc {
dimensions:coastward = 58 ;time = 3 ;
variables:}
help nc_add_dimension
Transform – store in netcdf - script
Add variablescrossshoreVariable = struct(... 'Name', 'crossshore_distance', ... 'Nctype', 'float', ... 'Dimension', {{‘crossshore'}}, ... 'Attribute', struct('Name', 'unit', 'Value', 'metre') ... );nc_addvar(outputfile, crossshoreVariable);timeVariable = struct(... 'Name', 'year', ... 'Nctype', 'float', ... 'Dimension', {{'time'}}, ... 'Attribute', struct('Name', 'unit', 'Value', 'year') ... );nc_addvar(outputfile, timeVariable);heightVariable = struct(... 'Name', 'height', ... 'Nctype', 'float', ... 'Dimension', {{'time', ‘crossshore'}}, ... 'Attribute', struct('Name', 'unit', 'Value', 'metre') ... );nc_addvar(outputfile, heightVariable);nc_dump(outputfile)
help nc_addvar
Transform – store in netcdf - script
Result
netcdf transect.nc {
dimensions:coastward = 58 ;time = 3 ;
variables:float coastward_distance(coastward), shape = [58]
coastward_distance:unit = "metre" float year(time), shape = [3]
year:unit = "year" float height(time,coastward), shape = [3 58]
height:unit = "metre"
}
Transform – store in netcdf - script
Store variables
nc_varput(outputfile, 'height', data.series)nc_varput(outputfile, 'year', data.time)nc_varput(outputfile, 'coastward_distance', data.distance)
help nc_varput
Transform – store in netcdf - script
Result: Netcdf filetransect.ncnetcdf transect {dimensions: coastward = 58 ; time = 3 ;variables: float coastward_distance(coastward) ; coastward_distance:unit = "metre" ; float year(time) ; year:unit = "year" ; float height(time, coastward) ; height:unit = "metre" ;data:
coastward_distance = -135, -130,…, 150, 160, 170, 180, 190, 200, 210, 220 ; year = 1999, 2000, 2001 ; height = 353, 354, … -142, -146, -170, -206, -232, -273, -309, -346, -375, -388, … -32, … -92, -110, -127, -143, -156, -177, -211, -259, -303, -334 ;}
Transform – store in netcdf - script
Read values
surface(nc_varget(outputfile, 'height')')
11.5
22.5
3
020
4060
-5000
0
5000
10000
15000
Transform – store in netcdf - script
Store in netcdf
• What’s netcdf?• Write a script to transform data into netcdf• Using CF convention
Transform – store in netcdf - convention
CF convention
Standard used by USGS, NOAA, Arcgis, GDAL
Climate and Forecast (CF) Conventionhttp://www.unidata.ucar.edu/software/netcdf/docs/conventions.html
Initially developed for• Climate and forecast data• Atmosphere, surface and ocean model-generated data• Also used for observational datasets• CF is the most widely used convention for geospatial netCDF
data.
Transform – store in netcdf - convention
Improve output
• Store extra attributes• Title• Author• Standard_name
Transform – store in netcdf - convention