45

GSIF utilities

Embed Size (px)

Citation preview

Page 1: GSIF utilities

Global Soil Information Facilities

Software developments

Tomislav Hengl

ISRIC � World Soil Information, Wageningen University

GSM2011.org, June 20�24th 2011

Page 2: GSIF utilities

Acknowledgement

1. Hannes Reuter � GIS and WPS functionality.

2. Pierre Roudier & Dylan Beaudette � plotKML package.

3. Brendon Malone & David Jacquier � spline �tting

function.

4. Keith Shepherd & Bob MacMillan � Soil Reference

Library.

GSM2011.org, June 20�24th 2011

Page 3: GSIF utilities

GSIF components

1. Cyber infrastructure for input, analysis and visualization of

data.

2. Global databases (legacy data, gridded covariates) that are

main inputs to global soil mapping.

3. Software tools (modules and packages) and manuals forcreation of geoinformation, for instance, according tothe GlobalSoilMap.net speci�cations.

4. Standards and protocols for data entry, map generation and

data sharing.

GSM2011.org, June 20�24th 2011

Page 4: GSIF utilities

Book chapter

GSM2011.org, June 20�24th 2011

Page 5: GSIF utilities

Overview

Open Soil Profiles

(Servers) cyber infrastructure

Soil variables

Soil site info

Soil analytical data

Descriptive properties

Soil covariates (worldgrids)

5.6 km repository

Global

1 km repository

Continental scale

250 m repository

Country/state-level

R packages

Map import module

Data entry module

Harmonization module

Spline fitting

Spatial analysis module

Data visualization

Data export

Soil property maps

100 m (250 m, 1 km and 5.6 km)

Global coverageSix+four key soil parameters

(organic carbon, pH, clay, silt,

sand, coarse fragments)

at six standard depths (0-5, 5-

15, 15-30, 30-60, 60-100, 100-

200 cm)

and with included upper and

lower 95% probability ranges

Webmapping API

Real-time spatial prediction

(Google Maps)

GlobalSoilMap.net functionality

for web-applications

Geo-serving and geoprocessing

functionality

GSM2011.org, June 20�24th 2011

Page 6: GSIF utilities

GSIF modules

HARMONISATION MODULE

Translation of laboratory

methods (correlation)

Upscaling / downscaling

functionality

Data translation (re-

formatting) functionality

DATA ENTRY MODULE

soilprofiles.org: live data

entry forms for point data;

Geo-registry module;

Automated data screening and

detection of gross errors and

artifacts;

SUPPORT MODULE

Help and F.A.Q.

Variable descriptions

(meta-data)

Search functionality

(manuals and user

forums, demos and

multimedia)

DATA EXPORT MODULE

Subsetting and export to GIS

data formats (geotiff and ESRI

Shape file), KML and table

formats (DBF);

API services to serve the data

without accessing URL (e.g.

via mobile-phone);

SPATIAL ANALYSIS MODULE

Overlay (covariates) and

regression analysis;

Multiscale prediction - trend

models;

Variogram analysis (automap);

Prediction and simulations;

Cross-validation;

Predict secondary soil

parameters (PTF)

Open Soil Profiles

(soilprofiles.org)Soil Gridded Covariates

(soilgrids.org)

New soil profile

dataNew covariates

MAP IMPORT MODULE

Upload of gridded maps to the

soilgrids.org repository;

Meta-data generation tool;

Automated map matching and

validation (mask maps);

Automated conversion and

harmonization of soil polygon

maps;

Data serving

Soil property maps

(globalsoilmap.net)

GSM2011.org, June 20�24th 2011

Page 7: GSIF utilities

Proposed implementation

1. Produce a suite of utilities to import, re-format, analyzeand visualize spatial soil data

2. Design them so they �t the needs of operational globalsoil mapping

3. Focus on using R+OSGeo

4. Get the whole DSM community involved (in design, in

development, in use)

5. Provide training in development and use to countries and

nodes

GSM2011.org, June 20�24th 2011

Page 8: GSIF utilities

Proposed implementation

1. Produce a suite of utilities to import, re-format, analyzeand visualize spatial soil data

2. Design them so they �t the needs of operational globalsoil mapping

3. Focus on using R+OSGeo

4. Get the whole DSM community involved (in design, in

development, in use)

5. Provide training in development and use to countries and

nodes

GSM2011.org, June 20�24th 2011

Page 9: GSIF utilities

Proposed implementation

1. Produce a suite of utilities to import, re-format, analyzeand visualize spatial soil data

2. Design them so they �t the needs of operational globalsoil mapping

3. Focus on using R+OSGeo

4. Get the whole DSM community involved (in design, in

development, in use)

5. Provide training in development and use to countries and

nodes

GSM2011.org, June 20�24th 2011

Page 10: GSIF utilities

Proposed implementation

1. Produce a suite of utilities to import, re-format, analyzeand visualize spatial soil data

2. Design them so they �t the needs of operational globalsoil mapping

3. Focus on using R+OSGeo

4. Get the whole DSM community involved (in design, in

development, in use)

5. Provide training in development and use to countries and

nodes

GSM2011.org, June 20�24th 2011

Page 11: GSIF utilities

Proposed implementation

1. Produce a suite of utilities to import, re-format, analyzeand visualize spatial soil data

2. Design them so they �t the needs of operational globalsoil mapping

3. Focus on using R+OSGeo

4. Get the whole DSM community involved (in design, in

development, in use)

5. Provide training in development and use to countries and

nodes

GSM2011.org, June 20�24th 2011

Page 12: GSIF utilities

List of utilities

1. Global soil mapping (core) package � GSIF

2. Soil visualization package � plotKML

3. Soil Reference Library � SRL

4. Geo-services (PythonWPS, Geoserver, RServe, GDAL utilities)

GSM2011.org, June 20�24th 2011

Page 13: GSIF utilities

List of utilities

1. Global soil mapping (core) package � GSIF

2. Soil visualization package � plotKML

3. Soil Reference Library � SRL

4. Geo-services (PythonWPS, Geoserver, RServe, GDAL utilities)

GSM2011.org, June 20�24th 2011

Page 14: GSIF utilities

List of utilities

1. Global soil mapping (core) package � GSIF

2. Soil visualization package � plotKML

3. Soil Reference Library � SRL

4. Geo-services (PythonWPS, Geoserver, RServe, GDAL utilities)

GSM2011.org, June 20�24th 2011

Page 15: GSIF utilities

List of utilities

1. Global soil mapping (core) package � GSIF

2. Soil visualization package � plotKML

3. Soil Reference Library � SRL

4. Geo-services (PythonWPS, Geoserver, RServe, GDAL utilities)

GSM2011.org, June 20�24th 2011

Page 16: GSIF utilities

Functionality (GSIF)

I Estimate the spatial domain and the tiling system;I Fit splines to soil horizon records and convert from block to

point support in vertical dimension;I Query and download point and gridded data from the data

portals;I Convert harmonized soil pro�le data from relational structure

to single table records;I Automatically �lter suspicious records and detect outliers in

the soil pro�le records;I Generate one set of globally consistent predictions using point

observations (OSP) and gridded predictors (worldgrids);I Convert gridded predictions to formats required for submission

to GSIF;I Generate metadata and data analysis reports using XML

formats;

GSM2011.org, June 20�24th 2011

Page 17: GSIF utilities

Status (GSIF)

I Progress so far:

X Derive cell ID for any location in the world and estimatenumber of 1�degree blocks required to map an area (based ona land mask);

X Fit equal-area splines to soil pro�le data (the method ofBishop et al. (1999));

X Get values at point locations from worldgrids.org (covariates);X Convert site-horizon DB to single-table structure;

GSM2011.org, June 20�24th 2011

Page 18: GSIF utilities

Functionality (plotKML)

I Visualize soil pro�les measurements (using the original soil

colors);

I Visualize soil pro�le photographs;

I Plot results of prediction (soil property maps) using standard

color schemes;

I Default distribution model for the GlobalSoilMap.net property

maps (?);

I Visualize uncertainty of the maps;

GSM2011.org, June 20�24th 2011

Page 20: GSIF utilities

Soil pro�le attribute plot

GSM2011.org, June 20�24th 2011

Page 21: GSIF utilities

Organic carbon mapped using RK

GSM2011.org, June 20�24th 2011

Page 22: GSIF utilities

Soil grids as transparent polygons

GSM2011.org, June 20�24th 2011

Page 23: GSIF utilities

Soil type maps

GSM2011.org, June 20�24th 2011

Page 24: GSIF utilities

Multiple layers (above each other)

GSM2011.org, June 20�24th 2011

Page 26: GSIF utilities

Why KML? (1)

Google Earth is #1: >350 millions of downloads!

GSM2011.org, June 20�24th 2011

Page 27: GSIF utilities

Why KML? (2)

People that made Google Earth understand statistics

GSM2011.org, June 20�24th 2011

Page 28: GSIF utilities

plotKML

GSM2011.org, June 20�24th 2011

Page 29: GSIF utilities

SRL package

I Harmonization of soil pro�le data;

I Estimation of secondary soil properties using pedo-transfer

functions;

I Estimation of soil properties using soil spectroscopy;

GSM2011.org, June 20�24th 2011

Page 30: GSIF utilities

Overview

Conversion functions (various R packages for generalized

linear modeling, fuzzy matching,

regression trees etc.)

Conversion coefficients (most accurate models to estimate

standard parameters; extendible)

HydroMe

soiltexture

soil.spec

Fit conversion

model parameters

Fits

requir-

ed accur-

acy?

YES

NO

Obtain additional

field data

Convers-

ion model

available?

NO

Unharmonized

record (new data)

Soil Reference Data (soil referent profiles with complete

laboratory methods, soil description

and scanned soil spectra)

Soil Spectral

Library

aqp

Dependent R libraries

Design the

conversion model

SOIL REFERENCE

LIBRARY

Estimate values of

the standardized

variable

YESStandardized value +

Associated uncertainty

GSM2011.org, June 20�24th 2011

Page 31: GSIF utilities

Status

I It is not di�cult to build a package, but to get soil referencedata

I We would need (at least) 300�500 points:

X Points have to be representative (hypercube sampling, thewhole world)

X Each point should be sampled using standard protocol (soil�eld description, soil lab analysis, soil spectroscopy)

X Project designers need to decide if existing samples can beused as well as new ones

X We probably need new point samples

GSM2011.org, June 20�24th 2011

Page 32: GSIF utilities

ISRIC monoliths

Figure: ISRIC referent samples (monoliths) and occurrence probability.Derived using the MaxEnt package (climatic images, HWSD andvegetation maps).

GSM2011.org, June 20�24th 2011

Page 33: GSIF utilities

Main principles of programming

1. Hide complexity from the users (scale, e�ective precision, 3D

geostat)

2. Deliver data and results so that no software training is required

to open it (KML)

3. Link to R+OSGeo community (do not invent functionality that

already exists and is operational)

GSM2011.org, June 20�24th 2011

Page 34: GSIF utilities

Why R?

1. It is a trustworthy software because it is open source

2. It is accessible to anyone via most of Operating Systems

3. People in developing countries can start picking thingsup today!

4. It is the fastest growing open source environments forstatistical computing

5. It can handle space-time data

6. It is professional (made by top-minds in the business)

GSM2011.org, June 20�24th 2011

Page 35: GSIF utilities

Why R?

1. It is a trustworthy software because it is open source

2. It is accessible to anyone via most of Operating Systems

3. People in developing countries can start picking thingsup today!

4. It is the fastest growing open source environments forstatistical computing

5. It can handle space-time data

6. It is professional (made by top-minds in the business)

GSM2011.org, June 20�24th 2011

Page 36: GSIF utilities

Why R?

1. It is a trustworthy software because it is open source

2. It is accessible to anyone via most of Operating Systems

3. People in developing countries can start picking thingsup today!

4. It is the fastest growing open source environments forstatistical computing

5. It can handle space-time data

6. It is professional (made by top-minds in the business)

GSM2011.org, June 20�24th 2011

Page 37: GSIF utilities

Why R?

1. It is a trustworthy software because it is open source

2. It is accessible to anyone via most of Operating Systems

3. People in developing countries can start picking thingsup today!

4. It is the fastest growing open source environments forstatistical computing

5. It can handle space-time data

6. It is professional (made by top-minds in the business)

GSM2011.org, June 20�24th 2011

Page 38: GSIF utilities

Why R?

1. It is a trustworthy software because it is open source

2. It is accessible to anyone via most of Operating Systems

3. People in developing countries can start picking thingsup today!

4. It is the fastest growing open source environments forstatistical computing

5. It can handle space-time data

6. It is professional (made by top-minds in the business)

GSM2011.org, June 20�24th 2011

Page 39: GSIF utilities

Why R?

1. It is a trustworthy software because it is open source

2. It is accessible to anyone via most of Operating Systems

3. People in developing countries can start picking thingsup today!

4. It is the fastest growing open source environments forstatistical computing

5. It can handle space-time data

6. It is professional (made by top-minds in the business)

GSM2011.org, June 20�24th 2011

Page 40: GSIF utilities

Next steps

I Release plotKML and GSIF packages v0.1

I Continue developing the functionality via R-forge

I Use users feedback to improve

I Optimize the processing speed and improve usability on various

platforms

I Incorporate this functionality within WPS

GSM2011.org, June 20�24th 2011

Page 41: GSIF utilities

Would you like to join GSIF?

I Join the GSIF workshop on Friday

I There are some expectations:

1. You take a responsibility to deliver the functionality on time2. You share most of the Edzer Pebesma's Open data principles3. You should be familiar with R / LATEX

GSM2011.org, June 20�24th 2011

Page 42: GSIF utilities

Would you like to join GSIF?

I Join the GSIF workshop on Friday

I There are some expectations:

1. You take a responsibility to deliver the functionality on time2. You share most of the Edzer Pebesma's Open data principles3. You should be familiar with R / LATEX

GSM2011.org, June 20�24th 2011

Page 43: GSIF utilities

Would you like to join GSIF?

I Join the GSIF workshop on Friday

I There are some expectations:

1. You take a responsibility to deliver the functionality on time

2. You share most of the Edzer Pebesma's Open data principles3. You should be familiar with R / LATEX

GSM2011.org, June 20�24th 2011

Page 44: GSIF utilities

Would you like to join GSIF?

I Join the GSIF workshop on Friday

I There are some expectations:

1. You take a responsibility to deliver the functionality on time2. You share most of the Edzer Pebesma's Open data principles

3. You should be familiar with R / LATEX

GSM2011.org, June 20�24th 2011

Page 45: GSIF utilities

Would you like to join GSIF?

I Join the GSIF workshop on Friday

I There are some expectations:

1. You take a responsibility to deliver the functionality on time2. You share most of the Edzer Pebesma's Open data principles3. You should be familiar with R / LATEX

GSM2011.org, June 20�24th 2011