Planet Labels: How do we use our Earth?cs229.stanford.edu/proj2015/173_poster.pdf · Planet Labels: How do we use our Earth? Rishabh Bhargava , Timon Ruban, Vincent Sitzmann Dataset

Planet Labels: How do we use our Earth?

Rishabh Bhargava , Timon Ruban, Vincent Sitzmann

Dataset

We use high-resolution satellite images to classify the land visible according to its cover, such as forest, water or urban land. The input to

our algorithms is a tile of 128x128 pixels cut out from a satellite image. Tiles are combined with labels from the 2011 National Land Cover

Dataset (NLCD) to yield a supervised data set. Random forests outperform a number of models yielding 92.45% accuracy on a test set.

Methodology

128 px

128 px

MODE

of all pixel

labels

NLCD labelsImage Tile

+

Satellite images of California from RapidEye satellites from

Planet Labs Explorer[1] program

5m x 5m resolution for each pixel

5 bands: B, G, R, Red Edge, Near Infrared

Tiles of size 128 pixels x 128 pixels extracted

Tiles matched with US government’s 2011 National Land

Cover Database[2] in a manner demonstrated below

Land cover classes selected on basis of Anderson’s Land

Cover Classification System[3]:

Water, Developed, Barren, Forest, Shrubland,

Herbaceous, Cultivated, Wetland

Only tiles where more than 90% of pixels belong to single

class were considered

Total dataset size: 78,082 tiles

Dataset split:

20% Test set

20% Validation set

60% Training set

Features used:

Color: HSV Histograms (convert from RGB)

Near Infrared / Red Edge Histograms

Texture: Gabor filters

Tested models:

Logistic regression with L2 regularization

Support Vector Machines with RBF Kernels

Random Forests with 500, 1000 trees and p = 𝑛predictors per tree

Convolutional Neural Networks with LeNet architecture

Random Forests outperformed all other approaches:

Logistic Regression incurred high bias (high training

and test error)

SVMs with suffered from heavy overfitting and long

training times

Lowest accuracies for classes Herbaceous, Shrubland and

Barren were observed

Water and Forest classes had the best accuracy.

Improve granularity of results by adopting per-pixel

approach

Update the NLCD dataset with RapidEye imagery

Create land cover databases for different countries using

latest satellite images

Discover temporal changes in land cover – this would help in

identifying illegal deforestation activities, patterns in urban

land growth, and shrinkage of water bodies

References

Model Parameters Validation Error

Logistic Regression L2 Regularization 0.2316

Random Forest 100 trees 0.0780

1000 trees 0.0755

Potential Next Steps

Random Forest Confusion Matrix Error Rates of

different feature vectors

Learning CurveResults

Validation set errors for different models and their parameters

[1] Owned by Planet Labs, in the frame of the Planet Labs Explorers Program: www.planet.com/explorers

[2] Homer, C.G., Dewitz, J.A., Yang, L., Jin, S., Danielson, P., Xian, G., Coulston, J., Herold, N.D., Wickham, J.D., and Megown, K., 2015, Completion of the 2011 National Land Cover Database for the conterminous United States-Representing a decade of land cover change information. Photogrammetric Engineering and Remote Sensing, v. 81, no. 5, p. 345-354

[3] Anderson, James Richard. A land use and land cover classification system for use with remote sensor data. Vol. 964. US Government Printing Office, 1976.

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Precision

Recall

-0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

9360 18721 28082 37443 46804

Training Set Size

Training Error

Test Error

0

5000

10000

15000

20000

25000

30000

Num

ber

of T

iles

Land Cover Class

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

hsv_hist hsv_hist +re_nir

hsv_hist +re_nir + gab

Precision and Recall of land cover classes

RapidEye image overlaid with land cover predictions

RapidEye satellite image

Documents

Planet Labels: How do we use our Earth?cs229.stanford.edu/proj2015/173_poster.pdf · Planet Labels: How do we use our Earth? Rishabh Bhargava , Timon Ruban, Vincent Sitzmann Dataset