1
Planet Labels: How do we use our Earth? Rishabh Bhargava , Timon Ruban, Vincent Sitzmann Dataset We use high-resolution satellite images to classify the land visible according to its cover, such as forest, water or urban land. The input to our algorithms is a tile of 128x128 pixels cut out from a satellite image. Tiles are combined with labels from the 2011 National Land Cover Dataset (NLCD) to yield a supervised data set. Random forests outperform a number of models yielding 92.45% accuracy on a test set. Methodology 128 px 128 px MODE of all pixel labels NLCD labels Image Tile + Satellite images of California from RapidEye satellites from Planet Labs Explorer [1] program 5m x 5m resolution for each pixel 5 bands: B, G, R, Red Edge, Near Infrared Tiles of size 128 pixels x 128 pixels extracted Tiles matched with US government’s 2011 National Land Cover Database [2] in a manner demonstrated below Land cover classes selected on basis of Anderson’s Land Cover Classification System [3] : Water, Developed, Barren, Forest, Shrubland, Herbaceous, Cultivated, Wetland Only tiles where more than 90% of pixels belong to single class were considered Total dataset size: 78,082 tiles Dataset split: 20% Test set 20% Validation set 60% Training set Features used: Color: HSV Histograms (convert from RGB) Near Infrared / Red Edge Histograms Texture: Gabor filters Tested models: Logistic regression with L2 regularization Support Vector Machines with RBF Kernels Random Forests with 500, 1000 trees and p= predictors per tree Convolutional Neural Networks with LeNet architecture Random Forests outperformed all other approaches: Logistic Regression incurred high bias (high training and test error) SVMs with suffered from heavy overfitting and long training times Lowest accuracies for classes Herbaceous, Shrubland and Barren were observed Water and Forest classes had the best accuracy. Improve granularity of results by adopting per-pixel approach Update the NLCD dataset with RapidEye imagery Create land cover databases for different countries using latest satellite images Discover temporal changes in land cover this would help in identifying illegal deforestation activities, patterns in urban land growth, and shrinkage of water bodies References Model Parameters Validation Error Logistic Regression L2 Regularization 0.2316 Random Forest 100 trees 0.0780 1000 trees 0.0755 Potential Next Steps Random Forest Confusion Matrix Error Rates of different feature vectors Learning Curve Results Validation set errors for different models and their parameters [1] Owned by Planet Labs, in the frame of the Planet Labs Explorers Program: www.planet.com/explorers [2] Homer, C.G., Dewitz, J.A., Yang, L., Jin, S., Danielson, P., Xian, G., Coulston, J., Herold, N.D., Wickham, J.D., and Megown, K., 2015, Completion of the 2011 National Land Cover Database for the conterminous United States-Representing a decade of land cover change information. Photogrammetric Engineering and Remote Sensing, v. 81, no. 5, p. 345-354 [3] Anderson, James Richard. A land use and land cover classification system for use with remote sensor data. Vol. 964. US Government Printing Office, 1976. 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 Precision Recall -0.02 0 0.02 0.04 0.06 0.08 0.1 0.12 9360 18721 28082 37443 46804 Training Set Size Training Error Test Error 0 5000 10000 15000 20000 25000 30000 Number of Tiles Land Cover Class 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 hsv_hist hsv_hist + re_nir hsv_hist + re_nir + gab Precision and Recall of land cover classes RapidEye image overlaid with land cover predictions RapidEye satellite image

Planet Labels: How do we use our Earth?cs229.stanford.edu/proj2015/173_poster.pdf · Planet Labels: How do we use our Earth? Rishabh Bhargava , Timon Ruban, Vincent Sitzmann Dataset

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Planet Labels: How do we use our Earth?cs229.stanford.edu/proj2015/173_poster.pdf · Planet Labels: How do we use our Earth? Rishabh Bhargava , Timon Ruban, Vincent Sitzmann Dataset

Planet Labels: How do we use our Earth?

Rishabh Bhargava , Timon Ruban, Vincent Sitzmann

Dataset

We use high-resolution satellite images to classify the land visible according to its cover, such as forest, water or urban land. The input to

our algorithms is a tile of 128x128 pixels cut out from a satellite image. Tiles are combined with labels from the 2011 National Land Cover

Dataset (NLCD) to yield a supervised data set. Random forests outperform a number of models yielding 92.45% accuracy on a test set.

Methodology

128 px

128 px

MODE

of all pixel

labels

NLCD labelsImage Tile

+

Satellite images of California from RapidEye satellites from

Planet Labs Explorer[1] program

5m x 5m resolution for each pixel

5 bands: B, G, R, Red Edge, Near Infrared

Tiles of size 128 pixels x 128 pixels extracted

Tiles matched with US government’s 2011 National Land

Cover Database[2] in a manner demonstrated below

Land cover classes selected on basis of Anderson’s Land

Cover Classification System[3]:

Water, Developed, Barren, Forest, Shrubland,

Herbaceous, Cultivated, Wetland

Only tiles where more than 90% of pixels belong to single

class were considered

Total dataset size: 78,082 tiles

Dataset split:

20% Test set

20% Validation set

60% Training set

Features used:

Color: HSV Histograms (convert from RGB)

Near Infrared / Red Edge Histograms

Texture: Gabor filters

Tested models:

Logistic regression with L2 regularization

Support Vector Machines with RBF Kernels

Random Forests with 500, 1000 trees and p = 𝑛predictors per tree

Convolutional Neural Networks with LeNet architecture

Random Forests outperformed all other approaches:

Logistic Regression incurred high bias (high training

and test error)

SVMs with suffered from heavy overfitting and long

training times

Lowest accuracies for classes Herbaceous, Shrubland and

Barren were observed

Water and Forest classes had the best accuracy.

Improve granularity of results by adopting per-pixel

approach

Update the NLCD dataset with RapidEye imagery

Create land cover databases for different countries using

latest satellite images

Discover temporal changes in land cover – this would help in

identifying illegal deforestation activities, patterns in urban

land growth, and shrinkage of water bodies

References

Model Parameters Validation Error

Logistic Regression L2 Regularization 0.2316

Random Forest 100 trees 0.0780

1000 trees 0.0755

Potential Next Steps

Random Forest Confusion Matrix Error Rates of

different feature vectors

Learning CurveResults

Validation set errors for different models and their parameters

[1] Owned by Planet Labs, in the frame of the Planet Labs Explorers Program: www.planet.com/explorers

[2] Homer, C.G., Dewitz, J.A., Yang, L., Jin, S., Danielson, P., Xian, G., Coulston, J., Herold, N.D., Wickham, J.D., and Megown, K., 2015, Completion of the 2011 National Land Cover Database for the conterminous United States-Representing a decade of land cover change information. Photogrammetric Engineering and Remote Sensing, v. 81, no. 5, p. 345-354

[3] Anderson, James Richard. A land use and land cover classification system for use with remote sensor data. Vol. 964. US Government Printing Office, 1976.

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Precision

Recall

-0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

9360 18721 28082 37443 46804

Training Set Size

Training Error

Test Error

0

5000

10000

15000

20000

25000

30000

Num

ber

of T

iles

Land Cover Class

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

hsv_hist hsv_hist +re_nir

hsv_hist +re_nir + gab

Precision and Recall of land cover classes

RapidEye image overlaid with land cover predictions

RapidEye satellite image