Persello Bruzzone IGARSS 2011.pptx

Remote Sensing Laboratory

Dept. of Information Engineering and Computer Science

University of Trento

Via Sommarive, 14, I-38123 Povo, Trento, Italy

A NOVEL ACTIVE LEARNING STRATEGY FOR DOMAIN ADAPTATION IN THE CLASSIFICATION

OF REMOTE SENSING IMAGES

e-mail: [email protected], [email protected], Web page: http://rslab.disi.unitn.it

C. Persello

L. Bruzzone

University of Trento, Italy 2C. Persello, L. Bruzzone

1

2

3

4

5

Outline

Background on Domain Adaptation and Active Learning

Aim of the Work

Proposed Approach to Address Domain Adaptation Problems

with Active Learning

Experimental Results

Conclusions

University of Trento, Italy

Introduction

Scenario: Growing availability of space-borne data that gives the opportunity to

develop several applications related to land-cover mapping and monitoring.

Problem: Common automatic classification techniques are based on

supervised learning methods, which require a set of new training samples

every time that a new remote sensing image has to be classified

Need for the development of efficient techniques capable to adapt the

supervised classifier trained on a image for the classification of another similar

but not identical image acquired either:

1) on a different area, or

2) on the same area at a different time.

C. Persello, L. Bruzzone 3


Background on Domain Adaptation

Domain Adaptation: models the problem of adapting a supervised classifier trained on a

given image (source domain) to the classification of another similar but not identical

image (target domain) acquired either on a different area, or on the same area at a

different time.

Assumption: Source and target domain share the same set of land cover classes.

C. Persello, L. Bruzzone

[1] L. Bruzzone, D. Fernandez Prieto, “Unsupervised retraining of a maximum-likelihood classifier for the analysis of

multitemporal remote-sensing images,” IEEE Trans. Geosci. Remote Sens., Vol. 39, No.2, pp. 456-460, 2001.

[2] L. Bruzzone, M. Marconcini, “Domain Adaptation Problems: a DASVM Classification Technique and a Circular

Validation Strategy,” IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 32, 2010, No. 5, pp. 770-787, 2010.

Source Domain Target DomainSemisupervised

techniques

(e.g., [1], [2])

Problem: correct

converngence is

not always

possible

Unknown Class

4


Working Assumption

Working Assumption: In this work we assume that some samples (as little as possible)

from the target domain can be labeled by the user and added to the existing training set.

Proposed solution: use of Active Learning [1], [2] procedure for selecting the most

informative samples of the target domain.


Update T GTi-1 Ti classification

QSX

UGeneral Active

Process

G: Supervised classifier;

Q: Query function;

S: Supervisor;

T: Training set;

U: Unlabeled data

[1] S. Rajan, J. Ghosh, and M. M. Crawford, “An active learning approach to hyperspectral data classification,” IEEE Transactions on

Geoscience and Remote Sensing, vol. 46, no. 4, pp. 1231-1242, Apr. 2008.

[2] B. Demir, C. Persello, and L. Bruzzone, “Batch mode active learning methods for the interactive classification of remote sensing images,”

IEEE Transactions on Geoscience and Remote Sensing, vol. 49, no.3, pp. 1014-1031, March 2011.

5


Aim of the Work

Aim of the Work: propose a novel Domain Adaptation technique based on Active

Learning, which aims at classifying the target image, while requiring the minimum

number of labeled samples from the new image.

Basic Idea: iterative process based on

1) labeling and adding to the training set the most informative samples from the target

domain (query+), while

2) removing from the training set the source-domain samples that do not fit with the

distributions of the classes in the target domain (query-).

Example:


Source Domain Target Domain

Query+Query-

Convergence

reached!

6


Proposed Technique


x

Largest class-

conditional density

Second largest class-

conditional density

7


Proposed Technique


x- x

Class-conditional density computed using

source-domain samplesClass-conditional density computed

using samples at iteration i

8


Proposed Technique


Class-conditional density computed using

source-domain samples

Class-conditional density computed

using samples at iteration i

9


Data Set Description: VHR data set


Data set: Two Quickbird images acquired in 2006 over two rural areas in Trento, Italy.

Reference labeled data: Two sets of labeled samples for each image.

Land-cover classes: Vineyard, water, agriculture fields, forest, apple tree, urban area.

Image QB1

Image QB2

10


Data Set Description


Distribution of labeled samples on bands 3 and 4 of the two Quickbird images

50 100 150 200 250 3000

100

200

300

400

500

600

700

band 3

band 4

Vineyard

Water

Agriculture Fields

Forest

Apple Tree

Urban Area

50 100 150 200 250 3000

100

200

300

400

500

600

700

band 3

band 4

Vineyard

Water

Agriculture Fields

Forest

Apple Tree

Urban Area

Source Domain Target Domain

11

University of Trento, Italy C. Persello, L. Bruzzone


0 100 200 300 400 500 60065

70

75

80

85

90

Number of Labeled Samples of the Target Domain

Overa

ll A

ccura

cy (

%)

on T

S2

AL on QB2

AL random on QB2

q+

q+ random

Proposed DA method (q+ and q-)

0 100 200 300 400 500 6000

0.5

1

1.5

2

2.5

3


Bhatt

achary

ya D

ista

nce

Vineyard

Water

Agriculture Fields

Forest

Apple Tree

Urban Area

Average

Averaged learning curves over ten trials

12


Data Set Description: hyperspectral data set

Study area: Okavango Delta, Botswana.

Data set: Hyperspectral image acquired by the Hyperion

sensor of the EO-1 satellite (145 noise free bands).

Classes: 14 different land-cover types.

Reference labeled data was collected in two disjoint areas

and four different sets were defined:

• a training set T1

• a spatially correlated test set TS1

• a training set T2 spatially disjoint from T1

• a test set TS2 spatially correlated with T2


T1

T2

TS1

TS2

Area 1

Area 2

13



Averaged learning curves over ten trials

0 100 200 300 400 500 60065

70

75

80

85

90

95

100


Overa

ll A

ccura

cy (

%)

on T

S2

AL on Area 2

AL random on Area 2

q+

q+ random

Proposed DA method (q+ and q-)

0 100 200 300 400 500 6000

0.5

1

1.5

Number of Labeled Samples of the Target DomainA

vera

ge B

hatt

achary

ya d

ista

nce

14


A novel approach to address Domain Adaptation problems with Active

Learning has been proposed.

Assuming that an image and the related reference labeled samples are

available, the proposed technique can be used either:

1) to classify another image acquired on another geographical area

with similar characteristics and the same land-cover classes, or

2) to update the land-cover map given a new image acquired on the

same area at a different time.

We introduced a stop criterion that does not require a test set defined on

the target domain.

Future Developments:

Include a diversity criterion in the query+ function.

Extend the proposed method to kernel-based classifiers.

Conclusion

15

Documents

Persello Bruzzone IGARSS 2011.pptx