PE&RS November 2019 Full - page 843

semisupervised classification approach for

HSI

was presented

based on a hierarchical learning paradigm, which is com-

posed of multiple layers feeding into each other with spectral

and spatial features. Ji

et al.

(2014) proposed a semisuper-

vised hyperspectral image classification method to address

both the pixel spectral and spatial constraints. In this method,

the relationship among pixels is formulated in a hypergraph

structure. In the constructed hypergraph, each vertex denotes

a pixel in the hyperspectral image.

Advances on Combining AL and SSL for

Hyperspectral Image Classification

Combining

and SSL is a promising choice. Many existing

works have introduced this combination. Samiappan and

Moorhead (2015) proposed a semisupervised approach which

adapts active learning to a cotraining framework in which the

algorithm automatically selects new training samples from

unlabeled samples. This proposed approach is validated by

applying a probabilistic support vector machine classifier.

CASSL

combines

and SSL to invoke a collaborative label-

ing process by both human experts and classifiers. In

CASSL

-based pseudolabel verification process is performed for

improving the pseudolabeling accuracy to facilitate SSL. The

unlabeled data with low pseudolabeling confidence in SSL

will become the query candidates in

Proposed Methodology

As stated above, the framework of integrating

and SSL to

enable collaborative labeling by both the human experts and

classifiers can obtain more accurate results in the hyperspec-

tral image classification. Assigning pseudolabels to unlabeled

data and retraining the classifier with both the labeled data

and pseudolabeled data are crucial issues. A good threshold

that can avoid introducing incorrect pseudolabels, while

the high threshold will result in fewer samples meeting the

condition at each iteration, and may lead to an “empty loop”

phenomenon. A low threshold can assign incorrect pseudo-

labels to unlabeled data and thus deteriorate classification

performance. Moreover, the effectiveness of the pseudolabel-

ing procedure may heavily depend on the initial labeled data

set. If the initial training set does not match the underlying

class distributions, it is difficult to train

an effective classi-

fier at the very initial stage and the judg

ment of unlabeled

samples is constantly changing. If a sing

le confidence thresh-

old is selected for pseudolabeling proce

dure judgment, it will

not be able to adapt to the changing model. Therefore, the

subsequent pseudolabeling procedure may invoke so many

wrong pseudolabels that deteriorate the performance of the

final classifier.

Discriminative Information Mining and Multiple Verification

DSC-CASSL

integrates two different

and SSL in a collab-

orative manner for hyperspectral image classification. This

framework enables a collaborative labeling procedure by both

human experts and three classifiers (two check classifiers and

one base classifier) to obtain more confident labeled samples

to improve the classification performance. In

DSC-CASSL

, we

apply

nEQB

and

MCLU

ESAL

partly to select the informative

samples for manual labeling. The

nEQB

, query by committee,

considers the uncertainty of the samples by the maximum

disagreement among the committee of learners. It can select

the most informative samples from the uncertain samples

pool.

MCLU

selects the most informative samples according to

the confidence values. These methods are widely applied in

hyperspectral image classification.

Discriminative information, which represents the qual-

ity of the training data, is vital to improve the generalization

ability of the classifiers. It is widely acknowledged that the

training data are limited, using the unlabeled data to enhance

the training data is an inevitable choice. The discriminative

information will be improved by adding the newly labeled

data in the active-learning process and the assigning pseu-

dolabels to unlabeled data in

DSC-CASSL

. Meanwhile, the

generalization ability of classifiers is improved gradually at

each iteration by adding a batch of informative samples to the

labeled data set with two different query function.

Detailed Steps of the DSC-CASSL Framework

In this section, we describe the details of the proposed

DSC-

CASSL

framework.

DSC-CASSL

framework is divided into two

parts. ESAL (Cui, Kai, and Zhongjun 2018) is the base of the

verification part and also reinforces the performance of base

classifier. It is similar to the usual supervised active learning

method, but the process of selecting informative samples has

been changed. We integrate

MCLU

and

nEQB

into a collabora-

tive sample selection strategy. At each iteration, we assume

that the number of samples to be labeled is Q, the samples

contributed by the

MCLU

, and the samples contributed by

the

nEQB

strategy is

. It is worth noting that

and

repre-

sent not only the number of samples, but also the information

of samples. Designing an appropriate function is the most im-

portant and challenging task for

EASL

. In this paper, the fitness

function of the

EASL

is defined as follows.

q Q w

= ×

(7)

Q q q R

1 2



(8)

where

and

are the assigned weight parameters for

MCLU

and

nEQB

, respectively. A range of parameters (

= 0.1,

= 0.3,

= 0.5,

= 0.7,

= 0.9) were considered in Cui,

Kai, and Zhongjun (2018). We can observe that

= 0.5 per-

forms better than the other parameters at each iteration. This

phenomenon can be attributed to the fact that complementing

nEQB

and

MCLU

for each other will achieve better performance.

Hence, we set the value of

and

are the same, both of

them are 0.5. If

∩

≠

, it indicates that the two strategies

select the same number of valuable samples. So there will be

the number of

–

∪

vacancies, and the value of

–

∪

is equal to the value o

f R. R is a random factor. The function

of R is to randomly se

lect valuable samples within the thresh-

old range as suppleme

nts. It should be underlined that R does

not exist at each iteration, but according to the results of the

EASL

to determine R. The pseudocode for ESAL is illustrated

in Algorithm 1 (see next page).

Initially, we utilize the initial labeled samples and the

pseudolabeled samples to train base classifiers. And base clas-

sifier is utilized to predict the unlabeled data. At the begin-

ning of the iteration, the pseudolabeled data set is empty.

Next, we select

the most informative unlabeled samples

by utilizing MLCU technique and select

the most informa-

tive unlabeled samples by utilizing

nEQB

algorithm. And then

these selected unlabeled data are labeled by human experts.

We denote the

newly labeled samples as

and the

newly labeled samples as

. It should be underlined that

DSC-CASSL

doesn’t increase labeling cost because

CASSL

and

DSC-CASSL

has the same numbers of the unlabeled samples

that need to be labeled by human experts at each iteration. We

suppose both

CASSL

and

DSC-CASSL

select Q unlabeled samples

at each iteration. When the

MCLU

technique and

nEQB

tech-

nique simultaneously select the same samples, we will utilize

the random factor R for supplement. Simultaneously, the

labeled set and the unlabeled set are updated by adding the

newly labeled samples to the labeled set and removing them

from the unlabeled set. The updated labeled set are applied to

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING

November 2019

843

SEO version

Warning.

You are currently viewing the SEO version of PE&RS November 2019 Full.
It has a number of design and functionality limitations.

We recommend viewing the basic HTML version or installing the Adobe Flash Player.

775...,833,834,835,836,837,838,839,840,841,842 844,845,846,847,848,849,850,851,852,853,...854