PE&RS November 2019 Full - page 847

labeled data (see Table

3). In the Pavia Univer-

sity data set, we ran-

domly divided the total

available data into two

parts: 75% for training

and 25% for testing

(see Table 4). Then, for

the 75% training data,

we randomly selected

thirty samples in each

class as the initial

labeled data.

In the experiments,

for every algorithm, ten

runs were executed on

each image with differ-

ent initial labeled data. The experimental simulation environ-

ment is Inter

Core™ i7-6700HQCPU@2.6Ghznotebook, and

its memory is 16 G, and the operating system is Windows 10.

Using the Python Scikit-learning algorithm package to simulate

the experiment. The adopted classifier is

SVM

based on the

radial-basis-function kernel. There two parameters for the

SVM

classifier, i.e., the Gaussian kernel parameter G and the regu-

larization parameter C. They are tuned via grid search and the

searching space defined by G={2

-15

-13

,……,2

} and C={2

-5

-3

,…

…,2

} (Chang and Lin 2011; Mountrakis, Im, and Ogole 2011).

Moreover,

DSC-CASSL

has two parameters Q and m, where m

is the size of the candidate query set and Q is the size of the

actual query set. We set

= 10 and

= 40. At each iteration,

10 samples were selected for manual labeling. The principle

of active learning is to use fewer labeled samples to get the

better training effect. Therefore, the number of labeled samples

represents the labor cost and measures the consumption of

active learning in the iterative phase. Because of the technique

of batch extraction, the minimum unit cost of manual labeling

is Q, and it is the number of batch samples. The cost of manual

labeling as follows:

Cost

= +

(

)

× − = + ⋅

( )

< < +

(

)

N h h Nh h P N p P N

(9)

where P(N) is the model precision, the v

alue of P(N) can be

specified as Overall Accuracy (OA), AA,

or Kappa coefficient.

For example, if require OA = 80%, the cost of active learning

can be calculated, setting

= 10 and

= 80%, and if

= 15

just match Equation 10, the Cost as follows:

Cost

= × + × =

15 10

10 155

(10)

Experiment on the Indian Pines Data Set

To demonstrate the effectiveness of the proposed framework,

we compare the

DSC-CASSL

with

CASSL

MCLU

nEQB

, and Ran-

dom Sampling (RS) techniques. Firstly, regarding the single

strategy of active learning such as

MCLU

and

nEQB

, this plot

reveals the advantages of using an

heuristic instead of RS.

We can observe that

MCLU

outperforms than

nEQB

. This can at-

tribute that the

nEQB

technique results in poorer classification

accuracies with small values of Q and limited initial labeled

samples. The computational complexity of

nEQB

technique is

very high in the case of selecting few samples. The interesting

thing is that

CASSL

works worse than

MCLU

in average accu-

racy (AA) before the labeled data is less than 700. Because the

initialization samples of all compared methods are limited,

it leads to low confidence in the construction of the train-

ing model and bias in the preestimation of the pseudolabels.

When the training samples reach a certain amount, the ac-

curacy of the pseudolabels is guaranteed and

CASSL

starts to

outperform the

MCLU

Taking into account the Indian Pines data set, before the

newly labeled data reaches 150,

DSC-CASSL

doesn’t perform

better than other comparison methods. The main reason is

that the double-strategy-check framework may have a “cold

start” problem. It is widely acknowledged that if the size

of initial labeled data set is too small, the performance of

pseudolabeling procedure could be much deteriorated, which

is known as the “cold start” problem. If the initial training

set does not match the class distributions, it is impossible to

obtain an effective classifier at the very initial stage. When the

algorithm runs over the 15 times, we can observe that

DSC-

CASSL

consistently outperforms other methods. Table 5 shows

five comparative algorithms in different iterations. In Figure

Table 4. Numbers of samples for

the corresponding classes of the

Pavia University data set.

Class Name

No. Samples

Asphalt

6631

Gravel

2099

Metal sheet

1345

Bitumen

1330

Shadows

947

Meadows

18 649

Trees

3064

Bare soil

5029

Self-blocking bricks

3682

Table 5. The comparison of Overall Accuracy between the com-

pared algorithms and

DSC-CASSL

on the Indian Pines data set.

Algorithm, %

Increase, %

RS MCLU n

EQB CASSL DSC-CASSL

30 71.40 75.60 7

4.99 75.33

76.40

+1.42

45 74.99 79.37 78.34 79.13

80.00

+1.10

60 77.39 81.20 80.34 81.45

82.53

+1.33

75 79.04 82.55 82.09 82.92

83.87

+1.15

90 80.27 83.69 82.79 84.13

84.83

+0.83

(a)

(b)

(c)

Figure 6.

, and Kappa results of the different algorithms on the Indian Pines data set. (a) Scaled-up version

. (b)

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING

November 2019

847

SEO Version

Warning.

You are currently viewing the SEO version of !text.
It has a number of design and functionality limitations.

We recommend viewing the Flash version or the basic HTML version of this publication.

775...,837,838,839,840,841,842,843,844,845,846 848,849,850,851,852,853,854