PE&RS October 2018 Full

In addition, the computation process and the algorithms

for the object matching step are complex. Also, choosing

the object matching methods and discrepancy indexes will

directly influence and affect the final evaluation result.

Therefore, determining how to choose or design an optimal

and universal object matching method and discrepancy index

for different applications and high spatial resolution remote

sensing images, is a key issue for future research about the

supervised evaluation method.

Unsupervised Evaluation Method

The unsupervised evaluation method is also known as the

empirical goodness evaluation method or the stand-alone

evaluation method. This method, requiring no reference

image, evaluates image segmentation results according to

the goodness index, which is based on human perception of

the ideal segmentation result. The widely-recognized defini-

tion of an ideal segmentation result, given by Haralick and

Shapiro (1985) is as follows: Regions should be uniform and

homogeneous with respect to some characteristic(s), adjacent

regions should have significant differences with respect to

the characteristic for which they are uniform, region interiors

should be simple and without holes, and boundaries should

be simple, not ragged, and be spatially accurate. Based on this

idea, Ming

et al.

(2009) proposed a multi-index unsupervised

segmentation method, in which six measures (removed region

number, non-uniformity within region measure, contrast

across region measure, variance contrast across region mea-

sure and edge gradient measure) were employed to evaluate

the segmentation quality.

However, as for natural images having clear texture and

structure, especially with high spatial resolution remote sens-

ing images, only the previous two rules proposed by (Haralick

and Shapiro, 1985) as mentioned above can be normally

satisfied. Since good segmentation results with high spatial

resolution remote sensing images should have both larger

intra-region homogeneity and inter-region heterogeneity.

Therefore, most unsupervised evaluation methods firstly cal-

culate the intra-region homogeneity and inter-region hetero-

geneity of each segmented object, and then obtain an overall

index for whole images by combining the local results of all

the segmented objects. The multi-index measures proposed

by Ming

et al

., (2016) are mainly corresponding to intra-

region homogeneity and inter-region heterogeneity. Lastly, the

two indexes, homogeneity and heterogeneity, are jointly used

to obtain a final goodness index, which is the key to evaluate

the segmentation result of high spatial resolution remote sens-

ing images. The following text will systematically summarize

the commonly used goodness indexes of intra-region homo-

geneity and inter-region heterogeneity, then summarize the

compositing methods of the goodness indexes.

Homogeneity Goodness Indexes

Homogeneity goodness is an index which has its base on the

first rule of ideal segmentation results. It can directly and

efficiently evaluate the segmentation results by calculating

the homogeneity within a region. As shown in Figure 5, there

are currently four indices to evaluate homogeneity: good-

ness based on spectral difference, goodness based on spectral

variance, goodness based on texture, and goodness based on

entropy. In the field of remote sensing, Chabrier

et al

. (2006)

compared the performance of some proposed goodness in-

dexes for the evaluation of segmentation results with remote

sensing imagery. Among the goodness indexes mentioned

above, the goodness based on spectral difference, the good-

ness based on texture, and the goodness based on entropy are

less applied in remote sensing.

Goodness Based on Difference

Weszka and Rosenfeld (1978) proposed goodness according

to spectral difference,

D

wr

, between the original pixels of the

segmented object and the pixels after threshold segmentation,

which is applied to evaluate the foreground and background

image segmentation algorithm, based on the value of thresh-

old. Zéboudj (1988) derived the Max Within-Region Contrast

(

MWC

) goodness according to the maximum spectral difference

between the intra-region pixels of the segmented object and

the adjacent pixels. Chen and Wang (2004) proposed the Intra-

Region Visual Error (E

intra

) goodness on the basis of the differ-

ence between the spectral value of each pixel in the segment-

ed object and the average spectral values within the object.

Goodness Based on Spectral Variance

In comparison to the goodness based on spectral difference,

which mostly consider only the intra-region homogeneity in

a local pixel level instead of segmented object level, goodness

based on spectral variance, measuring the intra-region homo-

geneity in segmented object level, responds more optimally to

the intra-region homogeneity. Liu and Yang (1994) proposed

F

goodness based on average spectral variance of segmented

objects, where over-segmentation phenomenon are penalized

by weighting the square root of the number of segmented ob-

jects. However,

F

still has biases: with the increasing number

of segmented objects, the value of

F

will rapidly decrease to

zero regardless of the high spectral variance value. Borsotti et

al. (1998) described

F

′

and

Q

goodness to correct the biases

in

F

. Although

Q

is better at segmentation evaluation, the de-

velopment of its formula is based on empirical analysis, and

may still cause biased evaluation

in some special situations. Zhang

(1996a) took the segmented objects’

spectral variance as a metric, and

put forward the Non-Uniformity

within Region Measure (

UM

) good-

ness. The smaller the

UM

value, the

stronger the homogeneity within

the segmented object. Furthermore,

the Normalized Uniformity Mea-

sure (

NU

) introduced by Sahoo et

al. (1988) is the normalized version

of

UM

. The larger the

NU

value, the

stronger the homogeneity within

the segmented object. Otsu (1979)

defined the Within-Class Variance

(

WV

) goodness, in accordance with

the weighted sum spectral vari-

ance of the segmented object and

its adjacent object.

WV

considers

Figure 5. Hierarchical map of commonly used homogeneity goodness indexes.

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING

October 2018

637

PE&RS October 2018 Full - page 637

Warning.