PE&RS August 2017 Public - page 541

A Relative Evaluation of Random Forests for
Land Cover Mapping in an Urban Area
Di Shi and Xiaojun Yang
Abstract
Random forests as a novel ensemble learning algorithm have
significant potential for land cover mapping in complex areas
but have not been sufficiently tested by the remote sensing
community relative to some more popular pattern classifiers.
In this research, we implemented random forests as a pattern
classifier for land cover mapping from a satellite image cover-
ing a complex urban area, and evaluated the performance
relative to several popular classifiers including Gaussian
maximum likelihood (
GML
), multi-layer-perceptron networks
(
MLP
), and support vector machines (
SVM
). Each classifier
was carefully configured with the parameter settings recom-
mended by recent literature, and identical training data were
used in each classification. The accuracy of each classified
map was further evaluated using identical reference data.
Random forests were slightly more accurate than
SVM
and
MLP
but significantly better than
GML
in the overall map ac-
curacy. Random forests and support vector machines gener-
ated almost identical overall map accuracy, but the former
produced a smaller standard deviation of categorical accura-
cies, suggesting its better overall capability in classifying both
homogeneous and heterogeneous land cover classes. Random
forests have shown its robustness due to the most accurate
classification on the whole, relatively balanced performance
across all land cover categories, and relatively easier to imple-
ment. These findings should help promote the use of random
forests for land cover classification in complex areas.
Introduction
Land-cover classification inventories the elements of Earth
surface supporting various environmental planning and re-
sources management applications (Foley
et al
., 2005). Remote
sensor imagery associated with pattern recognition techniques
can help automate the land-cover classification process
(Townsend, 1992; Vogelmann
et al
., 1998; Yang and Lo, 2002;
Knorn
et al
., 2009; Hansen and Loveland, 2012; Liu and Yang,
2015). Nevertheless, remote sensing-based land cover map-
ping in heterogeneous environments, such as urban areas,
has been challenging due to the presence of spectrally and
spatially complex surface features (Jensen and Cowen, 1999;
Herold
et al
., 2003; Franke
et al
., 2009; Adam
et al
., 2014).
Conventional classification methods, such as maximum likeli-
hood classifier and unsupervised clustering, are largely built
upon parametric statistics, which generically work well in
relatively homogeneous areas, but not in heterogeneous re-
gions or when scenes contain severe noise due to the increase
of image resolution (Jensen, 2014). Over the years, substantial
research efforts have been made to improve image classifica-
tion performance, particularly in spectrally and spatially
complex areas (e.g., Civco, 1993; Skidmore
et al
.,1997; Huang
et al
., 2002; Del Frate
et al
., 2007; Knorn
et al
., 2009; Liu and
Yang, 2013; Akbari
et al
., 2016).
This research focuses on random forests (
RF
), a novel en-
semble learning algorithm that has been increasingly used in
the remote sensing community. The fundamental idea of using
random forests for image classification is to construct multiple
full-grown decision tree classifiers, each of which labels a
pixel as a certain class; the final identity is based on the most
popular vote from all the class labels assigned by individual
classifiers (Breiman, 2001). Critical for random forests is the
introduction of randomness for each individual tree classi-
fier. Specifically, each individual decision tree is grown using
a different two-thirds of randomly selected training samples
through bootstrap sampling process (Breiman, 1996); the
left-out one third data are used for an internal error estima-
tion called out-of-bag error (Wolpert and Macready, 1999).
In addition, randomly selected features of the user-specified
feature number are used to split the nodes of individual trees,
and the best split from all binary splits generated by randomly
selected features is chosen based on Gini index or information
gain and used to split the node. Due to the randomized selec-
tion procedures, each tree is grown to be partially indepen-
dent. Random forests are capable of processing large and high-
dimensional datasets without having to use feature reduction
or selection techniques and are robust to outliers and noise
in training data (Breiman, 2001). They are considered to be a
straightforward, fast and accurate classifier (Biau, 2012). More
discussion about the theoretical underpinnings of random
forests can be found from several seminal works published by
Leo Breiman (e.g., Breiman, 2001; Breiman and Cutler, 2004).
Over the years, various efforts have been made to inves-
tigate the utilities of random forests for remote sensor image
classification. Random forests have been used to classify
different types of remote sensor data such as multispectral,
hyperspectral, high-spatial-resolution, and microwave images
at the per pixel level (e.g., Ham
et al
., 2005; Lawrence
et al
.,
2006; Chan and Paelinckx, 2008; Guo
et al
., 2011; Hayes
et
al
., 2014). They have also been used for image classification
at the sub-pixel level (e.g., Reschke and Hüttich, 2014) as
well as at the object level (e.g., Puissant
et al
., 2014). More-
over, random forests have been used for multitemporal image
classification in connection to change detection and thematic
mapping from multi-sensor images (e.g., Clark
et al
., 2012;
Grinand
et al
., 2013). These studies have shown considerable
potential of random forests as a suitable image classifier for
use in the remote sensing community.
Along with the above efforts, there have been several stud-
ies investigating the performance of random forests relative
to other image classifiers. For example, Pal (2005) evaluated
Di Shi is with the Department of Geography and Atmospheric
Science, University of Kansas, Lawrence, KS 66045.
Xiaojun Yang is with the Institute of Groundwater and
Earth Sciences, Jinan University, Guangzhou, China; and
the Department of Geography, Florida State University,
Tallahassee, FL 32306 (
).
Photogrammetric Engineering & Remote Sensing
Vol. 83, No. 8, August 2017, pp. 541–552.
0099-1112/17/541–552
© 2017 American Society for Photogrammetry
and Remote Sensing
doi: 10.14358/PERS.83.8.541
PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING
August 2017
541
519...,531,532,533,534,535,536,537,538,539,540 542,543,544,545,546,547,548,549,550,551,...594
Powered by FlippingBook