PE&RS June 2016 Full - page 438

higher omission error rates compared to commission error
rates, there is a need to generate regional, systematic maps of
potential locations of unmapped wetlands. The aim of this
work is therefore to map the probability of palustrine wetland
occurrence (
PEM
,
PFO
, and
PSS
) in West Virginia. We used
digital terrain variables generated from a photogrammetrical-
ly-derived digital elevation model (
DEM
) and random forest
(
RF
) machine learning classifications. In addition to generat-
ing a statewide map of wetland probability, regional maps
were produced in order to assess how reliable a probability
model generated in one physiographic region is when applied
to other regions. Since there are a wide range of potential
DEM
-derived variables, the importance of different variables
was also evaluated in creating the predictive model. The ef-
fectiveness of the wetland probability models were evaluated
by comparing them against
NWI
data not used in training the
model. In addition, the models were evaluated using a sepa-
rate dataset of wetlands not identified in the
NWI
mapping,
thus allowing an assessment of the value of the approach for
identifying wetlands beyond that of the
NWI
dataset.
Background
Remote Sensing for Mapping Wetlands
Wetland mapping using classification of remotely sensed data
has previously been explored using moderate spatial resolution
satellite imagery, such as Landsat Thematic Mapper (
TM
) and
Enhanced Thematic Mapper Plus (
ETM
+) data (for example,
Butera, 1983; Fraziere and Page, 2000; Townsend and Walsh,
2001; Baker
et al
., 2006; Wright and Gallant, 2007; Frohn
et al.
,
2009; Gong
et al
., 2011), the combination of multispectral and
radar imagery (Hess
et al
., 1990; Hess
et al
., 1995; Leahy, 2003;
Costa and Telmer, 2006; Rosenqvist
et al
., 2004; Corcoran
et al
.,
2011), high spatial resolution imagery (Xie
et al
., 2015), and
the combination of multispectral imagery and
GIS
rule-based
methods (Sader
et al
., 1995; Manju
et al
., 2005). Ozesmi and
Bauer (2002) suggest that Landsat Multispectral Scanner (
MSS
),
TM
, and Satellite Pour l’Observation de la Terre (
SPOT
) are the
most commonly used sensors for mapping wetlands
.
As hydrological features, the distribution of wetlands is
clearly in part controlled by topography and landscape posi-
tion, which has prompted a number of researchers to investi-
gate the value of terrain variables for mapping wetlands (for
example, Treitz and Howarth, 2000; Hogg and Todd, 2007;
Wright and Gallant, 2007; Corcoran
et al
., 2011; Knight
et al
.,
2013; Rampi
et al
., 2014). Topographic models are especially
of interest since they have the potential to overcome the prob-
lems previously discussed regarding identifying wetlands in
forested areas, where wetlands under the forest canopy may
not be visible from above, and therefore may be inherently
ambiguous for visual interpretation, but may have a unique
topographic signature that can be distinguished using the
digital terrain data
.
Generally, researchers have found that terrain variables de-
rived from
DEM
data are indeed valuable for mapping wetlands,
although there are some inconsistencies in their evaluation of
specific
DEM
-derived variables. For example, Wright and Gal-
lant (2007), in a study in Yellowstone National Park, found that
topographic slope was important for differentiating palustrine
wetlands from uplands, but that the compound topographic
moisture index (
CTMI
) was of comparatively little value. On the
other hand, Rampi
et al
. (2014) identify
CTMI
as a key variable
for differentiating wetlands from uplands in Minnesota, in the
Upper Midwest United States. Knight
et al
. (2013) found that
CTMI
and the computationally simpler surface curvature were
both similarly valuable for predicting the presence of wet-
lands. The topographic position index, conceptually similar to
the dissection measure used in this study, has also been found
to be of value for mapping wetlands (Merot
et al
., 2003; Hogg
and Todd, 2007; Bwangoy
et al
., 2010; De Reu
et al
., 2013). An
additional notable finding from Knight
et al
. (2013) is that the
source of the topographic data was of little importance in their
study since the accuracy of wetlands mapped from the Na-
tional Elevation Dataset (
NED
) and light detection and ranging
(lidar) terrain data were not statistically different.
Prior research has generally focused on traditional “hard”
classifications of wetlands, where pixels are labeled in a bina-
ry fashion, as either wetlands or uplands. However, wetlands
are inherently complex and dynamic landscape features, gen-
erally characterized by gradational interface with the adjacent
uplands (Mitsch and Gosselink, 2007). This suggests a tradi-
tional hard classification may not necessarily be the optimal
approach to mapping wetlands. An example of an alternative
conceptualization is provided by Wright and Gallant (2007),
who used a combination of Landsat
TM
imagery and ancillary
environmental data, including terrain characteristics derived
from a
DEM
, to generate a probabilistic map of palustrine wet-
lands in Yellowstone National Park. They demonstrated that
the probability models were more informative than per-pixel
classifications in capturing the spatial and temporal vari-
ability of palustrine wetlands. Their study is also notable for
identifying wetlands that were not mapped in the
NWI
.
Mapping Wetlands with Machine Learning and Random Forests
Machine learning algorithms have emerged as an alternative
to parametric classifiers and have been shown to be more
accurate and efficient when applied to high-dimensional,
complex data spaces (Hansen
et al
., 1996; Huang
et al
., 2002;
Rogan
et al
., 2003). Machine learning algorithms, such as arti-
ficial neural networks (ANNs) (Del Frate
et al
., 2003), support
vector machines (
SVM
s) (Pal, 2005; Pal and Mather, 2005), de-
cision trees (
DT
s) (Waske and Braun, 2009), and
RF
(Breiman,
2001), are not based on specific data distribution assumptions
(Loosvelt
et al
., 2012). In remote sensing, machine learning
algorithms are of interest because they offer the potential to
handle complex spectral measurement space, correlated data,
and large volumes of data, and require less processing time
compared to traditional classifiers (Hansen and Reed, 2000).
A variety of machine learning approaches have been
employed in wetland mapping. For example, both Hogg and
Todd (2011) and Wright and Gallant (2007) used a
DT
classifi-
cation to discriminate uplands and wetlands, with the latter
in their conclusions pointing to the need to explore ensemble
DT
methods. Knight
et al
. (2013) used boosted classification
and regression trees (boosted
CART
) to map palustrine wetland
types in Minnesota, while Baker
et al
. (2006) used the same
algorithm to distinguish wetlands and riparian areas from
the rest of the landscape in the Gallatin Valley of Southwest
Montana. Xi
et al
. (2015) used
SVM
s to map coastal wetlands.
RF
was previously used for wetland mapping by Corcoran
et
al
. (2011) who used a combination of optical, topographic,
and radar data for mapping a variety of freshwater wetland
types in Minnesota
.
RF
, introduced by Breiman (2001), uses multiple
DT
s to
improve upon the accuracy and consistency of single tree
classifications. A random bootstrap sample of the data with
replacement (called “bagging”) is drawn for each tree generat-
ed instead of using the entire training dataset (Breiman, 1996).
In addition, an out-of-bag (
OOB
) random sample is withheld
that can be used for accuracy assessment. In growing each
tree in the ensemble,
RF
uses a random subset of the predic-
tor variables (the number of which is defined by user). This
results in a decrease in the classification accuracy of a single
tree; however, the correlation between trees is reduced. As a
result, the randomized predictor variable selection reduces
the generalization error.
RF
has been used for classification in
remote sensing (for example, Ghimire
et al
., 2010; Burkholder
438
June 2016
PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING
387...,428,429,430,431,432,433,434,435,436,437 439,440,441,442,443,444,445,446,447,448,...450
Powered by FlippingBook