PE&RS June 2016 Full

higher omission error rates compared to commission error

rates, there is a need to generate regional, systematic maps of

potential locations of unmapped wetlands. The aim of this

work is therefore to map the probability of palustrine wetland

occurrence (

PEM

,

PFO

, and

PSS

) in West Virginia. We used

digital terrain variables generated from a photogrammetrical-

ly-derived digital elevation model (

DEM

) and random forest

(

RF

) machine learning classifications. In addition to generat-

ing a statewide map of wetland probability, regional maps

were produced in order to assess how reliable a probability

model generated in one physiographic region is when applied

to other regions. Since there are a wide range of potential

DEM

-derived variables, the importance of different variables

was also evaluated in creating the predictive model. The ef-

fectiveness of the wetland probability models were evaluated

by comparing them against

NWI

data not used in training the

model. In addition, the models were evaluated using a sepa-

rate dataset of wetlands not identified in the

NWI

mapping,

thus allowing an assessment of the value of the approach for

identifying wetlands beyond that of the

NWI

dataset.

Background

Remote Sensing for Mapping Wetlands

Wetland mapping using classification of remotely sensed data

has previously been explored using moderate spatial resolution

satellite imagery, such as Landsat Thematic Mapper (

TM

) and

Enhanced Thematic Mapper Plus (

ETM

+) data (for example,

Butera, 1983; Fraziere and Page, 2000; Townsend and Walsh,

2001; Baker

et al

., 2006; Wright and Gallant, 2007; Frohn

et al.

,

2009; Gong

et al

., 2011), the combination of multispectral and

radar imagery (Hess

et al

., 1990; Hess

et al

., 1995; Leahy, 2003;

Costa and Telmer, 2006; Rosenqvist

et al

., 2004; Corcoran

et al

.,

2011), high spatial resolution imagery (Xie

et al

., 2015), and

the combination of multispectral imagery and

GIS

rule-based

methods (Sader

et al

., 1995; Manju

et al

., 2005). Ozesmi and

Bauer (2002) suggest that Landsat Multispectral Scanner (

MSS

),

TM

, and Satellite Pour l’Observation de la Terre (

SPOT

) are the

most commonly used sensors for mapping wetlands

.

As hydrological features, the distribution of wetlands is

clearly in part controlled by topography and landscape posi-

tion, which has prompted a number of researchers to investi-

gate the value of terrain variables for mapping wetlands (for

example, Treitz and Howarth, 2000; Hogg and Todd, 2007;

Wright and Gallant, 2007; Corcoran

et al

., 2011; Knight

et al

.,

2013; Rampi

et al

., 2014). Topographic models are especially

of interest since they have the potential to overcome the prob-

lems previously discussed regarding identifying wetlands in

forested areas, where wetlands under the forest canopy may

not be visible from above, and therefore may be inherently

ambiguous for visual interpretation, but may have a unique

topographic signature that can be distinguished using the

digital terrain data

.

Generally, researchers have found that terrain variables de-

rived from

DEM

data are indeed valuable for mapping wetlands,

although there are some inconsistencies in their evaluation of

specific

DEM

-derived variables. For example, Wright and Gal-

lant (2007), in a study in Yellowstone National Park, found that

topographic slope was important for differentiating palustrine

wetlands from uplands, but that the compound topographic

moisture index (

CTMI

) was of comparatively little value. On the

other hand, Rampi

et al

. (2014) identify

CTMI

as a key variable

for differentiating wetlands from uplands in Minnesota, in the

Upper Midwest United States. Knight

et al

. (2013) found that

CTMI

and the computationally simpler surface curvature were

both similarly valuable for predicting the presence of wet-

lands. The topographic position index, conceptually similar to

the dissection measure used in this study, has also been found

to be of value for mapping wetlands (Merot

et al

., 2003; Hogg

and Todd, 2007; Bwangoy

et al

., 2010; De Reu

et al

., 2013). An

additional notable finding from Knight

et al

. (2013) is that the

source of the topographic data was of little importance in their

study since the accuracy of wetlands mapped from the Na-

tional Elevation Dataset (

NED

) and light detection and ranging

(lidar) terrain data were not statistically different.

Prior research has generally focused on traditional “hard”

classifications of wetlands, where pixels are labeled in a bina-

ry fashion, as either wetlands or uplands. However, wetlands

are inherently complex and dynamic landscape features, gen-

erally characterized by gradational interface with the adjacent

uplands (Mitsch and Gosselink, 2007). This suggests a tradi-

tional hard classification may not necessarily be the optimal

approach to mapping wetlands. An example of an alternative

conceptualization is provided by Wright and Gallant (2007),

who used a combination of Landsat

TM

imagery and ancillary

environmental data, including terrain characteristics derived

from a

DEM

, to generate a probabilistic map of palustrine wet-

lands in Yellowstone National Park. They demonstrated that

the probability models were more informative than per-pixel

classifications in capturing the spatial and temporal vari-

ability of palustrine wetlands. Their study is also notable for

identifying wetlands that were not mapped in the

NWI

.

Mapping Wetlands with Machine Learning and Random Forests

Machine learning algorithms have emerged as an alternative

to parametric classifiers and have been shown to be more

accurate and efficient when applied to high-dimensional,

complex data spaces (Hansen

et al

., 1996; Huang

et al

., 2002;

Rogan

et al

., 2003). Machine learning algorithms, such as arti-

ficial neural networks (ANNs) (Del Frate

et al

., 2003), support

vector machines (

SVM

s) (Pal, 2005; Pal and Mather, 2005), de-

cision trees (

DT

s) (Waske and Braun, 2009), and

RF

(Breiman,

2001), are not based on specific data distribution assumptions

(Loosvelt

et al

., 2012). In remote sensing, machine learning

algorithms are of interest because they offer the potential to

handle complex spectral measurement space, correlated data,

and large volumes of data, and require less processing time

compared to traditional classifiers (Hansen and Reed, 2000).

A variety of machine learning approaches have been

employed in wetland mapping. For example, both Hogg and

Todd (2011) and Wright and Gallant (2007) used a

DT

classifi-

cation to discriminate uplands and wetlands, with the latter

in their conclusions pointing to the need to explore ensemble

DT

methods. Knight

et al

. (2013) used boosted classification

and regression trees (boosted

CART

) to map palustrine wetland

types in Minnesota, while Baker

et al

. (2006) used the same

algorithm to distinguish wetlands and riparian areas from

the rest of the landscape in the Gallatin Valley of Southwest

Montana. Xi

et al

. (2015) used

SVM

s to map coastal wetlands.

RF

was previously used for wetland mapping by Corcoran

et

al

. (2011) who used a combination of optical, topographic,

and radar data for mapping a variety of freshwater wetland

types in Minnesota

.

RF

, introduced by Breiman (2001), uses multiple

DT

s to

improve upon the accuracy and consistency of single tree

classifications. A random bootstrap sample of the data with

replacement (called “bagging”) is drawn for each tree generat-

ed instead of using the entire training dataset (Breiman, 1996).

In addition, an out-of-bag (

OOB

) random sample is withheld

that can be used for accuracy assessment. In growing each

tree in the ensemble,

RF

uses a random subset of the predic-

tor variables (the number of which is defined by user). This

results in a decrease in the classification accuracy of a single

tree; however, the correlation between trees is reduced. As a

result, the randomized predictor variable selection reduces

the generalization error.

RF

has been used for classification in

remote sensing (for example, Ghimire

et al

., 2010; Burkholder

438

June 2016

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING

PE&RS June 2016 Full - page 438

Warning.