PE&RS March 2016 full version - page 193

masked version is then developed by setting all predictions to
zero when y
ˆ
_
τ
ˆ
·
sd
(y
ˆ
)
0. This approach approximates a 95
percent certainty for areas of no canopy cover. To implement,
we used 2,000 bootstrap replicates in our Monte Carlo as-
sessment. To estimate
τ
ˆ
we simply restricted the Monte Carlo
assessment dataset to the observation where the true value
was zero and then estimated . Additionally, we developed 95
percent prediction intervals for all predicted values.
Results
Random Forest Models
Using the 500 samples we fit random forest models for
y
1
,
y
2
, and
y
3
. The pseudo-R
2
and root mean square error (
RMSE
)
for the RFy1 model were 0.85 and 2.85, respectively. For the
RFy2 model the pseudo-R
2
was 0.90 and the
RMSE
was 2.28.
The RFy3 random forest model had a pseudo-R
2
of 0.73 and a
RMSE
of 10.20.
Monte Carlo Error
The 500 original samples were sampled via the bootstrap 2,000
times which yielded a Monte Carlo assessment dataset where
each of the 500 observation had ~740 observed and predicted
values for each of the dependent variables (
y
1
,
y
2
,
y
3
). Figure 3
shows the relationship between the variability in individual
tree-level predictions and the variability of the true error for
each dependent variable. The correlations between these two
quantities summarized for the 500 observations were 0.75,
0.78, and 0.93 for RFy1, RFy2, and RFy3, respectively. The
values for constructing approximate 95 percent prediction
intervals were 1.36, 1.18, and 1.31 for RFy1, RFy2, and RFy3,
respectively. These steps are illustrated in Figure 1B and 1D.
Simulated Data: Predicting Values for New Observations and Uncertainty
Predicted maps were made and 95 percent prediction intervals
were examined for Normal High (Y
1
), Normal Low (Y
2
), and
Model Misspecification (Y
3
) populations. For example, we ran-
domly selected 100 pixels from Normal High (Y
1
) population
(a)
(b)
Plate 1. Example modeling area east of Atlanta, Georgia. NAIP imagery for the area is shown in panel A. Percent tree canopy cover based
on a random forest model is shown in panel B.
Figure 3. The variability of tree-level predictions versus the variability of the true error for y1, y2, and y3 random forest models based on
Monte Carlo assessment. The solid lines denote linear regression lines and are just for visualization.
PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING
March 2016
193
167...,183,184,185,186,187,188,189,190,191,192 194,195,196,197,198,199,200,201,202,203,...234
Powered by FlippingBook