m above the ground and that elevation). Calculated pXX values
were interpolated to produce a rasterized surface. Within each
pXX
CHM
representation of each
GLAS
footprint, all pixels were
screened for the local maximum height; this value was selected
to represent the maximum height of the entire
GLAS
footprint.
The local maximum was selected as
GLAS
theoretically observes
the maximum height within each footprint. Point cloud (all and
first returns) height percentiles were calculated directly from
the point cloud data across the entire
GLAS
footprint. Although
little difference is expected between
ALS
heights derived from
all return and first return point cloud, the degree of the effect
of intermediate returns (with regard to changing height values)
in the former is unknown and cannot be assumed as negligible.
As a result
ALS
heights from both point cloud data sources are
compared with
GLAS
heights in results.
All
GLAS
estimates of RH
ROS
and RH
100
were compared
with
ALS
height percentiles to act as control datasets for each
respective height method. Each control dataset was estab-
lished to act as a basis against which differences in
GLAS
/
ALS
height comparisons were assessed for statistical differ-
ences, when
GLAS
data were stratified. Assessments of
GLAS
/
ALS
height comparisons for control and stratified datasets are
made by using simple statistics, namely root mean squared
error (
RMSE
), a modification of the coefficient of determination
(R
2
), fraction of predictions within 20 percent of observations
(F
20
), and fractional bias (F
B
). The gradient (
) of fitted linear
models is employed to determine the difference of the fitted
GLAS
/
ALS
model from an ideal 1:1 fit
.
The produced statistics were analyzed to indicate which
ALS
height percentile(s) best relate to
GLAS
height deriva-
tions (RH
100
and RH
ROS
). The best comparisons are found by
evaluating the minimum of the sum of
RMSE
, and F
B
, and the
maximum of the sum of R
2
and F
20
across unique
ALS
height
percentiles for each data source (point cloud or
CHM
). This step
essentially searches for which of the fitted models most closely
related to perfect summary statistics (i.e.,
RMSE
= 0 m, R
2
= 1,
F
20
= 1, and F
B
= 0). This method is applicable for assessing the
quality of control datasets, and all stratified data comparisons.
The nature of each stratification infers that different sample
sizes (i.e., number of footprints within the stratification) are
present. For example, considering stratification by laser num-
ber, the number of footprints recorded from laser 1 in our three
site data population is different to the number of footprints
from lasers 2 and 3; such sample size differences are noted for
other stratifications. Differing sample sizes within particular
stratifications may bias results towards larger sample sizes.
The heteroscedasticity within stratifications is addressed by
randomly sampling the largest common denominator of foot-
prints (with respect to each sub-population) from each sample,
hence yielding a common sample size across stratifications.
Coefficient of Determination
While they are not identical, a histogram of
ALS
point cloud z-
values and
GLAS
waveforms captured over coincident sample
areas should demonstrate similar characteristics, such that
height measures approximating the canopy maximum are
expected to co-vary. In an ideal situation, canopy maximum
heights derived from each method (i.e., RH
100
and pXX) will
produce a 1:1 regression relationship. In practice, unknown
effects of signal to noise ratio (
GLAS
footprint) and point spac-
ing (
ALS
) are expected to produce some non-deterministic
noise in these relationships
.
Here an origin-forcing method is necessitated to eliminate
any offset that exists between observations (
ALS
) and predic-
tions (
GLAS
) at zero; any offset would otherwise propagate to
all height derivations leading to spurious results. Additional-
ly, for measurements of identical objects (by similar methods)
a 1:1 relationship is expected, hence the concession of any
offset at zero has no physical explanation. By this method,
measures of the coefficient of determination (R
2
) often become
large and less meaningful as the mean of the dependent
variable Y
–
is set to zero in its calculation (Eisenhauer, 2003).
IIn the debate of addressing correlation measures for origin-
forced linear models in the statistical community, Eisenhauer
(2003) suggests that persisting with a non-zero value of Y¯
yields a more meaningful value for R
2
, given sufficient cor-
relation between data can be found; this form of R
2
is adopted
throughout this study.
Data Distributions
GLAS
data will frequently be influenced by more than one of
the effects investigated; e.g., a footprint can be subject to both
low energy and summertime phenological conditions. To
mitigate bias when performing some of the tests, sample pop-
ulations are normalized through random footprint selection
to ensure comparable sample sizes. However, due to opera-
tional constraints associated with missions (i.e., laser 1 was
short-lived, whereas laser 3 exhibited longevity) comparable
sample populations are not achievable in some cases without
removing too many footprints and thus compromising the
characterization of an already relatively small data sample.
For transparency, the relative sample populations associated
with each investigative test are noted in Table 2.
Results
Controls
Control comparisons between
ALS
height percentiles and
GLAS
derivations of RH
100
, and RH
ROS
across all sites for each
ALS
data source (all returns, 1
st
returns and
CHM
) are evaluated,
where the best comparisons (per
ALS
data source) are shown
in Figure 4; accompanying summary statistics are given in
Table 3 for
all
comparisons. Data in Figure 4 are illustrated
with origin-forced linear models (such to negate the introduc-
tion of physical bias), and 90 percent confidence intervals.
ALS
representations of canopy height from rasterized
CHM
’s
(Figure 4c) exhibit less variability in summary statistics (as
a function of height percentile) when compared to similar
T
able
2. D
ata
D
istributions
of
GLAS F
ootprint
C
haracteristics
for
E
ach
I
nvestigated
T
est
(L
aser
N
umber
, P
henology
,
and
T
ransmission
E
nergy
);
the
S
ample
P
opulation
is
I
ndicated
in
P
arentheses
in
each
T
ests
C
olumn
H
eading
,
i
.
e
.,
for
L
aser
N
umber
24 S
amples were
T
ested
for
L
asers
1, 2,
and
3. T
he
T
able
is
R
ead
by
R
ow
,
where
the
P
ercentage
of
D
ata
from
a
S
pecific
T
est
is
I
ndicated with
R
espect
to
other
T
ests
;
e
.
g
., 23 P
ercent
of
D
ata
E
mployed
to
T
est
the
E
ffect
of
H
igh
T
ransmission
E
nergies where
C
ollected
during
D
ummer
,
and
the
R
emaining
77 C
ollected
D
uring
W
inter
.
Data Percentage [%]
Laser Number (24)
Phenology (105)
Transmission Energy (30)
Test
1
2
3
Summer
Winter
High
Low
Laser Number
1
-
-
-
100
0
100
0
2
-
-
-
54
46
54
46
3
-
-
-
59
41
46
54
Phenology
Summer
7
74
19
-
-
55
45
Winter
0
83
17
-
-
54
46
Transmission Energy
High
0
13
87
23
77
-
-
Low
0
23
77
23
77
-
-
PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING
May 2016
355