The comparison of results from these two methods shows
that
VGS
tends to over-segment, namely details of a complete
structure are segmented as independent parts. In contrast, the
result of
SVGS
tends to be under-segmented retaining multiple
objects as one single segment. For example, adjacent planar
facets of the same facade are identified as one planar surface.
Note that in the result of
SVGS
, many small independent
details such as frames of different neighboring windows are
merged as one segment, but in the case of
VGS
, over-segment-
ed objects consisting of isolated voxels are removed as outli-
ers from the output. Naturally, such a removal step in
VGS
will
be counterproductive to the completeness of the output.
For the quantitative evaluation, we compare the proposed
methods with baseline methods based on the manually seg-
mented ground truth. In this test, the voxel resolution used
in
VGS
,
SVGS
, and
LCCP
is set to 0.1 m, equaling the radius of
normal vector estimation in
RG
and the small radius of normal
estimation in
DON
. The seed resolution of supervoxels in
SVGS
and
LCCP
is 0.2 m, equaling the graph size used in
VGS
and the large radius of normal vector estimation in
DON
. The
threshold
δ
of graph segmentation used in
VGS
, and
SVGS
is
empirically set to 0.7 and 0.35, respectively. The threshold of
normal difference used in
RG
and
DON
for smoothness is set
to 15°. For the
LCCP
method, both the convexity tolerance and
smoothness are set to 3.0. As displayed in Table 2, the pro-
posed methods can outperform other baseline methods, with
F1-measure reaching approximately 0.81. It is also interest-
ing that the
RG
method shows results comparable with those
of
VGS
and
SVGS
, but as will be discussed later,
VGS
and
SVGS
require less execution time, and they are more computation-
ally efficient.
Table 2. Evaluation of segmentation results of Sample 1 in the
building facade dataset.
Method
Laser scanned
Precision
Recall
F1
RG
0.8857
0.6957
0.7793
DON
0.5560
0.6160
0.5844
LCCP
0.6523
0.6840
0.6677
VGS
0.8562
0.7559
0.8029
VGS
0.8403
0.8084
0.8240
Results of the Construction Site Scene (TUM Dataset)
For the tests conducted on the construction site scene, Figures
9c through 9f depict the segmentation results obtained from
the
VGS
and
SVGS
methods, on both the lidar and photogram-
metry point clouds. Similar to the result of building facades,
the segments are rendered with varying gray values. The
parameters of methods are the same as the ones used for the
test using the building facade dataset. It appears that the con-
struction site scene is more complex than the building facade
scene, significantly increasing the difficulty of segmentation.
This hypothesis is backed by the obtained result as well,
which is apparently inferior to the result of building façade
tests. Comparing results of using lidar and photogrammetric
point clouds based on visual inspections, it seems that for
segmenting major structures of the given point cloud, the re-
sult of using lidar data is generally much better. One possible
explanation is that, in contrast to lidar points, our photogram-
metric point clouds normally have a bit higher percentage of
outliers caused by matching errors during the stereo matching
process, which may affect the segmentation result. Addition-
ally, with respect to the preservation of concave and “stair-
like” connections, the
VGS
method shows better performance
than the
SVGS
method when using the photogrammetric
dataset. One possible reason is that for the
SVGS
method, the
supervoxels are clustered based on normal vectors, which
are sensitive to the stronger noise and outliers existing in our
photogrammetric point clouds.
Quantitative evaluations are given in Tables 3 and 4. For
both of the two samples,
VGS
and
SVGS
methods can outper-
form the other methods, which exhibit F1-measures exceed-
ing 0.67 for both the laser scanning and photogrammetric
datasets. For the laser scanning datasets, all the methods
show similar performance for both samples. However, when it
comes to the photogrammetric dataset, the results of Sample
3 are inferior 438 to 439 to those of Sample 2 for all methods.
One possible reason is that the errors in the photogrammet-
ric point cloud are more pronounced than those in the laser
scanning point cloud. This is mainly because of the limited
440 observing positions of acquiring optical images for the
construction site, namely we can only obtain images of dif-
ferent parts of the scene from certain positions, so that the
distance from the camera to objects in Sample 3 is larger than
in Sample 2. A larger object distance decreases the ground
resolution of the image (i.e., the size of footprint of the pixel),
generating more sparse point clouds, which may decrease the
quality of the segmentation result, because to some degree the
sparse point density may affect the reliability of the eigenval-
ues computed from the points.
Table 3. Evaluation results of Sample 2 in the construction
site dataset.
Method
Laser scanned
Photogrammetric
Precision Recall
F1 Precision Recall
F1
RG
0.6098 0.5799 0.5945 0.6371 0.6807 0.6582
DON
0.5875 0.5160 0.5495 0.5649 0.5269 0.5452
LCCP
0.5950 0.5250 0.5578 0.6104 0.5694 0.5892
VGS
0.7105 0.7077 0.7091 0.7655 0.7306 0.7476
SVGS
0.7205 0.7283 0.7244 0.7163 0.7420 0.7289
It is interesting that when using the Sample 2 point cloud,
the results achieved for the photogrammetric datasets seem
to be even better than those for the lidar data for both
VGS
and
SVGS
methods in the light of the F1-measures, but these
numbers are inconclusive, because the manually-segmented
ground truth of the photogrammetric dataset is different
(coarser) than the one of the lidar dataset. For the photogram-
metric dataset we used, it is difficult to manually segment
the point cloud even for a human operator because of its
relatively poor quality. Although the quality of the ground
truth results in counterintuitive values for F1- measures, the
ranking of the methods achieved for identical input data still
shows that the proposed methods perform best.
Table 4. Evaluation results of Sample 3 in the construction
site dataset.
Method
Laser scanned
Photogrammetric
Precision Recall
F1 Precision Recall
F1
RG
0.6687 0.6532 0.6609 0.56286 0.5837 0.5731
DON
0.5745 0.5324 0.5526 0.5113 0.5446 0.5275
LCCP
0.6743 0.5676 0.6164 0.6144 0.6012 0.6077
VGS
0.7960 0.7113 0.7513 0.6843 0.6674 0.6758
SVGS
0.8404 0.7301 0.7814 0.7054 0.6413 0.6718
Influence of Parameter Variation
To fully investigate the performance of the proposed method,
we generate precision-recall (
PR
) curves of segmentation
results using manually segmented samples for both the
proposed approaches and different baseline methods, by
386
June 2018
PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING