PE&RS June 2018 Full

The comparison of results from these two methods shows

that

VGS

tends to over-segment, namely details of a complete

structure are segmented as independent parts. In contrast, the

result of

SVGS

tends to be under-segmented retaining multiple

objects as one single segment. For example, adjacent planar

facets of the same facade are identified as one planar surface.

Note that in the result of

SVGS

, many small independent

details such as frames of different neighboring windows are

merged as one segment, but in the case of

VGS

, over-segment-

ed objects consisting of isolated voxels are removed as outli-

ers from the output. Naturally, such a removal step in

VGS

will

be counterproductive to the completeness of the output.

For the quantitative evaluation, we compare the proposed

methods with baseline methods based on the manually seg-

mented ground truth. In this test, the voxel resolution used

in

VGS

,

SVGS

, and

LCCP

is set to 0.1 m, equaling the radius of

normal vector estimation in

RG

and the small radius of normal

estimation in

DON

. The seed resolution of supervoxels in

SVGS

and

LCCP

is 0.2 m, equaling the graph size used in

VGS

and the large radius of normal vector estimation in

DON

. The

threshold

δ

of graph segmentation used in

VGS

, and

SVGS

is

empirically set to 0.7 and 0.35, respectively. The threshold of

normal difference used in

RG

and

DON

for smoothness is set

to 15°. For the

LCCP

method, both the convexity tolerance and

smoothness are set to 3.0. As displayed in Table 2, the pro-

posed methods can outperform other baseline methods, with

F1-measure reaching approximately 0.81. It is also interest-

ing that the

RG

method shows results comparable with those

of

VGS

and

SVGS

, but as will be discussed later,

VGS

and

SVGS

require less execution time, and they are more computation-

ally efficient.

Table 2. Evaluation of segmentation results of Sample 1 in the

building facade dataset.

Method

Laser scanned

Precision

Recall

F1

RG

0.8857

0.6957

0.7793

DON

0.5560

0.6160

0.5844

LCCP

0.6523

0.6840

0.6677

VGS

0.8562

0.7559

0.8029

VGS

0.8403

0.8084

0.8240

Results of the Construction Site Scene (TUM Dataset)

For the tests conducted on the construction site scene, Figures

9c through 9f depict the segmentation results obtained from

the

VGS

and

SVGS

methods, on both the lidar and photogram-

metry point clouds. Similar to the result of building facades,

the segments are rendered with varying gray values. The

parameters of methods are the same as the ones used for the

test using the building facade dataset. It appears that the con-

struction site scene is more complex than the building facade

scene, significantly increasing the difficulty of segmentation.

This hypothesis is backed by the obtained result as well,

which is apparently inferior to the result of building façade

tests. Comparing results of using lidar and photogrammetric

point clouds based on visual inspections, it seems that for

segmenting major structures of the given point cloud, the re-

sult of using lidar data is generally much better. One possible

explanation is that, in contrast to lidar points, our photogram-

metric point clouds normally have a bit higher percentage of

outliers caused by matching errors during the stereo matching

process, which may affect the segmentation result. Addition-

ally, with respect to the preservation of concave and “stair-

like” connections, the

VGS

method shows better performance

than the

SVGS

method when using the photogrammetric

dataset. One possible reason is that for the

SVGS

method, the

supervoxels are clustered based on normal vectors, which

are sensitive to the stronger noise and outliers existing in our

photogrammetric point clouds.

Quantitative evaluations are given in Tables 3 and 4. For

both of the two samples,

VGS

and

SVGS

methods can outper-

form the other methods, which exhibit F1-measures exceed-

ing 0.67 for both the laser scanning and photogrammetric

datasets. For the laser scanning datasets, all the methods

show similar performance for both samples. However, when it

comes to the photogrammetric dataset, the results of Sample

3 are inferior 438 to 439 to those of Sample 2 for all methods.

One possible reason is that the errors in the photogrammet-

ric point cloud are more pronounced than those in the laser

scanning point cloud. This is mainly because of the limited

440 observing positions of acquiring optical images for the

construction site, namely we can only obtain images of dif-

ferent parts of the scene from certain positions, so that the

distance from the camera to objects in Sample 3 is larger than

in Sample 2. A larger object distance decreases the ground

resolution of the image (i.e., the size of footprint of the pixel),

generating more sparse point clouds, which may decrease the

quality of the segmentation result, because to some degree the

sparse point density may affect the reliability of the eigenval-

ues computed from the points.

Table 3. Evaluation results of Sample 2 in the construction

site dataset.

Method

Laser scanned

Photogrammetric

Precision Recall

F1 Precision Recall

F1

RG

0.6098 0.5799 0.5945 0.6371 0.6807 0.6582

DON

0.5875 0.5160 0.5495 0.5649 0.5269 0.5452

LCCP

0.5950 0.5250 0.5578 0.6104 0.5694 0.5892

VGS

0.7105 0.7077 0.7091 0.7655 0.7306 0.7476

SVGS

0.7205 0.7283 0.7244 0.7163 0.7420 0.7289

It is interesting that when using the Sample 2 point cloud,

the results achieved for the photogrammetric datasets seem

to be even better than those for the lidar data for both

VGS

and

SVGS

methods in the light of the F1-measures, but these

numbers are inconclusive, because the manually-segmented

ground truth of the photogrammetric dataset is different

(coarser) than the one of the lidar dataset. For the photogram-

metric dataset we used, it is difficult to manually segment

the point cloud even for a human operator because of its

relatively poor quality. Although the quality of the ground

truth results in counterintuitive values for F1- measures, the

ranking of the methods achieved for identical input data still

shows that the proposed methods perform best.

Table 4. Evaluation results of Sample 3 in the construction

site dataset.

Method

Laser scanned

Photogrammetric

Precision Recall

F1 Precision Recall

F1

RG

0.6687 0.6532 0.6609 0.56286 0.5837 0.5731

DON

0.5745 0.5324 0.5526 0.5113 0.5446 0.5275

LCCP

0.6743 0.5676 0.6164 0.6144 0.6012 0.6077

VGS

0.7960 0.7113 0.7513 0.6843 0.6674 0.6758

SVGS

0.8404 0.7301 0.7814 0.7054 0.6413 0.6718

Influence of Parameter Variation

To fully investigate the performance of the proposed method,

we generate precision-recall (

PR

) curves of segmentation

results using manually segmented samples for both the

proposed approaches and different baseline methods, by

386

June 2018

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING

PE&RS June 2018 Full - page 386

Warning.