Classification of Aerial Photogrammetric
3D Point Clouds
C. Becker, E. Rosinskaya, N. Häni, E. d’Angelo, and C. Strecha
Abstract
We present a powerful method to extract per-point semantic
class labels from aerial photogrammetry data. Labeling this
kind of data is important for tasks such as environmental
modeling, object classification, and scene understanding.
Unlike previous point cloud classification methods that rely
exclusively on geometric features, we show that incorporating
color information yields a significant increase in accuracy in
detecting semantic classes. We test our classification method
on four real-world photogrammetry datasets that were gener-
ated with Pix4Dmapper, and with varying point densities. We
show that off-the-shelf machine learning techniques coupled
with our new features allow us to train highly accurate clas-
sifiers that generalize well to unseen data, processing point
clouds containing 10 million points in less than three min-
utes on a desktop computer. We also demonstrate that our
approach can be used to generate accurate Digital Terrain
Models, outperforming approaches based on more simple
heuristics such as Maximally Stable Extremal Regions.
Introduction
Extraction of semantic information from point clouds enables
us to understand a scene, classify
objects, and generate high-
level models with
CAD
-like geometries from them. It can also
provide
a significant
improvement
to existing algorithms,
such as those used to construct Digital Terrain Models (
DTMs
)
from Digital Surface Models (
DSMs
) (Unger
et al
., 2009). With
the growing popularity of laser scanners, the availability
of drones as surveying
tools, and the rise of commercial
photogrammetry
software capable of generating millions of
points from images, there exists an increasing need for fully
automated extraction of semantic information from this kind
of data. Although
some of the commercial photogrammetry
software available today offer tools such as automated
DTM
extraction (Pix4Dmapper, 2017, Photoscan, 2017), semantic
classification
is typically left to specialized software pack-
ages (eCognition,
2017, GlobalMapper, 2017) that rely on 2.5D
orthomosaics
and
DSMs
as an input.
The need for semantic modeling of 3D point data has
inspired many research and application
engineers to model
specific structures. Often the proposed solutions were hand-
crafted to the application at hand: buildings have been mod-
eled by using common image processing techniques such as
edge detection (Haala
et al.
, 1998, Brenner, 2000) or by fitting
planes to point clouds (Rusu
et al.
, 2007); road networks have
been modeled by handcrafted features, and
DTM
algorithms
used heuristics about the size of objects to create a
DTM
from
a DSM. While successful and valuable, these approaches are
inherently
limited since they cannot be easily applied to de-
tect new classes of objects. The huge boost in the performance
of machine learning algorithms over the last years allows for
more flexible and general learning and classification algo-
rithms.
If supervised or semi-supervised learning and espe-
cially classification becomes fast and reliable, machine learn-
ing approaches to point cloud classification will find their
way into common photogrammetric workflows. Therefore, we
focus here on machine learning techniques that will allow
the
users to detect objects categories of their own choice.
In this paper we present a method to classify aerial
photogrammetry point clouds. Our approach exploits both
geometric and color information to classify individual points
as belonging
to one of the following classes extracted from
the
LAS
standard:
buildings, terrain, high vegetation, roads, hu-
man made objects (
HMO
)
or
cars
. Unlike previous point cloud
classification methods that rely exclusively on geometric
features, we show that incorporating color information
yields
a significant
increase in accuracy.
We evaluate our approach on four challenging datasets and
show that off-the-shelf machine learning techniques together
with our new features result in highly accurate and efficient
classifiers that generalize well to unseen data. The datasets
used for evaluation are publicly
available at
com/research
.
Moreover, we show that our classification approach can be
used to generate accurate Digital Terrain Models, without the
need for hand-designed heuristics such as Maximally Stable
Extremal Regions (
MSER
) detection on a Digital Surface Model.
Related Work
Methods used to extract semantic information from point
clouds can be split into two groups: those that try to seg-
ment coherent objects from a scene, and those that focus on
assigning an individual
class label to each point. Early works
using the first approach often converted the point data into
a regular 2.5D height grid so that standard image processing
techniques,
e.g., edge detection,
can be applied (Haala
et al.
,
1998; Haala and Brenner, 1999; Wang and Schenk, 2000).
A scan line based approach (Sithole
and Vosselman,
2003)
was proposed for structure detection in urban scenes. Build-
ing extraction approaches typically use geometric primitives
during the segmentation step. A multitude of such primitives
has been proposed, both in 2D, such as planes and polyhedral
(Vosselman
et al.
, 2001; Dorninger and Nothegger, 2007), and
in 3D (Lafarge and Mallet, 2012; Xiao and Furukawa, 2014).
In Rusu
et al.
(2007) the authors fit sampled parametric mod-
els to the data for object recognition. Similarly, Oesau
et al.
(2016) investigates supervised machine learning techniques to
represent small indoor datasets with planar models for object
recognition.
C. Becker, E. Rosinskaya, E. d’Angelo, and C. Strecha are with
Pix4D SA, EPFL Innovation Park, Building F, 1015 Lausanne,
Switzerland (
).
N. Häni is with the University of Minnesota.
Photogrammetric Engineering & Remote Sensing
Vol. 84, No. 5, May 2018, pp. 287–295.
0099-1112/18/287–295
© 2018 American Society for Photogrammetry
and Remote Sensing
doi: 10.14358/PERS.84.5.287
PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING
May 2018
287