Co-registering of Terrain-level Features
Following the argument of this research, x-disparity measured
in epipolar resampled stereo images would only represent the
off-terrain objects if corresponding ground-level objects are
coregistered. Accordingly, a shift is required to geometrically
align the terrain-level objects (e.g., roads) in both images, and
thus, eliminate their x-disparity. This alignment must maintain
the epipolar condition already achieved in the previous step.
Therefore, the alignment must be in the row direction only.
The required transformation for this step is a shift in the
x-direction (epipolar direction/rows) of the right epipolar
image, for example, with respect to the left one. Therefore,
by having an accurately matched ground-level point pair, the
x-shift can be calculated and applied to one epipolar image
(e.g., the right image). This shift can be implemented simul-
taneously with the y-shift mentioned in the previous step as
described in Equation 2:
I x y I x y S x y
’, ’
,
,
(
)
=
(
)
+
(
)
∆ ∆
(2)
where
I
(
x
′
,y
′
) is one of the projected images reoriented along
the epipolar direction,
I
(
x,y
) is the same image after co-regis-
tration using the shifts in the x-direction (
Δ
x
) and y-direction
(
Δ
y
) calculated from a point pair on the ground level.
Mapping the Disparity of the Aboveground Objects
The whole process of the proposed
RMAD
technique is flow-
charted in Figure 3. The last step in
RMAD
is the generation of
a disparity map for the aboveground features by applying an
image matching technique based on epipolar constraint.
There are several alternatives for an image matching algo-
rithm to be implemented for creating disparity maps. Alobeid
et al.
(2010) provided a critical review of the state-of-the-art
matching algorithms for dense disparity map generation.
They concluded that the most suitable matching algorithm for
urban areas is the semi-global matching (
SGM
) technique as
introduced in Hirschmüller (2008) due to several advantages.
For instance, it provides a sub-pixel matching accuracy and
preserves the discontinuity at the building edges. Further-
more, it is insensitive to the illumination and reflection
changes. Consequently, this study adopts this algorithm for
the generation of the disparity information.
Disparity-based Building Detection
A procedure can now be designed for building detection us-
ing the pan-sharpened
VHR
image along with the accurately
co-registered disparity map. The multispectral bands of the
VHR
image are used to define the borders of the scene fea-
tures, while the normalized disparity information identifies
the aboveground objects (e.g., buildings and trees). Based on
that, the general steps for a disparity-based building detection
procedure can be proposed as follows:
1. Image Segmentation. Image pixels are grouped into
classes based on the homogeneity measure of color
information. The resulting segments represent mean-
ingful objects in the scene. Several segmentation
techniques are introduced in the literature. However,
multi-resolution segmentation (Baatz and Schäpe,
2000) is one of the most appropriate techniques for
segmenting
VHR
images in urban areas as concluded by
Dey (2013). For improved segmentation results, the
VHR
image is required to be pan-sharpened. The UNB pan-
sharpening technique, introduced by Zhang (2004),
works best for
VHR
images of the new satellite sensors
because it preserves color information that is critical
for segmentation techniques.
2. Vegetation Suppression. The disparity values have
a crucial role in the subsequent steps for building
detection. Thus, other elevated features, such as trees,
must first be removed to avoid confusion with build-
ings. Vegetation indices based on red and infra-red
bands of
VHR
images are used effectively to detect and
delineate vegetation. One of the most accurate and
popular vegetation indices is the Normalized Differ-
ence Vegetation Index (
NDVI
).
All image segments related to trees and grass objects
can be identified by applying a thresholding opera-
tion to the calculated
NDVI
values as in Equation 3, and
subsequently, their corresponding disparity values are
suppressed (set to zero).
Disp x y ( , )
,
=
≥
(
)
0
if NDVI t
Disp x y
Otherwise
(3)
where
Disp
(
x,y
) is the disparity value at the x and y lo-
cation in the object space plane;
t
is the threshold value.
3. Disparity Thresholding. At this stage, building objects
can be detected directly and without confusion by
simply applying a threshold to the disparity map. This
is due to the fact that, after the last step, the remaining
disparity values will only represent buildings. Further-
more, in an off-nadir
VHR
stereo pair, the disparities
should represent buildings’ rooftops only without any
buildings’ façades. This takes advantage of the geom-
etry of image acquisition that makes building façades
in one stereo image not appear in the other one (Figure
4). Because of the façades’ dissimilarity and/or the im-
posed pre-specified search range in the
SGM
algorithm,
pixels that belong to building façades cannot normally
be matched in a stereo pair.
4. Result Finishing. A post-processing procedure may be
applied to enhance the representation of the detected
buildings. The procedure includes removing the noise,
filling the holes, and merging the detected segments.
Object-based image analysis software packages (e.g.,
Figure 3. The work flow for generating disparity maps for off-
terrain objects (RMAD technique).
538
July 2016
PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING