PE&RS September 2015 - page 735

features has different properties, and the overlap between
them is small if not empty (Mikolajczyk
et al.
, 2005).
There are several methods for robust extracting local
invariant features (Cheng
et al.
, 2008; Cheng
et al
., 2013). For
example, Cheng
et al
. (2008) proposed
ED
-
MSER
(entropy and
spatial dispersion constraints
MSER
) approach for a robust and
well-distributed
MSER
region extraction. For reliable and uni-
form robust
MSER
and Harris-Affine feature extraction in this
paper, an advanced strategy is used according to our previous
research Uniform Robust
SIFT
(
UR
-
SIFT
) algorithm (Sedaghat
et al.
, 2011). The main idea of this approach is a selection
strategy of the features in the full distribution of location
and scale where the feature qualities are quarantined based
on the stability and distinctiveness constraints (Sedaghat
et
al.
, 2011). This strategy is adapted for both Harris-Affine and
MSER
algorithms, and a uniform robust integrated approach is
presented for affine invariant feature extraction. For a detailed
explanation of this method, please see Sedaghat
et al.
) 2011).
After feature extraction, the next step is the descriptor
generation process. The well-known
SIFT
algorithm is used
for descriptor generation in the proposed method. The
SIFT
descriptor is a 3D histogram of gradient locations and orien-
tations. The location is quantified into a 4 × 4 location grid,
and the gradient angle is quantified into eight orientations,
resulting in a 128-dimensional descriptor. Harris-affine and
MSER
detectors provide elliptic regions of different size. All
the regions are resampled to a circular region of constant
radius (41 × 41 patch) to obtain scale and affine invariance. In
order to achieve orientation invariance, before
SIFT
descrip-
tor generation, one or more orientations are assigned to each
feature point based on local image gradient directions. Then,
the normalized regions are rotated in the direction of the
dominant gradient orientation.
Feature Correspondence and Mismatch Elimination
After combined feature extraction and description in both ref-
erence and search images, the initial cross matching process
is performed by using the Euclidean distance between feature
descriptors. The cross-matching process is a dual strategy
that confirms the matching by a reverse certification (Yu
et
al.
, 2008). The ratio between the first and second minimum
distances is also used to increase reliability of initial match
features (Lowe, 2004). The initial match features which their
distance ratios are greater than
T
ED
= 0.8 (as suggested by
Lowe (2004)) are rejected.
There are several methods for identifying and discard-
ing mismatch points. Generally, a geometric transformation
model is estimated between the two images. In this paper,
epipolar constraint based on fundamental matrix and
RANSAC
algorithm are used to reject the false matches (Zhang
et al.
,
2010). The initial feature matches that are not consistent
with the estimated epipolar geometry are identified as false
matches and rejected. For this purpose a threshold
T
E
= 1 pix-
el is used for the distance between each feature point to the
estimated epipolar line. For more details, please see (Zhang
et
al
., 2010). Therefore, the output of the first step is the initial
corresponding elliptical regions that their positional accura-
cies are improved in the next step.
Accurate Image Matching Based on Oriented Least Square
Before presenting the proposed accurate oriented least square
matching, a brief review of the original least square method is
presented.
Least Square Matching
The well-known Least Square Matching (
LSM
), algorithm is a
very potent, flexible, and accurate technique for all kinds of
data matching problems (Gruen, 1985). In the
LSM
method,
each point to be matched is the center of a small window of
pixels in a reference image (template) which is statistically
compared with equally sized windows of pixels in the search
image (matching window). The position and shape of the
matching window are iteratively estimated in an adjustment
process until the grey-level differences between the deformed
matching window and the template reach a minimum.
Suppose that
f
(
x,y
) and
g
(
x,y
) are template and matching
windows as conjugate patches of a point pair in the refer-
ence and the search images, respectively. The
g
(
x,y
) patch
is projected into
f
(
x,y
) patch by geometric and radiometric
transformation. So by considering
e
(
x,y
) as image noise, the
observation equation can be written as:
e
(
x,y
) =
f
(
x,y
) –
T
R
(
g
(
T
G
(
x,y
)))
(1)
where
T
R
and
T
G
are radiometric and geometric transforma-
tion models, respectively. A linear function with two pa-
rameters is generally applied for radiometric transformation.
Because template and matching windows are small patches,
a plan affine model with six unknown parameters or projec-
tive model with eight unknown parameters is also considered
as an effective geometric transformation. The Equation 1
is a nonlinear observation equation and should be linear-
ized using Taylor expansion. For each conjugate pixel in the
template and matching window, an observation equation is
written and unknown parameters, including radiometric and
geometric transformation parameters are determined using
the least square adjustment.
The nonlinearity of the original estimation model requires
iterations of the linearized model. In each iteration, a shift
location vector
Δ
d
= [
Δ
x,
Δ
y
] is computed respect with to the
initial position of
g
(
x,y
) and is used to refine the accuracy
of the correspondence. As a rule of thumb, the approximate
matching window location should not be displaced more than
half the window size from the desired point (Luhmann
et al.
,
2006). The iteration stops when solution converges to a certain
point. Iteration number and the success of the
LSM
solution
depend on various items such as template and matching win-
dow similarity, size and information content, the accuracy of
the initial values and the chosen transformation model. For a
detailed explanation of this method, please see (Gruen, 1985).
LSM
matching performance significantly decreases when
image pairs have significant geometric distortion. In general,
identical (equal) template and matching window are used
that cannot make true accurate corresponding points in image
pairs with a significant geometric distortion due to inadequa-
cy of the approximate matching window shape, even when
good location initial values are used.
Based on our experimental analysis described in the
Results and Discussion Section
, the standard
LSM
matching
method is relatively sensitive to significant geometric rota-
tion, scale, and viewpoint differences. Therefore, it cannot
be effectively applied for accurate robust viewpoint invariant
image matching. In fact, due to significant local distortions
in the convergent image pair captured from considerable dif-
ferent viewpoints, the identical matching window in original
LSM
method does not have enough overlap for successful
precise matching. Figure 2 shows a successful and unsuccess-
ful illustration of
LSM
matching method.
As a common approach to overcome this problem, the
search image can be coarsely aligned with respect to the refer-
ence image using a global transformation model. This process
requires a preregistration and a resampling step that increases
the computational complexity of the matching. On the other
hand, if two images have had very significant local distortion
especially caused by relief displacement, a global transfor-
mation model cannot align two images appropriately. In this
paper, to increase the capability of the
LSM
, an Orientated
Least Square Matching (
OLSM
) method is proposed. In the fol-
lowing, the detail of the proposed method is presented.
PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING
September 2015
735
679...,725,726,727,728,729,730,731,732,733,734 736,737,738,739,740,741,742,743,744,745,...754
Powered by FlippingBook