peak in the accumulate array. Through an analysis of the peak
of votes, the parameters of a line in feature space and feature
points that belong to this line can be found simultaneously.
The main characteristics of Hough transform are as follows:
1. Globally optimal. The final result of votes is deter-
mined by all feature points. Thus, the peaks of votes
reflect the entire consistency of these features.
2. Robust to random noise. Random noise will vote into
the accumulate array randomly, and they usually do
not coincidently form a strong peak.
3. Parameters of the pattern and features that belong to
this pattern can be discovered simultaneously by ana-
lyzing the peaks of votes.
These characteristics of Hough transform make it
suitable for rejecting incompatible initial candidate
matches. The method will be described in detail in the
next section.
Geometric Consistency Voting Strategy
The first step is to decide what geometrical relationship to use
between the initial candidate matches in the Hough trans-
form. The rigorous relationship between matches is popularly
known as the epipolar geometry. However, the parameters are
too many and inconvenient for use in the Hough transform.
The scale and the rotation parameters between two images
are appropriate choices. Although the two parameters are not
constant throughout the image, they are still effective global
constraints to the image matches as long as an appropriate bin
size is chosen when discretizing the parameter space. Choos-
ing these two parameters to perform the Hough transform is
effective even when the two images are under a significant 3D
viewpoint change deformation. Unlike the method mentioned
in the Introduction (Jegou
et al
., 2008; Zhou
et al
., 2013), the
scale parameter and the orientation parameter in our method
are not provided by the feature description method.
The translation parameters are not used in our method
because of the following reasons: (a) When using translation
parameters together with the scale and rotation parameters,
the accumulate array will become four dimensional instead
of two dimensional; (b) The variation range of the translation
parameters is much larger than the scale and rotation param-
eters; (c) The sampling interval of the translation parameters
is harder to provide than the scale and rotation parameters;
and (d) Using the scale and rotation parameters can obtain a
satisfactory result already.
Using only one parameter (either the scale parameter or the
rotation parameter) can work effectively if the ratio of correct
matches is not too low, such as higher than 35 percent (this
number is empirical according to our experience). The perfor-
mance decreases if an increasing number of incorrect matches
are involved. Using two parameters together with a weighted
voting strategy can improve the robustness of the algorithm.
In the following section, the method of using one parameter
is initially described. Then, the method of using both param-
eters and the weighted voting strategy are described after. The
input of the algorithm includes one set of feature points and
their
k
nearest neighbors from another set of feature points.
Using the Scale Parameter Alone
The algorithm consists of the following five steps:
1. The variation range of the scale parameter is deter-
mined. Suppose that the difference in scales between
two images does not exceed 5. Then, the variation
range of the scale parameter could be
1
5
to 5.
2. The sampling interval of the parameter space is de-
termined. In our method, the accumulate array is one
dimensional with 17 bins. These bins correspond to the
scale parameter of
1
5
,
1
4 5.
,
1
4
,
1
3 5.
,
1
3
,
1
2 5.
,
1
2
,
1
1 5.
, 1, 1.5, 2,
2.5, 3, 3.5, 4, 4.5, 5, respectively. Then, the accumulate
array is initialized to zero.
3. The voting process is performed.
a) Suppose total
n
feature points exist in the first
feature set, denoted by
p
i
i
∈
[0,
n
). Each feature
point has
k
nearest neighbors in another feature set,
denoted by
q
i_m
,
m
∈
[0,
k
), indicating the
m
th
nearest
neighbor of
p
i
. Thus, they form total
w
initial candi-
date matches (
w
=
n
×
k
).
b) An overall accumulate array
A
whose dimension is
17
×
1
is set up.
c) An individual accumulate array
A
i_m
is set up for
each candidate match
p
i
and. Then, one feature
point
p
j,
j
≠
i
is selected from the first feature set, form-
ing a vector
p p
i j
. In the meantime,
q
i_m
,
q
j_0
(the first
nearest neighbor of
p
j
) from the second feature set is
used to form the second vector
q q
i m j
_
_0
. Whichever
neighbor
q
i_m
is considered for
p
i
, the first nearest
neighbor
q
j_
0
is always used for
p
j
, because more
correct matches appear in the first nearest neighbor,
using them to vote for the match
p
i
and
q
i_m
is better.
d) For each
j
, the length ratio of the two vectors is cal-
culated. The length ratio indicates the scale param-
eter between two images. Then, the counts of bins
in accumulate array
A
i_m
are increased accordingly.
Finally,
(n – 1)
votes will occur in the accumulate
array
A
i_m
, thereby forming a peak at some point in
the array. The votes of the peak are the confidence of
match
p
i
and
q
i_m
to be a correct match. The confi-
dence and the index of the peak bin for this match
are saved (the index of the peak bin reveals which
scale parameter this match supports).
e) The votes of accumulate array
A
i_m
are added into
the overall accumulate array
A
bin to bin.
4. After all the matches complete the voting process, the
final voting result can be obtained from the overall
accumulate array
A
. The scale value related to the bin
of most votes in the overall accumulate array is the op-
timal average scale parameter between the two feature
sets. Considering the influence of discretization of the
parameter and the variety of the scale parameter within
the image, the adjacent bins of the peak bin are consid-
ered correct bins as well. For example, the bins with in-
dexes from
b
p
– d
to
b
p
+ d
are considered correct bins
in which
b
p
is the index of the peak bin. Furthermore,
d
is an integer that indicates how far the correct bins can
be away from the peak bin, as long as the votes in these
bins are not less than a certain percentage of the peak
bin, such as 40 percent.
5. Each match is labeled as correct or incorrect. A match
is accepted as correct if its peak bin index obtained in
Step 3-d is within the correct bin indexes mentioned
in Step 4. Furthermore, the confidence of a match
obtained in Step 3-d can reveal useful information. If
we arrange all matches in descending order according
to the confidence of this match and show the sorted
confidence as a curve, then the curve will create an
interesting pattern, as illustrated in Figure 3b. From
the highest confidence to the lowest confidence, three
stages exist. In the first stage, the confidence is high
and decreases slowly until it reaches a turning point,
where the confidence drops rapidly, toward the third
stage with low confidence. Matches that belong to the
first stage are found to be correct matches, and matches
that belong to the other stages are incorrect matches.
Thus, the confidence of a match is an equally effective
criterion of the correctness of this match.
PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING
July 2016
561