PE&RS July 2016 Public

peak in the accumulate array. Through an analysis of the peak

of votes, the parameters of a line in feature space and feature

points that belong to this line can be found simultaneously.

The main characteristics of Hough transform are as follows:

1. Globally optimal. The final result of votes is deter-

mined by all feature points. Thus, the peaks of votes

reflect the entire consistency of these features.

2. Robust to random noise. Random noise will vote into

the accumulate array randomly, and they usually do

not coincidently form a strong peak.

3. Parameters of the pattern and features that belong to

this pattern can be discovered simultaneously by ana-

lyzing the peaks of votes.

These characteristics of Hough transform make it

suitable for rejecting incompatible initial candidate

matches. The method will be described in detail in the

next section.

Geometric Consistency Voting Strategy

The first step is to decide what geometrical relationship to use

between the initial candidate matches in the Hough trans-

form. The rigorous relationship between matches is popularly

known as the epipolar geometry. However, the parameters are

too many and inconvenient for use in the Hough transform.

The scale and the rotation parameters between two images

are appropriate choices. Although the two parameters are not

constant throughout the image, they are still effective global

constraints to the image matches as long as an appropriate bin

size is chosen when discretizing the parameter space. Choos-

ing these two parameters to perform the Hough transform is

effective even when the two images are under a significant 3D

viewpoint change deformation. Unlike the method mentioned

in the Introduction (Jegou

et al

., 2008; Zhou

et al

., 2013), the

scale parameter and the orientation parameter in our method

are not provided by the feature description method.

The translation parameters are not used in our method

because of the following reasons: (a) When using translation

parameters together with the scale and rotation parameters,

the accumulate array will become four dimensional instead

of two dimensional; (b) The variation range of the translation

parameters is much larger than the scale and rotation param-

eters; (c) The sampling interval of the translation parameters

is harder to provide than the scale and rotation parameters;

and (d) Using the scale and rotation parameters can obtain a

satisfactory result already.

Using only one parameter (either the scale parameter or the

rotation parameter) can work effectively if the ratio of correct

matches is not too low, such as higher than 35 percent (this

number is empirical according to our experience). The perfor-

mance decreases if an increasing number of incorrect matches

are involved. Using two parameters together with a weighted

voting strategy can improve the robustness of the algorithm.

In the following section, the method of using one parameter

is initially described. Then, the method of using both param-

eters and the weighted voting strategy are described after. The

input of the algorithm includes one set of feature points and

their

k

nearest neighbors from another set of feature points.

Using the Scale Parameter Alone

The algorithm consists of the following five steps:

1. The variation range of the scale parameter is deter-

mined. Suppose that the difference in scales between

two images does not exceed 5. Then, the variation

range of the scale parameter could be

1

5

to 5.

2. The sampling interval of the parameter space is de-

termined. In our method, the accumulate array is one

dimensional with 17 bins. These bins correspond to the

scale parameter of

1

5

,

1

4 5.

,

1

4

,

1

3 5.

,

1

3

,

1

2 5.

,

1

2

,

1

1 5.

, 1, 1.5, 2,

2.5, 3, 3.5, 4, 4.5, 5, respectively. Then, the accumulate

array is initialized to zero.

3. The voting process is performed.

a) Suppose total

n

feature points exist in the first

feature set, denoted by

p

i

∈

[0,

n

). Each feature

point has

k

nearest neighbors in another feature set,

denoted by

q

i_m

,

m

∈

[0,

k

), indicating the

m

th

nearest

neighbor of

p

i

. Thus, they form total

w

initial candi-

date matches (

w

=

n

×

k

).

b) An overall accumulate array

A

whose dimension is

17

×

1

is set up.

c) An individual accumulate array

A

i_m

is set up for

each candidate match

p

i

and. Then, one feature

point

p

j,

j

≠

i

is selected from the first feature set, form-

ing a vector

p p

i j

. In the meantime,

q

i_m

,

q

j_0

(the first

nearest neighbor of

p

j

) from the second feature set is

used to form the second vector

q q

i m j

_

_0

. Whichever

neighbor

q

i_m

is considered for

p

i

, the first nearest

neighbor

q

j_

0

is always used for

p

j

, because more

correct matches appear in the first nearest neighbor,

using them to vote for the match

p

i

and

q

i_m

is better.

d) For each

j

, the length ratio of the two vectors is cal-

culated. The length ratio indicates the scale param-

eter between two images. Then, the counts of bins

in accumulate array

A

i_m

are increased accordingly.

Finally,

(n – 1)

votes will occur in the accumulate

array

A

i_m

, thereby forming a peak at some point in

the array. The votes of the peak are the confidence of

match

p

i

and

q

i_m

to be a correct match. The confi-

dence and the index of the peak bin for this match

are saved (the index of the peak bin reveals which

scale parameter this match supports).

e) The votes of accumulate array

A

i_m

are added into

the overall accumulate array

A

bin to bin.

4. After all the matches complete the voting process, the

final voting result can be obtained from the overall

accumulate array

A

. The scale value related to the bin

of most votes in the overall accumulate array is the op-

timal average scale parameter between the two feature

sets. Considering the influence of discretization of the

parameter and the variety of the scale parameter within

the image, the adjacent bins of the peak bin are consid-

ered correct bins as well. For example, the bins with in-

dexes from

b

p

– d

to

b

p

+ d

are considered correct bins

in which

b

p

is the index of the peak bin. Furthermore,

d

is an integer that indicates how far the correct bins can

be away from the peak bin, as long as the votes in these

bins are not less than a certain percentage of the peak

bin, such as 40 percent.

5. Each match is labeled as correct or incorrect. A match

is accepted as correct if its peak bin index obtained in

Step 3-d is within the correct bin indexes mentioned

in Step 4. Furthermore, the confidence of a match

obtained in Step 3-d can reveal useful information. If

we arrange all matches in descending order according

to the confidence of this match and show the sorted

confidence as a curve, then the curve will create an

interesting pattern, as illustrated in Figure 3b. From

the highest confidence to the lowest confidence, three

stages exist. In the first stage, the confidence is high

and decreases slowly until it reaches a turning point,

where the confidence drops rapidly, toward the third

stage with low confidence. Matches that belong to the

first stage are found to be correct matches, and matches

that belong to the other stages are incorrect matches.

Thus, the confidence of a match is an equally effective

criterion of the correctness of this match.

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING

July 2016

561

PE&RS July 2016 Public - page 561

Warning.