PE&RS August 2018 Full

The following Section, the details of the proposed method are

described. We demonstrate and analyze the experiment results

using diverse datasets from different sensors (

UAV

-borne,

airborne and spaceborne data) under challenging scenarios

such as images of bushes, forests, and crops. The final Section

concludes this paper and provides insight on future work.

Related Work

In general, the existing matching algorithms can be mainly

divided into area-based methods and feature-based methods

(Xiong and Zhang, 2009). In area-based methods, image data

in a rectangular window is used in the form of a matrix of

grey values to describe the centric pixel. Among area-based

methods, cross-correlation is one of the most widely used

similarity measurements. The Normalized Cross-Correlation

(

NCC

) was proposed to increase the robustness of cross-corre-

lation to linear grayscale variation and slight geometric distor-

tion (Helava, 1978; Ackermann, 1984; Lhuillier and Quan,

2002). To improve the matching reliability, some matching

strategies like least squares matching (Gruen, 1985, 2002, and

2005) and relaxation labeling (Wu and Pairman, 1995) were

proposed to constrain the similarity measurement. An advan-

tage of area-based methods is that it can achieve sub-pixel ac-

curacy and even better in some cases but they are sensitive to

image nonlinear intensity change and perspective geometric

deformation (Gruen, 2012).

Feature-based methods can overcome some of the afore-

mentioned problems of area-based methods by exploring

local image information to construct transformation-invariant

descriptors for detected features. Feature-based matching

methods attracted much attention in recent years as the suc-

cess of the well-known Scale Invariant Feature Transform

(

SIFT

) (Lowe, 2004). Without too much derivation, the para-

digm of feature-based methods can be generally divided into

three steps, being feature detection, description and matching.

Existing methods either focus one or two of these three steps,

or address them in a row. The most commonly used detectors

are the Förstner detector (Förstner and Gülch, 1987), Harris-

based detectors (Harris and Stephens, 1988; Mikolajczyk and

Schmid, 2004), Smallest Univalue Segment Assimilating Nu-

cleus (

SUSAN

) detector (Smith and Brady, 1997), Phase Con-

gruency detector (Kovesi, 1999), Difference of Gaussians (

DoG

)

detector (Lowe, 2004) and Maximally Stable Extremal Region

(

MSER

) (Matas

et al

., 2004), as they comparatively show higher

feature repeatability rates in either benchmarks or practical

tests (Mikolajczyk

et al

., 2005; Tuytelaars and Mikolajczyk,

2008). As for descriptors,

SIFT

, Speeded Up Robust Features

(

SURF

) (Bay

et al

., 2008), Histograms of Oriented Gradient

(

HOG

) (Dalal and Triggs, 2005) and

DAISY

(Tola

et al

., 2010)

are among the top due to their robustness and accuracy in

practical scenarios (Mikolajczyk and Schmid, 2005; Gauglitz

et al

., 2011). Different measures like mutual information and

Euclidean distance can be adopted to evaluate feature similar-

ity across different images (Maes

et al

., 1997). The matching

performance can be improved by using some iterative match-

ing strategies (Morel and Yu, 2009; Yu

et al

., 2012; Chen

et

al

., 2013; Chen

et al

., 2017). Feature-based methods in remote

sensing image matching can be specially treated, since the

spatial and geometric information of the often geo-referenced

images can be used to constrain the correspondences search

area. This not only accelerates the matching procedure but

also improves the matching precision (the ratio between the

number of correct matches and the number of total matches).

Despite these well-investigated feature detection and

matching methods generally work in many practical sce-

narios, such seemingly robust methods invariably run into

problems when applying on images with highly repetitive

textures. In order to improve the matching performance in

such scenario, the global context is often explored by com-

puting inter-relationship between correspondences as spatial

constraint (Mortensen

et al

., 2005; Duchenne

et al

., 2011).

Inspired by this idea, methods were proposed to match

repetitive patterns by matching point-pair and then finding

one-to-one correspondences from the matched pairs based

on geometric constraints (Fan

et al

., 2011). Local ambiguities

can be reduced by extending support regions to extract more

information from larger regions. These methods work well

for images with moderate-level of repetitive patterns, while

the performance of matching relies heavily on the initially

matched pairs. If repetitive textures take a large portion of

the image content, the information extracted from extended

regions may not be distinctive enough. Existing methods also

considered scale and orientation constraints provided by

geo-referencing information, e.g., Global Navigation Satellite

System and Inertial Navigation System (

GNSS/INS

) data. These

data can be combined with local features to match images

with large number of similar patterns (Habib

et al

., 2016).

The use of geometric constraints is capable of reducing false

matches to a certain extent, while the feature descriptors may

be still computed from similar texture regions that limit the

matching performance.

To further improve the capability of handling repetitive

patterns, increasing the feature support region size might not

bring as many benefits as the downside of causing higher

chances of geometric deformation (i.e., perspective or affine

distortion) and higher computational complexity. Therefore,

in this paper, we tend to instead explore the texture struc-

tures within a local region by discovering distinctive features

within a texturally repetitive area.

Methods

Our proposed method considers a full pipeline of feature

point detection, description, and matching. Different from

the traditional paradigm which detects features and compute

descriptors in two separate steps, the feature description is

integrated into the feature detection step. As a first step of

the method, a novel

LDF

detector is proposed to measure a

two-level distinctiveness of the pixel and the support region

through a modified response function, and detect

LDF

s with

high matching potential. Seed points are then selected from

the

LDF

s and matched using descriptor similarity through

a bidirectional matching strategy. These seed point match-

ing results are further filtered through a matching reliability

indicator, where only the pair of seed points with the high-

est reliability value in each area is kept. The remaining seed

point matching results are filtered again using the Random

Sample Consensus (

RANSAC

) (Fischler and Bolles, 1981).

Based on these matches, a coarse geometric transformation

(e.g., affine or projective transformation) can be computed

between the source and target image to bound the search area

for more refined matching. Finally, a

FIPS

-based search strat-

egy is proposed. It is combined with descriptor similarity and

geometric constraint to compute correspondence for each

LDF

on the source image. The workflow of the proposed matching

method is summarized in Figure 1.

LDF Detector

Classic corner detectors, for instance Harris detector, use

corner response values computed from covariance matrix to

measure the potential corner pixels as interest points. Since

the covariance matrix is aggregated through a small window,

such corner points can be distinctive in this local window

while being ambiguous if placed under a larger scale (feature

support region to compute descriptor). Such globally non-

distinctive feature points are normally becoming disturbing

514

August 2018

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING

PE&RS August 2018 Full - page 514

Warning.