PE&RS December 2018 Full

sensor artifact removal, and atmospheric compensation and

reflectance conversion (Kurz

et al

., 2013; Okyay

et al

., 2016).

For this purpose, in-house preprocessing routines were de-

veloped using

MATLAB

2016b (The Mathworks, Inc.). The dark

current signal was removed using an average frame obtained

from the dark image collected during data acquisition. Along

with the manufacturer provided non-uniformity correction

(

NUC

), dark frame subtraction removed most of the systematic

striping in the imagery. Image artifacts associated with the

poor performance of individual elements in the 2D sensor ar-

ray and push-broom scanning were identified on each image

band and replaced by the mean of adjacent lines in the spatial

domain. Brightness gradient caused mainly by the cylindri-

cal geometry of the panoramic images were corrected for each

spectral band as outlined in Okyay

et al

. (2016). For atmo-

spheric compensation and reflectance conversion, Empirical

Line Calibration (

ELC

) (Smith and Milton, 1999) was used

employing two calibration panels with ~3% and ~99% re-

flectance. Some of the image artifacts caused by bad pixels on

the sensor array can only be seen after the reflectance conver-

sion. Thus, following the reflectance conversion, bad pixels

were identified in the spectral domain and corrected using

linear interpolation. Subsequently, prior to spatial image co-

registration the

VNIR

image was down-sampled, i.e., the size of

the

VNIR

image was decreased, to match the dimensions of the

SWIR

image.

Spatial Image Co-registration

The homologous points between the ground-based hyperspec-

tral images needed for spatial co-registration were automati-

cally extracted using a feature-based image matching algo-

rithm: Scale-invariant Feature Transform (

SIFT

) (Lowe, 2004).

The implementation of

SIFT

used in this study (Wu, 2007) is

derived from the original work of Lowe (2004). The

SIFT

al-

gorithm has been widely used in remote sensing applications

due to its robustness to variable geometric and radiometric

conditions (Khan

et al

., 2011; Wessel

et al

., 2007). Although

a brief overview of the

SIFT

algorithm is provided, it is by no

means an exhaustive discussion. Thus, for further details of

the

SIFT

algorithm and parameters the reader is referred to the

original work of Lowe (2004 and 1999) which is considered to

be the best material on

SIFT

.

SIFT

initially identifies potential interest points (keypoints)

on input images separately. This is achieved by searching

local extrema over Difference of Gaussians (

DoG

) obtained

by differencing adjacent images progressively blurred with

Gaussians. However, not all keypoints will be stable such as

those have low contrast or are poorly localized (i.e., along

edges). If the intensity of a keypoint is less than a thresh-

old, called

contrast threshold

, it is considered as potentially

unstable and rejected. Similarly, if the ratio between the

principal curvatures of a keypoint is greater than a threshold,

called

edge threshold

, it is considered poorly localized and

rejected. While these parameters are rather arbitrary, as per

Lowe (2004), a

contrast threshold

of 0.03, an

edge thresh-

old

of 10, and a

nearest neighbor ratio

of 0.8 were used in

this study. Each stable keypoint is then assigned a principal

orientation based on local image gradient directions. Subse-

quently, an invariant keypoint descriptor which is essentially

a spatial histogram based on image gradient magnitude and

orientation and represented as a 128-dimensional feature

vector is created at each keypoint. Following their deriva-

tion, distinctive keypoints in individual images are matched

based on minimum Euclidean distance between the invariant

descriptors (i.e., Nearest Neighbor search). However, not all

keypoints will have correct matches and thus, the keypoints

with incorrect matches need to be discarded. For this, ratio of

distances to closest match and second-closest match, called

nearest neighbor ratio

is calculated. If the ratio is greater than

a threshold, the match is considered incorrect and eliminated.

Lowe (2004) has shown that a

nearest neighbor ratio

thresh-

old at 0.8 eliminates 90% of incorrect matches while elimi-

nating only less than 5% of correct matches. Any outliers, i.e.,

incorrect matches, which may still remain within matching

points identified in

SIFT

was further eliminated using the Ran-

dom Sampling and Consensus (

RANSAC

) algorithm (Fischler

and Bolles, 1981) leaving only the inlier points.

RANSAC

is an

iterative algorithm to estimate a mathematical model from a

data set containing outliers. As

RANSAC

works by identifying

the outliers in a data set, it could also be interpreted as an

outlier detection method (Strutz, 2011).

Although the

SIFT

algorithm is robust to radiometric

changes, it could further benefit from the use of images

acquired at similar wavelengths like any other feature-based

matching algorithm (Schwind

et al

., 2014; Sima and Buck-

ley, 2013). However, the low signal-to-noise ratio of the

VNIR

image bands within the overlapping spectral range (briefly

discussed below) causes speckles in the images, which sub-

sequently affect keypoint identification and image matching.

In attempt to evaluate the effect of input image selection on

image matching, four independent runs of

SIFT

were per-

formed using the same parameters but different input images.

Once the homologous points were extracted, the spatial image

co-registration was performed in ENVI version 5.3 (Harris

Geospatial, Boulder, Colorado) using both affine and first-

order polynomial transformations. The affine transformation

accounts for rotation, scaling, and translation (

RST

) but does

not allow for shearing, whereas the polynomial transforma-

tion accommodates image shearing in addition to rotation,

scaling, and translation (ENVI User Guide, 2017). Thus, in

order to account for potential shearing in the images, first-

order polynomial transformation is used in addition to affine

transformation.

Spectral Concatenation

Proper spatial co-registration of the ground-based hyperspec-

tral images, in theory, should provide continuous

VNIR

+

SWIR

spectra that allow a more complete spectral analysis. Unfor-

tunately, in reality this is an overly optimistic expectancy

due mainly to non-optimal sensor properties. Dissimilar

spectral performance of the hyperspectral cameras hampers

the spectral concatenation of

VNIR

and

SWIR

images. A pre-

liminary analysis of image spectra showed a large increase

in noise towards the longer end of the

VNIR

camera spectrum

caused by decreasing sensitivity of the camera sensor (Figure

2). The noise in wavelengths longer than approximately 920

nm becomes more prominent and potentially preventative

for spectral analysis (



in Figure 2); therefore, image bands

of these wavelengths were excluded from further analysis (



in Figure 2). In addition, due to the relatively poor spectral

Table 1. Specifications of the

VNIR

and

SWIR

cameras from

SpecIm, Finland.

VNIR Camera

SWIR Camera

Spectrograph ImSpector V10E

ImSpector N25E

Spectral range

400 – 1000 nm

900 – 2500 nm

Spectral Sampling 0.72 – 5.8 nm

1

6.3 nm

Sensor

HgCdTe (MCT)

Charged-couple

device (CCD)

Spatial dimension up to 1600 pixels

2

320 pixels

Spectral dimension up to 840 pixels

1

256 pixels

Pixel pitch

7.4 µm

30 µm

Digitization

12-bit

14-bit

Frame rate

up to 120 fps

up to 100 fps

1

Adjustable by spectral binning;

2

Adjustable by spatial binning

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING

December 2018

783

PE&RS December 2018 Full - page 783

Warning.