12-19 December Full - page 639

b b

′

cov(

)

var(

)

C C

1 2

(2)

in which var(X) indicates the variance of image X, and

cov(X, Y) denotes the covariance between two images X and

Y. The injection coefficient

is optimized by least square

fitting based on Equation 1, which can also be derived from

the mathematical development of the geoscience approaches,

details are described in these references (Aiazzi, Baronti, and

Selva 2007; Vivone

et al.

2015; Aiazzi

et al.

2017).

Residual Dense Network

After the first stage, the difference between the fusion result

↓

and the true Landsat image at

is 2 times. In order to

accurately obtain the final fused result on the prediction date

from the transitional fused results, here, a super-resolution re-

construction method is used to finish it. Considering residual

dense network is effective in super-resolution reconstruc-

tion (Zhang

et al.

2018), it can make full use of the local and

global features of the original images, and thus can accurately

reconstruct the mapping relationship between input data and

output data, we introduce the residual dense network for su-

per-resolution reconstruction to obtain the final fused result.

The network mainly consists four parts as shown in Figure

2. Firstly, two convolution layers are used to extract shallow

features from low-resolution images, and then residual dense

blocks (

RDBs

) are used to extract deep features. Thirdly, dense

feature fusion (

DFF

) is used to fuse the multilevel features ex-

tracted from previous layers. Finally, an up-sampling network

is used for super-resolution reconstruction to obtain high-res-

olution images. Next, we will introduce the network in detail.

Shallow Feature Extraction

Two convolution layers are used for shallow feature extrac-

tion (

SFE

) from low-resolution image

. The features extract-

ed from two layers can be respectively formulated as

–1

SFE1

(

);

(3)

SFE2

(

–1

);

(4)

where

SFE1(·)

and

SFE2(·)

represent the fir

st and the second

convolutional layer, respectively.

–1

denote the output

of these two convolution operations, respectively.

Deep Feature Extraction

Then, the extracted shallow feature

is used as the input of

the first residual dense block. After going through D residual

dense blocks, the hierarchical features

are extracted, and it

can be represented as

RDB,

(

–1

) =

RDB,

(

RDB,

–1

( … (

RDB,1

(

)) … ))

(5)

where

RDB,

can be a compos-

ite function of convolution and

rectified linear units (

ReLU

), it

represents the dth

RDB

Multilayer Features Fusion

To fully use the features extract-

ed from all the preceding layers,

the

DFF

is further conducted with

two operations. Firstly, the global

feature fusion (

GFF

) is used to

fuse multilevel features extracted

from residual dense blocks 1, …,

D. The adding of

GFF

can effec-

tively improve the performance

of the network and shows the benefits to stabilize the training

process, which has been demonstrated through quantitative

and visual analyses (Zhang

et al.

2018). The output

can be

formulated as

GFF

([

, … ,

])

(6)

where

GFF

is a composite function of a set of convolution op-

erators. The size of the convolution layer can be set to 1 × 1 or

3 × 3. The convolutional layer with the size of 1 × 1 can fuse

the multilevel features, and the 3 × 3 layer is used to further

extract features (Ledig

et al.

2017).

Then, the output feature

and the shallow feature

–1

ex-

tracted from the first convolutional layer are fused to conduct

global residual learning to obtain the global dense feature

–1

(7)

Super-Resolution Reconstruction

The extracted multilevel features from preceding layers are in

the low-resolution space. An up-sampling net in the high-

resolution space is introduced to reconstruct the final high-

resolution image

RDN

(

(8)

where

RDN

represents the composite function of residual

dense network.

Experiments and Results

In order to assess the effectiveness of the proposed method,

its performances are analyzed and compared with

STARFM

and

Fit-FC

. Two different commonly used Landsat-

MODIS

data sets

are selected to test the effectiveness of the proposed method.

The two data sets are characterized by phenological and land-

cover type changes, respectively (Emelyanova

et al.

2013).

In the two experiments, most of the parameters settings in

the residual dense network are referring to the original paper

(Zhang

et al.

2018), the convolutional size is set as 1 × 1 for

local and global feature fusion, the size of all other convolu-

tion layers are set to 3 × 3, the number of filters is 64, all of

the activation functio

ns are

ReLU

(Nair and Hinton 2010), and

the network is update

d using Adam optimizer. All algorithms

are tested on an Intel

Xeon

CPU

Gold 6134 at 3.20 GHz and

GPU Tesla P100 16

Data Set Introduction

The first study area is located in the Coleambally Irrigation

Area (

CIA

), south of New South Wales, Australia (34.0034°E,

145.0675°S). There are 17 cloud-free Landsat-

MODIS

image

pairs in the summer growing season of 2001–2002. All Land-

sat images in this area are from Landsat

ETM+

, covering 2193

, and the data set has six bands. The size of the image area

is 1720 × 2040 pixels with the resolution of 25 m.

Figure 2. The structure of residual dense network.

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING

December 2019

909

SEO version

Warning.

You are currently viewing the SEO version of 12-19 December Full.
It has a number of design and functionality limitations.

We recommend viewing the basic HTML version or installing the Adobe Flash Player.

581...,629,630,631,632,633,634,635,636,637,638 640,641,642,643,644,645,646,647,648