PE&RS March 2017 Public

Spatial-Spectral Unsupervised Convolutional

Sparse Auto-Encoder Classifier for

Hyperspectral Imagery

Xiaobing Han, Yanfei Zhong, and Liangpei Zhang

Abstract

The traditional spatial-spectral classification methods ap-

plied to hyperspectral remote sensing imagery are conducted

by combining the spatial information vector and the spectral

information vector in a separate manner, which may cause

information loss and concatenation deficiency between the

spatial and spectral information. In addition, the traditional

morphological-based spatial-spectral classification methods

require the design of handcrafted features according to expe-

rience, which is far from automatic and lacks generalization

ability. To automatically represent the spatial-spectral fea-

tures around the central pixel within a spatial neighborhood

window, a novel spatial-spectral feature classification method

based on the unsupervised convolutional sparse auto-encoder

(

UCSAE

) with a window-in-window strategy is proposed in this

study. The

UCSAE

algorithm features a unique spatial-spectral

feature extraction approach which is executed in two stages.

The first stage represents the spatial-spectral features within a

spatial neighborhood window on the basis of spatial-spectral

feature extraction of sub-windows with a sparse auto-en-

coder (SAE). The second stage exploits the spatial-spectral

feature representation with a convolution mechanism for the

larger outer windows. The

UCSAE

algorithm was validated by

two widely used hyperspectral imagery datasets (the Pavia

University dataset and the Washington DC Mall dataset)

obtaining accuracies of 90.03 percent and 96.88 percent,

respectively, which are better results than those obtained by

the traditional hyperspectral spatial-spectral classification

approaches.

Introduction

Hyperspectral imagery (

HSI

) data have become a valuable tool

in a wide variety of applications (Chang

et al.

, 2013; Fauvel

et

al.

, 2013) such as agriculture, surveillance, astronomy, min-

eralogy, and environmental sciences, due to the rich spectral

and spatial information (Chuvieco

et al.

, 1989; Congalton

et

al.

, 2015; Lu

et al.

, 2013; Qin

et al.

, 2014; Svejkovsky

et al.

,

2012). Among all of the above research fields, the most com-

mon utilization of

HSI

data is for ground feature classification,

which usually means classifying each pixel from the

HSI

into

an accurate land-cover category (Grahn

et al.

, 2007; Camps-

Valls

et al.

, 2014; Landgrebe

et al.

, 2003).

HSIs contain rich spectral information, which increases the

possibility of more accurately discriminating ground features.

Furthermore, with the development of hyperspectral sensors,

the fine spatial resolution of HSIs allows most neighboring pix-

els to contain homogeneous spectral profile signatures, which

enables the small spatial structures in the images to be correct-

ly delineated (Ji

et al.

, 2014; Jimenez

et al.

, 2005; Kang

et al.

,

2014; Yuan

et al.

, 2014; Zhou

et al.

, 2015). Therefore, efficiently

exploiting the combination of the finer spatial and spectral in-

formation from the

HSI

is of significance to further improve the

classification performance of the ground features. Various spa-

tial-spectral feature classification methods have been proposed,

including algorithms based on neighborhood window opening

operations (Chen

et al.

, 2014; Plaza

et al.

, 2009), morpholog-

ical operations (Fauvel

et al.

, 2008; Mauro

et al.

, 2011), and

segmentation approaches. In general, all these methods can be

categorized as spatially constrained approaches, which process

the spatial and spectral features in a discrete manner and adopt

a handcrafted feature design approach, especially the morpho-

logical-based methods. However, there are two significant as-

pects impeding the development of the traditional spatial-spec-

tral classification methods. On the one hand, the handcrafted

feature design approaches need expert human experience to

design the specific parameter settings for different experimental

data, which is far from automatic. On the other hand, the direct

concatenation of the spatial and spectral features within a cer-

tain local spatial neighborhood window in the

HSI

may lead to

a low-efficient feature representation and is unable to mine the

deeper-level features within the spatial window. Whether there

could be an algorithm that can take both the automatic feature

extraction ability and exhaustive feature representation within

a certain spatial window into consideration is an interesting

and deserving research direction.

From the above description, automatically and effectively

representing the spatial and spectral features within a certain

spatial neighborhood window is a critical problem. Along

with the development of deep learning (Hinton and Salakhut-

dinov, 2006; Hinton

et al.

, 2006; Bengio

et al.

, 2007), an auto-

matic feature learning and feature representation framework

for classification tasks has also been constructed (LeCun

et al.

,

2015). Among the deep learning research models, supervised

feature learning models and unsupervised feature learning

models are the two typical feature learning approaches. A typ-

ical supervised feature learning model is the stacked sparse

auto-encoder (

SSAE

). A typical unsupervised feature learn-

ing model is the sparse auto-encoder (

SAE

). In recent deep

learning research, some models have been based on non-con-

volutional models (e.g., the

SSAE

) and some have been based

on the convolution mechanism. Generally, for the non-con-

volutional models, the spatial and spectral information are

processed separately (Chen

et al.

, 2014), and this is usually

undertaken by concatenating the spatial and spectral informa-

tion vectors, which may cause information loss. Meanwhile,

State Key Laboratory of Information Engineering in Sur-

veying, Mapping, and Remote Sensing, Wuhan University,

Wuhan 430079, P.R. China; and the Collaborative Innovation

Center of Geospatial Technology, Wuhan University, Wuhan

430079, P.R. China (

zhongyanfei@whu.edu.cn

).

Photogrammetric Engineering & Remote Sensing

Vol. 83, No. 3, March 2017, pp. 195–206.

0099-1112/17/195–206

and Remote Sensing

doi: 10.14358/PERS.83.3.195

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING

March 2017

195

PE&RS March 2017 Public - page 195

Warning.