TY - JOUR
T1 - Unsupervised linear feature-extraction methods and their effects in the classification of high-dimensional data
AU - Jiménez-Rodríguez, Luis O.
AU - Arzuaga-Cruz, Emmanuel
AU - Vélez-Reyes, Miguel
N1 - Funding Information:
Manuscript received May 9, 2005; revised June 11, 2006. This work was supported in part by the U.S. Army Corps of Engineers Topographic Engineering Center under Grant DACA76-97-K-0007, in part by the National Science Foundation Engineering Research Center Program under Grant EEC-9986821, in part by the NASA University Research Centers Program under Grant NCC5-518, and in part by the National Imagery and Mapping Agency under Contract NMA2010112014.
PY - 2007/2
Y1 - 2007/2
N2 - This paper presents an analysis and a comparison of different linear unsupervised feature-extraction methods applied to hyperdimensional data and their impact on classification. The dimensionality reduction methods studied are under the category of unsupervised linear transformations: principal component analysis, projection pursuit (PP), and band subset selection. Special attention is paid to an optimized version of the PP introduced in this paper: optimized information divergence PP, which is the maximization of the information divergence between the probability density function of the projected data and the Gaussian distribution. This paper is particularly relevant with current and the next generation of hyperspectral sensors that acquire more information in a higher number of spectral channels or bands when compared to multispectral data. The process to uncover these high-dimensional data patterns is not a simple one. Challenges such as the Hughes phenomenon and the curse of dimensionality have an impact in high-dimensional data analysis. Unsupervised feature extraction, implemented as a linear projection from a higher dimensional space to a lower dimensional subspace, is a relevant process necessary for hyperspectral data analysis due to its capacity to overcome some difficulties of high-dimensional data. An objective of unsupervised feature extraction in hyperspectral data analysis is to reduce the dimensionality of the data maintaining its capability to discriminate data patterns of interest from unknown cluttered background that may be present in the data set. This paper presents a study of the impact these mechanisms have in the classification process. The impact is studied for supervised classification even on the conditions of a small number of training samples and unsupervised classification where unknown structures are to be uncovered and detected.
AB - This paper presents an analysis and a comparison of different linear unsupervised feature-extraction methods applied to hyperdimensional data and their impact on classification. The dimensionality reduction methods studied are under the category of unsupervised linear transformations: principal component analysis, projection pursuit (PP), and band subset selection. Special attention is paid to an optimized version of the PP introduced in this paper: optimized information divergence PP, which is the maximization of the information divergence between the probability density function of the projected data and the Gaussian distribution. This paper is particularly relevant with current and the next generation of hyperspectral sensors that acquire more information in a higher number of spectral channels or bands when compared to multispectral data. The process to uncover these high-dimensional data patterns is not a simple one. Challenges such as the Hughes phenomenon and the curse of dimensionality have an impact in high-dimensional data analysis. Unsupervised feature extraction, implemented as a linear projection from a higher dimensional space to a lower dimensional subspace, is a relevant process necessary for hyperspectral data analysis due to its capacity to overcome some difficulties of high-dimensional data. An objective of unsupervised feature extraction in hyperspectral data analysis is to reduce the dimensionality of the data maintaining its capability to discriminate data patterns of interest from unknown cluttered background that may be present in the data set. This paper presents a study of the impact these mechanisms have in the classification process. The impact is studied for supervised classification even on the conditions of a small number of training samples and unsupervised classification where unknown structures are to be uncovered and detected.
KW - Classification
KW - Dimensionality reduction
KW - Feature extraction
KW - Feature selection
KW - Hyperspectral data
KW - Pattern recognition
KW - Principal component analysis (PCA)
KW - Projection pursuit (PP)
UR - http://www.scopus.com/inward/record.url?scp=33846627768&partnerID=8YFLogxK
U2 - 10.1109/TGRS.2006.885412
DO - 10.1109/TGRS.2006.885412
M3 - Article
AN - SCOPUS:33846627768
SN - 0196-2892
VL - 45
SP - 469
EP - 483
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
IS - 2
ER -