A method of dimensionality reduction by selection of components in principal component analysis for text classification

yangwu zhang, guohe li, heng zong


Dimensionality reduction, including feature extraction and selection, is one of the key points for text classification. In this paper, we propose a mixed method of dimensionality reduction constructed by principal components analysis and the selection of components. Principal components analysis is a method of feature extraction. Not all of thecomponents in principal component analysis contribute to classification, because PCA objective is not a form of discriminant analysis (see, e.g. Jolliffe, 2002).In this context, we present a function of components selection, which returns the useful components for classification bythe indicators of the performanceson the different subsets of the components.Compared to traditional methods of feature selection, SVM classifiers trained on selected components show improved classification performance and a reduction in computational overhead.

Full Text:



  • There are currently no refbacks.