Boosting Kernel Discriminant Analysis and Its Application to Tissue Classification of Gene Expression Data

Guang Dai, Dit-Yan Yeung

Kernel discriminant analysis (KDA) is one of the most effective nonlinear techniques for dimensionality reduction and feature extraction. It can be applied to a wide range of applications involving high-dimensional data, including images, gene expressions, and text data. This paper develops a new algorithm to further improve the overall performance of KDA by effectively integrating the boosting and KDA techniques. The proposed method, called boosting kernel discriminant analysis (BKDA), possesses several appealing properties. First, like all kernel methods, it handles nonlinearity in a disciplined manner that is also computationally attractive; second, by introducing pairwise class discriminant information into the discriminant criterion and simultaneously employing boosting to robustly adjust the information, it further improves the classification accuracy; third, by calculating the significant discriminant information in the null space of the within-class scatter operator, it also effectively deals with the small sample size problem which is widely encountered in real-world applications for KDA; fourth, by taking advantage of the boosting and KDA techniques, it constitutes a strong ensemble-based KDA framework. Experimental results on gene expression data demonstrate the promising performance of the proposed methodology.

URL: http://www.cse.ust.hk/~dyyeung/paper/pdf/yeung.ijcai2007b.pdf