De-biasing Covariance-Regularized Discriminant Analysis

De-biasing Covariance-Regularized Discriminant Analysis

Haoyi Xiong, Wei Cheng, Yanjie Fu, Wenqing Hu, Jiang Bian, Zhishan Guo

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 2889-2897. https://doi.org/10.24963/ijcai.2018/401

Fisher's Linear Discriminant Analysis (FLD) is a well-known technique for linear classification, feature extraction and dimension reduction. The empirical FLD relies on two key estimations from the data -- the mean vector for each class and the (inverse) covariance matrix. To improve the accuracy of FLD under the High Dimension Low Sample Size (HDLSS) settings, Covariance-Regularized FLD (CRLD) has been proposed to use shrunken covariance estimators, such as Graphical Lasso, to strike a balance between biases and variances. Though CRLD could obtain better classification accuracy, it usually incurs bias and converges to the optimal result with a slower asymptotic rate. Inspired by the recent progress in de-biased Lasso, we propose a novel FLD classifier, DBLD, which improves classification accuracy of CRLD through de-biasing. Theoretical analysis shows that DBLD possesses better asymptotic properties than CRLD. We conduct experiments on both synthetic datasets and real application datasets to confirm the correctness of our theoretical analysis and demonstrate the superiority of DBLD over classical FLD, CRLD and other downstream competitors under HDLSS settings.
Keywords:
Machine Learning: Data Mining
Machine Learning Applications: Applications of Supervised Learning