Can Cross Entropy Loss Be Robust to Label Noise?

Can Cross Entropy Loss Be Robust to Label Noise?

Lei Feng, Senlin Shu, Zhuoyi Lin, Fengmao Lv, Li Li, Bo An

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Main track. Pages 2206-2212. https://doi.org/10.24963/ijcai.2020/305

Trained with the standard cross entropy loss, deep neural networks can achieve great performance on correctly labeled data. However, if the training data is corrupted with label noise, deep models tend to overfit the noisy labels, thereby achieving poor generation performance. To remedy this issue, several loss functions have been proposed and demonstrated to be robust to label noise. Although most of the robust loss functions stem from Categorical Cross Entropy (CCE) loss, they fail to embody the intrinsic relationships between CCE and other loss functions. In this paper, we propose a general framework dubbed Taylor cross entropy loss to train deep models in the presence of label noise. Specifically, our framework enables to weight the extent of fitting the training labels by controlling the order of Taylor Series for CCE, hence it can be robust to label noise. In addition, our framework clearly reveals the intrinsic relationships between CCE and other loss functions, such as Mean Absolute Error (MAE) and Mean Squared Error (MSE). Moreover, we present a detailed theoretical analysis to certify the robustness of this framework. Extensive experimental results on benchmark datasets demonstrate that our proposed approach significantly outperforms the state-of-the-art counterparts.
Keywords:
Machine Learning: Classification
Data Mining: Classification, Semi-Supervised Learning