An Empirical Study of the Noise Impact on Cost-Sensitive Learning

Xingquan Zhu, Xindong Wu, Taghi Taghi Khoshgoftaar, Shi Yong

In this paper, we perform an empirical study of the impact of noise on cost-sensitive (CS) learning, through observations on how a CS learner reacts to the mislabeled training examples in terms of mis-classification cost and classification accuracy. Our empirical results and theoretical analysis indicate that mislabeled training examples can raise serious concerns for cost-sensitive classification, especially when misclassifying some classes becomes ex-tremely expensive. Compared to general inductive learning, the problem of noise handling and data cleansing is more crucial, and should be carefully investigated to ensure the success of CS learning.