A Novel Data Representation for Effective Learning in Class Imbalanced Scenarios

A Novel Data Representation for Effective Learning in Class Imbalanced Scenarios

Sri Harsha Dumpala, Rupayan Chakraborty, Sunil Kumar Kopparapu

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 2100-2106. https://doi.org/10.24963/ijcai.2018/290

Class imbalance refers to the scenario where certain classes are highly under-represented compared to other classes in terms of the availability of training data. This situation hinders the applicability of conventional machine learning algorithms to most of the classification problems where class imbalance is prominent. Most existing methods addressing class imbalance either rely on sampling techniques or cost-sensitive learning methods; thus inheriting their shortcomings. In this paper, we introduce a novel approach that is different from sampling or cost-sensitive learning based techniques, to address the class imbalance problem, where two samples are simultaneously considered to train the classifier. Further, we propose a mechanism to use a single base classifier, instead of an ensemble of classifiers, to obtain the output label of the test sample using majority voting method. Experimental results on several benchmark datasets clearly indicate the usefulness of the proposed approach over the existing state-of-the-art techniques.
Keywords:
Machine Learning: Classification
Machine Learning: Machine Learning
Machine Learning: Neural Networks
Machine Learning: Ensemble Methods