Dealing with Concept Drift and Class Imbalance in Multi-Label Stream Classification
Eleftherios Spyromitros Xioufis, Myra Spiliopoulou, Grigorios Tsoumakas, Ioannis Vlahavas
Data streams containing objects that are (or can be) associated with more than one label at the same time are ubiquitous. In spite of its important applications, classification of streaming multi-label data is largely unexplored. Existing approaches try to tackle the problem by transferring traditional single-label stream classification practices to the multi-label domain. Nevertheless, they fail to consider some of the unique properties of the problem such as within and between class imbalance and multiple concept drift. To deal with these challenges, this paper proposes a novel multi-label stream classification approach that employs two windows for each label, one for positive and one for negative examples. Instance-sharing is exploited for space efficiency, while a time-efficient instantiation based on the k-Nearest Neighbor algorithm is also proposed. Finally, a batch-incremental thresholding technique is proposed to further deal with the class imbalance problem. Results of an empirical comparison against two other methods on three real world datasets are in favor of the proposed approach.