Detecting Changes in Unlabeled Data Streams using Martingale
Shen-Shyang Ho, Harry Wechsler
The martingale framework for detecting changes in data stream, currently only applicable to labeled data, is extended here to unlabeled data using clustering concept. The one-pass incremental change-detection algorithm (i) does not require a sliding window on the data stream, (ii) does not require monitoring the performance of the clustering algorithm as data points are streaming, and (iii) works well for high-dimensional data streams. To enhance the performance of the martingale change detection method, the multiple martingale test method using multiple views is proposed. Experimental results show (i) the feasibility of the martingale method for detecting changes in unlabeled data streams, and (ii) the multiple-martingale test method compares favorably with alternative methods using the recall and precision measures for the video-shot change detection problem.