Multi-Modality Tracker Aggregation: From Generative to Discriminative / 1937
Xiaoqin Zhang, Wei Li, Mingyu Fan, Di Wang, Xiuzi Ye
Visual tracking is an important research topic in computer vision community. Although there are numerous tracking algorithms in the literature, no one performs better than the others under all circumstances, and the best algorithm for a particular dataset may not be known a priori. This motivates a fundamental problem-the necessity of an ensemble learning of different tracking algorithms to overcome their drawbacks and to increase the generalization ability. This paper proposes a multi-modality ranking aggregation framework for fusion of multiple tracking algorithms. In our work, each tracker is viewed as a `ranker' which outputs a rank list of the candidate image patches based on its own appearance model in a particular modality. Then the proposed algorithm aggregates the rankings of different rankers to produce a joint ranking. Moreover, the level of expertise for each "ranker" based on the historical ranking results is also effectively used in our model. The proposed model not only provides a general framework for fusing multiple tracking algorithms on multiple modalities, but also provides a natural way to combine the advantages of the generative model based trackers and the the discriminative model based trackers. It does not need to directly compare the output results obtained by different trackers, and such a comparison is usually heuristic. Extensive experiments demonstrate the effectiveness of our work.