Domain Adaptation with Topical Correspondence Learning / 1280
Zheng Chen, Weixiong Zhang
A serious and ubiquitous issue in machine learning is the lack of sufficient training data in a domain of interest. Domain adaptation isan effective approach to dealing with this problem by transferring information or models learned from related, albeit distinct, domains to the target domain. We develop a novel domain adaptation method for text document classification under the framework of Non-negative Matrix Factorization. Two key ideas of our method are to construct a latent topic space where a topic is decomposed into common words shared by all domains and words specific to individual domains, and then to establish associations between words in different domains through the common words as a bridge for knowledge transfer. The correspondence between cross-domain topics leads to more coherent distributions of source and target domains in the new representation while preserving the predictive power. Our new method outperformed several state-of-the-art domain adaptation methods on several benchmark datasets.