Incorporating User Behaviors in New Word Detection

In this paper, we proposed a novel method to detect new words in domain-specific fields based on user behaviors. First, we select the most representative words from domain-specific lexicon. Then combining with user behaviors, we try to discover the potential experts in this field who use those terminologies frequently. Finally, we make further efforts to identify new words from behaviors of those experts. Words used much more frequently in this community than others are most probably new words. In brief, our method follows a collaborative filtering way: first from words to find professional experts, then from experts to discover new words, which is different from the traditional new word detection methods. Our method achieves up to 0.86 in accuracy on a computer science related data set. Moreover, the proposed method can be easily extended to related words retrieval task. We compare our method with Google Sets and Bayesian Sets. Experiments show that our method and Bayesian Sets gives better results than Google Sets.

Yabin Zheng, Zhiyuan Liu, Maosong Sun, Liyun Ru, Yang Zhang