One Class per Named Entity: Exploiting Unlabeled Text for Named Entity Recognition

Yingchuan Wong, Hwee Tou Ng

In this paper, we present a simple yet novel method of exploiting unlabeled text to further improve the accuracy of a high-performance state-of-the-art named entity recognition (NER) system. The method utilizes the empirical property that many named entities occur in one name class only. Using only unlabeled text as the additional resource, our improved NER system achieves an F1 score of 87.13%, an improvement of 1.17% in F1 score and a 8.3% error reduction on the CoNLL 2003 English NER official test set. This accuracy places our NER system among the top 3 systems in the CoNLL 2003 English shared task.