Automatic Construction and Evaluation of a Large Semantically Enriched Wikipedia / 2894
Alessandro Raganato, Claudio Delli Bovi, Roberto Navigli
The hyperlink structure of Wikipedia constitutes a key resource for many Natural Language Processing tasks and applications, as it provides several million semantic annotations of entities in context. Yet only a small fraction of mentions across the entire Wikipedia corpus is linked. In this paper we present the automatic construction and evaluation of a Semantically Enriched Wikipedia in which the overall number of linked mentions has been more than tripled solely by exploiting the structure of Wikipedia itself and the wide-coverage sense inventory of BabelNet. As a result we obtain a sense-annotated corpus with more than 200 million annotations of over 4 million different concepts and named entities. We then show that our corpus leads to competitive results on multiple tasks, such as Entity Linking and Word Similarity.