Reading Between the Lines

Reading involves, among others, identifying what is implied but not expressed in text. This task, known as textual entailment, offers a natural abstraction for many NLP tasks, and has been recognized as a central tool for the new area of Machine Reading. Important in the study of textual entailment is making precise the sense in which something is implied by text. The operational definition often employed is a subjective one: something is implied if humans are more likely to believe it given the truth of the text, than otherwise. In this work we propose a natural objective definition for textual entailment. Our approach is to view text as a partial depiction of some underlying hidden reality. Reality is mapped into text through a possibly stochastic process, the author of the text. Textual entailment is then formalized as the task of accurately, in a defined sense, recovering information about this hidden reality. We show how existing machine learning work can be applied to this information recovery setting, and discuss the implications for the construction of machines that autonomously engage in textual entailment. We then investigate the role of using multiple inference rules for this task. We establish that such rules cannot be learned and applied in parallel, but that layered learning and reasoning are necessary.

Loizos Michael