Towards Reading Comprehension for Long Documents

Towards Reading Comprehension for Long Documents

Yuanxing Zhang, Yangbin Zhang, Kaigui Bian, Xiaoming Li

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 4588-4594. https://doi.org/10.24963/ijcai.2018/638

Machine reading comprehension has gained attention from both industry and academia. It is a very challenging task that involves various domains such as language comprehension, knowledge inference, summarization, etc. Previous studies mainly focus on reading comprehension on short paragraphs, and these approaches fail to perform well on the documents. In this paper, we propose a hierarchical match attention model to instruct the machine to extract answers from a specific short span of passages for the long document reading comprehension (LDRC) task. The model takes advantages from hierarchical-LSTM to learn the paragraph-level representation, and implements the match mechanism (i.e., quantifying the relationship between two contexts) to find the most appropriate paragraph that includes the hint of answers. Then the task can be decoupled into reading comprehension task for short paragraph, such that the answer can be produced. Experiments on the modified SQuAD dataset show that our proposed model outperforms existing reading comprehension models by at least 20% regarding exact match (EM), F1 and the proportion of identified paragraphs which are exactly the short paragraphs where the original answers locate.
Keywords:
Natural Language Processing: Natural Language Processing
Natural Language Processing: Question Answering
Natural Language Processing: Coreference Resolution