Beyond Pure Text: Summarizing Financial Reports Based on Both Textual and Tabular Data

Beyond Pure Text: Summarizing Financial Reports Based on Both Textual and Tabular Data

Ziao Wang, Zelin Jiang, Xiaofeng Zhang, Jaehyeon Soon, Jialu Zhang, Wang Xiaoyao, Hongwei Du

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 5233-5241. https://doi.org/10.24963/ijcai.2023/581

Abstractive text summarization is to generate concise summaries that well preserve both salient information and the overall semantic meanings of the given documents. However, real-world documents, e.g., financial reports, generally contain rich data such as charts and tabular data which invalidates most existing text summarization approaches. This paper is thus motivated to propose this novel approach to simultaneously summarize both textual and tabular data. Particularly, we first manually construct a “table+text → summary” dataset. Then, the tabular data is respectively embedded in a row-wise and column-wise manner, and the textual data is encoded at the sentence-level via an employed pre-trained model. We propose a salient detector gate respectively performed between each pair of row/column and sentence embeddings. The highly correlated content is considered as salient information that must be summarized. Extensive experiments have been performed on our constructed dataset and the promising results demonstrate the effectiveness of the proposed approach w.r.t. a number of both automatic and human evaluation criteria.
Keywords:
Natural Language Processing: NLP: Summarization
Natural Language Processing: NLP: Applications
Natural Language Processing: NLP: Language generation