FinBERT: A Pre-trained Financial Language Representation Model for Financial Text Mining

FinBERT: A Pre-trained Financial Language Representation Model for Financial Text Mining

Zhuang Liu, Degen Huang, Kaiyu Huang, Zhuang Li, Jun Zhao

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Special Track on AI in FinTech. Pages 4513-4519. https://doi.org/10.24963/ijcai.2020/622

There is growing interest in the tasks of financial text mining. Over the past few years, the progress of Natural Language Processing (NLP) based on deep learning advanced rapidly. Significant progress has been made with deep learning showing promising results on financial text mining models. However, as NLP models require large amounts of labeled training data, applying deep learning to financial text mining is often unsuccessful due to the lack of labeled training data in financial fields. To address this issue, we present FinBERT (BERT for Financial Text Mining) that is a domain specific language model pre-trained on large-scale financial corpora. In FinBERT, different from BERT, we construct six pre-training tasks covering more knowledge, simultaneously trained on general corpora and financial domain corpora, which can enable FinBERT model better to capture language knowledge and semantic information. The results show that our FinBERT outperforms all current state-of-the-art models. Extensive experimental results demonstrate the effectiveness and robustness of FinBERT. The source code and pre-trained models of FinBERT are available online.
Keywords:
Foundation for AI in FinTech: Data mining and knowledge discovery for FinTech
Foundation for AI in FinTech: Deep learning and representation for FinTech
Foundation for AI in FinTech: General
Foundation for AI in FinTech: Analyzing big financial data
AI for lending: General
AI for marketing: General
AI for marketing: AI for consumer sentiment analysis
AI for payment: AI for payment risk modeling
Other areas: Financial decision-support system