AgriBERT: Knowledge-Infused Agricultural Language Models for Matching Food and Nutrition

AgriBERT: Knowledge-Infused Agricultural Language Models for Matching Food and Nutrition

Saed Rezayi, Zhengliang Liu, Zihao Wu, Chandra Dhakal, Bao Ge, Chen Zhen, Tianming Liu, Sheng Li

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
AI for Good. Pages 5150-5156. https://doi.org/10.24963/ijcai.2022/715

Pretraining domain-specific language models remains an important challenge which limits their applicability in various areas such as agriculture. This paper investigates the effectiveness of leveraging food related text corpora (e.g., food and agricultural literature) in pretraining transformer-based language models. We evaluate our trained language model, called AgriBERT, on the task of semantic matching, i.e., establishing mapping between food descriptions and nutrition data, which is a long-standing challenge in the agricultural domain. In particular, we formulate the task as an answer selection problem, fine-tune the trained language model with the help of an external source of knowledge (e.g., FoodOn ontology), and establish a baseline for this task. The experimental results reveal that our language model substantially outperforms other language models and baselines in the task of matching food description and nutrition.
Keywords:
Natural Language Processing: Language Models
Multidisciplinary Topics and Applications: Health and Medicine
Natural Language Processing: Applications
Natural Language Processing: Named Entities