Unveiling Maternity and Infant Care Conversations: A Chinese Dialogue Dataset for Enhanced Parenting Support
Unveiling Maternity and Infant Care Conversations: A Chinese Dialogue Dataset for Enhanced Parenting Support
Bo Xu, Liangzhi Li, Junlong Wang, Xuening Qiao, Erchen Yu, Yiming Qian, Linlin Zong, Hongfei Lin
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Main Track. Pages 8304-8312.
https://doi.org/10.24963/ijcai.2025/923
The rapid development of large language models has greatly advanced human-computer dialogue research. However, applying these models to specialized fields like maternity and infant care often leads to subpar performance due to a lack of domain-specific datasets. To address this problem, we have created MicDialogue, a Chinese dialogue dataset for maternity and infant care. MicDialogue involves a wide range of specialized topics, including gynecological health, pediatric care, pregnancy preparation, emotional counseling and other related topics. This dataset is curated from two types of Chinese social media: short videos and blog posts. Short videos capture real-time interactions and pragmatic dialogue patterns, while blog posts offer comprehensive coverage of various topics within the domain. We have also included detailed annotations for topics, diseases, symptoms, and causes, enabling in-depth research. Additionally, we developed a knowledge-driven benchmark model using LLM-based prompt learning and multiple knowledge graphs to address diverse dialogue topics. Experiments validate MicDialogue's usability, providing benchmarks for future research and essential data for fine-tuning language models in maternity and infant care.
Keywords:
Natural Language Processing: NLP: Resources and evaluation
Natural Language Processing: NLP: Applications
Natural Language Processing: NLP: Dialogue and interactive systems
Natural Language Processing: NLP: Discourse and pragmatics
