R2DQG: A Quality Meets Diversity Framework for Question Generation over Knowledge Bases

Yimeng Ren; Yanhua Yu; Lizi Liao; Yuhu Shang; Kangkang Lu; Mingliang Yan

doi:10.24963/ijcai.2025/915

R2DQG: A Quality Meets Diversity Framework for Question Generation over Knowledge Bases

Yimeng Ren, Yanhua Yu, Lizi Liao, Yuhu Shang, Kangkang Lu, Mingliang Yan

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence

Main Track. Pages 8231-8240. https://doi.org/10.24963/ijcai.2025/915

PDF BibTeX

The task of Knowledge-Based Question Generation (KBQG) involves generating natural language questions from structured knowledge sources, posing unique challenges in balancing linguistic diversity and semantic relevance. Existing models often focus on maximizing surface-level similarity to ground-truth questions, neglecting the need for diverse syntactic forms and leading to semantic drift during generation. To overcome these challenges, we propose Refine-Reinforced Diverse Question Generation (R2DQG), a two-phase framework leveraging a generation-then-refinement paradigm. The Generator first constructs a diverse set of expressive templates using dependency parse tree similarity, capturing a wide range of syntactic patterns and styles. These templates guide the creation of question drafts, ensuring both diversity and semantic relevance. In the second phase, a Corrector module refines the drafts to mitigate semantic drift and enhance overall coherence and quality. Experiments on public datasets show that R2DQG outperforms state-of-the-art models in generating diverse, contextually accurate questions. Moreover, synthetic datasets generated by R2DQG enhance downstream QA performance, underscoring the practical utility of our approach.

Keywords:

Natural Language Processing: NLP: Language generation

Data Mining: DM: Knowledge graphs and knowledge base completion

Natural Language Processing: NLP: Question answering

Natural Language Processing: NLP: Summarization