In-context Learning Demonstration Generation with Text Distillation

Wuyuqing Wang; Erkun Yang; Zilan Zhou; Cheng Deng

doi:10.24963/ijcai.2025/716

In-context Learning Demonstration Generation with Text Distillation

Wuyuqing Wang, Erkun Yang, Zilan Zhou, Cheng Deng

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence

Main Track. Pages 6433-6441. https://doi.org/10.24963/ijcai.2025/716

PDF BibTeX

In-context learning (ICL), a paradigm derived from large language models (LLMs), holds significant promise but is notably sensitive to the choice of input demonstrations. While numerous methodologies have been developed to select the optimal demonstrations from existing datasets, our work alternatively proposes to generate representative demonstrations through a Distillation-based Demonstration Generation (DDG) framework. Specifically, our approach aims to generate demonstrations that encapsulate the essential attributes of the target dataset. Rather than optimizing these demonstrations directly, we design a generative model and try to refine it by minimizing the discrepancies between the calculative models trained on generated demonstrations and the original datasets respectively. Additionally, we leverage a teacher-student framework to stabilize the training process and improve the quality of the synthesized samples. Extensive experiments conducted across ten prevalent text datasets demonstrate that our DDG method substantially outperforms existing state-of-the-art methodologies. Our code will be available at https://github.com/wwyq1/DDG.

Keywords:

Machine Learning: ML: Few-shot learning

Natural Language Processing: NLP: Text classification