Genetic Prompt Search via Exploiting Language Model Probabilities

Genetic Prompt Search via Exploiting Language Model Probabilities

Jiangjiang Zhao, Zhuoran Wang, Fangchun Yang

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 5296-5305. https://doi.org/10.24963/ijcai.2023/588

Prompt tuning for large-scale pretrained language models (PLMs) has shown remarkable potential, especially in low-resource scenarios such as few-shot learning. Moreover, derivative-free optimisation (DFO) techniques make it possible to tune prompts for a black-box PLM to better fit downstream tasks. However, there are usually preconditions to apply existing DFO-based prompt tuning methods, e.g. the backbone PLM needs to provide extra APIs so that hidden states (and/or embedding vectors) can be injected into it as continuous prompts, or carefully designed (discrete) manual prompts need to be available beforehand, serving as the initial states of the tuning algorithm. To waive such preconditions and make DFO-based prompt tuning ready for general use, this paper introduces a novel genetic algorithm (GA) that evolves from empty prompts, and uses the predictive probabilities derived from the backbone PLM(s) on the basis of a (few-shot) training set to guide the token selection process during prompt mutations. Experimental results on diverse benchmark datasets show that the proposed precondition-free method significantly outperforms the existing DFO-style counterparts that require preconditions, including black-box tuning, genetic prompt search and gradient-free instructional prompt search.
Keywords:
Natural Language Processing: NLP: Language models
Machine Learning: ML: Few-shot learning
Natural Language Processing: NLP: Other