Improving Few-Shot Text-to-SQL with Meta Self-Training via Column Specificity

Improving Few-Shot Text-to-SQL with Meta Self-Training via Column Specificity

Xinnan Guo, Yongrui Chen, Guilin Qi, Tianxing Wu, Hao Xu

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
Main Track. Pages 4150-4156. https://doi.org/10.24963/ijcai.2022/576

The few-shot problem is an urgent challenge for single-table text-to-SQL. Existing methods ignore the potential value of unlabeled data, and merely rely on a coarse-grained Meta-Learning (ML) algorithm that neglects the differences of column contributions to the optimization object. This paper proposes a Meta Self-Training text-to-SQL (MST-SQL) method to solve the problem. Specifically, MST-SQL is based on column-wise HydraNet and adopts self-training as an effective mechanism to learn from readily available unlabeled samples. During each epoch of training, it first predicts pseudo-labels for unlabeled samples and then leverages them to update the parameters. A fine-grained ML algorithm is used in updating, which weighs the contribution of columns by their specificity, in order to further improve the generalizability. Extensive experimental results on both open-domain and domain-specific benchmarks reveal that our MST-SQL has significant advantages in few-shot scenarios, and is also competitive in standard supervised settings.
Keywords:
Natural Language Processing: Question Answering
Data Mining: Information Retrieval