Learning to Converse with Noisy Data: Generation with Calibration

Learning to Converse with Noisy Data: Generation with Calibration

Mingyue Shang, Zhenxin Fu, Nanyun Peng, Yansong Feng, Dongyan Zhao, Rui Yan

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 4338-4344. https://doi.org/10.24963/ijcai.2018/603

The availability of abundant conversational data on the Internet brought prosperity to the generation-based open domain conversation systems. In the training of the generation models, existing methods generally treat all the training data equivalently. However, the data crawled from the websites may contain many noises. Blindly training with the noisy data could harm the performance of the final generation model. In this paper, we propose a generation with calibration framework, that allows high- quality data to have more influences on the generation model and reduces the effect of noisy data. Specifically, for each instance in training set, we employ a calibration network to produce a quality score for it, then the score is used for the weighted update of the generation model parameters. Experiments show that the calibrated model outperforms baseline methods on both automatic evaluation metrics and human annotations.
Keywords:
Natural Language Processing: Dialogue
Natural Language Processing: Natural Language Generation