Improving Function Word Alignment with Frequency and Syntactic Information / 2211
Jingyi Zhang, Hai Zhao
In statistical word alignment for machine translation, function words usually cause poor aligning performance because they do not have clear correspondence between different languages. This paper proposes a novel approach to improve word alignment by pruning alignments of function words from an existing alignment model with high precision and recall. Based on monolingual and bilingual frequency characteristics, a language-independent function word recognition algorithm is first proposed. Then a group of carefully defined syntactic structures combined with content word alignments are used for further function word alignment pruning. The experimental results show that the proposed approach improves both the quality of word alignment and the performance of statistical machine translation on Chinese-to-English, German-to-English and French-to-English language pairs.