序列到达：通过动作引导序列产生的语法误差校正

论文标题

序列到达：通过动作引导序列产生的语法误差校正

Sequence-to-Action: Grammatical Error Correction with Action Guided Sequence Generation

论文作者

Li, Jiquan, Guo, Junliang, Zhu, Yongxin, Sheng, Xin, Jiang, Deqiang, Ren, Bo, Xu, Linli

论文摘要

近年来，语法误差校正（GEC）的任务在自然语言处理（NLP）中受到了广泛的关注。虽然GEC的关键原则之一是保持正确的零件保持不变并避免过度校正，但先前的顺序到序列（SEQ2SEQ）模型会从头开始产生结果，这些结果不能保证遵循原始的句子结构，并且可能遭受过度校正问题的困扰。同时，最近提出的序列标记模型只能通过生成编辑操作来克服过度纠正问题，但根据人类设计的语言特定标记标签进行条件。在本文中，我们通过提出一个新型的序列〜（s2a）模块来结合优点并减轻两种模型的弊端。 S2A模块将源和目标句子作为输入共同以源和目标句子为输入，并能够在预测每个令牌之前自动生成令牌级别的动作序列，在此，每个动作都是从名为skip，复制和生成的三个选择中生成的。然后，这些动作与基本的SEQ2SEQ框架融合在一起，以提供最终的预测。我们在英语和中文GEC任务的基准数据集上进行实验。与序列标记模型相比，我们的模型始终胜过SEQ2SEQ基线，同时能够显着减轻过度纠正问题，并在生成结果中保持更好的一般性和多样性。

The task of Grammatical Error Correction (GEC) has received remarkable attention with wide applications in Natural Language Processing (NLP) in recent years. While one of the key principles of GEC is to keep the correct parts unchanged and avoid over-correction, previous sequence-to-sequence (seq2seq) models generate results from scratch, which are not guaranteed to follow the original sentence structure and may suffer from the over-correction problem. In the meantime, the recently proposed sequence tagging models can overcome the over-correction problem by only generating edit operations, but are conditioned on human designed language-specific tagging labels. In this paper, we combine the pros and alleviate the cons of both models by proposing a novel Sequence-to-Action~(S2A) module. The S2A module jointly takes the source and target sentences as input, and is able to automatically generate a token-level action sequence before predicting each token, where each action is generated from three choices named SKIP, COPY and GENerate. Then the actions are fused with the basic seq2seq framework to provide final predictions. We conduct experiments on the benchmark datasets of both English and Chinese GEC tasks. Our model consistently outperforms the seq2seq baselines, while being able to significantly alleviate the over-correction problem as well as holding better generality and diversity in the generation results compared to the sequence tagging models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题