论文标题
避免释义生成
Delexicalized Paraphrase Generation
论文作者
论文摘要
我们提出了一种释义的神经模型,并训练它以产生避免句子。我们通过创建培训数据来实现这一目标,其中每个输入与许多参考释义配对。这些参考释义集代表了基于带注释的插槽和意图的语义等效性的弱类型。为了了解不同类型的插槽的语义,除了匿名插槽以外,我们在汇总插槽值之前应用卷积神经网络(CNN)并使用指针在输出中找到插槽。我们从经验上表明,产生的释义具有高质量,导致现场话语额外匹配1.29%。我们还表明,自然语言理解(NLU)任务,例如意图分类和命名实体识别,可以使用自动生成的释义从数据增强中受益。
We present a neural model for paraphrasing and train it to generate delexicalized sentences. We achieve this by creating training data in which each input is paired with a number of reference paraphrases. These sets of reference paraphrases represent a weak type of semantic equivalence based on annotated slots and intents. To understand semantics from different types of slots, other than anonymizing slots, we apply convolutional neural networks (CNN) prior to pooling on slot values and use pointers to locate slots in the output. We show empirically that the generated paraphrases are of high quality, leading to an additional 1.29% exact match on live utterances. We also show that natural language understanding (NLU) tasks, such as intent classification and named entity recognition, can benefit from data augmentation using automatically generated paraphrases.