强大的文字嵌入对词级对抗攻击

论文标题

强大的文字嵌入对词级对抗攻击

Robust Textual Embedding against Word-level Adversarial Attacks

论文作者

Yang, Yichen, Wang, Xiaosen, He, Kun

论文摘要

我们将自然语言处理模型的脆弱性归因于以下事实：类似的输入转换为嵌入空间中不同的表示形式，导致输出不一致，我们提出了一种新颖的强大训练方法，称为快速三胞胎度量度量学习（FTML）。具体而言，我们认为原始样本应具有相似的表示及其对手的代表，并将其表示与其他样品区分开以提高鲁棒性。为此，我们将三胞胎度量学习采用标准培训中，以将单词更接近其正样本（即同义词），然后在嵌入空间中推出其负面样本（即非杂种）。广泛的实验表明，FTML可以显着促进模型的鲁棒性，同时对原始样本保持竞争性分类精度。此外，我们的方法是有效的，因为它只需要调整嵌入方式，并且在标准培训上很少引入开销。我们的作品显示出通过稳健的单词嵌入来改善文本鲁棒性的巨大潜力。

We attribute the vulnerability of natural language processing models to the fact that similar inputs are converted to dissimilar representations in the embedding space, leading to inconsistent outputs, and we propose a novel robust training method, termed Fast Triplet Metric Learning (FTML). Specifically, we argue that the original sample should have similar representation with its adversarial counterparts and distinguish its representation from other samples for better robustness. To this end, we adopt the triplet metric learning into the standard training to pull words closer to their positive samples (i.e., synonyms) and push away their negative samples (i.e., non-synonyms) in the embedding space. Extensive experiments demonstrate that FTML can significantly promote the model robustness against various advanced adversarial attacks while keeping competitive classification accuracy on original samples. Besides, our method is efficient as it only needs to adjust the embedding and introduces very little overhead on the standard training. Our work shows great potential of improving the textual robustness through robust word embedding.

下载PDF全文

下载文献需遵守相关版权规定

论文标题