预算变压器的神经知识库

论文标题

预算变压器的神经知识库

Neural Knowledge Bank for Pretrained Transformers

论文作者

Dai, Damai, Jiang, Wenbin, Dong, Qingxiu, Lyu, Yajuan, She, Qiaoqiao, Sui, Zhifang

论文摘要

预算变压器记住事实知识的能力是必不可少的，但对于现有模型仍然有限。灵感来自于将变压器中的前馈网络（FFN）视为钥匙值记忆的现有工作，我们设计了神经知识库（NKB）和知识注入策略，以引入预验证的变压器的额外事实知识。 NKB的形式是FFN的其他知识渊博的内存插槽，并且类似内存的体系结构使其高度易于解释和灵活。当向明显的跨度掩盖（SSM）预处理目标注入额外的知识时，我们将修复原始预审计的模型并仅训练NKB。该培训策略确保原始预告片模型的一般语言建模能力不会受到影响。通过将NKB安装到T5模型上，我们验证了其强大的能力，可以基于三个封闭的封闭式答案数据集来存储额外的事实知识。另外，我们证明，安装NKB不会通过两个代表性任务，即汇总和机器翻译来降低T5的一般语言建模能力。此外，我们彻底分析了NKB的可解释性，并以人为可读的方式揭示其钥匙和价值观的含义。最后，我们通过直接修改其价值向量以更新存储在其中的事实知识来显示NKB的灵活性。

The ability of pretrained Transformers to remember factual knowledge is essential but still limited for existing models. Inspired by existing work that regards Feed-Forward Networks (FFNs) in Transformers as key-value memories, we design a Neural Knowledge Bank (NKB) and a knowledge injection strategy to introduce extra factual knowledge for pretrained Transformers. The NKB is in the form of additional knowledgeable memory slots to the FFN and the memory-like architecture makes it highly interpretable and flexible. When injecting extra knowledge with the Salient Span Masking (SSM) pretraining objective, we fix the original pretrained model and train only the NKB. This training strategy makes sure the general language modeling ability of the original pretrained model is not influenced. By mounting the NKB onto the T5 model, we verify its strong ability to store extra factual knowledge based on three closed-book question answering datasets. Also, we prove that mounting the NKB will not degrade the general language modeling ability of T5 through two representative tasks, summarization and machine translation. Further, we thoroughly analyze the interpretability of the NKB and reveal the meaning of its keys and values in a human-readable way. Finally, we show the flexibility of the NKB by directly modifying its value vectors to update the factual knowledge stored in it.

下载PDF全文

下载文献需遵守相关版权规定

论文标题