CODA-PROMPT：持续分解基于注意力的促进无连续学习的提示

论文标题

CODA-PROMPT：持续分解基于注意力的促进无连续学习的提示

CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning

论文作者

Smith, James Seale, Karlinsky, Leonid, Gutta, Vyshnavi, Cascante-Bonilla, Paola, Kim, Donghyun, Arbelle, Assaf, Panda, Rameswar, Feris, Rogerio, Kira, Zsolt

论文摘要

从不断转移的训练数据中学习新颖概念时，计算机视觉模型遭受了一种被称为灾难性遗忘的现象。对于这个持续学习问题的典型解决方案需要对先前看到的数据进行广泛的排练，这增加了存储成本并可能侵犯数据隐私。最近，大规模训练的视觉变压器模型的出现已使促使方法成为数据级别的替代方法。这些方法依赖于钥匙疑问机制来产生提示，并被发现对灾难性的遗忘具有很高的抵抗力，在良好的无彩排持续学习环境中。但是，这些方法的关键机制不是通过任务序列端到端训练的。我们的实验表明，这导致其可塑性降低，因此牺牲了新的任务准确性，并且无法从扩展的参数容量中受益。相反，我们建议学习一组由输入条件的权重组装的提示组件，以产生输入条件的提示，从而产生了一种新颖的基于注意力的端到端键争方案。我们的实验表明，我们的表现优于当前的SOTA方法在已建立的基准测试上的平均最终精度高达4.5％。在连续学习的基准上，我们还超过了高度的高度准确性，该基准既包含班级收入和域名的任务转移，这对应于许多实际设置。我们的代码可在https://github.com/gt-ripl/coda-prompt上找到

Computer vision models suffer from a phenomenon known as catastrophic forgetting when learning novel concepts from continuously shifting training data. Typical solutions for this continual learning problem require extensive rehearsal of previously seen data, which increases memory costs and may violate data privacy. Recently, the emergence of large-scale pre-trained vision transformer models has enabled prompting approaches as an alternative to data-rehearsal. These approaches rely on a key-query mechanism to generate prompts and have been found to be highly resistant to catastrophic forgetting in the well-established rehearsal-free continual learning setting. However, the key mechanism of these methods is not trained end-to-end with the task sequence. Our experiments show that this leads to a reduction in their plasticity, hence sacrificing new task accuracy, and inability to benefit from expanded parameter capacity. We instead propose to learn a set of prompt components which are assembled with input-conditioned weights to produce input-conditioned prompts, resulting in a novel attention-based end-to-end key-query scheme. Our experiments show that we outperform the current SOTA method DualPrompt on established benchmarks by as much as 4.5% in average final accuracy. We also outperform the state of art by as much as 4.4% accuracy on a continual learning benchmark which contains both class-incremental and domain-incremental task shifts, corresponding to many practical settings. Our code is available at https://github.com/GT-RIPL/CODA-Prompt

下载PDF全文

下载文献需遵守相关版权规定

论文标题