持续的自我监督学习的挑战

论文标题

持续的自我监督学习的挑战

The Challenges of Continuous Self-Supervised Learning

论文作者

Purushwalkam, Senthil, Morgado, Pedro, Gupta, Abhinav

论文摘要

自我监督学习（SSL）旨在消除代表学习中的主要瓶颈之一 - 对人类注释的需求。结果，SSL有望从野外数据中学习表示形式，即无需有限和静态数据集。取而代之的是，真正的SSL算法应该能够利用Internet上生成的连续数据流或探索其环境的代理。但是，在这种设置中，传统的自我监督学习方法是否有效？在这项工作中，我们通过对持续的自学学习问题进行实验来调查这个问题。在野外学习时，我们希望看到一个连续的（无限）非IID数据流，该数据流遵循视觉概念的非平稳分布。目的是学习一种可以强大的，适应性但不忘记过去所见的概念的代表。我们表明，当前方法在这种连续设置中的直接应用是1）在计算和所需的数据量中效率低下，2）由于时间相关性（非IID数据）在某些流源数据源中导致较低的表示，而在流媒体数据源中以及3）表现出灾难性遗忘的签名，当时在非稳定数据的来源上训练了灾难性遗忘。我们建议使用重播缓冲区作为减轻效率低下和时间相关性问题的一种方法。我们进一步提出了一种新的方法来通过保持最小冗余样品来增强重型缓冲液。最低冗余（降低）缓冲液使我们能够学习有效表示形式，即使在最具挑战性的流媒体场景中，该场景由从单个体现的代理中获得的顺序视觉数据组成，并减轻了在使用非机构语义分布的数据中学习数据时灾难性遗忘的问题。

Self-supervised learning (SSL) aims to eliminate one of the major bottlenecks in representation learning - the need for human annotations. As a result, SSL holds the promise to learn representations from data in-the-wild, i.e., without the need for finite and static datasets. Instead, true SSL algorithms should be able to exploit the continuous stream of data being generated on the internet or by agents exploring their environments. But do traditional self-supervised learning approaches work in this setup? In this work, we investigate this question by conducting experiments on the continuous self-supervised learning problem. While learning in the wild, we expect to see a continuous (infinite) non-IID data stream that follows a non-stationary distribution of visual concepts. The goal is to learn a representation that can be robust, adaptive yet not forgetful of concepts seen in the past. We show that a direct application of current methods to such continuous setup is 1) inefficient both computationally and in the amount of data required, 2) leads to inferior representations due to temporal correlations (non-IID data) in some sources of streaming data and 3) exhibits signs of catastrophic forgetting when trained on sources with non-stationary data distributions. We propose the use of replay buffers as an approach to alleviate the issues of inefficiency and temporal correlations. We further propose a novel method to enhance the replay buffer by maintaining the least redundant samples. Minimum redundancy (MinRed) buffers allow us to learn effective representations even in the most challenging streaming scenarios composed of sequential visual data obtained from a single embodied agent, and alleviates the problem of catastrophic forgetting when learning from data with non-stationary semantic distributions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题