论文标题
桅杆:一个由内存的自我监督跟踪器
MAST: A Memory-Augmented Self-supervised Tracker
论文作者
论文摘要
最近对自我监督密集跟踪的兴趣已取得了迅速的进步,但绩效仍然远离监督方法。我们提出了一个密集的跟踪模型,该模型对视频进行了训练,而没有任何注释,这些注释超过了现有基准的先前自我监督方法,并具有很大的利润率(+15%),并且具有与受监管方法相当的性能。在本文中,我们首先通过进行彻底阐明最佳选择的彻底实验来重新评估用于自我监督训练和重建损失的传统选择。其次,我们通过使用关键的内存组件来增强架构来进一步改善现有方法。第三,我们基于大型半监督视频对象细分(又名密集跟踪),并提出了一个新的度量:概括性。我们的前两个贡献产生了一个自我监督的网络,该网络首次具有有关密集跟踪的标准评估指标的监督方法。在测量概括性时,我们显示自我监督的方法实际上优于大多数监督方法。我们认为,这种新的可推广性指标可以更好地捕获真实的用例来进行密集跟踪,并会激发这一研究方向的新兴趣。
Recent interest in self-supervised dense tracking has yielded rapid progress, but performance still remains far from supervised methods. We propose a dense tracking model trained on videos without any annotations that surpasses previous self-supervised methods on existing benchmarks by a significant margin (+15%), and achieves performance comparable to supervised methods. In this paper, we first reassess the traditional choices used for self-supervised training and reconstruction loss by conducting thorough experiments that finally elucidate the optimal choices. Second, we further improve on existing methods by augmenting our architecture with a crucial memory component. Third, we benchmark on large-scale semi-supervised video object segmentation(aka. dense tracking), and propose a new metric: generalizability. Our first two contributions yield a self-supervised network that for the first time is competitive with supervised methods on standard evaluation metrics of dense tracking. When measuring generalizability, we show self-supervised approaches are actually superior to the majority of supervised methods. We believe this new generalizability metric can better capture the real-world use-cases for dense tracking, and will spur new interest in this research direction.