论文标题
随着时间的流逝感知:适合图像理解的时间动力学
Perception Over Time: Temporal Dynamics for Robust Image Understanding
论文作者
论文摘要
尽管深度学习在狭窄和特定的视力任务中超过了人类水平的表现,但在分类中却脆弱且过于自信。例如,图像空间中的透视,照明或对象变形的微小转换可能会导致截然不同的标记,这在对抗性扰动中尤其透明。另一方面,人类的视觉感知是对输入刺激的变化的数量级更强。但是不幸的是,我们远没有完全理解和整合导致这种强大感知的基本机制。在这项工作中,我们介绍了一种新颖的方法,将时间动态纳入静态图像理解中。我们描述了一种神经启发的方法,该方法将单个图像分解为一系列的粗到美图像,该图像模拟了生物视觉如何随着时间的推移整合信息。接下来,我们演示了我们新颖的视觉感知框架如何使用具有复发单元的生物学上合理的算法“随着时间的流逝”来“随着时间的流逝”,因此,如何将其与标准CNN相比的准确性和鲁棒性显着提高。我们还将我们提出的方法与最先进的模型进行了比较,并通过多次消融研究明确量化了我们的对抗性鲁棒性特性。我们的定量和定性结果令人信服地表明了对当今使用的标准计算机愿景和深度学习体系结构的激动人心和变革性的改进。
While deep learning surpasses human-level performance in narrow and specific vision tasks, it is fragile and over-confident in classification. For example, minor transformations in perspective, illumination, or object deformation in the image space can result in drastically different labeling, which is especially transparent via adversarial perturbations. On the other hand, human visual perception is orders of magnitude more robust to changes in the input stimulus. But unfortunately, we are far from fully understanding and integrating the underlying mechanisms that result in such robust perception. In this work, we introduce a novel method of incorporating temporal dynamics into static image understanding. We describe a neuro-inspired method that decomposes a single image into a series of coarse-to-fine images that simulates how biological vision integrates information over time. Next, we demonstrate how our novel visual perception framework can utilize this information "over time" using a biologically plausible algorithm with recurrent units, and as a result, significantly improving its accuracy and robustness over standard CNNs. We also compare our proposed approach with state-of-the-art models and explicitly quantify our adversarial robustness properties through multiple ablation studies. Our quantitative and qualitative results convincingly demonstrate exciting and transformative improvements over the standard computer vision and deep learning architectures used today.