论文标题
如何培训基于能量的模型进行回归
How to Train Your Energy-Based Model for Regression
论文作者
论文摘要
近年来,基于能量的模型(EBM)在计算机视觉中变得越来越流行。尽管它们通常用于生成图像建模,但最近的工作也将EBM应用于回归任务,从而在对象检测和视觉跟踪方面实现了最先进的性能。但是,众所周知,培训EBM是具有挑战性的。尽管已经探索了各种不同的技术用于生成建模,但EBM在回归中的应用并不是一个充分研究的问题。因此,当前尚不清楚如何培训EBM以获得最佳回归性能。因此,我们接受提供有关此问题的首次详细研究的任务。为此,我们提出了对噪声对比估计的简单而高效的扩展,并仔细地将其性能与有关1D回归和对象检测任务的文献中的六种流行方法进行了比较。该比较的结果表明,我们的培训方法应被视为首选方法。我们还将方法应用于视觉跟踪任务,在五个数据集上实现了最先进的性能。值得注意的是,我们的追踪器在LASOT上获得了63.7%的AUC,在TrackingNet上取得了78.7%的成功。代码可在https://github.com/fregu856/ebms_regression上找到。
Energy-based models (EBMs) have become increasingly popular within computer vision in recent years. While they are commonly employed for generative image modeling, recent work has applied EBMs also for regression tasks, achieving state-of-the-art performance on object detection and visual tracking. Training EBMs is however known to be challenging. While a variety of different techniques have been explored for generative modeling, the application of EBMs to regression is not a well-studied problem. How EBMs should be trained for best possible regression performance is thus currently unclear. We therefore accept the task of providing the first detailed study of this problem. To that end, we propose a simple yet highly effective extension of noise contrastive estimation, and carefully compare its performance to six popular methods from literature on the tasks of 1D regression and object detection. The results of this comparison suggest that our training method should be considered the go-to approach. We also apply our method to the visual tracking task, achieving state-of-the-art performance on five datasets. Notably, our tracker achieves 63.7% AUC on LaSOT and 78.7% Success on TrackingNet. Code is available at https://github.com/fregu856/ebms_regression.