论文标题
捆绑捆绑:将体系结构适应深度神经网络的改进培训
Conflicting Bundles: Adapting Architectures Towards the Improved Training of Deep Neural Networks
论文作者
论文摘要
设计神经网络体系结构是一项具有挑战性的任务,并且知道必须适应模型的哪些特定层以提高性能几乎是一个谜。在本文中,我们介绍了一种新颖的理论和指标,以识别降低培训模型的测试准确性的层次,此识别早在训练开始时就可以完成。在最坏的情况下,这样的层可能会导致无法训练的网络。更确切地说,我们确定了那些使性能恶化的层次,因为它们会产生相互冲突的训练束,就像我们在新的理论分析中所显示的那样,这是我们广泛的经验研究的补充。基于这些发现,引入了一种新型算法,以自动去除性能降低层。与最先进的体系结构相比,该算法发现的结构达到了竞争精度。在保持如此高的精度的同时,我们的方法大大减少了内存消耗和推理时间,用于不同的计算机视觉任务。
Designing neural network architectures is a challenging task and knowing which specific layers of a model must be adapted to improve the performance is almost a mystery. In this paper, we introduce a novel theory and metric to identify layers that decrease the test accuracy of the trained models, this identification is done as early as at the beginning of training. In the worst-case, such a layer could lead to a network that can not be trained at all. More precisely, we identified those layers that worsen the performance because they produce conflicting training bundles as we show in our novel theoretical analysis, complemented by our extensive empirical studies. Based on these findings, a novel algorithm is introduced to remove performance decreasing layers automatically. Architectures found by this algorithm achieve a competitive accuracy when compared against the state-of-the-art architectures. While keeping such high accuracy, our approach drastically reduces memory consumption and inference time for different computer vision tasks.