探索神经建筑搜索中的损失格局

论文标题

探索神经建筑搜索中的损失格局

Exploring the Loss Landscape in Neural Architecture Search

论文作者

White, Colin, Nolen, Sam, Savani, Yash

论文摘要

在过去的几年中，神经建筑搜索（NAS）的兴趣急剧上升。 NAS的许多算法包括通过迭代选择架构，通过训练它来评估其性能，并使用所有先前的评估来评估其性能，并使用所有先前的评估来提出下一个选择。评估步骤是嘈杂的 - 最终精度根据权重的随机初始化而变化。先前的工作重点是设计新的搜索算法来处理此噪声，而不是量化或了解体系结构评估中的噪声水平。在这项工作中，我们表明（1）最简单的爬山算法是NAS的强大基准，并且（2）当流行的NAS基准数据集中的噪声减少到最低限度，攀爬山丘以优于许多流行的最先进的算法。我们通过证明局部最小值的数量大大减少，并通过对NAS中局部搜索的性能进行理论表征，从而进一步备份了这一观察。根据我们的发现，对于NAS研究，我们建议（1）使用局部搜索作为基线，以及（2）在可能的情况下将训练管道降低。

Neural architecture search (NAS) has seen a steep rise in interest over the last few years. Many algorithms for NAS consist of searching through a space of architectures by iteratively choosing an architecture, evaluating its performance by training it, and using all prior evaluations to come up with the next choice. The evaluation step is noisy - the final accuracy varies based on the random initialization of the weights. Prior work has focused on devising new search algorithms to handle this noise, rather than quantifying or understanding the level of noise in architecture evaluations. In this work, we show that (1) the simplest hill-climbing algorithm is a powerful baseline for NAS, and (2), when the noise in popular NAS benchmark datasets is reduced to a minimum, hill-climbing to outperforms many popular state-of-the-art algorithms. We further back up this observation by showing that the number of local minima is substantially reduced as the noise decreases, and by giving a theoretical characterization of the performance of local search in NAS. Based on our findings, for NAS research we suggest (1) using local search as a baseline, and (2) denoising the training pipeline when possible.

下载PDF全文

下载文献需遵守相关版权规定

论文标题