使用无训练性能指标的贝叶斯神经建筑搜索

论文标题

使用无训练性能指标的贝叶斯神经建筑搜索

Bayesian Neural Architecture Search using A Training-Free Performance Metric

论文作者

Camero, Andrés, Wang, Hao, Alba, Enrique, Bäck, Thomas

论文摘要

复发性神经网络（RNN）是时间序列预测的有力方法。但是，他们的性能受到建筑和超参数设置的强烈影响。 RNN的架构优化是一项耗时的任务，其中搜索空间通常是真实，整数和分类值的混合物。为了使网络的大小缩小和扩展，体系结构的表示通常具有可变的长度。在本文中，我们建议使用贝叶斯优化（BO）算法的变体解决结构优化问题。为了减少候选体系结构的评估时间，采用了一种无训练的方法来估计网络性能的平均绝对误差随机抽样（MRS）作为BO的目标函数。同样，我们提出了三个固定长度编码方案，以应对可变长度的体系结构表示。结果是对RNN的准确和高效设计的新观点，我们在三个问题上进行了验证。我们的发现表明，1）BO算法可以使用所提出的编码方案探索不同的网络体系结构并成功设计了良好的结构，以及2）使用MRS可大大降低优化时间，而不会损害与从实际训练程序中获得的体系结构相比的性能。

Recurrent neural networks (RNNs) are a powerful approach for time series prediction. However, their performance is strongly affected by their architecture and hyperparameter settings. The architecture optimization of RNNs is a time-consuming task, where the search space is typically a mixture of real, integer and categorical values. To allow for shrinking and expanding the size of the network, the representation of architectures often has a variable length. In this paper, we propose to tackle the architecture optimization problem with a variant of the Bayesian Optimization (BO) algorithm. To reduce the evaluation time of candidate architectures the Mean Absolute Error Random Sampling (MRS), a training-free method to estimate the network performance, is adopted as the objective function for BO. Also, we propose three fixed-length encoding schemes to cope with the variable-length architecture representation. The result is a new perspective on accurate and efficient design of RNNs, that we validate on three problems. Our findings show that 1) the BO algorithm can explore different network architectures using the proposed encoding schemes and successfully designs well-performing architectures, and 2) the optimization time is significantly reduced by using MRS, without compromising the performance as compared to the architectures obtained from the actual training procedure.

下载PDF全文

下载文献需遵守相关版权规定

论文标题