论文标题
通过太空翘曲将贝叶斯优化的专家合并
Incorporating Expert Prior in Bayesian Optimisation via Space Warping
论文作者
论文摘要
贝叶斯优化是一种众所周知的样品效率方法,用于优化昂贵的黑盒功能。但是,在处理大搜索空间时,算法在达到最佳功能之前会经过几个低功能值区域。由于功能评估在金钱和时间方面都是昂贵的,因此可以减轻此问题。消除这个冷启动阶段的一种方法是使用可以加速优化的先验知识。贝叶斯优化以标准形式假定搜索空间中任何点的可能性是最佳的。因此,任何可以提供有关该功能最佳信息的信息的先验知识都会提高优化性能。在本文中,我们代表有关功能最佳通过先前分布的先验知识。然后,将先前的分布用于扭曲搜索空间,以使空间围绕最佳功能最佳的高概率区域扩展,并围绕最佳的低概率区域收缩。我们通过重新定义内核矩阵直接将此先验合并到函数模型(高斯过程)中,该内核矩阵允许此方法与任何采集函数(即获取不可知的方法)一起使用。我们通过优化多个基准函数和两种算法的高参数调整来显示我们方法比标准贝叶斯优化方法的优越性:支持向量机(SVM)和随机森林。
Bayesian optimisation is a well-known sample-efficient method for the optimisation of expensive black-box functions. However when dealing with big search spaces the algorithm goes through several low function value regions before reaching the optimum of the function. Since the function evaluations are expensive in terms of both money and time, it may be desirable to alleviate this problem. One approach to subside this cold start phase is to use prior knowledge that can accelerate the optimisation. In its standard form, Bayesian optimisation assumes the likelihood of any point in the search space being the optimum is equal. Therefore any prior knowledge that can provide information about the optimum of the function would elevate the optimisation performance. In this paper, we represent the prior knowledge about the function optimum through a prior distribution. The prior distribution is then used to warp the search space in such a way that space gets expanded around the high probability region of function optimum and shrinks around low probability region of optimum. We incorporate this prior directly in function model (Gaussian process), by redefining the kernel matrix, which allows this method to work with any acquisition function, i.e. acquisition agnostic approach. We show the superiority of our method over standard Bayesian optimisation method through optimisation of several benchmark functions and hyperparameter tuning of two algorithms: Support Vector Machine (SVM) and Random forest.