加速贝叶斯对生物序列设计的优化，并使用denoising自动编码器

论文标题

加速贝叶斯对生物序列设计的优化，并使用denoising自动编码器

Accelerating Bayesian Optimization for Biological Sequence Design with Denoising Autoencoders

论文作者

Stanton, Samuel, Maddox, Wesley, Gruver, Nate, Maffettone, Phillip, Delaney, Emily, Greenside, Peyton, Wilson, Andrew Gordon

论文摘要

贝叶斯优化（Bayesopt）是查询有效连续优化的金标准。但是，决策变量的离散，高维性质阻碍了其对药物设计的采用。我们开发了一种新的方法（LAMBO），该方法通过判别性多任务高斯流程主管共同训练Denoising AutoCododer，从而使基于梯度的多目标采集功能优化了自动装饰器的潜在空间。这些采集功能使Lambo能够在多个设计回合上平衡探索探索折衷方案，并通过在Pareto Frontier上的许多不同点上优化序列来平衡客观权衡。我们在两个小分子设计任务上评估了兰博，并引入了优化\ emph {in silico}和\ emph {Inter {Inter}的特性的新任务。在我们的实验中，兰博的表现优于遗传优化器，不需要大量的训练训练库，这表明贝叶诺斯对生物序列设计是实用且有效的。

Bayesian optimization (BayesOpt) is a gold standard for query-efficient continuous optimization. However, its adoption for drug design has been hindered by the discrete, high-dimensional nature of the decision variables. We develop a new approach (LaMBO) which jointly trains a denoising autoencoder with a discriminative multi-task Gaussian process head, allowing gradient-based optimization of multi-objective acquisition functions in the latent space of the autoencoder. These acquisition functions allow LaMBO to balance the explore-exploit tradeoff over multiple design rounds, and to balance objective tradeoffs by optimizing sequences at many different points on the Pareto frontier. We evaluate LaMBO on two small-molecule design tasks, and introduce new tasks optimizing \emph{in silico} and \emph{in vitro} properties of large-molecule fluorescent proteins. In our experiments LaMBO outperforms genetic optimizers and does not require a large pretraining corpus, demonstrating that BayesOpt is practical and effective for biological sequence design.

下载PDF全文

下载文献需遵守相关版权规定

论文标题