论文标题

注意逆合合成差距:桥接单步和多步回曲的鸿沟预测

Mind the Retrosynthesis Gap: Bridging the divide between Single-step and Multi-step Retrosynthesis Prediction

论文作者

Hassen, Alan Kai, Torren-Peraire, Paula, Genheden, Samuel, Verhoeven, Jonas, Preuss, Mike, Tetko, Igor

论文摘要

循环合成是将化合物递归逐步分为分子前体的任务,直到发现一组市售分子为止。因此,目标是为分子提供有效的合成途径。随着越来越多的单步模型的发展,我们看到分子断开预测的准确性提高,有可能改善合成路径的创建。多步方法反复应用了存储在单步反折返模型中的化学信息。但是,这种连接在当代研究中不反映,在此过程中固定了单步模型或多步算法。在这项工作中,我们通过利用两种常见的搜索算法,Monte Carlo Tree Search和Retro*来基准将不同的单步反折返模型的性能和转移到多步域之间的性能和转移来建立两个任务之间的桥梁。我们表明,专为单步反返回合成而设计的模型在扩展到多步骤时可能会对当前多步方法的路线查找功能产生巨大影响,与最广泛使用的模型相比,高达30%的模型可提高性能。此外,我们观察到当代单步和多步评估指标之间没有明确的联系,这表明需要开发和测试单步模型,并为多步域进行测试,而不是作为寻找利益分子的综合途径的孤立任务。

Retrosynthesis is the task of breaking down a chemical compound recursively step-by-step into molecular precursors until a set of commercially available molecules is found. Consequently, the goal is to provide a valid synthesis route for a molecule. As more single-step models develop, we see increasing accuracy in the prediction of molecular disconnections, potentially improving the creation of synthetic paths. Multi-step approaches repeatedly apply the chemical information stored in single-step retrosynthesis models. However, this connection is not reflected in contemporary research, fixing either the single-step model or the multi-step algorithm in the process. In this work, we establish a bridge between both tasks by benchmarking the performance and transfer of different single-step retrosynthesis models to the multi-step domain by leveraging two common search algorithms, Monte Carlo Tree Search and Retro*. We show that models designed for single-step retrosynthesis, when extended to multi-step, can have a tremendous impact on the route finding capabilities of current multi-step methods, improving performance by up to +30% compared to the most widely used model. Furthermore, we observe no clear link between contemporary single-step and multi-step evaluation metrics, showing that single-step models need to be developed and tested for the multi-step domain and not as an isolated task to find synthesis routes for molecules of interest.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源