论文标题
基于结构的药物设计的3D生成模型
A 3D Generative Model for Structure-Based Drug Design
论文作者
论文摘要
我们研究了基于结构的药物设计中的基本问题 - 产生与特定蛋白质结合位点结合的分子。尽管我们目睹了在药物设计中深层生成模型的巨大成功,但现有方法主要基于弦或基于图。它们受到缺乏空间信息的限制,因此无法应用于基于结构的设计任务。特别是,此类模型对分子如何与其靶蛋白的相互作用在3D空间中如何相互作用没有或很少了解。在本文中,我们提出了一个3D生成模型,该模型在指定的3D蛋白结合位点产生分子。具体而言,给定一个绑定位点作为3D上下文,我们的模型估计了3D空间中原子发生的概率密度 - 更可能具有原子的位置将分配更高的概率。为了生成3D分子,我们提出了一种自动回归抽样方案 - 原子是从学习分布中顺序取样的,直到没有新原子的空间为止。结合此采样方案,我们的模型可以生成有效和多样的分子,这些分子可以适用于各种基于结构的分子设计任务,例如分子采样和接头设计。实验结果表明,从我们的模型中采样的分子表现出与特定靶标和良好药物特性(例如药物类似型)的高结合亲和力,即使该模型未明确优化它们。
We study a fundamental problem in structure-based drug design -- generating molecules that bind to specific protein binding sites. While we have witnessed the great success of deep generative models in drug design, the existing methods are mostly string-based or graph-based. They are limited by the lack of spatial information and thus unable to be applied to structure-based design tasks. Particularly, such models have no or little knowledge of how molecules interact with their target proteins exactly in 3D space. In this paper, we propose a 3D generative model that generates molecules given a designated 3D protein binding site. Specifically, given a binding site as the 3D context, our model estimates the probability density of atom's occurrences in 3D space -- positions that are more likely to have atoms will be assigned higher probability. To generate 3D molecules, we propose an auto-regressive sampling scheme -- atoms are sampled sequentially from the learned distribution until there is no room for new atoms. Combined with this sampling scheme, our model can generate valid and diverse molecules, which could be applicable to various structure-based molecular design tasks such as molecule sampling and linker design. Experimental results demonstrate that molecules sampled from our model exhibit high binding affinity to specific targets and good drug properties such as drug-likeness even if the model is not explicitly optimized for them.