论文标题

通过折叠扩散产生蛋白质结构

Protein structure generation via folding diffusion

论文作者

Wu, Kevin E., Yang, Kevin K., Berg, Rianne van den, Zou, James Y., Lu, Alex X., Amini, Ava P.

论文摘要

计算产生新颖但具有物理折叠蛋白结构的能力可能会导致新的生物学发现和新的靶向疾病靶向但无法治愈的疾病。尽管蛋白质结构预测的最新进展,但直接产生了多种神经网络的新蛋白质结构仍然很困难。在这项工作中,我们提出了一种新的基于扩散的生成模型,该模型通过反映天然折叠过程的过程来设计蛋白质主链结构。我们将蛋白质主链结构描述为一系列连续的角度,捕获组成氨基酸残基的相对取向,并通过从随机的,展开的状态向稳定的折叠结构降解来生成新结构。这种不仅可以反映蛋白质在生物学上如何扭曲到能量有利的构象中,这种表示的固有移位和旋转不变性至关重要地减轻了对复杂的含量网络的需求。我们使用简单的变压器主链训练一个脱氧扩散概率模型,并证明我们所得模型无条件地生成具有复杂性和结构模式的高度逼真的蛋白质结构,类似于天然蛋白质的蛋白质。作为有用的资源,我们发布了第一个开源代码库和经过训练的蛋白质结构扩散模型。

The ability to computationally generate novel yet physically foldable protein structures could lead to new biological discoveries and new treatments targeting yet incurable diseases. Despite recent advances in protein structure prediction, directly generating diverse, novel protein structures from neural networks remains difficult. In this work, we present a new diffusion-based generative model that designs protein backbone structures via a procedure that mirrors the native folding process. We describe protein backbone structure as a series of consecutive angles capturing the relative orientation of the constituent amino acid residues, and generate new structures by denoising from a random, unfolded state towards a stable folded structure. Not only does this mirror how proteins biologically twist into energetically favorable conformations, the inherent shift and rotational invariance of this representation crucially alleviates the need for complex equivariant networks. We train a denoising diffusion probabilistic model with a simple transformer backbone and demonstrate that our resulting model unconditionally generates highly realistic protein structures with complexity and structural patterns akin to those of naturally-occurring proteins. As a useful resource, we release the first open-source codebase and trained models for protein structure diffusion.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源