论文标题
使用贝叶斯学习的晶格蛋白质设计
Lattice protein design using Bayesian learning
论文作者
论文摘要
蛋白质设计是阐明3D结构与氨基酸序列之间关系的三维(3D)结构预测的逆方法。通常,蛋白质设计的计算涉及双回路:氨基酸序列变化的环和一个循环,用于针对每个氨基酸序列进行详尽的构象搜索。本文中,我们提出了一种使用贝叶斯学习的新型统计机械设计方法,该方法可以设计晶格蛋白,而无需详尽的构象搜索。我们考虑蛋白质进化的热力学假设,并将其应用于氨基酸序列的先前分布。此外,鉴于大规范的图片,我们考虑了水的效应。结果,在应用2D晶状体疏水性偏振(HP)模型时,我们的设计方法成功地找到了目标构型具有独特基态的氨基酸序列。但是,与2D模型相比,3D晶格HP模型的性能并不那么好。 3D模型的性能在使用20个字母的晶格蛋白上提高了。此外,我们发现水的化学潜力与表面残基的数量之间有很强的线性性,从而揭示了蛋白质结构与水分子的作用之间的关系。我们方法的优点是它大大减少了计算时间,因为它不需要长时间计算与详尽构象搜索相对应的分区函数。由于我们的方法使用贝叶斯学习和统计力学的一般形式,并且不限于晶格蛋白,因此此处介绍的结果阐明了以前蛋白质设计方法中成功使用的一些启发式方法。
Protein design is the inverse approach of the three-dimensional (3D) structure prediction for elucidating the relationship between the 3D structures and amino acid sequences. In general, the computation of the protein design involves a double loop: a loop for amino acid sequence changes and a loop for an exhaustive conformational search for each amino acid sequence. Herein, we propose a novel statistical mechanical design method using Bayesian learning, which can design lattice proteins without the exhaustive conformational search. We consider a thermodynamic hypothesis of the evolution of proteins and apply it to the prior distribution of amino acid sequences. Furthermore, we take the water effect into account in view of the grand canonical picture. As a result, on applying the 2D lattice hydrophobic-polar (HP) model, our design method successfully finds an amino acid sequence for which the target conformation has a unique ground state. However, the performance was not as good for the 3D lattice HP models compared to the 2D models. The performance of the 3D model improves on using a 20-letter lattice proteins. Furthermore, we find a strong linearity between the chemical potential of water and the number of surface residues, thereby revealing the relationship between protein structure and the effect of water molecules. The advantage of our method is that it greatly reduces computation time, because it does not require long calculations for the partition function corresponding to an exhaustive conformational search. As our method uses a general form of Bayesian learning and statistical mechanics and is not limited to lattice proteins, the results presented here elucidate some heuristics used successfully in previous protein design methods.