图神经网络中的非平衡分子几何形状

论文标题

图神经网络中的非平衡分子几何形状

Non-equilibrium molecular geometries in graph neural networks

论文作者

Raza, Ali, Henle, E. Adrian, Fern, Xiaoli

论文摘要

图形神经网络已成为学习复杂的结构 - 特性关系和化合物快速筛选的有力框架。最近提出的方法表明，使用该分子的3D几何信息以及键合结构可以导致对广泛属性的更准确的预测。一种常见的做法是使用通过密度功能理论（DFT）计算的3D几何形状进行模型的训练和测试。但是，DFT计算所需的计算时间可能很大。此外，我们旨在预测的许多属性通常可以通过用于生成3D几何信息的DFT计算的顶部几乎没有或没有开销来获得，从而使对预测模型的需求无效。为了对高通量化学筛查和发现药物发现有用，希望使用使用较低的但效率更高的非DFT方法获得的3D几何形状。在这项工作中，我们研究了在训练和测试现有模型中使用非DFT构象的影响，并提出了一种数据增强方法，以提高经典力场衍生的几何形状的预测准确性。

Graph neural networks have become a powerful framework for learning complex structure-property relationships and fast screening of chemical compounds. Recently proposed methods have demonstrated that using 3D geometry information of the molecule along with the bonding structure can lead to more accurate prediction on a wide range of properties. A common practice is to use 3D geometries computed through density functional theory (DFT) for both training and testing of models. However, the computational time needed for DFT calculations can be prohibitively large. Moreover, many of the properties that we aim to predict can often be obtained with little or no overhead on top of the DFT calculations used to produce the 3D geometry information, voiding the need for a predictive model. To be practically useful for high-throughput chemical screening and drug discovery, it is desirable to work with 3D geometries obtained using less-accurate but much more efficient non-DFT methods. In this work we investigate the impact of using non-DFT conformations in the training and the testing of existing models and propose a data augmentation method for improving the prediction accuracy of classical forcefield-derived geometries.

下载PDF全文

下载文献需遵守相关版权规定

论文标题