论文标题
来自监督机器学习溶液中聚合物脂质的分子动力学
Molecular Dynamics of Polymer-lipids in Solution from Supervised Machine Learning
论文作者
论文摘要
包括神经网络在内的机器学习技术是用于材料和化学科学家的流行工具,其应用可以在分析从晶体到生物分子的系统的结构和能量分析中提供可行的替代方法。但是,对于动态的预测,努力的丰富程度较小。在这里,我们探讨了三个良好建立的复发性神经网络体系结构的能力,以预测在环境条件下在乙酸乙酯中溶剂化的大分子聚合物脂质骨料的能量学。从复发性神经网络产生的数据模型在纳秒长的时间序列上进行了训练和测试,其内分子势能及其与分子动力学产生的溶剂和含有50万点的溶剂的相互作用能量。我们详尽的分析表明,研究了三个复发性神经网络生成的数据模型具有有限的能力,可以重现能量波动,并产生短期或长期的能量预测,并且与输入串联分布不一致的点的基本分布的基本分布。我们提出了一个IN IN Silico实验协议,该协议包括在一个系列集合中训练的人造网络模型的集合,并具有包含原始系列预群集的时间模式的时间序列中的其他功能。预测过程通过预测预测时间序列的频带,其值的传播与分子动力学能量波动一致。但是,从预测带中分布并不是最佳的。尽管三个检查的复发性神经网络无法产生单个模型,这些模型在纳米尺度上重现了热平衡中检查的分子系统能量的实际波动,但该方案提供了分子命运的有用估计。
Machine learning techniques including neural networks are popular tools for materials and chemical scientists with applications that may provide viable alternative methods in the analysis of structure and energetics of systems ranging from crystals to biomolecules. However, efforts are less abundant for prediction of dynamics. Here we explore the ability of three well established recurrent neural network architectures for forecasting the energetics of a macromolecular polymer-lipid aggregate solvated in ethyl acetate at ambient conditions. Data models generated from recurrent neural networks are trained and tested on nanoseconds-long time series of the intra-macromolecules potential energy and their interaction energy with the solvent generated from Molecular Dynamics and containing half million points. Our exhaustive analyses convey that the three recurrent neural network investigated generate data models with limited capability of reproducing the energetic fluctuations and yielding short or long term energetics forecasts with underlying distribution of points inconsistent with the input series distributions. We propose an in silico experimental protocol consisting on forming an ensemble of artificial network models trained on an ensemble of series with additional features from time series containing pre-clustered time patterns of the original series. The forecast process improves by predicting a band of forecasted time series with a spread of values consistent with the molecular dynamics energy fluctuations span. However, the distribution of points from the band of forecasts is not optimal. Although the three inspected recurrent neural networks were unable of generating single models that reproduce the actual fluctuations of the inspected molecular system energies in thermal equilibrium at the nanosecond scale, the proposed protocol provides useful estimates of the molecular fate