论文标题
用SE(3) - invariant denoising距离匹配的分子几何形状
Molecular Geometry Pretraining with SE(3)-Invariant Denoising Distance Matching
论文作者
论文摘要
由于标记的分子数量有限,分子表示在药物和材料发现的各种应用中至关重要,并且大多数现有的工作都集中在2D分子图上进行预处理。但是,在3D几何结构上进行预处理的功能较少。这是由于难以找到足够的代理任务,该任务可以使预处理能够从几何结构中提取基本特征。由3D分子的动态性质激励,其中3D欧几里得空间中分子的连续运动形成了平滑的势能表面,我们提出了Geossl,这是一种3D坐标,将预处理框架进行预绘制的框架,以模拟这种能量景观。进一步通过利用SE(3) - 激烈的得分匹配方法,我们提出了Geossl-DDM,其中坐标将降低代理任务的坐标有效地沸腾以降低分子中的成对原子距离。我们的全面实验证实了我们提出的方法的有效性和鲁棒性。
Molecular representation pretraining is critical in various applications for drug and material discovery due to the limited number of labeled molecules, and most existing work focuses on pretraining on 2D molecular graphs. However, the power of pretraining on 3D geometric structures has been less explored. This is owing to the difficulty of finding a sufficient proxy task that can empower the pretraining to effectively extract essential features from the geometric structures. Motivated by the dynamic nature of 3D molecules, where the continuous motion of a molecule in the 3D Euclidean space forms a smooth potential energy surface, we propose GeoSSL, a 3D coordinate denoising pretraining framework to model such an energy landscape. Further by leveraging an SE(3)-invariant score matching method, we propose GeoSSL-DDM in which the coordinate denoising proxy task is effectively boiled down to denoising the pairwise atomic distances in a molecule. Our comprehensive experiments confirm the effectiveness and robustness of our proposed method.