论文标题

基于序列的蛋白质蛋白质相互作用(PPI)预测的监督机器学习方法

A Supervised Machine Learning Approach for Sequence Based Protein-protein Interaction (PPI) Prediction

论文作者

Debnath, Soumyadeep, Mollah, Ayatullah Faruk

论文摘要

与实验方法相比,计算蛋白 - 蛋白质相互作用(PPI)预测技术可以在减少时间,成本和假阳性相互作用方面做出很大贡献。序列是在PPI预测中起关键作用的蛋白质的关键和主要信息之一。已经采用了几种机器学习方法来利用PPI数据集的特性。但是,这些数据集极大地影响了预测模型的性能。因此,应在数据集策划以及预测模型的设计上注意。在这里,我们用SEQPIP竞争的结果描述了我们提交的解决方案,其目标是从具有高质量偏见的相互作用数据集的序列信息中开发全面的PPI预测模型。向我们提供了2000个正面和2000个负面相互作用的训练集。通过三个独立的高质量交互测试数据集评估我们的方法以及其他竞争对手解决方案。

Computational protein-protein interaction (PPI) prediction techniques can contribute greatly in reducing time, cost and false-positive interactions compared to experimental approaches. Sequence is one of the key and primary information of proteins that plays a crucial role in PPI prediction. Several machine learning approaches have been applied to exploit the characteristics of PPI datasets. However, these datasets greatly influence the performance of predicting models. So, care should be taken on both dataset curation as well as design of predictive models. Here, we have described our submitted solution with the results of the SeqPIP competition whose objective was to develop comprehensive PPI predictive models from sequence information with high-quality bias-free interaction datasets. A training set of 2000 positive and 2000 negative interactions with sequences was given to us. Our method was evaluated with three independent high-quality interaction test datasets and with other competitors solutions.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源