论文标题
混合聚类系数核心,用于识别必需蛋白
A mixed clustering coefficient centrality for identifying essential proteins
论文作者
论文摘要
必需蛋白质在细胞寿命过程中起着至关重要的作用。必需蛋白质的鉴定不仅可以促进药物靶技术的发展,而且还有助于生物进化的机理。有很多学者根据蛋白质网络和生物学信息的拓扑结构来关注发现必需蛋白质。蛋白质识别的准确性仍然需要提高。在本文中,我们提出了一种在蛋白质复合物和拓扑特性中整合聚类系数的方法,以确定蛋白质的重要性。首先,我们给出了聚类系数(IC)的定义,以描述蛋白质复合物的特性。然后,我们提出了一种新方法,复杂边缘和节点聚类系数(CENC),以识别必需蛋白质。酿酒酵母,MIP和DIP的不同蛋白质蛋白相互作用(PPI)网络用作实验材料。通过逻辑回归模型的一些实验,结果表明,CenC的方法可以通过与现有方法DC,BC,EC,SC,SC,LAC,NC和最近的方法UC进行比较来促进识别必需蛋白的能力。
Essential protein plays a crucial role in the process of cell life. The identification of essential proteins can not only promote the development of drug target technology, but also contribute to the mechanism of biological evolution. There are plenty of scholars who pay attention to discovering essential proteins according to the topological structure of protein network and biological information. The accuracy of protein recognition still demands to be improved. In this paper, we propose a method which integrate the clustering coefficient in protein complexes and topological properties to determine the essentiality of proteins. First, we give the definition of In-clustering coefficient (IC) to describe the properties of protein complexes. Then we propose a new method, complex edge and node clustering coefficient (CENC) to identify essential proteins. Different Protein-Protein Interaction (PPI) networks of Saccharomyces cerevisiae, MIPS and DIP are used as experimental materials. Through some experiments of logistic regression model, the results show that the method of CENC can promote the ability of recognizing essential proteins, by comparing with the existing methods DC, BC, EC, SC, LAC, NC and the recent method UC.