The assessment of essential genes in the stability of PPI networks using critical node detection problem

Document Type : Original Article


1 Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran

2 Department of Biotechnology, College of Science, University of Tehran, Tehran, Iran


Essential genes and proteins as their products encode the basic functions of a cell in a variety of conditions and are vital for the survival of a cell. Analyzing the characteristics of these proteins provides important biological information. An interesting analysis is to demonstrate the correlation between the topological importance of a protein in protein-protein interaction networks and its essentiality. Different centrality criteria such as degree, between ness, closeness, and eigenvector centralities are used to investigate such a correlation. Despite the remarkable results obtained by these methods, it is shown that the centrality criteria in scale-free networks show a high level of correlations which indicate that they share similar topo[1]logical information of the networks. In this paper, we use a different approach for analyzing this correlation and use a well-known problem in the field of graph theory, Critical Node Detection Problem and solve it on the protein-protein interaction networks to obtain a subset of proteins called critical nodes which have the most effect on the network stability. Our results show that essential proteins have a more prominent presence in the set of critical nodes than what is expected at random samples. Furthermore, the essential proteins represented in the set of critical nodes have a different distribution of topological properties compared to the essential proteins recovered by the centrality-based methods. All the source codes and data are available at “ PPI networks/”.


[1] R. Albert, Network inference, analysis, and modeling in systems biology, The Plant Cell, 19 (2007), pp. 3327– 3338.
[2] E. Alm and A. P. Arkin, Biological networks, Current opinion in structural biology, 13 (2003), pp. 193–202.
[3] M. Altaf-Ul-Amin, S. H. Wijaya, D. F. Chandra, and S. Kanaya, Centrality values of yeast proteins in a ppi network are related to their essentiality and functions, Journal of Computer Aided Chemistry, 18 (2017), pp. 94–109.
[4] M. Ambriz-Rivas, N. Pastor, and G. del Rio, Relating protein structure and function through a bijection and its implications on protein structure prediction, Protein Interactions, 349 (2012).
[5] R. Aringhieri, A. Grosso, P. Hosteins, and R. Scatamacchia, A general evolutionary framework for different classes of critical node problems, Engineering Applications of Artificial Intelligence, 55 (2016), pp. 128–145.
[6] O. Aromolaran, T. Beder, M. Oswald, J. Oyelade, E. Adebiyi, and R. Koenig, Essential gene prediction in drosophila melanogaster using machine learning approaches based on sequence and functional features, Computational and structural biotechnology journal, 18 (2020), pp. 612–621.
[7] A. Arulselvan, C. W. Commander, L. Elefteriadou, and P. M. Pardalos, Detecting critical nodes in sparse graphs, Computers & Operations Research, 36 (2009), pp. 2193–2200.
[8] M. Ashtiani, A. Salehzadeh-Yazdi, Z. Razaghi-Moghadam, H. Hennig, O. Wolkenhauer, M. Mirzaie, and M. Jafari, A systematic survey of centrality measures for protein-protein interaction networks, BMC systems biology, 12 (2018), pp. 1–17.
[9] N. N. Batada, L. D. Hurst, and M. Tyers, Evolutionary and physiological importance of hub proteins, PLoS computational biology, 2 (2006), p. e88.
[10] S. P. Borgatti, Identifying sets of key players in a social network, Computational & Mathematical Organization Theory, 12 (2006), pp. 21–34.
[11] H. Buchner, Displaying centrality of a network using orbital layout, Master’s Thesis, University of Passau/National ICT Sydney, (2006).
[12] C.-Y. Chen and J.-J. Huang, A novel centrality for finding key persons in a social network by the bidirectional influence map, Symmetry, 12 (2020), p. 1747.
[13] W.-H. Chen, G. Lu, X. Chen, X.-M. Zhao, and P. Bork, Ogee v2: an update of the online gene essentiality database with special focus on differentially essential genes in human cancer cell lines, Nucleic acids research, (2016), p. gkw1013.
[14] J. Cheng, W. Wu, Y. Zhang, X. Li, X. Jiang, G. Wei, and S. Tao, A new computational strategy for predicting essential genes, BMC genomics, 14 (2013), pp. 1–13.
[15] M. Di Summa, A. Grosso, and M. Locatelli, Complexity of the critical node problem over trees, Computers & Operations Research, 38 (2011), pp. 1766–1774.
[16] T. N. Dinh, M. T. Thai, and H. T. Nguyen, Bound and exact methods for assessing link vulnerability in complex networks, Journal of Combinatorial Optimization, 28 (2014), pp. 3–24.
[17] B. Duan, J. Liu, M. Zhou, and L. Ma, A comparative analysis of network robustness against different link attacks, Physica A: Statistical Mechanics and its Applications, 448 (2016), pp. 144–153.
[18] E. Estrada, Virtual identification of essential proteins within the protein interaction network of yeast, Proteomics, 6 (2006), pp. 35–40.
[19] L. Faramondi, G. Oliva, R. Setola, F. Pascucci, A. Esposito Amideo, and M. P. Scaparra, Perfor[1]mance analysis of single and multi-objective approaches for the critical node detection problem, in International Conference on Optimization and Decision Science, Springer, 2017, pp. 315–324.
[20] F. Grando, Methods for the approximation of network centrality measures, (2018).
[21] S. Gund ¨ uc¸ and R. Eryi ¨ git ˘ , Time dependent correlations between the probability of a node being infected and its centrality measures, Physica A: Statistical Mechanics and its Applications, 563 (2021), p. 125483.
[22] S. Gurumayum, P. Jiang, X. Hao, T. L. Campos, N. D. Young, P. K. Korhonen, R. B. Gasser, P. Bork, X.-M. Zhao, L.-j. He, et al., Ogee v3: Online gene essentiality database with increased coverage of organisms and human cell lines, Nucleic acids research, 49 (2021), pp. D998–D1003.
[23] A. Hagberg, P. Swart, and D. Schult, Exploring network structure, dynamics, and function using networkx, tech. rep., Los Alamos National Lab.(LANL), Los Alamos, NM (United States), 2008.
[24] M. W. Hahn and A. D. Kern, Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks, Molecular biology and evolution, 22 (2005), pp. 803–806.
[25] S. Iyer, T. Killingback, B. Sundaram, and Z. Wang, Attack robustness and centrality of complex networks, PloS one, 8 (2013), p. e59613.
[26] H. Jeong, S. P. Mason, A.-L. Barabasi, and Z. N. Oltvai ´ , Lethality and centrality in protein networks, Nature, 411 (2001), pp. 41–42.
[27] M. Lalou, M. A. Tahraoui, and H. Kheddouci, The critical node detection problem in networks: A survey, Computer Science Review, 28 (2018), pp. 92–117.
[28] Y.-Y. Liu, J.-J. Slotine, and A.-L. Barabasi ´ , Control centrality and hierarchical structure in complex networks, PLOS ONE, 7 (2012), pp. 1–7.
[29] Y. Lopez, K. Nakai, and A. Patil ´ , Hitpredict version 4: comprehensive reliability scoring of physical protein–protein interactions from more than 100 species, Database, 2015 (2015).
[30] H. Luo, Y. Lin, F. Gao, C.-T. Zhang, and R. Zhang, Deg 10, an update of the database of essential genes that includes both protein-coding genes and noncoding genomic elements, Nucleic acids research, 42 (2014), pp. D574–D580.
[31] N. Matas, Comparing network centrality measures as tools for identifying key concepts in complex networks: A case of wikipedia., Journal of Digital Information Management, 15 (2017).
[32] N. Meghanathan, Correlation coefficient analysis of centrality metrics for complex network graphs, in Computer Science On-line Conference, Springer, 2015, pp. 11–20.
[33] A. E. Motter and Y.-C. Lai, Cascade-based attacks on complex networks, Physical Review E, 66 (2002), p. 065102.
[34] T. Murali, S. Pacifico, J. Yu, S. Guest, G. G. Roberts, and R. L. Finley, Droid 2011: a comprehensive, integrated resource for protein, transcription factor, rna and gene interactions for drosophila, Nucleic acids research, 39 (2011), pp. D736–D743.
[35] T. Nie, Z. Guo, K. Zhao, and Z.-M. Lu, New attack strategies for complex networks, Physica A: Statistical Mechanics and its Applications, 424 (2015), pp. 248–253.
[36] S. Oldham, B. Fulcher, L. Parkes, A. Arnatkevici ˘ ut¯ e, C. Suo, and A. Fornito ˙ , Consistency and differences between centrality measures across distinct classes of networks, PloS one, 14 (2019), p. e0220061.
[37] M. Oosten, J. H. Rutten, and F. C. Spieksma, Disconnecting graphs by removing vertices: a polyhedral approach, Statistica Neerlandica, 61 (2007), pp. 35–60.
[38] S. Orchard, M. Ammari, B. Aranda, L. Breuza, L. Briganti, F. Broackes-Carter, N. H. Camp[1]bell, G. Chavali, C. Chen, N. Del-Toro, et al., The mintact project—intact as a common curation platform for 11 molecular interaction databases, Nucleic acids research, 42 (2014), pp. D358–D363.
[39] R. Oughtred, C. Stark, B.-J. Breitkreutz, J. Rust, L. Boucher, C. Chang, N. Kolas, L. O’Donnell, G. Leung, R. McAdam, et al., The biogrid interaction database: 2019 update, Nucleic acids research, 47 (2019), pp. D529–D541.
[40] S. Pundir, M. J. Martin, C. O’Donovan, and U. Consortium, Uniprot tools, Current protocols in bioinformatics, 53 (2016), pp. 1–29.
[41] S. Rasti and C. Vogiatzis, A survey of computational methods in protein-protein interaction networks, Ann. Oper. Res., 276 (2019), pp. 35–87.
[42] J. Rezaei, F. Zare-Mirakabad, S. A. MirHassani, and S.-A. Marashi, Eia-cndp: an exact iterative algorithm for critical node detection problem, Computers & Operations Research, 127 (2021), p. 105138.
[43] L. Salwinski, C. S. Miller, A. J. Smith, F. K. Pettit, J. U. Bowie, and D. Eisenberg, The database of interacting proteins: 2004 update, Nucleic acids research, 32 (2004), pp. D449–D451. [44] P. Shannon, A. Markiel, O. Ozier, N. S. Baliga, J. T. Wang, D. Ramage, N. Amin, B. Schwikowski, and T. Ideker, Cytoscape: a software environment for integrated models of biomolec[1]ular interaction networks, Genome research, 13 (2003), pp. 2498–2504. [45] S. Shen, J. C. Smith, and R. Goli, Exact interdiction models and algorithms for disconnecting networks via node deletions, Discrete Optimization, 9 (2012), pp. 172–188.
[46] J. G. Siek, L.-Q. Lee, and A. Lumsdaine, The boost graph library: User guide and reference manual, portable documents, Pearson Education, 2001.
[47] J. L. Snoep and H. V. Westerhoff, From isolation to integration, a systems biology approach for building the silicon cell, in Systems biology, Springer, 2005, pp. 13–30.
[48] S.-w. Sun, Y.-l. Ma, R.-q. Li, L. Wang, and C.-y. Xia, Tabu search enhances network robustness under targeted attacks, Physica A: Statistical Mechanics and its Applications, 446 (2016), pp. 82–91.
[49] D. Szklarczyk, A. L. Gable, D. Lyon, A. Junge, S. Wyder, J. Huerta-Cepas, M. Simonovic, N. T. Doncheva, J. H. Morris, P. Bork, et al., String v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets, Nucleic acids research, 47 (2019), pp. D607–D613.
[50] X. Tang, J. Wang, J. Zhong, and Y. Pan, Predicting essential proteins based on weighted degree centrality, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 11 (2013), pp. 407–418.
[51] D. Vella, S. Marini, F. Vitali, D. Di Silvestre, G. Mauri, and R. Bellazzi, Mtgo: Ppi network analysis via topological and functional module identification, Scientific reports, 8 (2018), pp. 1–13.
[52] M. Ventresca and D. Aleman, A randomized algorithm with local search for containment of pandemic disease spread, Computers & Operations Research, 48 (2014), pp. 11–19.
[53] A. Veremyev, V. Boginski, and E. L. Pasiliao, Exact identification of critical nodes in sparse networks via new compact formulations, Optimization Letters, 8 (2014), pp. 1245–1259.
[54] J. Wang, W. Peng, and F.-X. Wu, Computational approaches to predicting essential proteins: a survey, PROTEOMICS–Clinical Applications, 7 (2013), pp. 181–192.
[55] C. Wierling, T. Kessler, L. A. Ogilvie, B. M. Lange, M.-L. Yaspo, and H. Lehrach, Network and systems biology: essential steps in virtualising drug discovery and development, Drug Discovery Today: Technologies, 15 (2015), pp. 33–40.
[56] L. Yang, J. Wang, H. Wang, Y. Lv, Y. Zuo, and W. Jiang, Characterization of essential genes by topological properties in the perturbation sensitivity network, Biochemical and biophysical research communications, 448 (2014), pp. 473–479.
[57] H. Yu, D. Greenbaum, H. X. Lu, X. Zhu, and M. Gerstein, Genomic analysis of essentiality within protein networks, TRENDS in Genetics, 20 (2004), pp. 227–231.
[58] H. Yu, P. M. Kim, E. Sprecher, V. Trifonov, and M. Gerstein, The importance of bottlenecks in protein networks: correlation with gene essentiality and expression dynamics, PLoS computational biology, 3 (2007), p. e59.
[59] J. Yu, S. Pacifico, G. Liu, and R. L. Finley, Droid: the drosophila interactions database, a comprehensive resource for annotated gene and protein interactions, BMC genomics, 9 (2008), pp. 1–9. [60] F. S. Zaidi, U. Fatima, B. A. Usmani, and A. R. Jafri, Comprehending nodes essentiality through centrality measures in biological networks, IJCSNS, 19 (2019), p. 65.
[61] E. Zajmi, The ottoman period in albanian historiography (1915-2015), (2018).
[62] X. Zhao and Z.-P. Liu, Analysis of topological parameters of complex disease genes reveals the importance of location in a biomolecular network, Genes, 10 (2019), p. 143.
[63] Z. Zhu, Discovering the influential users oriented to viral marketing based on online social networks, Physica A: Statistical Mechanics and its Applications, 392 (2013), pp. 3459–3469.
[64] E. Zotenko, J. Mestre, D. P. O’Leary, and T. M. Przytycka, Why do hubs in the yeast protein interaction network tend to be essential: reexamining the connection between the network topology and essentiality, PLoS computational biology, 4 (2008), p. e1000140.