A survey on hierarchical community detection in large-scale complex networks

Document Type : Review Article

Authors

College of Engineering and Computer Science, Australian National University, Canberra, Australia

Abstract

Vertices in a real-world social network can be grouped into densely connected communities that are sparsely connected to other groups, and these com[1]munities can be partitioned into successively more cohesive communities. Given the ever-growing pile of research on community detection, various researchers have surveyed the evolution of various community detection methods such as flat community detection, overlapping community detection, dynamic community detection and community search. Yet, the problem of hierarchical community detection, despite being well studied, has not been surveyed and the evolution of methods to identify hierarchies of communities in large-scale complex networks has not been documented. In this survey, we study the hierarchical community detection problem and formally define this problem. We then classify the existing works on hierarchical community detection and discuss some of the flat community detection approaches that are capable of producing hierarchies. We then introduce a set of empirical analysis tools, such as benchmark datasets and accuracy measures to evaluate the performance of a hierarchical community detection method.

Keywords


[1] T. Akiba, Y. Iwata, and Y. Yoshida, Linear-time enumeration of maximal k-edge-connected subgraphs in large networks by random contraction, in Proceedings of the 22nd ACM international conference on Information & Knowledge Management, 2013, pp. 909–918.
[2] S. Bandyopadhyay, G. Chowdhary, and D. Sengupta, Focs: Fast overlapped community search, TKDE’15, 27 (2015), pp. 2974–2985.
[3] S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D.-U. Hwang, Complex networks: Structure and dynamics, Physics reports, 424 (2006), pp. 175–308.
[4] J. A. Bondy, U. S. R. Murty, et al., Graph theory with applications, vol. 290, Citeseer, 1976.
[5] R. S. Burt, Positions in networks, Social forces, 55 (1976), pp. 93–122.
[6] J. Cohen, Trusses: Cohesive subgraphs for social network analysis, National Security Agency Technical Report, (2008), p. 16.
[7] L. Dhulipala, D. Eisenstat, J. Lacki, V. Mirrokni, and J. Shi, Hierarchical agglomerative graph clustering in nearly-linear time, in Proceedings of the 38th International Conference on Machine Learning, M. Meila and T. Zhang, eds., vol. 139 of Proceedings of Machine Learning Research, PMLR, 18–24 Jul 2021, pp. 2676–2686.
[8] N. Du, B. Wang, and B. Wu, Overlapping community structure detection in networks, in CIKM’08, 2008, pp. 1371–1372.
[9] D. A. Fell and A. Wagner, The small world of metabolism, Nature biotechnology, 18 (2000), p. 1121.
[10] S. Fortunato, Community detection in graphs, Physics reports, 486 (2010), pp. 75–174.
[11] S. Fortunato, V. Latora, and M. Marchiori, Method to find community structures based on information centrality, Physical review E, 70 (2004), p. 056104.
[12] L. C. Freeman, A set of measures of centrality based on betweenness, Sociometry, (1977), pp. 35–41.
[13] M. Girvan and M. E. Newman, Community structure in social and biological networks, PNAS’02, 99 (2002), pp. 7821–7826.
[14] P. K. Gopalan and D. M. Blei, Efficient discovery of overlapping communities in massive networks, PNAS’13, 110 (2013), pp. 14534–14539.
[15] J. H. W. Jr., Hierarchical grouping to optimize an objective function, Journal of the American Statistical Association, 58 (1963), pp. 236–244.
[16] R. Kannan, S. Vempala, and A. Vetta, On clusterings: Good, bad and spectral, J. ACM, 51 (2004), pp. 497–515.
[17] J. M. Kleinberg, Navigation in a small world, Nature, 406 (2000), p. 845.
[18] A. Lancichinetti, S. Fortunato, and J. Kertesz ´ , Detecting the overlapping and hierarchical community structure in complex networks, New Journal of Physics, 11 (2009), p. 033015.
[19] C. Lee, F. Reid, A. McDaid, and N. Hurley, Detecting highly overlapping community structure by greedy clique expansion, SNA/KDD’10, (2010), pp. 33–42.
[20] F. Luo, J. Z. Wang, and E. Promislow, Exploring local community structures in large networks, Web Intelligence and Agent Systems, 6 (2008), pp. 387–400.
[21] M. Mihail, C. Gkantsidis, A. Saberi, and E. Zegura, On the semantics of internet topologies, Tech. Rep. GIT-CC-02-07, College of Computing, Georgia Institute of Technology, Atlanta, GA, 2002.
[22] M. E. Newman, Coauthorship networks and patterns of scientific collaboration, PNAS’04, 101 (2004), pp. 5200–5205.
[23] , Modularity and community structure in networks, PNAS’06, 103 (2006), pp. 8577–8582.
[24] M. E. Newman and M. Girvan, Mixing patterns and community structure in networks, in Statistical mechanics of complex networks, Springer, 2003, pp. 66–87.
[25] G. Palla, I. Derenyi, I. Farkas, and T. Vicsek ´ , Uncovering the overlapping community structure of complex networks in nature and society, Nature, 435 (2005), pp. 814–818.
[26] P. Pons and M. Latapy, Computing communities in large networks using random walks, in Computer and Information Sciences - ISCIS 2005, p. Yolum, T. G¨ung¨or, F. G¨urgen, and C. Ozturan, eds., Berlin, Heidelberg, ¨ 2005, Springer Berlin Heidelberg, pp. 284–293.
[27] M. J. Rattigan, M. Maier, and D. Jensen, Graph clustering with network structure indices, in Proceedings of the 24th International Conference on Machine Learning, ICML ’07, New York, NY, USA, 2007, Association for Computing Machinery, p. 783–790.
[28] E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai, and A.-L. Barabasi ´ , Hierarchical organization of modularity in metabolic networks, Science, 297 (2002), pp. 1551–1555.
[29] M. Rezvani, W. Liang, C. Liu, and J. X. Yu, Efficient detection of overlapping communities using asymmetric triangle cuts, IEEE Transactions on Knowledge and Data Engineering, 30 (2018), pp. 2093–2105.
[30] M. Rezvani, W. Liang, W. Xu, and C. Liu, Identifying top-k structural hole spanners in large-scale social networks, in CIKM’15, 2015, pp. 263–272.
[31] M. Rezvani, Q. Wang, and W. Liang, Fach: Fast algorithm for detecting cohesive hierarchies of communities in large networks, in Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, 2018, pp. 486–494.
 [32] M. Rosvall and C. T. Bergstrom, Maps of random walks on complex networks reveal community structure, PNAS’08, 105 (2008), pp. 1118–1123.
[33] B. Saha, A. Hoch, S. Khuller, L. Raschid, and X.-N. Zhang, Dense subgraphs with restrictions and applications to gene annotation graphs, in Research in Computational Molecular Biology, Springer, 2010, pp. 456–472.
[34] M. Sales-Pardo, R. Guimera, A. A. Moreira, and L. A. N. Amaral, Extracting the hierarchical organization of complex systems, PNAS’07, 104 (2007), pp. 15224–15229.
[35] P. Sarkar and A. W. Moore, Random walks in social networks and their applications: A survey, in Social Network Data Analytics, C. C. Aggarwal, ed., Springer US, Boston, MA, 2011, pp. 43–77.
[36] S. E. Schaeffer, Graph clustering, Computer science review, 1 (2007), pp. 27–64.
[37] H. Shen, X. Cheng, K. Cai, and M.-B. Hu, Detect overlapping and hierarchical community structure in networks, Physica A: Statistical Mechanics and its Applications, 388 (2009), pp. 1706–1712.
[38] S. H. Strogatz, Exploring complex networks, nature, 410 (2001), p. 268.
[39] S. S. Tabatabaei, M. Coates, and M. Rabbat, Ganc: Greedy agglomerative normalized cut for graph clustering, Pattern Recognition, 45 (2012), pp. 831–843.
[40] H. Tong, S. Papadimitriou, C. Faloutsos, S. Y. Philip, and T. Eliassi-Rad, Gateway finder in large graphs: problem definitions and fast solutions, Information retrieval, 15 (2012), pp. 391–411.
[41] D. J. Watts and S. H. Strogatz, Collective dynamics of ‘small-world’ networks, nature, 393 (1998), pp. 440–442.
[42] J. J. Whang, D. F. Gleich, and I. S. Dhillon, Overlapping community detection using seed set expansion, in CIKM’13, 2013, pp. 2099–2108.
[43] Y. Wu, R. Jin, J. Li, and X. Zhang, Robust local community detection: on free rider effect and its elimination, VLDB’15, 8 (2015), pp. 798–809.
[44] J. Xie, S. Kelley, and B. K. Szymanski, Overlapping community detection in networks: The state-of-theart and comparative study, ACM Computing Surveys, 45 (2013), p. 43.
[45] J. Yang and J. Leskovec, Overlapping community detection at scale: a nonnegative matrix factorization approach, in WSDM’13, ACM, 2013, pp. 587–596.
[46] Y. Zhang and S. Parthasarathy, Extracting analyzing and visualizing triangle k-core motifs within networks, in 2012 IEEE 28th international conference on data engineering, IEEE, 2012, pp. 1049–1060.
[47] Y. Zhao, G. Karypis, and U. Fayyad, Hierarchical clustering algorithms for document datasets, Data mining and knowledge discovery, 10 (2005), pp. 141–168.
[48] H. Zhou, Distance, dissimilarity index, and network community structure, Phys. Rev. E, 67 (2003), p. 061901.
[49] , Network landscape from a brownian particle’s perspective, Phys. Rev. E, 67 (2003), p. 041908.
[50] H. Zhou and R. Lipowsky, Network brownian motion: A new method to measure vertex-vertex proximity and to identify communities and subcommunities, in International conference on computational science, Springer, 2004, pp. 1062–1069.
[51] R. Zhou, C. Liu, J. X. Yu, W. Liang, B. Chen, and J. Li, Finding maximal k-edge-connected subgraphs from a large graph, in EDBT’12, 2012, pp. 480–491.