Research article Special Issues

INSDPC: A density peaks clustering algorithm based on interactive neighbors similarity

  • Published: 27 April 2025
  • MSC : 62H30

  • The density peaks clustering (DPC) algorithm has gained significant attention in various fields due to its simplicity and effectiveness. However, its performance is constrained by the local density calculation method and the selection of the cutoff distance $ d_c $, which is a parameter primarily dependent on global data distribution, while neglecting local characteristics. Additionally, the one-step assignment strategy in DPC is prone to chain errors caused by single-point misassignment, adversely affecting clustering performance. To address these limitations, this paper proposes the interactive neighbors similarity-based density peaks clustering (INSDPC) algorithm. The algorithm introduces an interactive neighbors similarity measure that combines the information of interactive neighbors and shared neighbors to redefine local density. Furthermore, a two-step assignment strategy, leveraging interactive neighbors similarity and neighborhood information, is designed to avoid further errors when a point is incorrectly assigned. Experimental results on synthetic and real-world datasets demonstrate that INSDPC improves cluster centers identification and enhances clustering precision.

    Citation: Shihu Liu, Yirong He, Xiyang Yang, Zhiqiang Yu. INSDPC: A density peaks clustering algorithm based on interactive neighbors similarity[J]. AIMS Mathematics, 2025, 10(4): 9748-9772. doi: 10.3934/math.2025447

    Related Papers:

  • The density peaks clustering (DPC) algorithm has gained significant attention in various fields due to its simplicity and effectiveness. However, its performance is constrained by the local density calculation method and the selection of the cutoff distance $ d_c $, which is a parameter primarily dependent on global data distribution, while neglecting local characteristics. Additionally, the one-step assignment strategy in DPC is prone to chain errors caused by single-point misassignment, adversely affecting clustering performance. To address these limitations, this paper proposes the interactive neighbors similarity-based density peaks clustering (INSDPC) algorithm. The algorithm introduces an interactive neighbors similarity measure that combines the information of interactive neighbors and shared neighbors to redefine local density. Furthermore, a two-step assignment strategy, leveraging interactive neighbors similarity and neighborhood information, is designed to avoid further errors when a point is incorrectly assigned. Experimental results on synthetic and real-world datasets demonstrate that INSDPC improves cluster centers identification and enhances clustering precision.



    加载中


    [1] D. Wu, S. J. Zheng, X. P. Zhang, C. A. Yuan, F. Cheng, Y. Zhao, et al., Deep learning-based methods for person re-identification: A comprehensive review, Neurocomputing, 337 (2019), 354–371. https://doi.org/10.1016/j.neucom.2019.01.079 doi: 10.1016/j.neucom.2019.01.079
    [2] D. Jiang, W. Zang, R. Sun, Z. Wang, X. Liu, Adaptive density peaks clustering based on K-nearest neighbor and Gini coefficient, IEEE Access, 8 (2020), 113900–113917. https://doi.org/10.1109/ACCESS.2020.3003057 doi: 10.1109/ACCESS.2020.3003057
    [3] P. Jiao, W. Yu, W. Wang, X. Li, Y. Sun, Exploring temporal community structure and constant evolutionary pattern hiding in dynamic networks, Neurocomputing, 314 (2018), 224–233. https://doi.org/10.1016/j.neucom.2018.03.065 doi: 10.1016/j.neucom.2018.03.065
    [4] M. Nilashi, K. Bagherifard, M. Rahmani, V. Rafe, A recommender system for tourism industry using cluster ensemble and prediction machine learning techniques, Comput. Indust. Eng., 109 (2017), 357–368. https://doi.org/10.1016/j.cie.2017.05.016 doi: 10.1016/j.cie.2017.05.016
    [5] L. Yang, X. Cai, S. Pan, H. Dai, D. Mu, Multi-document summarization based on sentence cluster using non-negative matrix factorization, J. Intell. Fuzzy Syst., 33 (2017), 1867–1879. https://doi.org/10.3233/JIFS-161613 doi: 10.3233/JIFS-161613
    [6] D. Jiang, C. Tang, A. Zhang, Cluster analysis for gene expression data: a survey, IEEE Trans. Knowl. Data Eng., 16 (2004), 1370–1386. https://doi.org/10.1109/TKDE.2004.68 doi: 10.1109/TKDE.2004.68
    [7] S. Vassilvitskii, D. Arthur, k-means++: The advantages of careful seeding, In: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, 2006, 1027–1035.
    [8] A. K. Jain, Data clustering: 50 years beyond K-means, Pattern Recog. Lett., 31 (2010), 651–666. https://doi.org/10.1016/j.patrec.2009.09.011 doi: 10.1016/j.patrec.2009.09.011
    [9] S. Guha, R. Rastogi, K. Shim, CURE: An efficient clustering algorithm for large databases, ACM Sigmod Record, 27 (1998), 73–84. https://doi.org/10.1145/276305.276312 doi: 10.1145/276305.276312
    [10] T. Zhang, R. Ramakrishnan, M. Livny, BIRCH: an efficient data clustering method for very large databases, ACM Sigmod Record, 25 (1996), 103–114. https://doi.org/10.1145/235968.233324 doi: 10.1145/235968.233324
    [11] W. Wang, J. Yang, R. Muntz, STING: A statistical information grid approach to spatial data mining, In: Proceedings of the 23rd International Conference on Very Large Data Bases, 97 (1997), 186–195.
    [12] H. Asheri, R. Hosseini, B. N. Araabi, A new EM algorithm for flexibly tied GMMs with large number of components, Pattern Recogn., 114 (2021), 107836. https://doi.org/10.1016/j.patcog.2021.107836 doi: 10.1016/j.patcog.2021.107836
    [13] M. Ester, H. P. Kriegel, J. Sander, X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, 96 (1996), 226–231.
    [14] A. Rodriguez, A. Laio, Clustering by fast search and find of density peaks, Science, 344 (2014), 1492–1496. https://doi.org/10.1126/science.1242072 doi: 10.1126/science.1242072
    [15] Y. Wang, D. Wang, Y. Zhou, X. Zhang, C. Quek, VDPC: Variational density peak clustering algorithm, Inf. Sci., 621 (2023), 627–651. https://doi.org/10.1016/j.ins.2022.11.091 doi: 10.1016/j.ins.2022.11.091
    [16] Z. Liang, P. Chen, Delta-density based clustering with a divide-and-conquer strategy: 3DC clustering, Pattern Recogn. Lett., 73 (2016), 52–59. https://doi.org/10.1016/j.patrec.2016.01.009 doi: 10.1016/j.patrec.2016.01.009
    [17] Y. Wang, D. Wang, X. Zhang, W. Pang, C. Miao, A. H. Tan, et al., McDPC: Multi-center density peak clustering, Neural Comput. Applic., 32 (2020), 13465–13478. https://doi.org/10.1007/s00521-020-04754-5 doi: 10.1007/s00521-020-04754-5
    [18] Y. Wang, J. Qian, M. Hassan, X. Zhang, T. Zhang, C. Yang, et al., Density peak clustering algorithms: A review on the decade 2014–2023, Expert Syst. Appl., 238 (2024), 121860. https://doi.org/10.1016/j.eswa.2023.121860 doi: 10.1016/j.eswa.2023.121860
    [19] W. Tong, S. Liu, X. Z. Gao, A density-peak-based clustering algorithm of automatically determining the number of clusters, Neurocomputing, 458 (2021), 655–666. https://doi.org/10.1016/j.neucom.2020.03.125 doi: 10.1016/j.neucom.2020.03.125
    [20] Y. Li, L. Sun, Y. Tang, DPC-FSC: An approach of fuzzy semantic cells to density peaks clustering, Inf. Sci., 616 (2022), 88–107. https://doi.org/10.1016/j.ins.2022.10.041 doi: 10.1016/j.ins.2022.10.041
    [21] Y. Chen, J. Zhou, X. He, X. Luo, An improved density peaks clustering based on sparrow search algorithm, Cluster Comput., 27 (2024), 11017–11037. https://doi.org/10.1007/s10586-024-04384-9 doi: 10.1007/s10586-024-04384-9
    [22] R. Zhang, Z. Miao, Y. Tian, H. Wang, A novel density peaks clustering algorithm based on Hopkins statistic, Expert Syst. Applic., 201 (2022), 116892. https://doi.org/10.1016/j.eswa.2022.116892 doi: 10.1016/j.eswa.2022.116892
    [23] T. Gao, D. Chen, Y. Tang, B. Du, R. Ranjan, A. Y. Zomaya, et al., Adaptive density peaks clustering: Towards exploratory EEG analysis, Knowledge-Based Syst., 240 (2022), 108123. https://doi.org/10.1016/j.knosys.2022.108123 doi: 10.1016/j.knosys.2022.108123
    [24] X. Yang, Z. Cai, R. Li, W. Zhu, GDPC: Generalized density peaks clustering algorithm based on order similarity, Int. J. Mach. Learn. Cyber., 12 (2021), 719–731. https://doi.org/10.1007/s13042-020-01198-0 doi: 10.1007/s13042-020-01198-0
    [25] W. Guo, W. Wang, S. Zhao, Y. Niu, Z. Zhang, X. Liu, Density peak clustering with connectivity estimation, Knowledge-Based Syst., 243 (2022), 108501. https://doi.org/10.1016/j.knosys.2022.108501 doi: 10.1016/j.knosys.2022.108501
    [26] X. Qin, X. Han, J. Chu, Y. Zhang, X. Xu, J. Xie, et al., Density peaks clustering based on jaccard similarity and label propagation, Cogn. Comput., 13 (2021), 1609–1626. https://doi.org/10.1007/s12559-021-09906-w doi: 10.1007/s12559-021-09906-w
    [27] H. Yu, L. Y. Chen, J. T. Yao, A three-way density peak clustering method based on evidence theory, Knowledge-Based Syst., 211 (2021), 106532. https://doi.org/10.1016/j.knosys.2020.106532 doi: 10.1016/j.knosys.2020.106532
    [28] X. Yang, F. Xiao, An improved density peaks clustering algorithm based on the generalized neighbors similarity, Eng. Applic. Artif. Intell., 136 (2024), 108883. https://doi.org/10.1016/j.engappai.2024.108883 doi: 10.1016/j.engappai.2024.108883
    [29] L. Fu, E. Medico, FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data, BMC Bioinf., 8 (2007), 3. https://doi.org/10.1186/1471-2105-8-3 doi: 10.1186/1471-2105-8-3
    [30] H. Chang, D. Y. Yeung, Robust path-based spectral clustering, Pattern Recogn., 41 (2008), 191–203. https://doi.org/10.1016/j.patcog.2007.04.010 doi: 10.1016/j.patcog.2007.04.010
    [31] C. J. Veenman, M. J. T. Reinders, E. Backer, A maximum variance cluster algorithm, IEEE Trans. Pattern Anal. Mach. Intell., 24 (2002), 1273–1280. https://doi.org/10.1109/TPAMI.2002.1033218 doi: 10.1109/TPAMI.2002.1033218
    [32] A. Gionis, H. Mannila, P. Tsaparas, Clustering aggregation, ACM Trans. Knowl. Discov. Data, 1 (2007), 4-es. https://doi.org/10.1145/1217299.1217303
    [33] P. Franti, O. Virmajoki, V. Hautamaki, Fast agglomerative clustering using a k-nearest neighbor graph, IEEE IEEE Trans. Pattern Anal. Mach. Intell., 28 (2006), 1875–1881. https://doi.org/10.1109/TPAMI.2006.227
    [34] P. Fränti, O. Virmajoki, Iterative shrinking method for clustering problems, Pattern Recogn., 39 (2006), 761–775. https://doi.org/10.1016/j.patcog.2005.09.012 doi: 10.1016/j.patcog.2005.09.012
    [35] A. Asuncion, D. Newman, UCI machine learning repository, 2007. Available from: https://ergodicity.net/2013/07/.
    [36] M. Charytanowicz, J. Niewczas, P. Kulczycki, P. A. Kowalski, S. Łukasik, S. Żak, Complete gradient clustering algorithm for features analysis of x-ray images, In: Advances in Intelligent and Soft Computing, Berlin: Springer, 2010, 15–24. https://doi.org/10.1007/978-3-642-13105-9_2
    [37] V. G. Sigillito, S. P. Wing, L. V. Hutton, K. B. Baker, Classification of radar returns from the ionosphere using neural networks, Johns Hopkins APL Technical Digest, 10 (1989), 262–266.
    [38] L. Breiman, J. Friedman, R. A. Olshen, C. J. Stone, Classification and regression trees, London: Routledge, 2017.
    [39] D. Cai, X. He, J. Han, Document clustering using locality preserving indexing, IEEE Trans. Knowl. Data Eng., 17 (2005), 1624–1637. https://doi.org/10.1109/TKDE.2005.198 doi: 10.1109/TKDE.2005.198
    [40] N. X. Vinh, J. Epps, J. Bailey, Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance, J. Mach. Learning Res., 11 (2010), 2837–2854.
    [41] E. B. Fowlkes, C. L. Mallows, A method for comparing two hierarchical clusterings, J. Amer. Stat. Assoc., 78 (1983), 553–569.
    [42] J. MacQueen, Some methods for classification and analysis of multivariate observations, In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, University of California press, 5 (1967), 281–298.
    [43] J. C. Bezdek, R. Ehrlich, W. Full, FCM: The fuzzy c-means clustering algorithm, Comput. Geosc., 10 (1984), 191–203. https://doi.org/10.1016/0098-3004(84)90020-7 doi: 10.1016/0098-3004(84)90020-7
    [44] R. Liu, H. Wang, X. Yu, Shared-nearest-neighbor-based clustering by fast search and find of density peaks, Inf. Sci., 450 (2018), 200–226. https://doi.org/10.1016/j.ins.2018.03.031 doi: 10.1016/j.ins.2018.03.031
    [45] D. Yu, G. Liu, M. Guo, X. Liu, S. Yao, Density peaks clustering based on weighted local density sequence and nearest neighbor assignment, IEEE Access, 7 (2019), 34301–34317. https://doi.org/10.1109/ACCESS.2019.2904254 doi: 10.1109/ACCESS.2019.2904254
    [46] S. Ding, C. Li, X. Xu, L. Guo, L. Ding, X. Wu, Horizontal federated density peaks clustering, IEEE Trans. Neural Networks Learning Syst., 36 (2023), 820–829. https://doi.org/10.1109/TNNLS.2023.3329720 doi: 10.1109/TNNLS.2023.3329720
    [47] J. Guan, S. Li, J. Zhu, X. He, J. Chen, Fast main density peak clustering within relevant regions via a robust decision graph, Pattern Recogn., 152 (2024), 110458. https://doi.org/10.1016/j.patcog.2024.110458 doi: 10.1016/j.patcog.2024.110458
  • Reader Comments
  • © 2025 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1147) PDF downloads(56) Cited by(0)

Article outline

Figures and Tables

Figures(9)  /  Tables(8)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog