As the cost of data labeling continues to escalate, research on feature selection for a partially labeled multiset-valued decision information system (p-MSVDIS) has emerged as a core challenge in the field of data mining. Information entropy, as an uncertainty measurement tool, can be used for feature selection via global equivalence relations. However, it fails to capture features within local data regions. Furthermore, in partially labeled scenarios, its limitations in dealing with missing labels cause poor accuracy in feature selection. In contrast, local conditional entropy can accurately characterize the discriminative ability of features in local regions by quantifying the information of the dataset. To address the problem of feature selection in a p-MSVDIS, this paper proposed two algorithms for feature selection in a p-MSVDIS. First, we utilized the Hellinger distance in a p-MSVDIS to define the tolerance classes of different attributes. Second, we introduced local conditional entropy and designed two feature selection algorithms for a p-MSVDIS with predicted labels. Finally, comparative experimental results demonstrated that the proposed algorithms significantly improved classification performance and reduced redundant features in partially labeled data scenarios.
Citation: Dongliang Li, Yanlan Zhang. Feature selection in partially labeled multiset-valued decision information systems based on local conditional entropy[J]. Electronic Research Archive, 2025, 33(11): 6672-6699. doi: 10.3934/era.2025295
As the cost of data labeling continues to escalate, research on feature selection for a partially labeled multiset-valued decision information system (p-MSVDIS) has emerged as a core challenge in the field of data mining. Information entropy, as an uncertainty measurement tool, can be used for feature selection via global equivalence relations. However, it fails to capture features within local data regions. Furthermore, in partially labeled scenarios, its limitations in dealing with missing labels cause poor accuracy in feature selection. In contrast, local conditional entropy can accurately characterize the discriminative ability of features in local regions by quantifying the information of the dataset. To address the problem of feature selection in a p-MSVDIS, this paper proposed two algorithms for feature selection in a p-MSVDIS. First, we utilized the Hellinger distance in a p-MSVDIS to define the tolerance classes of different attributes. Second, we introduced local conditional entropy and designed two feature selection algorithms for a p-MSVDIS with predicted labels. Finally, comparative experimental results demonstrated that the proposed algorithms significantly improved classification performance and reduced redundant features in partially labeled data scenarios.
| [1] | A. Bargiela, W. Pedrycz, Granular computing, in: Handbook on Computational Intelligence (ed. P. P. Angelov), World Scientific, (2016), 43–66. https://doi.org/10.1142/9548 |
| [2] |
C. Romero, S. Ventura, Data mining in education, Wiley Interdiscip. Rev. Data Min. Knowl. Discov., 3 (2013), 12–27. https://doi.org/10.1002/widm.1075 doi: 10.1002/widm.1075
|
| [3] |
J. Dai, Q. Xu, Approximations and uncertainty measures in incomplete information systems, Inf. Sci., 198 (2012), 62–80. https://doi.org/10.1016/j.ins.2012.02.032 doi: 10.1016/j.ins.2012.02.032
|
| [4] |
S. U. Amin, A. Hussain, B. Kim, S. Seo, Deep learning based active learning technique for data annotation and improve the overall performance of classification models, Expert Syst. Appl., 228 (2023), 120391. https://doi.org/10.1016/j.eswa.2023.120391 doi: 10.1016/j.eswa.2023.120391
|
| [5] |
D. Huang, H. Lin, Z. Li, Information structures in a multiset-valued information system with application to uncertainty measurement, J. Intell. Fuzzy Syst., 43 (2022), 7447–7469. https://doi.org/10.3233/jifs-220652 doi: 10.3233/jifs-220652
|
| [6] |
J. Liang, Z. Shi, The information entropy, rough entropy and knowledge granulation in rough set theory, Int. J. Uncertain. Fuzziness Knowl. Based Syst., 12 (2004), 37–46. https://doi.org/10.1142/S0218488504002631 doi: 10.1142/S0218488504002631
|
| [7] |
C. Janiesch, P. Zschech, K. Heinrich, Machine learning and deep learning, Electron. Mark., 31 (2021), 685–695. https://doi.org/10.1007/s12525-021-00475-2 doi: 10.1007/s12525-021-00475-2
|
| [8] |
D. Huang, Y. Chen, F. Liu, Z. Li, Feature selection for multiset-valued data based on fuzzy conditional information entropy using iterative model and matrix operation, Appl. Soft Comput., 142 (2023), 110345. https://doi.org/10.1016/j.asoc.2023.110345 doi: 10.1016/j.asoc.2023.110345
|
| [9] | P. Cunningham, M. Cord, S. J. Delany, Supervised learning, in: Machine Learning Techniques for Multimedia: Case Studies on Organization and Retrieval (eds. M. Cord and P. Cunningham), Springer Berlin Heidelberg, (2008), 21–49. https://doi.org/10.1007/978-3-540-75171-7_2 |
| [10] |
Z. Li, T. Yang, J. Li, Semi-supervised attribute reduction for partially labelled multiset-valued data via a prediction label strategy, Inf. Sci., 634 (2023), 477–504. https://doi.org/10.1016/j.ins.2023.03.127 doi: 10.1016/j.ins.2023.03.127
|
| [11] |
X. Guo, Y. Peng, Y. Li, H. Lin, Unsupervised attribute reduction algorithms for multiset-valued data based on uncertainty measurement, Mathematics, 13 (2025), 1718. https://doi.org/10.3390/math13111718 doi: 10.3390/math13111718
|
| [12] |
Y. He, J. He, H. Liu, Z. Li, Semi-supervised attribute selection algorithms for partially labeled multiset-valued data, Mathematics, 13 (2025), 1318. https://doi.org/10.3390/math13081318 doi: 10.3390/math13081318
|
| [13] |
Y. Qian, X. Liang, Q. Wang, J. Liang, B. Liu, A. Skowron, et al., Local rough set: a solution to rough data analysis in big data, Int. J. Approx. Reason., 97 (2018), 38–63. https://doi.org/10.1016/j.ijar.2018.01.008 doi: 10.1016/j.ijar.2018.01.008
|
| [14] |
Y. Wang, X. Chen, K. Dong, Attribute reduction via local conditional entropy, Int. J. Mach. Learn. Cybern., 10 (2019), 3619–3634. https://doi.org/10.1007/s13042-019-00948-z doi: 10.1007/s13042-019-00948-z
|
| [15] |
L. Xie, G. Lin, J. Li, Y. Lin, A novel fuzzy-rough attribute reduction approach via local information entropy, Fuzzy Sets Syst., 473 (2023), 108733. https://doi.org/10.1016/j.fss.2023.108733 doi: 10.1016/j.fss.2023.108733
|
| [16] |
J. Chen, Z. Li, X. Wang, J. Zhai, Adaptive feature selection framework using local fuzzy dominance neighborhood composite entropy for imbalanced ordered decision systems, Inf. Sci., 720 (2025), 122533. https://doi.org/10.1016/j.ins.2025.122533 doi: 10.1016/j.ins.2025.122533
|
| [17] |
J. Dai, Q. Hu, J. Zhang, H. Hu, N. Zheng, Attribute selection for partially labeled categorical data by rough set approach, IEEE Trans. Cybern., 47 (2016), 2460–2471. https://doi.org/10.1109/TCYB.2016.2636339 doi: 10.1109/TCYB.2016.2636339
|
| [18] |
H. Zhang, Q. Sun, K. Dong, Information-theoretic partially labeled heterogeneous feature selection based on neighborhood rough sets, Int. J. Approximate Reasoning, 154 (2023), 200–217. https://doi.org/10.1016/j.ijar.2022.12.010 doi: 10.1016/j.ijar.2022.12.010
|
| [19] |
Z. Luo, C. Gao, J. Zhou, Rough sets-based tri-trade for partially labeled data, Appl. Intell., 53 (2023), 17708–17726. https://doi.org/10.1007/s10489-022-04405-3 doi: 10.1007/s10489-022-04405-3
|
| [20] |
N. Zhou, S. Liao, H. Chen, W. Ding, Y. Lu, Semi-supervised feature selection with multi-scale fuzzy information fusion: from both global and local perspectives, IEEE Trans. Fuzzy Syst., 33 (2025), 1825–1839. https://doi.org/10.1109/TFUZZ.2025.3540884 doi: 10.1109/TFUZZ.2025.3540884
|
| [21] |
Y. Song, H. Lin, Z. Li, Outlier detection in a multiset-valued information system based on rough set theory and granular computing, Inf. Sci., 657 (2024), 119950. https://doi.org/10.1016/j.ins.2023.119950 doi: 10.1016/j.ins.2023.119950
|
| [22] |
X. Xie, Z. Li, P. Zhang, G. Zhang, Information structures and uncertainty measures in an incomplete probabilistic set-valued information system, IEEE Access, 7 (2019), 27501–27514. https://doi.org/10.1109/ACCESS.2019.2897752 doi: 10.1109/ACCESS.2019.2897752
|
| [23] | V. B. Gisin, E. S. Volkova, Equivalence relations on multisets, in 2024 XXVII International Conference on Soft Computing and Measurements, IEEE, (2024), 258–261. https://doi.org/10.1109/SCM62608.2024.10554263 |
| [24] |
Z. Pawlak, Rough set theory and its applications, J. Telecommun. Inf. Technol., 9 (2002), 7–10. https://doi.org/10.26636/jtit.2002.140 doi: 10.26636/jtit.2002.140
|
| [25] |
W. Shu, W. Qian, Y. Xie, Incremental feature selection for dynamic hybrid data using neighborhood rough set, Knowl. Based Syst., 194 (2020), 105516. https://doi.org/10.1016/j.knosys.2020.105516 doi: 10.1016/j.knosys.2020.105516
|
| [26] | J. W. Grzymala-Busse, M. Hu, A comparison of several approaches to missing attribute values in data mining, in Rough Sets and Current Trends in Computing (eds. Z. Wojciech and Y. Yiyu), Springer Berlin Heidelberg, (2001), 378–385. https://doi.org/10.1007/3-540-45554-X_46 |
| [27] |
M. Friedman, A comparison of alternative tests of significance for the problem of m rankings, Ann. Math. Statist., 11 (1940), 86–92. https://doi.org/10.1214/aoms/1177731944 doi: 10.1214/aoms/1177731944
|
| [28] |
O. J. Dunn, Multiple comparisons among means, J. Am. Stat. Assoc., 56 (1961), 52–64. https://doi.org/10.1080/01621459.1961.10482090 doi: 10.1080/01621459.1961.10482090
|