Rumors circulating on social media can have significant adverse impacts on society, highlighting the urgent need for effective rumor detection. However, most existing methods predominantly focus on rumor identification only after widespread dissemination has occurred, when substantial harm has been inflicted. Early-stage rumor detection is a major challenge due to limited reach and small sample sizes, which restrict the use of large datasets and traditional propagation models. To overcome these challenges, a cross-modal deep fusion method based on small samples for early rumor detection is proposed, which includes a multimodal feature extraction network and a cross-modal deep fusion network. The multimodal feature extraction network captures features from multiple modalities, while the multimodal deep information extraction network derives deep representations from these modalities. The cross-modal deep fusion network integrates textual and visual features for rumor classification. Furthermore, an enhanced meta-learning training approach based on model-agnostic meta-learning is proposed to improve the efficiency of rumor detection by employing distinct learning rates both within and between tasks. Experimental results on two publicly available datasets demonstrate that the proposed cross-modal deep fusion method outperforms baseline methods and exhibits promising performance.
Citation: Junqing Yang, Hongzhe Chen, Yao Zhao, Wing-Kuen Ling, Yang Zhou. Cross-modal deep fusion based on small samples for early rumor detection[J]. Electronic Research Archive, 2026, 34(1): 232-250. doi: 10.3934/era.2026012
Rumors circulating on social media can have significant adverse impacts on society, highlighting the urgent need for effective rumor detection. However, most existing methods predominantly focus on rumor identification only after widespread dissemination has occurred, when substantial harm has been inflicted. Early-stage rumor detection is a major challenge due to limited reach and small sample sizes, which restrict the use of large datasets and traditional propagation models. To overcome these challenges, a cross-modal deep fusion method based on small samples for early rumor detection is proposed, which includes a multimodal feature extraction network and a cross-modal deep fusion network. The multimodal feature extraction network captures features from multiple modalities, while the multimodal deep information extraction network derives deep representations from these modalities. The cross-modal deep fusion network integrates textual and visual features for rumor classification. Furthermore, an enhanced meta-learning training approach based on model-agnostic meta-learning is proposed to improve the efficiency of rumor detection by employing distinct learning rates both within and between tasks. Experimental results on two publicly available datasets demonstrate that the proposed cross-modal deep fusion method outperforms baseline methods and exhibits promising performance.
| [1] | Z. Jin, J. Cao, H. Guo, Y. Zhang, J. Luo, Multimodal fusion with recurrent neural networks for rumor detection on microblogs, in Proceedings of the 25th ACM International Conference on Multimedia, (2017), 795–816. https://doi.org/10.1145/3123266.3123454 |
| [2] | C. Boididou, S. Middleton, S. Papadopoulos, D. Dang, M. Riegler, G. Boato, et al., The vmu participation@ verifying multimedia use 2016, Comput. Sci., 2016. |
| [3] | J. Zheng, X. Zhang, S. Guo, Q. Wang, W. Zang, Y. Zhang, Mfan: Multi-modal feature-enhanced attention networks for rumor detection, in Proceedings of the 32th International Joint Conference on Artificial Intelligence, (2022), 2413–2419, https://doi.org/10.24963/ijcai.2022/335. |
| [4] | W. Wang, Y. Qiu, S. Xuan, W. Yang, Early rumor detection based on deep recurrent q-learning, Secur. Commun. Netw., 2021 (2021), 5569064. |
| [5] |
Z. Zhang, Z. Dan, F. Dong, Z. Gao, Y. Zhang, A rumor detection method based on adaptive fusion of statistical features and textual features, Information, 13 (2022), 388. https://doi.org/10.3390/info13080388. doi: 10.3390/info13080388
|
| [6] |
X. Liu, D. Wang, Pagcn: Structural semantic relationship and attention mechanism for rumor detection, Appl. Sci., 15 (2025), 8984. https://doi.org/10.3390/app15168984. doi: 10.3390/app15168984
|
| [7] |
H. Li, L. Jiang, J. Li, Continuous-time dynamic graph networks integrated with knowledge propagation for social media rumor detection, Mathematics, 12 (2024), 3453. https://doi.org/10.3390/math12223453. doi: 10.3390/math12223453
|
| [8] |
H. Sha, L. Zhu, Dynamic analysis of pattern and optimal control research of rumor propagation model on different networks, Inf. Process. Manage., 62 (2025), 104016. https://doi.org/10.1016/j.ipm.2024.104016 doi: 10.1016/j.ipm.2024.104016
|
| [9] |
L. Zhu, Y. Ding, S. Shen, Green behavior propagation analysis based on statistical theory and intelligent algorithm in data-driven environment, Math. Biosci., 379 (2025), 109340. https://doi.org/10.1016/j.mbs.2024.109340 doi: 10.1016/j.mbs.2024.109340
|
| [10] |
T. Yang, L. Zhu, S. Shen, L. He, Pattern dynamics analysis and parameter identification of spatiotemporal infectious disease models on complex networks, Math. Biosci., 387 (2025), 109502. https://doi.org/10.1016/j.mbs.2025.109502 doi: 10.1016/j.mbs.2025.109502
|
| [11] |
N. Zhong, G. Zhou, W. Ding, J. Zhang, A rumor detection method based on multimodal feature fusion by a joining aggregation structure, Electronics, 11 (2022), 3200. https://doi.org/10.3390/electronics11193200. doi: 10.3390/electronics11193200
|
| [12] |
J. Lai, X. Yang, W. Luo, L. Zhou, L. Li, Y. Wang et al., Rumorllm: A rumor large language model-based fake-news-detection data-augmentation approach, Appl. Sci., 14 (2024), 3532. https://doi.org/10.3390/app14083532. doi: 10.3390/app14083532
|
| [13] | T. Chen, X. Li, H. Yin, J. Zhang, Call attention to rumors: Deep attention based recurrent neural networks for early rumor detection, in Pacific-Asia Conference on Knowledge Discovery and Data Mining, (2018), 40–52. |
| [14] |
H. Lu, C. Fan, X. Song, W. Fang, A novel few-shot learning based multi-modality fusion model for covid-19 rumor detection from online social media, PeerJ Comput. Sci., 7 (2021), e688. https://doi.org/10.7717/peerj-cs.688 doi: 10.7717/peerj-cs.688
|
| [15] | Q. Huang, J. Yu, J. Wu, B. Wang, Heterogeneous graph attention networks for early detection of rumors on twitter, in 2020 International Joint Conference on Neural Networks, (2020), 1–8. |
| [16] |
H. T. Le-Pham, T. P. Nguyen, A. D. Tran, K. H. Le, Mtas: A temporal-aware multiview graph attention framework for early rumor detection on social media, Inf. Fusion, 126 (2026), 103598. https://doi.org/10.1016/j.inffus.2025.103598 doi: 10.1016/j.inffus.2025.103598
|
| [17] |
T. D. Nguyen, A. K. N. Vu, N. D. Nguyen, V. T. Nguyen, T. D. Ngo, T. T. Do, et al., The art of camouflage: Few-shot learning for animal detection and segmentation, IEEE Access, 12 (2024), 103488–103503. https://doi.org/10.1109/ACCESS.2024.3432873 doi: 10.1109/ACCESS.2024.3432873
|
| [18] |
M. E. H. Habib, A. Küçükmanisa, O. Urhan, Enhanced protonet with self-knowledge distillation for few-shot learning, IEEE Access, 12 (2024), 145331–145340. https://doi.org/10.1109/ACCESS.2024.3472530 doi: 10.1109/ACCESS.2024.3472530
|
| [19] |
M. Seo, H. Kim, Irregular openings identification at construction sites based on few-shot learning, Buildings, 15 (2025), 1834. https://doi.org/10.3390/buildings15111834. doi: 10.3390/buildings15111834
|
| [20] |
Y. Li, X. Gu, Y. Wei, A deep learning-based method for bearing fault diagnosis with few-shot learning, Sensors, 24 (2024), 7516. https://doi.org/10.3390/s24237516. doi: 10.3390/s24237516
|
| [21] | C. Finn, P. Abbeel, S. Levine, Model-agnostic meta-learning for fast adaptation of deep networks, in International Conference on Machine Learning, (2017), 1126–1135. |
| [22] |
W. Zhou, M. Lu, R. Ji, Meta-se: A meta-learning framework for few-shot speech enhancement, IEEE Access, 9 (2021), 46068–46078. https://doi.org/10.1109/ACCESS.2021.3066609 doi: 10.1109/ACCESS.2021.3066609
|
| [23] |
F. Zhou, X. Han, J. Ren, W. Wang, Y. Wang, P. Zhang, et al., A method for few-shot modulation recognition based on reinforcement metric meta-learning, Computers, 14 (2025), 346. https://doi.org/10.3390/computers14090346. doi: 10.3390/computers14090346
|
| [24] |
X. Zhou, H. Bai, Z. Dong, K. Zhou, Y. Liu, Small sample palmprint recognition based on image augmentation and dynamic model-agnostic meta-learning, Electronics, 14 (2025), 3236. https://doi.org/10.3390/electronics14163236. doi: 10.3390/electronics14163236
|
| [25] |
S. Nasim, A. S. Al-Shamayleh, N. Thalji, A. Raza, L. Abualigah, A. I. Alzahrani, et al., Novel meta learning approach for detecting postpartum depression disorder using questionnaire data, IEEE Access, 12 (2024), 101247–101259. https://doi.org/10.1109/ACCESS.2024.3427685 doi: 10.1109/ACCESS.2024.3427685
|
| [26] | K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, preprint, arXiv: 1409.1556. https://doi.org/10.48550/arXiv.1409.1556 |
| [27] | J. Snell, K. Swersky, R. Zemel, Prototypical networks for few-shot learning, Adv. Neural Inf. Process. Syst., 30 (2017), 4080–4090. |
| [28] | G. Koch, R. Zemel, R. Salakhutdinov, Siamese neural networks for one-shot image recognition, in ICML Deep Learning Workshop, (2015), 1–30. |
| [29] | T. Bian, X. Xiao, T. Xu, P. Zhao, W. Huang, Y. Rong, et al., Rumor detection on social media with bi-directional graph convolutional networks, in Proceedings of the AAAI Conference on Artificial Intelligence, (2020), 549–556. https://doi.org/10.1609/aaai.v34i01.5393 |
| [30] | F. Yu, Q. Liu, S. Wu, L. Wang, T. Tan, A convolutional approach for misinformation identification, in IJCAI'17: Proceedings of the 26th International Joint Conference on Artificial Intelligence, (2017), 3901–3907. |