Research article

CoReFuNet: A coarse-to-fine registration and fusion network for typhoon intensity classification using multimodal satellite imagery

  • Published: 02 April 2025
  • Typhoons cause significant damage to coastal and inland areas, making the accurate classification of typhoon cloud images essential for effective monitoring and forecasting. While integrating multimodal data from different satellites can improve classification accuracy, existing methods often rely on aligned images and fail to account for radiometric and structural differences, leading to performance degradation during image fusion. In registration methods designed to address this issue, two-stage approaches inaccurately estimate deformation fields, while one-stage methods typically overlook radiometric differences between typhoon cloud images. Additionally, fusion methods suffer from inherent noise accumulation and insufficient cross-modal feature utilization due to cascaded structures. To address these issues, this study proposed a coarse-to-fine registration and fusion network (CoReFuNet) that integrated a one-stage registration module with a cross-modal fusion module for multimodal typhoon cloud image classification. The registration module adopted a one-stage coarse-to-fine strategy, using cross-modal style alignment to address radiometric difference and global spatial registration by affine transformations to resolve positional differences. Bidirectional local feature refinement (BLFR) then ensured precise adjustments, facilitating fine registration by evaluating feature points in each image. Following registration, the fusion module employed a dual-branch alternating enhancement (DAE) approach, which reduced noise by learning cross-modal mapping relationships and applying feedback-based adjustments. Additionally, a cross-modal feature interaction (CMFI) module merged low-level, high-level, intra-modal, and intermodal features through a residual structure, minimizing modality differences and maximizing feature utilization. Experiments on the FY-HMW (Feng Yun-Himawari) dataset, constructed using data from the Feng Yun and Himawari satellites, showed that the CoReFuNet outperformed existing registration methods (VoxelMorph and SIFT) and fusion methods (IFCNN and DenseFuse), achieving 84.34% accuracy and 87.16% G-mean on the FY test dataset, and 82.88% accuracy and 85.54% G-mean on the HMW test dataset. These results showed significant improvements, particularly in unaligned data scenarios, highlighting the potential for real-world typhoon monitoring and forecasting.

    Citation: Zongsheng Zheng, Jia Du, Yuewei Zhang, Xulong Wang. CoReFuNet: A coarse-to-fine registration and fusion network for typhoon intensity classification using multimodal satellite imagery[J]. Electronic Research Archive, 2025, 33(4): 1875-1901. doi: 10.3934/era.2025085

    Related Papers:

  • Typhoons cause significant damage to coastal and inland areas, making the accurate classification of typhoon cloud images essential for effective monitoring and forecasting. While integrating multimodal data from different satellites can improve classification accuracy, existing methods often rely on aligned images and fail to account for radiometric and structural differences, leading to performance degradation during image fusion. In registration methods designed to address this issue, two-stage approaches inaccurately estimate deformation fields, while one-stage methods typically overlook radiometric differences between typhoon cloud images. Additionally, fusion methods suffer from inherent noise accumulation and insufficient cross-modal feature utilization due to cascaded structures. To address these issues, this study proposed a coarse-to-fine registration and fusion network (CoReFuNet) that integrated a one-stage registration module with a cross-modal fusion module for multimodal typhoon cloud image classification. The registration module adopted a one-stage coarse-to-fine strategy, using cross-modal style alignment to address radiometric difference and global spatial registration by affine transformations to resolve positional differences. Bidirectional local feature refinement (BLFR) then ensured precise adjustments, facilitating fine registration by evaluating feature points in each image. Following registration, the fusion module employed a dual-branch alternating enhancement (DAE) approach, which reduced noise by learning cross-modal mapping relationships and applying feedback-based adjustments. Additionally, a cross-modal feature interaction (CMFI) module merged low-level, high-level, intra-modal, and intermodal features through a residual structure, minimizing modality differences and maximizing feature utilization. Experiments on the FY-HMW (Feng Yun-Himawari) dataset, constructed using data from the Feng Yun and Himawari satellites, showed that the CoReFuNet outperformed existing registration methods (VoxelMorph and SIFT) and fusion methods (IFCNN and DenseFuse), achieving 84.34% accuracy and 87.16% G-mean on the FY test dataset, and 82.88% accuracy and 85.54% G-mean on the HMW test dataset. These results showed significant improvements, particularly in unaligned data scenarios, highlighting the potential for real-world typhoon monitoring and forecasting.



    加载中


    [1] X. Sun, Y. Tian, W. Lu, P. Wang, R. Niu, H. Yu, et al., From single-to multi-modal remote sensing imagery interpretation: A survey and taxonomy, Sci. China Inf. Sci., 66 (2023), 140301. https://doi.org/10.1007/s11432-022-3588-0 doi: 10.1007/s11432-022-3588-0
    [2] L. Li, H. Ling, M. Ding, H. Cao, H. Hu, A deep learning semantic template matching framework for remote sensing image registration, ISPRS, 181 (2021), 205–217. https://doi.org/10.1016/j.isprsjprs.2017.12.012 doi: 10.1016/j.isprsjprs.2017.12.012
    [3] R. Hang, Z. Li, P. Ghamisi, D. Hong, G. Xia, Q. Liu, Classification of hyperspectral and LiDAR data using coupled CNNs, IEEE Trans. Geosci. Remote Sens., 58 (2020), 4939–4950. https://doi.org/10.1109/TGRS.2020.2969024 doi: 10.1109/TGRS.2020.2969024
    [4] S. Morchhale, V. P. Pauca, R. J. Plemmons, T. C. Torgersen, Classification of pixel-level fused hyperspectral and lidar data using deep convolutional neural networks, in 2016 8th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS), (2016), 1–5. https://doi.org/10.1109/WHISPERS.2016.8071715
    [5] S. Deldari, H. Xue, A. Saeed, D. V. Smith, F. D. Salim, Cocoa: Cross modality contrastive learning for sensor data, Proc. ACM Interact. Mobile Wearable Ubiquitous Technol., 6 (2022), 1–28. https://doi.org/10.1145/3550316 doi: 10.1145/3550316
    [6] D. Hong, L. Gao, N. Yokoya, J. Yao, J. Chanussot, Q. Du, et al., More diverse means better: Multimodal deep learning meets remote-sensing imagery classification, IEEE Trans. Geosci. Remote Sens., 59 (2020), 4340–4354. https://doi.org/10.1109/TGRS.2020.3016820 doi: 10.1109/TGRS.2020.3016820
    [7] H. Li, X. Wu, DenseFuse: A fusion approach to infrared and visible images, IEEE Trans. Image Process., 28 (2018), 2614–2623. https://doi.org/10.1109/TIP.2018.2887342 doi: 10.1109/TIP.2018.2887342
    [8] Y. Zhang, Y. Liu, P. Sun, H. Yan, X. Zhao, X. Zhang, IFCNN: A general image fusion framework based on convolutional neural network, Inf. Fusion, 54 (2020), 99–118. https://doi.org/10.1016/J.INFFUS.2019.07.011 doi: 10.1016/J.INFFUS.2019.07.011
    [9] H. Xu, J. Ma, J. Jiang, X. Guo, H. Ling, U2Fusion: A unified unsupervised image fusion network, IEEE Trans. Pattern Anal. Mach. Intell., 44 (2020), 502–518. https://doi.org/10.1109/TPAMI.2020.3012548 doi: 10.1109/TPAMI.2020.3012548
    [10] N. Li, Y. Li, J. Jiao, Multimodal remote sensing image registration based on adaptive multi-scale PIIFD, Multimed. Tools Appl., 83 (2024), 1–13. https://doi.org/10.1007/s11042-024-18756-1 doi: 10.1007/s11042-024-18756-1
    [11] H. Xie, J. Qiu, Y. Dai, Y. Yang, C. Cheng, Y. Zhang, SA-DNet: A on-demand semantic object registration network adapting to non-rigid deformation, preprint, arXiv: 2210.09900. https://doi.org/10.48550/arXiv.2210.09900
    [12] R. Feng, H. Shen, J. Bai, X. Li, Advances and opportunities in remote sensing image geometric registration: A systematic review of state-of-the-art approaches and future research directions, IEEE Trans. Geosci. Remote Sens., 9 (2021), 120–142. https://doi.org/10.1109/MGRS.2021.3081763 doi: 10.1109/MGRS.2021.3081763
    [13] W. M. Wells Ⅲ, P. Viola, H. Atsumi, S. Nakajima, R. Kikini., Multi-modal volume registration by maximization of mutual information, Med. Image Anal., 1 (1996), 35–51. https://doi.org/10.1016/s1361-8415(01)80004-9 doi: 10.1016/s1361-8415(01)80004-9
    [14] A. Goshtasby, G. C. Stockman, C. V. Page, A region-based approach to digital image registration with subpixel accuracy, IEEE Trans. Geosci. Remote Sens., 24 (1986), 390–399. https://doi.org/10.1109/TGRS.1986.289597 doi: 10.1109/TGRS.1986.289597
    [15] X. He, C. Meile, S. M. Bhandarkar, Multimodal registration of FISH and nanoSIMS images using convolutional neural network models, preprint, arXiv: 2201.05545. https://doi.org/10.48550/arXiv.2201.05545
    [16] D. G. Lowe, Object recognition from local scale-invariant features, in Proceedings of the Seventh IEEE International Conference on Computer Vision, 2 (1999), 1150–1157. https://doi.org/10.1109/ICCV.1999.790410
    [17] J. Li, Q. Hu, M. Ai, RIFT: Multi-modal image matching based on radiation-variation insensitive feature transform, IEEE Trans. Image Process., 29 (2019), 3296–3310. https://doi.org/10.1109/TIP.2019.2959244 doi: 10.1109/TIP.2019.2959244
    [18] L. Tang, Y. Deng, Y. Ma, J. Huang, J. Ma, SuperFusion: A versatile image registration and fusion network with semantic awareness, IEEE/CAA J. Autom. Sin., 9 (2022), 2121–2137. https://doi.org/10.1109/JAS.2022.106082 doi: 10.1109/JAS.2022.106082
    [19] B. K. P. Horn, B. G. Schunck, Determining optical flow, Artif. Intell. 17 (1981), 185–203. https://doi.org/10.1016/0004-3702(81)90024-2 doi: 10.1016/0004-3702(81)90024-2
    [20] J. Xiong, Y. Luo, G. Tang, An improved optical flow method for image registration with large-scale movements, Acta Autom. Sin., 34 (2008), 760–764. https://doi.org/10.3724/SP.J.1004.2008.00760 doi: 10.3724/SP.J.1004.2008.00760
    [21] H. Li, J. Zhao, J. Li, Z. Yu, G. Lu, Feature dynamic alignment and refinement for infrared-visible image fusion: Translation robust fusion, Inf. Fusion, 95 (2023), 26–41. https://doi.org/10.1016/j.inffus.2023.02.011 doi: 10.1016/j.inffus.2023.02.011
    [22] S. Tang, P. Miao, X. Gao, Y. Zhong, D. Zhu, H. Wen, et al., Point cloud-based registration and image fusion between cardiac SPECT MPI and CTA, preprint, arXiv: 2402.06841. https://doi.org/10.48550/arXiv.2402.06841
    [23] Z. Xu, J. Yan, J. Luo, X. Li, J. Jagadeesan, Unsupervised multimodal image registration with adaptative gradient guidance, in ICASSP 20212021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (2021), 1225–1229. https://doi.org/10.1109/ICASSP39728.2021.9414320
    [24] H. Xu, J. Ma, J. Yuan, Z. Le, W. Liu, RFNet: Unsupervised network for mutually reinforcing multi-modal image registration and fusion, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2022), 19647–19656. https://doi.org/10.1109/CVPR52688.2022.01906
    [25] H. Xu, J. Yuan, J. Ma, Murf: Mutually reinforcing multi-modal image registration and fusion, IEEE Trans. Pattern Anal. Mach. Intell., 45 (2023), 12148–12166. https://doi.org/10.1109/TPAMI.2023.3283682 doi: 10.1109/TPAMI.2023.3283682
    [26] D. Wang, J. Liu, X. Fan, R. Liu, Unsupervised misaligned infrared and visible image fusion via cross-modality image generation and registration, preprint, arXiv: 2205.11876. https://doi.org/10.48550/arXiv.2205.11876
    [27] D. Wang, J. Liu, L. Ma, R. Liu, X. Fan, Improving misaligned multi-modality image fusion with one-stage progressive dense registration, IEEE Trans. Circuits Syst. Video Technol., 34 (2024). https://doi.org/10.1109/TCSVT.2024.3412743 doi: 10.1109/TCSVT.2024.3412743
    [28] Z. Zhang, H. Li, T. Xu, X. Wu, J. Kittler, BusReF: Infrared-visible images registration and fusion focus on reconstructible area using one set of features, preprint, arXiv: 2401.00285. https://doi.org/10.48550/arXiv.2401.00285
    [29] L. Z. Li, L. Han, M. Ding, H. Cao, Multimodal image fusion framework for end-to-end remote sensing image registration, IEEE Trans. Geosci. Remote Sens., 61, (2023), 1–14. https://doi.org/10.1109/TGRS.2023.3247642 doi: 10.1109/TGRS.2023.3247642
    [30] S. Mai, Y. Zeng, H. Hu, Multimodal information bottleneck: Learning minimal sufficient unimodal and multimodal representations, IEEE Trans. Multimedia, 25 (2022), 4121–4134. https://doi.org/10.1109/TMM.2022.3171679 doi: 10.1109/TMM.2022.3171679
    [31] Q. Wang, Y. Chi, T. Shen, J. Song, Z. Zhang, Y. Zhu, Improving RGB-infrared object detection by reducing cross-modality redundancy, Remote. Sens., 14 (2020). https://doi.org/10.3390/rs14092020 doi: 10.3390/rs14092020
    [32] S. Cui, J. Cao, X. Cong, J. Sheng, Q. Li, T. Liu, et al., Enhancing multimodal entity and relation extraction with variational information bottleneck, IEEE/ACM Trans. Audio Speech Lang. Process., 32 (2024), 1274–1285. https://doi.org/10.1109/TASLP.2023.3345146 doi: 10.1109/TASLP.2023.3345146
    [33] H. Chen, Y. Li, Three-stream attention-aware network for RGB-D salient object detection, IEEE Trans. Image Process., 28 (2019), 2825–2835. https://doi.org/10.1109/TIP.2019.2891104 doi: 10.1109/TIP.2019.2891104
    [34] X. Sun, L. Zhang, H. Yang, T. Wu, Y. Cen, Y. Guo, Enhancement of spectral resolution for remotely sensed multispectral image, IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens., 8 (2014), 2198–2211. https://doi.org/10.1109/JSTARS.2014.2356512 doi: 10.1109/JSTARS.2014.2356512
    [35] S. Malec, D. Rogge, U. Heiden, A. Sanchez-Azofeifa, M. Bachmann, M. Wegmann, Capability of spaceborne hyperspectral EnMAP mission for mapping fractional cover for soil erosion modeling, Remote Sens., 7 (2015), 11776–11800. https://doi.org/10.3390/rs70911776 doi: 10.3390/rs70911776
    [36] N. Yokoya, T. Yairi, A. Iwasaki, Coupled nonnegative matrix factorization unmixing for hyperspectral and multispectral data fusion, IEEE Trans. Geosci. Remote Sens., 50 (2020), 528–537. https://doi.org/10.1109/TGRS.2011.2161320 doi: 10.1109/TGRS.2011.2161320
    [37] L. Mou, X. Zhu, RiFCN: Recurrent network in fully convolutional network for semantic segmentation of high resolution remote sensing images, preprint, arXiv: 1805.02091. https://doi.org/10.48550/arXiv.1805.02091
    [38] H. Ma, X. Yang, R. Fan, W. Han, K. He, L. Wang, Refined water-body types mapping using a water-scene enhancement deep models by fusing optical and SAR images, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 17 (2024), 17430–17441. https://doi.org/10.1109/JSTARS.2024.3459916 doi: 10.1109/JSTARS.2024.3459916
    [39] D. Hong, N. Yokoya, J. Chanussot, X. Zhu, CoSpace: Common subspace learning from hyperspectral-multispectral correspondences, IEEE Trans. Geosci. Remote Sens., 57 (2019), 4349–4359. https://doi.org/10.1109/TGRS.2018.2890705 doi: 10.1109/TGRS.2018.2890705
    [40] D. Hong, N. Yokoya, N. Ge, J. Chanussot, X. Zhu, Learnable manifold alignment (LeMA): A semi-supervised cross-modality learning framework for land cover and land use classification, ISPRS J. Photogramm. Remote Sens., 147 (2019), 193–205. https://doi.org/10.1016/j.isprsjprs.2018.10.006 doi: 10.1016/j.isprsjprs.2018.10.006
    [41] G. Li, Z. Liu, M. Chen, Z. Bai, W. Lin, H. Ling, Hierarchical alternate interaction network for RGB-D salient object detection, IEEE Trans. Image Process., 30 (2021), 3528–3542. https://doi.org/10.1109/TIP.2021.3062689 doi: 10.1109/TIP.2021.3062689
    [42] M. Chen, L. Xing, Y. Wang, Y. Zhang, Enhanced multimodal representation learning with cross-modal kd, in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2023), 11766–11775.
    [43] D. Hong, J. Yao, D. Meng, Z. Xu, J. Chanussot, Multimodal GANs: Toward crossmodal hyperspectral-multispectral image segmentation, IEEE Trans. Geosci. Remote Sens., 59 (2020), 5103–5113. https://doi.org/10.1109/TGRS.2020.3020823 doi: 10.1109/TGRS.2020.3020823
    [44] D. Hong, N. Yokoya, G. Xia, J. Chanussot, X. Zhu, X-ModalNet: A semi-supervised deep cross-modal network for classification of remote sensing data, ISPRS J. Photogramm. Remote Sens., 167 (2020), 12–23. https://doi.org/10.1016/j.isprsjprs.2020.06.014 doi: 10.1016/j.isprsjprs.2020.06.014
    [45] J. Huang, X. Huang, J. Yang, Residual enhanced multi-hypergraph neural network, in 2021 IEEE International Conference on Image Processing (ICIP), (2021), 3657–3661. https://doi.org/10.1109/ICIP42928.2021.9506153
    [46] N. Xu, W. Mao, A residual merged neutral network for multimodal sentiment analysis, in 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA), (2017), 6–10. https://doi.org/10.1109/ICBDA.2017.8078794
    [47] K. He, Z. Zhang, Y. Dong, D. Cai, Y. Lu, W. Han, Improving geological remote sensing interpretation via a contextually enhanced multiscale feature fusion network, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 17 (2024), 6158–6173. https://doi.org/10.1109/JSTARS.2024.3374818 doi: 10.1109/JSTARS.2024.3374818
    [48] A. M. Saxe, Y. Bansal, J. Dapello, M. Advani, A. Kolchinsky, B. D. Tracey, et al., On the information bottleneck theory of deep learning, J. Stat. Mech.: Theory Exp., 2019 (2019), 124020. https://doi.org/10.1088/1742-5468/ab3985 doi: 10.1088/1742-5468/ab3985
    [49] B. Chen, B. Chen, H. Lin, R. L. Elsberry, Estimating tropical cyclone intensity by satellite imagery utilizing convolutional neural networks, Weather Forecast., 34 (2019), 447–465. https://doi.org/10.1007/s13351-024-3186-y doi: 10.1007/s13351-024-3186-y
    [50] J. Lee, J. Im, D. H. Cha, H. Park, S. Sim, Tropical cyclone intensity estimation using multi-dimensional convolutional neural networks from geostationary satellite data, Remote Sens., 12 (2019), 108. https://doi.org/10.3390/rs12010108 doi: 10.3390/rs12010108
    [51] R. Zhang, Q. Liu, R. Hang, Tropical cyclone intensity estimation using two-branch convolutional neural network from infrared and water vapor images, IEEE Trans. Geosci. Remote Sens., 58 (2019), 586–597. https://doi.org/10.1109/TGRS.2019.2938204 doi: 10.1109/TGRS.2019.2938204
    [52] W. Jiang, G. Hu, T. Wu, L. Liu, B. Kim, Y. Xiao, et al., DMANet_KF: Tropical cyclone intensity estimation based on deep learning and Kalman filter from multi-spectral infrared images, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 16 (2023), 4469–4483. https://doi.org/10.1109/JSTARS.2023.3273232 doi: 10.1109/JSTARS.2023.3273232
    [53] M. Brown, S. Süsstrunk, Multi-spectral SIFT for scene category recognition, in CVPR 2011, (2011), 177–184. https://doi.org/10.1109/CVPR.2011.5995637
    [54] L. Tang, H. Zhang, H. Xu, J. Ma, Rethinking the necessity of image fusion in high-level vision tasks: A practical infrared and visible image fusion network based on progressive semantic injection and scene fidelity, Inf. Fusion, 99 (2023), 101870. https://doi.org/10.1016/j.inffus.2023.101870 doi: 10.1016/j.inffus.2023.101870
    [55] G. Balakrishnan, A. Zhao, M. R. Sabuncu, J. Guttag, A. V. Dalca, Voxelmorph: A learning framework for deformable medical image registration, IEEE Trans. Med. Imaging, 38 (2019), 1788–1800. https://doi.org/10.1109/TMI.2019.2897538 doi: 10.1109/TMI.2019.2897538
    [56] X. Fan, X. Wang, J. Gao, J. Wang, Z. Luo, R. Liu, Bi-level learning of task-specific decoders for joint registration and one-shot medical image segmentation, in 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2024), 11726–11735. https://doi.org/10.1109/CVPR52733.2024.01114
    [57] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, Grad-cam: Visual explanations from deep networks via gradient-based localization, in Proceedings of the IEEE International Conference on Computer Vision, (2017), 618–626. https://doi.org/10.1109/ICCV.2017.74
    [58] V. F. Dvorak, Tropical cyclone intensity analysis and forecasting from satellite imagery, Mon. Weather Rev., 103 (1975), 420–430. https://doi.org/10.1175/1520-0493(1975)103%3C0420:TCIAAF%3E2.0.CO;2 doi: 10.1175/1520-0493(1975)103%3C0420:TCIAAF%3E2.0.CO;2
  • Reader Comments
  • © 2025 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1182) PDF downloads(88) Cited by(0)

Article outline

Figures and Tables

Figures(13)  /  Tables(7)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog