Correlation filter algorithms are widely used in the field of object tracking due to their excellent tracking performance and real-time tracking efficiency. Traditional correlation filter tracking methods focus on three aspects of improving algorithm performance: feature representation, spatial regularization techniques, and temporal smoothness. However, these methods overlook the removal of redundant and interfering information in the filters, as well as the protection of filters in occluded scenes, which leads to poor performance in complex scenarios such as cluttered backgrounds and occlusions. It is noted that interfering information exhibits sparsity but often lacks the structural property of combinatorial sparsity. Inspired by this, this paper proposes a correlation filter object tracking algorithm based on spatial and channel attention mechanisms. Specifically, the algorithm introduces fiber group sparsity constraints in the spatial direction of the filter and slice group sparsity constraints in the row, column, and channel directions. In this way, the structural sparsity property of the filter is further exploited to allocate attention to the spatial and channel features, thereby removing redundant and interfering information. In addition, the reliability analysis of the best candidate samples is performed by a history template pool to decide whether the filter should be updated or not to avoid the tracking failure problem caused by filter degradation in occlusion scenarios. Experiments on several datasets show that the proposed method outperforms other state-of-the-art trackers and achieves good performance in terms of accuracy and robustness.
Citation: Kaiwei Chen, Yingpin Chen, Ronghuan Zhang, Yiling Chen, Hongshuo Han, Yijing He, Wenjie Xu, Wenbing Ye, Jinghao Li. Correlation filter object tracking algorithm based on spatial and channel attention mechanism[J]. Electronic Research Archive, 2025, 33(8): 4857-4892. doi: 10.3934/era.2025219
Correlation filter algorithms are widely used in the field of object tracking due to their excellent tracking performance and real-time tracking efficiency. Traditional correlation filter tracking methods focus on three aspects of improving algorithm performance: feature representation, spatial regularization techniques, and temporal smoothness. However, these methods overlook the removal of redundant and interfering information in the filters, as well as the protection of filters in occluded scenes, which leads to poor performance in complex scenarios such as cluttered backgrounds and occlusions. It is noted that interfering information exhibits sparsity but often lacks the structural property of combinatorial sparsity. Inspired by this, this paper proposes a correlation filter object tracking algorithm based on spatial and channel attention mechanisms. Specifically, the algorithm introduces fiber group sparsity constraints in the spatial direction of the filter and slice group sparsity constraints in the row, column, and channel directions. In this way, the structural sparsity property of the filter is further exploited to allocate attention to the spatial and channel features, thereby removing redundant and interfering information. In addition, the reliability analysis of the best candidate samples is performed by a history template pool to decide whether the filter should be updated or not to avoid the tracking failure problem caused by filter degradation in occlusion scenarios. Experiments on several datasets show that the proposed method outperforms other state-of-the-art trackers and achieves good performance in terms of accuracy and robustness.
| [1] |
A. Kumar, R. Vohra, R. Jain, M. Li, C. Gan, D. K. Jain, Correlation filter based single object tracking: A review, Inf. Fusion, 112 (2024), 102562. https://doi.org/10.1016/j.inffus.2024.102562 doi: 10.1016/j.inffus.2024.102562
|
| [2] |
S. Arthanari, D. Elayaperumal, Y. H. Joo, Learning temporal regularized spatial-aware deep correlation filter tracking via adaptive channel selection, Neural Networks, 186 (2025), 107210. https://doi.org/10.1016/j.neunet.2025.107210 doi: 10.1016/j.neunet.2025.107210
|
| [3] |
X. F. Zhu, X. J. Wu, T. Xu, Z. H. Feng, J. Kittler, Robust visual object tracking via adaptive attribute-aware discriminative correlation filters, IEEE Trans. Multimedia, 24 (2021), 301–312. https://doi.org/10.1109/TMM.2021.3050073 doi: 10.1109/TMM.2021.3050073
|
| [4] |
S. Ma, B. Zhao, Z. Hou, W. Yu, L. Pu, X. Yang, SOCF: A correlation filter for real-time UAV tracking based on spatial disturbance suppression and object saliency-aware, Expert Syst. Appl., 238 (2024), 122131. https://doi.org/10.1016/j.eswa.2023.122131 doi: 10.1016/j.eswa.2023.122131
|
| [5] |
S. Li, Y. Liu, Q. Zhao, Z. Feng, Learning residue-aware correlation filters and refining scale for real-time UAV tracking, Pattern Recognit., 127 (2022), 108614. https://doi.org/10.1016/j.patcog.2022.108614 doi: 10.1016/j.patcog.2022.108614
|
| [6] |
P. Lai, M. Zhang, G. Cheng, S. Li, X. Huang, J. Han, Target-aware transformer for satellite video object tracking, IEEE Trans. Geosci. Remote Sens., 62 (2023), 1–10. https://doi.org/10.1109/TGRS.2023.3339658 doi: 10.1109/TGRS.2023.3339658
|
| [7] |
Y. Li, N. Wang, W. Li, X. Li, M. Rao, Object tracking in satellite videos with distractor–occlusion-aware correlation particle filters, IEEE Trans. Geosci. Remote Sens., 62 (2024), 1–12. https://doi.org/10.1109/TGRS.2024.3353298 doi: 10.1109/TGRS.2024.3353298
|
| [8] |
X. Qu, Z. Ma, H. Zhang, X. Sun, X. Yang, Target tracking method based on scale-adaptive rotation kernelized correlation filter for through-the-wall radar, IEEE Signal Process Lett., 32 (2025), 1001–1005. https://doi.org/10.1109/LSP.2025.3540954 doi: 10.1109/LSP.2025.3540954
|
| [9] |
Y. Zhang, Y. F. Yu, L. Chen, W. Ding, Robust correlation filter learning with continuously weighted dynamic response for UAV visual tracking, IEEE Trans. Geosci. Remote Sens., 61 (2023), 1–14. https://doi.org/10.1109/TGRS.2023.3325337 doi: 10.1109/TGRS.2023.3325337
|
| [10] |
Z. Chen, L. J. Liu, Z. Yu, Learning dynamic distractor-repressed correlation filter for real-time UAV tracking, IEEE Signal Process Lett., 32 (2025), 616–620. https://doi.org/10.1109/LSP.2024.3522850 doi: 10.1109/LSP.2024.3522850
|
| [11] |
L. Chen, Y. Liu, Y. Wang, An efficient spatial-temporal UAV visual tracker with the temporal enhancement model update strategy, Signal Image Video Process., 19 (2025), 217. https://doi.org/10.1007/s11760-024-03772-3 doi: 10.1007/s11760-024-03772-3
|
| [12] |
P. Feng, C. Xu, Z. Zhao, F. Liu, J. Guo, C. Yuan, et al., A deep features based generative model for visual tracking, Neurocomputing, 308 (2018), 245–254. https://doi.org/10.1016/j.neucom.2018.05.007 doi: 10.1016/j.neucom.2018.05.007
|
| [13] |
C. Wu, J. Shen, K. Chen, Y. Chen, Y. Liao, UAV object tracking algorithm based on spatial saliency-aware correlation filter, Electron. Res. Arch., 33 (2025), 1446–1475. https://doi.org/10.3934/era.2025068 doi: 10.3934/era.2025068
|
| [14] |
Y. Chen, K. Chen, Four mathematical modeling forms for correlation filter object tracking algorithms and the fast calculation for the filter, Electron. Res. Arch., 32 (2024), 4684–4714. https://doi.org/10.3934/era.2024213 doi: 10.3934/era.2024213
|
| [15] |
H. Zhu, H. Peng, G. Xu, L. Deng, Y. Cheng, A. Song, Bilateral weighted regression ranking model with spatial-temporal correlation filter for visual tracking, IEEE Trans. Multimedia, 24 (2021), 2098–2111. https://doi.org/10.1109/TMM.2021.3075876 doi: 10.1109/TMM.2021.3075876
|
| [16] |
Y. Liang, Y. Liu, Y. Yan, L. Zhang, H. Wang, Robust visual tracking via spatio-temporal adaptive and channel selective correlation filters, Pattern Recognit., 112 (2021), 107738. https://doi.org/10.1016/j.patcog.2020.107738 doi: 10.1016/j.patcog.2020.107738
|
| [17] | F. Li, C. Tian, W. Zuo, L. Zhang, M. H. Yang, Learning spatial-temporal regularized correlation filters for visual tracking, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2018), 4904–4913. https://doi.org/10.1109/CVPR.2018.00515 |
| [18] | M. Danelljan, G. Hager, F. S. Khan, M. Felsberg, Learning spatially regularized correlation filters for visual tracking, in IEEE International Conference on Computer Vision, IEEE, (2015), 4310–4318. https://doi.org/10.1109/ICCV.2015.490 |
| [19] |
K. Nai, S. Chen, Learning a novel ensemble tracker for robust visual tracking, IEEE Trans. Multimedia, 26 (2023), 3194–3206. https://doi.org/10.1109/TMM.2023.3307939 doi: 10.1109/TMM.2023.3307939
|
| [20] | Q. Hu, H. Wu, J. Wu, J. Shen, H. Hu, Y. Chen, et al., Spatio-temporal self-learning object tracking model based on anti-occlusion mechanism, Eng. Lett., 31 (2023), 1141–1150. |
| [21] |
Y. Chen, H. Wu, Z. Deng, J. Zhang, H. Wang, L. Wang, et al., Deep-feature-based asymmetrical background-aware correlation filter for object tracking, Digital Signal Process., 148 (2024), 104446. https://doi.org/10.1016/j.dsp.2024.104446 doi: 10.1016/j.dsp.2024.104446
|
| [22] | Z. Huang, C. Fu, Y. Li, F. Lin, P. Lu, Learning aberrance repressed correlation filters for real-time UAV tracking, in IEEE International Conference on Computer Vision, IEEE, (2019), 2891–2900. https://doi.org/10.1109/ICCV.2019.00298 |
| [23] |
J. Liao, C. Qi, J. Cao, Temporal constraint background-aware correlation filter with saliency map, IEEE Trans. Multimedia, 23 (2020), 3346–3361. https://doi.org/10.1109/TMM.2020.3023794 doi: 10.1109/TMM.2020.3023794
|
| [24] |
P. Yang, Q. Wang, J. Dou, L. Dou, SDCS-CF: Saliency-driven localization and cascade scale estimation for visual tracking, J. Visual Commun. Image Represent., 98 (2024), 104040. https://doi.org/10.1016/j.jvcir.2023.104040 doi: 10.1016/j.jvcir.2023.104040
|
| [25] | M. Danelljan, A. Robinson, F. S. Khan, M. Felsberg, Beyond correlation filters: Learning continuous convolution operators for visual tracking, in European Conference on Computer Vision, Springer, (2016), 472–488. https://doi.org/10.1007/978-3-319-46454-1_29 |
| [26] | M. Danelljan, G. Bhat, F. S. Khan, M. Felsberg, Eco: Efficient convolution operators for tracking, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2017), 6638–6646. https://doi.org/10.48550/arXiv.1611.09224 |
| [27] | J. Valmadre, L. Bertinetto, J. Henriques, A. Vedaldi, P. H. Torr, End-to-end representation learning for correlation filter based tracking, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2017), 2805–2813. https://doi.org/10.1109/CVPR.2017.531 |
| [28] | T. Xu, Z. H. Feng, X. J. Wu, J. Kittler, Joint group feature selection and discriminative filter learning for robust visual object tracking, in IEEE International Conference on Computer Vision, IEEE, (2019), 7950–7960. https://doi.org/10.1109/ICCV.2019.00804 |
| [29] |
J. Wen, H. Chu, Z. Lai, T. Xu, L. Shen, Enhanced robust spatial feature selection and correlation filter learning for UAV tracking, Neural Networks, 161 (2023), 39–54. https://doi.org/10.1016/j.neunet.2023.01.003 doi: 10.1016/j.neunet.2023.01.003
|
| [30] |
Y. Q. Su, F. Xu, Z. S. Wang, M. C. Sun, H. Zhao, A context constraint and sparse learning based on correlation filter for high-confidence coarse-to-fine visual tracking, Expert Syst. Appl., 268 (2025), 126225. https://doi.org/10.1016/j.eswa.2024.126225 doi: 10.1016/j.eswa.2024.126225
|
| [31] | M. Mueller, N. G. Smith, B. Ghanem, A benchmark and simulator for UAV tracking, in European Conference on Computer Vision, Springer, (2016), 445–461. https://doi.org/10.1007/978-3-319-46448-0_27 |
| [32] | S. Li, D. Y. Yeung, Visual object tracking for unmanned aerial vehicles: A benchmark and new motion models, in AAAI Conference on Artificial Intelligence, AAAI Press, (2017), 4140–4146. https://doi.org/10.1609/aaai.v31i1.11205 |
| [33] | Y. Wu, J. Lim, M. H. Yang, Online object tracking: A benchmark, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2013), 2411–2418. https://doi.org/10.1109/CVPR.2013.312 |
| [34] | D. S. Bolme, J. R. Beveridge, B. A. Draper, Y. M. Lui, Visual object tracking using adaptive correlation filters, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, (2010), 2544–2550. https://doi.org/10.1109/CVPR.2010.5539960 |
| [35] | J. F. Henriques, R. Caseiro, P. Martins, J. Batista, Exploiting the circulant structure of tracking-by-detection with kernels, in European Conference on Computer Vision, Springer, (2012), 702–715. https://doi.org/10.1007/978-3-642-33765-9 |
| [36] |
J. F. Henriques, R. Caseiro, P. Martins, J. Batista, High-speed tracking with kernelized correlation filters, IEEE Trans. Pattern Anal. Mach. Intell., 37 (2014), 583–596. https://doi.org/10.1109/TPAMI.2014.2345390 doi: 10.1109/TPAMI.2014.2345390
|
| [37] | N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, (2005), 886–893. https://doi.org/10.1109/CVPR.2005.177 |
| [38] | L. Bertinetto, J. Valmadre, S. Golodetz, O. Miksik, P. H. S. Torr, Staple: Complementary learners for real-time tracking, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2016), 1401–1409. https://doi.org/10.1109/CVPR.2016.156 |
| [39] |
J. Van De Weijer, C. Schmid, J. Verbeek, D. Larlus, Learning color names for real-world applications, IEEE Trans. Image Process., 18 (2009), 1512–1523. https://doi.org/10.1109/TIP.2009.2019809 doi: 10.1109/TIP.2009.2019809
|
| [40] |
M. Fiaz, A. Mahmood, S. Javed, S. K. Jung, Handcrafted and deep trackers: Recent visual object tracking approaches and trends, ACM Comput. Surv., 52 (2019), 1–44. https://doi.org/10.48550/arXiv.1812.07368 doi: 10.48550/arXiv.1812.07368
|
| [41] | M. Oquab, L. Bottou, I. Laptev, J. Sivic, Learning and transferring mid-level image representations using convolutional neural networks, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2014), 1717–1724. https://doi.org/10.1109/CVPR.2014.222 |
| [42] |
E. Shelhamer, J. Long, T. Darrell, Fully convolutional networks for semantic segmentation, IEEE Trans. Pattern Anal. Mach. Intell., 39 (2016), 640–651. https://doi.org/10.48550/arXiv.1411.4038 doi: 10.48550/arXiv.1411.4038
|
| [43] | K. Chen, L. Wang, H. Wu, C. Wu, Y. Liao, Y. Chen, et al., Background-aware correlation filter for object tracking with deep CNN features, Eng. Lett., 32 (2024), 1351–1363. |
| [44] |
D. Yuan, X. Chang, P. Y. Huang, Q. Liu, Z. He, Self-supervised deep correlation tracking, IEEE Trans. Image Process., 30 (2020), 976–985. https://doi.org/10.1109/TIP.2020.3037518 doi: 10.1109/TIP.2020.3037518
|
| [45] | X. Li, C. Ma, B. Wu, Z. He, M. H. Yang, Target-aware deep tracking, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2019), 1369–1378. https://doi.org/10.1109/CVPR.2019.00146 |
| [46] |
K. Qian, J. Shen, S. Wang, Y. Wu, G. Lu, SiamUF: SiamCar based small UAV tracker using dense U-shape deep features in near infrared videos, Opt. Lasers Eng., 186 (2025), 108825. https://doi.org/10.1016/j.optlaseng.2025.108825 doi: 10.1016/j.optlaseng.2025.108825
|
| [47] | H. Nam, B. Han, Learning multi-domain convolutional neural networks for visual tracking, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2016), 4293–4302. https://doi.org/10.1109/CVPR.2016.465 |
| [48] | I. Jung, J. Son, M. Baek, B. Han, Real-time MDNet, in European Conference on Computer Vision, Springer, (2018), 83–98. https://doi.org/10.1007/978-3-030-01225-0_6 |
| [49] |
K. Yang, Z. He, W. Pei, Z. Zhou, X. Li, D. Yuan, et al., SiamCorners: Siamese corner networks for visual tracking, IEEE Trans. Multimedia, 24 (2021), 1956–1967. https://doi.org/10.1109/TMM.2021.3074239 doi: 10.1109/TMM.2021.3074239
|
| [50] | L. Bertinetto, J. Valmadre, J. F. Henriques, A. Vedaldi, P. H. Torr, Fully-convolutional Siamese networks for object tracking, in European Conference on Computer Vision, Springer, (2016), 850–865. https://doi.org/10.48550/arXiv.1606.09549 |
| [51] | Q. Wang, Z. Teng, J. Xing, J. Gao, W. Hu, S. Maybank, Learning attentions: Residual attentional Siamese network for high performance online visual tracking, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2018), 4854–4863. https://doi.org/10.1109/CVPR.2018.00510 |
| [52] | Q. Guo, W. Feng, C. Zhou, R. Huang, L. Wan, S. Wang, Learning dynamic Siamese network for visual object tracking, in IEEE International Conference on Computer Vision, IEEE, (2017), 1763–1771. https://doi.org/10.1109/ICCV.2017.196 |
| [53] | B. Li, J. Yan, W. Wu, Z. Zhu, X. Hu, High performance visual tracking with Siamese region proposal network, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2018), 8971–8980. https://doi.org/10.1109/CVPR.2018.00935 |
| [54] | B. Li, W. Wu, Q. Wang, F. Zhang, J. Xing, J. Yan, Siamrpn++: Evolution of Siamese visual tracking with very deep networks, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2019), 4282–4291. http://dx.doi.org/10.1109/CVPR.2019.00441 |
| [55] |
K. Yang, Q. Li, C. Tian, H. Zhang, A. Shi, J. Li, DeforT: Deformable transformer for visual tracking, Neural Networks, 176 (2024), 106380. https://doi.org/10.1016/j.neunet.2024.106380 doi: 10.1016/j.neunet.2024.106380
|
| [56] |
H. Wu, Y. Chen, C. Wu, R. Zhang, K. Chen, A multi-scale cyclic-shift window Transformer object tracker based on fast Fourier transform, Electron. Res. Arch., 33 (2025), 3638–3672. https://doi.org/10.3934/era.2025162 doi: 10.3934/era.2025162
|
| [57] | A. Lukezic, T. Vojir, L. C. Zajc, J. Matas, M. Kristan, Discriminative correlation filter with channel and spatial reliability, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2017), 6309–6318. https://doi.org/10.1007/s11263-017-1061-3 |
| [58] | Y. Yu, Y. Xiong, W. Huang, M. R. Scott, Deformable Siamese attention networks for visual object tracking, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2020), 6728–6737. https://doi.org/10.1109/CVPR42600.2020.00676 |
| [59] | B. Yan, H. Peng, J. Fu, D. Wang, H. Lu, Learning spatio-temporal transformer for visual tracking, in IEEE International Conference on Computer Vision, IEEE, (2021), 10448–10457. https://doi.org/10.48550/arXiv.2103.17154 |
| [60] |
Y. Huang, Y. Chen, C. Lin, Q. Hu, J. Song, Visual attention learning and antiocclusion-based correlation filter for visual object tracking, J. Electron. Imaging, 32 (2023), 013023. https://doi.org/10.1117/1.JEI.32.1.013023 doi: 10.1117/1.JEI.32.1.013023
|
| [61] |
C. Fu, J. Ye, J. Xu, Y. He, F. Lin, Disruptor-aware interval-based response inconsistency for correlation filters in real-time aerial tracking, IEEE Trans. Geosci. Remote Sens., 59 (2021), 6301–6313. https://doi.org/10.1109/TGRS.2020.3030265 doi: 10.1109/TGRS.2020.3030265
|
| [62] | F. Lin, C. Fu, Y. He, F. Guo, Q. Tang, BiCF: Learning bidirectional incongruity-aware correlation filter for efficient UAV object tracking, in IEEE International Conference on Robotics and Automation, IEEE, (2020), 2365–2371. https://doi.org/10.1109/ICRA40945.2020.9196530 |
| [63] |
F. Lin, C. Fu, Y. He, W. Xiong, F. Li, ReCF: Exploiting response reasoning for correlation filters in real-time UAV tracking, IEEE Trans. Intell. Transp. Syst., 23 (2021), 10469–10480. https://doi.org/10.1109/TITS.2021.3094654 doi: 10.1109/TITS.2021.3094654
|
| [64] | Y. Li, C. Fu, F. Ding, Z. Huang, G. Lu, AutoTrack: Towards high-performance visual tracking for UAV with automatic spatio-temporal regularization, in IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2020), 11923–11932. https://doi.org/10.1109/CVPR42600.2020.01194 |
| [65] |
F. Zhang, S. Ma, L. Yu, Y. Zhang, Z. Qiu, Z. Li, Learning future-aware correlation filters for efficient UAV tracking, Remote Sens., 13 (2021), 4111. https://doi.org/10.3390/rs13204111 doi: 10.3390/rs13204111
|
| [66] | H. K. Galoogahi, A. Fagg, S. Lucey, Learning background-aware correlation filters for visual tracking, in IEEE International Conference on Computer Vision, IEEE, (2017), 1135–1143. https://doi.org/10.1109/ICCV.2017.129 |