Research article

A practical object detection-based multiscale attention strategy for person reidentification

  • Received: 03 October 2024 Revised: 20 November 2024 Accepted: 10 December 2024 Published: 18 December 2024
  • In person reidentification (PReID) tasks, challenges such as occlusion and small object sizes frequently arise. High-precision object detection methods can accurately locate small objects, while attention mechanisms help focus on the strong feature regions of objects. These approaches mitigate the mismatches caused by occlusion and small objects to some extent. This paper proposes a PReID method based on object detection and attention mechanisms (ODAMs) to achieve enhanced object matching accuracy. In the proposed ODAM-based PReID system, You Only Look Once version 7 (YOLOv7) was utilized as the detection algorithm, and a size attention mechanism was integrated into the backbone network to further improve the detection accuracy of the model. To conduct feature extraction, ResNet-50 was employed as the base network and augmented with residual attention mechanisms (RAMs) for PReID. This network emphasizes the key local information of the target object, enabling the extraction of more effective features. Extensive experimental results demonstrate that the proposed method achieves a mean average precision (mAP) value of 90.1% and a Rank-1 accuracy of 97.2% on the Market-1501 dataset, as well as an mAP of 82.3% and a Rank-1 accuracy of 91.4% on the DukeMTMC-reID dataset. The proposed PReID method offers significant practical value for intelligent surveillance systems. By integrating multiscale attention and RAMs, this method enhances both its object detection accuracy and its feature extraction robustness, enabling a more efficient individual identification process in complex scenes. These improvements are crucial for enhancing the real-time performance and accuracy of video surveillance systems, thus providing effective technical support for intelligent monitoring and security applications.

    Citation: Bin Zhang, Zhenyu Song, Xingping Huang, Jin Qian, Chengfei Cai. A practical object detection-based multiscale attention strategy for person reidentification[J]. Electronic Research Archive, 2024, 32(12): 6772-6791. doi: 10.3934/era.2024317

    Related Papers:

  • In person reidentification (PReID) tasks, challenges such as occlusion and small object sizes frequently arise. High-precision object detection methods can accurately locate small objects, while attention mechanisms help focus on the strong feature regions of objects. These approaches mitigate the mismatches caused by occlusion and small objects to some extent. This paper proposes a PReID method based on object detection and attention mechanisms (ODAMs) to achieve enhanced object matching accuracy. In the proposed ODAM-based PReID system, You Only Look Once version 7 (YOLOv7) was utilized as the detection algorithm, and a size attention mechanism was integrated into the backbone network to further improve the detection accuracy of the model. To conduct feature extraction, ResNet-50 was employed as the base network and augmented with residual attention mechanisms (RAMs) for PReID. This network emphasizes the key local information of the target object, enabling the extraction of more effective features. Extensive experimental results demonstrate that the proposed method achieves a mean average precision (mAP) value of 90.1% and a Rank-1 accuracy of 97.2% on the Market-1501 dataset, as well as an mAP of 82.3% and a Rank-1 accuracy of 91.4% on the DukeMTMC-reID dataset. The proposed PReID method offers significant practical value for intelligent surveillance systems. By integrating multiscale attention and RAMs, this method enhances both its object detection accuracy and its feature extraction robustness, enabling a more efficient individual identification process in complex scenes. These improvements are crucial for enhancing the real-time performance and accuracy of video surveillance systems, thus providing effective technical support for intelligent monitoring and security applications.



    加载中


    [1] W. Sun, Q. Li, C. Zhao, S. K. Nguang, Mode-dependent dynamic output feedback h$\infty$ control of networked systems with markovian jump delay via generalized integral inequalities, Inf. Sci., 520 (2020), 105–116. https://doi.org/10.1016/j.ins.2020.02.023 doi: 10.1016/j.ins.2020.02.023
    [2] F. Tung, J. S. Zelek, D. A. Clausi, Goal-based trajectory analysis for unusual behaviour detection in intelligent surveillance, Image Vision Comput., 29 (2011), 230–240. https://doi.org/10.1016/j.imavis.2010.11.003 doi: 10.1016/j.imavis.2010.11.003
    [3] L. Zheng, Y. Yang, A. G. Hauptmann, Person re-identification: Past, present and future, preprint, arXiv: 1610.02984. https://doi.org/10.48550/arXiv.1610.02984
    [4] M. Ye, C. Liang, Z. Wang, Q. Leng, J. Chen, J. Liu, Specific person retrieval via incomplete text description, in Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, (2015), 547–550. https://doi.org/10.1145/2671188.2749347
    [5] M. Wang, B. Lai, J. Huang, X. Gong, X. S. Hua, Camera-aware proxies for unsupervised person re-identification, in Proceedings of the AAAI Conference on Artificial Intelligence, 35 (2021), 2764–2772. https://doi.org/10.1609/aaai.v35i4.16381
    [6] G. Zhang, Y. Ge, Z. Dong, H. Wang, Y. Zheng, S. Chen, Deep high-resolution representation learning for cross-resolution person re-identification, IEEE Trans. Image Process., 30 (2021), 8913–8925. https://doi.org/10.1109/TIP.2021.3120054 doi: 10.1109/TIP.2021.3120054
    [7] G. Zhang, Z. Luo, Y. Chen, Y. Zheng, W. Lin, Illumination unification for person re-identification, IEEE Trans. Circuits Syst. Video Technol., 32 (2022), 6766–6777. https://doi.org/10.1109/TCSVT.2022.3169422 doi: 10.1109/TCSVT.2022.3169422
    [8] X. Shu, G. Li, X. Wang, W. Ruan, Q. Tian, Semantic-guided pixel sampling for cloth-changing person re-identification, IEEE Signal Process. Lett., 28 (2021), 365–1369. https://doi.org/10.1109/LSP.2021.3091924 doi: 10.1109/LSP.2021.3091924
    [9] E. Ning, C. Wang, H. Zhang, X. Ning, P. Tiwari, Occluded person re-identification with deep learning: A survey and perspectives, Exp. Syst. Appl., 239 (2024), 122419. https://doi.org/10.1016/j.eswa.2023.122419 doi: 10.1016/j.eswa.2023.122419
    [10] J. Miao, Y. Wu, Y. Yang, Identifying visible parts via pose estimation for occluded person re-identification, IEEE Trans. Neural Networks Learn. Syst., 33 (2021), 4624–4634. https://doi.org/10.1109/TNNLS.2021.3059515 doi: 10.1109/TNNLS.2021.3059515
    [11] T. Si, F. He, P. Li, Y. Song, L. Fan, Diversity feature constraint based on heterogeneous data for unsupervised person re-identification, Inf. Process. Manage., 60 (2023), 103304. https://doi.org/10.1016/j.ipm.2023.103304 doi: 10.1016/j.ipm.2023.103304
    [12] M. Farenzena, L. Bazzani, A. Perina, V. Murino, M. Cristani, Person re-identification by symmetry-driven accumulation of local features, in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, (2010), 2360–2367. https://doi.org/10.1109/CVPR.2010.5539926
    [13] A. Bedagkar-Gala, S. K. Shah, A survey of approaches and trends in person re-identification, Image Vision Comput., 32 (2014), 270–286. https://doi.org/10.1016/j.imavis.2014.02.001 doi: 10.1016/j.imavis.2014.02.001
    [14] W. S. Zheng, S. Gong, T. Xiang, Person re-identification by probabilistic relative distance comparison, in CVPR 2011, (2011), 649–656. https://doi.org/10.1109/CVPR.2011.5995598
    [15] L. Wu, C. Shen, A. Hengel, Personnet: Person re-identification with deep convolutional neural networks, preprint, arXiv: 1601.07255. https://doi.org/10.48550/arXiv.1601.07255
    [16] X. Qian, Y. Fu, Y. G. Jiang, T. Xiang, X. Xue, Multi-scale deep learning architectures for person re-identification, in Proceedings of the IEEE International Conference on Computer Vision, (2017), 5399–5408. https://doi.org/10.1109/ICCV.2017.577
    [17] Y. Sun, L. Zheng, Y. Yang, Q. Tian, S. Wang, Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline), in Proceedings of the European Conference on Computer Vision (ECCV), (2018), 480–496. https://doi.org/10.1007/978-3-030-01225-0
    [18] X. Zhang, H. Luo, X. Fan, W. Xiang, Y. Sun, Q. Xiao, et al., Alignedreid: Surpassing human-level performance in person re-identification, preprint, arXiv: 1711.08184. https://doi.org/10.48550/arXiv.1711.08184
    [19] C. Su, J. Li, S. Zhang, J. Xing, W. Gao, Q. Tian, Pose-driven deep convolutional model for person re-identification, in Proceedings of the IEEE International Conference on Computer Vision, (2017), 3960–3969. https://doi.org/10.1109/ICCV.2017.427
    [20] M. S. Nixon, J. N. Carter, Automatic recognition by gait, in Proceedings of the IEEE, 94 (2006), 2013–2024. https://doi.org/10.1109/JPROC.2006.886018
    [21] W. Li, X. Zhu, S. Gong, Harmonious attention network for person re-identification, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2018), 2285–2294. https://doi.org/10.1109/CVPR.2018.00243
    [22] J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2018), 7132–7141. https://doi.org/10.1109/CVPR.2018.00745
    [23] C. Wang, Q. Zhang, C. Huang, W. Liu, X. Wang, Mancs: A multi-task attentional network with curriculum sampling for person re-identification, in Proceedings of the European Conference on Computer Vision (ECCV), (2018), 365–381.
    [24] C. P. Tay, S. Roy, K. H. Yap, Aanet: Attribute attention network for person re-identifications, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2019), 7134–7143. https://doi.org/10.1109/CVPR.2019.00730
    [25] H. Li, N. Dong, Z. Yu, D. Tao, G. Qi, Triple adversarial learning and multi-view imaginative reasoning for unsupervised domain adaptation person re-identification, IEEE Trans. Circuits Syst. Video Technol., 32 (2021), 2814–2830. https://doi.org/10.1109/TCSVT.2021.3099943 doi: 10.1109/TCSVT.2021.3099943
    [26] S. Wang, R. Liu, H. Li, G. Qi, Z. Yu, Occluded person re-identification via defending against attacks from obstacles, IEEE Trans. Inf. Forensics Secur., 18 (2022), 147–161. https://doi.org/10.1109/TIFS.2022.3218449 doi: 10.1109/TIFS.2022.3218449
    [27] Y. Wang, G. Qi, S. Li, Y. Chai, H. Li, Body part-level domain alignment for domain-adaptive person re-identification with transformer framework, IEEE Trans. Inf. Forensics Secur., 17 (2022), 3321–3334. https://doi.org/10.1109/TIFS.2022.3207893 doi: 10.1109/TIFS.2022.3207893
    [28] D. Gray, S. Brennan, H. Tao, Evaluating appearance models for recognition, reacquisition, and tracking, in Proc. IEEE International Workshop on Performance Evaluation for Tracking and Surveillance (PETS), 3 (2007), 1–7.
    [29] W. Li, R. Zhao, X. Wang, Human reidentification with transferred metric learning, in Computer Vision–-ACCV 2012: 11th Asian Conference on Computer Vision, (2013), 31–44. https://doi.org/10.1007/978-3-642-37331-2_3
    [30] W. Li, X. Wang, Locally aligned feature transforms across views, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2013), 3594–3601. https://doi.org/10.1109/CVPR.2013.461
    [31] W. Li, R. Zhao, T. Xiao, X. Wang, Deepreid: Deep filter pairing neural network for re-identification, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2014), 152–159. https://doi.org/10.1109/CVPR.2014.27
    [32] L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, Q. Tian, Scalable person re-identification: A benchmark, in Proceedings of the IEEE International Conference on Computer Vision, (2015), 1116–1124. https://doi.org/10.1109/ICCV.2015.133
    [33] Z. Zheng, L. Zheng, Y. Yang, Unlabeled samples generated by gan improve the person re-identification baseline in vitro, in Proceedings of the IEEE International Conference on Computer Vision, (2017), 3754–3762. https://doi.org/10.1109/ICCV.2017.405
    [34] L. Wei, S. Zhang, W. Gao, Q. Tian, Person transfer gan to bridge domain gap for person re-identification, in Proceedings of the IEEE Conference on Computer Vision and Pattern recognition, (2018), 79–88. https://doi.org/10.1109/CVPR.2018.00016
    [35] M. Guo, E. Chou, D. A. Huang, S. Song, S. Yeung, F. F. Li, Neural graph matching networks for fewshot 3D action recognition, in Proceedings of the European Conference on Computer Vision (ECCV), (2018), 653–669.
    [36] J. Si, H. Zhang, C. G. Li, J. Kuen, X. Kong, A. C. Kot, et al., Dual attention matching network for context-aware feature sequence based person re-identification, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2018), 5363–5372. https://doi.org/10.1109/CVPR.2018.00562
    [37] Y. Sun, L. Zheng, W. Deng, S. Wang, Svdnet for pedestrian retrieval, in Proceedings of the IEEE International Conference on Computer Vision, (2017), 3800–3808. https://doi.org/10.1109/ICCV.2017.410
    [38] L. Qi, J. Huo, L. Wang, Y. Shi, Y. Gao, Maskreid: A mask based deep ranking neural network for person re-identification, preprint, arXiv: 1804.03864. https://doi.org/10.48550/arXiv.1804.03864
    [39] D. Chen, H. Li, T. Xiao, S. Yi, X. Wang, Video person re-identification with competitive snippet-similarity aggregation and co-attentive snippet embedding, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2018), 1169–1178. https://doi.org/10.1109/CVPR.2018.00128
    [40] M. M. Kalayeh, E. Basaran, M. Gökmen, M. E. Kamasak, M. Shah, Human semantic parsing for person re-identification, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2018), 1062–1071. https://doi.org/10.1109/CVPR.2018.00117
    [41] J. Almazan, B. Gajic, N. Murray, D. Larlus, Re-id done right: Towards good practices for person re-identification, preprint, arXiv: 1801.05339. https://doi.org/10.48550/arXiv.1801.05339
    [42] G. Wang, Y. Yuan, X. Chen, J. Li, X. Zhou, Learning discriminative features with multiple granularities for person re-identification, in Proceedings of the 26th ACM International Conference on Multimedia, (2018), 274–282. https://doi.org/10.1145/3240508.3240552
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1388) PDF downloads(54) Cited by(0)

Article outline

Figures and Tables

Figures(5)  /  Tables(5)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog