Deep learning-based small object detection: A survey

Qihan Feng; Xinzheng Xu; Zhixiao Wang; Qihan Feng; Xinzheng Xu; Zhixiao Wang

doi:10.3934/mbe.2023282

Mathematical Biosciences and Engineering

2023, Volume 20, Issue 4: 6551-6590. doi: 10.3934/mbe.2023282

Previous Article Next Article

Survey

Deep learning-based small object detection: A survey

1.
College of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
2.
Mine Digitization Engineering Research Center of the Ministry of Education, Xuzhou 221116, China

Academic Editor: Vladimir Mityushev

Received: 18 October 2022 Revised: 21 December 2022 Accepted: 26 December 2022 Published: 02 February 2023

Small object detection (SOD) is significant for many real-world applications, including criminal investigation, autonomous driving and remote sensing images. SOD has been one of the most challenging tasks in computer vision due to its low resolution and noise representation. With the development of deep learning, it has been introduced to boost the performance of SOD. In this paper, focusing on the difficulties of SOD, we analyze the deep learning-based SOD research papers from four perspectives, including boosting the resolution of input features, scale-aware training, incorporating contextual information and data augmentation. We also review the literature on crucial SOD tasks, including small face detection, small pedestrian detection and aerial image object detection. In addition, we conduct a thorough performance evaluation of generic SOD algorithms and methods for crucial SOD tasks on four well-known small object datasets. Our experimental results show that network configuring to boost the resolution of input features can enable significant performance gains on WIDER FACE and Tiny Person. Finally, several potential directions for future research in the area of SOD are provided.
- small object detection,
- deep learning,
- computer vision,
- neural network,
- benchmark
Citation: Qihan Feng, Xinzheng Xu, Zhixiao Wang. Deep learning-based small object detection: A survey[J]. Mathematical Biosciences and Engineering, 2023, 20(4): 6551-6590. doi: 10.3934/mbe.2023282

Related Papers:

Abstract

Small object detection (SOD) is significant for many real-world applications, including criminal investigation, autonomous driving and remote sensing images. SOD has been one of the most challenging tasks in computer vision due to its low resolution and noise representation. With the development of deep learning, it has been introduced to boost the performance of SOD. In this paper, focusing on the difficulties of SOD, we analyze the deep learning-based SOD research papers from four perspectives, including boosting the resolution of input features, scale-aware training, incorporating contextual information and data augmentation. We also review the literature on crucial SOD tasks, including small face detection, small pedestrian detection and aerial image object detection. In addition, we conduct a thorough performance evaluation of generic SOD algorithms and methods for crucial SOD tasks on four well-known small object datasets. Our experimental results show that network configuring to boost the resolution of input features can enable significant performance gains on WIDER FACE and Tiny Person. Finally, several potential directions for future research in the area of SOD are provided.

References

[1]	S. Agarwal, J. O. D. Terrail, F. Jurie, Recent advances in object detection in the age of deep convolutional neural networks, preprint, arXiv: 1809.03193.
[2]	R. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, in 2014 IEEE Conference on Computer Vision and Pattern Recognition, (2014), 580–587. https://doi.org/10.1109/CVPR.2014.81
[3]	R. Girshick, Fast R-CNN, in 2015 IEEE International Conference on Computer Vision (ICCV), (2015), 1440–1448. https://doi.org/10.1109/ICCV.2015.169
[4]	S. Ren, K. He, R. Girshick, J. Sun, Faster R-CNN: towards real-time object detection with region proposal networks, IEEE Trans. Pattern Anal. Mach. Intell., 39 (2016), 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031 doi: 10.1109/TPAMI.2016.2577031
[5]	K. He, G. Gkioxari, P. Dollár, R. Girshick, Mask R-CNN, IEEE Trans. Pattern Anal. Mach. Intell., 42 (2020), 386–397. https://doi.org/10.1109/TPAMI.2018.2844175 doi: 10.1109/TPAMI.2018.2844175
[6]	J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You Only Look Once: Unified, real-time object detection, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 779–88. https://doi.org/10.1109/CVPR.2016.91
[7]	J. Redmon, A. Farhadi, YOLOv3: An incremental improvement, preprint, arXiv: 1804.02767.
[8]	J. C. Y. Wang, A. Bochkovskiy, H. Y. M. Liao, YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors, preprint, arXiv: 2207.02696.
[9]	K. Kang, H. Li, J. Yan, X. Zeng, B. Yang, T. Xiao, et al., T-CNN: tubelets with convolutional neural networks for object detection from videos, IEEE Trans. Circuits Syst. Video Technol., (2017), 2896–2907. https://doi.org/10.1109/TCSVT.2017.2736553 doi: 10.1109/TCSVT.2017.2736553
[10]	T. Yin, X. Zhou, P. Krahenbuhl, Center-based 3d object detection and tracking, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2021), 11784–11793. https://doi.org/10.1109/CVPR46437.2021.01161
[11]	J. Dai, K. He, J. Sun, Instance-aware semantic segmentation via multi-task network cascades, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 3150–3158. https://doi.org/10.1109/CVPR.2016.343
[12]	B. Hariharan, P. Arbeláez, R. Girshick, J. Malik, Hypercolumns for object segmentation and fine-grained localization, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015), 447–456. https://doi.org/10.1109/CVPR.2015.7298642
[13]	B. Hariharan, P. Arbeláez, R. Girshick, J. Malik, Simultaneous detection and segmentation, in Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VII 13, (2014), 297–312. https://doi.org/10.1007/978-3-319-10584-0_20
[14]	C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, et al., Going deeper with convolutions, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015), 1–9. https://doi.org/10.1109/CVPR.2015.7298594
[15]	H. Wang, F. He, Z. Peng, T. Shao, Y. L. Yang, K. Zhou, et al., Understanding the robustness of skeleton-based action recognition under adversarial attack, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2021), 14656–14665. https://doi.org/10.1109/CVPR46437.2021.01442
[16]	L. Wang, Z. Tong, B. Ji, G. Wu, TDN: Temporal difference networks for efficient action recognition, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2021), 1895–1904. https://doi.org/10.48550/arXiv.2012.10071
[17]	D. Li, Z. Qiu, Y. Pan, T. Yao, H. Li, T. Mei, Representing videos as discriminative sub-graphs for action recognition, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2021), 3310–3319. https://doi.org/10.48550/arXiv.2201.04027
[18]	C. F. R. Chen, R. Panda, K. Ramakrishnan, R. Feris, J. Cohn, A. Oliva, et al., Deep analysis of cnn-based spatio-temporal representations for action recognition, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2021), 6165–6175. https://doi.org/10.1109/CVPR46437.2021.00610
[19]	S. Jha, C. Seo, E. Yang, G. P. Joshi, Real time object detection and trackingsystem for video surveillance system, Multimed. Tools Appl., 80 (2021), 3981–3996. https://doi.org/10.1007/s11042-020-09749-x doi: 10.1007/s11042-020-09749-x
[20]	M. A. Farooq, A. A. Khan, A. Ahmad, R. H. Raza, Effectiveness of state-of-the-art super resolution algorithms in surveillance environment, in Conference on Multimedia, Interaction, Design and Innovation, 1376 (2021), 79–88. https://doi.org/10.48550/arXiv.2107.04133
[21]	X. Zheng, X. Li, K. Xu, X. Jiang, T. Sun, Gait identification under surveillance environment based on human skeleton, preprint, arXiv: 2111.11720.
[22]	F. Wu, Q. Wang, J. Bian, H. Xiong, N. Ding, F. Lu, et al., A survey on video action recognition in sports: datasets, methods and applications, preprint, arXiv: 2206.01038.
[23]	C. J. Roros, A. C. Kak, maskGRU: Tracking small objects in the presence of large background motions, preprint, arXiv: 2201.00467.
[24]	Y. B. Can, A. Liniger, D. P. Paudel, L. Van Gool, Structured bird's-eye-view traffic scene understanding from onboard images, in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), (2021), 15641–15650. https://doi.org/10.1109/ICCV48922.2021.01537
[25]	S. Hampali, S. Stekovic, S. D. Sarkar, C. S. Kumar, F. Fraundorfer, V. Lepetit, Monte carlo scene search for 3d scene understanding, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2021), 13804–13813. https://doi.org/10.1109/CVPR46437.2021.01359
[26]	J. Hou, B. Graham, M. Niessner, S. Xie, Exploring data-efficient 3d scene understanding with contrastive scene contexts, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2021), 15587–15597. https://doi.org/10.1109/CVPR46437.2021.01533
[27]	Y. Liu, R. Wang, S. Shan, X. Chen, Structure inference net: object detection using scene-level context and instance-level relationships, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 6985–6994. https://doi.org/10.1109/CVPR.2018.00730
[28]	M. Schön, M. Buchholz, K. Dietmayer, MGNet: monocular geometric scene understanding for autonomous driving, in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), (2021), 15784–15795. https://doi.org/10.1109/ICCV48922.2021.01551
[29]	K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 770–778. https://doi.org/10.1109/CVPR.2016.90
[30]	S. H. Gao, M. M. Cheng, K. Zhao, X. Y. Zhang, M. H. Yang, P. Torr, Res2Net: a new multi-scale backbone architecture, in IEEE Trans. Pattern Anal. Mach. Intell., 43 (2021), 652–662. https://doi.org/10.1109/TPAMI.2019.2938758
[31]	K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, preprint, arXiv: 1409.1556.
[32]	A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, et al., MobileNets: efficient convolutional neural networks for mobile vision applications, preprint, arXiv: 1704.04861.
[33]	M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L. C. Chen, MobileNetV2: inverted residuals and linear bottlenecks, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 4510–4520. https://doi.org/10.48550/arXiv.1801.04381
[34]	K. He, X. Zhang, S. Ren, J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Trans. Pattern Anal. Mach. Intell., 37 (2015), 1904–1916. https://doi.org/10.1109/TPAMI.2015.2389824 doi: 10.1109/TPAMI.2015.2389824
[35]	T. Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, S. Belongie, Feature pyramid networks for object detection, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 936–944. https://doi.org/10.1109/CVPR.2017.106
[36]	W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, et al., SSD: single shot multibox detector, in European Conference on Computer Vision, (2016), 21–37. https://doi.org/10.1007/978-3-319-46448-0_2
[37]	C. Zhu, Y. He, M. Savvides, Feature selective anchor-free module for single-shot object detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 840–849.
[38]	H. Law, J. Deng, CornerNet: Detecting objects as paired keypoints, in European Conference on Computer Vision, (2018), 765–781. https://doi.org/10.1007/978-3-030-01264-9_45
[39]	Z. Tian, C. Shen, H. Chen, T. He, FCOS: fully convolutional one-stage object detection, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 9626–9635. https://doi.org/10.1109/ICCV.2019.00972
[40]	X. Zhou, D. Wang, P. Krähenbühl, Objects as points, preprint, arXiv: 1904.07850.
[41]	C. Eggert, S. Brehm, A. Winschel, D. Zecha, R. Lienhart, A closer look: small object detection in faster R-CNN, in 2017 IEEE International Conference on Multimedia and Expo (ICME), (2017), 421–426. https://doi.org/10.1109/ICME.2017.8019550
[42]	C. Chen, M. Y. Liu, O. Tuzel, J. Xiao, R-CNN for small object detection, in Asian Conference on Computer Vision, 10115 (2017), 214–230. https://doi.org/10.1007/978-3-319-54193-8_14
[43]	T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, et al., Microsoft COCO: common objects in context, in European Conference on Computer Vision, (2014), 740–755. https://doi.org/10.48550/arXiv.1405.0312
[44]	J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, F. Li, ImageNet: a large-scale hierarchical image database, in 2009 IEEE Conference on Computer Vision and Pattern Recognition, (2009), 248–255. https://doi.org/10.1109/CVPR.2009.5206848
[45]	M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, A. Zisserman, The pascal visual object classes (voc) challenge, Int. J. Comput. Vis., 88 (2010), 303–338. https://doi.org/10.1007/s11263-009-0275-4 doi: 10.1007/s11263-009-0275-4
[46]	Z. Zong, G. Song, Y. Liu, DETRs with collaborative hybrid assignments training, preprint, arXiv: 2211.12860.
[47]	S. Yang, P. Luo, C. C. Loy, X. Tang, WIDER FACE: a face detection benchmark, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 5525–5533. https://doi.org/10.1109/CVPR.2016.596
[48]	A. B. Chan, Z. S. J. Liang, N. Vasconcelos, Privacy preserving crowd monitoring: counting people without people models or tracking, in 2008 IEEE Conference on Computer Vision and Pattern Recognition, (2008), 1–7. https://doi.org/10.1109/CVPR.2008.4587569
[49]	L. Wang, J. Shi, G. Song, Object detection combining recognition and segmentation, in Asian Conference on Computer Vision, 4843 (2007), 189.
[50]	E. Bondi, R. Jain, P. Aggrawal, S. Anand, R. Hannaford, A. Kapoor, et al., BIRDSAI: a dataset for detection and tracking in aerial thermal infrared videos, in 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), (2020), 1736–1745. https://doi.org/10.1109/WACV45572.2020.9093284
[51]	L. Neumann, M. Karg, S. Zhang, C. Scharfenberger, E. Piegert, S. Mistr, et al., NightOwls: a pedestrians at night dataset, in Asian Conference on Computer Vision, (2019), 691–705. https://doi.org/10.1007/978-3-030-20887-5_43
[52]	K. Behrendt, L. Novak, R. Botros, A deep learning approach to traffic lights: Detection, tracking, and classification, in 2017 IEEE International Conference on Robotics and Automation (ICRA), (2017), 1370–1377. https://doi.org/10.1109/ICRA.2017.7989163
[53]	C. Ertler, J. Mislej, T. Ollmann, L. Porzi, G. Neuhold, Y. Kuang, The Mapillary Traffic sign dataset for detection and classification on a global scale, in European Conference on Computer Vision, (2020), 68–84. https://doi.org/10.48550/arXiv.1909.04422
[54]	J. Zhang, M. Huang, X. Jin, X. Li, A real-time chinese traffic sign detection algorithm based on modified yolov2, Algorithms, 10 (2017), 127. https://doi.org/10.3390/a10040127 doi: 10.3390/a10040127
[55]	D. Tabernik, D. Skočaj, Deep learning for large-scale traffic-sign detection and recognition, preprint, arXiv: 1904.00649.
[56]	Z. Zhu, D. Liang, S. Zhang, X. Huang, B. Li, S. Hu, Traffic-sign detection and classification in the wild, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 2110–2118. https://doi.org/10.1109/CVPR.2016.232
[57]	Z. Zhao, P. Zheng, S. T. Xu, X. Wu, Object detection with deep learning: a review, IEEE Trans. Neural Networks Learn. Syst., 30 (2019), 3212–3232. https://doi.org/10.1109/TNNLS.2018.2876865 doi: 10.1109/TNNLS.2018.2876865
[58]	K. Li, G. Wan, G. Cheng, L. Meng, J. Han, Object detection in optical remote sensing images: A survey and a new benchmark, ISPRS J. Photogramm. Remote Sens., 159 (2020), 296–307. https://doi.org/10.1016/j.isprsjprs.2019.11.023 doi: 10.1016/j.isprsjprs.2019.11.023
[59]	K. Oksuz, B. C. Cam, S. Kalkan, E. Akbas, Imbalance problems in object detection: a review, preprint, arXiv: 1909.00169.
[60]	A. G. Menezes, G. de Moura, C. Alves, A. C. P. L. F. de Carvalho, Continual object detection: a review of definitions, strategies, and challenges, preprint, arXiv: 2205.15445.
[61]	L. Jiao, R. Zhang, F. Liu, S. Yang, B. Hou, L. Li, et al., New generation deep learning for video object detection: a survey, IEEE Trans. Neural Networks Learn. Syst., 33 (2022), 3195–3215. https://doi.org/10.1109/TNNLS.2021.3053249 doi: 10.1109/TNNLS.2021.3053249
[62]	L. Jiao, F. Zhang, F. Liu, S. Yang, L. Li, Z. Feng, et al., A survey of deep learning-based object detection, IEEE Access, 7 (2019), 128837–128868. https://doi.org/10.1109/ACCESS.2019.2939201 doi: 10.1109/ACCESS.2019.2939201
[63]	G. Chen, H. Wang, K. Chen, Z. Li, Z. Song, Y. Liu, et al., A survey of the four pillars for small object detection: multiscale representation, contextual information, super-resolution, and region proposal, IEEE Trans. Syst. Man Cybern, Syst., 52 (2022), 936–953. https://doi.org/10.1109/TSMC.2020.3005231 doi: 10.1109/TSMC.2020.3005231
[64]	K. Chen, J. Wang, J. Pang, Y. Cao, Y. Xiong, X. Li, et al., MMDetection: open mmlab detection toolbox and benchmark, preprint, arXiv: 1906.07155.
[65]	K. Tong, Y. Wu, F. Zhou, Recent advances in small object detection based on deep learning: A review, Image Vis. Comput., 97 (2020), 103910. https://doi.org/10.1016/j.imavis.2020.103910 doi: 10.1016/j.imavis.2020.103910
[66]	Y. Liu, P. Sun, N. Wergeles, Y. Shang, A survey and performance evaluation of deep learning methods for small object detection, Expert Syst. Appl., 172 (2021), 114602. https://doi.org/10.1016/j.eswa.2021.114602 doi: 10.1016/j.eswa.2021.114602
[67]	K. Tong, Y. Wu, Deep learning-based detection from the perspective of small or tiny objects: A survey, Image Vis. Comput., 123 (2022), 104471. https://doi.org/10.1016/j.imavis.2022.104471 doi: 10.1016/j.imavis.2022.104471
[68]	A. M. Rekavandi, L. Xu, F. Boussaid, A. K. Seghouane, S. Hoefs, M. Bennamoun, A guide to image and video based small object detection using deep learning: case study of maritime surveillance, preprint, arXiv: 2207.12926.
[69]	G. Cheng, X. Yuan, X. Yao, K. Yan, Q. Zeng, J. Han, Towards large-scale small object detection: survey and benchmarks, preprint, arXiv: 2207.14096.
[70]	S. Liu, L. Qi, H. Qin, J. Shi, J. Jia, Path aggregation network for instance segmentation, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 8759–8768. https://doi.org/10.1109/CVPR.2018.00913
[71]	M. Tan, R. Pang, Q. V. Le, EfficientDet: scalable and efficient object detection, preprint, arXiv: 1911.09070.
[72]	S. Liu, D. Huang, Y. Wang, Learning spatial fusion for single-shot object detection, preprint, arXiv: 1911.09516.
[73]	G. Ghiasi, T. Y. Lin, R. Pang, Q. V. Le, NAS-FPN: learning scalable feature pyramid architecture for object detection, preprint, arXiv: 1904.07392.
[74]	T. Y. Lin, P. Goyal, R. Girshick, K. He, P. Dollár, Focal loss for dense object detection, in 2017 IEEE International Conference on Computer Vision (ICCV), (2017), 2999–3007. https://doi.org/10.1109/ICCV.2017.324
[75]	Z. Li, F. Zhou, FSSD: feature fusion single shot multibox detector, preprint, arXiv: 1712.00960.
[76]	L. Cui, R. Ma, P. Lv, X. Jiang, Z. Gao, B. Zhou, et al., MDSSD: multi-scale deconvolutional single shot detector for small objects, preprint, arXiv: 1805.07009.
[77]	Y. Gong, X. Yu, Y. Ding, X. Peng, J. Zhao, Z. Han, Effective fusion factor in fpn for tiny object detection, preprint, arXiv: 2011.02298.
[78]	Z. Liu, G. Gao, L. Sun, Z. Fang, HRDNet: High-resolution detection network for small objects, preprint, arXiv: 2006.07607.
[79]	Z. Liu, G. Gao, L. Sun, L. Fang, IPG-Net: image pyramid guidance network for small object detection, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), (2020), 4422–4430. https://doi.org/10.1109/CVPRW50498.2020.00521
[80]	P. Y. Chen, J. W. Hsieh, C. Y. Wang, H. Y. M. Liao, Recursive hybrid fusion pyramid network for real-time small object detection on embedded devices, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), (2020), 1612–1621. https://doi.org/10.1109/CVPRW50498.2020.00209
[81]	C. Yang, Z. Huang, N. Wang, QueryDet: cascaded sparse query for accelerating high-resolution small object detection, preprint, arXiv: 2103.09136.
[82]	C. Deng, M. Wang, L. Liu, Y. Liu, Y. Jiang, Extended feature pyramid network for small object detection, IEEE Trans. Multimedia, 24 (2022), 1968–1979. https://doi.org/10.1109/TMM.2021.3074273 doi: 10.1109/TMM.2021.3074273
[83]	J. Li, X. Liang, Y. Wei, T. Xu, J. Feng, S. Yan, Perceptual generative adversarial networks for small object detection, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 1951–1959. https://doi.org/10.1109/CVPR.2017.211
[84]	Y. Bai, Y. Zhang, M. Ding, B. Ghanem, SOD-MTGAN: small object detection via multi-task generative adversarial network, in European Conference on Computer Vision, 11217 (2018), 210–226. https://doi.org/10.1007/978-3-030-01261-8_13
[85]	J. Noh, W. Bae, W. Lee, J. Seo, G. Kim, Better to follow, follow to be better: towards precise supervision of feature super-resolution for small object detection, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 9724–9733. https://doi.org/10.1109/ICCV.2019.00982
[86]	F. Zhang, L. Jiao, L. Li, F. Liu, X. Liu, MultiResolution attention extractor for small object detection, preprint, arXiv: 2006.05941.
[87]	J. Rabbi, N. Ray, M. Schubert, S. Chowdhury, D. Chao, Small-object detection in remote sensing images with end-to-end edge-enhanced gan and object detector network, preprint, arXiv: 2003.09085.
[88]	K. Jiang, Z. Wang, P. Yi, G. Wang, T. Lu, J. Jiang, Edge-enhanced GAN for remote sensing image super-resolution, IEEE Trans. Geosci. Remote Sens., 57 (2019), 5799–5812. https://doi.org/10.1109/TGRS.2019.2902431 doi: 10.1109/TGRS.2019.2902431
[89]	X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, et al., ESRGAN: enhanced super-resolution generative adversarial networks, in Proceedings of the European conference on computer vision (ECCV), (2018). https://doi.org/10.1007/978-3-030-11021-5_5
[90]	A. Jolicoeur-Martineau, The relativistic discriminator: a key element missing from standard gan, preprint, arXiv: 1807.00734.
[91]	I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, et al., Generative adversarial nets, Adv. Neural Inf. Process Syst., 27 (2014). https://doi.org/10.48550/arXiv.1406.2661 doi: 10.48550/arXiv.1406.2661
[92]	J. Cao, Y. Pang, S. Zhao, X. Li, High-level semantic networks for multi-scale object detection, IEEE Trans. Circuits Syst. Video Technol., 30 (2020), 3372–3386. https://doi.org/10.1109/TCSVT.2019.2950526 doi: 10.1109/TCSVT.2019.2950526
[93]	K. Zhang, Z. Zhang, Z. Li, Y. Qiao, Joint face detection and alignment using multitask cascaded convolutional networks, IEEE Signal Process. Lett., 23 (2016), 1499–1503. https://doi.org/10.1109/LSP.2016.2603342 doi: 10.1109/LSP.2016.2603342
[94]	Z. Hao, Y. Liu, H. Qin, J. Yan, X. Li, X. Hu, Scale-aware face detection, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 1913–1922. https://doi.org/10.1109/CVPR.2017.207
[95]	B. Singh, L. S. Davis, An analysis of scale invariance in object detection - snip, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 3578–3587. https://doi.org/10.1109/CVPR.2018.00377
[96]	B. Singh, M. Najibi, L. S. Davis, SNIPER: efficient multi-scale training, Adv. Neural Inf. Process Syst., 31 (2018). https://doi.org/10.48550/arXiv.1805.09300 doi: 10.48550/arXiv.1805.09300
[97]	Y. Kim, B. N. Kang, D. Kim, SAN: learning relationship between convolutional features for multi-scale object detection, in European Conference on Computer Vision, 11209 (2018), 328–343. https://doi.org/10.1007/978-3-030-01228-1_20
[98]	Y. Li, Y. Chen, N. Wang, Z. Zhang, Scale-aware trident networks for object detection, preprint, arXiv: 1901.01892.
[99]	J. Peng, M. Sun, Z. X. Zhang, T. Tan, J. Yan, POD: practical object detection with scale-sensitive network, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 9606–9615. https://doi.org/10.1109/ICCV.2019.00970
[100]	A. Oliva, A. Torralba, The role of context in object recognition, Trends Cogn. Sci., 11 (2007), 520–527. https://doi.org/10.1016/j.tics.2007.09.009 doi: 10.1016/j.tics.2007.09.009
[101]	S. Bell, C. L. Zitnick, K. Bala, R. Girshick, Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 2874–2883. https://doi.org/10.1109/CVPR.2016.314
[102]	C. Y. Fu, W. Liu, A. Ranga, A. Tyagi, A. C. Berg, DSSD: deconvolutional single shot detector, preprint, arXiv: 1701.06659.
[103]	W. Xiang, D. Q. Zhang, H. Yu, V. Athitsos, Context-aware single-shot detector, in 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), (2018), 1784–1793. https://doi.org/10.1109/WACV.2018.00198
[104]	X. Chen, A. Gupta, Spatial memory for context reasoning in object detection, in 2017 IEEE International Conference on Computer Vision (ICCV), (2017), 4106–4116. https://doi.org/10.1109/ICCV.2017.440
[105]	K. Fu, J. Li, L. Ma, K. Mu, Y. Tian, Intrinsic relationship reasoning for small object detection, preprint, arXiv: 2009.00833.
[106]	J. S. Lim, M. Astrid, H. J. Yoon, S. I. Lee, Small object detection using context and attention, in 2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), (2021), 181–186. https://doi.org/10.1109/ICAIIC51459.2021.9415217
[107]	A. Bochkovskiy, C. Y. Wang, H. Y. M. Liao, YOLOv4: optimal speed and accuracy of object detection, preprint, arXiv: 2004.10934.
[108]	H. Zhang, M. Cisse, Y. N. Dauphin, D. Lopez-Paz, Mixup: beyond empirical risk minimization, preprint, arXiv: 1710.09412.
[109]	S. Yun, D. Han, S. J. Oh, S. Chun, J. Choe, Y. Yoo, CutMix: regularization strategy to train strong classifiers with localizable features, in Proceedings of the IEEE International Conference on Computer Vision, (2019), 6023–6032. https://doi.org/10.1109/ICCV.2019.00612
[110]	M. Kisantal, Z. Wojna, J. Murawski, J. Naruniec, K. Cho, Augmentation for small object detection, preprint, arXiv: 1902.07296.
[111]	C. Chen, Y. Zhang, Q. Lv, S. Wei, X. Wang, X. Sun, et al., RRNet: a hybrid detector for object detection in drone-captured images, in 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), (2019), 100–108. https://doi.org/10.1109/ICCVW.2019.00018
[112]	F. O. Unel, B. O. Ozkalayci, C. Cigla, The power of tiling for small object detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), (2019), 582–591. https://doi.org/10.1109/CVPRW.2019.00084
[113]	Y. Chen, P. Zhang, Z. Li, Y. Li, X. Zhang, L. Qi, et al., Dynamic scale training for object detection, preprint, arXiv: 2004.12432.
[114]	B. Zoph, E. D. Cubuk, G. Ghiasi, T. Y. Lin, J. Shlens, Q. V. Le, Learning data augmentation strategies for object detection, in European Conference on Computer Vision, (2020), 566–583. https://doi.org/10.1007/978-3-030-58583-9_34
[115]	E. D. Cubuk, B. Zoph, D. Mane, V. Vasudevan, Q. V. Le, AutoAugment: learning augmentation policies from data, preprint, arXiv: 1805.09501.
[116]	Y. Chen, Y. Li, T. Kong, L. Qi, R. Chu, L. Li, et al., Scale-aware automatic augmentation for object detection, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2021), 9563–9572. https://doi.org/10.1109/CVPR46437.2021.00944
[117]	N. Samet, S. Hicsonmez, E. Akbas, Reducing label noise in anchor-free object detection, preprint, arXiv: 2008.01167.
[118]	K. Duan, S. Bai, L. Xie, H. Qi, Q. Huang, Q. Tian, CenterNet++ for object detection, preprint, arXiv: 2204.08394.
[119]	J. Wang, C. Xu, W. Yang, L. Yu, A normalized gaussian wasserstein distance for tiny object detection, preprint, arXiv: 2110.13389.
[120]	C. Xu, J. Wang, W. Yang, H. Yu, L. Yu, G. Xia, RFLA: Gaussian receptive field based label assignment for tiny object detection, in Proceedings of the European conference on computer vision (ECCV), (2022). https://doi.org/10.1007/978-3-031-20077-9_31
[121]	C. Lee, S. Park, H. Song, J. Ryu, S. Kim, H. Kim, et al., Interactive multi-class tiny-object detection, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2022), 14136–14145. https://doi.org/10.1109/CVPR52688.2022.01374
[122]	F. C. Akyon, S. Altinuc, A. Temi̇zel, Slicing aided hyper inference and fine-tuning for small object detection, preprint, arXiv: 2202.06934.
[123]	P. Hu, D. Ramanan, Finding tiny faces, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 1522–1530. https://doi.org/10.1109/CVPR.2017.166
[124]	S. Zhang, X. Zhu, Z. Lei, H. Shi, X. Wang, S. Z. Li, S.3FD: single shot scale-invariant face detector, in 2017 IEEE International Conference on Computer Vision (ICCV), (2017), 192–201. https://doi.org/10.1109/ICCV.2017.30
[125]	Y. Bai, Y. Zhang, M. Ding, B. Ghanem, Finding tiny faces in the wild with generative adversarial network, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 21–30. https://doi.org/10.1109/CVPR.2018.00010
[126]	P. Samangouei, M. Najibi, L. Davis, R. Chellappa, Face-magnet: magnifying feature maps to detect small faces, preprint, arXiv: 1803.05258.
[127]	C. Zhu, R. Tao, K. Luu, M. Savvides, Seeing small faces from robust anchor's perspective, preprint, arXiv: 1802.09058.
[128]	Y. Zhu, H. Cai, S. Zhang, C. Wang, Y. Xiong, TinaFace: strong but simple baseline for face detection, preprint, arXiv: 2011.13183.
[129]	J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, et al., Deformable convolutional networks, in 2017 IEEE International Conference on Computer Vision (ICCV), (2017), 764–773. https://doi.org/10.1109/ICCV.2017.89
[130]	Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, D. Ren, Distance-IoU loss: faster and better learning for bounding box regression, in Proceedings of the AAAI conference on artificial intelligence, 34 (2019), 12993–13000. https://doi.org/10.1609/aaai.v34i07.6999
[131]	A. Shrivastava, A. Gupta, R. Girshick, Training region-based object detectors with online hard example mining, in Proceedings of the IEEE conference on computer vision and pattern recognition, (2016), 761–769. https://doi.org/10.1109/CVPR.2016.89
[132]	Z. Zhang, W. Shen, S. Qiao, Y. Wang, B. Wang, A. Yuille, Robust face detection via learning small faces on hard images, in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, (2020), 1361–1370. https://doi.org/10.48550/arXiv.1811.11662
[133]	T. Song, L. Sun, D. Xie, H. Sun, S. Pu, Small-scale pedestrian detection based on somatic topology localization and temporal feature aggregation, preprint, arXiv: 1807.01438.
[134]	S. Das, P. S. Mukherjee, U. Bhattacharya, Seek and you will find: a new optimized framework for efficient detection of pedestrian, preprint, arXiv: 1912.10241.
[135]	W. Liu, S. Liao, W. Ren, W. Hu, Y. Yu, High-level semantic feature detection: a new perspective for pedestrian detection, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 5182–5191. https://doi.org/10.1109/CVPR.2019.00533
[136]	X. Yu, Y. Gong, N. Jiang, Q. Ye, Z. Han, Scale match for tiny person detection, in 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), (2020), 1246–1254. https://doi.org/10.1109/WACV45572.2020.9093394
[137]	D. Božić-Štulić, Ž. Marušić, S. Gotovac, Deep learning approach in aerial imagery for supporting land search and rescue missions, Int. J. Comput Vis., 127 (2019), 1256–1278. https://doi.org/10.1007/s11263-019-01177-1 doi: 10.1007/s11263-019-01177-1
[138]	G. Adaimi, S. Kreiss, A. Alahi, Perceiving traffic from aerial images, preprint, arXiv: 2009.07611.
[139]	C. Gheorghe, N. Filip, Road traffic analysis using unmanned aerial vehicle and image processing algorithms, in 2022 IEEE International Conference on Automation, Quality and Testing, Robotics (AQTR), (2022), 1–5. https://doi.org/10.1109/AQTR55203.2022.9802058
[140]	J. Han, J. Ding, J. Li, G. S. Xia, Align deep features for oriented object detection, IEEE Trans. Geosci. Remote Sens., 60 (2022), 5602511. https://doi.org/10.1109/TGRS.2021.3062048 doi: 10.1109/TGRS.2021.3062048
[141]	X. Yang, J. Yang, J. Yan, Y. Zhang, T. Zhang, Z. Guo, et al., SCRDet: towards more robust detection for small, cluttered and rotated objects, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 8231–8240. https://doi.org/10.1109/ICCV.2019.00832
[142]	X. Xie, G. Cheng, J. Wang, X. Yao, J. Han, Oriented r-cnn for object detection, in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), (2021), 3500–3509. https://doi.org/10.1109/ICCV48922.2021.00350
[143]	R. Qin, Q. Liu, G. Gao, D. Huang, Y. Wang, MRDet: a multi-head network for accurate oriented object detection in aerial images, preprint, arXiv: 2012.13135.
[144]	X. Zhang, E. Izquierdo, K. Chandramouli, Dense and small object detection in uav vision based on cascade network, in 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), (2019), 118–126. https://doi.org/10.1109/ICCVW.2019.00020
[145]	J. Yi, P. Wu, B. Liu, Q. Huang, H. Qu, D. Metaxas, Oriented object detection in aerial images with box boundary-aware vectors, in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, (2021), 2150–2159. https://doi.org/10.1109/WACV48630.2021.00220
[146]	O. Ronneberger, P. Fischer, T. Brox, U-Net: convolutional networks for biomedical image segmentation, in Medical Image Computing and Computer-Assisted Intervention, (2015), 234–241. https://doi.org/10.1007/978-3-319-24574-4_28
[147]	J. Han, J. Ding, N. Xue, G. S. Xia, ReDet: a rotation-equivariant detector for aerial object detection, preprint, arXiv: 2103.07733.
[148]	J. Ding, N. Xue, Y. Long, G. S. Xia, Q. Lu, Learning ROI transformer for oriented object detection in aerial images, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 2849–2858. https://doi.org/10.1109/CVPR.2019.00296
[149]	M. Zand, A. Etemad, M. Greenspan, Oriented bounding boxes for small and freely rotated objects, IEEE Trans. Geosci. Remote Sensing, 60 (2022), 1–15. https://doi.org/10.1109/TGRS.2021.3076050 doi: 10.1109/TGRS.2021.3076050
[150]	Z. Yang, S. Liu, H. Hu, L. Wang, S. Lin, RepPoints: point set representation for object detection, in Proceedings of the IEEE/CVF International Conference on Computer Vision, (2019), 9657–9666. https://doi.org/10.1109/ICCV.2019.00975
[151]	W. Li, Y. Chen, K. Hu, J. Zhu, Oriented reppoints for aerial object detection, preprint, arXiv: 2105.11111.
[152]	C. Xu, J. Wang, W. Yang, L. Yu, Dot distance for tiny object detection in aerial images, in IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), (2021), 1192–1201, https://doi.org/10.1109/CVPRW53098.2021.00130
[153]	X. Fang, F. Hu, M. Yang, T. Zhu, R. Bi, Z. Zhang, Z. Gao, Small object detection in remote sensing images based on super-resolution, Pattern Recognit. Lett., 153 (2022), 107–112. https://doi.org/10.1016/j.patrec.2021.11.027.5 doi: 10.1016/j.patrec.2021.11.027.5
[154]	Y. Li, Q. Huang, X. Pei, Y. Chen, L. Jiao, R. Shang, Cross-layer attention network for small object detection in remote sensing imagery, IEEE J. Sel. Top Appl. Earth Obs. Remote Sens., 14 (2021), 2148–2161. https://doi.org/10.1109/JSTARS.2020.3046482 doi: 10.1109/JSTARS.2020.3046482
[155]	O. C. Koyun, R. K. Keser, İ. B. Akkaya, B. U. Töreyin, Focus-and-detect:a small object detection framework for aerial images, Signal Process. Image Commun., 104 (2022), 116675. https://doi.org/10.1016/j.image.2022.116675 doi: 10.1016/j.image.2022.116675
[156]	B. F. Klare, B. Klein, E. Taborsky, A. Blanton, J. Cheney, K. Allen, et al., Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015), 1931–1939. https://doi.org/10.1109/CVPR.2015.7298803
[157]	Y. Yuan, W. Yang, W. Ren, J. Liu, W. J. Scheirer, Z. Wang, UG²⁺: a collective benchmark effort for evaluating and advancing image understanding in poor visibility environments, preprint, arXiv: 1904.04474.
[158]	H. Nada, V. A. Sindagi, H. Zhang, V. M. Patel, Pushing the limits of unconstrained face detection: a challenge ataset and baseline results, in 2018 IEEE 9th International Conference on Biometrics Theory, Applications and Systems (BTAS), (2018), 1–10. https://doi.org/10.1109/BTAS.2018.8698561
[159]	M. K. Yucel, Y. C. Bilge, O. Oguz, N. Ikizler-Cinbis, P. Duygulu, R. G. Cinbis, Wildest faces: face detection and recognition in violent settings, preprint, arXiv: 1805.07566.
[160]	S. Zhang, Y. Xie, J. Wan, H. Xia, S. Z. Li, G. Guo, WiderPerson: A diverse dataset for dense pedestrian detection in the wild, IEEE Trans. Multimedia, 22 (2020), 380–393. https://doi.org/10.1109/TMM.2019.2929005 doi: 10.1109/TMM.2019.2929005
[161]	M. Braun, S. Krebs, F. Flohr, D. M. Gavrila, The eurocity persons dataset: a novel benchmark for object detection, preprint, arXiv: 1805.07193.
[162]	S. Zhang, R. Benenson, B. Schiele, CityPersons: a diverse dataset for pedestrian detection, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 4457–4465. https://doi.org/10.1109/CVPR.2017.474
[163]	P. Dollar, C. Wojek, B. Schiele, P. Perona, Pedestrian detection: a benchmark, in 2009 IEEE Conference on Computer Vision and Pattern Recognition, (2009), 304–311. https://doi.org/10.1109/CVPR.2009.5206631
[164]	P. Zhu, L. Wen, D. Du, X. Bian, H. Fan, Q. Hu, et al., Detection and tracking meet drones challenge, IEEE Trans. Pattern Anal. Mach. Intell., 44 (2022), 7380–7399. https://doi.org/10.1109/TPAMI.2021.3119563 doi: 10.1109/TPAMI.2021.3119563
[165]	D. Du, Y. Qi, H. Yu, Y. Yang, K. Duan, G. Li, et al., The unmanned aerial vehicle benchmark: object detection and tracking, in Proceedings of the European Conference on Computer Vision (ECCV), (2018), 370–386. https://doi.org/10.1007/s11263-019-01266-1
[166]	G. S. Xia, X. Bai, J. Ding, Z. Zhu, S. Belongie, J. Luo, et al., DOTA: a large-scale dataset for object detection in aerial images, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2018), 3974–3983. https://doi.org/10.1109/CVPR.2018.00418
[167]	G. Cheng, J. Han, P. Zhou, L. Guo, Multi-class geospatial object detection and geographic image classification based on collection of part detectors, ISPRS J. Photogramm. Remote Sens., 98 (2014), 119–132. https://doi.org/10.1016/j.isprsjprs.2014.10.002 doi: 10.1016/j.isprsjprs.2014.10.002
[168]	H. Zhu, X. Chen, W. Dai, K. Fu, Q. Ye, J. Jiao, Orientation robust object detection in aerial images using deep convolutional neural network, in 2015 IEEE International Conference on Image Processing (ICIP), (2015), 3735–3739. https://doi.org/10.1109/ICIP.2015.7351502
[169]	L. Tuggener, I. Elezi, J. Schmidhuber, M. Pelillo, T. Stadelmann, DeepScores-a dataset for segmentation, detection and classification of tiny objects, in 2018 24th International Conference on Pattern Recognition (ICPR), (2018), 3704–3709. https://doi.org/10.1109/ICPR.2018.8545307
[170]	A. Geiger, P. Lenz, R. Urtasun, Are we ready for autonomous driving? The KITTI vision benchmark suite, in 2012 IEEE Conference on Computer Vision and Pattern Recognition, (2012), 3354–3361. https://doi.org/10.1109/CVPR.2012.6248074
[171]	S. Song, S. P. Lichtenberg, J. Xiao, SUN RGB-D: a rgb-d scene understanding benchmark suite, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2015), 567–576. https://doi.org/10.1109/CVPR.2015.7298655
[172]	S. Zhang, L. Wen, X. Bian, Z. Lei, S. Z. Li, Single-shot refinement neural network for object detection, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 4203–4212. https://doi.org/10.1109/CVPR.2018.00442
[173]	J. Cao, H. Cholakkal, R. M. Anwer, F. S. Khan, Y. Pang, L. Shao, D2Det: towards high quality object detection and instance segmentation, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 11482–11491.
[174]	Y. Chen, J. Li, H. Xiao, X. Jin, S. Yan, J. Feng, Dual path networks, Adv. Neural Inf. Process Syst., 30 (2017). https://doi.org/10.48550/arXiv.1707.01629 doi: 10.48550/arXiv.1707.01629
[175]	Y. Zhu, C. Zhao, J. Wang, X. Zhao, Y. Wu, H. Lu, CoupleNet: coupling global structure with local parts for object detection, in 2017 IEEE International Conference on Computer Vision (ICCV), (2017), 4146–4154. https://doi.org/10.1109/ICCV.2017.444
[176]	H. Hu, J. Gu, Z. Zhang, J. Dai, Y. Wei, Relation networks for object detection, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2018), 3588–3597. https://doi.org/10.1109/CVPR.2018.00378
[177]	L. Tychsen-Smith, L. Petersson, Improving object localization with fitness nms and bounded iou loss, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2018), 6877–6885. https://doi.org/10.1109/CVPR.2018.00719
[178]	S. Xu, X. Wang, W. Lv, Q. Chang, C. Cui, K. Deng, et al., PP-YOLOE: an evolved version of YOLO, preprint, arXiv: 2203.16250.
[179]	J. Leng, Y. Ren, W. Jiang, X. Sun, Y. Wang, Realize your surroundings: exploiting context information for small object detection, Neurocomputing, 433 (2021). https://doi.org/10.1016/j.neucom.2020.12.093 doi: 10.1016/j.neucom.2020.12.093
[180]	C. L. Zitnick, P. Dollár, Edge Boxes: locating object proposals from edges, in European Conference on Computer Vision, (2014), 391–405. https://doi.org/10.1007/978-3-319-10602-1_26
[181]	A. Howard, M. Sandler, G. Chu, L. C. Chen, B. Chen, M. Tan, et al., Searching for MobileNetV3, in Proceedings of the IEEE/CVF International Conference on Computer Vision, (2019), 1314–1324. https://doi.org/10.1109/ICCV.2019.00140
[182]	X. Tang, D. K. Du, Z. He, J. Liu, PyramidBox: a context-assisted single shot face detector, in Proceedings of the European Conference on Computer Vision (ECCV), (2018), 797–813. https://doi.org/10.1007/978-3-030-01240-3_49
[183]	J. Deng, J. Guo, Y. Zhou, J. Yu, I. Kotsia, S. Zafeiriou, RetinaFace: single-stage dense face localisation in the wild, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2019), 5203–5212. https://doi.org/10.1109/CVPR42600.2020.00525
[184]	Z. Liu, J. Du, F. Tian, J. Wen, MR-CNN: a multi-scale region-based convolutional neural network for small traffic sign recognition, IEEE Access, 7 (2019), 57120–57128. https://doi.org/10.1109/ACCESS.2019.2913882 doi: 10.1109/ACCESS.2019.2913882
[185]	X. Lu, B. Li, Y. Yue, Q. Li, J. Yan, Grid R-CNN, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019), 7355–7364, https://doi.org/10.1109/CVPR.2019.00754.(2018).
[186]	J. Li, Y. Wang, C. Wang, Y. Tai, J. Qian, J. Yang, et al., DSFD: dual shot face detector, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2019), 5060–5069. https://doi.org/10.1109/CVPR.2019.00520
[187]	X. Zhang, F. Wan, C. Liu, R. Ji, Q. Ye, FreeAnchor: learning to match anchors for visual object detection, IEEE Trans. Pattern Anal. Mach. Intell., 44 (2022), 3096–3109. https://doi.org/10.48550/arXiv.1909.02466 doi: 10.48550/arXiv.1909.02466
[188]	J. Pang, K. Chen, J. Shi, H. Feng, W. Ouyang, D. Lin, Libra R-CNN: towards balanced learning for object detection, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2019), 821–830. https://doi.org/10.1109/CVPR.2019.00091
[189]	G. Zhang, S. Lu, W. Zhang, CAD-Net: a context-aware detection network for objects in remote sensing imagery, IEEE Trans. Geosci. Remote Sens., 57 (2019), 10015–10024. https://doi.org/10.1109/TGRS.2019.2930982 doi: 10.1109/TGRS.2019.2930982
[190]	N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, S. Zagoruyko, End-to-end object detection with transformers, in European Conference on Computer Vision, 12346 (2020), 213–229. https://doi.org/10.1007/978-3-030-58452-8_13
[191]	S. Li, F. Liu, L. Jiao, X. Liu, P. Chen, Learning salient feature for salient object detection without labels, IEEE Trans. Cybern., 53 (2022), 1012–1025. https://doi.org/10.1109/TCYB.2022.3209978 doi: 10.1109/TCYB.2022.3209978
[192]	F. Liu, X. Qian, L. Jiao, X. Zhang, L. Li, Y. Cui, Contrastive learning-based dual dynamic gcn for sar image scene classification, IEEE Trans. Neural Networks Learn Syst., (2022), 1–15. https://doi.org/10.1109/TNNLS.2022.3174873 doi: 10.1109/TNNLS.2022.3174873
[193]	Y. Du, F. Liu, L. Jiao, Z. Hao, S. Li, X. Liu, et al., Augmentative contrastive learning for one-shot object detection, Neurocomputing, 513 (2022), 13–24. https://doi.org/10.1016/j.neucom.2022.09.125 doi: 10.1016/j.neucom.2022.09.125

Reader Comments

Your name:*

Email:*
© 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)