CD-YOLO: A lightweight end-to-end detection model for cigarette appearance defects

Yuanyuan Liu; Hao Wu; Hao Zhou; Guowu Yuan; Yuanyuan Liu; Hao Wu; Hao Zhou; Guowu Yuan

doi:10.3934/electreng.2026004

AIMS Electronics and Electrical Engineering

2026, Volume 10, Issue 1: 71-91. doi: 10.3934/electreng.2026004

Previous Article Next Article

Research article Topical Sections

CD-YOLO: A lightweight end-to-end detection model for cigarette appearance defects

School of Information Science and Engineering, Yunnan University, Kunming 650504, China

Academic Editor: Zhigang Zhu

Received: 18 July 2025 Revised: 15 September 2025 Accepted: 27 October 2025 Published: 14 January 2026

Appearance defect detection is essential for ensuring cigarette quality during production. Reaching high-precision and lightweight automated cigarette appearance defect detection has long been manufacturers' key focus. However, existing methods struggle to balance detection accuracy and speed effectively. This paper proposes a high-performance detection model for cigarette defects, named cigarette defect YOLO (CD-YOLO), which builds upon the YOLOv10 network with three major improvements. First, an intra-scale feature interaction (ISFI) module is designed to enhance the model's ability to distinguish different defects. Subsequently, a multi-scale feature fusion (MSFF) network is developed to improve the model's performance in recognizing small-scale and subtle defects. Finally, a lightweight group convolution detection head (LGCDH) is implemented to substantially reduce the model's computational complexity and parameter count, accelerating detection speed. The experimental results demonstrate that the CD-YOLO model achieves a favorable trade-off between accuracy and speed, maintaining a detection speed exceeding 500 FPS, with a mAP@0.5 of 96.2%. Additionally, a novel data augmentation strategy is introduced in this paper, employing low-rank adaptation (LoRA) to fine-tune a pretrained stable diffusion model, which generates synthetic defect samples to alleviate data scarcity.
- cigarette appearance defects,
- defect detection,
- YOLOv10,
- deep learning,
- defect generation
Citation: Yuanyuan Liu, Hao Wu, Hao Zhou, Guowu Yuan. CD-YOLO: A lightweight end-to-end detection model for cigarette appearance defects[J]. AIMS Electronics and Electrical Engineering, 2026, 10(1): 71-91. doi: 10.3934/electreng.2026004

Related Papers:

Abstract

Appearance defect detection is essential for ensuring cigarette quality during production. Reaching high-precision and lightweight automated cigarette appearance defect detection has long been manufacturers' key focus. However, existing methods struggle to balance detection accuracy and speed effectively. This paper proposes a high-performance detection model for cigarette defects, named cigarette defect YOLO (CD-YOLO), which builds upon the YOLOv10 network with three major improvements. First, an intra-scale feature interaction (ISFI) module is designed to enhance the model's ability to distinguish different defects. Subsequently, a multi-scale feature fusion (MSFF) network is developed to improve the model's performance in recognizing small-scale and subtle defects. Finally, a lightweight group convolution detection head (LGCDH) is implemented to substantially reduce the model's computational complexity and parameter count, accelerating detection speed. The experimental results demonstrate that the CD-YOLO model achieves a favorable trade-off between accuracy and speed, maintaining a detection speed exceeding 500 FPS, with a mAP@0.5 of 96.2%. Additionally, a novel data augmentation strategy is introduced in this paper, employing low-rank adaptation (LoRA) to fine-tune a pretrained stable diffusion model, which generates synthetic defect samples to alleviate data scarcity.

References

[1]	Liu H, Yuan G, Yang L, Liu K, Zhou H (2022) An appearance defect detection method for cigarettes based on C-CenterNet. Electronics 11: 2182. https://doi.org/10.3390/electronics11142182 doi: 10.3390/electronics11142182
[2]	Ding Y, Yuan G, Zhou H, Wu H (2025) ESF-DETR: a real-time and high-precision detection model for cigarette appearance. J Real Time Image Process 22: 54. https://doi.org/10.1007/s11554-025-01632-y doi: 10.1007/s11554-025-01632-y
[3]	Fu J, Zhang J, Wu Z, Xu L, Ye H, Zhang Y (2017) Application of infrared photoelectric detection system to triple filter rod production. Tob Sci Technol 50: 85–90. (in Chinese)
[4]	Li Y, Yang S, Fan L, Xiong Y, Zhu Z, Zhang L (2023) Online inspection of cigarette seam defects based on machine vision. Tob Sci Technol 56: 93–98. (in Chinese)
[5]	Yuan G, Liu J, Liu H, Ma Y, Wu H, Zhou H (2023) Detection of cigarette appearance defects based on improved YOLOv4. Electron Res Arch 31: 1344–1364. https://doi.org/10.3934/era.2023069 doi: 10.3934/era.2023069
[6]	Wu S, Lv X, Liu Y, Jiang M, Li X, Jiang D, et al. (2024) Enhanced SSD framework for detecting defects in cigarette appearance using variational Bayesian inference under limited sample conditions. Math Biosci Eng 21: 3281–3303. https://doi.org/10.3934/mbe.2024145 doi: 10.3934/mbe.2024145
[7]	Ma Y, Yuan G, Yue K, Zhou H (2023) CJS-YOLOv5n: A high-performance detection model for cigarette appearance defects. Math Biosci Eng 20: 17886–17904. https://doi.org/10.3934/mbe.2023795 doi: 10.3934/mbe.2023795
[8]	Wang A, Chen H, Liu L, Chen K, Lin Z, Han J, et al. (2024) YOLOv10: Real-Time End-to-End Object Detection. Adv Neural Inform Proc Syst 37: 107984-108011.
[9]	Hu EJ, Shen Y, Wallis P, Allen-Zhu Z, Li Y, Wang S, et al. (2022) LoRA: Low-Rank Adaptation of Large Language Models. Proc ICLR 1: 3.
[10]	Rombach R, Blattmann A, Lorenz D, Esser P, Ommer B (2022) High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10674–10685. https://doi.org/10.1109/CVPR52688.2022.01042 doi: 10.1109/CVPR52688.2022.01042
[11]	Zhang K, Zhou Y, Xu X, Dai B, Pan X (2024) DiffMorpher: Unleashing the capability of diffusion models for image morphing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7912–7921. https://doi.org/10.1109/CVPR52733.2024.00756 doi: 10.1109/CVPR52733.2024.00756
[12]	Jocher G, Chaurasia A, Qiu J (2023) Yolo by ultralytics. Available from: https://github.com/ultralytics/ultralytics
[13]	Li Y, Liu F (2020) Adaptive Gaussian Noise Injection Regularization for Neural Networks. In: Proceedings of the International Symposium on Neural Networks, 176‒189. https://doi.org/10.1007/978-3-030-64221-1_16 doi: 10.1007/978-3-030-64221-1_16
[14]	Kim EK, Lee H, Kim JY, Kim S (2020) Data augmentation method by applying color perturbation of inverse PSNR and geometric transformations for object recognition based on deep learning. Appl Sci 10: 3755. https://doi.org/10.3390/app10113755 doi: 10.3390/app10113755
[15]	Wong SC, Gatt A, Stamatescu V, McDonnell MD (2016) Understanding Data Augmentation for Classification: When to Warp? In: Proceedings of the International Conference on Digital Image Computing: Techniques and Applications (DICTA), 1‒6. https://doi.org/10.1109/DICTA.2016.7797091
[16]	Jia S, Wang P, Jia P, Hu S (2017) Research on data augmentation for image classification based on convolution neural networks. In: Proceedings of the Chinese Automation Congress, 4165‒4170. https://doi.org/10.1109/CAC.2017.8243510 doi: 10.1109/CAC.2017.8243510
[17]	Takahashi R, Matsubara T, Uehara K (2020) Data Augmentation Using Random Image Cropping and Patching for Deep CNNs. IEEE Trans Circuits Syst Video Technol 30: 2917‒2931. https://doi.org/10.1109/TCSVT.2019.2935128 doi: 10.1109/TCSVT.2019.2935128
[18]	Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. (2017) Attention is All You Need. Adv Neural Inform Proc Syst 30.
[19]	Esser P, Rombach R, Ommer B (2021) Taming transformers for high-resolution image synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 12873–12883. https://doi.org/10.1109/CVPR46437.2021.01268 doi: 10.1109/CVPR46437.2021.01268
[20]	Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, et al. (2021) Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 9992‒10002. https://doi.org/10.1109/ICCV48922.2021.00986 doi: 10.1109/ICCV48922.2021.00986
[21]	Ba LJ, Kiros JR, Hinton GE (2016) Layer normalization. arXiv preprint arXiv: 1607.06450.
[22]	Guo J, Chen X, Tang Y, Wang Y (2024) SLAB: Efficient transformers with simplified linear attention and progressive re-parameterized batch normalization. In: International Conference on Machine Learning, 667.
[23]	Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, 448‒456.
[24]	Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 8759–8768. https://doi.org/10.1109/CVPR.2018.00913 doi: 10.1109/CVPR.2018.00913
[25]	Xu X, Jiang Y, Chen W, Huang Y, Zhang Y, Sun X (2022) DAMO-YOLO: a report on real-time object detection design. arXiv preprint arXiv: 2211.15444.
[26]	Wang CY, Bochkovskiy A, Liao HYM (2023) YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7464–7475. https://doi.org/10.1109/CVPR52729.2023.00721
[27]	Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet Classification with Deep Convolutional Neural Networks. Adv Neural Inform Proc Syst 25: 1106–1114.
[28]	Chen C, Zanotti Fragonara L, Tsourdos A (2020) Go wider: an efficient neural network for point cloud analysis via group convolutions. Appl Sci 10: 2391. https://doi.org/10.3390/app10072391 doi: 10.3390/app10072391
[29]	Zhang X, Zhou X, Lin M, Sun J (2018) ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6848–6856. https://doi.org/10.1109/CVPR.2018.00716
[30]	Wang C, He W, Nie Y, Guo J, Liu C, Wang Y, et al. (2023) Gold-YOLO: Efficient object detector via gather-and-distribute mechanism. Adv Neural Inform Proc Syst 36: 51094–51112.
[31]	Zhao Y, Lv W, Xu S, Wei J, Wang G, Dang Q, et al. (2024) DETRs beat YOLOs on real-time object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 16965–16974. https://doi.org/10.1109/CVPR52733.2024.01605 doi: 10.1109/CVPR52733.2024.01605
[32]	Wang CY, Yeh IH, Liao HYM (2024) YOLOv9: Learning what you want to learn using programmable gradient information. In: Proceedings of the European Conference on Computer Vision (ECCV), 1–21. https://doi.org/10.1007/978-3-031-72751-1_1
[33]	Khanam R, Hussain M (2024) YOLOv11: An Overview of the Key Architectural Enhancements. arXiv preprint arXiv: 2410.17725.
[34]	Zhou H, Yang R, Zhang Y, Duan H, Huang Y, Hu R, et al. (2025) UniHead: Unifying Multi-Perception for Detection Heads. IEEE Trans Neural Networks Learn Syst 36: 9565–9576. https://doi.org/10.1109/TNNLS.2024.3412947 doi: 10.1109/TNNLS.2024.3412947
[35]	Ding X, Zhang X, Han J, Ding G (2022) Scaling Up Your Kernels to 31×31: Revisiting Large Kernel Design in CNNs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11953–11965. https://doi.org/10.1109/CVPR52688.2022.01166 doi: 10.1109/CVPR52688.2022.01166
[36]	Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 618–626. https://doi.org/10.1109/ICCV.2017.74

Reader Comments

Your name:*

Email:*
© 2026 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)