In response to the limited detection ability and low model generalization ability of the YOLOv7 algorithm for small targets, this paper proposes a detection algorithm based on the improved YOLOv7 algorithm for steel surface defect detection. First, the Transformer-InceptionDWConvolution (TI) module is designed, which combines the Transformer module and InceptionDWConvolution to increase the network's ability to detect small objects. Second, the spatial pyramid pooling fast cross-stage partial channel (SPPFCSPC) structure is introduced to enhance the network training performance. Third, a global attention mechanism (GAM) attention mechanism is designed to optimize the network structure, weaken the irrelevant information in the defect image, and increase the algorithm's ability to detect small defects. Meanwhile, the Mish function is used as the activation function of the feature extraction network to improve the model's generalization ability and feature extraction ability. Finally, a minimum partial distance intersection over union (MPDIoU) loss function is designed to locate the loss and solve the mismatch problem between the complete intersection over union (CIoU) prediction box and the real box directions. The experimental results show that on the Northeastern University Defect Detection (NEU-DET) dataset, the improved YOLOv7 network model improves the mean Average precision (mAP) performance by 6% when compared to the original algorithm, while on the VOC2012 dataset, the mAP performance improves by 2.6%. These results indicate that the proposed algorithm can effectively improve the small defect detection performance on steel surface defects.
Citation: Yinghong Xie, Biao Yin, Xiaowei Han, Yan Hao. Improved YOLOv7-based steel surface defect detection algorithm[J]. Mathematical Biosciences and Engineering, 2024, 21(1): 346-368. doi: 10.3934/mbe.2024016
[1] | Eunha Shim . Optimal strategies of social distancing and vaccination against seasonal influenza. Mathematical Biosciences and Engineering, 2013, 10(5&6): 1615-1634. doi: 10.3934/mbe.2013.10.1615 |
[2] | Hamed Karami, Pejman Sanaei, Alexandra Smirnova . Balancing mitigation strategies for viral outbreaks. Mathematical Biosciences and Engineering, 2024, 21(12): 7650-7687. doi: 10.3934/mbe.2024337 |
[3] | Pannathon Kreabkhontho, Watchara Teparos, Thitiya Theparod . Potential for eliminating COVID-19 in Thailand through third-dose vaccination: A modeling approach. Mathematical Biosciences and Engineering, 2024, 21(8): 6807-6828. doi: 10.3934/mbe.2024298 |
[4] | Sarafa A. Iyaniwura, Musa Rabiu, Jummy F. David, Jude D. Kong . Assessing the impact of adherence to Non-pharmaceutical interventions and indirect transmission on the dynamics of COVID-19: a mathematical modelling study. Mathematical Biosciences and Engineering, 2021, 18(6): 8905-8932. doi: 10.3934/mbe.2021439 |
[5] | Lili Liu, Xi Wang, Yazhi Li . Mathematical analysis and optimal control of an epidemic model with vaccination and different infectivity. Mathematical Biosciences and Engineering, 2023, 20(12): 20914-20938. doi: 10.3934/mbe.2023925 |
[6] | Antonios Armaou, Bryce Katch, Lucia Russo, Constantinos Siettos . Designing social distancing policies for the COVID-19 pandemic: A probabilistic model predictive control approach. Mathematical Biosciences and Engineering, 2022, 19(9): 8804-8832. doi: 10.3934/mbe.2022409 |
[7] | Seyedeh Nazanin Khatami, Chaitra Gopalappa . Deep reinforcement learning framework for controlling infectious disease outbreaks in the context of multi-jurisdictions. Mathematical Biosciences and Engineering, 2023, 20(8): 14306-14326. doi: 10.3934/mbe.2023640 |
[8] | Avinash Shankaranarayanan, Hsiu-Chuan Wei . Mathematical modeling of SARS-nCoV-2 virus in Tamil Nadu, South India. Mathematical Biosciences and Engineering, 2022, 19(11): 11324-11344. doi: 10.3934/mbe.2022527 |
[9] | Chloe Bracis, Mia Moore, David A. Swan, Laura Matrajt, Larissa Anderson, Daniel B. Reeves, Eileen Burns, Joshua T. Schiffer, Dobromir Dimitrov . Improving vaccination coverage and offering vaccine to all school-age children allowed uninterrupted in-person schooling in King County, WA: Modeling analysis. Mathematical Biosciences and Engineering, 2022, 19(6): 5699-5716. doi: 10.3934/mbe.2022266 |
[10] | Amira Bouhali, Walid Ben Aribi, Slimane Ben Miled, Amira Kebir . Impact of immunity loss on the optimal vaccination strategy for an age-structured epidemiological model. Mathematical Biosciences and Engineering, 2024, 21(6): 6372-6392. doi: 10.3934/mbe.2024278 |
In response to the limited detection ability and low model generalization ability of the YOLOv7 algorithm for small targets, this paper proposes a detection algorithm based on the improved YOLOv7 algorithm for steel surface defect detection. First, the Transformer-InceptionDWConvolution (TI) module is designed, which combines the Transformer module and InceptionDWConvolution to increase the network's ability to detect small objects. Second, the spatial pyramid pooling fast cross-stage partial channel (SPPFCSPC) structure is introduced to enhance the network training performance. Third, a global attention mechanism (GAM) attention mechanism is designed to optimize the network structure, weaken the irrelevant information in the defect image, and increase the algorithm's ability to detect small defects. Meanwhile, the Mish function is used as the activation function of the feature extraction network to improve the model's generalization ability and feature extraction ability. Finally, a minimum partial distance intersection over union (MPDIoU) loss function is designed to locate the loss and solve the mismatch problem between the complete intersection over union (CIoU) prediction box and the real box directions. The experimental results show that on the Northeastern University Defect Detection (NEU-DET) dataset, the improved YOLOv7 network model improves the mean Average precision (mAP) performance by 6% when compared to the original algorithm, while on the VOC2012 dataset, the mAP performance improves by 2.6%. These results indicate that the proposed algorithm can effectively improve the small defect detection performance on steel surface defects.
Let
● (Divisorial)
● (Flipping)
● (Mixed)
Note that the mixed case can occur only if either
We can almost always choose the initial
Our aim is to discuss a significant special case where the
Definition 1 (MMP with scaling). Let
By the
(Xj,Θj)ϕj→Zjψj←(Xj+1,Θj+1)gj↘↓↙gj+1S | (1.1) |
where
(2)
(3)
(4)
Note that (4) implies that
In general such a diagram need not exist, but if it does, it is unique and then
(X,Θ)ϕ→Zϕ+←(X+,Θ+)g↘↓↙g+S | (1.5) |
We say that the MMP terminates with
(6) either
(7) or
Warning 1.8. Our terminology is slightly different from [7], where it is assumed that
One advantage is that our MMP steps are uniquely determined by the starting data. This makes it possible to extend the theory to algebraic spaces [33].
Theorem 2 is formulated for Noetherian base schemes. We do not prove any new results about the existence of flips, but Theorem 2 says that if the MMP with scaling exists and terminates, then its steps are simpler than expected, and the end result is more controlled than expected.
On the other hand, for 3-dimensional schemes, Theorem 2 can be used to conclude that, in some important cases, the MMP runs and terminates, see Theorem 9.
Theorem 2. Let
(i)
(ii)
(iii)
(iv)
(v) The
We run the
(1)
(a) either
(b) or
(2) The
(3)
Furthermore, if the MMP terminates with
(4)
(5) if
Remark 2.6. In applications the following are the key points:
(a) We avoided the mixed case.
(b) In the fipping case we have both
(c) In (3) we have an explicit, relatively ample, exceptional
(d) In case (5) we end with
(e) In case (5) the last MMP step is a divisorial contraction, giving what [35] calls a Kollár component; no further flips needed.
Proof. Assertions (1-3) concern only one MMP-step, so we may as well drop the index
Let
∑hi(Ei⋅C)=−r−1(EΘ⋅C). | (2.7) |
By Lemma 3 this shows that the
∑hi(e′(Ei⋅C)−e(Ei⋅C′))=0. | (2.8) |
By the linear independence of the
Assume first that
ϕ∗(EΘ+rH)=∑i>1(ei+rhi)ϕ∗(Ei) |
is
Otherwise
g−1(g(supp(EΘ+rH)))=supp(EΘ+rH). | (2.9) |
If
Thus
Assume next that the flip
Finally, if the MMP terminates with
Lemma 3. Let
∑ni=1hivi=γv0 for some γ∈L. |
Then
Proof. We may assume that
∑ni=1hiai=γa0 and n∑i=1hibi=γb0. |
This gives that
∑ni=1hi(b0ai−a0bi)=0. |
Since the
Lemma 4. Let
Proof. Assume that
∑ni=1sihi=−(∑ni=1siei)⋅∑ni=0rihi. |
If
The following is a slight modifications of [3,Lem.1.5.1]; see also [17,5.3].
Lemma 5. Let
Comments on
Conjecture 6. Let
(1)
(2) The completion of
Using [30,Tag 0CAV] one can reformulate (6.2) as a finite type statement:
(3) There are elementary étale morphisms
(x,X,∑DXi)←(u,U,∑DUi)→(y,Y,∑DYi). |
Almost all resolution methods commute with étale morphisms, thus if we want to prove something about a resolution of
A positive answer to Conjecture 6 (for
(Note that [27] uses an even stronger formulation: Every normal, analytic singularity has an algebraization whose class group is generated by the canonical class. This is, however, not true, since not every normal, analytic singularity has an algebraization.)
Existence of certain resolutions.
7 (The assumptions 2.i-v). In most applications of Theorem 2 we start with a normal pair
Typically we choose a log resolution
We want
The existence of a
8 (Ample, exceptional divisors). Assume that we blow up an ideal sheaf
Claim 8.1. Let
Resolution of singularities is also known for 3-dimensional excellent schemes [10], but in its original form it does not guarantee projectivity in general. Nonetheless, combining [6,2.7] and [23,Cor.3] we get the following.
Claim 8.2. Let
Next we mention some applications. In each case we use Theorem 2 to modify the previous proofs to get more general results. We give only some hints as to how this is done, we refer to the original papers for definitions and details of proofs.
The first two applications are to dlt 3-folds. In both cases Theorem 2 allows us to run MMP in a way that works in every characteristic and also for bases that are not
Relative MMP for dlt 3-folds.
Theorem 9. Let
Then the MMP over
(1) each step
(a) either a contraction
(b) or a flip
(2)
(3) if either
Proof. Assume first that the MMP steps exist and the MMP terminates. Note that
KX+E+g−1∗Δ∼Rg∗(KY+Δ)+∑j(1+a(Ej,Y,Δ))Ej∼g,R∑j(1+a(Ej,Y,Δ))Ej=:EΘ. |
We get from Theorem 2 that (1.a-b) are the possible MMP-steps, and (2-3) from Theorem 15-5.
For existence and termination, all details are given in [6,9.12].
However, I would like to note that we are in a special situation, which can be treated with the methods that are in [1,29], at least when the closed points of
The key point is that everything happens inside
Contractions for reducible surfaces have been treated in [1,Secs.11-12], see also [12,Chap.6] and [31].
The presence of
The short note [34] explains how [15,3.4] gives 1-complemented 3-fold flips; see [16,3.1 and 4.3] for stronger results.
Inversion of adjunction for 3-folds. Using Theorem 9 we can remove the
Corollary 10. Let
This implies that one direction of Reid's classification of terminal singularities using 'general elephants' [28,p.393] works in every characteristic. This could be useful in extending [2] to characteristics
Corollary 11. Let
Divisor class group of dlt singularities. The divisor class group of a rational surface singularity is finite by [24], and [8] plus an easy argument shows that the divisor class group of a rational 3-dimensional singularity is finitely generated. Thus the divisor class group of a 3-dimensional dlt singularity is finitely generated in characteristic
Proposition 12. [21,B.1] Let
It seems reasonable to conjecture that the same holds in all dimensions, see [21,B.6].
Grauert-Riemenschneider vanishing. One can prove a variant of the Grauert-Riemenschneider (abbreviated as G-R) vanishing theorem [13] by following the steps of the MMP.
Definition 13 (G-R vanishing). Let
Let
(1)
(2)
Then
We say that G-R vanishing holds over
By an elementary computation, if
If
G-R vanishing also holds over 2-dimensional, excellent schemes by [24]; see [20,10.4]. In particular, if
However, G-R vanishing fails for 3-folds in every positive characteristic, as shown by cones over surfaces for which Kodaira's vanishing fails. Thus the following may be the type of G-R vanishing result that one can hope for.
Theorem 14. [5] Let
Proof. Let
A technical problem is that we seem to need various rationality properties of the singularities of the
For divisorial contractions
For flips
From G-R vanishing one can derive various rationality properties for all excellent dlt pairs. This can be done by following the method of 2 spectral sequences as in [19] or [20,7.27]; see [5] for an improved version.
Theorem 15. [5] Let
(1)
(2) Every irreducible component of
(3) Let
See [5,12] for the precise resolution assumptions needed. The conclusions are well known in characteristic 0, see [22,5.25], [12,Sec.3.13] and [20,7.27]. For 3-dimensional dlt varieties in
The next two applications are in characteristic 0.
Dual complex of a resolution. Our results can be used to remove the
Corollary 16. Let
Theorem 17. Let
(1)
(2)
(3)
Then
Proof. Fix
Let us now run the
Note that
We claim that each MMP-step as in Theorem 2 induces either a collapse or an isomorphism of
By [11,Thm.19] we get an elementary collapse (or an isomorphism) if there is a divisor
It remains to deal with the case when we contract
Dlt modifications of algebraic spaces. By [25], a normal, quasi-projective pair
However, dlt modifications are rarely unique, thus it was not obvious that they exist when the base is not quasi-projective. [33] observed that Theorem 2 gives enough uniqueness to allow for gluing. This is not hard when
Theorem 18 (Villalobos-Paz). Let
(1)
(2)
(3)
(4)
(5) either
I thank E. Arvidsson, F. Bernasconi, J. Carvajal-Rojas, J. Lacini, A. Stäbler, D. Villalobos-Paz, C. Xu for helpful comments and J. Witaszek for numerous e-mails about flips.
[1] |
S. Mei, Y. D. Wang, G. J. Wen, Automatic fabric defect detection with a multi-scale convolutional denoising autoencoder network model, Sensors, 18 (2018), 1064. http://doi.org/10.3390/S18041064 doi: 10.3390/S18041064
![]() |
[2] |
Z. Q. He, Q. F. Liu, Deep regression neural network for industrial surface defect detection, IEEE Access, 8 (2020), 35583–35591. http://doi.org/10.1109/ACCESS.2020.2975030 doi: 10.1109/ACCESS.2020.2975030
![]() |
[3] |
J. X. Luo, Z. Y. Yang, S. P. Li, Y. Wu, FPCB surface defect detection: a decoupled two-stage object detection framework, IEEE Trans. Instrum. Meas., 70 (2021). http://doi.org/10.1109/TIM.2021.3092510 doi: 10.1109/TIM.2021.3092510
![]() |
[4] |
L. H. Shao, E. R. Zhang, Q. R. Ma, M. Li, Pixel-wise semisupervised fabric defect detection method combined with multitask mean teacher, IEEE Trans. Instrum. Meas., 71 (2022). http://doi.org/10.1109/TIM.2022.3162286 doi: 10.1109/TIM.2022.3162286
![]() |
[5] |
M. Q. Chen, L. J. Yu, C. Zhi, R. Sun, S. Zhu, Z. Gao, et al., Improved faster R-CNN for fabric defect detection based on Gabor filter with genetic algorithm optimization, Comput. Ind., 134 (2022). http://doi.org/10.1016/j.compind.2021.103551 doi: 10.1016/j.compind.2021.103551
![]() |
[6] | J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: Unified, real-time object detection, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2016), 27–30. http://doi.org/10.1109/CVPR.2016.91 |
[7] | J. Redmon, A. Farhadi, YOLO9000: Better, faster, stronger, in 30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 21–26. http://doi.org/10.1109/CVPR.2017.690 |
[8] | J. Redmon, A. Farhadi, YOLOv3: An incremental improvement, preprint, arXiv: 180402767. |
[9] | A. Bochkovskiy, C. Y. Wang, H. Y. M. Liao, YOLOv4: Optimal speed and accuracy of object detection, preprint, arXiv: 200410934. |
[10] |
X. H. Qian, X. Wang, S. Y. Yang, J. Lei, LFF-YOLO: A YOLO algorithm with lightweight feature fusion network for multi-scale defect detection, IEEE Access, 10 (2022), 130339–130349. http://doi.org/10.1109/ACCESS.2022.3227205 doi: 10.1109/ACCESS.2022.3227205
![]() |
[11] | N. Yang, W. Guo, Application of improved YOLOv5 model for strip surface defect detection, in 2022 Global Reliability and Prognostics and Health Management (PHM-Yantai), (2022), 1–5. http://doi.org/10.1109/PHM-Yantai55411.2022.9942194 |
[12] |
Y. Wan, H. Y. Wang, Z. H. Xin, Efficient detection model of steel strip surface defects based on YOLO-V7, IEEE Access, 10 (2022), 133936–133944. http://doi.org/10.1109/ACCESS.2022.3230894 doi: 10.1109/ACCESS.2022.3230894
![]() |
[13] | X. Wang, K. Zhuang, An improved YOLOX method for surface defect detection of steel strips, in 2023 IEEE 3rd International Conference on Power, Electronics and Computer Applications (ICPECA), (2022), 152–157. http://doi.org/10.1109/ICPECA56706.2023.10075827 |
[14] | C. Li, L. Li, H. Jiang, K. Weng, Y. Geng, L. Li, et al., YOLOv6: A single-stage object detection framework for industrial applications, preprint, arXiv: 220902976. |
[15] | C. Y. Wang, A. Bochkovskiy, H. Y. M. Liao, YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2023), 7464–7475. http://doi.org/10.48550/arXiv.2207.02696 |
[16] | F. Akhyar, C. Y. Lin, K. Muchtar, T. Y. Wu, H. F. Ng, High efficient single-stage steel surface defect detection, in 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), (2019), 18–21. http://doi.org/10.1109/AVSS.2019.8909834 |
[17] | V. Nath, C. Chattopadhyay, S2D2Net: An improved approach for robust steel surface defects diagnosis with small sample learning, in IEEE International Conference on Image Processing (ICIP), (2021), 1199–1203. http://doi.org/10.26599/TST.2018.9010090 |
[18] | A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, et al., Attention is all you need, in Advances in Neural Information Processing Systems, 30 (2017). http://doi.org/10.1109/ICIP42928.2021.9506405 |
[19] | W. Yu, P. Zhou, S. Yan, X. Wang, Inceptionnext: When inception meets convnext, preprint, arXiv: 230316900. |
[20] | Y. Liu, Z. Shao, N. Hoffmann, Global attention mechanism: Retain information to enhance channel-spatial interactions, preprint, arXiv: 211205561. |
1. | Balázs Csutak, Gábor Szederkényi, Robust control and data reconstruction for nonlinear epidemiological models using feedback linearization and state estimation, 2025, 22, 1551-0018, 109, 10.3934/mbe.2025006 |