
In response to the challenge of noise filtering for the impulsive vibration signals of rolling bearings, this paper presented a novel filtering method based on the improved Morlet wavelet, which has clear physical meaning and is more conducive to parameter optimization through employing Gaussian waveform width to replace the traditional Morlet wavelet shape factor. Simultaneously, the marine predation algorithm was employed and the minimum Shannon entropy was used as the parameter optimization index while optimizing the shape width and center frequency of the improved Morlet wavelet. The vibration waveform of the rolling bearing was matched perfectly by using the optimized Morlet wave. Shannon entropy was used as the evaluation index of noise filtering, and the quantitative analysis of noise filtering was realized. Through experimental validation, this method was proved to be effective in noise elimination for rolling bearing. It is significance to preprocessing of vibration signal, feature extraction and fault recognition of rolling bearing.
Citation: Yu Chen, Qingyang Meng, Zhibo Liu, Zhuanzhe Zhao, Yongming Liu, Zhijian Tu, Haoran Zhu. Research on filtering method of rolling bearing vibration signal based on improved Morlet wavelet[J]. Electronic Research Archive, 2024, 32(1): 241-262. doi: 10.3934/era.2024012
[1] | Xiaodan Zhang, Shuyi Wang, Kemeng Xu, Rui Zhao, Yichong She . Cross-subject EEG-based emotion recognition through dynamic optimization of random forest with sparrow search algorithm. Mathematical Biosciences and Engineering, 2024, 21(3): 4779-4800. doi: 10.3934/mbe.2024210 |
[2] | Binju Saju, Neethu Tressa, Rajesh Kumar Dhanaraj, Sumegh Tharewal, Jincy Chundamannil Mathew, Danilo Pelusi . Effective multi-class lungdisease classification using the hybridfeature engineering mechanism. Mathematical Biosciences and Engineering, 2023, 20(11): 20245-20273. doi: 10.3934/mbe.2023896 |
[3] | Basem Assiri, Mohammad Alamgir Hossain . Face emotion recognition based on infrared thermal imagery by applying machine learning and parallelism. Mathematical Biosciences and Engineering, 2023, 20(1): 913-929. doi: 10.3934/mbe.2023042 |
[4] | Yufeng Qian . Exploration of machine algorithms based on deep learning model and feature extraction. Mathematical Biosciences and Engineering, 2021, 18(6): 7602-7618. doi: 10.3934/mbe.2021376 |
[5] | Xu Yin, Ming Meng, Qingshan She, Yunyuan Gao, Zhizeng Luo . Optimal channel-based sparse time-frequency blocks common spatial pattern feature extraction method for motor imagery classification. Mathematical Biosciences and Engineering, 2021, 18(4): 4247-4263. doi: 10.3934/mbe.2021213 |
[6] | Kunpeng Li, Zepeng Wang, Yu Zhou, Sihai Li . Lung adenocarcinoma identification based on hybrid feature selections and attentional convolutional neural networks. Mathematical Biosciences and Engineering, 2024, 21(2): 2991-3015. doi: 10.3934/mbe.2024133 |
[7] | Dingxin Xu, Xiwen Qin, Xiaogang Dong, Xueteng Cui . Emotion recognition of EEG signals based on variational mode decomposition and weighted cascade forest. Mathematical Biosciences and Engineering, 2023, 20(2): 2566-2587. doi: 10.3934/mbe.2023120 |
[8] | Yan Yan, Yong Qian, Hongzhong Ma, Changwu Hu . Research on imbalanced data fault diagnosis of on-load tap changers based on IGWO-WELM. Mathematical Biosciences and Engineering, 2023, 20(3): 4877-4895. doi: 10.3934/mbe.2023226 |
[9] | Jie Bai, Heru Xue, Xinhua Jiang, Yanqing Zhou . Classification and recognition of milk somatic cell images based on PolyLoss and PCAM-Reset50. Mathematical Biosciences and Engineering, 2023, 20(5): 9423-9442. doi: 10.3934/mbe.2023414 |
[10] | Yuzhuo Shi, Huijie Zhang, Zhisheng Li, Kun Hao, Yonglei Liu, Lu Zhao . Path planning for mobile robots in complex environments based on improved ant colony algorithm. Mathematical Biosciences and Engineering, 2023, 20(9): 15568-15602. doi: 10.3934/mbe.2023695 |
In response to the challenge of noise filtering for the impulsive vibration signals of rolling bearings, this paper presented a novel filtering method based on the improved Morlet wavelet, which has clear physical meaning and is more conducive to parameter optimization through employing Gaussian waveform width to replace the traditional Morlet wavelet shape factor. Simultaneously, the marine predation algorithm was employed and the minimum Shannon entropy was used as the parameter optimization index while optimizing the shape width and center frequency of the improved Morlet wavelet. The vibration waveform of the rolling bearing was matched perfectly by using the optimized Morlet wave. Shannon entropy was used as the evaluation index of noise filtering, and the quantitative analysis of noise filtering was realized. Through experimental validation, this method was proved to be effective in noise elimination for rolling bearing. It is significance to preprocessing of vibration signal, feature extraction and fault recognition of rolling bearing.
Retinal tears arise from vitreous traction on the retina or degeneration and atrophy of the retina, and it is frequently observed in individuals who have acute posterior vitreous detachment [1]. The identification of retinal tears, which serve as a risk factor for the occurrence of retinal detachment, poses a significant challenge. In the absence of timely detection and intervention, 30–50% of the cases will progress to retinal detachment [2], a condition that leads to severe blinding. In most cases, retinal tears can be diagnosed by using indirect fundoscopy in conjunction with scleral pressure examination [3]. However, in situations where the patient's refracting media is murky, B-scan ultrasound emerges as a viable option among the limited alternative diagnostic tools available. Moreover, ultrasound is also more accessible and less expensive than other types like OCT and ultra-wide-field imaging. It is widely prevalent and available in many local hospitals and primary community clinics. However, conventional manual methods require the involvement of highly skilled physicians to prevent their potential oversight or misdiagnosis [4]. In this context, only a few of the large hospitals in China have professional sonographers, as is the case in other developing countries and regions. As a result, the development of a model capable of automatically diagnosing retinal tears is critical and urgent [5].
Deep learning represents the most effective approach to automating the development of diagnostic systems. Previous studies have proposed a multitude of models, with predominant focus on the utilization of convolutional neural networks (CNNs) [6,7]. For example, Li et al. [8] screened for notable peripheral retinal lesions (NPRLs) by using numerous models, such as InceptionResNetV2, InceptionV3, ResNet50 and VGG16. Furthermore, with an accuracy of 79.8%, a system based on seResNet50 was developed by Zhang et al. [9] to screen numerous types of NPRLs. However, the inability of the CNN to capture long-distance image features hinders its continued development. In this context, Dosovitskiy et al. [10] proposed the vision transformer (ViT) as a solution to this problem, using the excellent transformer [11] from natural language processing as a point of reference. Subsequently, ViT was observed to outperform CNNs in a multitude of tests after self-attention methods were substituted for convolutional processes. Accordingly, several researchers have made efforts to implement the model in the treatment of ophthalmic disorders, particularly, retinal issues. Jiang et al. [12] employed a ViT to automatically identify normal eyes, age-related macular degeneration, and diabetic macular edema, achieving a classification accuracy of 99.69%. Furthermore, a deep learning model based on a ViT was introduced by Wu et al. [13] to assess diabetic retinopathy, and it realized an accuracy of 91.4% and a kappa score of 0.935. However, studies that report on the automatic diagnosis of retinal tears are few.
The present study involved the collection and construction of a retinal tear dataset comprising 1831 images, with the aim of developing more effective diagnostic algorithms. Despite the widely acknowledged fact that ViT is data-driven and performs exceptionally well with ample training data, our study encountered a hurdle due to the limited availability of data. Although the use of transfer learning has been demonstrated to be able to partially address this challenge, it should be noted that this approach may not be sufficient and could potentially lead to an increase in computational resources. Consequently, a hybrid structure was devised to introduce inductive bias and enhance the model's adaptability to our limited dataset. Furthermore, through experimental analysis, it has been observed that the utilization of deformable convolution [14] affords superior adaptability to the contour of lesions and yields improved performance. Thus, based on the aforementioned rationales, we proposed a novel framework called the deformable convolution and transformer network (DCT-Net) in the current study, which integrates the merits of deformable convolution and the vision transformer. The model was subjected to rigorous testing on two datasets to assess its overall performance and efficacy. Additionally, attention maps were generated in order to validate their interpretability. The current body of research on retinal tear diagnostic systems is limited, and our study has partially addressed this research gap.
To summarize, the main contributions of the present study can be succinctly stated as follows:
● A dataset comprising 1831 B-scan ultrasound images of retinal tears was assembled.
● A novel model that is more appropriate for small datasets of medical images is proposed. To our knowledge, this study represents the first investigation into the utilization of ViT-based architecture for the purpose of identifying retinal tears through the analysis of ultrasound images.
● The efficacy of the model in terms of lesion detection, as well as its commendable performance, are demonstrated through the analysis of two datasets.
The contents of the current study can be categorized into three primary modules: data collection and preprocessing; model design and validation and interpretability analysis and external validation. The flowchart is illustrated in Figure 1.
The investigation was carried out in adherence to the Protocol for the Declaration of Helsinki, as amended in 2013.
A comprehensive set of 1902 ultrasound B-scan images was collected for this retrospective study. These samples were obtained from the eye hospital of Wenzhou Medical University for the period from October 2017 to April 2022. All positive samples were verified by professional ophthalmologists. However, the images were collected from a variety of devices with varying resolutions and file types. Thus, to accommodate the model's input, each image underwent a resizing process to 224 × 224 pixels, and any blurry pixels were removed. Finally, 1831 samples (910 positive and 927 negative) were utilized for subsequent investigations.
Data augmentation is a data processing technique that is employed to enhance the quantity and diversity of training samples by transforming existing data. There are two distinct categories of data arguments, namely, augment online and augment offline. Typically, the former approach is utilized for larger datasets, wherein operations are executed on the data batch. Conversely, the latter approach is employed for smaller datasets, wherein operations are directly performed on the original data [15]. Accordingly, the offline method was selected as a result of the limited dataset available for our study. Various data augmentation techniques, including rotation, cropping, brightness shift, contrast modification, horizontal flipping, vertical flipping, etc., can be employed for image augmentation [16,17]. However, not all enhancement techniques are universally applicable, because the labels of the image categories could be modified after enhancement. After conducting analysis, we opted to employ horizontal flip, vertical flip and brightness shift techniques in order to enhance the original dataset. Figure 2 illustrates the aforementioned augmentation operations.
The ViT model is based on direct global relationship modeling and has demonstrated significant accomplishments in the extraction of global features through the use of a multi-head self-attention mechanism. However, it has limitations in its ability to effectively accommodate minuscule lesions, and it proves inadequate when confronted with a limited size of training data. In this context, convolution operations, specifically deformable convolutions, exhibit better adaptability to local detail characteristics. This study presents a novel approach that integrates the ViT and deformable convolution to realize the accurate detection of retinal tears with enhanced precision. Figure 3 presents a visual representation of the proposed model. Furthermore, the utilization of transfer learning technology was employed in this particular aspect to enhance network performance and expedite the training process.
The input images (H×W×C) were split into n patches. After these patches were flattened, a linear projection layer was used to convert them to D-dimensional vectors. A class token was also appended, as illustrated in the BERT [18]. Following position embedding, the D-dimensional vectors were subsequently transmitted to the Transformer Encoder. Maintaining the dimensions of the vectors was crucial throughout the entire process.
In the Transformer Encoder, the input vectors undergo an initial step of layer normalization, which expedites the convergence of the network. The procedure is denoted by Eq (1) in terms of the mean and standard deviation of the input, respectively.
LayerNorm(xi)=xi−μ√σ2+ϵ | (1) |
The resulting output is used to compute the mutual attention by utilizing multi-head attention layers (as demonstrated in Eqs (2)–(4)). Subsequently, the Layer Norm and Multi-Layer Perceptron layer were employed to obtain the final outputs. The inclusion of residual connections in this process effectively mitigated the issue of gradient vanishing. To optimize the utilization of the transfer learning's weight, we employed an equal number of encoders as the conventional ViT model.
Qi=QWQi,Ki=KWKi,Vi=VWVi | (2) |
headi=Attention(Qi,Ki,Vi) | (3) |
Multihead(Q,K,V)=Concact(head1,head2,...,head12) | (4) |
y(P0)=∑Pn∈Rw(Pn)⋅x(P0+Pn+ΔPn) | (5) |
The diagnosis of retinal tears using ultrasound images is highly dependent on the position and shape of the small lesion areas. However, the standard ViT is insufficient for acquiring such localized data. As a result of conventional convolution employing regular kernels, the receptive field remains constant and is ill-equipped to accommodate variations in edge shape. By appending a learnable offset to the standard convolution kernel, deformable convolution can modify the sampling area's shape, bringing it closer to the object's edge. The sampling procedure for deformable convolution and ordinary convolution is presented in Figure 4. Equation (5) illustrates the calculation process.
Subsequently, a residual deformable convolution block was devised in order to enhance the extraction of intricate features. Similar to the Transformer Encoder, the designed module initially employs a Batch Norm layer to convert inputs into data with a mean of 1 and a variance of 0. Two deformable convolutional layers were used to capture local concrete detail features. To enhance nonlinearity while minimizing computational workload, the convolutional kernel of the first layer was designed to be larger than that of the second layer. Subsequently, an adaptive average pooling layer was incorporated in order to enhance the efficacy of feature extraction and computational processes. Furthermore, the concept of residual connection was incorporated into the model design, drawing inspiration from Resnet [19]. This addition was made in order to mitigate the issue of gradient vanishing [20].
The utilization of pooling layers in a CNN can lead to the merging of position information, potentially resulting in the loss of certain details during the generation of rough heat maps [21,22]. Our model effectively captures global features and is founded upon a self-attention mechanism. Moreover, it has the ability to deliver elaborate visualizations to an adequate degree [23]. However, attention-based networks are incompatible with the traditional Grad-CAM [24] method. This is attributed to the fact that the CNN permits the aggregation of feature map weights from multiple channels, whereas the ViT restricts the addition of distinct patches. Therefore, we adopted the attention rollout method proposed by Samira Abnar [25]. Attention rollout in essence calculates the product of the attention matrix from the low level to the high level of the network. The concrete realization is achieved through the recursive calculation of each layer's tokens, computing information from the input layer to the higher level. Concurrently, the residual connection and the weight must be taken into account. It is represented by Eq (6).
AttentionRolloutL=(AL+I)AttentionRolloutL−1 | (6) |
where AL is the attention matrix of the L layer and I is the identity matrix.
The adoption of a transfer learning strategy was implemented with the aim of expediting the training process and enhancing the performance of the model. The pre-training process was conducted by using the ImageNet dataset, which comprises a vast collection of more than 1000 categories of nature images. The cross-entropy loss [26,27] was employed as the loss function in our study. This choice was made to address the issue of the sigmoid function's derivative form, which is susceptible to saturation and results in slow gradient updates. Furthermore, the Adam optimizer [28] was also utilized. The approach offers the benefits of rapid convergence and a relatively facile process for configuring hyperparameters.
Furthermore, an early stopping strategy was developed with the intention of mitigating the issue of overfitting. Following each iteration of training, a comprehensive evaluation was conducted on the designated test dataset. The training process was deemed to be complete once the accuracy on the test set ceased to exhibit substantial improvements and stabilized after approximately 10 epochs.
In order to enhance the precision of an evaluation of the performance of the designed model, a set of widely recognized state-of-the-art (SOTA) models, viz. Alexnet [29], Inception v3 [30], Resnet101 [19], VGG16 [31] and ViT, were chosen as the baseline models. The preprocessing steps and training strategies remained consistent across all baselines, with the exception of Inception v3, which required an input size of 299 × 299 pixels.
Table 1 presents a comprehensive overview of the performance metrics for both the baseline models and the model that has been specifically designed for this study. The confusion matrix for multiple models on the test set is depicted in Figure 5. The number in each small square represents the corresponding number of images with the same predicted true label and it is the percentage of the total number of images under the true label. It is worth mentioning that within the category of CNN-based models, Inception v3 exhibited the highest level of performance, achieving an accuracy rate of 96.82%, an F1 score of 0.9605 and an AUC of 0.9828. The ViT model with the pure self-attention mechanism did not perform well; particularly, the performance was even worse than that of the CNN. Nevertheless, our designed model exhibited superior performance across all metrics, surpassing all other models, and only a mere 10 samples were classified incorrectly. To our knowledge, the proposed model exhibited superior performance even as compared to human experts (with a sensitivity of 96%) [32].
Model | Accuracy | Precision | Recall | F1 Score | AUC |
Alexnet | 95.11% | 94.64% | 95.68% | 0.9456 | 0.9286 |
Inception V3 | 96.82% | 96.55% | 96.37% | 0.9605 | 0.9828 |
Resnet101 | 96.74% | 96.94% | 96.42% | 0.9599 | 0.9772 |
VGG16 | 96.52% | 96.42% | 96.66% | 0.9595 | 0.9598 |
Vit | 95.76% | 95.66% | 95.87% | 0.9515 | 0.9444 |
DCT-Net | 97.78% | 97.34% | 97.13% | 0.9682 | 1.0000 |
As an external validation step, we utilized the ORIGA datasets in this section to ensure that the proposed model possesses exceptional generalizability and can adapt to various database types. The dataset comprised a total of 650 images depicting instances of glaucoma. In order to conduct a comparative analysis against other models documented in the literature [33,34,35], we used the original dataset without employing any augmentation techniques. Table 2 shows the results, where NMD denotes that the pre-training was performed by using a non-medical dataset, SOD denotes that the pre-training was performed by using a similar ophthalmic dataset and CT-Net denotes that common convolution replaced the deformable convolution. The ViT did not perform well among them, most likely as a result of the limited dataset. On the other hand, the DCT-Net achieved the highest accuracy at 83.8%, demonstrating the best performance. Additionally, the significance of deformable convolution became apparent when it was compared to CT-Net.
Model | Accuracy | Sensitivity | Specificity |
CNN | 70.4% | 70.7% | 74.8% |
VGG | 70.1% | 69.8% | 71.0% |
GoogLeNet | 71.8% | 69.8% | 73.5% |
ResNet | 71.5% | 71.3% | 71.7% |
Chen [34] | 70.8% | 69.2% | 71.0% |
Shibata [35] | 73.3% | 73.2% | 76.7% |
NMD+CNN | 74.5% | 68.7% | 80.7% |
SOD+CNN | 73.9% | 80.9% | 72.2% |
NMD+Attention | 74.9% | 71.2% | 77.7% |
Xu [33] | 76.6% | 75.3% | 77.2% |
ViT | 71.4% | 74.0% | 67.8% |
CT-Net | 80.5% | 81.7% | 80.1% |
DCT-Net | 83.8% | 82.7% | 82.4% |
Models that are easily interpretable offer valuable insights into their inner workings, thereby benefiting both patients and clinicians. Figure 6 displays three attention maps that were generated by using our private dataset. We have used the red circle to mark the lesion parts in the original image. In the attention maps, higher intensity of color is indicative of a greater level of attention. The aforementioned images demonstrate a strong correspondence between the regions of heightened attention and the affected areas of the lesion. This indicated that the model possesses a well-defined operational framework and possesses exceptional interpretive qualities.
The hardware configuration utilized in this study is as follows. The central processing unit (CPU) utilized in the system comprised a 7-core Intel(R) Xeon(R) CPU E5-2680 v4 operating at a frequency of 2.40 GHz. Additionally, the system incorporated a single graphics processing unit in the form of an RTX 3070ti with 8 GB of dedicated memory. The training process employed Python version 3.8, PyTorch framework version 1.10.0 for machine learning and CUDA version 11.3.
CNNs have demonstrated remarkable performance on previous image processing tasks and are widely acknowledged as the SOTA approach. For instance, Yu et al. [36,37] employed CNNs for the purpose of detecting concrete cracks, achieving exceptional performance. Ragupathy and Karunakaran [38] proposed a CNN-based model for the detection of meningioma brain tumors. The model demonstrated promising performance metrics. However, due to the constraints imposed by the small convolutional kernel, CNNs may not be able to effectively extract global features. As shown in Table 1, it appears that the performance of the CNN-based model has encountered a bottleneck, making further improvements challenging. When comparing the CNN with the ViT, it can be observed that the ViT utilizes the attention mechanism to calculate the relationship between global pixels, thereby enabling a comprehensive global perspective. Numerous studies have substantiated the impressive efficacy of the ViT model [39]. However, our investigation revealed that the pure ViT did not perform well on small datasets of retinal tears (with the accuracy of 95.76%).
To enhance the efficacy of lesion detection on limited datasets, a novel architecture was initially devised, integrating the merits of convolution and attention mechanisms. As shown in Table 2, the utilization of global feature extraction techniques contributes to the generation of a relatively comprehensive latent space feature representation. Concurrently, as a result of incorporating the inductive bias of convolution, the proposed model demonstrates substantial enhancements on the limited public dataset, achieving an accuracy of 80.5%. Moreover, replacing ordinary convolutions with deformable convolutions has been found to yield more favorable outcomes, as evidenced by an accuracy rate of 83.8%. This phenomenon could potentially be attributed to the enhanced precision resulting from extracting both the location and shape of the lesion areas. From the perspective of external validation and interpretable analysis, the model possesses robustness and sufficient accuracy.
Notwithstanding the enhanced performance achieved in this study, certain constraints remain. First, ophthalmic ultrasound is highly dependent on the equipment, technique and examiner experience. However, the data collected for this study came from a variety of devices. This may compromise the validity of the results. Second, all of the retinal tear images utilized in this study were procured from a single hospital. This may lead to an absence of diversity in the cases. Moreover, only retinal tears were included in our study. Ultrasound imaging can, in fact, be utilized to diagnose additional retinal disorders. Correspondingly, the value of the model can be enhanced through the incorporation of additional disease types. Finally, the incorporation of the residual deformable convolution module and the utilization of a ViT as the feature extractor resulted in an increased number of parameters for our model (Table 3). This results in increased demands on the environment in terms of model deployment.
Utilizing ultrasound to identify retinal tears is an extremely practical method. It is superior to alternative approaches when it comes to handling intricate clinical scenarios, such as ocular media opacity. However, the extraction of useful features via conventional machine learning methods is hampered by low resolution. Fortunately, the progress that has been made in deep learning enables the analysis of these images in an efficient manner. Our current research is, without a doubt, preliminary in nature. Moving forward, we aim to enhance the model's architecture and implement global vision technology that is more streamlined or possesses a reduced number of parameters. This will allow the effortless deployment of lightweight models across diverse environments. Furthermore, our objective is to enhance the quantity and range of samples gathered in order to prevent issues with model generalization that may arise from discrepancies in the training data. Lastly, we will collaborate with clinicians and conduct additional multicenter studies to precisely quantify the extent to which this model can benefit physicians.
Model | Parameters(1 × 106) |
Alexnet | 57.01 |
Inception v3 | 25.12 |
Resnet101 | 42.5 |
VGG16 | 134.27 |
Vision Transformer | 85.80 |
DCT-Net | 138.36 |
A novel model was developed for the diagnosis of ophthalmological conditions in the current study. The model demonstrated superior performance on both our proprietary dataset and the glaucoma dataset that was publicly available. The framework is a comprehensive computing framework that exhibits superior performance and does not necessitate the generation of manually designed features. Overall, this technology provides significant practical value in the field of clinical application, particularly in the realm of automated diagnosis.
The authors declare that they have not used artificial intelligence tools in the creation of this article.
We would like to thank all editors and reviews for their careful review and revision of the paper. This research was supported in part by the National Key R & D Program of China [2018YFA0701700].
The authors declare that there is no conflict of interest.
[1] |
M. Cerrada, R. V. Sanchez, C. Li, F. Pacheco, D. Cabrera, J. V. de Oliveira, et al., A review on data-driven fault severity assessment in rolling bearings, Mech. Syst. Signal Process., 99 (2018), 169–196. https://doi.org/10.1016/j.ymssp.2017.06.012 doi: 10.1016/j.ymssp.2017.06.012
![]() |
[2] |
M. Xia, T. Li, L. Xu, L. Liu, C. W. de Silva, Fault diagnosis for rotating machinery using multiple sensors and convolutional neural networks, IEEE/ASME Trans. Mechatron., 23 (2017), 101–110. https://doi.org/10.1109/TMECH.2017.2728371 doi: 10.1109/TMECH.2017.2728371
![]() |
[3] |
M. Liang, K. Zhou, Probabilistic bearing fault diagnosis using Gaussian process with tailored feature extraction, Int. J. Adv. Manuf. Technol., 119 (2022), 2059–2076. https://doi.org/10.1007/s00170-021-08392-6 doi: 10.1007/s00170-021-08392-6
![]() |
[4] |
W. Yang, R. Court, Experimental study on the optimum time for conducting bearing maintenance, Measurement, 46 (2013), 2781–2791. https://doi.org/10.1016/j.measurement.2013.04.016 doi: 10.1016/j.measurement.2013.04.016
![]() |
[5] |
C. Mongia, D. Goyal, S. Sehgal, Vibration response-based condition monitoring and fault diagnosis of rotary machinery, Mater. Today Proc., 50 (2022), 679–683. https://doi.org/10.1016/j.matpr.2021.04.395 doi: 10.1016/j.matpr.2021.04.395
![]() |
[6] |
W. Ahmad, S. A. Khan, J. M. Kim, A hybrid prognostics technique for rolling element bearings using adaptive predictive models, IEEE Trans. Ind. Electron., 65 (2017), 1577–1584. https://doi.org/10.1109/TIE.2017.2733487 doi: 10.1109/TIE.2017.2733487
![]() |
[7] |
M. A. Ugwiri, M. Carratú, V. Paciello, C. Liguori, Benefits of enhanced techniques combining negentropy, spectral correlation and kurtogram for bearing fault diagnosis, Measurement, 185 (2021), 110013. https://doi.org/10.1016/j.measurement.2021.110013 doi: 10.1016/j.measurement.2021.110013
![]() |
[8] |
S. Gawde, S. Patil, S. Kumar, P. Kamat, K. Kotecha, A. Abraham, Multi-fault diagnosis of Industrial Rotating Machines using Data-driven approach: a review of two decades of research, Eng. Appl. Artif. Intell., 123 (2023), 106139. https://doi.org/10.1016/j.engappai.2023.106139 doi: 10.1016/j.engappai.2023.106139
![]() |
[9] |
Y. Xu, Z. Li, S. Wang, W. Li, T. Sarkodie-Gyan, S. Feng, A hybrid deep-learning model for fault diagnosis of rolling bearings, Measurement, 169 (2021), 108502. https://doi.org/10.1016/j.measurement.2020.108502 doi: 10.1016/j.measurement.2020.108502
![]() |
[10] |
H. Shao, H. Jiang, Y. Lin, X. Li, A novel method for intelligent fault diagnosis of rolling bearings using ensemble deep auto-encoders, Mech. Syst. Signal Process., 102 (2018), 278–297. https://doi.org/10.1016/j.ymssp.2017.09.026 doi: 10.1016/j.ymssp.2017.09.026
![]() |
[11] |
L. Wen, X. Li, L. Gao, Y. Zhang, A new convolutional neural network-based data-driven fault diagnosis method, IEEE Trans. Ind. Electron., 65 (2018), 5990–5998. https://doi.org/10.1109/TIE.2017.2774777 doi: 10.1109/TIE.2017.2774777
![]() |
[12] |
M. Gan, C. Wang, C. Zhu, Construction of hierarchical diagnosis network based on deep learning and its application in the fault pattern recognition of rolling element bearings, Mech. Syst. Signal Process., 72–73 (2016), 92–104. https://doi.org/10.1016/j.ymssp.2015.11.014 doi: 10.1016/j.ymssp.2015.11.014
![]() |
[13] |
X. F. Xu, S. T. Hu, P. M. Shi, H. S. Shao, R. X. Li, Z. Li, Natural phase space reconstruction-based broad learning system for short-term wind speed prediction: case studies of an offshore wind farm, Energy, 262 (2023), 125342. https://doi.org/10.1016/j.energy.2022.125342 doi: 10.1016/j.energy.2022.125342
![]() |
[14] |
X. F. Xu, S. T. Hu, H. S. Shao, P. M. Shi, R. X. Li, D. G. Li, A spatio-temporal forecasting model using optimally weighted graph convolutional network and gated recurrent unit for wind speed of different sites distributed in an offshore wind farm, Energy, 284 (2023), 128565. https://doi.org/10.1016/j.energy.2023.128565 doi: 10.1016/j.energy.2023.128565
![]() |
[15] |
L. J. Zhang, J. W. Xu, J. H. Yang, D. B. Yang, D. D. Wang, Multiscale morphology analysis and its application of fault diagnosis, Mech. Syst. Signal Process., 22 (2008), 597–610. https://doi.org/10.1016/j.ymssp.2007.09.010 doi: 10.1016/j.ymssp.2007.09.010
![]() |
[16] |
Z. Li, S. Cai, X. Li, S. Shao, X. Y. Yang, Fault diagnosis of Rolling Bearing for Motor Based on LSTM-EEMD and Genetic Optimization, J. Phys.: Conf. Ser., 2549 (2023), 012025. https://doi.org/10.1088/1742-6596/2549/1/012025 doi: 10.1088/1742-6596/2549/1/012025
![]() |
[17] |
K. Zhou, J. Tang, A wavelet neural network informed by time-domain signal preprocessing for bearing remaining useful life prediction, Appl. Math. Modell., 122 (2023), 220–241. https://doi.org/10.1016/j.apm.2023.05.042 doi: 10.1016/j.apm.2023.05.042
![]() |
[18] |
Q. Miao, C. Tang, W. Liang, M. Pecht, Health assessment of cooling fan bearings using wavelet-based filtering, Sensors, 13 (2013), 274–291. https://doi.org/10.3390/s130100274 doi: 10.3390/s130100274
![]() |
[19] |
K. Belaid, A. Miloudi, H. Bournine, The processing of resonances excited by gear faults using continuous wavelet transform with adaptive complex Morlet wavelet and sparsity measurement, Measurement, 180 (2021), 109576. https://doi.org/10.1016/j.measurement.2021.109576 doi: 10.1016/j.measurement.2021.109576
![]() |
[20] |
P. Liang, W. Wang, X. Yuan, S. Liu, L. Zhang, Y. Cheng, Intelligent fault diagnosis of rolling bearing based on wavelet transform and improved ResNet under noisy labels and environment, Eng. Appl. Artif. Intell., 115 (2022), 105269. https://doi.org/10.1016/j.engappai.2022.105269 doi: 10.1016/j.engappai.2022.105269
![]() |
[21] |
J. Ma, H. Li, Y. Chen, J. Wang, Z. Zou, Application of VMD and dynamic wavelet noise reduction techniques in rolling bearing fault diagnosis, J. Phys.: Conf. Ser., 2528 (2023), 012048. https://doi.org/10.1088/1742-6596/2528/1/012048 doi: 10.1088/1742-6596/2528/1/012048
![]() |
[22] |
G. Naima, H. A. Elias, S. Salah, An improved fast kurtogram based on an optimal wavelet coefficient for wind turbine gear fault detection, J. Electr. Eng. Technol., 17 (2022), 1335–1346. https://doi.org/10.1007/s42835-021-00937-9 doi: 10.1007/s42835-021-00937-9
![]() |
[23] |
L. Liang, G. H. Xu, C. G. Hou, Continuous wavelet transform denoising method based on singular value decomposition, J. Xi'an Jiaotong Univ., 38 (2004), 904–908. https://doi.org/10.3321/j.issn:0253-987X.2004.09.006 doi: 10.3321/j.issn:0253-987X.2004.09.006
![]() |
[24] |
J. Lin, L. S. Qu, Feature extraction based on Morlet wavelet and its application form echanical fault diagnosis, J. Sound Vib., 234 (2000), 135–148. https://doi.org/10.1006/jsvi.2000.2864 doi: 10.1006/jsvi.2000.2864
![]() |
[25] |
W. Zhang, M. P. Jia, L. Zhu, An adaptive Morlet wavelet filter method and its application in detecting early fault feature of ball bearings (in Chinese), J. Southeast Univ. (Nat. Sci. Ed.), 46 (2016), 457–463. https://doi.org/10.3969/j.issn.1001-0505.2016.03.001 doi: 10.3969/j.issn.1001-0505.2016.03.001
![]() |
[26] |
P. W. Tse, D. Wang, The automatic selection of an optimal wavelet filter and its enhancement by the new Sparsogram for bearing fault detection: part 2 of the two related manuscripts that have a joint title as "two automatic vibration-based fault diagnostic methods using the novel sparsity measurement-Parts 1 and 2", Mech. Syst. Signal Process., 40 (2013), 520–544. https://doi.org/10.1016/j.ymssp.2013.05.018 doi: 10.1016/j.ymssp.2013.05.018
![]() |
[27] |
Y. Jiang, B. Tang, Y. Qin, W. Liu, Feature extraction method of wind turbine based on adaptive Morlet wavelet and SVD, Renewable Energy, 36 (2011), 2146–2153. https://doi.org/10.1016/j.renene.2011.01.009 doi: 10.1016/j.renene.2011.01.009
![]() |
[28] |
M. Behzad, A. Kiakojouri, H. A. Arghand, A. Davoodabadi, Inaccessible rolling bearing diagnosis using a novel criterion for Morlet wavelet optimization, J. Vib. Control, 28 (2022), 1239–1250. https://doi.org/10.1177/1077546321989503 doi: 10.1177/1077546321989503
![]() |
[29] |
X. Gu, S. Yang, Y. Liu, F. Deng, B. Ren, Compound faults detection of the rolling element bearing based on the optimal complex Morlet wavelet filter, Proc. Inst. Mech. Eng., Part C: J. Mech. Eng. Sci., 232 (2018), 1786–1801. https://doi.org/10.1177/0954406217710673 doi: 10.1177/0954406217710673
![]() |
[30] |
Y. Zhang, B. P. Tang, Z. R. Liu, R. X. Chen, An adaptive demodulation approach for bearing fault detection based on adaptive wavelet filtering and spectral subtraction, Meas. Sci. Technol., 27 (2015), 025001. https://doi.org/10.1088/0957-0233/27/2/025001 doi: 10.1088/0957-0233/27/2/025001
![]() |
[31] |
W. Su, F. Wang, H. Zhu, Z. Zhang, Z. Guo, Rolling element bearing faults diagnosis based on optimal Morlet wavelet filter and autocorrelation enhancement, Mech. Syst. Signal Process., 24 (2010), 1458–1472. https://doi.org/10.1016/j.ymssp.2009.11.011 doi: 10.1016/j.ymssp.2009.11.011
![]() |
[32] |
X. Han, J. Xu, S. Song, J. Zhou, Crack fault diagnosis of vibration exciter rolling bearing based on genetic algorithm–optimized Morlet wavelet filter and empirical mode decomposition, Int. J. Distrib. Sens. Netw., 18 (2022). https://doi.org/10.1177/15501329221114566 doi: 10.1177/15501329221114566
![]() |
[33] |
M. X. Cohen, A better way to define and describe Morlet wavelets for time-frequency analysis, Neuroimage, 199 (2019), 81–86. https://doi.org/10.1016/j.neuroimage.2019.05.048 doi: 10.1016/j.neuroimage.2019.05.048
![]() |
[34] |
A. Dey, S. Bhattacharyya, S. Dey, D. Konar, J. Platos, V. Snasel, et al., A review of quantum-inspired metaheuristic algorithms for automatic clustering, Mathematics, 11 (2023), 2018. https://doi.org/10.3390/math11092018 doi: 10.3390/math11092018
![]() |
[35] |
A. Faramarzi, M. Heidarinejad, S. Mirjalili, A. H. Gandomi, Marine Predators Algorithm: a nature-inspired metaheuristic, Expert Syst. Appl., 152 (2020), 113377. https://doi.org/10.1016/j.eswa.2020.113377 doi: 10.1016/j.eswa.2020.113377
![]() |
[36] | S. Devendiran, K. Manivannan, Vibration based condition monitoring and fault diagnosis technologies for bearing and gear components a review, Int. J. Appl. Eng. Res., 11 (2016), 3966–3975. |
[37] |
N. G. Nikolaou, I. A. Antoniadis, Demodulation of vibration signals generated by defects in rolling element bearings using complex shifted Morlet wavelets, Mech. Syst. Signal Process., 16 (2002), 677–694. https://doi.org/10.1006/mssp.2001.1459 doi: 10.1006/mssp.2001.1459
![]() |
[38] |
P. K. Kankar, S. C. Sharma, S. P. Harsha, Rolling element bearing fault diagnosis using wavelet transform, Neurocomputing, 74 (2011), 1638–1645. https://doi.org/10.1016/j.neucom.2011.01.021 doi: 10.1016/j.neucom.2011.01.021
![]() |
[39] |
R. Dubey, V. Rajpoot, A. Chaturvedi, A. Dixit, S. Maheshwari, Ball-bearing fault classification using comparative analysis of wavelet coefficient based on entropy measurement, IETE J. Res., 25 (2022). https://doi.org/10.1080/03772063.2022.2142685 doi: 10.1080/03772063.2022.2142685
![]() |
[40] |
S. Dong, X. Xu, R. Chen, Application of fuzzy C-means method and classification model of optimized K-nearest neighbor for fault diagnosis of bearing, J. Braz. Soc. Mech. Sci. Eng., 38 (2016), 2255–2263. https://doi.org/10.1007/s40430-015-0455-9 doi: 10.1007/s40430-015-0455-9
![]() |
[41] |
B. Wang, Y. Lei, N. Li, N. Li, A hybrid prognostics approach for estimating remaining useful life of rolling element bearings, IEEE Trans. Reliab., 69 (2018), 401–412. https://doi.org/10.1109/TR.2018.2882682 doi: 10.1109/TR.2018.2882682
![]() |
[42] |
T. H. Loutas, D. Roulias, G. Georgoulas, Remaining useful life estimation in rolling bearings utilizing data-driven probabilistic e-support vectors regression, IEEE Trans. Reliab., 62 (2013), 821–832. https://doi.org/10.1109/TR.2013.2285318 doi: 10.1109/TR.2013.2285318
![]() |
[43] |
X. F. Xu, B. Li, Z. J. Qiao, P. M. Shi, H. S. Shao, R. X. Li, Caputo-Fabrizio fractional order derivative stochastic resonance enhanced by ADOF and its application in fault diagnosis of wind turbine drivetrain, Renewable Energy, 219 (2023), 119398. https://doi.org/10.1016/j.renene.2023.119398 doi: 10.1016/j.renene.2023.119398
![]() |
[44] |
W. A. Smith, R. B. Randall, Rolling element bearing diagnostics using the Case Western Reserve University data: a benchmark study, Mech. Syst. Signal Process., 64–65 (2015), 100–131. https://doi.org/10.1016/j.ymssp.2015.04.021 doi: 10.1016/j.ymssp.2015.04.021
![]() |
Model | Accuracy | Precision | Recall | F1 Score | AUC |
Alexnet | 95.11% | 94.64% | 95.68% | 0.9456 | 0.9286 |
Inception V3 | 96.82% | 96.55% | 96.37% | 0.9605 | 0.9828 |
Resnet101 | 96.74% | 96.94% | 96.42% | 0.9599 | 0.9772 |
VGG16 | 96.52% | 96.42% | 96.66% | 0.9595 | 0.9598 |
Vit | 95.76% | 95.66% | 95.87% | 0.9515 | 0.9444 |
DCT-Net | 97.78% | 97.34% | 97.13% | 0.9682 | 1.0000 |
Model | Accuracy | Sensitivity | Specificity |
CNN | 70.4% | 70.7% | 74.8% |
VGG | 70.1% | 69.8% | 71.0% |
GoogLeNet | 71.8% | 69.8% | 73.5% |
ResNet | 71.5% | 71.3% | 71.7% |
Chen [34] | 70.8% | 69.2% | 71.0% |
Shibata [35] | 73.3% | 73.2% | 76.7% |
NMD+CNN | 74.5% | 68.7% | 80.7% |
SOD+CNN | 73.9% | 80.9% | 72.2% |
NMD+Attention | 74.9% | 71.2% | 77.7% |
Xu [33] | 76.6% | 75.3% | 77.2% |
ViT | 71.4% | 74.0% | 67.8% |
CT-Net | 80.5% | 81.7% | 80.1% |
DCT-Net | 83.8% | 82.7% | 82.4% |
Model | Parameters(1 × 106) |
Alexnet | 57.01 |
Inception v3 | 25.12 |
Resnet101 | 42.5 |
VGG16 | 134.27 |
Vision Transformer | 85.80 |
DCT-Net | 138.36 |
Model | Accuracy | Precision | Recall | F1 Score | AUC |
Alexnet | 95.11% | 94.64% | 95.68% | 0.9456 | 0.9286 |
Inception V3 | 96.82% | 96.55% | 96.37% | 0.9605 | 0.9828 |
Resnet101 | 96.74% | 96.94% | 96.42% | 0.9599 | 0.9772 |
VGG16 | 96.52% | 96.42% | 96.66% | 0.9595 | 0.9598 |
Vit | 95.76% | 95.66% | 95.87% | 0.9515 | 0.9444 |
DCT-Net | 97.78% | 97.34% | 97.13% | 0.9682 | 1.0000 |
Model | Accuracy | Sensitivity | Specificity |
CNN | 70.4% | 70.7% | 74.8% |
VGG | 70.1% | 69.8% | 71.0% |
GoogLeNet | 71.8% | 69.8% | 73.5% |
ResNet | 71.5% | 71.3% | 71.7% |
Chen [34] | 70.8% | 69.2% | 71.0% |
Shibata [35] | 73.3% | 73.2% | 76.7% |
NMD+CNN | 74.5% | 68.7% | 80.7% |
SOD+CNN | 73.9% | 80.9% | 72.2% |
NMD+Attention | 74.9% | 71.2% | 77.7% |
Xu [33] | 76.6% | 75.3% | 77.2% |
ViT | 71.4% | 74.0% | 67.8% |
CT-Net | 80.5% | 81.7% | 80.1% |
DCT-Net | 83.8% | 82.7% | 82.4% |
Model | Parameters(1 × 106) |
Alexnet | 57.01 |
Inception v3 | 25.12 |
Resnet101 | 42.5 |
VGG16 | 134.27 |
Vision Transformer | 85.80 |
DCT-Net | 138.36 |