Research article Special Issues

The research of recognition of peep door open state of ethylene cracking furnace based on deep learning


  • Received: 13 September 2021 Revised: 09 January 2022 Accepted: 18 January 2022 Published: 26 January 2022
  • In the chemical industry, the ethylene cracking furnace is the core ethylene production equipment, and its safe and stable operation must be ensured. The fire gate is the only observation window to understand the high temperature operating conditions inside the cracking furnace. In the automatic monitoring process of ethylene production, the accurate identification of the opening and closing status of the fire door is particularly important. Through the research on the ethylene cracking production process, based on deep learning, the open and closed state of the fire gate is recognized and studied. First of all, a series of preprocessing and augmentation are performed on the originally collected image data of the fire gate. Then, a recognition model is constructed based on convolutional neural network, and the preprocessed data is used to train the model. Optimization algorithms such as Adam are used to update the model parameters to improve the generalization ability of the model. Finally, the proposed recognition model is verified based on the test set and is compared with the transfer learning model. The experimental results show that the proposed model can accurately recognize the open state of the fire door and is more stable than the migration learning model.

    Citation: Qirui Li, Baikun Zhang, Delong Cui, Zhiping Peng, Jieguang He. The research of recognition of peep door open state of ethylene cracking furnace based on deep learning[J]. Mathematical Biosciences and Engineering, 2022, 19(4): 3472-3486. doi: 10.3934/mbe.2022160

    Related Papers:

    [1] Hongan Li, Qiaoxue Zheng, Wenjing Yan, Ruolin Tao, Xin Qi, Zheng Wen . Image super-resolution reconstruction for secure data transmission in Internet of Things environment. Mathematical Biosciences and Engineering, 2021, 18(5): 6652-6671. doi: 10.3934/mbe.2021330
    [2] Zhijing Xu, Jingjing Su, Kan Huang . A-RetinaNet: A novel RetinaNet with an asymmetric attention fusion mechanism for dim and small drone detection in infrared images. Mathematical Biosciences and Engineering, 2023, 20(4): 6630-6651. doi: 10.3934/mbe.2023285
    [3] Yingying Xu, Songsong Dai, Haifeng Song, Lei Du, Ying Chen . Multi-modal brain MRI images enhancement based on framelet and local weights super-resolution. Mathematical Biosciences and Engineering, 2023, 20(2): 4258-4273. doi: 10.3934/mbe.2023199
    [4] Jimin Yu, Jiajun Yin, Shangbo Zhou, Saiao Huang, Xianzhong Xie . An image super-resolution reconstruction model based on fractional-order anisotropic diffusion equation. Mathematical Biosciences and Engineering, 2021, 18(5): 6581-6607. doi: 10.3934/mbe.2021326
    [5] Qing Zou, Zachary Miller, Sanja Dzelebdzic, Maher Abadeer, Kevin M. Johnson, Tarique Hussain . Time-Resolved 3D cardiopulmonary MRI reconstruction using spatial transformer network. Mathematical Biosciences and Engineering, 2023, 20(9): 15982-15998. doi: 10.3934/mbe.2023712
    [6] Linglei Meng, XinFang Shang, FengXiao Gao, DeMao Li . Comparative study of imaging staging and postoperative pathological staging of esophageal cancer based on smart medical big data. Mathematical Biosciences and Engineering, 2023, 20(6): 10514-10529. doi: 10.3934/mbe.2023464
    [7] Zhuang Zhang, Wenjie Luo . Hierarchical volumetric transformer with comprehensive attention for medical image segmentation. Mathematical Biosciences and Engineering, 2023, 20(2): 3177-3190. doi: 10.3934/mbe.2023149
    [8] Liwei Deng, Yuanzhi Zhang, Jingjing Qi, Sijuan Huang, Xin Yang, Jing Wang . Enhancement of cone beam CT image registration by super-resolution pre-processing algorithm. Mathematical Biosciences and Engineering, 2023, 20(3): 4403-4420. doi: 10.3934/mbe.2023204
    [9] Qiming Li, Chengcheng Chen . A robust and high-precision edge segmentation and refinement method for high-resolution images. Mathematical Biosciences and Engineering, 2023, 20(1): 1058-1082. doi: 10.3934/mbe.2023049
    [10] Shuaiyu Bu, Yuanyuan Li, Guoqiang Liu, Yifan Li . MAET-SAM: Magneto-Acousto-Electrical Tomography segmentation network based on the segment anything model. Mathematical Biosciences and Engineering, 2025, 22(3): 585-603. doi: 10.3934/mbe.2025022
  • In the chemical industry, the ethylene cracking furnace is the core ethylene production equipment, and its safe and stable operation must be ensured. The fire gate is the only observation window to understand the high temperature operating conditions inside the cracking furnace. In the automatic monitoring process of ethylene production, the accurate identification of the opening and closing status of the fire door is particularly important. Through the research on the ethylene cracking production process, based on deep learning, the open and closed state of the fire gate is recognized and studied. First of all, a series of preprocessing and augmentation are performed on the originally collected image data of the fire gate. Then, a recognition model is constructed based on convolutional neural network, and the preprocessed data is used to train the model. Optimization algorithms such as Adam are used to update the model parameters to improve the generalization ability of the model. Finally, the proposed recognition model is verified based on the test set and is compared with the transfer learning model. The experimental results show that the proposed model can accurately recognize the open state of the fire door and is more stable than the migration learning model.



    High-resolution (HR) magnetic resonance imaging (MRI) unveils enhanced structural details and textures, essential for accurate diagnosis and pathological analysis of bodily organs. However, the resolution of the medical image is often constrained by factors like imaging hardware limitations, prolonged scanning durations, and lower signal-to-noise ratios (SNR) [1]. Improving spatial resolution usually involves the sacrifice of decreased SNR and increased scanning time [2].

    Recently, super-resolution (SR) has emerged as a post-processing technique for upscaling the resolution of MRI images [2,3,4]. Existing SR methods include interpolation-based, regularization-based, and learning-based methods [5,6]. Interpolation methods usually blur sharp edges and can hardly recover fine details or handle complex textures [7]. Using deep convolutional neural networks (CNN) in the SR image has shown notable success in high-quality reconstruction performance [8]. After the pioneering work of SRCNN [9], a multitude of CNN-based SR models have been proposed, such as EDSR [10], RCAN [11], and SwinIR [12], significantly improving SR performance. The superior reconstruction performance of CNN-based methods, such as SAN [13] and HAN [14], primarily stems from their deep architecture, residual learning, and diverse attention mechanisms [7,15]. Deepening the network's layers can enlarge receptive fields and facilitate its ability to comprehend the intricate mapping between the low-resolution (LR) inputs and HR counterparts. The adoption of residual learning facilitates deeper SR networks, as it effectively mitigates issues associated with gradient vanishing and explosion. Since CNN-based SR methods develop rapidly, transformer-based SR methods emerged to further improve SR performance [12,16,17]. As an alternative to CNN, transformer-based methods make full use of long-range dependency information rather than local features, greatly improving SR performance. However, the transformer-based SR model usually has large model parameters and is difficult to train.

    Although previous work has made significant progress, the deep SR model is still challenging to train because of its expensive GPU computation and time costs, leading to decreased performance of the state-of-the-art methods [18]. Therefore, the SR methods proposed ahead are not suitable for limited computation resources and limited diagnosis time in medical applications.

    To tackle the aforementioned issues and challenges, we propose the multi-distillation residual network (MDRN), which has a superior trade-off between reconstruction quality and computation consumption. Specifically, we propose a feature multi-distillation residual block (FMDRB), used in MDRN, which selectively retains certain features and sends others to the subsequent steps. To maximize the feature distillation capability, we incorporate a contrast-aware channel attention layer (CCA) to enhance the aggregation of diverse refined information. Our approach focuses on leveraging more informative features such as edges, textures, and small vessels for MRI image reconstruction.

    In general, our main contributions can be summarized as follows:

    1) We propose a multi-distillation residual network (MDRN) applied to efficient and fast super-resolution MRI that learns extra discriminative feature representations and is lightweight enough for limited computation costs. Our MDRN is suitable for super-resolution MRI and clinical applications.

    2) We introduce a CCA block to our FMDRB that can guide the model to focus on recovering high-frequency information. Based on that, CCA maximizes the power of the MDRN network. Besides, it is suitable for low-level vision and has better performance than the plain channel attention block.

    3) Thanks to the unique design of MDRN, it outperforms previous CNN-based SR models even under smaller GPU conditions. The proposed method obtains the best trade-off between inference time and reconstruction quality, showing the competitive advantage of our MDRN over state-of-the-art (SOTA) methods, as supported by quantitative and qualitative evidence.

    We propose a multi-distillation residual network (MDRN) for efficient and fast super-resolution MRI, whose architecture is shown in Figure 1. In Section 2.1, we provide an overview of the MDRN structure. In Section 2.2, we introduce the core module: feature multi-distillation residual block (FMDRB). Drawing inspiration from the common residual block (RB) [10] and information multi-distillation block (IMDB) [19], our network comprises a series of stacked FMDRBs forming the main chain, as demonstrated in Figure 1.

    Figure 1.  The architecture of MDRN.

    Given ILR as the LR input of MDRN, the network reconstructs the SR output ISR from the LR input. As in previous works, we adopt a shallow feature extraction, deep feature extraction, and post-upsample structure. The process of shallow feature F0 extracted from the input ILR is as follows:

    F0=DSF(ILR), (1)

    where HSF() demonstrates the function of shallow feature extractor, specifically one convolution operation.

    The subsequent part of MDRN involves the integration of multiple FMDRBs, which are put in a chain manner with feature distillation connections. This design facilitates the gradual refinement of the initial extracted features, culminating in the generation of deep features. The deep feature extraction part can be described as follows:

    Fk=DDFk(Fk1),k=1,,n, (2)

    where DDFk() stands for the function of k -th FMDRB, and Fk1 and Fk represent the input and output features of the k-th FMDRB, respectively. After the iterative refinement process by the FMDRBs, one 1×1 convolution layer is put at the end of a feature extraction part to assemble the fused distilled features. Following the fusion operation, a 3×3 convolution layer is put here to smooth the inductive bias of the aggregated features as follows:

    Ffusion=Daggregated(Concat(F1,,Fn)), (3)

    where Concat denotes the fusion operation through channel concatenation of all the distillation features, Daggregated denotes the operation, which is one 3×3 convolution following one 1×1 convolution, and Ffusion is the fused and aggregated features. Finally, the SR output ISR is generated by the reconstruction module as follows:

    ISR=DREC(Ffusion+F0), (4)

    where DREC() denotes the function of the upscale reconstruction part. The initial extracted feature F0 is added to the assembled features Ffusion through skip connection, and ISR is the output of the network. The upsample reconstruction works through a convolution layer, whose output channels are quadratic in relation to the upscale factor with a 3×3 kernel size and a sub-pixel shuffle operation that is non-parametric.

    The shallow extracted features predominantly contain low-frequency information, whereas deep extracted features focus more on restoring fading high-frequency information. The skip connection path enables MDRN to directly transmit low frequencies to the reconstruction process, which can help combine information and achieve more stable training.

    Inspired by the concept of feature distillation and residual learning, we designed the core module--feature multi-distillation residual block (FMDRB), which is more efficient and lightweight than the traditional residual modules. Different from the common residual block (two convolutions and one activation with identity connection), FMDRB uses an additional path with convolution for feature distillation and improved residual blocks stacked in the main chain as refinement layers that process coarse features gradually. We describe the complete structure as follows:

    Fdistilled_1=D1(Fin),Fremain_1=R1(Fin),Fdistilled_2=D2(Fremain_1),Fremain_2=R2(Fremain_1),Fdistilled_3=D2(Fremain_2),Fremain_3=R2(Fremain_2),Fremain_4=R4(Fremain_3),Fout=Concat(Fdistilled_1,Fdistilled_2,Fdistilled_3,Fremain_4), (5)

    where D denotes the distillation operation, R denotes the layer for remaining features, and the subscript number represents the number of layers. The output feature Fout fuses the right-most features processed in the main chain and distilled features in the distillation paths. As described in the above equations, the distillation operation works concurrently with the residual learning; this structure shows more efficiency and flexibility than the original residual block commonly used. As such, this block is called feature multi-distillation residual block.

    As shown in Figure 1 below, the feature distillation path in each level is performed by one 1×1 convolution layer that effectively compresses feature channels at a fixed ratio; for example, we use input channels divided by 2. Although most convolutions in the SR model use 3×3 kernel size, we note that employing the 1×1 convolution for channel reduction, as done in numerous other CNN models, is more efficient. As we replace the convolution in the distillation path, the parameter amount is significantly reduced. The convolutions located in the main body of MDRN still use a 3×3 kernel size, which better refines the features in the main path and more effectively utilizes spatial information in context.

    As shown in Figure 1, despite the improvements mentioned above, we also introduce the base unit of FMDRB, named BSRB [20], which allows more flexible residual learning than a common residual block. Specifically, it uses a 3×3 Blueprint Separable Convolution (BSConv) [21], an identity connection, and the ReLU activation layer. BSConv is a 1×1 pointwise convolution followed by a 3×3 depthwise convolution, which differs from the standard convolution.

    The initial concept of channel attention, widely recognized as the squeeze-and-excitation (SE) module, has been extensively used in image processing tasks. The significance of a feature map is predominantly determined by the activation of high-value regions, as these areas are critical for classification or detection. Consequently, global average and maximum pooling are commonly utilized to capture global information in these high- or mid-level visions. While average pooling can indeed enhance the PSNR value, it lacks the capability to retain structural, textural, and edge information, which are crucial for improving image detail (as related to SSIM) [19]. As illustrated in Figure 1, the contrast-aware channel attention module is specific to low-level vision. Specifically, we replace global average pooling with the summation of standard deviation and mean (evaluating the contrast degree of a feature map). Let us denote X=[x1,x2,,xc,,xC] as the input, which has C feature maps with spatial size of H×W. Therefore, the contrast information value can be calculated by

    zc=HGC(xc)=1HW(i,j)xc(xi,jc1HW(i,j)xcxi,jc)2+1HW(i,j)xcxi,jc, (6)

    where zc is the c-th element of output. HGC indicates the global contrast information evaluation function. With the assistance of the CCA module, our network can steadily improve the accuracy of super-resolution.

    We used the public clinical dataset from The Cancer Imaging Archive [22], which is available at https://www.cancerimagingarchive.net/collection/vestibular-schwannoma-seg/, named MRI-brain below. The dataset contains labeled MRI images obtained from 242 patients who received Gamma Knife radiation treatment and have been diagnosed with vestibular schwannoma. The images were acquired on a 32-channel Siemens Avanto 1.5T scanner. We used 5000 slices in the MRI-brain dataset for the training set. For testing the performance of our method, we used the remaining 1000 slices as the testing set. The dataset is enough for training and testing since one patient has approximately 140-160 slices.

    In data preprocessing, first, we converted the DICOM raw files to NumPy files with voxels. Then, the image pixel data was clipped to range below 2000 and normalized to range [0, 1]. Third, we used bicubic interpolation as the degradation function of the original HR image to the LR image. The preprocessing workflow is shown in Figure 2.

    Figure 2.  Preprocessing workflow of our data.

    We trained our model with 5×104 learning rate updated by StepLR scheduler and minimizing the L1 loss function. For the purpose of reducing the training burden, we got patches 192×192 from whole HR images as the input to the network. We used the ADAM optimizer with β1 = 0.9, β2 = 0.99. The entire MDRN procedure took approximately 48 h (20, 000 iterations per epoch, 200 epochs) for training and evaluation on the MRI dataset on a single GeForce RTX 3090 GPU with 24 GB of memory.

    Following previous works, peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) were used to assess the model's performance. The calculation of these evaluation metrics is written below:

    PSNR=10log10(MAX2MSE),MSE=1mnm1i=0n1j=0[Ix(i,j)Iy(i,j)]2, (7)
    SSIM=(2μxμy+c1)(σxy+c2)(μ2x+μ2y+c1)(σ2x+σ2y+c2). (8)

    We verified the effectiveness of each proposed component in our MDRN introduced before in detail on the same dataset under the same experiment setting. As shown in Table 1, we itemized the performance of specific methods.

    Table 1.  Ablation study of the different components. The best PSNR values on the 4× dataset are listed below.
    Base R1 R2 R3 R4 R5 R6 R7 Ours
    Multi-distillation (inside block)
    BSRB
    Using CCA
    Multi-distillation (outside block)
    PSNR 31.07 31.08 31.26 31.54 31.89 31.89 31.97 31.53 32.46

     | Show Table
    DownLoad: CSV

    The Base refers to the model EDSR, which is a common residual block stacked in one path with one long skip connection, keeping the basic style of the mostly used SR SOTA model. The result of R1 shows the effectiveness of the distillation path outside the FMDRB. The result of R2 verifies the effectiveness of the basic unit (BSRB); as we can see, the block used alone enhances the performance, overtaking the model constructed from common residual blocks. The result of R3 shows the role of CCA in this proposed method. Results from R4 to R7 with/without the feature distillation operation outside/inside the proposed FMDRB, BSRB, and CCA obtain different SR results and outperform the before model, which further verifies the effectiveness of each proposed method. When the basic residual units (FMDRBs) are stacked in a chain manner, which is the common structure in the popular SR models, the model gets lower performance. However, when adding the feature distillation connections to the main chain of the residual blocks, which is the so-called FMDRB, the enhanced distillation block gets better performance.

    The distillation structure is useful not only inside the enhanced distillation block but also outside the basic block. The result R6 is without/with the CCA layer; the result using CCA outperforms the result not using CCA, which verifies that the CCA layer maximizes the performance of FDRB.

    We put the contrast-aware channel attention block in the tail position of the proposed FMDRB, which maximizes the capability of the proposed module. To prove the effectiveness of the attention module, we used other attention blocks for comparison, such as CA and IIA. As shown in Table 2, the results of the ablation study aiming at attention block show that the CCA is effective and has the best ability for immediate features.

    Table 2.  Effects of different attention blocks.
    Attention block w/o CA IIA CCA
    PSNR 31.97 31.98 32.12 32.46
    SSIM 0.8767 0.8771 0.8778 0.8761

     | Show Table
    DownLoad: CSV

    The proposed MDRN has inherited the advantages of the residual network and combines the advantages of the feature distillation network. To prove the excellent performance of MDRN, we compared our model with popular state-of-the-art SR models, including NTIRE2017 winner EDSR [10], RCAN [11], large-scale SAN [13], HAN [14], novel IGAN [15], RFDN [23], and the recent DINet [24]. Since most SR SOTA models are tested on DIV2K, which are 3-channel natural images, the performance comparison between different methods cannot be directly done from cited papers; they were re-tested on the MRI-brain dataset, composed of single-channel clinical images.

    Table 3 demonstrates the comparison of quantitative results for 2×, 4×, and 8× SR. Our MDRN outperforms existing methods on MR-brain test datasets of all scales. Without using tricks like self-ensemble, the proposed MDRN network still achieves significant improvements compared to recent advanced methods. It is notably worth noticing that our model is much better than the EDSR, which shares a similar basic architecture with MDRN and shows some superiority over RFDN, which also uses the feature distillation strategy as MDRN. MDRN outperforms methods such as SAN and IGAN, which have more computationally intensive attention modules. Specifically, MDRN obtains superior results by 1.82 dB improvement in PSNR compared to the base EDSR in 4× scale, and its SSIM wins over previous methods. MDRN gains better results by up to 0.44 dB in terms of PSNR than DIPNet.

    Table 3.  Comparison of quantitative results with state-of-the-art SR methods on Brain Vestibular-Schwannoma datasets in 2×, 4×, and 8× scale. The best and second-best performances are in red and blue colors, respectively.
    Memory Time Scale 2 Scale 4 Scale 8
    [M] (ms) PSNR/SSIM PSNR/SSIM PSNR/SSIM
    Bicubic -- -- 33.66/0.9299 28.44/0.8159 24.40/0.6580
    EDSR [10] 2192.74 72.36 34.98*/0.9025* 30.64*/0.8697* 26.17*/0.7513*
    RCAN [11] 2355.20 498.26 38.27*/0.9614* 31.65**/0.9019* 26.21*/0.7778*
    SAN [13] 5017.60 805.23 34.85*/0.9318* 31.09*/0.8432* 25.39*/0.7359*
    IGAN [15] 2099.20 335.77 33.91*/0.9173* 31.73*/0.8744* 26.32*/0.7804*
    HAN [14] 5038.98 719.07 34.97*/0.9576* 31.03*/0.8424* 25.66*/0.7612*
    RFDN [23] 813.06 49.51 38.31**/0.9620* 31.98*/0.8795* 26.28*/0.7794*
    DIPNet [24] 521.02 28.79 38.27**/0.9614* 32.02**/0.8712* 26.33*/0.7884*
    Ours 325.21 27.88 39.19/0.9686 32.46/0.8761 26.47/0.8696
    *p < 0.05, **p < 0.001

     | Show Table
    DownLoad: CSV

    The efficiency of a SR model can be assessed through various metrics, such as the number of parameters, runtime, computational complexity (FLOPs), and GPU memory consumption. These metrics play pivotal roles in the deployment of models in different aspects. Among these evaluation metrics, the runtime is the most direct indicator of a network's efficiency and is used as the primary evaluation metric. Memory consumption is also an important metric because it determines whether the model can be deployed to the edge device. In a clinical setting, the SR MRI model will be put into a small GPU, and models needing large-memory GPU will not work as intended. Our MDRN model gets the best PSNR, which is over 32 dB, only using 325.21 M GPU memory and 27.88 ms valid runtime, as shown in Table 3, showing a competitive advantage over other methods. To test the validation of experiment results, we analyzed the statistical significance of the results. As shown in Table 3, we calculated the P value of the results using the data of every epoch as a collection of random variables.

    Table 4.  Comparison of quantitative results on other datasets.

    BraTS-Gli BraTS-Meni
    PSNR/SSIM PSNR/SSIM
    Bicubic 32.94/0.9099 30.25/0.8689
    EDSR [10] 36.35*/0.9610* 33.33*/0.9196*
    RCAN [11] 36.94**/0.9513* 33.86*/0.9160*
    SAN [13] 37.06*/0.9514* 34.02*/0.9191*
    IGAN [15] 37.09*/0.9620* 34.13*/0.9217*
    HAN [14] 37.33*/0.9521* 33.83*/0.9197*
    RFDN [23] 38.17**/0.9600** 34.08**/0.9214*
    DIPNet [24] 38.38**/0.9623* 34.17*/0.9218*
    Ours 38.92/0.9635 34.25/0.9225
    *p < 0.05, **p < 0.001

     | Show Table
    DownLoad: CSV

    For a more intuitive demonstration of the gap between these methods, we show the comparison of zoomed results of various methods. As shown in Figure 3, we randomly select some results from the test set for evaluation. Taking "img_050112" as an example, most SR methods can reconstruct the general composition, but only IGAN and MDRN recover the more detailed textures and sharper edges. In zoomed details of "img_05011", we can see that IGAN, SAN, and RFDN do not clearly restore the small vessels, while our MDRN obviously does (shown in red arrows). Additionally, as seen in "img_05024", MDRN is closer to the ground truth, recovering the cerebrospinal fluid and not generating blurring artifacts (shown in yellow arrows). Our MDRN can output more high-frequency information, like enhanced contrast edges, than other methods. Through the observations of visual results, it is verified that MDRN has superiority in complex feature representations and recovery ability over previous works.

    Figure 3.  Visual comparison of SR methods in 4× scale on the MRI-brain dataset. Zoomed details for observation. Colored visualization below for better comparison.

    Deep learning-based methods have been proven to work effectively in the domain of medical image processing, including SR reconstruction for MR images. Based on the bottleneck of the SR task, we propose a novel lightweight and fast SR model named MDRN using multi-distillation residual learning.

    Figure 4 provides an overview of the comparison of the performance and computation efficiency of the proposed method and other methods. It is evident that MDRN achieves the best execution time. Except for SAN and HAN using transformer structure, the computation complexity of SAN and HAN is O(n2) and of other models is O(n). The quadratic computation complexity O(n) in relation to the query/key/value sequence length n leads to high computation costs when using self-attention with a global receptive field. For a precise assessment of the computation complexity of our method, we compare it using quantitative metrics with several representative open-source models, as shown in Table 3. Quantitative results show that our MDRN consumes lower computation resources while maintaining 32+ PSNR. MDRN has a better trade-off between performance and cost.

    Figure 4.  Comparison of computation efficiency and performance between our method and other methods.

    We conducted generalization experiments by applying the super-resolution model trained on head and neck magnetic resonance imaging (MRI) images to pelvic CT images, aiming to validate the model's generalization performance on different datasets (Table 5). The results demonstrate that our model achieves a PSNR of 31.4 dB on the pelvic dataset at a 4× magnification factor. This outcome indicates that our MDRN exhibits favorable generalization performance and is capable of completing super-resolution tasks on new datasets. Visual quality is shown in Figure 5.

    Table 5.  Generalization analysis on pelvic CT images.
    Scale
    PSNR 36.55 32.35 27.79
    SSIM 0.8882 0.8938 0.8928

     | Show Table
    DownLoad: CSV
    Figure 5.  Visual quality of SR results on pelvic CT images for generalization study.

    In this paper, we propose the MDRN, a lightweight CNN model, for efficient and fast super-resolution MRI tasks using the innovative multi-distillation strategy. Our findings show remarkable superiority of MDRN over current SR methods, supported by both quantitative metrics and visual evidence. Notably, MDRN excels at learning discriminative features and striking a better balance between computational efficiency and reconstruction performance by integrating the feature distillation mechanism into the network architecture. Extensive evaluations conducted on an MRI-brain dataset underline the favorable performance of MDRN over existing methods in both computational cost and accuracy for medical scenarios.

    We declare that we have not used generative AI tools to generate the scientific writing of this paper.

    We declare that we have no known financial interests or personal relationships that could have appeared to influence the work reported in this paper. There is no professional or other personal interest of any kind in any product, service or company that could influence the work reported in this paper.



    [1] Z. Peng, J. He, D. Cui, Q. Li, J. Qiu, Study of dual-phase drive synchronization method and temperature measurement algorithm for measuring external surface temperatures of ethylene cracking furnace tubes, Appl. Petrochem. Res., 8 (2018), 163–172. https://doi.org/10.1007/s13203-018-0205-x doi: 10.1007/s13203-018-0205-x
    [2] Y. Li, W. Chen, H. Yan, X. Li, Learning graph-based embedding for personalized product recommendation, Chin. J. Comput., 42 (2019), 1767–1778. https://doi.org/10.11897/SP.J.1016.2019.01767 doi: 10.11897/SP.J.1016.2019.01767
    [3] C. Yan, C. Wang, Development and application of convolutional neural network model, J. Front. Comput. Sci. Technol., 15 (2021), 27–46. https://doi.org/10.3778/j.issn.1673-9418.2008016 doi: 10.3778/j.issn.1673-9418.2008016
    [4] J. Galvis, S. Morales, C. Kasmi, F. Vega, Denoising of video frames resulting from video interface leakage using deep learning for efficient optical character recognitionn, in IEEE Letters on Electromagnetic Compatibility Practice and Applications, 3 (2021), 82–86. https://doi.org/10.1109/LEMCPA.2021.3073663
    [5] X. Liu, B. Hu, Q. Chen, X. Wu, J. You, Stroke sequence-dependent deep convolutional neural network for online handwritten chinese character recognition, IEEE Trans. Neural Netw. Learn. Syst., 31 (2020), 4637–4648. https://doi.org/10.1109/TNNLS.2019.2956965 doi: 10.1109/TNNLS.2019.2956965
    [6] R. Khanna, D. Oh, Y. Kim, Through-wall remote human voice recognition using doppler radar with transfer learning, IEEE Sens. J., 19 (2019), 4571–4576. https://doi.org/10.1109/JSEN.2019.2901271 doi: 10.1109/JSEN.2019.2901271
    [7] B. Sisman, J. Yamagishi, S. King, H. Li, An overview of voice conversion and its challenges: from statistical modeling to deep learning, IEEE-ACM Trans. Audio Speech Lang., 29 (2021), 132–157. https://doi.org/10.1109/TASLP.2020.3038524 doi: 10.1109/TASLP.2020.3038524
    [8] R. He, X. Wu, Z. Sun, T. Tan, Wasserstein CNN: learning invariant features for NIR-VIS face recognition, IEEE Trans. Pattern Anal. Mach. Intell., 41 (2019), 1761–1773. https://doi.org/10.1109/TPAMI.2018.2842770 doi: 10.1109/TPAMI.2018.2842770
    [9] L. Zhang, J. Liu, B. Zhang, D. Zhang, C. Zhu, Deep cascade model-based face recognition: when deep-layered learning meets small data, IEEE Trans. Image Process., 29 (2020), 1016–1029. https://doi.org/10.1109/TIP.2019.2938307 doi: 10.1109/TIP.2019.2938307
    [10] H. Li, P. Wang, C. Shen, Toward end-to-end car license plate detection and recognition with deep neural networks, IEEE Trans. Intell. Transp. Syst., 20 (2019), 1126–1136. https://doi.org/10.1109/TITS.2018.2847291 doi: 10.1109/TITS.2018.2847291
    [11] W. Wang, J. Yang, M. Chen, P. Wang, A light CNN for end-to-end car license plates detection and recognition, IEEE Access, 7 (2019), 173875–173883. https://doi.org/10.1109/ACCESS.2019.2956357 doi: 10.1109/ACCESS.2019.2956357
    [12] Y. Deng, T. Zhang, G. Lou, X. Zheng, J. Jin, Q. Han, Deep learning-based autonomous driving systems: a survey of attacks and defenses, IEEE Trans. Ind. Inform., 17 (2021), 7897–7912. https://doi.org/10.1109/TII.2021.3071405 doi: 10.1109/TII.2021.3071405
    [13] S. Kuutti, R. Bowden, Y. Jin, P. Barber, S. Fallah, A survey of deep learning applications to autonomous vehicle control, IEEE Trans. Intell. Transp. Syst., 22 (2021), 712–733. https://doi.org/10.1109/TITS.2019.2962338 doi: 10.1109/TITS.2019.2962338
    [14] A. Krizhevsky, I. Sutskever, G. Hinton, ImageNet classification with deep convolutional neural networks, Commun. ACM, 60 (2017), 84–90. https://doi.org/10.1145/3065386 doi: 10.1145/3065386
    [15] I. Hammad, K. El-Sankary, Impact of approximate multipliers on VGG deep learning network, IEEE Access, 6 (2018), 60438–60444. https://doi.org/10.1109/ACCESS.2018.2875376 doi: 10.1109/ACCESS.2018.2875376
    [16] H. Zhu, M. Sun, H. Fu, N. Du, J. Zhang, Training a seismogram discriminator based on ResNet, IEEE Trans. Geosci. Remote Sensing, 59 (2021), 7076–7085. https://doi.org/10.1109/TGRS.2020.3030324 doi: 10.1109/TGRS.2020.3030324
    [17] Z. Ma, G. He, Y. Yuan, Fume hood window state recognition method based on few-shot deep learning, J. East China Unive. Sci. Technol., 46 (2020), 428–435. https://doi.org/10.14135/j.cnki.1006-3080.20190412004 doi: 10.14135/j.cnki.1006-3080.20190412004
    [18] C. Y. Hsu, Y. Qiao, C. Wang, S.T. Chen, Machine learning modeling for failure detection of elevator doors by three-dimensional video monitoring, IEEE Access, 8 (2020), 211595–211609. https://doi.org/10.1109/ACCESS.2020.3037185 doi: 10.1109/ACCESS.2020.3037185
    [19] D. Y. Choi, B. C. Song, Facial micro-expression recognition using two-dimensional landmark feature maps, IEEE Access, 8 (2020), 121549–121563. https://doi.org/10.1109/ACCESS.2020.3006958 doi: 10.1109/ACCESS.2020.3006958
    [20] Y. Ma, P. Tang, L. Zhao, Z. Zhang, Review of data augmentation for image in deep learning, J. Image Graphics, 26 (2021), 487–502. https://doi.org/10.11834/jig.200089 doi: 10.11834/jig.200089
    [21] Y. Zuo, Y. Shen, Regularization of the Tanh activation function, J. Zhoukou Normal Unive., 37 (2020), 23–27. https://doi.org/10.13450/j.cnki.jzknu.2020.05.006 doi: 10.13450/j.cnki.jzknu.2020.05.006
    [22] Q. Zeng, D. Tan, F. Wang, Improved convolutional neural network based on fast exponentially linear unit activation function, IEEE Access, 7 (2019), 151359–151367. https://doi.org/10.1109/ACCESS.2019.2948112 doi: 10.1109/ACCESS.2019.2948112
    [23] J. Tian, Y. Li, T. Li, Contrastive study of activation function in convolutional neural network, J. Syst. Appl., 27 (2018), 43–49. https://doi.org/10.15888/j.cnki.csa.006463 doi: 10.15888/j.cnki.csa.006463
    [24] S. G. Zadeh, M. Schmid, Bias in cross-entropy-based training of deep survival networks, IEEE Trans. Pattern Anal. Mach. Intell., 43 (2021), 3126–3137. https://doi.org/10.1109/TPAMI.2020.2979450 doi: 10.1109/TPAMI.2020.2979450
    [25] H. Fu, Y. Chi, Y. Liang, Guaranteed recovery of one-hidden-layer neural networks via cross entropy, IEEE Trans. Signal Process., 68 (2020), 3225–3235. https://doi.org/10.1109/TSP.2020.2993153 doi: 10.1109/TSP.2020.2993153
    [26] W. Liu, X. Liang, H. Qu, Learning performance of convolutional neural networks with different pooling models, J. Image Graphics, 21 (2016), 1178–1190. https://doi.org/10.11834/jig.20160907 doi: 10.11834/jig.20160907
    [27] M. Li, H. Li, J. Chen, Adam optimization algorithm based on differential privacy protection, Comput. Appl. softw., 21 (2020), 253–258. https://doi.org/10.3969/j.issn.1000-386x.2020.06.044 doi: 10.3969/j.issn.1000-386x.2020.06.044
    [28] G. Huang, Z. Liu, L. V. D. Maaten, K. Q. Weinberger, Densely connected convolutional networks, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2017), 2261–2269. https://doi.org/10.1109/CVPR.2017.243
  • Reader Comments
  • © 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(3366) PDF downloads(106) Cited by(3)

Figures and Tables

Figures(10)  /  Tables(1)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog