Effective multi-class lungdisease classification using the hybridfeature engineering mechanism

Binju Saju; Neethu Tressa; Rajesh Kumar Dhanaraj; Sumegh Tharewal; Jincy Chundamannil Mathew; Danilo Pelusi; Binju Saju; Neethu Tressa; Rajesh Kumar Dhanaraj; Sumegh Tharewal; Jincy Chundamannil Mathew; Danilo Pelusi

doi:10.3934/mbe.2023896

Mathematical Biosciences and Engineering

2023, Volume 20, Issue 11: 20245-20273. doi: 10.3934/mbe.2023896

Previous Article Next Article

Research article Special Issues

Effective multi-class lungdisease classification using the hybridfeature engineering mechanism

1.
Department of Master of Computer Applications, New Horizon College of Engineering, Bengaluru, India
2.
Symbiosis Institute of Computer Studies and Research (SICSR), Symbiosis International (Deemed University), Pune, India
3.
Department of Communication Sciences, University of Teramo, Teramo, Italy

Academic Editor: Yang Kuang

Received: 25 June 2023 Revised: 05 September 2023 Accepted: 18 September 2023 Published: 07 November 2023

The utilization of computational models in the field of medical image classification is an ongoing and unstoppable trend, driven by the pursuit of aiding medical professionals in achieving swift and precise diagnoses. Post COVID-19, many researchers are studying better classification and diagnosis of lung diseases particularly, as it was reported that one of the very few diseases greatly affecting human beings was related to lungs. This research study, as presented in the paper, introduces an advanced computer-assisted model that is specifically tailored for the classification of 13 lung diseases using deep learning techniques, with a focus on analyzing chest radiograph images. The work flows from data collection, image quality enhancement, feature extraction to a comparative classification performance analysis. For data collection, an open-source data set consisting of 112,000 chest X-Ray images was used. Since, the quality of the pictures was significant for the work, enhanced image quality is achieved through preprocessing techniques such as Otsu-based binary conversion, contrast limited adaptive histogram equalization-driven noise reduction, and Canny edge detection. Feature extraction incorporates connected regions, histogram of oriented gradients, gray-level co-occurrence matrix and Haar wavelet transformation, complemented by feature selection via regularized neighbourhood component analysis. The paper proposes an optimized hybrid model, improved Aquila optimization convolutional neural networks (CNN), which is a combination of optimized CNN and DENSENET121 with applied batch equalization, which provides novelty for the model compared with other similar works. The comparative evaluation of classification performance among CNN, DENSENET121 and the proposed hybrid model is also done to find the results. The findings highlight the proposed hybrid model's supremacy, boasting 97.00% accuracy, 94.00% precision, 96.00% sensitivity, 96.00% specificity and 95.00% F1-score. In the future, potential avenues encompass exploring explainable machine learning for discerning model decisions and optimizing performance through strategic model restructuring.

Keywords:

chest X-ray,
lung disease,
otsu,
contrast limited adaptive histogram equalization,
Canny edge detection,
DENSENET121,
batch equalization,
Aquila optimizer

Citation: Binju Saju, Neethu Tressa, Rajesh Kumar Dhanaraj, Sumegh Tharewal, Jincy Chundamannil Mathew, Danilo Pelusi. Effective multi-class lungdisease classification using the hybridfeature engineering mechanism[J]. Mathematical Biosciences and Engineering, 2023, 20(11): 20245-20273. doi: 10.3934/mbe.2023896

Related Papers:

[1]	Qian Wu, Yuyao Pei, Zihao Cheng, Xiaopeng Hu, Changqing Wang . SDS-Net: A lightweight 3D convolutional neural network with multi-branch attention for multimodal brain tumor accurate segmentation. Mathematical Biosciences and Engineering, 2023, 20(9): 17384-17406. doi: 10.3934/mbe.2023773
[2]	Tongping Shen, Fangliang Huang, Xusong Zhang . CT medical image segmentation algorithm based on deep learning technology. Mathematical Biosciences and Engineering, 2023, 20(6): 10954-10976. doi: 10.3934/mbe.2023485
[3]	Shen Jiang, Jinjiang Li, Zhen Hua . Transformer with progressive sampling for medical cellular image segmentation. Mathematical Biosciences and Engineering, 2022, 19(12): 12104-12126. doi: 10.3934/mbe.2022563
[4]	Xiaomeng Feng, Taiping Wang, Xiaohang Yang, Minfei Zhang, Wanpeng Guo, Weina Wang . ConvWin-UNet: UNet-like hierarchical vision Transformer combined with convolution for medical image segmentation. Mathematical Biosciences and Engineering, 2023, 20(1): 128-144. doi: 10.3934/mbe.2023007
[5]	Jun Liu, Zhenhua Yan, Chaochao Zhou, Liren Shao, Yuanyuan Han, Yusheng Song . mfeeU-Net: A multi-scale feature extraction and enhancement U-Net for automatic liver segmentation from CT Images. Mathematical Biosciences and Engineering, 2023, 20(5): 7784-7801. doi: 10.3934/mbe.2023336
[6]	Xiaoyan Zhang, Mengmeng He, Hongan Li . DAU-Net: A medical image segmentation network combining the Hadamard product and dual scale attention gate. Mathematical Biosciences and Engineering, 2024, 21(2): 2753-2767. doi: 10.3934/mbe.2024122
[7]	Zhigao Zeng, Cheng Huang, Wenqiu Zhu, Zhiqiang Wen, Xinpan Yuan . Flower image classification based on an improved lightweight neural network with multi-scale feature fusion and attention mechanism. Mathematical Biosciences and Engineering, 2023, 20(8): 13900-13920. doi: 10.3934/mbe.2023619
[8]	Haiyan Song, Cuihong Liu, Shengnan Li, Peixiao Zhang . TS-GCN: A novel tumor segmentation method integrating transformer and GCN. Mathematical Biosciences and Engineering, 2023, 20(10): 18173-18190. doi: 10.3934/mbe.2023807
[9]	Yuqing Zhang, Yutong Han, Jianxin Zhang . MAU-Net: Mixed attention U-Net for MRI brain tumor segmentation. Mathematical Biosciences and Engineering, 2023, 20(12): 20510-20527. doi: 10.3934/mbe.2023907
[10]	Zhenyin Fu, Jin Zhang, Ruyi Luo, Yutong Sun, Dongdong Deng, Ling Xia . TF-Unet:An automatic cardiac MRI image segmentation method. Mathematical Biosciences and Engineering, 2022, 19(5): 5207-5222. doi: 10.3934/mbe.2022244

Abstract

1. Introduction

Accurate medical images segmentation is basic and crucial for medical image processing and analysis ^[1,2]. Generally, the targets on medical images are segmented by sketching the outline manually, but this is time-consuming and requires professional knowledge of physicians. Lots of morphology-based automatic segmentation methods have been proposed in the past, including edge detection, area detection, and template matching ^[3]. However, it is difficult to design specifically and easily deformable models for various segmentation tasks ^[4]. The significant variations in the scale and shape of segmented targets add to the difficulty of segmentation tasks ^[5].

With the great development of deep learning, deep convolutional neural networks (DCNNs) have achieved excellent performance in medical image segmentation filed ^[3,6,7]. Compared with traditional methods, DCNNs can automatically extract features and show higher accuracy and robustness. Many structures ^[8,9,10] were founded on Fully Convolutional Network (FCN), and U-Net ^[10] with its variants, have been widely implemented to many tasks, such as skin lesions ^[11,12], thyroid gland ^[13,14], lung ^[7,15], nuclei ^{[16,17,18,19]}, etc. U-Net adopts the encoder-decoder structure. The encoder captures feature information using continuous stacked convolutional layers and the decoder is sited to recover the categories of each pixel. At the same time, multiple skip connections are also applied to spread feature information for final segmentation. The variants, such as MultiResUNet ^[12], UNet++ ^[20], rethought the U-Net architecture and achieved better performance in segmentation tasks. Nevertheless, there are still some shortcomings of U-Net and its variants. First, the pooling operation may lose some important features which are conductive to improve the segmentation accuracy. Second, these methods couldn't dynamically adjust to the variation of features, such as shape and size. Third, continuous stacked convolution layers deepen the network architecture and enhance the feature extraction capability of models, but a major critique of such models is their large parameter count ^[21].

The attention mechanism helps the network to draw what we need. By imitating the way humans allocate attention, the attention feature vectors or maps dynamically add the important weights to critical information and omit useless ones. Using the squeeze-and-excitation module, SE-Net ^[22], showed effectiveness of the inter-channel relationship. ECA module ^[23], inspired by SE-Net, generated feature weights by 1D convolution operation and effected on output. ^[24] adopted the attention mechanism for image classification based on RNN model and achieved good performance. ^[25] first applied attention mechanism to the field of NLP in machine translation tasks, and ^[26] proposed a self-attention mechanism. For medical image segmentation field, Attention U-Net ^[27] applied attention gates for capturing richer contextual information. CA-Net ^[11] proposed a comprehensive attention network to emphasize significant features in multi-scale feature maps. These methods take advantage of context information and achieve higher accuracy, but the parameters are relatively higher.

The emergence of depthwise separable convolution has shown great efficiency and reduced training parameters over regular convolution ^[28,29,30]. It separates the standard convolution operation into two layers: depthwise convolution and pointwise convolution. Each input channel is first convoluted spatially, and the pointwise convolution subsequently processes the channels into a new channel dimensional space, subsequently. In MobileNets architecture ^[29], depthwise separable convolution was employed to build lightweight networks and embedded in mobile visual applications. DeepLabV3+ ^[31] applied it to the ASPP module, which achieved a faster and more powerful network for semantic image segmentation. X-Net ^[32] adopted it to scale the network size down and performed well. MobileNetV3-UNet ^[33] created a lightweight encoder and decoder architecture based on depthwise separable convolution, which achieved high accuracy on medical image segmentation tasks.

Combining the advantages of attention mechanism and depthwise separatable convolution into U-shaped architecture, a lightweight DSCA-Net is proposed in this paper for medical image segmentation. Three novel attention modules are proposed and integrated into the encoder and decoder of U-Net, separately. The chief contributions of our work are summarized as follows:

1) A Pooling Attention module is proposed to reduce the feature loss caused by down-sampling.

2) A Context Attention module is designed to exploit the concatenation feature maps from the encoder and decoder, which combines the spatial and channel attention mechanisms to focus on useful position features.

3) To make better use of multi-scale information from different level stages of the decoder, a Multiscale Edge Attention module is proposed to deal with combined features for the final prediction.

4) We integrate all proposed modules into DSCA-Net for medical image segmentation and all convolution operations are implemented by depthwise separable convolution. The proposed network was evaluated on four public datasets and the experimental results reveal that our proposed network outperforms previous state-of-the-art frameworks.

The remainder of our paper is structured below. Section 2 goes over detailed information of proposed DSCA-Net architecture, and Section 3 describes the experimental settings and results. Finally, some discussions and conclusions are given in Sections 4 and 5.

2. Materials and methods

2.1. DSCA-Net

By combining attention mechanism and depthwise separable convolution with the architecture of U-Net, we propose DSCA-Net, which is shown in . The network is composed of encoding part, decoding part, and multiscale edge part. Firstly, we replace the stacked $3\times 3$ convolution layers of U-Net with DC module. The depth of encoder is 128, which enables our proposed model better extracting abundant features while reducing parameter amount. Secondly, to reduce the feature loss, PA module is embedded in place of maximum pooling layer, which has almost no effect on the number of parameters. Then, long-range skip connections are utilized to transfer feature maps from encoder to symmetrical decoder stage after passing through CA module, which fuses and recalibrates the context information at five different resolution levels. Finally, MEA module reemphasizes the salient scale information from concatenating multiscale feature maps, which enable the last CNN layer to be aware of segmenting target edge.

Figure 1. The overall architecture of DSCA-Net.

DownLoad: Full-Size Img PowerPoint

2.2. Dense convolution module

Recent studies show that extending the network depth leads to better segmentation performance ^[28,34]. Based on depthwise separable convolution operation ^[28] and DenseNet ^[35], the dense convolution module is proposed. We utilize it in encoder to extract high-dimensional feature information and recover segmented target details in decoder. As shown in Figure 2, every depthwise separable convolution layer is followed by one group normalization ^[36] and LeakyReLU ^[37], which improves nonlinear expression capability of model. For convenience, we assume the input as ${x}_{input}\in {\mathbb{R}}^{C\times H\times W}$ . $C, H, W$ denote channel, height, and weight, respectively. At beginning, one $1\times 1$ convolution layer ${F}_{1\times 1}^{conv}$ expands ${x}_{input}$ channel numbers 2 times. Then, multiple residual connections from former layers are summed to all subsequent layer with two continuous $3\times 3$ convolution layers ${F}_{3\times 3}^{conv}$ . The elementwise summation operations are used for fusing extracted information without adding parameters. DC module is described by the following equation:

${x}_{0} = {F}_{1\times 1}^{conv}\left({x}_{input}\right)$

(1)

${x}_{i} = {{F}_{3\times 3}^{conv}\left(SUM\right(x}_{0};{x}_{1};{x}_{2};\dots ;{x}_{i-1}\left)\right)$

(2)

Figure 2. Dense convolution module.

DownLoad: Full-Size Img PowerPoint

where ${x}_{0}{\in \mathbb{R}}^{C\times H\times W}$ denotes the input feature map and ${x}_{i}{\in \mathbb{R}}^{C\times H\times W}$ represents the feature maps in layer $i$ .

2.3. Pooling attention module

Consecutive pooling operation in encoder enlarges the reception of convolution operation but lose certain features. Therefore, we rethink SE-Net ^[22] and ECA-Net ^[23], and propose PA module to replace the original pooling layer, as shown in . PA module mainly takes in a two-branch structure. One branch tries to obtain an attention channel feature vector and the other rescales height and width of feature maps. First, a $1$ D convolution ${F}_{5\times 1\times 1}^{conv}$ with shared kernel weights of $5$ is used to extract more abundant feature information after adaptive maximum pooling ${P}_{1\times 1}^{max}$ and average pooling ${P}_{1\times 1}^{avg}$ layers. Then, the vector ${V}_{sum}$ is summed element-by-element and activated by $Sigmoid$ function. Finally, the output ${y}_{out}$ is multiplied by rescaled feature maps ${M}_{scaled}$ . PA module can be expressed as follows:

${V}_{sum} = {Sigmoid\left(GN\right({F}_{5\times 1\times 1}^{conv}(P}_{1\times 1}^{max}\left({x}_{input}\right)) \oplus {{F}_{5\times 1\times 1}^{conv}(P}_{1\times 1}^{avg}\left({x}_{input}\right))\left)\right)$

(3)

${M}_{scaled} = {P}_{\frac{h}{2}\times \frac{w}{2}}^{max}\left({x}_{input}\right) \oplus {P}_{\frac{h}{2}\times \frac{w}{2}}^{avg}\left({x}_{input}\right)$

(4)

${y}_{out} = {V}_{sum}\otimes {M}_{scaled}$

(5)

Figure 3. Pooling attention module.

DownLoad: Full-Size Img PowerPoint

where ${x}_{input}{\in \mathbb{R}}^{C\times H\times W}$ denotes input feature maps, $\oplus$ and $\otimes$ denote element-wise summation and element-wise production, respectively.

2.4. Context attention module

In the process of context information extraction, simple concatenation of U-Net is not sufficient to gradually restore needed information. Drawing lessons from dynamic weight similarity calculation, we propose CA module to fuse context information, as shown in . ${x}_{low}{\in \mathbb{R}}^{C\times H\times W}$ and ${x}_{high}{\in \mathbb{R}}^{C\times \frac{H}{2}\times \frac{W}{2}}$ represent feature maps from encoder and decoder, respectively. At first, we obtain ${x}_{input}{\in \mathbb{R}}^{2C\times H\times W}$ via concatenating ${x}_{low}$ and ${x}_{high}$ from upper decoder layer. Then, to capture detailed context information, CA module adopts a three-branch structure, including a spatial attention branch, a channel attention branch, and a convolution branch, which has the same dimensions of ${x}_{input}$ and ${y}_{out}$ . The learned feature maps from spatial attention branch ${x}_{spatial}{\in \mathbb{R}}^{2C\times H\times W}$ and channel attention branch ${x}_{channel}{\in \mathbb{R}}^{2C\times H\times W}$ multiply convolutional feature maps ${x}_{conv}{\in \mathbb{R}}^{2C\times H\times W}$ , separately. Finally, feature maps are concatenated and one $1\times 1$ convolution ${F}_{1\times 1}^{conv}$ reconstructs ${y}_{out}{\in \mathbb{R}}^{C\times H\times W}$ . The relevant formula can be stated as follows:

${x}_{input} = Cat\left[{x}_{low};U\left({x}_{high}\right)\right]$

(6)

${x}_{channel} = {x}_{input} \otimes \left({Sigmoid\left({F}_{5\times 1\times 1}^{conv}\right(P}_{1\times 1}^{max}\left({x}_{input}\right)\right) \oplus {{F}_{5\times 1\times 1}^{conv}(P}_{1\times 1}^{avg}\left({x}_{input}\right)\left)\right))$

(7)

${x}_{spatial} = {{x}_{input} \otimes \left(Sigmoid\right({F}_{1\times 1}^{conv}\left(Cat\right[P}_{H\times W}^{max}\left({x}_{input}\right);{P}_{H\times W}^{avg}\left({x}_{input}\right)\left]\right)\left)\right)$

(8)

${y}_{out}{ = F}_{1\times 1}^{conv}\left(Cat\left[{x}_{spatial} \otimes {x}_{conv};{x}_{channel} \otimes {x}_{conv};{x}_{input}\right]\right)$

(9)

Figure 4. Context attention module.

DownLoad: Full-Size Img PowerPoint

where $Cat[ \cdot ]$ denotes the concatenation operation along with channel dimension. ${P}_{H\times W}^{avg}$ denotes adaptive average pooling operation and ${P}_{H\times W}^{max}$ denotes adaptive maximum pooling operation. For the process of bottom feature information, we use a deformable CA module to capture context information with single input.

2.5. Multiscale edge attention module

U-Net uses decoder to restore the categories of each pixel. However, segmented objects with large variant scales and blurred edges increase the difficulty of accurate segmentation. The pixel position of target edge in feature maps of union scales from decoder is slightly different and the high-level feature maps in decoder contain more sufficient edge information. Learning the scale-dynamic weights of all fused feature map pixels for calibrating object edge is desirable. To utilize multiscale feature maps, we propose MEA module, as shown in . First, we use bilinear up-sampling layers of different scale factors ( $s = 1, 2, 4, 8$ ) to unify feature map scale obtained from decoder to the final output size and concatenate them. Then, for learning scale-dynamic features, one $1\times 1$ convolution and $Sigmoid$ function generate calibrated weights and multiplied original input to obtain ${y}_{out}{\in \mathbb{R}}^{C\times H\times W}$ . MEA module can be described as follows:

${x}_{input} = Cat\left[s\left({x}_{1}\right);s\left({x}_{2}\right);\dots s\left({x}_{i}\right)\right]$

(10)

${y}_{out} = {x}_{input}\otimes GN\left(Sigmoid\left({x}_{input}\right)\right)$

(11)

Figure 5. Multiscale edge attention module.

DownLoad: Full-Size Img PowerPoint

where $s\left(\cdot \right)$ denotes resampled function with different scale factors. ${x}_{input}$ denotes concatenated feature map and $GN$ denotes Group Normalization.

3. Experiments and results

To assess our proposed network, we validated DSCA-Net and compared with other state-of-the-art methods on four public datasets: ISIC 2018 dataset ^[5,38], thyroid gland segmentation dataset ^[39], lung segmentation (LUNA) dataset, and nuclei segmentation (TNBC) dataset ^[17]. Each dataset poses its separate challenge, and the corresponding samples are shown in Figure 6. On each task, we compared the results with state-of-the-art networks and implemented ablation studies to demonstrate effectiveness of modules, which will be discussed in Sections 3.3–3.6.

Figure 6. Samples of four datasets.

DownLoad: Full-Size Img PowerPoint

3.1. Experiment setup

During the experimental period, all models in this paper were achieved based on Pytorch and the experimental planform was supported by Linux 18.04 operating system, which was equipped with Intel Xeon CPU @2.30 GHz and 27GB RAM. The GPU was 16 GB Nvidia Tesla P100-PCIE. The Adam optimizer ^[40] was used with learning rate ${10}^{-4}$ , and weight decay ${10}^{-8}$ . The dynamic learning rate was decayed by 0.5 every 100 epochs. We utilized the Soft Dice loss for model training and kept optimal result upon validation dataset. Quantitative results were obtained in test.

To maximize the use of GPU, the batch sizes are set to 8, 4, 12, and 2 for ISIC, thyroid gland, LUNA, and TNBC datasets, respectively. For better fitting data, the number of iterations for TNBC dataset is 500, and 300 for others. The training process stops automatically after the maximum epoch. We utilized Fivefold cross-validation for result to assess the stability and effectiveness of DSCA-Net. Every input image was normalized from $\left[0, \;255\right]$ to $\left[0, \;1\right]$ . During model training, random rotation and flipping of the angle in $(-\frac{\pi }{9}, \frac{\pi }{9})$ with the probability of 0.5 were applied for data augmentation.

3.2. Evaluation metrics

In this paper, Dice coefficient (Dice), Intersection over Union (IoU), accuracy (Acc), specificity (Spec), sensitivity (Sens) and average symmetric surface distance (ASSD) are used as evaluation metrics. The formula for all metrics can be expressed as follows:

$Dice = \frac{2*TP}{2TP+FN+FP}$

(12)

$IoU = \frac{TP}{TP+FN+FP}$

(13)

$Accuracy = \frac{TP+TN}{TP+FP+TN+FN}$

(14)

$Specific = \frac{TN}{TN+FP}$

(15)

$Sensitivity = \frac{TP}{TP+FN}$

(16)

where $TP, TN, FP, FN$ represent the predicted pixel numbers of true positive, true negative, false positive, and false negative, respectively. Assuming ${S}_{a}$ and ${S}_{b}$ are the set of border points from prediction result and corresponding label, individually, $ASSD$ is defined as:

$ASSD = \frac{\left(\sum _{a\in {S}_{a}}d\left(a, {S}_{b}\right)+\sum _{b\in {S}_{b}}d\left(b, {S}_{a}\right)\right)}{{|S}_{a}|+|{S}_{b}|}$

(17)

where ${d(v, S}_{a}) = {min}_{x\in {S}_{a}}\left(\left|v-x\right|\right)$ represents the shortest Pythagorean distance between point $v$ and ${S}_{a}$ .

3.3. Skin lesion segmentation

The skin lesion segmentation dataset has 2594 images and their corresponding label in 2018 ^[5,38]. We randomly divided the dataset by the ratio of 7:2:1 into 1815,261, and 520 used for training, validation, and testing, respectively. The original size of images in dataset varies from $720\times 540$ to $6708\times 4439$ . To facilitate the training process of our proposed network, all images and corresponding masks were cropped to $256\times 256.$

Some skin lesion segmentation samples of our proposed network and U-Net are shown in Figure 7. U-Net performs unsatisfactorily compared with DSCA-Net in regular skin lesion segmentation images. When the skin lesion has a similar color to surroundings or occluded by hair and tissue fluid, U-Net gets error segmentation results. The more blurred boundary of skin lesion, the more incorrect segmentation is obtained by U-Net. Comparatively, DSCA-Net performs better.

Figure 7. Visualization results of skin lesion segmentation dataset.

DownLoad: Full-Size Img PowerPoint

To fully confirm the validity of our method, we compared DSCA-Net with U-Net ^[10], Attention U-Net ^[27], RefineNet ^[41], EOCNet ^[42], CA-Net ^[11], DeepLabv3+ ^[31], MobileNetV3-UNet ^[33] and IBA-U-Net ^[44] on this dataset. The results are listed in Table 1. Our proposed model performs an Acc of 0.9532, 0.0755 higher than U-Net, 0.0053 higher than second-place method MobileNetV3-UNet. Although Dice is 0.0002 less than DeepLabV3+, the difference is not significant. Our model has 1/3.53, 1/15.85, 1/24.86, 1/1.27, 1/3.77 and 1/6.32 times fewer parameters than U-Net, attention U-Net, DeepLabV3+, CA-Net, MobileNetV3-UNet, and IBA-U-Net with better segmentation performance, respectively.

Table 1. Comparisons of segmentation performance and number of parameters between DSCA-Net and other networks on skin lesion segmentation.

Methods	Dice	Acc	Params
U-Net ^[10]	0.8739	0.8777	7.76 M
Attention-UNet ^[27]	0.8846	0.8846	34.88 M
RefineNet ^[41]	0.9155	0.9155	46.3 M
EOC-Net ^[42]	0.8611	0.8401	-
CA-Net ^[11]	0.9208	0.9268	2.8 M
DeepLabV3+ ^[43]	0.9221	0.9179	54.7 M
Mobi leNetV3-UNet ^[33]	0.9098	0.9479	8.3 M
IBA-U-Net ^[44]	-	0.9440	13.91 M
Ours	0.9282	0.9532	2.2 M

| Show Table

DownLoad: CSV

Table 2 lists the comparison results. Lightweight U-Net were achieved by depthwise separable convolution instead of original stacked convolution layers of U-Net. DSCA-Net is the network adding all designed modules. The quantitative results show that our proposed modules strengthen the feature extraction ability. Every proposed module improves segmentation performance. At the same time, Backbone + DC + PA, and Backbone + DC + PA + CA shows better segmentation results than U-Net.

Table 2. Quantitative evaluation of ablation study on skin lesion segmentation dataset.

Methods	Dice	IoU	ASSD
Lightweight U-Net (Backbone)	0.8232	0.7164	1.6189
Backbone + DC	0.8612	0.7672	1.0833
Backbone + DC + PA	0.8941	0.8157	0.8364
Backbone + DC + PA + CA	0.9219	0.8711	0.6753
DSCA-Net	0.9282	0.8733	0.6318

| Show Table

DownLoad: CSV

3.4. Thyroid gland segmentation

The thyroid public dataset ^[39] was acquired by a GE Logiq E9 XDclear 2.0 system equipped with a GE ML6-15 ultrasound probe with Ascension driveBay electromagnetic tracking. It took from healthy thyroid records and the volumes were taken straight from the ultrasound imaging instrument, which was recorded in DICOM format. The matching label, which was produced by a medical expert, contains the isthmus as part of the segmented region. To train our model, we split the volume into 3998 individual slices with label. We randomly used 2798 images for training, 400 images for validation and 800 for testing, with a ratio of 7:1:2. The shape of input was randomly cropped in $256\times 256$ .

Figure 8 presents several test segmenting results on thyroid gland dataset. The edge of thyroid gland and background information usually have some outliers and similarities in vision, but not relate to our interest. Observations show that U-Net under-segmented thyroid isthmus while DSCA-Net better.

Figure 8. Visualization results of thyroid gland segmentation dataset.

DownLoad: Full-Size Img PowerPoint

We tested DSCA-Net against three methods: SegNet ^[8], SUMNet ^[14], and Attention-UNet ^[27]. Quantitative evaluation results present in Table 3. The Dice increases from 0.9332 to 0.9727 by 4.2%, the Sens increases from 0.9526 to 0.9873 by 3.6% and Spec increases from 0.9169 to 0.9921 by 8.2% compared with U-Net. Our model has 1/51.05 times fewer parameters than SegNet and performs better through evaluation metrics.

Table 3. Comparisons of segmentation performance and number of parameters between DSCA-Net and other networks on thyroid gland segmentation.

Methods	Dice	Sens	Spec	Params
U-Net ^[10]	0.9332	0.9526	0.9169	7.76 M
SegNet ^[8]	0.8401	0.9811	0.8437	112.32 M
SUMNet ^[14]	0.9207	0.9830	0.8911	-
Attention-UNet ^[27]	0.9582	0.9801	0.9444	34.88 M
DSCA-Net	0.9727	0.9873	0.9921	2.2 M

| Show Table

DownLoad: CSV

Additionally, Table 4 presents the quantitative analysis results of ablation study on thyroid segmentation. DSCA-Net scored the best performance in every metric. The Dice increased significantly after adding CA module, which indicates that CA module can efficiently extract context information for thyroid segmentation performance.

Table 4. Quantitative evaluation of ablation study on thyroid gland segmentation dataset.

Methods	Dice	IoU	ASSD
Lightweight U-Net (Backbone)	0.8837	0.8079	1.0703
Backbone + DC	0.9017	0.8325	0.8331
Backbone + DC + PA	0.9113	0.8469	0.6557
Backbone + DC + PA + CA	0.9687	0.9422	0.1072
DSCA-Net	0.9727	0.9544	0.0953

| Show Table

DownLoad: CSV

3.5. Lung segmentation

Lung segmentation requires segmenting the lung structure from a competition called Lung Nodule Analysis (LUNA). It contains 534 2D CT samples with corresponding label. The original resolution of images is $512\times 512,$ and we randomly cropped them into $256\times 256$ . Separately, 70, 10, and 20% of dataset is allocated for training, validation, and testing with corresponding number 374, 53, and 107.

From the visualization results shown in Figure 9, DSCA-Net performs better than U-Net in detailed edge processing. Affected by the noise of lung CT images, U-Net produces some erroneous segmented areas. DSCA-Net has a greater tolerance to noise than U-Net. We demonstrate the validation of our approach by achieving a promising improvement despite the relatively simple task.

Figure 9. Visualization results of lung segmentation dataset.

DownLoad: Full-Size Img PowerPoint

To quantitatively analyze the effectiveness, we assessed DSCA-Net with four methods: U-Net ^[10], CE-Net ^[4], RU-Net ^[15], and R2U-Net ^[15]. demonstrates that all methods achieve excellent performance in four metrics, and our network reached 0.9828 in Dice, 0.9920 in Acc, 0.9836 in Sens, and 0.9895 in Spec, better than U-Net. In spite of the slightly lower performance of DSCA-Net than R2U-Net in Spec, our model has 1/1.9 times fewer parameters than R2U-Net while three metric scores are higher than R2U-Net. Noting that $t$ in Table 5 means recurrent convolution time-step.

Table 5. Comparisons of segmentation performance and number of parameters between DSCA-Net and other networks on lung segmentation.

Methods	Dice	Acc	Sens	Spec	Params
U-Net ^[10]	0.9675	0.9768	0.9441	0.9869	7.76 M
CE-Net ^[4]	-	0.9900	0.9800	-	-
ResU-Net (t = 2) ^[15]	0.9690	0.9849	0.9555	0.9945	-
RU-Net (t = 2) ^[15]	0.9638	0.9836	0.9734	0.9866	4.2 M
R2U-Net (t = 3) ^[15]	0.9826	0.9918	0.9826	0.9944	4.2 M
DSCA-Net	0.9828	0.9920	0.9836	0.9895	2.2 M

| Show Table

DownLoad: CSV

Table 6 shows the segmentation results of ablation study on lung segmentation dataset. By adding designed modules in sequence, each of the proposed modules improved segmentation performance of DSCA-Net. The backbone + DC + PA + CA exceeds U-Net 0.0138 in Dice, and DSCA-Net shows best performance in Dice, IoU and ASSD evaluation metrics.

Table 6. Quantitative evaluation of ablation study on lung segmentation dataset.

Methods	Dice	IoU	ASSD
Lightweight U-Net (Backbone)	0.6160	0.4562	7.7274
Backbone + DC	0.8597	0.7661	1.0903
Backbone + DC + PA	0.9160	0.8471	1.3105
Backbone + DC + PA + CA	0.9813	0.9631	0.1192
DSCA-Net	0.9828	0.9662	0.0882

| Show Table

DownLoad: CSV

3.6. Nuclei segmentation

The last application is nuclei segmentation of Triple-Negative Breast Cancer (TNBC) dataset. It has 50 images from 11 patients with the size of $512\times 512$ . To avoid overfitting of training process, we used data augmentation method to expand dataset with a total number of 500, including random flipping, random cropping, and random rotation with the angle in $(-\frac{\pi }{6}, \frac{\pi }{6})$ . The probability of process triggering in data augmentation methods is 0.5. As usual, we adopted the same split ratio of 7:1:2, with 350, 50,100 for training, validation, and testing.

Figure 10 illustrates some comparative cases of prediction results between our designed network and U-Net on TNBC dataset. It can be viewed that DSCA-Net performs better than U-Net. However, incorrect segmentation results are also obtained in some segmenting areas, as shown in second line. The obscure color transitional areas and overlaid nuclei increases the difficulty to be segmented. For the relatively easy segmentation target, our network performs better.

Figure 10. Visualization results of nuclei segmentation dataset.

DownLoad: Full-Size Img PowerPoint

Additionally, we compared DACA-Net with other networks: U-Net ^[10], DeconvNet ^[17], Ensemble ^[17], Kang et al. ^[19], DeepLabV3+ ^[43], and Up-Net-N4 ^[16]. The comparison results are shown in Table 7. Although the Sens is 0.0456 lower than Ensemble in Table 7, a combination of attention mechanism and data augmentation allows our DSCA-Net to score higher than state-of-the-art methods in Dice and Acc. Our model has 1/233.13 and 1/3.36 times fewer parameters than DeconvNet and Up-Net-N4, separately.

Table 7. Comparisons of segmentation performance and number of parameters between DSCA-Net and other networks on nuclei segmentation.

Methods	Dice	Acc	Sens	Params
U-Net ^[10]	0.8087	0.9344	0.7915	7.76 M
DeconvNet ^[18]	0.8151	0.9541	0.7731	512.9 M
Ensemble ^[17]	0.8083	0.9441	0.9000	-
Kang et al. ^[19]	0.8343	-	0.8330	-
DeepLabV3+ ^[43]	0.8014	0.9549	-	54.7 M
Up-Net-N4 ^[16]	0.8369	0.9604	-	7.4 M
DSCA-Net	0.8995	0.9583	0.8544	2.2 M

| Show Table

DownLoad: CSV

According to the quantitative evaluation results, Table 8 demonstrates the effectiveness of our proposed modules. After adding the MEA module, our proposed network performs better, which indicates that segmented edge is closer to the label with less error.

Table 8. Quantitative evaluation of ablation study on nuclei segmentation dataset.

Methods	Dice	IoU	ASSD
Lightweight U-Net (Backbone)	0.6261	0.4562	7.7274
Backbone + DC	0.7587	0.6471	1.7348
Backbone + DC + PA	0.7691	0.6680	0.8761
Backbone + DC + PA + CA	0.8337	0.8025	0.6065
DSCA-Net	0.8995	0.8231	0.5597

| Show Table

DownLoad: CSV

4. Discussion

To lighten the network parameters and maintain performance, we take fully advantages of U-Net and integrate designed modules in DSCA-Net for 2D medical image segmentation. First, DC module replaces stacked convolutional layers of U-Net for feature extraction and restoration. Second, PA module is designed to recover down-sampling feature loss. Third, CA module substitutes the simple concatenation operation in U-Net to extract richer context information. In addition, MEA module is proposed to realize segmenting target edges from multi-scale encoder information for final prediction. Evaluation metrics with other state-of-the-art networks showed the performance of DSCA-Net is better.

Multi-group experimental visualized results are shown in Figures 7–10. It can be summarized that our model is more robust than U-Net. For the blurred edge details and occlusions in electron microscope images, our network can also distinguish the segmented target correctly. For the most challenging task like TNBC, the similarity of adherent nuclei and unobvious changed color with great morphological changes increases the difficulty of segmentation. Our proposed network has achieved better results compared with other networks. However, it still needs further development.

5. Conclusions

The target of this study is to lighten parameters of the network while maintaining good performance. We design a lightweight depthwise separable convolutional neural network with an attention mechanism named DSCA-Net for accurate medical image segmentation. Our proposed network extracts richer feature information and reduces feature loss in segmentation processing compared with U-Net. We assessed our network on four datasets and collected segmentation results against state-of-the-art networks under various metrics. The visualization and quantitative results show that our network has better segmenting ability. We intend to utilize DSCA-Net to segment 3D images in the future.

Acknowledgments

This study was supported by Shanghai Jiao Tong University Medical-industrial Cross-key Project under Grant ZH2018ZDA26, the Jiangsu Provincial Key Research and Development Fund Project under Grant BE2017601.

Conflict of interest

The authors have no conflicts of interest to declare.

References

[1]	A. Sinha, A. R P, M. Suresh, N. M. R, A. D, A. G. Singerji, Brain tumour detection using deep learning, in 2021 Seventh International conference on Bio Signals, Images, and Instrumentation (ICBSII), (2021), 1–5. https://doi.org/10.1109/ICBSII51839.2021.9445185
[2]	B. Saju, V. Asha, A. Prasad, V. A, A. S, S. P. Sreeja, Prediction analysis of hypothyroidism by association, in 2023 Third International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), (2023), 1–6. https://doi.org/10.1109/ICAECT57570.2023.10117641
[3]	H. Tang, Z. Hu, Research on medical image classification based on machine learning, IEEE Access, 8 (2020), 93145–93154. https://doi.org/10.1109/ACCESS.2020.2993887 doi: 10.1109/ACCESS.2020.2993887
[4]	S. K. Zhou, H. Greenspan, C. Davatzikos, J. S. Duncan, B. V. Ginneken, A. Madabhushi, et al., A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises, Proc. IEEE, 109 (2021), 820–838. https://doi.org/10.1109/JPROC.2021.3054390 doi: 10.1109/JPROC.2021.3054390
[5]	P. Uppamma, S. Bhattacharya, Deep learning and medical image processing techniques for diabetic retinopathy: A survey of applications, challenges, and future trends, J. Healthcare Eng., 2023 (2023), 2728719. https://doi.org/10.1155/2023/2728719 doi: 10.1155/2023/2728719
[6]	Z. Shi, L. He, Application of neural networks in medical image processing, in Proceedings of the Second International Symposium on Networking and Network Security, (2010), 23–26.
[7]	L. Abualigah, D. Yousri, M. A. Elaziz, A. A. Ewees, M. A. A. Al-qaness, A. H. Gandomi, Aquila optimizer: A novel meta-heuristic optimization algorithm, Comput. Ind. Eng., 157 (2021), 107250. https://doi.org/10.1016/j.cie.2021.107250 doi: 10.1016/j.cie.2021.107250
[8]	H. Chen, M. M. Rogalski, J. N. Anker, Advances in functional X-ray imaging techniques and contrast agents, Phys. Chem. Chem. Phys., 14 (2012), 13469–13486. https://doi.org/10.1039/c2cp41858d doi: 10.1039/c2cp41858d
[9]	M. E. H. Chowdhury, T. Rahman, A. Khandakar, R. Mazhar, M. A. Kadir, Z. B. Mahbub, et al., Can AI help in screening viral and COVID-19 pneumonia?, IEEE Access, 8 (2020), 132665–132676. https://doi.org/10.1109/ACCESS.2020.3010287 doi: 10.1109/ACCESS.2020.3010287
[10]	S. P. Sreeja, V. Asha, B. Saju, P. K. C, P. Manasa, V. C. R, Classifying chest X-rays for COVID-19 using deep learning, in 2023 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE), (2023), 1084–1089. https://doi.org/10.1109/IITCEE57236.2023.10090915
[11]	M. Soni, S. Gomathi, P. Kumar, P. P. Churi, M. A. Mohammed, A. O. Salman, Hybridizing convolutional neural network for classification of lung diseases, Int. J. Swarm Intell. Res., 13 (2022), 1–15. https://doi.org/10.4018/IJSIR.287544 doi: 10.4018/IJSIR.287544
[12]	V. Indumathi, R. Siva, An efficient lung disease classification from X-ray images using hybrid Mask-RCNN and BiDLSTM, Biomed. Signal Process. Control, 81 (2023), 104340. https://doi.org/10.1016/j.bspc.2022.104340 doi: 10.1016/j.bspc.2022.104340
[13]	F. M. J. M. Shamrat, S. Azam, A. Karim, R. Islam, Z. Tasnim, P. Ghosh, et al., LungNet22: A fine-tuned model for multiclass classification and prediction of lung disease using X-ray images, J. Pers. Med., 12 (2022), 680. https://doi.org/10.3390/jpm12050680 doi: 10.3390/jpm12050680
[14]	R. Rajagopal, R. Karthick, P. Meenalochini, T. Kalaichelvi, Deep Convolutional Spiking Neural Network optimized with Arithmetic optimization algorithm for lung disease detection using chest X-ray images, Biomed. Signal Process. Control, 79 (2023), 104197. https://doi.org/10.1016/j.bspc.2022.104197 doi: 10.1016/j.bspc.2022.104197
[15]	S. Kim, B. Rim, S. Choi, A. Lee, S. Min, M. Hong, Deep learning in multi-class lung diseases' classification on chest X-ray images, Diagnostics, 12 (2022), 915. https://doi.org/10.3390/diagnostics12040915 doi: 10.3390/diagnostics12040915
[16]	A. M. Q. Farhan, S. Yang, Automatic lung disease classification from the chest X-ray images using hybrid deep learning algorithm, Multimed. Tools Appl., (2023), 38561–38587. https://doi.org/10.1007/s11042-023-15047-z doi: 10.1007/s11042-023-15047-z
[17]	S. Buragadda, K. S. Rani, S. V. Vasantha, M. K. Chakravarthi, HCUGAN: Hybrid cyclic UNET GAN for generating augmented synthetic images of chest X-Ray images for multi classification of lung diseases, Int. J. Eng. Trends Technol., 70 (2020), 249–253. http://doi.org/10.14445/22315381/IJETT-V70I2P227 doi: 10.14445/22315381/IJETT-V70I2P227
[18]	V. Ravi, V. Acharya, M. Alazab, A multichannel Efficient Net deep learning-based stacking ensemble approach for lung disease detection using chest X-ray images, Cluster Comput., 26 (2023), 1181–1203. https://doi.org/10.1007/s10586-022-03664-6 doi: 10.1007/s10586-022-03664-6
[19]	A. M. Ismael, A. Şengür, Deep learning approaches for COVID-19 detection based on chest X-ray images, Expert Syst. Appl., 164 (2021), 114054. https://doi.org/10.1016/j.eswa.2020.114054 doi: 10.1016/j.eswa.2020.114054
[20]	M. Blain, M. T. Kassin, N. Varble, X. Wang, Z. Xu, D. Xu, et al., Determination of disease severity in COVID-19 patients using deep learning in chest X-ray images, Diagn. Interv. Radiol., 27 (2021), 20–27. https://doi.org/10.5152/dir.2020.20205 doi: 10.5152/dir.2020.20205
[21]	E. H. Houssein, Z. Abohashima, M. Elhoseny, W. M. Mohamed, Hybrid quantum-classical convolutional neural network model for COVID-19 prediction using chest X-ray images, J. Comput. Design Eng., 9 (2022), 343–363. https://doi.org/10.1093/jcde/qwac003 doi: 10.1093/jcde/qwac003
[22]	W. A. Shalaby, W. Saad, M. Shokair, F. E. A. El-Samie, M. I. Dessouky, COVID-19 classification based on deep convolutional neural networks over a wireless network, Wireless Pers. Commun., 120 (2021), 1543–1563. https://doi.org/10.1007/s11277-021-08523-y doi: 10.1007/s11277-021-08523-y
[23]	W. Saad, W. A. Shalaby, M. Shokair, F. A. El-Samie, M. Dessouky, E. Abdellatef, COVID-19 classification using deep feature concatenation technique, J. Ambient Intell. Human. Comput., 13 (2022), 2025–2043. https://doi.org/10.1007/s12652-021-02967-7 doi: 10.1007/s12652-021-02967-7
[24]	S. Sheykhivand, Z. Mousavi, S. Mojtahedi, T. Y. Rezaii, A. Farzamnia, S. Meshgini, et al., Developing an efficient deep neural network for automatic detection of COVID-19 using chest X-ray images, Alexandria Eng. J., 60 (2021), 2885–2903. https://doi.org/10.1016/j.aej.2021.01.011 doi: 10.1016/j.aej.2021.01.011
[25]	V. Agarwal, M. C. Lohani, A. S. Bist, D. Julianingsih, Application of voting based approach on deep learning algorithm for lung disease classification, in 2022 International Conference on Science and Technology (ICOSTECH), (2022), 1–7. https://doi.org/10.1109/ICOSTECH54296.2022.9828806
[26]	A. Narin, C. Kaya, Z. Pamuk, Automatic detection of coronavirus disease (COVID-19) using X-ray images and deep convolutional neural networks, Pattern Anal. Appl., 24 (2021), 1207–1220. https://doi.org/10.1007/s10044-021-00984-y doi: 10.1007/s10044-021-00984-y
[27]	V. Kumar, A. Zarrad, R. Gupta, O. Cheikhrouhou, COV-DLS: Prediction of COVID-19 from X-rays using enhanced deep transfer learning techniques, J. Healthcare Eng., 2022 (2022), 6216273. https://doi.org/10.1155/2022/6216273 doi: 10.1155/2022/6216273
[28]	Q. Lv, S. Zhang, Y. Wang, Deep learning model of image classification using machine learning, Adv. Multimedia, 2022 (2022), 3351256. https://doi.org/10.1155/2022/3351256 doi: 10.1155/2022/3351256
[29]	M. Xin, Y. Wang, Research on image classification model based on deep convolutional neural network, J. Image Video Process., 2019 (2019). https://doi.org/10.1186/s13640-019-0417-8 doi: 10.1186/s13640-019-0417-8
[30]	A. H. Setianingrum, A. S. Rini, N. Hakiem, Image segmentation using the Otsu method in Dental X-rays, in 2017 Second International Conference on Informatics and Computing (ICIC), (2017), 1–6. https://doi.org/10.1109/IAC.2017.8280611
[31]	S. Sahu, A. K. Singh, S. P. Ghrera, M. Elhoseny, An approach for de-noising and contrast enhancement of retinal fundus image using CLAHE, Opt. Laser Technol., 110 (2019), 87–98. https://doi.org/10.1016/j.optlastec.2018.06.061 doi: 10.1016/j.optlastec.2018.06.061
[32]	S. K. Jadwaa, X-Ray lung image classification using a canny edge detector, J. Electr. Comput. Eng., 2022 (2022), 3081584. https://doi.org/10.1155/2022/3081584 doi: 10.1155/2022/3081584
[33]	P. G. Bhende, A. N. Cheeran, A novel feature extraction scheme for medical X-ray images, Int. J. Eng. Res. Appl., 6 (2016), 53–60.
[34]	P. K. Mall, P. K. Singh, D. Yadav, GLCM based feature extraction and medical X-ray image classification using machine learning techniques, in 2019 IEEE Conference on Information and Communication Technology, (2019), 1–6. https://doi.org/10.1109/CICT48419.2019.9066263
[35]	S. Gunasekaran, S. Rajan, L. Moses, S. Vikram, M. Subalakshmi, B. Shudhersini, Wavelet based CNN for diagnosis of COVID 19 using chest X ray, in First International Conference on Circuits, Signals, Systems and Securities, 1084 (2021). https://doi.org/10.1088/1757-899X/1084/1/012015
[36]	W. Yang, K. Wang, W. Zuo, Neighborhood component feature selection for high-dimensional data, J. Comput., 7 (2012), 161–168.
[37]	M. Ramprasath, M. V. Anand, S. Hariharan, Image classification using convolutional neural networks, Int. J. Pure Appl. Math., 119 (2018), 1307–1319.
[38]	G. Wang, Z. Guo, X. Wan, X. Zheng, Study on image classification algorithm based on improved DENSENET, J. Phys.: Conf. Ser., 1952 (2021), 022011. http://doi.org/10.1088/1742-6596/1952/2/022011 doi: 10.1088/1742-6596/1952/2/022011
[39]	N. Hasan, Y. Bao, A. Shawon, Y. Huang, DENSENET convolutional neural networks application for predicting COVID-19 using CT image, SN Comput. Sci., 2 (2021), 389. https://doi.org/10.1007/s42979-021-00782-7 doi: 10.1007/s42979-021-00782-7
[40]	B. Sasmal, A. G. Hussien, A. Das, K. G. Dhal, A comprehensive survey on aquila optimizer, Arch. Comput. Methods Eng., 30 (2023), 4449–4476. https://doi.org/10.1007/s11831-023-09945-6 doi: 10.1007/s11831-023-09945-6
[41]	S. Ekinci, D. Izci, E. Eker, L. Abualigah, An effective control design approach based on novel enhanced aquila optimizer for automatic voltage regulator, Artif. Intell. Rev., 56 (2023), 1731–1762. https://doi.org/10.1007/s10462-022-10216-2 doi: 10.1007/s10462-022-10216-2
[42]	M. H. Nadimi-Shahraki, S. Taghian, S. Mirjalili, L. Abualigah, Binary aquila optimizer for selecting effective features from medical data: A COVID-19 case study, Mathematics, 10 (2022), 1929. https://doi.org/10.3390/math10111929 doi: 10.3390/math10111929
[43]	A. A. Ewees, Z. Y. Algamal, L. Abualigah, M. A. A. Al-qaness, D. Yousri, R. M. Ghoniem, et al., A cox proportional-hazards model based on an improved aquila optimizer with whale optimization algorithm operators, Mathematics, 10 (2022), 1273. https://doi.org/10.3390/math10081273 doi: 10.3390/math10081273
[44]	F. Gul, I. Mir, S. Mir, Aquila Optimizer with parallel computation application for efficient environment exploration, J. Ambient Intell. Human. Comput., 14 (2023), 4175–4190. https://doi.org/10.1007/s12652-023-04515-x doi: 10.1007/s12652-023-04515-x
[45]	S. Akyol, A new hybrid method based on Aquila optimizer and tangent search algorithm for global optimization, J. Ambient Intell. Human. Comput., 14 (2023), 8045–8065. https://doi.org/10.1007/s12652-022-04347-1 doi: 10.1007/s12652-022-04347-1
[46]	K. G. Dhal, R. Rai, A. Das, S. Ray, D. Ghosal, R. Kanjilal, Chaotic fitness-dependent quasi-reflected Aquila optimizer for superpixel based white blood cell segmentation, Neural Comput. Appl., 35 (2013), 15315–15332. https://doi.org/10.1007/s00521-023-08486-0 doi: 10.1007/s00521-023-08486-0
[47]	A. Ait-Saadi, Y. Meraihi, A. Soukane, A. Ramdane-Cherif, A. B. Gabis, A novel hybrid chaotic Aquila optimization algorithm with simulated annealing for unmanned aerial vehicles path planning, Comput. Electr. Eng., 104 (2022), 108461. https://doi.org/10.1016/j.compeleceng.2022.108461 doi: 10.1016/j.compeleceng.2022.108461
[48]	S. Wang, H. Jia, L. Abualigah, Q. Liu, R. Zheng, An improved hybrid aquila optimizer and harris hawks algorithm for solving industrial engineering optimization problems, Processes, 9 (2021), 1551. https://doi.org/10.3390/pr9091551 doi: 10.3390/pr9091551
[49]	Y. Zhang, Y. Yan, J. Zhao, Z. Gao, AOAAO: The hybrid algorithm of arithmetic optimization algorithm with aquila optimizer, IEEE Access, 10 (2022), 10907–10933. https://doi.org/10.1109/ACCESS.2022.3144431 doi: 10.1109/ACCESS.2022.3144431
[50]	J. Zhong, H. Chen, W. Chao, Making batch normalization great in federated deep learning, preprint, arXiv: 2303.06530.
[51]	M. Segu, A. Tonioni, F. Tombari, Batch normalization embeddings for deep domain generalization, Pattern Recognit., 135 (2023), 109115. https://doi.org/10.1016/j.patcog.2022.109115 doi: 10.1016/j.patcog.2022.109115
[52]	N. Talat, A. Alsadoon, P. W. C. Prasad, A. Dawoud, T. A. Rashid, S. Haddad, A novel enhanced normalization technique for a mandible bones segmentation using deep learning: batch normalization with the dropout, Multimed. Tools Appl., 82 (2023), 6147–6166. https://doi.org/10.1007/s11042-022-13399-6 doi: 10.1007/s11042-022-13399-6
[53]	G. M. M. Alshmrani, Q. Ni, R. Jiang, H. Pervaiz, N. M. Elshennawy, A deep learning architecture for multi-class lung diseases classification using chest X-ray (CXR) images, Alexandria Eng. J., 64 (2023), 923–935. https://doi.org/10.1016/j.aej.2022.10.053 doi: 10.1016/j.aej.2022.10.053
[54]	F. J. M. Shamrat, S. Azam, A. Karim, K. Ahmed, F. M. Bui, F. De Boer, High-precision multiclass classification of lung disease through customized MobileNetV2 from chest X-ray images, Comput. Biol. Med., 155 (2023), 106646. https://doi.org/10.1016/j.compbiomed.2023.106646 doi: 10.1016/j.compbiomed.2023.106646
[55]	K. Subramaniam, N. Palanisamy, R. A. Sinnaswamy, S. Muthusamy, O. P. Mishra, A. K. Loganathan, et al., A comprehensive review of analyzing the chest X-ray images to detect COVID-19 infections using deep learning techniques, Soft Comput., 27 (2023), 14219–14240. https://doi.org/10.1007/s00500-023-08561-7 doi: 10.1007/s00500-023-08561-7
[56]	S. Sharma, K. Guleria, A deep learning based model for the detection of pneumonia from chest X-Ray images using VGG-16 and neural networks, Procedia Comput. Sci., 218 (2023), 357–366. https://doi.org/10.1016/j.procs.2023.01.018 doi: 10.1016/j.procs.2023.01.018
[57]	V. T. Q. Huy, C. Lin, An improved DENSENET deep neural network model for tuberculosis detection using chest X-Ray images, IEEE Access, 11 (2023), 42839–42849. https://doi.org/10.1109/ACCESS.2023.3270774 doi: 10.1109/ACCESS.2023.3270774
[58]	V. Sreejith, T. George, Detection of COVID-19 from chest X-rays using ResNet-50, J. Phys.: Conf. Ser., 1937 (2021), 012002. https://doi.org/10.1088/1742-6596/1937/1/012002 doi: 10.1088/1742-6596/1937/1/012002

This article has been cited by:

1.	Viet-Tien Pham, Ngoc-Tu Vu, Van-Truong Pham, Thi-Thao Tran, 2023, CellFormer: A DaViT Transformer-Based Method for Nuclei Segmentation, 979-8-3503-1584-4, 165, 10.1109/RIVF60135.2023.10471814
2.	João D. Nunes, Diana Montezuma, Domingos Oliveira, Tania Pereira, Jaime S. Cardoso, A survey on cell nuclei instance segmentation and classification: Leveraging context and attention, 2025, 99, 13618415, 103360, 10.1016/j.media.2024.103360
3.	MuYun Liu, XiangXi Du, JunYuan Hu, Xiao Liang, HaiJun Wang, Utilization of convolutional neural networks to analyze microscopic images for high-throughput screening of mesenchymal stem cells, 2024, 19, 2391-5412, 10.1515/biol-2022-0859
4.	Qile Zhang, Xiaoliang Jiang, Xiuqing Huang, Chun Zhou, MSR-Net: Multi-Scale Residual Network Based on Attention Mechanism for Pituitary Adenoma MRI Image Segmentation, 2024, 12, 2169-3536, 119371, 10.1109/ACCESS.2024.3449925
5.	Shangwang Liu, Peixia Wang, Yinghai Lin, Bingyan Zhou, SMRU-Net: skin disease image segmentation using channel-space separate attention with depthwise separable convolutions, 2024, 27, 1433-7541, 10.1007/s10044-024-01307-7
6.	Binh-Duong Dinh, Thanh-Thu Nguyen, Thi-Thao Tran, Van-Truong Pham, 2023, 1M parameters are enough? A lightweight CNN-based model for medical image segmentation, 979-8-3503-0067-3, 1279, 10.1109/APSIPAASC58517.2023.10317244
7.	Chenyang Lu, Guangtong Yang, Xu Qiao, Wei Chen, Qingyun Zeng, UniverDetect: Universal landmark detection method for multidomain X-ray images, 2024, 600, 09252312, 128157, 10.1016/j.neucom.2024.128157
8.	Ngoc-Tu Vu, Viet-Tien Pham, Thi-Thao Tran, Van-Truong Pham, 2023, PyBoFormer: Pyramid Selected Boundary Transformer for Polyp Segmentation, 979-8-3503-2878-3, 194, 10.1109/ICCAIS59597.2023.10382322
9.	Jiayuan Bai, Deep learning‐based intraoperative video analysis for supporting surgery, 2023, 35, 1532-0626, 10.1002/cpe.7837
10.	Niranjan Yadav, Rajeshwar Dass, Jitendra Virmani, Assessment of encoder-decoder-based segmentation models for thyroid ultrasound images, 2023, 61, 0140-0118, 2159, 10.1007/s11517-023-02849-4
11.	Wang Jiangtao, Nur Intan Raihana Ruhaiyem, Fu Panpan, A Comprehensive Review of U‐Net and Its Variants: Advances and Applications in Medical Image Segmentation, 2025, 19, 1751-9659, 10.1049/ipr2.70019
12.	Jing Sha, Xu Wang, Zhongyuan Wang, Lu Wang, DPFA‐UNet: Dual‐Path Fusion Attention for Accurate Brain Tumor Segmentation, 2025, 19, 1751-9659, 10.1049/ipr2.70084

Reader Comments

Your name:*

Email:*
© 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)