Lagrange tracking-based long-term drift trajectory prediction method for Autonomous Underwater Vehicle

Shuwen Zheng; Mingjun Zhang; Jing Zhang; Jitao Li; Shuwen Zheng; Mingjun Zhang; Jing Zhang; Jitao Li

doi:10.3934/mbe.2023932

Mathematical Biosciences and Engineering

2023, Volume 20, Issue 12: 21075-21097. doi: 10.3934/mbe.2023932

Previous Article Next Article

Research article

Lagrange tracking-based long-term drift trajectory prediction method for Autonomous Underwater Vehicle

1.
College of Mechanical and Electrical Engineering, Harbin Engineering University, Harbin 150001, China
2.
School of Information Science and Engineering, University of Jinan, Jinan 250022, China

Academic Editor: Hamid Reza Karim

Received: 27 August 2023 Revised: 20 November 2023 Accepted: 21 November 2023 Published: 27 November 2023

Autonomous Underwater Vehicle (AUV) works autonomously in complex marine environments. After a severe accident, an AUV will lose its power and rely on its small buoyancy to ascend at a slow speed. If the reserved buoyancy is insufficient, when reaching the thermocline, the buoyancy will rapidly decrease to zero. Consequently, the AUV will experience prolonged lateral drift within the thermocline. This study focuses on developing a prediction method for the drift trajectory of an AUV after a long-term power loss accident. The aim is to forecast the potential resurfacing location, providing technical support for surface search and salvage operations of the disabled AUV. To the best of our knowledge, currently, there is no mature and effective method for predicting long-term AUV underwater drift trajectories. In response to this issue, based on real AUV catastrophes, this paper studies the prediction of long-term AUV underwater drift trajectories in the cases of power loss. We propose a three-dimensional trajectory prediction method based on the Lagrange tracking approach. This method takes the AUV's longitudinal velocity, the time taken to reach different depths, and ocean current data at various depths into account. The reason for the AUV's failure to ascend to sea surface lies that the remaining buoyancy is too small to overcome the thermocline. As a result, AUV drifts long time within the thermocline. To address this issue, a method for estimating thermocline currents is proposed, which can be used to predict the lateral drift trajectory of the AUV within the thermocline. Simulation is conducted to compare the results obtained by the proposed method and that in a real accident. The results demonstrate that the proposed approach exhibits small directional and positional errors. This validates the effectiveness of the proposed method.

Keywords:

Citation: Shuwen Zheng, Mingjun Zhang, Jing Zhang, Jitao Li. Lagrange tracking-based long-term drift trajectory prediction method for Autonomous Underwater Vehicle[J]. Mathematical Biosciences and Engineering, 2023, 20(12): 21075-21097. doi: 10.3934/mbe.2023932

Related Papers:

[1]	Yun Jiang, Jie Chen, Wei Yan, Zequn Zhang, Hao Qiao, Meiqi Wang . MAG-Net : Multi-fusion network with grouped attention for retinal vessel segmentation. Mathematical Biosciences and Engineering, 2024, 21(2): 1938-1958. doi: 10.3934/mbe.2024086
[2]	Yuqing Zhang, Yutong Han, Jianxin Zhang . MAU-Net: Mixed attention U-Net for MRI brain tumor segmentation. Mathematical Biosciences and Engineering, 2023, 20(12): 20510-20527. doi: 10.3934/mbe.2023907
[3]	Rongrong Bi, Chunlei Ji, Zhipeng Yang, Meixia Qiao, Peiqing Lv, Haiying Wang . Residual based attention-Unet combing DAC and RMP modules for automatic liver tumor segmentation in CT. Mathematical Biosciences and Engineering, 2022, 19(5): 4703-4718. doi: 10.3934/mbe.2022219
[4]	Tong Shan, Jiayong Yan, Xiaoyao Cui, Lijian Xie . DSCA-Net: A depthwise separable convolutional neural network with attention mechanism for medical image segmentation. Mathematical Biosciences and Engineering, 2023, 20(1): 365-382. doi: 10.3934/mbe.2023017
[5]	Dongwei Liu, Ning Sheng, Tao He, Wei Wang, Jianxia Zhang, Jianxin Zhang . SGEResU-Net for brain tumor segmentation. Mathematical Biosciences and Engineering, 2022, 19(6): 5576-5590. doi: 10.3934/mbe.2022261
[6]	Jun Liu, Zhenhua Yan, Chaochao Zhou, Liren Shao, Yuanyuan Han, Yusheng Song . mfeeU-Net: A multi-scale feature extraction and enhancement U-Net for automatic liver segmentation from CT Images. Mathematical Biosciences and Engineering, 2023, 20(5): 7784-7801. doi: 10.3934/mbe.2023336
[7]	Wencong Zhang, Yuxi Tao, Zhanyao Huang, Yue Li, Yingjia Chen, Tengfei Song, Xiangyuan Ma, Yaqin Zhang . Multi-phase features interaction transformer network for liver tumor segmentation and microvascular invasion assessment in contrast-enhanced CT. Mathematical Biosciences and Engineering, 2024, 21(4): 5735-5761. doi: 10.3934/mbe.2024253
[8]	Jiajun Zhu, Rui Zhang, Haifei Zhang . An MRI brain tumor segmentation method based on improved U-Net. Mathematical Biosciences and Engineering, 2024, 21(1): 778-791. doi: 10.3934/mbe.2024033
[9]	Qian Wu, Yuyao Pei, Zihao Cheng, Xiaopeng Hu, Changqing Wang . SDS-Net: A lightweight 3D convolutional neural network with multi-branch attention for multimodal brain tumor accurate segmentation. Mathematical Biosciences and Engineering, 2023, 20(9): 17384-17406. doi: 10.3934/mbe.2023773
[10]	Zhuang Zhang, Wenjie Luo . Hierarchical volumetric transformer with comprehensive attention for medical image segmentation. Mathematical Biosciences and Engineering, 2023, 20(2): 3177-3190. doi: 10.3934/mbe.2023149

Abstract

1. Introduction

TILs are the types of immune cells, which exist in tumor tissues and are of great significance for the diagnosis and prognosis of cancer ^[1]. As the gold standard for cancer diagnosis, pathological images contain a lot of information ^[2]. TILs can be observed in pathological images, and their role is particularly important as the main immune cells in the tumor microenvironment ^[3,4]. Now many studies have shown that the number and spatial characteristics of TILs on pathological images can be used as predictors of breast cancer prognosis ^[5,6]. Part of the pathological images of TILs are shown in Figure 1.

Figure 1. Pathological image of tumor infiltrating lymphocytes.

DownLoad: Full-Size Img PowerPoint

Pathological image analysis relies on professional doctors, which is time-consuming and laborious, meanwhile, the specificity of pathological images will also affect the reliability of doctors' diagnosis ^[7]. Deep learning technology has attracted extensive attention in the medical field because of its autonomy and intelligence ^[8]. It has been gradually applied to many fields, such as medical image classification ^[9,10], detection ^[11,12] and segmentation ^[13,14], etc. Using deep learning methods to segment TILs in pathological images, and quantify the number and characteristics of TILs has become one of the hotspots of current research. However, due to the specificity of pathological images and cells, there are three challenges in the segmentation tasks of TILs: 1) The problem of cell adhesion and overlap. During the sampling process, many cells tend to cluster together because of cell movement; 2) The coexistence of multiple types of cells. There are many kinds of cells in a pathological image, it is difficult to segment a kind of cells accurately; 3) The problem of the large difference between the front and background. Compared with the background area, the cells occupy a small area and are not easy to capture in the segmentation process.

Considering the above challenges, we take advantage of deep learning technology to design a segmentation network, which is called as SAMS-Net. The proposed network model has three contributions:

● Squeeze-and-attention with the residual structure module (SAR) fuses local and global context features, which makes up for the missing spatial information in the ordinary convolution process.

● Multi-scale feature fusion module (MSFF) is integrated into the network to capture TILs of smaller size, and combine the context features to enrich the decoding stage features.

● Convolution module with residual structure (RS) merges feature maps from different scales to strengthen the fusion capability of high-level and low-level semantic information.

2. Related works

2.1. TILs segmentation

Early cell segmentation methods such as threshold segmentation method ^[15], watershed algorithm ^[16] etc., are mostly using local features while ignoring global features, so the segmentation accuracy needs to be improved. Cell segmentation algorithms based on deep learning have been proposed and widely used in medical image segmentation, like fully convolutional networks (FCN) ^[17], UNet ^[18] and DeepLab networks ^[19]. The experiment has shown that compared to traditional segmentation algorithms, these networks have high performance.

Automated cell segmentation methods have been studied extensively in the literature ^{[20,21,22,23,24]}. The literature ^[20] introduced a combined loss function and adopted 4 × 4 max-pooling layers instead of widely used 2 × 2 to reinforce the learning of the cell's boundary area, thereby improving the network performance. The study ^[21] applied a weakly supervised multi-task learning algorithm for cell's segmentation and detection, which effectively solved the problems of difficult segmentation. In addition, Zhang et al. ^[22] put forward a dense dual-task network (DDTNet), this network uses the pyramid network as the backbone network. The boundary sensing module and feature fusion strategy are designed to realize the automatic detection and segmentation of TILs at the same time. The results show that it is not only superior to other advanced methods in detecting and segmentation indexes, but also can complete automatic annotation of unlabeled TILs. Study ^[23] found a new approach for the prognosis and treatment of hepatocellular carcinoma by utilizing Mask-RCNN to segment lymphocytes and extract spatial features of images. Based on the concept of autoencoder, Budginaite et al. ^[24] devised a multiple-image input layer architecture to ensure the automatic segmentation of TILs, where the convolutional texture blocks can not only improve the performance of the model but also reduce the complexity. However, the cell segmentation methods proposed by the above scholars are single network models, without considering the characteristics of pathological images and cells. Improving the network model by utilizing the characteristics of images can help further increase the segmentation effect of cells.

2.2. Attention mechanism

Attention mechanism is a method to measure the importance of different features ^[25]. Originally, the attentional mechanism is initially used in machine translation, but has gradually been applied to semantic segmentation because of its ability to filter high-value features. The attention mechanism can be divided into soft attention and hard attention. Since the hard attention mechanism is difficult to train, the soft attention mechanism module is often used to extract key features ^[26].

Related researches have shown that the spatial correlation between features can be captured by integrating learning mechanism into the network. Study ^[27] presented the squeeze-and-excitation (SE) module by introducing channel learning to emphasize useful features and suppress useless features. Residual attention network ^[28] exploited a stacked attention module to generate attention-aware features, and the residual learning coupled with the attention module can make the network expansion easier. Furthermore, Yin et al. ^[29] employed a selective attention regularization module based on the traditional classification network to improve the interpretability and reliability of the model. This type of attention module only used channel attention to enhance the main features, while ignoring the spatial features, and is not suitable for segmentation tasks. With the transformer, architecture success has been achieved in many natural language processing tasks, Gao et al. ^[30] proposed UTNet, which integrated self-attention into UNet frame for enhancing network performance. In addition, the literature ^[31] believed that semantic segmentation included two aspects, one is pixel-wise prediction, and the other is pixel grouping. Thus, the squeeze-and-attention (SA) module is designed to generate the attention mask of pixel group to improve the segmentation effect.

2.3. Multi-scale module

Ordinary segmentation networks applied single convolution and pooling operations to extract features, which led to under-segmentation due to a lack of relevant information between images. To address this problem, a number of studies have proposed multi-scale feature fusion methods to mine context information that improve the effect of network segmentation. Feature pyramid network ^[32] extracted semantic feature maps at different scales by a top-down architecture with lateral connections. The atrous spatial pyramid pooling (ASPP) module capitalized on dilated convolutions with different expansion rates to obtain semantic information of multi-scale contexts. UNet++ ^[33] introduced nested and dense jump connections to aggregate semantic features from different scales. Moreover, UNet3+ ^[34] exploited full-scale jump connections to make full use of multi-scale features. which combined low-level details and high-level semantics in full-scale feature maps to improve segmentation accuracy. In addition, atrous convolution and deformable convolution obtained multi-scale semantic information by changing the size and position of the convolution kernel.

3. Methodology

In this section, we elaborate on the proposed TILs segmentation network. First, the pathological images of TILs were labeled by labelme software, and then segmented by the SAMS-Net algorithm. The algorithm framework of SAMS-Net is shown in Figure 2. Specifically, the coding structure of the model consists of a SA module and a residual structure, this structure is named SAR modules, and the blocks are connected by down-sampling operations. SAR modules enhance the spatial features of pathological images while extracting their features. In the middle of the second layer and the third layer, multi-scale feature fusion (MSFF) modules are added to fuse the low-level and high-level features. In the decoding stage, RS modules are designed based on the residual network to enhance the feature recovery capability of the model.

Figure 2. SAMS-Net overall framework diagram. The left side is the encoding structure, and the maximum pooling operation is used between blocks; the right side is the decoding structure, and the operation of up-sampling and 1 × 1 convolution is used between blocks; The encoding and decoding structures are connected by multi-scale feature fusion modules.

DownLoad: Full-Size Img PowerPoint

3.1. Residual structure

As the depth of the network increases, the "gradient disappearance problem" follows. A common solution method is to add residual learning. Residual learning structure was first proposed by He ^[35], which mainly uses jump connections to realize the identity mapping from the upper layer features to the lower layer network. The formula is as follows:

${{H}}\mathit{(x)} = {{F}}\mathit{(x)} + x$

(1)

Where, x indicates the network input of the local layer. F(x) stands for the residual learning part. This paper applies the idea of residual network to design the residual block. Because of the short connection, the convergence speed of the network is accelerated. The research utilizes the residual idea in both the encoding and decoding stages. In the encoding stage, the function of the residual structure is to enhance the ability of feature extraction, while in the decoding stage, the purpose of the residual structure is to fuse features from different scales to enhance the feature recovery ability. As shown in Figure 3, two 3 × 3 convolutions are used to extract features in the decoding module, and a 1 × 1 convolution is used to form a residual connection, so that the network can be extended to integrate high-level and low-level features.

Figure 3. Decoding module structure diagram.

DownLoad: Full-Size Img PowerPoint

3.2. Squeeze-and-attention with residual structure module

SA module and residual structure are used to extract image features simultaneously. In the encoding module, two 3 × 3 convolutions are parallel with SA module and residual structure. Each SA module includes two parts: compression and attention extraction. Compression part uses global average pooling to obtain feature vectors. Attention extraction part realizes multi-scale feature aggregation through two attention convolutions channels and up-sampling operations, and generates a global soft attention mask at the same time. In addition, For the input image whose feature maps is $X\in {\mathbb{R}}^{H\times W\times C}$ , a 1 × 1 convolution operation is used to match the output feature maps. Finally, the attention mask obtained from SA module and the feature map generated by trunk convolution are added to capture the key features. Among them, the role of the SA module is to enhance the attention feature of pixel-grouping. Encoding module is shown in Figure 4.

Figure 4. Encoding module frame diagram.

DownLoad: Full-Size Img PowerPoint

Figure 4 shows that the output characteristic graph is obtained by adding three input values, and its formula is as follows:

${X_{\text{a}}}{\text{ = }}{{\rm{Up}}} \left( {{{{\rm{F}}} _{\rm{a}}}\left( {{{\rm{Apl}}} \left( {{X_{{\rm{in}}}}} \right), C} \right)} \right)$

(2)

${X_{{{\rm{out}}} }} = {X_{{\rm{in}}}} + {{\rm{F}}} \left( {{X_{{{\rm{in}}} }}, C} \right) * {X_{\text{a}}} + {X_{\text{a}}}$

(3)

Where, ${X}_{in}\in {\mathbb{R}}^{H\times W\times C}$ , ${X}_{out}\in {\mathbb{R}}^{{H}^{{'}}\times {W}^{{'}}\times {C}^{{'}}}$ are input and output feature maps, $F\left( \cdot \right)$ is the residual function, and $C$ stands for two 3 × 3 convolutions. $Up\left( \cdot \right)$ represents the up-sampled operation, which is used to expand the number of channels of the output feature maps. $Apl\left( \cdot \right)$ represents the average pooling layer, which implements the compression operation of SA modules.

3.3. Multi-scale feature fusion module

Receptive field is often regarded as the mapping region of the input image that can be seen by convolutional neural network (CNN). Receptive field size increases as the number of network layers deepen ^[36]. A large number of studies show that there are great differences in the characteristics of different scales. Small receptive field has lower detailed information, and large receptive field has stronger semantic information. The calculation formula of receptive field is shown in the formula:

$R{F_{i + 1}} = R{F_i} + ({K_{i + 1}} - 1) \times \prod\limits_{j = 1}^i {{S_j}}$

(4)

Among them, $i$ represents the current number of network layers; $K$ stands for the size of the convolution kernel of a certain layer of the network; $S$ denotes the step size of a certain layer of the network. When $i = 0$ , ${RF}_{0}$ is the input layer receptive field, and ${RF}_{0} = 1$ .

Using features of different scales for segmentation tasks can obtain richer semantic information, which is conducive to improving the segmentation effect. The feature fusion method of the early network model is the jump connection between the same layers. This method only employs single-scale features and does not apply multi-scale features. After experimental verification, the characteristics of the receptive fields in the second and third layers of the SAMS-Net network are suitable for TILs that capture pathological images. Therefore, this study uses the second and third layers of the encoding part as the multi-scale feature fusion layer. To effectively combine shallow detail information with deep semantic information, feature maps of different scales are connected to each layer of the decoding module through up-sampling or pooling operation. The specific implementation is shown in Figure 5.

Figure 5. Multi-scale feature fusion module.

DownLoad: Full-Size Img PowerPoint

${D}_{4}$ is taken as an example to represent the implementation process of the multi-scale feature fusion module. When the image passes through the coding module, the features from the ${E}_{2}$ layer and ${E}_{3}$ layer are fused with the features of the ${E}_{4}$ layer through the maximum pooling operation of different sizes, and the ${E}_{5}$ features from the decoding part after the upsampling operation to obtain the rich information of the joint context.

Assuming that ${E}_{0}$ and ${D}_{0}$ are the input feature maps of the encoding part and the output feature maps of the decoding part, respectively. $i$ indicates the number of current network layers. $H\left(*\right)$ is used to represent the nonlinear transformation of layer $i$ , which can be realized by a series of operations, such as ReLu, Batch Normalization, and Pooling etc. The formula of the MSFF module is as follows:

${D_i} = {\text{H}}\left( {\left[ {{E_2}, {E_3}, {E_i}, {D_{i + 1}}} \right]} \right)$

(5)

where $\left[ \cdot \right]$ is concatenate operation, ${E}_{2}$ and ${E}_{3}$ stand for the feature maps of the 2 and 3 layers in the encoding stage, respectively. ${D}_{i}$ is the feature map of the current layer in the decoding stage. ${E}_{i}$ is the feature map of the current layer in the encoding stage.

4. Experimental results

4.1. Experimental data

The experiment uses the HER2-positive breast cancer tumor infiltrating lymphocyte data set in the literature ^[37], which is marked by a professional pathologist, and the image size is 100 × 100 pixels. There is a risk of overfitting when the data set is too small. The data enhancement methods such as clipping, mirror transformation and flipping are used to prevent overfitting. According to the ratio of 8:1:1, the dataset was divided into a training set, validation set, and test set. This research uses a ten-fold cross-validation method to evaluate the generalization performance of the model.

4.2. Implementation

The SAMS-Net algorithm is written using the Pytorch1.8.1 deep learning framework, and is trained on the experimental platform of Intel(R) Core (TM) i5-1135G7 CPU and NVIDIA Tesla V100 32 GB GPU. The initial learning rate of the algorithm is set to 0.0025. In this network, adaptive moment estimation (Adam) is used as the optimizer, DiceLoss is employed as the loss function, and L2 regularization operation is used to prevent overfitting.

4.3. Evaluation index

To verify the effectiveness of the algorithm proposed in this study, we use IoU, DSC, positive prediction value (PPV), F1 score, pixel accuracy (PA), recall, Hausdorff distance (Hd) indicators to evaluate the performance of the algorithm. The IoU is used to measure the coincidence of the predicted graph with the ground-truth, the DSC is used to calculate the similarity between the predicted map and the ground truth, the closer the value is to 1, the better the segmentation effect. On the contrary, Hausdorff distance is a distance defined between any two sets in the metric space, the closer the value is to 0, the better the splitting effect. The calculation formulas are:

$IoU = \frac{{P \cap G}}{{P \cup G}}$

(6)

$DS{\text{C}} = \frac{{2|P \cap G|}}{{|P| + |G|}}$

(7)

$PPV = \frac{{TP}}{{TP + FP}}$

(8)

$F1 = \frac{{2TP}}{{2TP + FP + FN}}$

(9)

$PA = \frac{{TP + TN}}{{TP + TN + FP + FN}}$

(10)

$Recall = \frac{{TP}}{{TP + FN}}$

(11)

$Hd = \max \{ h(P, G), (G, P)\}$

(12)

Among them, in Eqs (6), (7) and (12), P represents the area of TILs predicted in the segmentation result, G represents the area of TILs in the ground truth image. In Eqs (8)–(11), TP is a true example, FP is a false positive example, TN is a true negative example, and FN is a false negative example.

4.4. Results and disscussion

In order to use multi-scale features more effectively, the fusion strategy between different layers of the algorithm is experimentally studied. The experimental results show that using different layers of information to integrate multi-scale features in TILs segmentation has a certain effect on improving the segmentation accuracy. However, the second and third layers of SAMS-Net can retain the semantic information of TILs to the maximum extent, improve the overall segmentation effect, and perform the best in TILs segmentation task. The experimental results are shown in . ${E}_{1}$ , ${E}_{2}$ , ${E}_{3}$ and ${E}_{4}$ represent the first, second, third, and fourth layers of the coding part respectively. It can be seen from the table that using ${E}_{2}$ and ${E}_{3}$ joint feature vectors have the best effect for the SAMS-Net algorithm.

Table 1. Comparison results of fusion between different layers.

Model	IoU (%) ↑	DSC (%) ↑	PPV (%) ↑	F1 (%) ↑	PA (%) ↑	Recall (%) ↑	Hd↓
E₁ + E₂	77.2	87.0	92.7	92.4	96.1	92.2	3.40
E₁ + E₃	76.1	86.3	92.0	91.9	96.2	92.1	3.503
E₁ + E₄	76.7	86.8	92.4	91.7	94.9	91.3	3.781
E₂ + E₄	76.2	86.4	92.3	92.0	96.1	91.8	3.450
E₃ + E₄	75.8	86.1	92.0	91.8	95.8	91.9	3.443
E₂+E₃	77.5	87.2	93.0	92.6	96.4	92.1	3.354
Note: Different metrics between the automated and ground truth for evaluating segmentation performance. Where ↑ means that the larger the value, the better the effect, ↓ means that the smaller the value, the better the effect. The best results are highlighted in bold.

| Show Table

DownLoad: CSV

In order to verify the effectiveness of the proposed algorithm, the proposed SAMS-Net algorithm is compared with other classical segmentation algorithms in Table 1 (such as FCN network, DeepLab V3+ network, and UNet network, etc.) on the same experimental platform. The experimental results are shown in Table 2. It can be seen from the experimental results that SAMS-Net performs best in the TILs segmentation task, and its IoU, DSC and other indicators are optimal among the eight segmentation algorithms.

Table 2. Model performance comparison results.

Model	IoU (%) ↑	DSC (%) ↑	PPV (%) ↑	F1 (%) ↑	PA (%) ↑	Recall (%) ↑	Hd↓
FCN ^[17]	74.5	85.1	91.8	91.3	95.6	91.0	3.460
DeepLabV3+ ^[19]	70.1	82.3	90.5	89.7	95.0	89.2	4.177
SegNet ^[38]	73.2	84.4	90.9	90.8	95.6	91.0	3.729
ENet ^[39]	51.5	67.9	81.9	81.0	91.2	81.1	4.465
UNet ^[18]	73.7	84.7	90.1	91.1	95.7	90.8	3.498
R2UNet ^[40]	74.1	85.1	92.0	91.2	95.8	90.7	3.574
UNet++ ^[33]	75.6	85.8	92.3	91.7	96.0	91.3	3.368
SAMS-Net(ours)	77.5	87.2	93.0	92.6	96.4	92.1	3.354
Note: Different metrics between the automated and ground truth for evaluating segmentation performance. Where ↑ means that the larger the value, the better the effect, ↓ means that the smaller the value, the better the effect. The best results are highlighted in bold.

| Show Table

DownLoad: CSV

The experimental results show that the SAMS-Net has a good effect in the TILs segmentation task, and its IoU, DSC and other indicators have achieved the best results among the eight segmentation algorithms. Compared with UNet, IoU increased by 3.8%, DSC promoted by 2.5%, compared with FCN, DeepLabV3+, SegNet, R2UNet and UNet+, IoU increased by 3, 7.4, 4.3, 3.1 and 1.9%, respectively. DSC is improved by 2.1, 4.9, 2.8, 2.1 and 1.4% respectively, which proves the effectiveness of SAMS-NET in segmentation. The analysis shows that the FCN and SegNet networks have the problem of long training time due to a large number of parameters, and the failure to consider the global information is easy to lose the image details, which leads to the segmentation is not fine enough. In order to reduce the number of model parameters, ENet algorithm carries out a down-sampling operation in advance, which leads to the serious loss of image spatial information and poor segmentation ability. DeepLabV3+ algorithm adds a variety of modules to reduce model parameters and enhance feature extraction ability, which leads to feature information redundancy and makes the network unable to learn key information, thus making the network segmentation effect low. Although the UNet, UNet++ and R2UNet networks consider the relationship between pixels, they fail to fully relate the context information to obtain richer features and thus lose part of the edge information, resulting in a slightly lower segmentation ability.

In view of the residual attention module and multi-scale feature fusion module designed by our proposed SAMS-NET algorithm, the network not only pays attention to the key information in the image but also considers the context connection, so the image segmentation results are better and can achieve better segmentation. In order to better analyze the segmentation effect, this study conducts a visual analysis on SAMS-NET and its comparison algorithm, and the comparison results are shown in Figure 6.

Figure 6. Visualization of experimental results.

DownLoad: Full-Size Img PowerPoint

According to the segmentation results, SegNet, UNet and UNet++algorithms mistakenly divide normal cells into TILs cells. FCN and DeepLabV3+show the problem of cell segmentation edge adhesion in the segmentation process, and ENet shows unclear segmentation edges and burrs. Compared with other segmentation networks, the overall segmentation effect of SAMS-Net is improved, which effectively avoids under segmentation and over-segmentation, and the overall segmentation effect is better. However, although the SAMS-Net has a certain improvement effect on the segmentation ability of TILs, there are still some unclear edges and segmentation errors in some segmented regions, which may be caused by the small dataset and unbalanced front and background pixels. Adding more training samples to enhance the feature learning ability of the network can further improve the segmentation effect.

4.5. Ablation experiment

To measure the generalization performance of the algorithm and explore the influence of different modules on the algorithm, multiple improved modules were split and ablation experiments were used to validate the contribution of each module to SAMS-Net. The verification results are shown in Table 3. It can be seen from the table that compared to the basic network, each module of SAMS-Net contributes to the segmentation task of this paper, moreover, the combination of multiple modules can achieve the best effect.

Table 3. Performance comparison results of each module.

SA	MSFF	RS	IoU (%) ↑	DSC (%) ↑	PPV (%) ↑	F1 (%) ↑	PA (%) ↑	Recall (%) ↑	Hd↓
			74.8	85.3	91.2	91.4	96.0	91.7	3.610
✔			76.2	86.4	92.4	92.0	96.2	91.9	3.477
	✔		76.3	86.4	92.5	92.0	96.2	91.7	3.388
		✔	75.6	85.9	91.4	91.7	95.4	91.3	3.512
✔	✔		75.9	86.2	92.5	91.9	96.1	91.5	3.506
✔		✔	76.1	86.3	92.4	92.0	96.1	91.7	3.454
	✔	✔	75.7	86.0	92.5	91.8	96.1	91.4	3.498
✔	✔	✔	77.5	87.2	93.0	92.6	96.4	92.1	3.354
Note: Ablation results of different components. Where ↑ means that the larger the value, the better the effect, ↓ means that the smaller the value, the better the effect. The best results are highlighted in bold.

| Show Table

DownLoad: CSV

As can be seen from the table, compared with the basic network, each module of SAMS-NET has contributed to the segmentation task of this research, and the best effect can be achieved through the combination of multiple modules.

In order to verify the effectiveness of the data enhancement operation and L2 regularization ^[41] method on the algorithm, the benchmark algorithm is compared with the algorithm after adding data enhancement and L2 regularization, and the comparison results are shown in Figure 7.

Figure 7. Data enhancement and L2 regularization operation are added to compare the test results.

DownLoad: Full-Size Img PowerPoint

Where, Base is the algorithm without data enhancement and L2 regularization, Aug stands for data enhancement operation, and L2 stands for L2 regularization method. As can be seen, compared with the Base network, the IoU index of the algorithm is increased by 4.4% and DSC index is improved by 3% after adding the data enhancement operation and L2 regularization method. The results show that these two operations play a certain role in improving the segmentation effect.

5. Conclusions

Related research shows that TILs can predict cancer chemotherapy response and survival outcome ^[42], and can provide a basis for precise treatment of cancer. This paper proposes a segmentation network based on the squeeze attention mechanism and multi-scale feature fusion to segment TILs in breast cancer pathological images. SAMS-Net has three modules: SAR module, MSFF module, and RS module. Different from the traditional attention mechanism, the interdependence between spatial channels is effectively taken into consideration by the SAR module, which can enhance the dense prediction at the pixel level. MSFF module effectively combines low-level and high-level semantic features in different scale feature maps on the basis of enhancing context features. RS module can enhance the ability of gradient return to speed up training.

Lacking the spatial information of the image and the pixel difference of the segmentation target are common problems in traditional segmentation networks, which cause the unsuitability for the task of cell segmentation. Based on the traditional network, the segmentation effect of different receptive fields on the cell area was taken into account in this paper, and a MSFF module combining multiple receptive fields were proposed to solve the problem of difficulty in capturing the segmentation process due to small cell pixels. SAMS-Net uses the attention mechanism combined with the residual structure to extract richer semantic information. A large number of experiments have proved that among the state-of-the-art methods, SAMS-Net has a better segmentation effect and can further provide important evidence for the prognosis and treatment of cancer. In addition, this study can also be applied to the diagnosis of various diseases by optical imaging (optical coherence tomography), such as age-related macular degeneration and Stargardt's disease ^[43,44,45]. Due to the uses of multiple modules to improve the segmentation effect, which increases the number of parameters and calculations of the model. In the future, the network model needs to be further improved to reduce the scores of parameters and calculations.

Acknowledgments

This work is supported by the National Nature Science Foundation of China (No. 61872225), the Natural Science Foundation of Shandong Province (No. ZR2020KF013, No. ZR2020ZD44, No. ZR2019ZD04, No. ZR2020QF043) and Introduction and Cultivation Program for Young Creative Talents in Colleges and Universities of Shandong Province(No.2019–173), the Special fund of Qilu Health and Health Leading Talents Training Project.

Conflict of interest

We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

References

[1]	X. Xiang, C. Yu, Q. Zhang, On intelligent risk analysis and critical decision of underwater robotic vehicle, Ocean Eng., 140 (2017), 453–465. https://doi.org/10.1016/j.oceaneng.2017.06.020 doi: 10.1016/j.oceaneng.2017.06.020
[2]	W. Wawrzyński, M. Zieja, M. Żokowski, N. Sigiel, Optimization of Autonomous Underwater Vehicle mission planning process, Bull. Pol. Acad. Sci. Tech. Sci., 70 (2022), e140371. https://doi.org/10.24425/bpasts.2022.140371 doi: 10.24425/bpasts.2022.140371
[3]	X. Chen, N. Bose, M. Brito, F. Khan, B. Thanyamanta, T. Zou, A review of risk analysis research for the operations of Autonomous Underwater Vehicles, Reliab. Eng. Syst. Saf., 216 (2021), 108011. https://doi.org/10.1016/j.ress.2021.108011 doi: 10.1016/j.ress.2021.108011
[4]	S. Xia, X. Zhou, H. Shi, S. Li, C. Xu, A fault diagnosis method based on attention mechanism with application in Qianlong-2 Autonomous Underwater Vehicle, Ocean Eng., 233 (2021), 109049. https://doi.org/10.1016/j.oceaneng.2021.109049 doi: 10.1016/j.oceaneng.2021.109049
[5]	D. Chaos, D. Moreno-Salinas, J. Aranda, Fault-tolerant control for AUVs using a single thruster, IEEE Access, 10 (2022), 22123–22139. https://doi.org/10.1109/access.2022.3152190 doi: 10.1109/ACCESS.2022.3152190
[6]	Y. Yu, J. Zhang, T. Zhang, AUV drift track prediction method based on a modified neural network, Appl. Sci., 12 (2022), 12169. https://doi.org/10.3390/app122312169 doi: 10.3390/app122312169
[7]	S. Meng, W. Lu, Y. Li, H. Wang, L. Jiang, A study on the leeway drift characteristic of a typical fishing vessel common in the Northern South China Sea, Appl. Ocean Res., 109 (2021), 102498. https://doi.org/10.1016/j.apor.2020.102498 doi: 10.1016/j.apor.2020.102498
[8]	H. Tu, L. Mu, K. Xia, X. Wang, K. Zhu, Determining the drift characteristics of open lifeboats based on large-scale drift experiments, Front. Mar. Sci., 9 (2022), 1017042. https://doi.org/10.3389/fmars.2022.1017042 doi: 10.3389/fmars.2022.1017042
[9]	J. R. Frost, L. D. Stone, Review of search theory: Advances and applications to search and rescue decision support, TRB Annu. Meet., 2001.
[10]	National SAR Manual, National Search and Rescue Manual, EXHIBIT/P-00112, 1998. Available from: http://www.oshsi.nl.ca/userfiles/files/p00112.pdf.
[11]	L. P. Perera, P. Oliveira, C. Guedes Soares, Maritime traffic monitoring based on vessel detection, tracking, state estimation, and trajectory prediction, IEEE Trans. Intell. Transp. Syst., 13 (2012), 1188–1200. https://doi.org/10.1109/tits.2012.2187282 doi: 10.1109/TITS.2012.2187282
[12]	J. Zhang, Â. P. Teixeira, C. Guedes Soares, X. Yan, Probabilistic modelling of the drifting trajectory of an object under the effect of wind and current for maritime search and rescue, Ocean Eng., 129 (2017), 253–264. https://doi.org/10.1016/j.oceaneng.2016.11.002 doi: 10.1016/j.oceaneng.2016.11.002
[13]	P. Miron, F. J. Beron-Vera, M. J. Olascoaga, P. Koltai, Markov-chain-inspired search for MH370, Chaos: Interdiscip. J. Nonlinear Sci., 29 (2019), 041105. https://doi.org/10.1063/1.5092132 doi: 10.1063/1.5092132
[14]	M. Zhao, J. Zhang, M. H. Rashid, Predicting the drift position of ships using deep learning, in the 2nd International Conference on Computing and Data Science, Association for Computing Machinery, (2021), 1–5. https://doi.org/10.1145/3448734.3450922
[15]	A. A. Pereira, J. Binney, G. A. Hollinger, G. S. Sukhatme, Risk-aware path planning for Autonomous Underwater Vehicles using Predictive ocean models, J. Field Rob., 30 (2013), 741–762. https://doi.org/10.1002/rob.21472 doi: 10.1002/rob.21472
[16]	D. N. Subramani, Q. J. Wei, P. F. J. Lermusiaux, Stochastic time-optimal path-planning in uncertain, strong, and dynamic flows, Comput. Methods Appl. Mech. Eng., 333 (2018), 218–237. https://doi.org/10.1016/j.cma.2018.01.004 doi: 10.1016/j.cma.2018.01.004
[17]	Z. Wu, H. R. Karimi, C. Dang, An approximation algorithm for graph partitioning via deterministic annealing neural network, Neural Networks, 117 (2019), 191–200. https://doi.org/10.1016/j.neunet.2019.05.010 doi: 10.1016/j.neunet.2019.05.010
[18]	Z. Wu, Q. Gao, B. Jiang, H. R. Karimi, Solving the production transportation problem via a deterministic annealing neural network method, Appl. Math. Comput., 411 (2021), 126518. https://doi.org/10.1016/j.amc.2021.126518 doi: 10.1016/j.amc.2021.126518
[19]	D. Tong, B. Ma, Q. Chen, Y. Wei, P. Shi, Finite-time synchronization and energy consumption prediction for multilayer fractional-order networks, IEEE Trans. Circuits Syst. Ⅱ Express Briefs, 70 (2023), 2176–2180. https://doi.org/10.1109/TCSII.2022.3233420 doi: 10.1109/TCSII.2022.3233420
[20]	G. Yang, D. Tong, Q. Chen, W. Zhou, Fixed-time synchronization and energy consumption for kuramoto-oscillator networks with multilayer distributed control, IEEE Trans. Circuits Syst. Ⅱ Express Briefs, 70 (2022), 1555–1559. https://doi.org/10.1109/TCSII.2022.3221477 doi: 10.1109/TCSII.2022.3221477
[21]	C. Xu, D. Tong, Q. Chen, W. Zhou, P. Shi, Exponential stability of Markovian jump systems via adaptive sliding mode control, IEEE Trans. Syst. Man Cybern.: Syst., 51 (2019), 954–964. https://doi.org/10.1109/TSMC.2018.2884565 doi: 10.1109/TSMC.2018.2884565
[22]	K. Zhu, L. Mu, H. Tu, Exploration of the wind-induced drift characteristics of typical Chinese offshore fishing vessels, Appl. Ocean Res., 92 (2019), 101916. https://doi.org/10.1016/j.apor.2019.101916 doi: 10.1016/j.apor.2019.101916
[23]	H. Yasukawa, N. Hirata, Y. Nakayama, A. Matsuda, Drifting of a dead ship in wind, Ship Technol. Res., 70 (2023), 26–45. https://doi.org/10.1080/09377255.2021.1954835 doi: 10.1080/09377255.2021.1954835
[24]	H. W. Tu, X. D. Wang, L. Mu, J. L. Sun, A study on the drift prediction method of wrecked fishing vessels at sea, in OCEANS 2021: San Diego–Porto, IEEE, (2021), 1–6. https://doi.org/10.23919/OCEANS44145.2021.9705751
[25]	D. Sumangala, A. Joshi, H. Warrior, Modelling freshwater plume in the Bay of Bengal with artificial neural networks, Curr. Sci., 123 (2022), 73–80. https://doi.org/10.18520/cs/v123/i1/73-80 doi: 10.18520/cs/v123/i1/73-80
[26]	L. Ren, Z. Hu, M. Hartnett, Short-term forecasting of coastal surface currents using high frequency radar data and artificial neural networks, Remote Sens., 10 (2018), 850. https://doi.org/10.3390/rs10060850 doi: 10.3390/rs10060850
[27]	H. Kalinić, H. Mihanović, S. Cosoli, M. Tudor, I. Vilibić, Predicting ocean surface currents using numerical weather prediction model and Kohonen neural network: A northern Adriatic study, Neural Comput. Appl., 28 (2017), 611–620. https://doi.org/10.1007/s00521-016-2395-4 doi: 10.1007/s00521-016-2395-4
[28]	H. Guan, X. Dong, C. Xue, Z. Luo, H. Yang, T. Wu, Optimization of POM based on parallel supercomputing grid cloud platform, in 2019 Seventh International Conference on Advanced Cloud and Big Data (CBD), IEEE, (2019), 49–54. https://doi.org/10.1109/cbd.2019.00019
[29]	A. K. Das, A. Sharma, S. Joseph, A. Srivastava, D. R. Pattanaik, Comparative performance of HWRF model coupled with POM and HYCOM for tropical cyclones over North Indian Ocean, MAUSAM, 72 (2021), 147–166. https://doi.org/10.54302/mausam.v72i1.127 doi: 10.54302/mausam.v72i1.127
[30]	C. D. Dong, T. H. H. Nguyen, T. H. Hou, C. C. Tsai, Integrated numerical model for the simulation of the T.S. Taipei oil spill, J. Mar. Sci. Technol., 27 (2019), 7. https://doi.org/10.6119/JMST.201908_27(4).0007 doi: 10.6119/JMST.201908_27(4).0007
[31]	J. Xu, J. Y. Bao, C. Y. Zhan, X. H. Zhou, Tide model CST1 of China and its application for the water level reducer of bathymetric data, Mar. Geod., 40 (2017), 74–86. https://doi.org/10.1080/01490419.2017.1308896 doi: 10.1080/01490419.2017.1308896
[32]	H. Xu, Tracking lagrange trajectories in position–velocity space, Meas. Sci. Technol., 19 (2008), 075105. https://doi.org/10.1088/0957-0233/19/7/075105 doi: 10.1088/0957-0233/19/7/075105
[33]	T. Heus, G. van Dijk, H. J. J. Jonker, H. E. A. Van den Akker, Mixing in shallow cumulus clouds studied by lagrange particle tracking, J. Atmos. Sci., 65 (2008), 2581–2597. https://doi.org/10.1175/2008jas2572.1 doi: 10.1175/2008JAS2572.1
[34]	N. B. Engdahl, R. M. Maxwell, Quantifying changes in age distributions and the hydrologic balance of a high-mountain watershed from climate induced variations in recharge, J. Hydrol., 522 (2015), 152–162. https://doi.org/10.1016/j.jhydrol.2014.12.032 doi: 10.1016/j.jhydrol.2014.12.032
[35]	M. Jing, F. Heße, R. Kumar, O. Kolditz, T. Kalbacher, S. Attinger, Influence of input and parameter uncertainty on the prediction of catchment-scale groundwater travel time distributions, Hydrol. Earth Syst. Sci., 23 (2019), 171–190. https://doi.org/10.5194/hess-23-171-2019 doi: 10.5194/hess-23-171-2019
[36]	Y. H. Zhu, S. Q. Peng, 40 years of marine data products in the south china sea (1980–2019) (1/10 degree) (hourly) (netcdf), the Key Special Project for Introduced Talents Team of Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou) (GML2019ZD0303), 2019. Available from: http://data.scsio.ac.cn/metaDatadetail/1480813599763386368.
[37]	NOOA Physical Sciences Laboratory (PSL), NCEP/NCAR Reanalysis. Available from: https://psl.noaa.gov/.
[38]	W. Ekman, Eddy-viscosity and skin-friction in the dynamics of winds and ocean-currents, in Memoirs of the Royal Meteorological Society, Stanford, (1928), 161–172.
[39]	N. P. Fofonoff, Physical properties of seawater: A new salinity scale and equation of state for seawater, J. Geophys. Res., 90 (1985), 3332–3342. https://doi.org/10.1029/JC090iC02p03332 doi: 10.1029/JC090iC02p03332
[40]	Y. Jiang, Y. Li, Y. Su, J. Cao, Y. Li, Y. Wang, et al., Statics variation analysis due to spatially moving of a full ocean depth Autonomous Underwater Vehicle, Int. J. Nav. Archit. Ocean Eng., 11 (2019), 448–461. https://doi.org/10.1016/j.ijnaoe.2018.08.002 doi: 10.1016/j.ijnaoe.2018.08.002
[41]	K. Zhang, New gravity acceleration formula research (in Chinese), Prog. Geophys., 26 (2011), 824–828. https://doi.org/10.3969/j.issn.1004-2903.2011.03.006 doi: 10.3969/j.issn.1004-2903.2011.03.006
[42]	Y. K. Wang, Simulation research on the full-ocean-depth AUV diving and floating motion (in Chinese), Harbin Eng. Univ., 2020. https://doi.org/10.27060/d.cnki.ghbcu.2019.000077 doi: 10.27060/d.cnki.ghbcu.2019.000077
[43]	A. Chen, J. Ye, Research on four-layer back propagation neural network for the computation of ship resistance, in 2009 International Conference on Mechatronics and Automation, IEEE, (2009), 2537–2541. https://doi.org/10.1109/icma.2009.5245975
[44]	X. Chen, C. Wei, G. Zhou, H. Wu, Z. Wang, S. A. Biancardo, Automatic identification system (AIS) data supported ship trajectory prediction and analysis via a deep learning model, J. Mar. Sci. Eng., 10 (2022), 1314. https://doi.org/10.3390/jmse10091314 doi: 10.3390/jmse10091314

This article has been cited by:

1.	Alessio Fiorin, Carlos López Pablo, Marylène Lejeune, Ameer Hamza Siraj, Vincenzo Della Mea, Enhancing AI Research for Breast Cancer: A Comprehensive Review of Tumor-Infiltrating Lymphocyte Datasets, 2024, 2948-2933, 10.1007/s10278-024-01043-8
2.	D. P. Yadav, Turki Aljrees, Deepak Kumar, Ankit Kumar, Kamred Udham Singh, Teekam Singh, Spatial attention-based residual network for human burn identification and classification, 2023, 13, 2045-2322, 10.1038/s41598-023-39618-0
3.	Haiyan Song, Cuihong Liu, Shengnan Li, Peixiao Zhang, TS-GCN: A novel tumor segmentation method integrating transformer and GCN, 2023, 20, 1551-0018, 18173, 10.3934/mbe.2023807
4.	Lei Yuan, Jianhua Song, Yazhuo Fan, FM-Unet: Biomedical image segmentation based on feedback mechanism Unet, 2023, 20, 1551-0018, 12039, 10.3934/mbe.2023
5.	Nurkhairul Bariyah Baharun, Afzan Adam, Mohamed Afiq Hidayat Zailani, Nasir M. Rajpoot, Qiaoyi XU, Reena Rahayu Md Zin, Automated scoring methods for quantitative interpretation of Tumour infiltrating lymphocytes (TILs) in breast cancer: a systematic review, 2024, 24, 1471-2407, 10.1186/s12885-024-12962-8
6.	Xiang Li, Jian Wang, Haifeng Wei, Jinyu Cong, Hongfu Sun, Pingping Wang, Benzheng Wei, MH2AFormer: An Efficient Multiscale Hierarchical Hybrid Attention With a Transformer for Bladder Wall and Tumor Segmentation, 2024, 28, 2168-2194, 4772, 10.1109/JBHI.2024.3397698
7.	Lei Yuan, Jianhua Song, Yazhuo Fan, FM-Unet: Biomedical image segmentation based on feedback mechanism Unet, 2023, 20, 1551-0018, 12039, 10.3934/mbe.2023535
8.	Jie Huang, Yangsheng Hu, Yuanchao Xue, Yu Yao, Haitao Wang, Jianfeng He, A Deep Learning Model for Assessing Ki-67 and Tumor-Infiltrating Lymphocytes Using Multiscale Attention and Pixel Channel Fusion, 2024, 12, 2169-3536, 167856, 10.1109/ACCESS.2024.3494241
9.	Yawo M. Kobara, Ikpe Justice Akpan, Alima Damipe Nam, Firas H. AlMukthar, Mbuotidem Peter, Artificial Intelligence and Data Science Methods for Automatic Detection of White Blood Cells in Images, 2025, 2948-2933, 10.1007/s10278-025-01538-y

Reader Comments

Your name:*

Email:*
© 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)