
The consistency and implication relation of chaotic properties of p-periodic discrete system and its induced autonomous discrete system are obtained. The chaotic properties discussed involve several types of transitivity and some stronger forms of sensitivity in the sense of Furstenberg families.
Citation: Xiaofang Yang, Tianxiu Lu, Waseem Anwar. Transitivity and sensitivity for the p-periodic discrete system via Furstenberg families[J]. AIMS Mathematics, 2022, 7(1): 1321-1332. doi: 10.3934/math.2022078
[1] | Xiaomeng Feng, Taiping Wang, Xiaohang Yang, Minfei Zhang, Wanpeng Guo, Weina Wang . ConvWin-UNet: UNet-like hierarchical vision Transformer combined with convolution for medical image segmentation. Mathematical Biosciences and Engineering, 2023, 20(1): 128-144. doi: 10.3934/mbe.2023007 |
[2] | Yu Li, Meilong Zhu, Guangmin Sun, Jiayang Chen, Xiaorong Zhu, Jinkui Yang . Weakly supervised training for eye fundus lesion segmentation in patients with diabetic retinopathy. Mathematical Biosciences and Engineering, 2022, 19(5): 5293-5311. doi: 10.3934/mbe.2022248 |
[3] | Yunling Liu, Yaxiong Liu, Jingsong Li, Yaoxing Chen, Fengjuan Xu, Yifa Xu, Jing Cao, Yuntao Ma . ECA-TFUnet: A U-shaped CNN-Transformer network with efficient channel attention for organ segmentation in anatomical sectional images of canines. Mathematical Biosciences and Engineering, 2023, 20(10): 18650-18669. doi: 10.3934/mbe.2023827 |
[4] | Biao Cai, Qing Xu, Cheng Yang, Yi Lu, Cheng Ge, Zhichao Wang, Kai Liu, Xubin Qiu, Shan Chang . Spine MRI image segmentation method based on ASPP and U-Net network. Mathematical Biosciences and Engineering, 2023, 20(9): 15999-16014. doi: 10.3934/mbe.2023713 |
[5] | Xi Lu, Zejun You, Miaomiao Sun, Jing Wu, Zhihong Zhang . Breast cancer mitotic cell detection using cascade convolutional neural network with U-Net. Mathematical Biosciences and Engineering, 2021, 18(1): 673-695. doi: 10.3934/mbe.2021036 |
[6] | Mingju Chen, Sihang Yi, Mei Yang, Zhiwen Yang, Xingyue Zhang . UNet segmentation network of COVID-19 CT images with multi-scale attention. Mathematical Biosciences and Engineering, 2023, 20(9): 16762-16785. doi: 10.3934/mbe.2023747 |
[7] | Zhenyin Fu, Jin Zhang, Ruyi Luo, Yutong Sun, Dongdong Deng, Ling Xia . TF-Unet:An automatic cardiac MRI image segmentation method. Mathematical Biosciences and Engineering, 2022, 19(5): 5207-5222. doi: 10.3934/mbe.2022244 |
[8] | Jinke Wang, Xiangyang Zhang, Liang Guo, Changfa Shi, Shinichi Tamura . Multi-scale attention and deep supervision-based 3D UNet for automatic liver segmentation from CT. Mathematical Biosciences and Engineering, 2023, 20(1): 1297-1316. doi: 10.3934/mbe.2023059 |
[9] | Tongping Shen, Fangliang Huang, Xusong Zhang . CT medical image segmentation algorithm based on deep learning technology. Mathematical Biosciences and Engineering, 2023, 20(6): 10954-10976. doi: 10.3934/mbe.2023485 |
[10] | Hong'an Li, Man Liu, Jiangwen Fan, Qingfang Liu . Biomedical image segmentation algorithm based on dense atrous convolution. Mathematical Biosciences and Engineering, 2024, 21(3): 4351-4369. doi: 10.3934/mbe.2024192 |
The consistency and implication relation of chaotic properties of p-periodic discrete system and its induced autonomous discrete system are obtained. The chaotic properties discussed involve several types of transitivity and some stronger forms of sensitivity in the sense of Furstenberg families.
In recent years, with the improvement of computer hardware performance, deep learning (DL) has been applied in many industrial fields and has demonstrated excellent performance. One of the most important areas is the application of medical image segmentation and classification [1,2,3]. Compared with traditional machine learning and computer vision methods, DL has more significant advantages in segmentation accuracy [4,5,6].
Over the past decade, medical image segmentation technology based on deep learning has focused on developing efficient and robust segmentation methods [7]. Unet is a milestone work [8]. It establishes an encoder-decoder convolutional network structure with a skip connection, which is simple and efficient for medical image segmentation with small required datasets. In recent years, the Unet-like structures have become the backbone of almost all leading medical image segmentation methods. Following Unet, many important extension networks have emerged, such as Unet++ [9], Res-Unet [10], AttentionUnet [11] and Trans-Unet [12].
Unet draws on the experience of the Fully convolutional network (FCN), and its network structure consists of two parts. The shrinking network on the left side captures the contextual information in the image, and the extended network on the right side achieves the purpose of accurate positioning of the required segmentation part of the image. Unet also uses skip connection for feature fusion, which combines the down-sampling features of the first half and the up-sampling features of the second half to obtain more accurate context information and a better segmentation effect.
Zhou et al. proposed Unet++ [9], which is composed of a set of different depths Unet and decoders. These decoders are connected intensively with the same resolution through redesigned skip connections. Despite the improved performance, the Unet++ model is very complex, requires additional learnable parameters, and some of its components are redundant for specific tasks [13]. Inspired by the deep residual network (ResNet) [14] and Unet, Zhang et al. proposed the deep residual Unet (Res-Unet) [10]. Res-Unet is still based on the Unet architecture. A series of stacked residual units replace ordinary neural units as basic blocks to build deep Res-Unet, which effectively deepens the number of network training layers. But with the increase of network depth, the training time becomes very long. Researchers also consider introducing self-attention mechanisms into CNNs to improve network performance [15,16,17]. Ozan Oktay et al. integrated the skip connection of additional focus gates into the U-shaped structure for medical image segmentation [11]. The attention gates (AGs) mechanism implicitly generates soft region suggestions, highlighting salient features useful for specific tasks. The sensitivity and accuracy of the model for dense label prediction are improved by suppressing the features of irrelevant regions. Some researchers have attempted to minimize interference from extraneous regions by preprocessing data prior to network training. Rani et al. [18] discovered that bone structures in chest X-rays could interfere with feature extraction from lung regions, thereby reducing the accuracy of models in detecting, localizing, and visualizing infections during COVID-19 screening. To improve the overall accuracy of their model, they applied bone suppression and lung segmentation preprocessing methods. By preprocessing, the model can minimize the visibility of bones within the lung region while preserving maximum spatial information and resolution [19]. Subsequently, Rani et al. [20] used data augmentation, histogram equalization, and pre-segmentation of L1 vertebrae to calculate the vertebral center as a reference for kidney and ureter localization. The proposed KUB-UNet network was used to verify the effectiveness of this method in enhancing the segmentation of urinary organs in KUB X-ray images.
Currently, Transformer, designed for the sequence-to-sequence prediction, has become an alternative architecture with a global self-attention mechanism [21,22,23,24,25]. Chen et al. proposed Trans-Unet [12], which has the advantages of both Transformers and Unet. On the one hand, the transformer encoders the tokenized image blocks from the CNN feature map into an input sequence to extract the global context. On the other hand, the decoder up-samples the encoded feature and then combines them with a high-resolution CNN feature map for accurate positioning.
The encoder-decoder structure and the skip connection of Unet have been proved to be efficient and stable network structures [26,27,28,29]. As mentioned above, many novel network structures based on Unet structure have been proposed. However, these are improvements proposed on the backward propagation and fusion of the network, with few changes on the forward propagation of the network and forward fusion of information. In order to improve receptive field and pixel level prediction in Unet network, there must be a series of up-sampling and down-sampling operations. However, these operations will inevitably cause information loss and underutilization. Based on this disadvantage, we design a feedback mechanism Unet (FM-Unet) in this paper, adding a feedback path to the encoder and decoder paths of the network to help the network integrate the following step information in the current encoder and decoder. The main contributions of this paper are summarized as follows:
1) A feedback mechanism Unet model for semantic segmentation of medical images is proposed. A feedback path is introduced into the Unet, which integrates the context information of the convolutional blocks and can compensate for the loss of information to improve the segmentation accuracy.
2) Compared with most of the improved networks based on Unet, the FM-Unet model has smaller parameters, which can reduce the cost of computing time and space to a certain extent.
3) In FM-Unet, the concatenation of the feedback path context feature map, the concatenation of the encoder-decoder primary path feature map and the feedback path feature map, and the concatenation of the same node at different time points can better fuse information at each scale and alleviate the problem of gradient disappearance.
The rest of this paper is organized as follows: Section 2 reviews Unet-based segmentation networks and related techniques. Section 3 describes the proposed method. Section 4 gives the experimental results and analysis. Finally, a summary of the proposed model is presented.
Unet, whose network structure is shaped like the letter 'U', is composed of convolution, down-sampling, up-sampling, skip connection and other operations, including the down-sampling contraction path on the left and the up-sampling expansion path on the right. Unet contraction path extracts image semantic information, reduces image resolution and expands the receptive field. The network consists of five blocks, each containing two 3 × 3 convolutional layers. After each convolution layer, there is a RELU activation function and a down-sampling operation for the input of the next block. The expansion path predicts pixel by pixel, accurately locates the target position, and restores the image to the size similar to the input image. The extension path contains four blocks, and each block also contains 3 × 3 convolution layers, RELU and up-sampling operation. The skip connection is added between the contraction path and the expansion path. Unet uses concatenation to the crop feature map of the contraction path to the same size as the expansion path and then performs the concatenation operation, which can help the network learn some details lost before the contraction path.
He et al. proposed the residual network [14], which introduces a constant mapping design to solve the degradation and gradient disappearance problem in multilayer neural networks. For a stacking layer structure, when the input is x, the learned feature is H(X), and the residual network function can be obtained as F(X) = H(X) - X. Currently, the learning feature can be expressed as H(X) = F(X) + X. In this way, the optimal solution of the network can be obtained by adjusting the residual function F(X), which is easier than directly learning the original feature H(X). The residual structure in the residual network uses a shortcut connection method, which can also be understood as a quick connection channel so that the feature matrix is added by the interlayers. It is important to note that the F(X) and X shapes should be the same, so the input X is often dimensioned with a 1 × 1 convolution kernel on the Shortcut path. And here is done by adding the numbers in the same position of the feature matrix.
Huang et al. proposed the DenseNet model [30], its basic idea is consistent with ResNet, but it establishes a dense connection between all the preceding layers and the following layers, and its name comes from this. Another major feature of DenseNet is feature map reuse through the connecting features on the channel. These features enable DenseNet to achieve better performance than ResNet with fewer parameters and computing costs. Compared with ResNet, DenseNet proposes a more radical mechanism of dense connectivity: connecting all layers. In DenseNet, each layer is connected with all previous layers in the channel dimension (the size of the feature map of each layer is the same here) and used as the input of the next layer.
The output of traditional network at the layer l is xl = Hl (xl-1), that is, only the previous layer is used as input. In DenseNet, all previous layers will be connected as input xl = Hl ([x0, x1, x2, ···, xl-1]), which xl = Hl (·) represents a nonlinear transformation function, and it is a combined operation and includes a series of Batch Normalization (BN), ReLU, Pooling and Conv operations.
The concept of feedback in cybernetics involves adjusting the input based on the change in the output. Researchers have incorporated this idea into Unet networks to enhance the accuracy of medical image segmentation. Shibuya et al. [31] proposed a Feedback U-Net with convolutional LSTM, where the second round of input is fed back through the first round of Unet output, and convolutional LSTM is used to extract features based on those obtained in the first round. However, this approach only implements feedback between two stages. This kind of long-distance feedback inevitably has a semantic gap, limiting the transfer of features learned in the first stage to the second stage. Lin et al. [32] proposed Refine U-Net. In order to alleviate the semantic gap in the skip connection, the global refinement module of the middle layer was added to the Unet skip connection. To this end, the encoder output is progressively upsampled as feedback features, and they are fused with the corresponding decoder-side output features. However, this work only jump-connects the feedback feature map of the encoder to the decoder, and does not consider the feedback of the decoder information. Furthermore, these works seldom consider the feedback of encoder-inside and decoder-inside information.
Zhou et al. proposed Unet++ [9], which is composed of a set of Unet with different depths and decoders. These decoders are connected intensively with the same resolution through redesigned skip connections. Inspired by the Inception module for CNNs to achieve more efficient computing, Nabil et al. introduced MultiResUnet [33], an enhanced Unet architecture that uses a convolutional layer chain with residual connections, rather than simply connecting features from the encoder path to the decoder path. These residual connections not only reduce the semantic gap between encoder and decoder features, but also make learning easier while segmenting images from various modalities robustly at different scales. Valanarasu et al. argued that Unet has poor performance in detecting small anatomical structures with fuzzy noise boundaries and proposed Ki-Net [34], an overcomplete network architecture that can project data onto high dimensions. When combined with Unet, Ki-Net can capture details and perform target segmentation better than Unet. The introduction of the self-attentive mechanism also improves the performance of the network [35,36]. Oktay et al. integrated the skip connection of additional focus gates into the U-shaped structure for medical image segmentation [11]. At the same time, people are trying to combine CNNs and Transformer. Chen et al. combined Transformer and CNNs to form a powerful encoder for 2D medical image segmentation [12]. The complementarity of Transformer and CNN is used to improve the segmentation capability of the model.
Although these variants of Unet architecture show good results in biomedical image segmentation, the problem of information loss due to convolution operation in the encoder and decoder paths still exists, which affects the final segmentation accuracy.
FM-Unet utilizes the classic U-shaped network as the basic network architecture. By introducing a feedback mechanism, the output feature map of the next convolution block is updated through the feedback convolution block, as shown in Figure 1(a). At the same time, the feedback output in the feedback path will also be output to the next feedback convolution block through down-sampling or up-sampling. The feedback mechanism of the coder-decoder designed in FM-Unet is shown in Figure 1(b), (c).
The mechanism integrates not only the output feature map of the next convolution block in the primary path, but also the output feature map of the last convolution block in the feedback path. The structure is well fused with the information of the context and less information can be lost. The complete structure of the FM-Unet includes a basic Unet architecture primary path (yellow block in Figure 1) and a feedback path (green block in Figure 1). In the encoder of Figure 1(b), layer I of the primary path receives FXi-1_0, down-sampling feature map, and the output Xi_0 of this layer is obtained. At the same time, Xi_0 down-sampling is used for the next layer operation to get Xi+1_0 on the primary path. The feedback path convolution block receives the output Xi+1_0 from the next layer of the primary path and the output Xi-1_1 from the previous layer of the feedback path. According to a short skip connection, the feedback path output Xi_1 is concatenated with the original primary path convolutional block output Xi_0 in the channel dimension (green dashed line in Figure 1). This short skip connection can allow the concatenated two feature maps to have a smaller semantic gap and low training difficulty [37]. The feature map after feedback concatenation is the FXi_0, the feedback output of the primary path will completely replace the previous Xi_0, which will be used for the next operation. The feedback structure of the decoder is shown in Figure 1(c). The basic implementation process is similar to the feedback structure of the encoder. The detailed implementation diagram of the designed feedback mechanism can refer to Figures 2 and 3.
In the encoder path, the detailed implementation diagram of the feedback mechanism is shown in Figure 2.
First, the primary path receives the input H×W×C feature map, which is obtained by 2 × 2 max polling of the previous convolution block FXi-1_0 (step 1). It should be noted that if the feature map is the initial input image, then the received input feature map is not obtained from the down-sampling of the previous module, it is the original input image with 1 channel or 3 channels. Every convolutional block consists of two convolutional layers and the corresponding activation function. At this node, we get the primary path output Xi_0. The output of the convolution block passes through 2 × 2 max polling as the input of the next layer in the primary path (step 2), the convolution block output of this layer will be up-sampling into the feedback path (step 3), The input to the feedback path may also come from a layer above the feedback path (step 4). This output of the feedback convolution block gives a feature map incorporating contextual information. The feedback feature map obtained here will be concatenated with the previously output feature map of the primary path of the current layer in the channel dimension (steps 5 and 6), then the concatenated feature map is used as input of the current primary path convolution block. The feature map from this convolution block is the updated feature map of the primary path layer. We get the output FXi_0 after the primary path is updated. And this feature map will completely replace the previous output feature map of the layer for subsequent operations.
The Encoder Feedback Mechanism that we designed enables the first layer node in the encoder to perform feedback updates on its original output, Xi_0. The feedback output, Xi_1, is obtained by capturing the information Xi+1_0 of the next layer in the primary path, and the information Xi-1_1 of the upper layer in the feedback path at the feedback node. The feedback output, Xi_1, is concatenated with the original output, Xi_0, and updated after passing through the convolution block, culminating in the primary path returning the output FXi_0. This operation achieves the shrinking network on the left that captures the image's contextual information, thus mitigating information loss.
In the decoder path, the detailed implementation diagram of the feedback mechanism is shown in Figure 3.
First, the primary path receives the bilinear interpolation up-sampling feature map of the decoder (step 1). Meanwhile, the feedback feature map after the decoder master path update is stitched by skip connection (step 2). The skip connection follows the standard Unet architecture. It concatenated two feature maps in the channel dimension. The output of this convolution block is up-sampled to the next layer of the decoder primary path (step 3), and this layer is similarly stitched together with the encoder primary path feature map (step 4) and the up-sampled feature map. The feature map in this layer will be down-sampled into the feedback path of the decoder (step 5), and the input in this feedback path may also have the feature map from the previous layer of the decoder feedback path (step 6). The output of this feedback convolutional block will be connected to the primary path of the current layer via a short ship connection (step 7). Currently, the primary path needs to concatenate the feedback convolution block output feature map, the previous output feature map of the primary path of the current layer (step 8) and the skip connection encoder primary path feedback feature map (step 9). We get the output FYi_0 after the primary path is updated. The output of this input to the convolutional block is the current feedback feature map of that layer of the primary path and, like the encoder, completely replaces the previous output of that layer for subsequent operations.
We designed the Decoder feedback mechanism to enable the original output Yi_0 of the i-th layer node in the decoder to be updated with feedback. The feedback output Yi_1 is obtained by capturing the information Yi+1_0 of the next layer of the primary path, and the information Yi-1_1 of the upper layer of the feedback path in the feedback node. The feedback path output Yi_1, the original output Yi_0 and the encoder primary path feedback output FXi_0 are concatenated after the convolution block to obtain the updated primary path feedback output FYi_0. This operation allows the extended network on the right to accurately locate the segmented part of the image required and provide more detailed information.
In this section, we conduct extensive experiments to evaluate the performance of the proposed image segmentation framework and compare it with the baseline model on several benchmark datasets.
Breast Ultra Sound Images (BUSI) [38] is a medical images dataset of breast cancer by ultrasound scans. The BUSI dataset collected included breast ultrasound images of women aged 25 to 75 collected in 2018. The number of patients is 600 women. The dataset consists of 780 images, all of which are cropped to different sizes to remove unused and unimportant borders from the images. The images are in PNG format and divided into three categories: normal, benign and malignant. Each image has its ground truth (mask image). Both benign and malignant images are used in the experiments. To standardize network input sizes and take advantage of GPU parallelization, we resized 647 images to 256 × 256 RGB.
The International Skin Imaging Collaboration 2018 (ISIC 2018) [39] is the world's largest skin image analysis challenge and has organized the world's largest public dermoscope image library. This challenge was divided into three image analysis tasks: lesion segmentation, lesion attribute detection and disease classification. We performed only the lesion segmentation task. There are 2594 images in the dataset containing three different categories, including 20.0% melanoma, 72.0% nevus, and 8.0% seborrheic keratosis. The dataset consists of images of various resolutions, and we adapt all images to 512 × 512 RGB images for the same reason.
The STARE [40] dataset was first introduced by Michael Goldbaum in 1975 as a color fundus image database designed for retinal vessel segmentation. This dataset comprises 20 fundus images, among which 10 have lesions and the remaining 10 do not have any lesions. The images in the dataset have a resolution of 605 × 700. To avoid overfitting the model, we randomly cropped each picture to a size of 256 × 256 four times, and introduced random noise to the images. This approach not only enhances the number of datasets, but also satisfies standardized network input sizes and GPU parallel processing of data requirements.
All experiments were run on a Tesla P40 (24 GB) graphics card. FM-Unet is developed based on Python 3.8 framework using an SGD optimizer with learning rate of 0.001, momentum of 0.9, and weight decay of 0.0001. SGD optimizer was bound to a cosine annealing decay learning rate controller with a minimum learning rate of 0.00001 and a cosine function period (T_max) of one epoch. The batch size is set to 4 and a total of 100 epochs are performed. We split the dataset by a random factor according to the training set of 70% and validation set of 30%. In order to increase the diversity of training samples, the training model has a stronger generalization ability. We also expanded the data with random rotations and random adjustments of hue, brightness and cropping of the images.
FM-Unet uses a combination of binary cross-entropy and dice coefficients as the loss function for all of the above dataset training. The cross-entropy loss function is described as shown in Eq (1).
BCE(Y,ˆY)=−∑Ni=1Y(xi)⋅logˆY(xi) | (1) |
where Y and ˆY are the true image label and the predicted image respectively, and xi is the i-th pixel of the image. The calculation formula of the Dice coefficient is shown in Eq (2).
Dice=2⋅|Y∩ˆY||Y|+|ˆY| | (2) |
where Y∩ˆY denotes the intersection of the sets Y and ˆY, |Y| and |ˆY| denote the number of their elements. For the segmentation task, Y and ˆY denote the Ground Truth (GT) and predict mask of the segmentation. The DiceLoss is DiceLoss = 1-Dice, which is calculated as shown in Eq (3).
DiceLoss(Y,ˆY)=1−2⋅|Y∩ˆY||Y|+|ˆY| | (3) |
The combination of binary cross entropy and dice coefficients between Y and ˆY is used as the final loss function. The loss function is expressed as Eq (4).
L=0.5BCE(Y,ˆY)+DiceLoss(Y,ˆY) | (4) |
In order to evaluate the performance of the proposed framework relative to the baseline approach, we use F1_score and Intersection-Over-Union (IOU) as evaluation metrics. F1-score is a measure of classification problems. F1-score is often used as the final measure in some multi-classification and binary classification problems. It is the harmonic average of the accuracy rate and recall rate, with a maximum of 1 and a minimum of F1-score is defined as shown in Eq (5).
F1=2⋅Y⋅ˆYY+ˆY | (5) |
The IOU is a standard performance measure for object segmentation problems, and its definition is shown in Eq (6). Given a set of images, IOU gives the similarity between the predicted regions and GT of the objects present in the set.
IOU=|Y∩ˆY||Y∪ˆY| | (6) |
where |⋅| denotes the base of the set. IOU is the area of overlapping between the predicted segmentation and GT divided by the joint area between the predicted segmentation and GT. For binary or multi-class segmentation, the average IOU is calculated by taking the IOUs of all classes and averaging them. The IOU index ranges from 0 to 1, where 1 represents a perfect match between GT and predicted segmentation, and 0 indicates a complete mismatch between them.
In this section, FM-Unet is compared with existing models to verify the effectiveness of the method. We choose the classical and more popular models for medical image segmentation: Unet [8], Unet++ [9], ResUnet [10], Attention Unet [11], TransUnet [12], MedFormer [41] and UNeXt [42].
Figure 4 shows the experimental results of FM-Unet and other models on the BUSI dataset, the parameter of loss function on the training set and the evaluation index on the validation set within 100 epochs are compared. We performed a random split in the BUSI dataset by training set 70% and validation set 30%. We also augmented the dataset with random rotation, adjustment of hue, brightness and cropping of the images.
Figure 4(a)–(c) show the changes in the loss function, IOU coefficients and F1-score of each model on the validation set with the increase of iterations in 100 epochs. In Figure 4(a), we can see that the loss function of FM-Unet decreases the fastest and tends to be stable around 70 epochs. At the same time, in Figure 4(b), (c), it can be seen that FM-Unet performs best in both the IOU coefficient and F1-score, and achieves good results in less iterations. FM-Unet has achieved a better IOU coefficient and F1-score at 50 epochs. Compared with TransUnet and AttentionUnet, FM-Unet can achieve better segmentation results on a smaller number of iterations. Compared with Unet, Unet++ and ResUnet, IOU coefficients and F1-score of FM-Unet are steadily increasing in the early training period, while the remaining networks have very large jumps in IOU coefficients and F1-score, especially Unet network, which has the most drastic oscillation. It proves that FM-Unet is more adaptable, robust and effective, and the whole network design is more scientific.
Table 1 shows the Params data for FM-Unet, Unet and its improved architecture model, which shows the spatial complexity and relative computation time of the model. We obtained the data in Table 1 on an input tensor of 256 × 256 × 3. From Table 1, we can find that the Params of the Unet are about 31.13 MB. UNet++ halves the output channels of the Unet network and consolidates Unet structures of different sizes into one network. Its Params are about 9.16 MB, more than three times smaller than Unet. The Res-Unet and Att-Unet are also improvements based on Unet, but there are relatively large increases on Params, 62.74 MB and 51.99 MB, respectively. With the explosion of Transformers in the last year, Trans-Unet, a fusion of Transformers and Unet networks, has seen a huge increase in Params. Nowadays, most medical image segmentation research focuses on attention mechanisms and Transformers, but this neglects computation time and graphics capacity in the pursuit of better segmentation performance. FM-Unet has a little more than the Params of Unet, and the Params data is still in a small range. However, the params are still in the smaller range of 48.03 MB than the more popular attention-based and Transformers-related networks, 3.96 MB smaller than Att-Unet and 57.29 MB smaller than Trans-Unet.
Networks | Params (in MB) |
Unet | 31.13 |
Unet++ | 9.16 |
Res-Unet | 62.74 |
Att-Unet | 51.99 |
Trans-Unet | 105.32 |
MedFormer | 28.07 |
UNeXt | 1.47 |
FM-Unet | 48.03 |
a) Results on Breast Ultra Sound Images (BUSI) dataset
Table 2 shows the experimental results of several of the most popular segmentation networks on the BUSI segmentation dataset. As shown in Table 2, FM-Unet obtained the highest IOU and F1-score scores with 70.21% and 80.53%, respectively. Compared with the results of the other seven models, IOU coefficients of FM-Unet improved by 8.16%, 7.49%, 5.39%, 4.96%, 3.29%, 8.95% and 3.26%, respectively, F1-score improved by 4.51%, 4.10%, 3.06%, 3.03, 1.23%, 6.05% and 1.16%, respectively. It shows that FM-Unet achieves better results than other models and achieves state-of-the-art.
Networks | BUSI | ISIC 2018 | STARE | |||
IOU (%) | F1-score (%) | IOU (%) | F1-score (%) | IOU (%) | F1-score (%) | |
Unet | 62.05 | 76.02 | 72.69 | 83.72 | 63.72 | 77.79 |
Unet++ | 62.72 | 76.43 | 74.24 | 84.76 | 64.62 | 78.46 |
Res-Unet | 64.82 | 77.47 | 73.16 | 84.07 | 64.42 | 78.83 |
Att-Unet | 65.25 | 77.50 | 75.10 | 85.38 | 63.52 | 77.53 |
Trans-Unet | 66.92 | 79.30 | 80.51 | 88.91 | 65.44 | 79.06 |
MedFormer | 61.26 | 74.48 | 81.14 | 88.98 | 64.50 | 78.38 |
UNeXt | 66.95 | 79.37 | 81.70 | 89.70 | 64.27 | 78.21 |
FM-Unet | 70.21 | 80.53 | 82.14 | 89.95 | 66.13 | 79.47 |
b) Results on International Skin Imaging Collaboration (ISIC 2018) dataset
Table 2 also shows the experimental results of several advanced segmentation networks on the ISIC 2018 segmentation dataset. I experimental results show that FM-Unet obtained IOU the highest IOU and F1-score with 82.14% and 89.95%, respectively. Compared with the results of the other seven models, IOU coefficients of FM-Unet improved by 9.45%, 7.90%, 8.98%, 7.04%, 1.63%, 1.00% and 0.44%, respectively, F1-score improved by 6.22%, 5.51%, 5.88%, 4.57, 1.04%, 0.97% and 0.25%, respectively. It indicates that FM-Unet achieves better results than other models on ISIC 2018 dataset and achieves state-of-the-art.
c) Results on Structured Analysis of the Retina (STARE) dataset
Table 2 also shows the experimental results of several advanced segmentation networks on the STARE segmentation dataset. The experimental results show that FM-Unet obtained IOU the highest IOU and F1-score with 66.13% and 79.47%, respectively. Compared with the results of the other seven models, IOU coefficients of FM-Unet improved by 2.41%, 1.51%, 1.71%, 2.61%, 0.69%, 1.63% and 1.86%, respectively, F1-score improved by 1.68%, 1.01%, 0.64%, 1.94%, 0.41%, 1.09% and 1.26%, respectively. It indicates that FM-Unet achieves better results than other models on STARE dataset and achieves state-of-the-art.
Quantitative evaluation to show performance differences may not be sufficient to fully understand the advantages of the proposed model. As shown in the evaluation index results in Table 2, FM-Unet achieves the best results, but visual observation is required to determine whether the proposed model works as expected. To this end, in Figure 5, we also give some visual comparison examples of the segmentation in BUSI dataset, ISIC 2018 dataset and STARE dataset.
FM-Unet employs a feedback mechanism and achieves better results than other state-of-the-art segmentation networks. These visual segmentation results show that FM-Unet can recover finer segmentation details successfully, and unexpected segmentation results are less likely to occur for complex backgrounds.
In this paper, we introduce a novel feedback encoder-decoder depth convolution network architecture based on the U-shaped structure for medical image segmentation. The core idea is to add an encoder feedback path and a decoder feedback path to the basic Unet framework. The proposed feedback path focuses on the information loss of up-sampling and down-sampling, which is well integrated with contextual information. And FM-Unet is efficiently modeled with less complexity and parameters in improved Unet-based networks. Experimental results show that our proposed architecture outperforms the state-of-the-art baselines on various benchmarks. In addition, our network achieves very excellent segmentation results in complex backgrounds. Our work has some limitations. We have developed FM-Unet, a network that relies entirely on convolutional operations. However, the localization and weight sharing of the receiver domain in convolutional operations make it difficult for our network to learn global information. For our future work, we plan to enhance the feedback mechanism's information extraction mechanism so that the feedback module can have a global sensory field. Additionally, we aim to explore a more effective information fusion architecture between the primary and feedback paths.
This work is partially supported by the Natural Science Foundation of Fujian Province (Grant Nos. 2020J01816 and 2022J01916), the National Natural Sciences Foundation of China (Grant No. 62106092) and the Principal Foundation of Minnan Normal University (Grant No. KJ18010).
The authors have no conflict of interest.
[1] | S. Kolyada, L. Snoha, Topological entropy of nonautonomous dynamical systems, Random Comput. Dyn., 4 (1996), 205–233. |
[2] |
W. Qian, F. Meng, Periodic solutions of a class of nonautonomous discrete time semi-ratio-dependent predator-prey systems, Discrete Cont. Dyn. Syst., 4 (2004), 563–574. doi: 10.3934/dcdsb.2004.4.563. doi: 10.3934/dcdsb.2004.4.563
![]() |
[3] |
C. Guo, Y. Chen, J. Shu, Dynamical behaviors of non-autonomous fractional FitzHugh-Nagumo system driven by additive noise in unbounded domains, Front. Math. China, 16 (2021), 59–93. doi: 10.1007/s11464-021-0896-7. doi: 10.1007/s11464-021-0896-7
![]() |
[4] |
Z. Wang, J. Zhang, M. Chen, A unified approach to periodic solutions for a class of non-autonomous second order Hamiltonian systems, Nonlinear Anal-Real., 58 (2021), 103218. doi: 10.1016/j.nonrwa.2020.103218. doi: 10.1016/j.nonrwa.2020.103218
![]() |
[5] |
J. M. Cushing, S. M. Henson, The effcct of periodic habit fluctuations on a nonlincar insect population model, J. Math. Biol., 36 (1997), 201–226. doi: 10.1007/s002850050098. doi: 10.1007/s002850050098
![]() |
[6] |
I. Sanchez, M. Sanchi, H. Villanueva, Chaos in hyperspaces of nonautonomous discrete systems, Chaos Soliton. Fract., 94 (2017), 68–74. doi: 10.1016/j.chaos.2016.11.009. doi: 10.1016/j.chaos.2016.11.009
![]() |
[7] |
R. M. Abu-Saris, On nonautonomous discrete dynamical systems driven by means, Adv. Differ. Equ-Ny., 13 (2006), 1–7. doi: 10.1155/ADE/2006/43470. doi: 10.1155/ADE/2006/43470
![]() |
[8] |
Y. Lan, A. Peris, Weak stability of non-autonomous discrete dynamical systems, Topol. Appl., 250 (2018), 53–60. doi: 10.1016/j.topol.2018.10.006. doi: 10.1016/j.topol.2018.10.006
![]() |
[9] |
J. S. Canovas, Recent results on non-autonomous discrete systems, SeMA J., 51 (2010), 33–40. doi: 10.1007/BF03322551. doi: 10.1007/BF03322551
![]() |
[10] |
Y. Shi, Chaos in nonautonomous discrete dynamical systems approached by their induced systems, Int. J. Bifurcat. Chaos, 22 (2012), 1250284. doi: 10.1142/S0218127412502847. doi: 10.1142/S0218127412502847
![]() |
[11] |
Y. Lan, Chaos in nonautonomous discrete fuzzy dynamical systems, J. Nonlinear Sci. Appl., 9 (2016), 404–412. doi: 10.22436/jnsa.009.02.06. doi: 10.22436/jnsa.009.02.06
![]() |
[12] |
R. Vasisht, R. Das, Specification and shadowing properties for non-autonomous systems, J. Dyn. Control Syst., 2021 (2021), 1–12. doi: 10.1007/s10883-021-09535-4. doi: 10.1007/s10883-021-09535-4
![]() |
[13] |
M. Salman, X. Wu, R. Das, Sensitivity of nonautonomous dynamical systems on uniform spaces, Int. J. Bifurcat. Chaos, 31 (2021), 2150017. doi: 10.1142/S0218127421500176. doi: 10.1142/S0218127421500176
![]() |
[14] | R. Devaney, L. Robert, An introduction to chaotic dynamical systems, Acta Appl. Math., 19 (1990), 204–205. |
[15] |
D. Ruelle, F. Takens, On the nature of turbulence, Commun. Math. Phys., 20 (1971), 178–188. doi: 10.1007/BF01646553. doi: 10.1007/BF01646553
![]() |
[16] |
H. Liu, L. Liao, L. Wang, Thickly syndetical sensitivity of topological dynamical system, Discrete Dyn. Nat. Soc., 2014 (2014), 583431. doi: 10.1155/2014/583431. doi: 10.1155/2014/583431
![]() |
[17] |
R. Li, The large deviations theorem and ergodic sensitivity, Commun. Nonlinear Sci., 18 (2013), 819–825. doi: 10.1016/j.cnsns.2012.09.008. doi: 10.1016/j.cnsns.2012.09.008
![]() |
[18] |
T. Moothathu, Stronger forms of sensitivity for dynamical systems, Nonlinearity, 20 (2007), 2115–2126. doi: 10.1088/0951-7715/20/9/006. doi: 10.1088/0951-7715/20/9/006
![]() |
[19] |
R. Li, T. Lu, G. Chen, X. Yang, Further discussion on Kato's chaos in set-valued discrete systems, J. Appl. Anal. Comput., 10 (2020), 2491–2505. doi: 10.11948/20190388. doi: 10.11948/20190388
![]() |
[20] |
H. Wang, J. Xiong, F. Tan, Furstenberg families and sensitivity, Discrete Dyn. Nature Soc., 12 (2010), 649348. doi: 10.1155/2010/649348. doi: 10.1155/2010/649348
![]() |
[21] |
R. Li, T. Lu, G. Chen, G. Liu, Some stronger forms of topological transitivity and sensitivity for a sequence of uniformly convergent continuous maps, J. Math. Anal. Appl., 494 (2020), 124443. doi: 10.1016/j.jmaa.2020.124443. doi: 10.1016/j.jmaa.2020.124443
![]() |
[22] |
L. Alseda, M. A. D. Rio, J. A. Rodriguez, A note on the totally transitive graph maps stability of pwl cellular, Int. J. Bifurcat. Chaos, 11 (2001), 841–843. doi: 10.1142/S0218127401002365. doi: 10.1142/S0218127401002365
![]() |
[23] |
M. Murillo-Arcila, A. Peris, Mixing properties for nonautonomous linear dynamics and invariant sets, Appl. Math. Lett., 26 (2013), 215–218. doi: 10.1016/j.aml.2012.08.014. doi: 10.1016/j.aml.2012.08.014
![]() |
[24] |
L. Wang, J. Liang, Z. Chu, Weakly mixing property and chaos, Arch. Math., 109 (2017), 83–89. doi: 10.1007/s00013-017-1044-1. doi: 10.1007/s00013-017-1044-1
![]() |
[25] | Q. Huang, Chaos theory and application of discrete dynamic system, Diss, Shandong Univ., 2012 (In Chinese). |
[26] |
R. Li, Z. Yu, H. Wang, Stronger forms of transitivity and sensitivity for nonautonomous discrete dynamical systems and furstenberg families, J. Dyn. Control Syst., 26 (2020), 109–126. doi: 10.1007/s10883-019-09437-6. doi: 10.1007/s10883-019-09437-6
![]() |
[27] | X. Yang, X. Tang, T. Lu, The collectively sensitivity and accessible in non-autonomous composite systems, Acta Math. Sci., (In Chinese, in press). |
1. | Guangju Li, Yuanjie Zheng, Jia Cui, Wei Gai, Meng Qi, DIM-UNet: Boosting medical image segmentation via diffusion models and information bottleneck theory mixed with MLP, 2024, 91, 17468094, 106026, 10.1016/j.bspc.2024.106026 | |
2. | Lei Yuan, Jianhua Song, Yazhuo Fan, MCNMF-Unet: a mixture Conv-MLP network with multi-scale features fusion Unet for medical image segmentation, 2024, 10, 2376-5992, e1798, 10.7717/peerj-cs.1798 | |
3. | Wenjie Meng, Shujun Liu, Huajun Wang, AFC-Unet: Attention-fused full-scale CNN-transformer unet for medical image segmentation, 2025, 99, 17468094, 106839, 10.1016/j.bspc.2024.106839 | |
4. | Guangju Li, Meng Qi, SIB-UNet: A dual encoder medical image segmentation model with selective fusion and information bottleneck fusion, 2024, 252, 09574174, 124284, 10.1016/j.eswa.2024.124284 | |
5. | Mohsin Ali, Moin Hassan, Esra Kosan, John Q. Gan, Akhilanand Chaurasia, Haider Raza, 2024, Chapter 2, 978-3-031-66957-6, 19, 10.1007/978-3-031-66958-3_2 | |
6. | Zhanlin Ji, Juncheng Mu, Jianuo Liu, Haiyang Zhang, Chenxu Dai, Xueji Zhang, Ivan Ganchev, ASD-Net: a novel U-Net based asymmetric spatial-channel convolution network for precise kidney and kidney tumor image segmentation, 2024, 62, 0140-0118, 1673, 10.1007/s11517-024-03025-y | |
7. | Yazhuo Fan, Jianhua Song, Lei Yuan, Yunlin Jia, HCT-Unet: multi-target medical image segmentation via a hybrid CNN-transformer Unet incorporating multi-axis gated multi-layer perceptron, 2024, 0178-2789, 10.1007/s00371-024-03612-y | |
8. | Hinako Mitsuoka, Kazuhiro Hotta, Human Brain-Inspired Network Using Transformer and Feedback Processing for Cell Image Segmentation, 2025, 13, 2169-3536, 50918, 10.1109/ACCESS.2025.3552847 | |
9. | Hinako Mitsuoka, Kazuhiro Hotta, 2025, Chapter 21, 978-3-031-91577-2, 278, 10.1007/978-3-031-91578-9_21 |
Networks | Params (in MB) |
Unet | 31.13 |
Unet++ | 9.16 |
Res-Unet | 62.74 |
Att-Unet | 51.99 |
Trans-Unet | 105.32 |
MedFormer | 28.07 |
UNeXt | 1.47 |
FM-Unet | 48.03 |
Networks | BUSI | ISIC 2018 | STARE | |||
IOU (%) | F1-score (%) | IOU (%) | F1-score (%) | IOU (%) | F1-score (%) | |
Unet | 62.05 | 76.02 | 72.69 | 83.72 | 63.72 | 77.79 |
Unet++ | 62.72 | 76.43 | 74.24 | 84.76 | 64.62 | 78.46 |
Res-Unet | 64.82 | 77.47 | 73.16 | 84.07 | 64.42 | 78.83 |
Att-Unet | 65.25 | 77.50 | 75.10 | 85.38 | 63.52 | 77.53 |
Trans-Unet | 66.92 | 79.30 | 80.51 | 88.91 | 65.44 | 79.06 |
MedFormer | 61.26 | 74.48 | 81.14 | 88.98 | 64.50 | 78.38 |
UNeXt | 66.95 | 79.37 | 81.70 | 89.70 | 64.27 | 78.21 |
FM-Unet | 70.21 | 80.53 | 82.14 | 89.95 | 66.13 | 79.47 |
Networks | Params (in MB) |
Unet | 31.13 |
Unet++ | 9.16 |
Res-Unet | 62.74 |
Att-Unet | 51.99 |
Trans-Unet | 105.32 |
MedFormer | 28.07 |
UNeXt | 1.47 |
FM-Unet | 48.03 |
Networks | BUSI | ISIC 2018 | STARE | |||
IOU (%) | F1-score (%) | IOU (%) | F1-score (%) | IOU (%) | F1-score (%) | |
Unet | 62.05 | 76.02 | 72.69 | 83.72 | 63.72 | 77.79 |
Unet++ | 62.72 | 76.43 | 74.24 | 84.76 | 64.62 | 78.46 |
Res-Unet | 64.82 | 77.47 | 73.16 | 84.07 | 64.42 | 78.83 |
Att-Unet | 65.25 | 77.50 | 75.10 | 85.38 | 63.52 | 77.53 |
Trans-Unet | 66.92 | 79.30 | 80.51 | 88.91 | 65.44 | 79.06 |
MedFormer | 61.26 | 74.48 | 81.14 | 88.98 | 64.50 | 78.38 |
UNeXt | 66.95 | 79.37 | 81.70 | 89.70 | 64.27 | 78.21 |
FM-Unet | 70.21 | 80.53 | 82.14 | 89.95 | 66.13 | 79.47 |