Loading [MathJax]/jax/output/SVG/jax.js
Review Topical Sections

Cytokine signaling in the modulation of post-acute and chronic systemic inflammation: a review of the influence of exercise and certain drugs

  • Acute inflammation in response to stimuli such as infection can be of deleterious amplitude and/or duration in some individuals and often tends towards chronicity in older adults. This inflammatory pattern appears to be causally linked to higher all-cause mortality and other adverse outcomes such as frailty, sarcopenia, mood disorders and impaired cognitive function. Patients in this clinical state have a persistent pro-inflammatory cytokine profile. Exercise has been shown to shift baseline levels of tumor necrosis factor (TNF), interleukin-1 (IL-1) and other cytokines to a less inflamed setting, with interleukin-6 (IL-6) playing a key modulating role. Drugs can also modulate innate immune cells and their biochemical networks with a shift to a surveillance pattern. Theophylline and chloroquine are examples of drugs that could have clinical value as immune modulators. For example, theophylline induces a 20 percent fall in TNF and around 200 percent increase in IL-10 production by blood-harvested mononuclear cells, and a fall of about 50 percent in interferon-gamma (IF-γ) release. Pharmacological activity in that domain could be exploited in clinical practice, with the aim of establishing a less pro-inflammatory innate immune milieu after provocations such as infection, trauma or major surgery.

    Citation: Stephen C Allen. Cytokine signaling in the modulation of post-acute and chronic systemic inflammation: a review of the influence of exercise and certain drugs[J]. AIMS Allergy and Immunology, 2020, 4(4): 100-116. doi: 10.3934/Allergy.2020009

    Related Papers:

    [1] Lu Yuan, Yuming Ma, Yihui Liu . Protein secondary structure prediction based on Wasserstein generative adversarial networks and temporal convolutional networks with convolutional block attention modules. Mathematical Biosciences and Engineering, 2023, 20(2): 2203-2218. doi: 10.3934/mbe.2023102
    [2] Xiaowen Jia, Jingxia Chen, Kexin Liu, Qian Wang, Jialing He . Multimodal depression detection based on an attention graph convolution and transformer. Mathematical Biosciences and Engineering, 2025, 22(3): 652-676. doi: 10.3934/mbe.2025024
    [3] Bingyu Liu, Jiani Hu, Weihong Deng . Attention distraction with gradient sharpening for multi-task adversarial attack. Mathematical Biosciences and Engineering, 2023, 20(8): 13562-13580. doi: 10.3934/mbe.2023605
    [4] Xing Hu, Minghui Yao, Dawei Zhang . Road crack segmentation using an attention residual U-Net with generative adversarial learning. Mathematical Biosciences and Engineering, 2021, 18(6): 9669-9684. doi: 10.3934/mbe.2021473
    [5] Hui Yao, Yuhan Wu, Shuo Liu, Yanhao Liu, Hua Xie . A pavement crack synthesis method based on conditional generative adversarial networks. Mathematical Biosciences and Engineering, 2024, 21(1): 903-923. doi: 10.3934/mbe.2024038
    [6] Jia Yu, Huiling Peng, Guoqiang Wang, Nianfeng Shi . A topical VAEGAN-IHMM approach for automatic story segmentation. Mathematical Biosciences and Engineering, 2024, 21(7): 6608-6630. doi: 10.3934/mbe.2024289
    [7] Changwei Gong, Bing Xue, Changhong Jing, Chun-Hui He, Guo-Cheng Wu, Baiying Lei, Shuqiang Wang . Time-sequential graph adversarial learning for brain modularity community detection. Mathematical Biosciences and Engineering, 2022, 19(12): 13276-13293. doi: 10.3934/mbe.2022621
    [8] Hanyu Zhao, Chao Che, Bo Jin, Xiaopeng Wei . A viral protein identifying framework based on temporal convolutional network. Mathematical Biosciences and Engineering, 2019, 16(3): 1709-1717. doi: 10.3934/mbe.2019081
    [9] Sakorn Mekruksavanich, Anuchit Jitpattanakul . RNN-based deep learning for physical activity recognition using smartwatch sensors: A case study of simple and complex activity recognition. Mathematical Biosciences and Engineering, 2022, 19(6): 5671-5698. doi: 10.3934/mbe.2022265
    [10] Hao Wang, Guangmin Sun, Kun Zheng, Hui Li, Jie Liu, Yu Bai . Privacy protection generalization with adversarial fusion. Mathematical Biosciences and Engineering, 2022, 19(7): 7314-7336. doi: 10.3934/mbe.2022345
  • Acute inflammation in response to stimuli such as infection can be of deleterious amplitude and/or duration in some individuals and often tends towards chronicity in older adults. This inflammatory pattern appears to be causally linked to higher all-cause mortality and other adverse outcomes such as frailty, sarcopenia, mood disorders and impaired cognitive function. Patients in this clinical state have a persistent pro-inflammatory cytokine profile. Exercise has been shown to shift baseline levels of tumor necrosis factor (TNF), interleukin-1 (IL-1) and other cytokines to a less inflamed setting, with interleukin-6 (IL-6) playing a key modulating role. Drugs can also modulate innate immune cells and their biochemical networks with a shift to a surveillance pattern. Theophylline and chloroquine are examples of drugs that could have clinical value as immune modulators. For example, theophylline induces a 20 percent fall in TNF and around 200 percent increase in IL-10 production by blood-harvested mononuclear cells, and a fall of about 50 percent in interferon-gamma (IF-γ) release. Pharmacological activity in that domain could be exploited in clinical practice, with the aim of establishing a less pro-inflammatory innate immune milieu after provocations such as infection, trauma or major surgery.


    As a new technology of the internet of things, speech recognition plays an important role in various electronic products such as smart homes and vehicle-mounted equipment. However, the interference of surrounding environmental noise can seriously affect the quality and intelligibility of the speech signal. In response to the above problems, speech enhancement technology aimed at improving the quality of the speech signal, reducing noise, and enhancing speech information has emerged [1,2].

    In the last century, by reason of limited resources and immature advanced technologies, people were able to rely more on traditional methods and techniques. Boll et al. [3] tried to obtain clear speech noise by subtracting the noise part from the spectrum, but spectral subtraction does not work well for nonstationary noise. To address the aforementioned issues, Ephraim and et al. [4] reduced the impact of noise on the speech signal by calculating the average value of samples within the window, and the experimental results show that the quality and intelligibility of speech signals has been improved significantly compared to other models.With the aim of further deepening the effect of the model in the face of nonstationary noise, some researchers used the median value of the value in the window to replace the value of the sampling point, which further improved the model's denoising effect on nonstationary noise and sudden noise [5,6]. With a view to solve the limitations of the median filtering method, Widrowand et al. [7] used adaptive filtering, which can automatically adjust parameters according to the signal and noise, improve signal quality, effectively suppress various noises, and it is suitable for complex noise environments and real-time signal processing. Although traditional methods have made many achievements in the field of speech enhancement, their scope of use is still limited, such as the detailed parts of the speech signal and the use environment. However, deep learning methods are able to compensate for these deficiencies through data-driven feature learning, thereby achieving better noise suppression and speech enhancement [8,9].

    Up to now, speech enhancement technology has completed the transformation from traditional signal processing methods to deep learning methods [10,11]. Among them, Grais et al. [12,13] used a deep neural network (DNN) to process speech signals, and it completes the modeling of the spectrum or time domain characteristics of the speech signal and finds out the nonlinearity between the speech signal and the noise. Subsequently, as the complexity of speech enhancement tasks became higher and higher, Strake et al. [14,15] introduced the convolutional neural network (CNN) into speech enhancement technology to solve complex speech enhancement problems. CNN is deeply loved by researchers due to its efficient feature extraction capability and small number of parameters. Nonetheless, CNN still cannot learn features directly from the original signal when processing speech signals, which means that it has limitations in modeling time series data [16]. To address the issues mentioned above, Choi et al. [17,18] began to introduce recurrent neural network (RNN) into the speech enhancement model to improve the modeling ability of speech signals and noise. At the same time, Hsieh et al. [19,20] combined CNN and RNN to not only improve the model's ability for time series data, but also speed up the model's training and prediction speed. In recent years, under the concept of data-driven models, autoencoders (AED) [21] and generative adversarial neural networks (GAN) [22] have begun to emerge, among which the AED model can realize unsupervised learning of low-dimensional representations of data and reduce the need for labels, making model training more flexible. The GAN model consisting of a generator and a discriminator is also an unsupervised learning method, which achieves data enhancement through adversarial training. Pascual et al. [23,24] demonstrated for the first time that its performance in the field of speech enhancement has significantly improved compared to other models. However, there are many problems with the GAN model in practical applications [25,26]. In order to further improve model performance, Hao et al. [27] began to introduce deep learning technologies such as attention mechanisms into the GAN model, and relative experimental results showed that the model can effectively capture local feature information and establish a long sequence dependency relationship with the data. With the aim of further enhancing the feature extraction and data generation enhancement capabilities of the model, Pandey et al. [28] combined the AED and GAN models to implement a more flexible enhancement strategy.

    This type of model has good performance in processing speech signals, for example, the generator of GAN can generate synthetic samples similar to real speech, and improve the generation effect through adversarial training. Additionally, GAN is able to learn and process complex speech features, including speech speed, pitch, and noise, thereby making the model more able to approximate the performance of real speech. Moreover, GAN is an unsupervised learning method that does not require a large amount of clearly labeled speech data and can reduce the difficulty of data acquisition. Last but not least, the generator of GAN can simulate multiple types of noise and makes the model highly robust in different environments, thereby improving the effectiveness of speech enhancement. These features make GAN a powerful tool for processing speech enhancement tasks. Nevertheless, these models possess certain drawbacks, such as the absence of aggregated feature information. The specific reasons why the structure design of the network may lead to discrete and non-aggregated feature information, include mismatched hierarchical structures between encoders and decoders as well as a lack of effective information transmission mechanisms in hierarchical structure design. The overly simple design of the network structure is the main factor that cannot fully capture and transmit the correlation of complex data, which results in the loss of continuity and integrity of feature information during the transmission process. However, the above models still cannot obtain the best speech enhancement effect. Through investigation and research, this article found that the above models have ignored the impact of feature information aggregation between the encoder and decoder on the performance. Therefore, this article will focus on the problem of non-aggregation of generator feature information in the GAN network.

    Considering the above factors, this paper fully exploits the network advantages of the temporal convolutional network (TCN) [29]. By introducing modules such as multilayer convolutional layers, dilate causal convolutions, and residual connections in the TCN network to aggregate and interact feature information effectively, the goal is to capture the feature information between the encoder and decoder to improve feature expression ability of the overall network. The main contributions of this article are summarized as follows:

    ● A novel speech enhancement model is proposed. We have made some extension work on the basis of the Self-Attention Generative Adversarial Network for Speech Enhancement (SASEGAN) model [30]. By integrating the TCN network with the generator, this model can capture the local feature information and long-distance feature information to solve the problem of non-aggregation of feature information. Moreover, our model obviously improves speech signal quality and intelligibility.

    ● This article uses Chinese and English datasets to conduct experimental verification analysis based on SEGAN and SASEGAN models, respectively. The experimental results perform well, which validates the effectiveness and generalization of the model. During the training phase, the model has a relatively smooth and stable loss curve, which verifies that the model is more stable and has a good fitting ability compared to other models.

    The remainder of this paper is organized as follows. We introduce the two baseline models of SEGAN and SASEGAN in Section 2. In Section 3, the SASEGAN-TCN model is proposed. In Section 4, we introduce the relevant configuration of the experiment, and the results of multiple sets of experimental data are analyzed and discussed in depth.

    Assume that the speech signal input to the GAN model is ˜X=X+N, where X and N represent the intermediate variables of input data, noise, respectively. As shown in Figure 1, the goal of speech enhancement is to recover a clean signal X from a noisy signal ˜X. The SEGAN method generates enhanced data ˆX=G(˜X,Z) by using a generator G, where Z represent the data of encoder input value decoder. The task of the discriminator D is to distinguish between the enhanced data and the real clean signal and learn to classify as true or false. At the same time, the generator G learns and generates an enhanced signal in order for the discriminator D to classify data as true. SEGAN is trained through this adversarial method and the least squares loss function. The least squares target loss function calculation formula of D and G can be expressed as:

    minDLLS(D)=12EX,˜Xpdata(X,˜X)(D(X,˜X)1)2+12EZpZ(Z),˜Xpdata(˜X)D(G(Z,˜X),˜X)2, (2.1)
    minGLLS(G)=12EZpZ(Z),˜Xp{data}(˜X)(D(G(Z,˜X),˜X)1)2+λ||G(Z,˜X)X||1, (2.2)

    where pdata(X) and Z represent the distribution probability density function of real data and latent variables, respectively. X, N, and E represent the clean speech signal, additive background noise and the expected value with respect to the distribution specified in the subscript, respectively.

    Figure 1.  The structure of SEGAN model.

    When traditional GANs perform speech enhancement, they often rely entirely on the convolution operations of each layer of the CNN in the model, which may blur the event correlation of the entire sequence and provide a way to capture the correlation between long-distance speech data. The SASEGAN model combines a self-attention layer that can adapt to nonlocal features and the convolutional layer in the SEGAN model, and the effect is significantly improved.

    The structure diagram of the self-attention layer is shown in Figure 2. The conv and pooling in the figure represent the convolutional layer and the max pooling layer, respectively. Assume that the input speech feature data is FRL×C, and choose to use a one-dimensional convolution to calculate one dimensional feature data. Query vector (Q), key vector (K), and value vector (V) are derived as follows:

    Q=FWQ,K=FWK,V=FWV, (2.3)

    where L and C represent the time dimension and the number of channels, respectively. WQRC×Ck, WKRC×Ck, and WVRC×Ck represent weight matrices. Their values are determined by the convolution layer with the number of channels as Ck and the convolution kernel size as (1×1), respectively. The optimization of the feature dimension is achieved by setting the variable k. At the same time, K and V of appropriate dimensions are selected by introducing the variable p, then the relative lower complexity O, A, and O are as follows:

    A=softmax(Q¯KT),ARL×Lp, (2.4)
    O=(AV)WO,WORCk×C, (2.5)

    where k = 2, p = 3, C = 4, and L = 6 by introducing the variable β. The convolution and other nonlinear operations are used to obtain the output result Oout, which can be expressed as:

    Oout=βO+F. (2.6)
    Figure 2.  The structure of self-attention network.

    In the generator, in an effort to enhance the feature representation ability between the encoder and the decoder, the existing technology often ignores the aggregation of feature information between the encoder and the decoder, and the model cannot obtain long-distance feature dependencies. To this end, this paper proposes a SASEGAN-TCN model, whose generator structure diagram is presented in Figure 3:

    In Figure 3, the speech signal is first extracted into matrix data with a dimension of (8192×16) through feature extraction. Second, a downsampling operation is performed through a multilayer CNN to compress the feature information, then the self-attention layer is used to obtain the dependencies of long-distance feature information until the latent variable Z between the encoder and the decoder is extracted. Finally, the obtained feature information is aggregated again through the TCN network layer. By virtue of the hole causal convolution and sum in the TCN network, the residual connection module not only avoids problems such as gradient disappearance and long-term dependence in traditional CNNs, but also it achieves the effect of aggregating feature information between the encoder and the decoder.

    Figure 3.  The generator structure of SASEGAN-TCN model.

    Although the SASEGAN model generates some feature vectors at each time step in the encoder, these features can only describe the local information of the input sequence, and the output of each time step is only related to the previous input in the decoder. The above situation will lead to the problem of non-aggregation of feature data in variable Z. We will choose the SASEGAN model based on the self-attention mechanism at the 10th layer for research and analysis. When processing time series data, the traditional CNN has some limitations. For example, when using a convolution kernel with a fixed kernel size for operation, the receptive field of the model is limited, which makes it impossible for the model to capture time dependencies within a limited range. In consideration of the foregoing challenges, dilated causal convolution combines the characteristics of dilated convolution and causal convolution to achieve an increase in the receptive field and an improvement in parameter efficiency and parallel operation efficiency. It can well handle long-term trend and periodic pattern data, and achieve an effect of feature information aggregation. Its structure is shown as:

    In Figure 4, assuming that the input time series data is z=[z[0],z[1],z[2],z[3],z[4],z[5],...z[i]], the calculation formula of the dilate causal convolution output result is shown as:

    l[t]=cz[tdc]w[c], (3.1)

    where i, d, k, l[t], z[tdc], and w[c] represent the time step, dilatation rate, index of the convolution kernel, the output of the t time step, the input data at the time step, and the weight of the convolution kernel at the convolution kernel index c, respectively.

    Figure 4.  The structure diagram of dilated causal convolution.

    This paper takes into account the problems of gradient disappearance and gradient explosion when traditional recursive neural networks process time series data. Therefore, the TCN network uses residual connection to bypass the feature information of the convolution layer and directly transfer the original feature information to the output layer. To alleviate the gradient descent problem and improve the information transfer of the network, we assume that the input is x, and the output result after the Rectified Linear Unit (RELU) nonlinear operations is F, then the calculation formula of the final output result o of the residual network is shown as:

    o=x+F(x,W), (3.2)

    where F(x,W) and W represent the nonlinear operation and network weight of the residual part, respectively.

    The residual connection module in the TCN network is shown in Figure 5. The TCN network can well aggregate feature information and realize the interaction of feature information through methods such as multilayer convolution layers, dilate causal convolutions, and residual connections to achieve the goal of improving the overall network performance and feature expression ability. Accordingly, we have effectively integrated the SASEGAN model and the TCN network, as well as processed the final output result (latent variable Z) of the encoder in the generator through the two-layer TCN network to achieve the aggregation of feature information and improve the speech enhancement effect.

    Figure 5.  The structure of TCN network residual module.

    This article uses the Valentini English dataset [31] and the THCHS30 Chinese dataset [32] with both audio sampling rates of 16 kHz. The Valentini dataset contains audio data from 30 pronunciation members in the Voice Bank corpus, and the training set was recorded by 28 pronunciation members. This pronunciation data was mixed with 10 different types of noise at signal-to-noise ratios of 15, 10, 5, and 0 db, respectively. The test set was recorded by 2 pronunciation members. After recording, it was mixed with 5 types of noise in the Demand audio library, with a signal-to-noise ratio of 17.5, 12.5, 7.5 and 2.5 db as the mixing conditions. First, we adjust the sampling rate of 15 audio signals in NoiseX-92 and concatenate them to form a long-term noisy audio data. Second, we traverse the training and testing sets in the THCHS30 dataset, then randomly select a long period of noisy audio data and mix it with mixing conditions of one of the four signal-to-noise ratios of 0, 5, 10 and 15 db. In this experiment, Table 1 shows the output data dimensions of each layer of the generator.

    Table 1.  Output dimensions of each convolutional layer in the generator.
    Layer 1 2 3 4 5 6 7 8 9 10 11
    Encoder (8192×16) (4096×32) (2048×32) (1024×64) (512×64) (256×128) (128×128) (64×256) (32×256) (16×512) (8×1024)
    Decoder (16×512) (32×256) (64×256) (128×128) (256×128) (512×64) (1024×64) (2048×32) (4096×32) (8192×16) (16,384×1)

     | Show Table
    DownLoad: CSV

    These experiments are conducted on a 2060 graphics card with 6 GB memory and the Windows system, and the software is used in Python version 3.7 and TensorFlow version 1.13. At training time, the raw audio segments in the batch are sampled from the training data with 50% overlap, followed by a high-frequency pre-emphasis filter with a synergy efficiency of 0.95. Because the computer hardware configuration is limited, the TCN network used in this article has only two layers, and the number of channels is 32 and 16, respectively. The models are trained for 10 rounds with a batch size of 10, and the learning rates of the generator and discriminator models are both 0.0002.

    To evaluate the effectiveness of the model experiments, this article will elaborate analysis based on various indicators. PESQ acts as an objective measure of speech quality, typically ranging from -0.5 to 4.5. A superior PESQ score indicates enhanced speech quality, and it's a pivotal metric for assessing the performance of speech encoding, decoding, and communication systems. As a comprehensive signal-to-noise ratio indicator, Channel State Information Gain (CSIG) evaluates the ratio of speech signals to noise, with a higher CSIG score reflecting an improved signal quality. Mean opinion score prediction of the intrusiveness of background noise (CBAK) serves as a comprehensive indicator for background noise suppression, and measures the extent of noise reduction in speech signals. A heightened CBAK score signifies more effective background noise suppression. Mean opinion score prediction of the overall effect (COVL) assesses the coverage of speech quality assessment algorithms across various quality levels and offers a more thorough evaluation of system performance. Lastly, Segmental Signal-to-Noise Ratio (SSNR), as a segmented signal-to-noise ratio indicator, is employed to assess the ratio between speech signals.

    In order to verify the effectiveness of this method, this paper first conducts experiments on the Valentini dataset. It can be seen from Table 2 that SEGAN-TCN has improved in PESQ, STOI, SSNR, and other indicators compared with the SEGAN model. Specifically, PESQ, CBAK, COVL, and STOI reached 2.1476, 2.8472, 2.7079 and 92.61% and have been improved by 9.0, 16.7, 3.0 and 0.5% compared with noisy data, in addition, the SSNR increased by 5.3724 db. However, the CSIG has been slightly reduced due to improper selection of data processing methods and insufficient model training, which will be elaborated later.

    Table 2.  SEGAN and SEGAN-TCN experimental results of on the Valentini dataset.
    PESQ CSIG CBAK COVL SSNR STOI
    NOISY 1.97 3.35 2.44 2.63 1.68 92.11
    SEGAN [23] 1.8176 3.0043 2.4423 2.3691 3.4108 91.24
    SEGAN-TCN 2.1476 3.3388 2.8472 2.7079 7.0524 92.61

     | Show Table
    DownLoad: CSV

    During the training process of the SEGAN model and the SEGAN-TCN model, the false sample loss value of the discriminator (d_fk_loss), the real sample loss value of the discriminator (d_rl_loss), the adversarial loss value of the generator (g_adv_loss), and the L1 loss value of the generator (g_l1_loss) curve chart are shown in Figure 6. This article records data every 100 steps and plots it. As can be seen from Figure 6, the SEGAN-TCN model loss value decline curves are smoother than the SEGAN model curves, and the training process is relatively stable. A decline in the d_fk_loss value denotes the discriminator's increased proficiency in distinguishing the generated samples as counterfeit, while a reduction the in the d_rl_loss value indicates the discriminator's heightened ability to accurately classify genuine samples as authentic. The diminishing g_adv_loss value suggests the generator's success in outsmarting the discriminator and creating realistic samples. Meanwhile, the decrease in the g_l1_loss value signifies the similarity, at the pixel level, between the generator-produced sample and the authentic sample.

    Figure 6.  The loss curve of SEGAN and SEGAN-TCN on the Valentini dataset.

    In order to further verify the generalization and effectiveness of the network, we will continue to conduct experiments based on the SASEGAN model. It can be seen from Table 3 that SASEGAN-TCN achieves 2.1636, 3.4132, 2.8272, 2.7631 and 92.78% on PESQ, CSIG, CBAK, COVL, and STOI on the Valentini dataset, and compared with the noise data, it's improved by 9.83, 1.9, 15.9, 5.1 and 0.7% besides the SSNR, which is improved by 4.4907 db. Data analysis reveals that the SASEGAN-TCN model has good performance in CSIG indicators, but it will reduce the quality of the speech signal when processing speech signals and the introduction of external noise will lead to a slight reduction in PESQ, CBAK, SSNR and other indicators. To effectively confront and resolve these issues, we will continue to conduct experiments and research analysis.

    Table 3.  SASEGAN and SASEGAN-TCN experimental results of on the Valentini dataset.
    PESQ CSIG CBAK COVL SSNR STOI
    NOISY 1.97 3.35 2.44 2.63 1.68 92.11
    SASEGAN [30] 2.2027 3.3331 2.9883 2.7441 8.3832 92.56
    SASEGAN-TCN 2.1636 3.4132 2.8272 2.7631 6.1707 92.78

     | Show Table
    DownLoad: CSV

    As can be seen from Figure 7, we can clearly see that during the training phase, the SASEGAN-TCN model not only successfully fits to the optimal state, but also exhibits more stable loss curves compared to the SASEGAN model. This strongly confirms the higher stability and easier convergence of SASEGAN-TCN during the training process. This result further emphasizes the superiority of the model in processing training data. The reduction in discriminator loss (d_fk_loss, d_rl_loss) indicates an improvement in the recognition of false and true samples. Lower g_adv_loss indicates successful generator deception, while lower g_l1_loss represents pixel level similarity between generated samples and real samples.

    Figure 7.  The loss curve of SASEGAN and SASEGAN-TCN on the Valentini dataset.

    To tackle the issue that the SASEGAN model will reduce the quality of the speech signal and introduce external noise when processing Valentini data, this article will once again verify the effectiveness and applicability of the network on the THCHS30 Chinese dataset based on the SASEGAN model. The experimental results are shown in Table 4. PESQ, CSIG, CBAK, COVL, and STOI can reach 1.8077, 2.9350, 2.4360, 2.3009 and 83.54%, and the SSNR increases to 4.6332 db. After analyzing the experimental data, it can be seen that the SSNR in the SASEGAN model is higher, while the PESQ and STOI are lower, which proves that the SASEGAN model introduces additional noise during the training process and results in signal distortion. Nevertheless, the SASEGAN-TCN model proposed in this article not only ensures that SSNR does not attenuate too more, but also effectively improves PESQ and STOI levels.

    Table 4.  SASEGAN and SASEGAN-TCN experimental results of on the THCHS30 dataset.
    PESQ CSIG CBAK COVL SSNR STOI
    NOISY 1.3969 2.3402 1.9411 1.78 1.3101 80.33
    SASEGAN [30] 1.7212 2.8051 2.3813 2.1815 4.9159 83.07
    SASEGAN-TCN 1.8077 2.9350 2.4360 2.3009 4.6332 83.54

     | Show Table
    DownLoad: CSV

    During the training phase, the training loss graphs of the SASEGAN and SASEGAN-TCN models on the THCHS30 dataset are shown in Figure 8. The SASEGAN-TCN model is still very stable and can achieve better fitting results than other models during the training process, which indicates that the model in this paper improved the discriminator's ability to distinguish between false and true samples, and also enhanced the generator's ability to generate false samples that are extremely similar to true samples. Through relevant experiments, it has been shown that there are also some problems that we should notice. Specifically, the integration of the TCN module increases the number of model parameters, which in turn requires higher experimental hardware costs. In addition, it has been experimentally proven that the model presented in this paper performs well in processing long speech data, while there may be poor performance in processing short speech data.

    Figure 8.  The loss curve of SASEGAN and SASEGAN-TCN on the THCHS30 dataset.

    To sum up, this article verifies the recognition effect of enhanced audio data in the field of speech recognition technology. First, this article will use the last five saved model parameters during the SASEGAN-TCN model training process for testing and will obtain enhanced audio data corresponding to the five model parameters. Second, the test output speech data is used for a multi-core two dimensional causal convolution fusion network with attention mechanism for end-to-end speech recognition (ASKCC-DCNN-CTC) model [33] testing. The recognition results are shown in Table 5. The model proposed in this article obviously improves the quality and intelligibility of speech signals and significantly reduces the recognition error rate in speech processing technology.

    Table 5.  Identification results.
    Type Test wer
    Noisy audio data 60.8189
    First 50.9427
    Second 51.5100
    Third 51.3780
    Fourth 52.5014
    Fifth 50.2238
    Average 51.3112

     | Show Table
    DownLoad: CSV

    To enhance the quality and intelligibility of speech signals effectively, this paper analyzed the characteristics of the TCN network and used modules such as multilayer convolution layers, dilated causal convolution, and residual connections in the TCN network to effectively avoid problems like gradient vanishing. Moreover, the feature information between the encoder and decoder is also aggregated, thereby improving the performance and feature expression ability of network speech enhancement. Experimental results show that the proposed model has very obvious improvement on the Valentini and THCHS30 datasets, and exhibits a certain stability during the training process. In addition, we used the enhanced speech data in speech recognition technology, and the word recognition error rate is reduced by 17.4% compared with the original noisy audio data. The above content indicates that the SASEGAN-TCN model used the characteristics of the TCN network to solve the problem of non-aggregation, improved the model's speech enhancement performance and feature expression capabilities, and effectively elevated the quality and intelligibility of noisy speech data. Additionally, the speech recognition scheme proposed in this article can still maintain high recognition accuracy in noisy environments.

    The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

    This work was supported by the National Natural Science Foundation of China (NSFC, No. 61702320).

    There are no conflicts of interest to report in this study.



    Conflict of interest



    The author declares no conflict of interest.

    [1] Rockwood K, Howlett SE (2018) Fifteen years of progress in understanding frailty and health in aging. BMC Med 16: 220. doi: 10.1186/s12916-018-1223-3
    [2] Nelke C, Dziewas R, Minnerup J, et al. (2019) Skeletal muscle as potential central link between sarcopenia and immune senescence. EBioMedicine 49: 381-388. doi: 10.1016/j.ebiom.2019.10.034
    [3] Howcroft TK, Campisi J, Louis GB, et al. (2013) The role of inflammation in age-related disease. Aging 5: 84-93. doi: 10.18632/aging.100531
    [4] Giacconi R, Malavolta M, Costarelli L, et al. (2015) Cellular senescence and inflammatory burden as determinants of mortality in elderly people until extreme old age. EBioMedicine 2: 1316-1317. doi: 10.1016/j.ebiom.2015.09.015
    [5] Degens H (2019) Human ageing: impact on muscle force and power. Muscle and Exercise Physiology London: Academic Press, 423-432. doi: 10.1016/B978-0-12-814593-7.00019-0
    [6] Brüünsgaard H, Pedersen BK (2003) Age-related inflammatory cytokines and disease. Immunol Allergy Clin North Am 23: 15-39. doi: 10.1016/S0889-8561(02)00056-5
    [7] Franceschi C, Campisi J (2014) Chronic inflammation (inflammaging) and its potential contribution to age-associated diseases. J Gerontol A Biol Sci Med Sci 69: S4-S9. doi: 10.1093/gerona/glu057
    [8] Wener MH, Daum PR, McQuillan GM (2000) The influence of age, sex and race on the upper reference limit of serum C-reactive protein concentration. J Rheumatol 27: 2351-2359.
    [9] Chung HY, Cesari M, Anton S, et al. (2009) Molecular inflammation: Underpinnings of aging and age-related diseases. Ageing Res Rev 8: 18-30. doi: 10.1016/j.arr.2008.07.002
    [10] Caspersen CJ, Pereira MA, Curran KM (2000) Changes in physical activity patterns in the United States, by sex and cross-sectional age. Med Sci Sports Exercise 32: 1601-1609. doi: 10.1097/00005768-200009000-00013
    [11] Driver JA, Djousse L, Logroscino G, et al. (2008) Incidence of cardiovascular disease and cancer in advanced age: Prospective cohort study. BMJ 337: a2467. doi: 10.1136/bmj.a2467
    [12] Cowie CC, Rust KF, Byrd-Holt DD, et al. (2006) Prevalence of diabetes and impaired fasting glucose in adults in the US population: NHANES survey 1999–2002. Diabetes Care 29: 1263-1268. doi: 10.2337/dc06-0062
    [13] Starr ME, Saito H (2014) Sepsis in old age: Review of human and animal studies. Aging Dis 5: 126-136.
    [14] Boyd AR, Orihuela CJ (2011) Dysregulated inflammation as a risk factor for pneumonia in the elderly. Aging Dis 2: 487-500.
    [15] Kalaria RN, Maestre GE, Arizaga R, et al. (2008) Alzheimer's disease and vascular dementia in developing countries: Prevalence, management, and risk factors. Lancet Neurol 7: 812-826. doi: 10.1016/S1474-4422(08)70169-8
    [16] Coresh J, Selvin E, Stevens LA, et al. (2007) Prevalence of chronic kidney disease in the United States. JAMA 298: 2038-2047. doi: 10.1001/jama.298.17.2038
    [17] Dagenais S, Garbedian S, Wai EK (2009) Systematic review of the prevalence of radiographic primary hip osteoarthritis. Clin Orthop Relat Res 467: 623-637. doi: 10.1007/s11999-008-0625-5
    [18] Ballou SP, Lozanski FB, Hodder S, et al. (1996) Quantitative and qualitative alterations of acute-phase proteins in healthy elderly persons. Age Ageing 25: 224-230. doi: 10.1093/ageing/25.3.224
    [19] Ershler WB, Sun WH, Binkley N, et al. (1993) Interleukin-6 and aging: Blood levels and mononuclear cell production increase with advancing age and in vitro production is modifiable by dietary restriction. Lymphokine Cytokine Res 12: 225-230.
    [20] Wei J, Xu H, Davies JL, et al. (1992) Increase in plasma IL-6 concentration with age in healthy subjects. Life Sci 51: 1953-1956. doi: 10.1016/0024-3205(92)90112-3
    [21] Ahluwalia N, Mastro AM, Ball R, et al. (2001) Cytokine production by stimulated mononuclear cells did not change with aging in apparently healthy, well-nourished women. Mech Ageing Dev 122: 1269-1279. doi: 10.1016/S0047-6374(01)00266-4
    [22] Beharka AA, Meydani M, Wu D, et al. (2001) Interleukin-6 production does not increase with age. J Gerontol A Biol Sci Med Sci 56: 81-88. doi: 10.1093/gerona/56.2.B81
    [23] Kabagambe EK, Judd SE, Howard VJ, et al. (2011) Inflammation biomarkers and risk of all-cause mortality in the RCARDS cohort. Am J Epidemiol 174: 284-292. doi: 10.1093/aje/kwr085
    [24] DeMartinis M, Franceschi C, Monti D, et al. (2006) Inflammation markers predicting frailty and mortality in the elderly. Exp Mol Pathol 80: 219-227. doi: 10.1016/j.yexmp.2005.11.004
    [25] Jensen GL (2008) Inflammation: Roles in aging and sarcopenia. JPEN J Parenter Enteral Nutr 32: 656-659. doi: 10.1177/0148607108324585
    [26] Penninx BW, Kritchevsky SB, Newman AB, et al. (2004) Inflammatory markers and incident mobility limitation in the elderly. J Am Geriatr Soc 52: 1105-1113. doi: 10.1111/j.1532-5415.2004.52308.x
    [27] Christian LM, Glaser R, Porter K, et al. (2011) Poorer self-related health is associated with elevated inflammatory markers among older adults. Psychoneuroendocrinology 36: 1495-1504. doi: 10.1016/j.psyneuen.2011.04.003
    [28] Michaud M, Balardy L, Moulis G, et al. (2013) Proinflammatory cytokines, aging, and age-related diseases. J Am Med Dir Assoc 14: 877-882. doi: 10.1016/j.jamda.2013.05.009
    [29] Golbidi S, Laher I (2013) Exercise and the aging endothelium. J Diabetes Res 2013: 1-12. doi: 10.1155/2013/789607
    [30] Bruunsgaard H, Skinhoj P, Qvist J, et al. (1999) Elderly humans show prolonged in vivo inflammatory activity during pneumococcal infections. J Infect Dis 180: 551-554. doi: 10.1086/314873
    [31] Krabbe KS, Bruunsgaard H, Hansen CM, et al. (2001) Ageing is associated with a prolonged fever in human endotoxemia. Clin Diagn Lab Immunol 8: 333-338. doi: 10.1128/CDLI.8.2.333-338.2001
    [32] Wu J, Xia S, Kalonis B, et al. (2014) The role of oxidative stress and inflammation in cardiovascular aging. BioMed Res Int 2014: 1-13.
    [33] McFarlin BK, Flynn MG, Campbell W, et al. (2006) Physical activity status, but not age, influences inflammatory biomarkers and toll-like receptor 4. J Gerontol A Biol Sci Med Sci 61: 388-393. doi: 10.1093/gerona/61.4.388
    [34] Rotman-Pikielny P, Roash V, Chen O, et al. (2006) Serum cortisol levels in patients admitted to the department of medicine: Prognostic correlations and effects of age, infection and co-morbidity. Am J Med Sci 332: 61-67. doi: 10.1097/00000441-200608000-00002
    [35] Johnson DB, Kip KE, Marroquin OC, et al. (2004) Serum amyloid A as a predictor of coronary artery disease and cardiovascular outcome in women. Circulation 109: 726-732. doi: 10.1161/01.CIR.0000115516.54550.B1
    [36] Beavers KM, Brinkley TE, Nicklas BJ (2010) Effect of exercise training on chronic inflammation. Clin Chim Acta 411: 785-793. doi: 10.1016/j.cca.2010.02.069
    [37] Everett BM, Bansal S, Rifai N, et al. (2009) Interleukin-18 and the risk of future cardiovascular disease among initially healthy women. Atherosclerosis 202: 282-288. doi: 10.1016/j.atherosclerosis.2008.04.015
    [38] Gokkusu C, Aydin M, Ozkok E, et al. (2010) Influences of genetic variants in interleukin-15 gene and interleukin-15 levels on coronary heart disease. Cytokine 49: 58-63. doi: 10.1016/j.cyto.2009.09.004
    [39] Caruso DJ, Carmack AJ, Lockeshwar VB, et al. (2008) Osteopontin and interleukin-8 expression is independently associated with prostate cancer recurrence. Clin Cancer Res 14: 4111-4118. doi: 10.1158/1078-0432.CCR-08-0738
    [40] Pan W, Stone KP, Hsuchou H, et al. (2011) Cytokine signaling modulates blood-brain barrier function. Curr Pharm Des 17: 3729-3740. doi: 10.2174/138161211798220918
    [41] Allison DJ, Ditor DS (2014) The common inflammatory etiology of depression and cognitive impairment: A therapeutic target. J Neuroinflammation 11: 1-12. doi: 10.1186/s12974-014-0151-1
    [42] Tizard I (2008) Sickness behaviour, its mechanisms and significance. Anim Health Res Rev 9: 87-99. doi: 10.1017/S1466252308001448
    [43] Zoladz JA, Majerczak J, Zeligowska E, et al. (2014) Moderate-intensity interval training increases serum brain-derived neurotrophic factor level and decreases inflammation in Parkinson's disease patients. J Physiol Pharmacol 65: 441-448.
    [44] Ohman H, Savikko N, Strandberg TE, et al. (2014) Effect of physical exercise on cognitive performance in older adults with mild cognitive impairment or dementia: A systematic review. Dementia Geriatr Cognit Disord 38: 347-365. doi: 10.1159/000365388
    [45] Vinik AI, Erbas T, Casellini CMJ (2013) Diabetic cardiac autonomic neuropathy, inflammation and cardiovascular disease. J Diabetes Invest 4: 4-18. doi: 10.1111/jdi.12042
    [46] Tishler M, Caspi D, Yaron M (1985) C-reactive protein levels in patients with rheumatoid arthritis. Clin Rheumatol 4: 321-324. doi: 10.1007/BF02031616
    [47] Pal M, Febbraio MA, Whitham M (2014) From cytokine to myokine: The emerging role of interleukin-6 in metabolic regulation. Immunol Cell Biol 92: 331-339. doi: 10.1038/icb.2014.16
    [48] Pedersen BK, Febbraio M (2005) Muscle-derived interleukin-6: A possible link between skeletal muscle, adipose tissue, liver and brain. Brain Behav Immun 19: 371-376. doi: 10.1016/j.bbi.2005.04.008
    [49] Kishimoto T (2010) IL-6: From its discovery to clinical applications. Int Immunol 22: 347-352. doi: 10.1093/intimm/dxq030
    [50] Mikkelsen UR, Couppe C, Karlsen A, et al. (2013) Life-long endurance exercise in humans: Circulating levels of inflammatory markers and leg muscle size. Mech Ageing Dev 134: 531-540. doi: 10.1016/j.mad.2013.11.004
    [51] Pedersen BK, Steensberg A, Fischer C, et al. (2003) Searching for the exercise factor: Is IL-6 a candidate? J Muscle Res Cell Motil 24: 113-119. doi: 10.1023/A:1026070911202
    [52] Pedersen AMW, Pedersen BK (2005) The anti-inflammatory effect of exercise. J Appl Physiol 98: 1154-1162. doi: 10.1152/japplphysiol.00164.2004
    [53] Wojewoda M, Kmiecik K, Majerczak J, et al. (2015) Skeletal muscle response to endurance training in IL-6-/- mice. Int J Sports Med 36: 1163-1169. doi: 10.1055/s-0035-1555851
    [54] Fischer CP (2006) Interleukin-6 in acute exercise and training; what is the biological relevance? Exerc Immunol Rev 12: 6-33.
    [55] Woods JA, Veira VJ, Keylock KT (2009) Exercise, inflammation and innate immunity. Immunol Allergy Clin North Am 29: 381-393. doi: 10.1016/j.iac.2009.02.011
    [56] Pedersen BK (2011) Exercise-induced myokines and their role in chronic disease. Brain Behav Immun 25: 811-816. doi: 10.1016/j.bbi.2011.02.010
    [57] Brandt C, Pedersen BK (2010) The role of exercise-induced myokines in muscle homeostasis and the defence against chronic diseases. J Biomed Biotechnol 2010: 1-6. doi: 10.1155/2010/520258
    [58] Narici MV, Maffulli N (2010) Sarcopenia: Characteristics, mechanisms and functional significance. Br Med Bull 95: 139-159. doi: 10.1093/bmb/ldq008
    [59] Leeuwenburgh C (2003) Role of apoptosis in sarcopenia. J Gerontol A Biol Sci Med Sci 58: M999-M1001. doi: 10.1093/gerona/58.11.M999
    [60] Demontis F, Rosanna P, Goldberg AL, et al. (2013) Mechanisms of skeletal muscle aging: Insights from Drosophila and mammalian models. Dis Models Mech 6: 1339-1352. doi: 10.1242/dmm.012559
    [61] Walrand S, Guillet C, Salles J, et al. (2011) Physiopathological mechanism of sarcopenia. Clin Geriatr Med 27: 365-385. doi: 10.1016/j.cger.2011.03.005
    [62] Barnes PJ (2006) Theophylline for COPD. Thorax 61: 742-744. doi: 10.1136/thx.2006.061002
    [63] Culpitt SV, de Matos C, Russell RE, et al. (2002) Effect of theophylline on induced sputum inflammatory indices and neutrophil chemotaxis in chronic obstructive pulmonary disease. Am J Respir Crit Care Med 165: 1371-1376. doi: 10.1164/rccm.2105106
    [64] Neuner P, Klosner G, Schauer E, et al. (1994) Pentoxyfylline in vivo down-regulates the release of IL-1 beta, IL-6, IL-8 and TNF alpha by human peripheral blood mononuclear cells. Immunology 83: 262-267.
    [65] Mascali JJ, Cvietusa P, Negri J, et al. (1996) Anti-inflammatory effects of theophylline: Modulation of cytokine production. Ann Allergy Asthma Immunol 77: 34-38. doi: 10.1016/S1081-1206(10)63476-X
    [66] Ito K, Lim S, Caramori G, et al. (2002) A molecular mechanism of the action of theophylline: Induction of histone deacetylase activity to decrease inflammatory gene expression. Proc Natl Acad Sci U S A 99: 8921-8926. doi: 10.1073/pnas.132556899
    [67] Ichiyami T, Hasegawa S, Matsubara T, et al. (2001) Theophylline inhibits NF-kappa activation and I kappa B alpha degradation in human pulmonary epithelial cells. Naunyn-Schmiedeberg's Arch Pharmacol 364: 558-561. doi: 10.1007/s00210-001-0494-x
    [68] Spatafora M, Chiappara G, Merendino AM, et al. (1994) Theophylline suppresses the release of TNF alpha by blood monocytes and alveolar macrophages. Eur Respir J 7: 223-228. doi: 10.1183/09031936.94.07020223
    [69] Yoshimura T, Usami E, Kurita C, et al. (1995) Effect of theophylline on the production of IL-1 beta, TNF alpha and IL-8 by human peripheral blood mononuclear cells. Biol Pharm Bull 18: 1405-1408. doi: 10.1248/bpb.18.1405
    [70] Subramanian V, Ragulan AB, Jindal A, et al. (2015) The study of tolerability and safety of theophylline given along with formoterol plus budesonide in COPD. J Clin Diagn Res 9: 10-13.
    [71] Hancock REW, Nijnik A, Philpott DJ (2012) Modulating immunity as a therapy for bacterial infections. Nat Rev Microbiol 10: 243-254. doi: 10.1038/nrmicro2745
    [72] Shih YN, Chen YT, Seethala R, et al. (2015) Effect of the use of theophylline and sepsis outcomes. Crit Care Med 43: 274. doi: 10.1097/01.ccm.0000474919.64656.12
    [73] Zhang J, Feng MX, Qu JM (2012) Low dose theophylline showed an inhibitory effect on the production of IL-6 and IL-8 in primary lung fibroblasts from patients with COPD. Mediators Inflammation 2012: 1-7.
    [74] Mosire K, Renvall MJ, Ramsdell JW, et al. (1966) The effect of theophylline on metabolic rate in COPD patients. J Am Coll Nutr 15: 403-407. doi: 10.1080/07315724.1996.10718616
    [75] Cosio BG, Iglesias A, Rios A, et al. (2009) Low-dose theophylline enhances the anti-inflammatory effects of steroids during exacerbations of COPD. Thorax 64: 424-429. doi: 10.1136/thx.2008.103432
    [76] Horby P, Lim WS, Emberson J, et al. (2020) Effect of dexamethasone in hospitalized patients with Covid-19: preliminary report. MedRxiv. In press.
    [77] Bodera P, Stankiewicz W (2011) Immunomodulatory properties of thalidomide analogs: Pomalidomide and lenalidomide, experimental and therapeutic applications. Recent Pat Endocr Metab Immune Drug Discovery 5: 192-196. doi: 10.2174/187221411797265890
    [78] Eski M, Sahin I, Sengezer M, et al. (2008) Thalidomide decreases the plasma levels of IL-1 and TNF following burn injury: Is it the new drug for modulation of systemic inflammatory response. Burns 34: 104-108. doi: 10.1016/j.burns.2007.01.007
    [79] Lee SY, Kim W, Park HW, et al. (2015) Anti-sarcopenic effects of diamino-diphenyl sulfone observed in elderly female leprosy survivors: A cross-sectional study. J Cachexia Sarcopenia Muscle 7: 322-329. doi: 10.1002/jcsm.12074
    [80] Borne B, Dijkmans BAC, Rooij HH, et al. (1997) Chloroquine and hydroxychloroquine equally affect TNF alpha, IL-6 and IF gamma production by peripheral blood mononuclear cells. J Rheumatol 24: 55-60.
    [81] Allen SC, Tiwari D (2019) The potential to use chloroquine and other 4-aminoquinoline analogues to modulate persisting inflammation in old age. SM Gerontol Geriatric Res 3: 1020-1024. doi: 10.36876/smggr.1020
    [82] Landi F, Marzetti E, Liperoti R, et al. (2013) Nonsteroidal anti-inflammatory drug (NSAID) use and sarcopenia in older people: Results from the ilSIRENTE study. J Am Med Dir Assoc 14: 626.e9-e13. doi: 10.1016/j.jamda.2013.04.012
    [83] Wu K, Tian S, Zhou H, et al. (2013) Statins protect human endothelial cells from TNF-induced inflammation via ERK5 activation. Biochem Pharmacol 85: 1753-1760. doi: 10.1016/j.bcp.2013.04.009
    [84] Saisho Y (2015) Metformin and inflammation: Its potential beyond glucose-lowering effect. Endocr Metab Immune Disord Drug Targets 15: 196-205. doi: 10.2174/1871530315666150316124019
    [85] Hattori Y, Hattori K, Hayashi T (2015) Pleiotropic benefits of metformin: Macrophage targeting its anti-inflammatory mechanisms. Diabetes 64: 1907-1909. doi: 10.2337/db15-0090
  • This article has been cited by:

    1. Darshana Subhash, Jyothish Lal G., Premjith B., Vinayakumar Ravi, A robust accent classification system based on variational mode decomposition, 2025, 139, 09521976, 109512, 10.1016/j.engappai.2024.109512
    2. Moran Chen, Mingjiang Wang, 2024, An Investigation Of Rotary Position Embedding For Speech Enhancement, 9798400710636, 44, 10.1145/3712464.3712472
  • Reader Comments
  • © 2020 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(4053) PDF downloads(103) Cited by(1)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog