Loading [MathJax]/jax/output/SVG/jax.js
Research article

Model for sustainable carbon emission reduction energy development and smart grid technology strategy

  • Received: 26 August 2024 Revised: 30 October 2024 Accepted: 07 November 2024 Published: 28 November 2024
  • In the context of sustainable energy development to reduce carbon emissions, the application of new energy sources and smart grid technologies in power systems is becoming more widespread. However, current research results on power system technology strategies for carbon emission reduction are not satisfactory. To address this problem, a model for optimal power system operation and scheduling based on the prediction error mechanism and synthetic fuel technology is proposed. The model used the carbon trading mechanism to further reduce carbon emissions and the carnivorous plant algorithm to optimize the scheduling strategy. The results indicate that the model demonstrates significant advantages in terms of carbon emission, total operating cost, prediction accuracy, and energy utilization efficiency, respectively, at 60.8 kg, 2517.5 yuan, 96.5%, and 90.2%, indicating that it utilizes energy more fully and helps to enhance the overall energy efficiency of the system. The calculation time of the optimized power system was only 12.5 s, the stability was as high as 98.7%, and the satisfaction rate was 95.6% in terms of user satisfaction. Compared to other contemporary designs, the proposed model can successfully reduce the system's carbon emissions while increasing energy efficiency. The model has positive implications for smart grid and sustainable development.

    Citation: Kangli Xiang, Keren Chen, Simin Chen, Wanqing Chen, Jinyu Chen. Model for sustainable carbon emission reduction energy development and smart grid technology strategy[J]. AIMS Energy, 2024, 12(6): 1206-1224. doi: 10.3934/energy.2024055

    Related Papers:

    [1] Peng Lu, Ao Sun, Mingyu Xu, Zhenhua Wang, Zongsheng Zheng, Yating Xie, Wenjuan Wang . A time series image prediction method combining a CNN and LSTM and its application in typhoon track prediction. Mathematical Biosciences and Engineering, 2022, 19(12): 12260-12278. doi: 10.3934/mbe.2022571
    [2] Faisal Mehmood Butt, Lal Hussain, Anzar Mahmood, Kashif Javed Lone . Artificial Intelligence based accurately load forecasting system to forecast short and medium-term load demands. Mathematical Biosciences and Engineering, 2021, 18(1): 400-425. doi: 10.3934/mbe.2021022
    [3] Xin Jing, Jungang Luo, Shangyao Zhang, Na Wei . Runoff forecasting model based on variational mode decomposition and artificial neural networks. Mathematical Biosciences and Engineering, 2022, 19(2): 1633-1648. doi: 10.3934/mbe.2022076
    [4] Qinhua Tang, Xingxing Cen, Changqing Pan . Explainable and efficient deep early warning system for cardiac arrest prediction from electronic health records. Mathematical Biosciences and Engineering, 2022, 19(10): 9825-9841. doi: 10.3934/mbe.2022457
    [5] Guanghua Fu, Qingjuan Wei, Yongsheng Yang . Bearing fault diagnosis with parallel CNN and LSTM. Mathematical Biosciences and Engineering, 2024, 21(2): 2385-2406. doi: 10.3934/mbe.2024105
    [6] Wenbo Yang, Wei Liu, Qun Gao . Prediction of dissolved oxygen concentration in aquaculture based on attention mechanism and combined neural network. Mathematical Biosciences and Engineering, 2023, 20(1): 998-1017. doi: 10.3934/mbe.2023046
    [7] Lihe Liang, Jinying Cui, Juanjuan Zhao, Yan Qiang, Qianqian Yang . Ultra-short-term forecasting model of power load based on fusion of power spectral density and Morlet wavelet. Mathematical Biosciences and Engineering, 2024, 21(2): 3391-3421. doi: 10.3934/mbe.2024150
    [8] Chao Che, Chengjie Zhou, Hanyu Zhao, Bo Jin, Zhan Gao . Fast and effective biomedical named entity recognition using temporal convolutional network with conditional random field. Mathematical Biosciences and Engineering, 2020, 17(4): 3553-3566. doi: 10.3934/mbe.2020200
    [9] Xihe Qiu, Xiaoyu Tan, Chenghao Wang, Shaotao Chen, Bin Du, Jingjing Huang . A long short-temory relation network for real-time prediction of patient-specific ventilator parameters. Mathematical Biosciences and Engineering, 2023, 20(8): 14756-14776. doi: 10.3934/mbe.2023660
    [10] Mingju Chen, Fuhong Qiu, Xingzhong Xiong, Zhengwei Chang, Yang Wei, Jie Wu . BILSTM-SimAM: An improved algorithm for short-term electric load forecasting based on multi-feature. Mathematical Biosciences and Engineering, 2024, 21(2): 2323-2343. doi: 10.3934/mbe.2024102
  • In the context of sustainable energy development to reduce carbon emissions, the application of new energy sources and smart grid technologies in power systems is becoming more widespread. However, current research results on power system technology strategies for carbon emission reduction are not satisfactory. To address this problem, a model for optimal power system operation and scheduling based on the prediction error mechanism and synthetic fuel technology is proposed. The model used the carbon trading mechanism to further reduce carbon emissions and the carnivorous plant algorithm to optimize the scheduling strategy. The results indicate that the model demonstrates significant advantages in terms of carbon emission, total operating cost, prediction accuracy, and energy utilization efficiency, respectively, at 60.8 kg, 2517.5 yuan, 96.5%, and 90.2%, indicating that it utilizes energy more fully and helps to enhance the overall energy efficiency of the system. The calculation time of the optimized power system was only 12.5 s, the stability was as high as 98.7%, and the satisfaction rate was 95.6% in terms of user satisfaction. Compared to other contemporary designs, the proposed model can successfully reduce the system's carbon emissions while increasing energy efficiency. The model has positive implications for smart grid and sustainable development.



    Presently, real-life multivariate time series (MTS) data sets have drawn considerable interest from diverse fields, including time series of sensor events [1], the Internet of Things [2], compartmental epidemiological models [3], and human conduct prediction [4]. Classical linear models include the autoregressive (AR) model [5], moving average (MA) model [6], autoregressive moving average (ARMA) model [7], and autoregressive integrated moving average (ARIMA) model [8]. Some nonlinear MTS models were applied for MTS prediction, such as threshold AR [9] and bilinear models [10]. However, the classical MTS models only partially use the hidden spatial relationship between variable pairs and are limited by complex constraints. Support vector regression (SVR) [11], artificial neural network (ANN) [12], gradient-boosted regression tree (GBRT) [13], and gradient-boosting decision tree (GBDT) [14] are prominent machine learning (ML) algorithms that have gained recognition for their superior prediction performance. Conventional ML and ANN algorithms that propagate forward (FFANN) are restricted in MTS forecasting because they must consider the correlation between the multi-dimensional input variables over time [15]. Natural techniques for modeling sequence-based systems with memories are recurrent neural networks (RNNs), which are more appropriate than FFANN [16]. RNNs have been recognized as one of the most efficient models for predicting MTS [17]. Nevertheless, traditional RNNs still need to improve the handling of long-range time-dependent MTS datasets, thereby decreasing prediction accuracy [18].

    To solve the shortcoming of conventional RNNs, researchers [19,20,21,22] have employed a long short-term memory (LSTM) model that is treated as an extended RNN with memory gates. In the following studies, the employment of bidirectional LSTM (BiLSTM) aims to overcome the limitation of LSTM that solely trains signals in a unidirectional manner. However, the feature learning characteristics of BiLSTM remain unclear [23]. Shallow BiLSTMs are still ineffective in representing the MTS features exhibiting high nonlinearity and long-term dependencies. Ewees et al. [24] employed a novel LSTM based on a heap-based optimizer to address complex optimization and engineering problems. Liu et al. [25] introduced an integrated hybrid convolutional neural LSTM network based on error correction and variational mode decomposition for hourly stepwise solar irradiance forecasting. Neshat et al. [26] proposed a deep hybrid LSTM prediction model utilizing a convolutional neural network featuring BiLSTM and an adaptive decomposition-based algorithm for wave power prediction. Li et al. [27] operated an evolutionary attention-based multi-layer LSTM network (EA-LSTM) through learning with competitive random search for MTS forecasting. Deep LSTM or BiLSTM architectures can generally acquire good representations of high-dimensional, complex, and strongly nonlinear MTS signals.

    Notably, a deep belief network (DBN) [28] employed by Hinton et al. can be considered a layered feature extraction method that consists of multiple restricted Boltzmann machines (RBMs). DBNs expresses the potential rules of the input features and exhibits better generalization capacities. Unlike other traditional nonlinear models, the apparent merit of DBN is its distinctive unsupervised pre-training to eliminate over-fitting in the training process. Another advantage of a DBN is that it is simple to sample signals from top to bottom, and the topology of a DBN is more suitable for ANN models based on a backpropagation algorithm. However, DBNs still has some inherent drawbacks, such as high computational complexity, slow training time, and poor application on large-scale data. One of the most troubling drawbacks of DBN application is that DBNs need to be stronger in dealing with time series with complex patterns or nonlinear dynamic systems. To solve the above disadvantage, we can either extend the DBN structure or generate hybrid models based on DBNs; for example, we can either add RNN layers with dynamic characteristics or encoder layers with the ability to handle transient data to the original DBN structure. In recent years, hybrid deep DBN-based RNNs have shown better potential for time series forecasting. Sun et al. [29] applied a DBN-based echo state network (ESN) for MTS forecasting that can efficiently learn layered dataset characteristics of sequence datasets. Li et al. [30] constructed a deep ANN based on an improved DBN and ESN for blockchain virtual currency prediction. Wu et al. [31] employed a chained structure ESN based on stacked subnetwork modules for MTS prediction.

    To enhance the forecasting accuracy of MTS datasets and improve the feature extraction capability of traditional BiLSTM, this study proposes a new deep belief improved BiLSTM (DBI-BiLSTM). Although the BiLSTM can learn data features from different directions, more than a single-layer BiLSTM is needed to understand MTS data with many features and high-dimensional complexity. Therefore, the DBN is added on top of the multi-layer BiLSTM to improve the data representation of the DBI-BiLSTM, and a chained structure is introduced to prevent the loss of features during model training. In DBI-BiLSTM, the feature vectors discovered by the DBN are divided into different clusters based on a variance-based global sensitivity analysis (GSA) method. Then, these clustered features are used as inputs to other BiLSTM neurons. Among the DBI-BiLSTM, the outputs of the former BiLSTM neurons are connected with the additional clustered features based on GSA to be combined into new input sequences for the following nearby layer.

    This study has three critical innovations and contributions that can be emphasized as follows:

    1) Our proposed method, DBI-BiLSTM, leverages the advantages of DBN and DI-BiLSTM by utilizing a DBN for efficient feature extraction and DI-BiLSTM for generating a hierarchical data representation from different perspectives. This combination allows for more comprehensive modeling of MTS datasets, resulting in improved performance.

    2) The proposed DBI-BiLSTM model utilizes a deep BiLSTM architecture composed of stacked BiLSTM modules to capture dynamic features into hierarchical layers. This design enhances the model's flexibility and robustness in handling MTS forecasting.

    3) The variance-based GSA algorithm is employed to identify the sensitivity of the DBN module's output features. Primary features with high sensitivity are injected into the first layer of the DI-BiLSTM. In contrast, supplementary features with low sensitivity are fed into the subsequent layers of the DI-BiLSTM.

    This paper is structured as follows: Section 2 provides a fundamental overview of DBN, GSA, single-layer BiLSTM, and multi-layer BiLSTM. Section 3 thoroughly interprets the employed DBI-BiLSTM model's architecture and training methodology. Section 4 describes the MTS experimental outcomes and discusses the distribution of the weights. An overall conclusion is presented in the last section.

    A DBN is a probabilistic productive network that consists of several RBM layers. A DBN extracts data features using an unsupervised layer-by-layer training method [32], as shown in Figure 1. An RBM is an unsupervised non-linear feature extractor based on a Markov random field, including two essential layers: a visible cell and a hidden unit layer. The output of each RBM hidden unit is wholly linked to the next RBM unit by symmetric undirected synapses. These RBM properties lead to a conditional independence between visible and hidden cells.

    Figure 1.  The architecture of DBN consists of multiple RBMs.

    The joint probability distribution identified by the RBM's weights utilized an energy-based function of {v, h}, as shown in the following equation:

    En(v,h;θ)=vTWhaTvbTh=Dvi=1Dhj=1wijvihjDvi=1aiviDhj=1bjhj, (1)

    where θ={bi,aj,wij}, wij represents the weight from visible cell i to hidden cell j, and ai and bj are the bias of units i and j, respectively. The joint probability distribution of the RBM model over the visible-hidden cells is computed according to Eq (1) as follows:

    P(v,h;θ)=1Z(θ)exp(E(v,h;θ)), (2)

    where Z(θ) is a normalizing constant value or partition function obtained from the summary of all potential energy allocations combining cells i and j.

    Z(θ)=vhexp(E(v,h;θ)) (3)

    The RBM can obtain the probability of input datasets through the energy equation. According to the joint probability distribution function, the conditional probability functions of cells i and j can be obtained by the following functions:

    P(hj=1|v)=δ(bj+iviwij), (4)
    P(vi=1|h)=δ(ai+jhjwij), (5)
    δ(x)=11+exp(x). (6)

    The input state will be reconstructed by arranging every vi to 1 with the probability given by Eq (5). Thus, the state of hidden cells is gradually renewed to represent the reconstruction features. The maximization implements the training method in RBM and the probability distribution of the training data with regard to the model variables as follows:

    maximize{bj,ai,wij}1mml=1log(P(vl)) (7)

    where m represents the length of training datasets. Therefore, the objective function is a log-likelihood term; a gradient descent algorithm should solve it. However, the gradient calculation of the log-likelihood term is difficult to implement due to the presence of Z(θ). Thus, sampling methods such as contrastive divergence [33] and persistent contrastive divergence [34] in a gradient calculation can be utilized instead.

    LSTM is an improved version of the conventional RNNs, that utilizes specially designed memory units to efficiently express the long-term dependencies of MTS datasets. In contrast to classical RNNs, the design of LSTM offers an efficient answer to the vanishing gradient issue. The LSTM cell learns the present hidden units' state according to the current input and the prior hidden units' state. Nonetheless, it replaces the architecture of the hidden units with a memory cell that can represent the long-term dependence of the MTS signals. As shown in Figure 2, the LSTM network introduces four controlled gates, including one input, one output, one forgets, and one self-loop memory cell, to manipulate the interactions of the information streams between various memory neurons. In the hidden LSTM layers, the forget gate determines which information from the previous moment should be preserved or ignored. At the same time, the entrance of the input neurons can decide whether input signals should be injected into the memory units' information. The gate of the output neurons handles whether the state of different memory units can be changed. Given the input xt of MTS and the dynamic output state ht, the gates states, outputs of hidden layers, and neuron states can be computed using the following equations:

    it=σ(Uixt+Wiht1+bi), (8)
    ft=σ(Ufxt+Wfht1+bf), (9)
    ot=σ(Uoxt+Woht1+bo), (10)
    ˜ct=tanh(Ucxt+Wcht1+bc), (11)
    ct=ftct1+it˜ct, (12)
    ht=ottanh(ct). (13)
    Figure 2.  The framework for the basic shallow LSTM.

    In the equations used for LSTM, the recurrent weight matrices are represented by Wi, Wf, Wo, and Wc, while the weight matrices for the input, forget, output, and memory cell gates are denoted by Ui, Uf, Uo, and Uc, respectively. The gates biases are expressed as bi, bf, bo, and bc. At each time step, ht refers to the state of the hidden layer, and ot is the output. The candidate cell state ˜ct is used to update the original memory cell state ct. The hyperbolic tangent function is represented by tanh, and the logistic sigmoid activation function is denoted by σ. The multiplication operation by elementwise is described by .

    The traditional LSTM may inadvertently discard important sequential information during training because it processes input signals in only one direction. As a result, the time series data cannot be thoroughly analyzed. To address this limitation, the BiLSTM was developed with a bidirectional structure that can capture representations of MTS data through forward and backward directions. This procedure is illustrated in Figure 3.

    Figure 3.  The architecture diagram of BiLSTM network.

    The BiLSTM consists of two LSTM layers that run in parallel in two opposite directions. In the forward propagation direction, the information of the hidden LSTM neurons, represented by hf(t), retains information from past sequence values. In the reverse propagation direction, the hidden state, represented by hb(t), contains data from future MTS values. hf(t) and hb(t) are linked together to create the ultimate outputs of BiLSTM. The t-th hidden states of BiLSTM for the forward and backward states are calculated using the following equations:

    hf(t)=ψ(Wfhxt+Wfhhhf(t1)+bfb), (14)
    hb(t)=ψ(Wbhxt+Wbhhhb(t+1)+bb). (15)

    The weight matrices Wfh and Wbh represent the forward and backward synapses weights from the input to the internal unit weights. Similarly, Wfhh and Wbhh represent the forward and backward feedback recurrent weights. Additionally, bfb and bb correspond to bias information in two directions. This study gives the activation function of the hidden layer ψ as tanh. Using these components, the output of BiLSTM yt is described using the following equation:

    yt=σ(Wfhyhf(t)+Wbhyhb(t)+by) (16)

    Where the output layer's forward and backward weights are denoted by Wfhy and Wbhy, respectively. The activation function of the output layer σ is generally given as either a sigmoidal or linear function. Additionally, by denotes the output bias.

    Due to the high dimensionality and numerous features of the MTS data, the input of DI-BiLSTM or the output of DBN becomes more complex after DBN processing. Therefore, conducting a sensitivity analysis on the DBN output is necessary to reduce the input complexity of the multi-layer BiLSTM. Among different available algorithms for network output sensitivity analysis (SA), variance-based SA algorithms, which apply Monte Carlo sampling and are based on model decomposition, have demonstrated their versatility and effectiveness. An improved variance-based GSA algorithm was proposed by Saltelli et al. [35] to compute a model's total sensitivity indices (SI). GSA has been applied to a parametric sensitivity analysis for various models with excellent stability and minimal computational cost. Considering the output of a nonlinear model Y=f(X)=f(x1,x2,,xk), where Y is the output and X=(x1,x2,,xk) contains k input factors, the final model variance is computed as follows:

    V(Y)=ki=1Vi+ki=1kj=i+1(Vij++Vi,...,k), (17)

    where X is readjusted to a k-dimensional hypercube Ωk (Ωk={X|0xi1,i=1,,k}), V(Y) denotes the sum variance, Vi=V|E(Y|xi)| is the variance of the parameter xi, and Vij is the variance of the interaction between parameters xi and xj.The first-order sensitivity Si and total sensitivity STi of parameter xi are expressed as follows:

    Si=ViV(Y)=V|E(Y|xi)|V(Y), (18)
    STi=S1+jiSij+=E|V(Y|xi)|V(Y), (19)

    where xi stands for all factors except x.

    To calculate the total sensitivity values for factor xi, we must create two independent sampling matrices, P and Q, each of size (N, k), where N is the sample length and k is the number of model variables. Every row in the matrices corresponds to an input parameter vector X for the model. We can use Monte Carlo methods to approximate ˆV(Y), ˆSi and ˆSTi. The computation function is shown as follows:

    ˆf0=1NNj=1f(P)j, (20)
    ˆV(Y)=1NNj=1(f(P)j)2ˆf20, (21)
    ˆSi=1NNj=1f(Q)j(f(P(i)Q)jf(P)j)ˆV(Y), (22)
    ˆSTi=12NNj=1(f(P)jf(P(i)Q)j)2ˆV(Y), (23)

    where ˆf0 is the average value of a model's output, P(i)Q represents the new matrix obtained by replacing the ith column data in matrix P with the ith column data in matrix Q. To calculate ˆSi and ˆSTi, the model simulation number is set to N(k+2). Due to interaction effects, the sum of ˆSTi of k parameters would be greater than 1, therefore ˆSTi was normalized. In this study, the total sensitivity of each parameter in the model was evaluated according to Eq (23).

    Comprehensive research has shown that adding depth to ANNs can efficiently increase the overall performance of ANN models [36]. Similarly, the academic community has been impressed by the learning capabilities of deep RNNs in MTS forecasting [32,37]. Inspired by the learning capacity of multi-layer RNNs and the characteristic extraction ability of DBN, we develop a DBI-BiLSTM model for MTS prediction, consisting of multiple BiLSTM modules connected through a chained structure, as shown in Figure 4. The DBI-BiLSTM model has three main parts: a DBN-based feature extraction component, a sensitivity analysis component, and an improved BiLSTM-based prediction model. In the DBI-BiLSTM network, the DBN output vectors are categorized into major and minor features based on their sensitivity to the model's output (SI). The primary features are those that are highly sensitive to the model's production, while the minor features are those that are less sensitive. The significant features with a high sensitivity are injected into the first layer of the DI-BiLSTM. In contrast, supplementary features with low sensitivity are fed into the subsequent layers of the DI-BiLSTM.

    Figure 4.  The proposed DBI-BiLSTM structure.

    The significant benefits of the proposed DBI-BiLSTM can be outlined in three aspects.

    1) DBN's powerful feature extraction capability ensures the feature representation ability of DBI-BiLSTM for MTS datasets, which further reduces the prediction error of the employed DBI-BiLSTM for MTS prediction.

    2) The DBI-BiLSTM model incorporates both multi-layer and bidirectional RNN structures. The multi-layer enhances the training and generalization capabilities of DBI-BiLSTM. In contrast, the propagation in different directions enables the DBI-BiLSTM to efficiently train the MTS signals from two directions. This characteristic makes DBI-BiLSTM more capable of adapting to the natural features of the MTS datasets.

    3) The multi-layer architecture of DBI-BiLSTM is different from the traditional deep LSTM structure; it is an improved structure that includes multiple BiLSTM modules. Each BiLSTM module can dynamically map learned feature vectors from the DBN layer. The forecasting results of DBI-BiLSTM are affected by continuously training the output of the former BiLSTM neurons and by increasing additional input data features in different BiLSTM hidden states. This makes the addressed DBI-BiLSTM more reliable and stable for MTS forecasting compared to traditional deep LSTM structures.

    A structure explanation diagram of the DI-BiLSTM with three layers is shown in Figure 5, in which each layer is represented through a distinguishing color.

    Figure 5.  Structure diagram and principle of DI-BiLSTM with three layers.

    A consistent hidden layer size of Nh is used for each BiLSTM with different propagation direction layers to make the experiments more compact. Let Ii and Oi represent the number of input and output units, respectively, for layer i. The export signals of the forward and backward recurrent neurons of the i-th BiLSTM layer are represented as h(i)f(t) and h(i)b(t), while the bias signals of the i-th BiLSTM module for different directional neurons are represented by b(i)fb and b(i)b. b(i)y denotes the output bias signals of the i-th BiLSTM module. The output states of the recurrent layer and the final output can be expressed as follows:

    h(i)f(t)f(t)=ψ(Wfh[x(i)t;y(i1)t]+Wfhhh(i)f(t1)+b(i)fb), (24)
    h(i)b(t)(t)=ψ(Wbh[x(i)t;y(i+1)t]+Wbhhh(i)b(t+1)+b(i)b) (25)
    y(i)t=σ(Wfhyh(i)f(t)+Wbhyh(i)b(t)+b(i)y). (26)

    In the experiments of this paper, the Adaptive Moment Estimation (Adam) method [38] is used as the learning optimizer to renew the weights of the DBI-BiLSTM model with an original learning rate (set to 0.001). For an MTS prediction, the training and testing procedure loss function can be defined as the mean square error (MSE) function:

    MSE=1nnt=1(ytˆyt)2 (27)

    where ˆyt represents the actual forecasted signal, yt denotes the wanted output signal and n denotes the length of yt.

    Figure 4 shows that the model begins by using multiple DBN layers to map inputs to their feature representations. Then, the clustered feature vectors based on GSA are fed into the DI-BiLSTM layers, which process them in different directions. The outputs of the DI-BiLSTM layers are then passed through a wholly connected layer, which serves as the regression neurons and uses an adjusted linear activation function. A suitable dropout probability (set to 0.2) is applied to the proposed forecasting model to prevent overfitting of the MTS datasets. Table 1 summarizes the proposed DBI-BiLSTM parameters.

    Table 1.  Parameter values for the DBI-BiLSTM models of different MTS simulations.
    Tasks Batch size DBN layers
    (Md)
    Neurons in
    RBM (Nd)
    Hidden neurons
    of BiLSTM (Nh)
    BiLSTM layers
    (Mh)
    Activation function
    ψ
    HX
    SML
    BS
    MITV
    100
    100
    100
    100
    2
    3
    2
    3
    10
    12
    8
    10
    50
    40
    35
    40
    3
    3
    4
    3
    tanh
    tanh
    tanh
    tanh

     | Show Table
    DownLoad: CSV

    The learning mechanism of DBI-BiLSTM is shown as follows:

    Algorithm: Employed DBI-BiLSTM model
    Initialization: datasets are divided into training, validation, and testing set which are then normalized before training;
    Input: MTS data V={v1,v2,...,vn};
    Output: extracted feature vectors X={x1,x2,...,xn} of data V;
    for data in training and testing data do
      Extract features (X) of all the datasets by the DBN layer and update the weights of the DBN layer through Eqs (4)-(7);
    end for
      The total SI of ˆSTi for each input feature vector (X) is calculated according to Eqs (20)-(23) and ranks the SI of feature (X);
      Divide X into major input features (high SI) and minor input features (low SI) based on the ranked SI and initialize the weights and layers of BiLSTM;
    Input: feature representation vectors X=[Xmajor,Xminor]T={x1,x2,...,xn};
    Output: desired output Y={y1,y2,...,yn}
    for t1 to n do
      Calculate the gates outputs through Eq (8)-(13);
      Update the forward and backward hidden layers states through Eqs (24) and (25);
      Obtain the actual output Yt of the DI-BiLSTM through Eq (26);
    end for
      Calculate the training error MSE through Eq (27);
      Update all the weights of the BiLSTM layers by the Adam optimizer;
      Repeat until the training MSE converges;
    Test the DBI-BiLSTM on testing datasets.

    In this section, we simulate four real-world MTS datasets to evaluate the performance of the employed DBI-BiLSTM network. These datasets include the heat exchangers (HX) system [39], the small-medium-large (SML) system [40], the bike sharing (BS) dataset [41], and the metro interstate traffic volume (MITV) dataset from the UCI machine learning repository. The data was divided into three sections for each MTS simulation experiment: training set, validation set, and testing set. The neuron number and layers in different DBN and BiLSTM modules are jointly determined based on the length and dimensionality of the MTS data. In this study, we used the validation data. We identified the hyperparameters of the DBI-BiLSTM model, such as the number of DBN layers (Md), the neurons number per RBM (Nd), the number of BiLSTM layers (Mh), and the number of neurons per BiLSTM recurrent layer (Nh) by the grid search or greedy search algorithm until the established DBI-BiLSTM's validation MSE is minimized. All experiments were performed on an Intel-based Core i5-8265U (1.60 GHz CPU with 8 GB RAM). Table 1 summarizes the hyperparameters used in the simulation experiments.

    The DBI-BiLSTM model is assessed through the following loss functions: percentage improvement (IM%), normalized mean squared error (NMSE), mean absolute error (MAE), and symmetric mean absolute percentage error (SMAPE):

    NMSE=nt=1(ytˆyt)2nσ2, (28)
    MAE=1nnt=1|ytˆyt|, (29)
    SMAPE=100nnt=1|ytˆyt|(|ˆyt|+|yt|)/2, (30)
    IM%=NMSELSTMNMSEModelNMSELSTM×100%, (31)

    where ˆyt represents the actual forecasted signal, yt denotes the wanted output signal, σ2 is the variance of yt, and n denotes the length of yt NMSELSTM represents the prediction performance of a single-layer LSTM, while NMSEModel indicates the forecasting performance of the comparison method. The IM% value denotes the percentage improvement in performance achieved by different prediction models compared to a single-layer BiLSTM model. In the following experiments, the mean NMSE, MAE, and SMAPE errors for testing are acquired by experimentally averaging ten times on the MTS datasets.

    To demonstrate the performance and efficacy of the DBI-BiLSTM model on different MTS, five ablation models were employed as a single-layer LSTM, a shallow BiLSTM (single-layer), a multi-layer LSTM, a multi-layer BiLSTM, and a DI-BiLSTM. These ablation models were used to further illustrate the DBI-BiLSTM model's superiority. Furthermore, we evaluated and compared the performance of the proposed DBI-BiLSTM model with several classical and state-of-the-art models, including SVR, GBRT, DBN-ANN, Elman [42], gated recurrent units (GRU) [43], ESN, attention-LSTM [44], stacked bidirectional and unidirectional LSTM (SBU-LSTM) [45], EA-LSTM [27], and LSTM-FCN [46], where the SVR and GRRT are typical ML models used for comparison, DBN-ANN and Elman are classical ANN and single-layer RNN models for comparison, GRU is a gate-based single-layer RNN similar to LSTM, and ESN is an RNN with a different training algorithm compared with LSTM. Similar to DBI-BiLSTM, attention-LSTM, SBU-LSTM, EA-LSTM, and LSTM-FCN are recently proposed structural improvements to the original LSTM, which are multi-layer LSTM models that acquire features from the original data by adding some feature representation layers before the multi-layer LSTM layers. Overall, this study employed various models to demonstrate the effectiveness of the proposed DBI-BiLSTM model, which was then compared to classical and state-of-the-art models to prove its relative performance.

    In the following experiments, several parameters were identified as significantly impacting the model's performance, including Md, Mh, Nd, and Nh. The parameters that resulted in the minimum loss for validating NMSE were chosen as the ultimate values for the model. Figures 6-9 illustrate the validation NMSE values obtained by varying the Md, Mh, Nd, and Nh parameters for the four MTS tasks. Table 1 summarizes the ultimate parameter values for the employed DBI-BiLSTM models. To ensure a fair comparison for testing, the parameters of the LSTM-based models used for comparison, including hidden layer neurons number, learning method, activation or fire function, original training rate, and dropout constant, were set equal to the values of the parameters in the DBI-BiLSTM model.

    Figure 6.  NMSE for validating that vary with different BiLSTM hidden neurons.
    Figure 7.  NMSE for validating that vary with different BiLSTM layers.
    Figure 8.  NMSE for validating that vary with different N1.
    Figure 9.  NMSE for validating that vary with different N2.

    The HX task used in this paper is from the literature [39]. HX is a complex non-linear MTS task involving the effective heat exchange between two streams utilizing the temperature difference. The task presents significant difficulties, including flow turbulence, fluid flow geometry, and complex thermal behavior. Figure 10 illustrates the whole sequential data of the HX datasets, which is comprised of a total of 4, 000 datasets. First, the HX data are normalized between −1 and 1 ([−1, 1]); 4, 000 length data steps are taken out for modeling and testing the DBI-BiLSTM model, of which the first 2, 000 steps are used for training, the next 1, 000 steps are used as validation set for parameter selection, and the last 1, 000 steps are used for performance testing of the DBI-BiLSTM. The grid search method is used to select the optimal parameters (Md, Mh, Nd and Nh), as shown in Table 1. Then, the DBI-BiLSTM model is constructed based on the parameters in Table 1; the initial learning rate is set to 0.001, and the bias of each BiLSTM is set to 1. Meanwhile, a dropout probability of 0.2 is set for the BiLSTM layers to ensure that the DBI-BiLSTM does not overfit the time series datasets.

    Figure 10.  Sequential trend of the HX task.

    The experimental results of the DBI-BiLSTM model were evaluated against several ablation LSTM-based models, including a shallow BiLSTM (one layer), a multi-layer LSTM, a multi-layer BiLSTM, and a DI-BiLSTM. To ensure a fair comparison, the same parameter values were used for the ablation models as those used for the DBI-BiLSTM model, as shown in Table 1. The forecasting outcomes acquired by DBI-BiLSTM and shallow BiLSTM for the HX dataset over a selected length of 200 datasets are shown in Figure 11. The loss error of NMSE, MAE, SMAPE, and enhancement IM% for the shallow BiLSTM, multi-layer LSTM, multi-layer BiLSTM, DI-BiLSTM, and DBI-BiLSTM for the HX experiment are summarized in Table 2.

    Figure 11.  Fitting and test absolute error of DBI-BiLSTM and shallow BiLSTM for the HX experiment.
    Table 2.  The testing NMSE, MAE, MAPE, IM% and running time of DBI-BiLSTM and ablation models for the four MTS forecasting benchmarks.
    Benchmarks Testing
    Performance
    Proposed and ablation models
    One layer LSTM One layer BiLSTM Multi-layer LSTM Multi-layer BiLSTM DI-BiLSTM DBI-BiLSTM
    HX NMSE 0.0658 0.0282 0.0273 0.0215 0.0158 0.0096
    MAE 0.0527 0.0332 0.0324 0.0279 0.0193 0.0138
    MAPE 52.976 28.612 27.492 24.787 21.035 15.418
    IM%
    Time(s)

    30.2
    57.14
    59.1
    58.51
    76.1
    67.32
    128.9
    77.61
    129.1
    85.41
    130.7
    SML NMSE 0.2451 0.1582 0.1295 0.1131 0.0801 0.0601
    MAE 0.0623 0.0361 0.0304 0.0285 0.0199 0.0149
    MAPE 17.253 13.571 9.417 8.836 8.036 6.713
    IM%
    Time(s)

    34.4
    35.45
    52.7
    47.16
    84.3
    53.85
    145.2
    66.94
    145.8
    75.47
    147.8
    BS NMSE 0.3712 0.3276 0.3033 0.1751 0.1502 0.1423
    MAE 0.1053 0.0987 0.0891 0.0561 0.0507 0.0501
    MAPE 58.468 53.169 49.051 37.849 36.306 35.513
    IM%
    Time(s)

    27.1
    11.74
    40.5
    18.29
    87.5
    52.82
    146.6
    56.91
    147.1
    61.66
    148.2
    MITV NMSE 0.4723 0.4587 0.4453 0.4229 0.3472 0.3272
    MAE 0.1352 0.1236 0.1198 0.1125 0.0839 0.0897
    MAPE 41.821 40.243 38.602 35.145 34.070 32.493
    IM%
    Time(s)

    38.4
    2.87
    59.3
    5.71
    95.9
    10.46
    166.3
    20.01
    167.6
    30.72
    168.4

     | Show Table
    DownLoad: CSV

    The SML dataset is an open dataset from the UCI machine learning repository that collected data from a monitor system mounted in an intelligent house. SML data are sampled at one-minute intervals and smoothed by averaging the data over 15 minutes (Open-source download link: https://archive.ics.uci.edu/dataset/274/sml2010). This dataset includes sensor readings such as weather forecast temperature, relative humidity, lighting, rain, sun dusk, wind speed, sunlight in the west, east and south facades, sun irradiance, outdoor temperature, outdoor relative humidity, and room temperature, which is the target output for this experiment. Figure 12 illustrates the input and output sequential data of the SML benchmark. The SML data are normalized in this experiment between −1 and 1 ([−1, 1]). The total length of the SML dataset used in this paper is 4137. The first 3, 000 items are used for training, the next 537 steps are used as validation set for parameter selection, and the last 6, 00 steps are used for performance testing of the DBI-BiLSTM. The grid search method is used to select the optimal parameters (Md, Mh, Nd and Nh), as shown in Table 1. Then, the DBI-BiLSTM model is constructed based on the parameters in Table 1; the initial learning rate, the bias of each BiLSTM, and the dropout probability are similarly set in the HX experiment.

    Figure 12.  Sequential trend of the SML task.

    To evaluate the effectiveness of the proposed DBI-BiLSTM model, we compare its performance with several ablation LSTM-based models, as described previously. The ablation models' parameters are chosen according to the values of DBI-BiLSTM (see Table 1) to ensure a fair comparison. The prediction results acquired by DBI-BiLSTM and a shallow BiLSTM over a selected length of 200 testing signals for the SML benchmark are presented in Figure 13. Table 2 summarizes the NMSE, MAE, SMAPE, and enhancement IM% of the shallow BiLSTM, multi-layer LSTM, multi-layer BiLSTM, DI-BiLSTM, and DBI-BiLSTM models for the SML benchmark.

    Figure 13.  Fitting and test absolute error of DBI-BiLSTM and shallow BiLSTM for the SML benchmark.

    The BS dataset is an open dataset from the UCI machine learning repository that represents a new generation of traditional bike rental systems (Open-source download link: https://archive.ics.uci.edu/dataset/275/bike+sharing+dataset). The BS dataset represents a new generation of conventional bike rental systems. This automated system allows for membership eligibility, bike rentals, and returns to be completed entirely through mechanical processes. The dataset consists of a two-year usage record (2011-2012) of the capital bike-share system and corresponding weather and seasonal information. Sensor data includes working day, weather, temperature, feeling temperature, humidity, wind speed, and the total number of bicycle rentals per hour. The target value to be predicted in this experiment is the total number of bicycle rentals per hour. Figure 14 displays the sequential trend (input and output) of the BS benchmark. In this experiment, the BS data are normalized between −1 and 1 ([−1, 1]), as described previously. The total length of the BS dataset used in this paper is 17, 389, but only 5, 000 were used for experiments due to a periodic pattern in the data. The first 3000 items are used for training, the next 1000 steps are used as validation sets for parameter selection, and the last 1000 data are used for performance testing of the DBI-BiLSTM. The grid search method is used to select the optimal parameters (Md, Mh, Nd and Nh), as shown in Table 1. Then, the DBI-BiLSTM model is constructed based on the parameters in Table 1; the initial learning rate, the bias of each BiLSTM, and the dropout probability are similarly set in the HX and SML experiments.

    Figure 14.  Sequential trend of the BS benchmark.

    The performance of the DBI-BiLSTM model was simulated using the same ablation LSTM-based models, as described earlier. The parameters used in the ablation model were selected according to the values of DBI-BiLSTM model (see Table 1). The forecasting performance of the shallow BiLSTM and DBI-BiLSTM models for the BS task over a 200-length testing dataset are shown in Figure 15. The performance of the shallow BiLSTM, multi-layer LSTM, multi-layer BiLSTM, DI-BiLSTM, and DBI-BiLSTM models for the BS task in terms of NMSE, MAE, SMAPE, and IM% is shown in Table 2.

    Figure 15.  Fitting and test absolute error of DBI-BiLSTM and shallow BiLSTM for the BS benchmark.

    The MITV dataset is an open dataset from the UCI machine learning repository that represents a situation of MTS regression, where the employed network aims to forecast the continuous variables (Open download link: https://archive.ics.uci.edu/dataset/492/metro+interstate+traffic+volume). This dataset includes hourly traffic volume data for the MN DoT ATR station 301, located approximately halfway between Minneapolis and St. Paul, MN. The sensor data includes holidays, temperature, rainfall, snowfall, percentage of cloud cover, weather descriptions, and hourly traffic volume. The hourly traffic volume serves as the target predicted variable in this experiment. The input and output sequential data for the MITV task are displayed in Figure 16. In this experiment, the MITV data are normalized between −1 and 1 ([−1, 1]), as described previously. The total length of the BS dataset used in this paper is 48, 204, but only 10, 000 were used for experiments due to a robust periodic pattern. The first 6, 000 items are used for training, the next 2, 000 steps are used as validation sets for parameter selection, and the last 2, 000 steps are used for performance testing of the DBI-BiLSTM. The grid search method is used to select the optimal parameters (Md, Mh, Nd and Nh), as shown in Table 1. Then, the DBI-BiLSTM model is constructed based on the parameters in Table 1; the initial learning rate, the bias of each BiLSTM, and the dropout probability are similarly set in the HX, SML, and BS experiments.

    Figure 16.  Sequential trend of the MITV benchmark.

    As with the previous experiments, we simulated the experimental results of the DBI-BiLSTM model using the same ablation LSTM-based models. The parameters for the ablation model were set to the same values as the DBI-BiLSTM model, as shown in Table 1. Figure 17 displays the prediction performance over a 200-length testing dataset for both the DBI-BiLSTM and shallow BiLSTM models for the MITV task. Table 2 shows the NMSE, MAE, SMAPE, and enhancement IM% for the shallow BiLSTM, multi-layer LSTM, multi-layer BiLSTM, DI-BiLSTM, and DBI-BiLSTM models for the MITV task.

    Figure 17.  Fitting and test absolute error of DBI-BiLSTM and shallow BiLSTM for the MITV task.

    To analyze and validate the impact of DBN layers and deep chained structure in the performance of DBI-BiLSTM, comparative ablation models, including single-layer and multi-layer LSTM-based models, are used to test the selected time series datasets. The forecasting results for the comparison of DBI-BiLSTM and the ablation models are shown in Table 2. The running time(s) performance in Table 2 is the total time of the training and testing process.

    From Table 2, we can see that the running time of the bidirectional ablation models is significantly longer than that of the unidirectional LSTM models, and the running time of the multi-layer ablation model is longer than that of the single-layer ablation model. The computational complexity of our proposed DBI-BiLSTM model is comparable to that of the multi-layer BiLSTM, which suggests that the DBN module and the stack chained structure in the DBI-BiLSTM do not significantly enhance the computational burden of the multi-layer BiLSTM.

    To comprehensively assess and evaluate the performance of the DBI-BiLSTM model in MTS forecasting, we conducted a comparative test with several fundamental models. These models include conventional SVR, GBRT, DBN-ANN, traditional Elman RNN, classical variant GRU of RNN, and ESN. Additionally, we compared the DBI-BiLSTM model with several lately proposed LSTM models, including the EA-LSTM model that can be adjusted by evolutionary computation, the SBU-LSTM model with deep stack structure, and the attention-LSTM model. To ensure a fair comparison, we set all parameters used in the DBN-based and LSTM-based models to the same values as those used in the DBI-BiLSTM model (developed as Table 1). The performance comparison between DBI-BiLSTM and many other MTS forecasting models is presented in Table 3. The running time(s) performance in Table 3 is the total time of the training and testing process. Furthermore, we conducted a 10-fold cross-validation experiment on four selected MTS datasets and obtained NMSE performance results. These results are shown in Figure 18 in the form of a box plot.

    Table 3.  The performance comparison of DBI-BiLSTM and various MTS forecasting models.
    Forecasting
    models
    Test
    Performance
    MTS Datasets
    HX SML BS MITV
    SVR NMSE 1.3657 1.5824 0.6700 0.5766
    MAE 0.2967 0.1969 0.1007 0.1311
    SMAPE 121.368 46.960 60.795 52.047
    IM%
    Time(s)
    −1975.53
    21.2
    −545.61
    35.6
    −80.49
    38.3
    −22.08
    50.1
    GBRT NMSE 1.4174 1.7838 0.2902 0.4502
    MAE 0.2850 0.1978 0.0712 0.1315
    SMAPE 125.196 47.867 33.743 43.491
    IM%
    Time(s)
    −2054.10
    23.2
    −627.78
    38.1
    21.82
    40.8
    4.68
    53.7
    DBN-ANN NMSE 1.0372 1.7490 0.8257 0.5203
    MAE 0.1823 0.2675 0.1223 0.1396
    SMAPE 84.619 54.822 66.427 52.225
    IM%
    Time(s)
    −1476.29
    16.1
    613.58
    20.3
    −122.44
    25.1
    −10.16
    34.8
    Elman NMSE 0.0892 0.2201 0.3902 0.4554
    MAE 0.0813 0.0586 0.1023 0.1481
    SMAPE 38.305 8.292 60.581 51.307
    IM%
    Time(s)
    −33.56
    18.3
    10.20
    26.6
    −5.11
    30.1
    3.58
    43.3
    GRU NMSE 0.0832 0.2045 0.3648 0.4352
    MAE 0.0724 0.0519 0.0915 0.1349
    SMAPE 36.363 7.203 59.205 50.275
    IM%
    Time(s)
    −26.44
    29.2
    16.56
    33.8
    1.72
    26.1
    7.855
    37.9
    ESN
    (Nh = 500)
    NMSE 0.0216 0.3060 0.4113 0.4899
    MAE 0.0263 0.0810 0.1079 0.1430
    SMAPE 25.876 9.820 65.836 52.266
    IM%
    Time(s)
    67.17
    4.2
    −24.84
    5.6
    −10.80
    5.8
    3.72
    7.1
    Attention-LSTM NMSE 0.0480 0.1858 0.1923 0.3871
    MAE 0.0401 0.0396 0.0728 0.1254
    SMAPE 32.372 8.710 47.726 46.289
    IM%
    Time(s)
    27.05
    138.1
    24.19
    168.7
    48.19
    186.5
    18.04
    260.7
    SBU-LSTM NMSE 0.0383 0.1926 0.1830 0.3995
    MAE 0.0361 0.0446 0.0604 0.1129
    SMAPE 30.845 7.576 47.157 40.482
    IM%
    Time(s)
    41.79
    127.8
    21.42
    140.9
    50.70
    145.5
    15.41
    160.9
    EA-LSTM NMSE 0.0210 0.0791 0.1654 0.3427
    MAE 0.0221 0.0252 0.0608 0.1262
    SMAPE 17.845 8.435 48.191 45.577
    IM%
    Time(s)
    68.08
    156.6
    67.76
    191.8
    55.44
    212.3
    27.44
    307.9
    LSTM-FCN NMSE 0.0279 0.1135 0.2131 0.3722
    MAE 0.0292 0.0314 0.0766 0.1335
    SMAPE 23.467 8.290 49.478 46.270
    IM%
    Time(s)
    57.60
    161.8
    53.69
    205.6
    42.58
    240.4
    21.19
    319.2
    DBI-BiLSTM NMSE 0.0096 0.0601 0.1423 0.3272
    MAE 0.0138 0.0149 0.0501 0.0897
    SMAPE 15.418 6.713 35.513 32.493
    IM%
    Time(s)
    85.41
    130.7
    75.47
    147.8
    61.66
    148.2
    30.72
    168.4

     | Show Table
    DownLoad: CSV
    Figure 18.  Box plot of test NMSE using DBI-BiLSTM and many other MTS forecasting models with 10-fold cross validation.

    As can be seen from the running time in Table 3, the computational complexity of the DBI-BiLSTM model is significantly higher than that of the statistical models and the single-layer RNN models. However, the computational complexity of the DBI-BiLSTM model is still acceptable compared with the recently proposed multi-layer LSTM model (Attention-LSTM, SBU-LSTM, EA-LSTM, and LSTM-FCN) based on feature learning.

    Heteroscedasticity, in contrast to homoscedasticity, is the case when the variance of the stochastic error items of the fitting network is not constant. Specifically, if the variance of the error term changes with changes in the independent variable, we have either variable variance or heteroscedasticity. Figure 19 displays the results of the heteroscedasticity test for each MTS dataset. According to the change of the residuals with the fitness numbers in Figure 19, it can be concluded that the values are typically spread around 0, and the variance keeps significantly steady with increasing fitness values. Figure 19 shows little heteroskedasticity among the DBI-BiLSTM models, and the homoskedasticity can be maintained.

    Figure 19.  Performance and results of the heterogeneity test for the DBI-BiLSTM prediction model.

    Given that the feature vectors output by DBN are typically more complex and have a higher dimension, it becomes necessary to incorporate the GSA process. This is because GSA provides a framework for attributing the uncertainty in the model's output to various sources of uncertainty in the input factors of the model. Figure 20 illustrates the schematic chart of the total SI acquired by the GSA algorithm for the four MTS benchmarks. The SI in Figure 20 enables us to identify the output of the DBN with high and low sensitivity to DBI-BiLSTM. This information helps us to classify the input data of DI-BiLSTM into major and minor features.

    Figure 20.  SA of DBN's output to DBI-BiLSTM performance.

    To demonstrate the different characteristics of the one-direction prediction model (DBI-LSTM) and the bidirectional forecasting model (DBI-BiLSTM), as well as the differences in synapsis distribution resulting from the two-direction transmission, partial output weight heat maps of DBI-LSTM (unidirectional) and DBI-BiLSTM (bidirectional-forward and bidirectional-backward) are plotted. Figure 21 displays these heatmaps, where the red rectangle on the left represents the weight heatmap for DBI-LSTM, and the green boxes represent the forward and backward heatmaps of DBI-BiLSTM. As indicated in Table 2 and Figure 21, bidirectional propagation outperforms one-direction propagation for shallow and multi-layer LSTM structures. Additionally, Figure 21 shows that the weights from the recurrent layer to the output layer of the one-direction LSTM are mostly uniformly distributed within the symmetric interval of 0, indicating that the features in the one-direction LSTM are transmitted in a single direction. Conversely, the forward and backward output synapses of the bidirectional LSTM models are distinguishable in Figure 21, meaning that if most of the forward weights are more significant than zero, then most of the backward consequences are more minor than zero, and vice versa. Therefore, it clearly illustrates that the DBI-BiLSTM model can effectively learn and train the characteristics of the MTS in different directions, further increasing the interpretability of the DBI-BiLSTM.

    Figure 21.  Heatmaps analysis presentation of the partial output weights (left part is unidirectional).

    This paper proposes a novel, profound, improved BiLSTM network for MTS forecasting. The proposed network is comprised of a DBN, a GSA module, and a stacked BiLSTM network. The DBN layer is used for unsupervised feature learning, and the learned features based on GSA are divided into major and minor parts, which are then fed into each part of the BiLSTM modules. The different layers of the BiLSTM learn the features of the input data and integrate the final output results. The DBI-BiLSTM network leverages the BiLSTM and RBM to thoroughly understand the transient information of the signals of different layers, collecting diverse and rich information in forward and backward directions. Four real-world MTS datasets were applied to test the performance of DBI-BiLSTM. Comparative experimental results on the MTS tasks demonstrate that the proposed DBI-BiLSTM outperforms some conventional ML forecasting models, several classical RNN-based MTS prediction models, and a few recently proposed LSTM-based models. The percentage improvement of DBI-BiLSTM compared with the original shallow LSTM on the four MTS datasets is 85.41, 75.47, 61.66 and 30.72%, respectively.

    The proposed DBI-BiLSTM can effectively extract features from MTS data and learn features sufficiently in a multi-layer chained structure to improve performance. Additionally, the DBI-BiLSTM is a more robust and flexible in forecasting MTS datasets. The visualization and interpretation of the input and output weights reflect the proposed model's reasonability. The ideas presented in this paper can also be used in other neural networks. Future work will involve evaluating other tasks with the proposed method and using advanced optimization algorithms to optimize network parameters and improve training efficiency.

    The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

    The authors gratefully acknowledge the support of the following foundations: Research Project of State Grid Ningbo Electric Power Supply Company under Grant 2022YXKJ-002, National Natural Science Foundation of China (62002183), Zhejiang Provincial Natural Science Foundation of China (LQ20F020012).

    The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.



    [1] Butt OM, Zulqarnain M, Butt TM (2021) Recent advancement in smart grid technology: Future prospects in the electrical power network. Ain Shams Eng J 12: 687-695. https://doi.org/10.1016/j.asej.2020.05.004 doi: 10.1016/j.asej.2020.05.004
    [2] Priyanka EB, Thangavel S, Gao XZ (2021) Review analysis on cloud computing based smart grid technology in the oil pipeline sensor network system. Pet Res 6: 77-90. https://doi.org/10.1016/j.ptlrs.2020.10.001 doi: 10.1016/j.ptlrs.2020.10.001
    [3] Pal R, Chavhan S, Gupta D, et al. (2021) A comprehensive review on IoT‐based infrastructure for smart grid applications. IET Renewable Power Gener 15: 3761-3776. https://doi.org/10.1049/rpg2.12272 doi: 10.1049/rpg2.12272
    [4] Lopez J, Rubio JE, Alcaraz C (2021) Digital twins for intelligent authorization in the B5G-enabled smart grid. IEEE Wireless Commun 28: 48-55. https://doi.org/10.1109/MWC.001.2000336 doi: 10.1109/MWC.001.2000336
    [5] Li Y, Yan J (2022) Cybersecurity of smart inverters in the smart grid: A survey. IEEE Trans Power Electron 38: 2364-2383. https://doi.org/10.1109/TPEL.2022.3206239 doi: 10.1109/TPEL.2022.3206239
    [6] Xu S, Yu B (2021) Current development and prospect of hydrogen energy technology in China. J Beijing Inst Technol (Social Sciences Edition) 23: 1-12. htps:/doi.org10.159185 jbitss1009-3370.2021.3061
    [7] Pingkuo L, Xue H (2022) Comparative analysis on similarities and differences of hydrogen energy development in the World's top 4 largest economies: A novel framework. Int J Hydrogen Energy 47: 9485-9503. https://doi.org/10.1016/j.ijhydene.2022.01.038 doi: 10.1016/j.ijhydene.2022.01.038
    [8] Tarasov BP, Fursikov PV, Volodin AA, et al. (2021) Metal hydride hydrogen storage and compression systems for energy storage technologies. Int J Hydrogen Energy 46: 13647-13657. https://doi.org/10.1016/j.ijhydene.2020.07.085 doi: 10.1016/j.ijhydene.2020.07.085
    [9] Diaz IU, de Queiróz Lamas W, Lotero RC (2023) Development of an optimization model for the feasibility analysis of hydrogen application as energy storage system in microgrids. Int J Hydrogen Energy 48: 16159-16175. https://doi.org/10.1016/j.ijhydene.2023.01.128 doi: 10.1016/j.ijhydene.2023.01.128
    [10] Zhang X (2021) The development trend of and suggestions for China's hydrogen energy industry. Engineering 7: 719-721. https://doi.org/10.1016/j.eng.2021.04.012 doi: 10.1016/j.eng.2021.04.012
    [11] Abomazid AM, El-Taweel NA, Farag HEZ (2022) Optimal energy management of hydrogen energy facility using integrated battery energy storage and solar photovoltaic systems. IEEE Trans Sustainable Energy 13: 1457-1468. https://doi.org/10.1109/TSTE.2022.3161891 doi: 10.1109/TSTE.2022.3161891
    [12] Shao C, Feng C, Shahidehpour M, et al. (2021) Optimal stochastic operation of integrated electric power and renewable energy with vehicle-based hydrogen energy system. IEEE Trans Power Syst 36: 4310-4321. https://doi.org/10.1109/TPWRS.2021.3058561 doi: 10.1109/TPWRS.2021.3058561
    [13] Qays MO, Ahmad I, Abu-Siada A, et al. (2023) Key communication technologies, applications, protocols and future guides for IoT-assisted smart grid systems: A review. Energy Rep 9: 2440-2452. https://doi.org/10.1016/j.egyr.2023.01.085 doi: 10.1016/j.egyr.2023.01.085
    [14] Jenkins JD, Sepulveda NA (2021) Long-duration energy storage: A blueprint for research and innovation. Joule 5: 2241-2246. https://doi.org/10.1016/j.joule.2021.08.002 doi: 10.1016/j.joule.2021.08.002
    [15] Magdy G, Bakeer A, Alhasheem M (2021) Superconducting energy storage technology-based synthetic inertia system control to enhance frequency dynamic performance in microgrids with high renewable penetration. Prot Control Mod Power Syst 6: 1-13. https://doi.org/10.1186/s41601-021-00212-z doi: 10.1186/s41601-021-00212-z
    [16] Chatterjee S, Parsapur RK, Huang KW (2021) Limitations of ammonia as a hydrogen energy carrier for the transportation sector. ACS Energy Lett 6: 4390-4394. https://doi.org/10.1021/acsenergylett.1c02189 doi: 10.1021/acsenergylett.1c02189
    [17] Li J, Gu C, Xiang Y, et al. (2022) Edge-cloud computing systems for smart grid: state-of-the-art, architecture, and applications. J Mod Power Syst Clean Energy 10: 805-817. https://doi.org/10.35833/MPCE.2021.000161 doi: 10.35833/MPCE.2021.000161
    [18] Ari I (2023) A low carbon pathway for the turkish electricity generation sector. Green Low-Carbon Econ 1: 147-153. https://doi.org/10.47852/bonviewGLCE3202552 doi: 10.47852/bonviewGLCE3202552
    [19] Xu X, Zhou Q, Yu D (2022) The future of hydrogen energy: Bio-hydrogen production technology. Int J Hydrogen Energy 47: 33677-33698. https://doi.org/10.1016/j.ijhydene.2022.07.261 doi: 10.1016/j.ijhydene.2022.07.261
    [20] Scovell MD (2022) Explaining hydrogen energy technology acceptance: A critical review. Int J Hydrogen Energy 47: 10441-10459. https://doi.org/10.1016/j.ijhydene.2022.01.099 doi: 10.1016/j.ijhydene.2022.01.099
    [21] Liu X, Liu X, Jiang Y, et al. (2022) Photovoltaics and energy storage integrated flexible direct current distribution systems of buildings: definition, technology review, and application. CSEE J Power Energy Syst 9: 829-845. https://doi.org/10.17775/CSEEJPES.2022.04850 doi: 10.17775/CSEEJPES.2022.04850
    [22] Așchilean I, Cobȋrzan N, Bolboaca A, et al. (2021) Pairing solar power to sustainable energy storage solutions within a residential building: A case study. Int J Energy Res 45: 15495-15511. https://doi.org/10.1002/er.6982 doi: 10.1002/er.6982
    [23] Kaur A, Narang N (2024) Multi-objective generation scheduling of integrated energy system using hybrid optimization technique. Neural Comput Appl 36: 1215-1236. https://doi.org/10.1007/s00521-023-09091-x doi: 10.1007/s00521-023-09091-x
    [24] Zhong Z, Fan N, Wu L (2024) Multistage robust optimization for the day-ahead scheduling of hybrid thermal-hydro-wind-solar systems. J Global Optim 88: 999-1034. https://doi.org/10.1007/s10898-023-01328-2 doi: 10.1007/s10898-023-01328-2
    [25] Liu Z, Huang B, Hu X, et al. (2023) Blockchain-based renewable energy trading using information entropy theory. IEEE Trans Network Sci Eng 11: 5564-5575. https://doi.org/10.1109/TNSE.2023.3238110 doi: 10.1109/TNSE.2023.3238110
    [26] Sun Q, Han R, Zhang H, et al. (2015) A multiagent-based consensus algorithm for distributed coordinated control of distributed generators in the energy internet. IEEE Trans Smart Grid 6: 3006-3019. https://doi.org/10.1109/TSG.2015.2412779 doi: 10.1109/TSG.2015.2412779
  • This article has been cited by:

    1. Mohammed Basheri, Intelligent Breast Mass Classification Approach Using Archimedes Optimization Algorithm with Deep Learning on Digital Mammograms, 2023, 8, 2313-7673, 463, 10.3390/biomimetics8060463
    2. Rayed AlGhamdi, Design of Network Intrusion Detection System Using Lion Optimization-Based Feature Selection with Deep Learning Model, 2023, 11, 2227-7390, 4607, 10.3390/math11224607
    3. Jingbin Zhang, Yanfang Zhang, Fan Tang, Yuanpeng Song, Yuanhaonan Deng, Songlin He, 2024, E-commerce Retail Merchandise Based on Optimized K-means Algorithm and Multi-model Fusion Demand Forecasting Research, 9798400710155, 512, 10.1145/3685088.3685178
    4. Noelyn M. De Jesus, Enrique D. Festijo, Gerard Francesco DG. Apolinario, Dylan Josh Domingo Lopez, 2024, Hybrid BiLSTM-PSO Approach for Multi-Metering Point Day-Ahead Electrical Load Forecasting, 979-8-3503-5166-8, 326, 10.1109/ICOPESA61191.2024.10743580
    5. Difeng Zhu, Yafei Zhang, Xuan Gong, Hai-miao Hu, Yi Gao, DTS-BWpredictor: Dual-scale temporal strategy based bandwidth prediction in highly dynamic links, 2025, 13891286, 111071, 10.1016/j.comnet.2025.111071
    6. Pramesh Gautam, Carsten Bockelmann, Armin Dekorsy, Probabilistic Interference Prediction for Dynamic 6G In-X Sub-Networks, 2025, 6, 2644-125X, 2454, 10.1109/OJCOMS.2025.3554993
    7. Chenhui Wang, Jianhua Fang, Zhiming Liu, Wenrui Luo, Wenbing Xu, Xiao Yin, A hybrid deep learning model for accurate prediction of subway deep excavation settlement, 2025, 7, 2631-8695, 025112, 10.1088/2631-8695/add4c6
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(920) PDF downloads(131) Cited by(0)

Figures and Tables

Figures(11)  /  Tables(4)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog