
Citation: Muhammad Sadeqi Nezhad, Farhad Seif, Ilad Alavi Darazam, Azam Samei, Monireh Kamali, Hossein Aazami, Monireh Mohsenzadegan, Yaghoub Mollaei-Kandelousi, Pegah Babaheidarian, Majid Khoshmirsafa, Mohsen Fateh. An overview of the prominence of current diagnostic methods for diagnosis of COVID-19[J]. AIMS Allergy and Immunology, 2020, 4(3): 60-74. doi: 10.3934/Allergy.2020006
[1] | Hongzeng He, Shufen Dai . A prediction model for stock market based on the integration of independent component analysis and Multi-LSTM. Electronic Research Archive, 2022, 30(10): 3855-3871. doi: 10.3934/era.2022196 |
[2] | Dewang Chen, Xiaoyu Zheng, Ciyang Chen, Wendi Zhao . Remaining useful life prediction of the lithium-ion battery based on CNN-LSTM fusion model and grey relational analysis. Electronic Research Archive, 2023, 31(2): 633-655. doi: 10.3934/era.2023031 |
[3] | Zichang Wang, Xiaoping Lu . Gender prediction model based on CNN-BiLSTM-attention hybrid. Electronic Research Archive, 2025, 33(4): 2366-2390. doi: 10.3934/era.2025105 |
[4] | Yulin Zhao, Fengning Liang, Yaru Cao, Teng Zhao, Lin Wang, Jinhui Xu, Hong Zhu . MRI-based model for accurate prediction of P53 gene status in gliomas. Electronic Research Archive, 2024, 32(5): 3113-3129. doi: 10.3934/era.2024142 |
[5] | Nihar Patel, Nakul Vasani, Nilesh Kumar Jadav, Rajesh Gupta, Sudeep Tanwar, Zdzislaw Polkowski, Fayez Alqahtani, Amr Gafar . F-LSTM: Federated learning-based LSTM framework for cryptocurrency price prediction. Electronic Research Archive, 2023, 31(10): 6525-6551. doi: 10.3934/era.2023330 |
[6] | Xite Yang, Ankang Zou, Jidi Cao, Yongzeng Lai, Jilin Zhang . Systemic risk prediction based on Savitzky-Golay smoothing and temporal convolutional networks. Electronic Research Archive, 2023, 31(5): 2667-2688. doi: 10.3934/era.2023135 |
[7] | Jiange Liu, Yu Chen, Xin Dai, Li Cao, Qingwu Li . MFCEN: A lightweight multi-scale feature cooperative enhancement network for single-image super-resolution. Electronic Research Archive, 2024, 32(10): 5783-5803. doi: 10.3934/era.2024267 |
[8] | Rizhao Cai, Liepiao Zhang, Changsheng Chen, Yongjian Hu, Alex Kot . Learning deep forest for face anti-spoofing: An alternative to the neural network against adversarial attacks. Electronic Research Archive, 2024, 32(10): 5592-5614. doi: 10.3934/era.2024259 |
[9] | Zengyu Cai, Liusen Xu, Jianwei Zhang, Yuan Feng, Liang Zhu, Fangmei Liu . ViT-DualAtt: An efficient pornographic image classification method based on Vision Transformer with dual attention. Electronic Research Archive, 2024, 32(12): 6698-6716. doi: 10.3934/era.2024313 |
[10] | Yufei Duan, Xian-Ming Gu, Tingyu Lei . Application of machine learning in quantitative timing model based on factor stock selection. Electronic Research Archive, 2024, 32(1): 174-192. doi: 10.3934/era.2024009 |
In recent years, the number of financial activities has surged alongside rapid economic development, leading to increasingly complex trends. Simultaneously, the advancement of information technology has resulted in the accumulation of vast amounts of financial transaction data. Leveraging this historical data to uncover patterns in financial activities and predict their evolution has become a key focus in both academic and financial research. This approach benefits investors and profit-driven organizations by enabling informed decision-making at the micro-level, with the goal of maximizing profits. However, the inherent non-linearity and complexity of financial data make accurate financial price prediction an exceptionally challenging task [1].
In the early days, researchers primarily adopted statistical models and econometric methods to construct prediction model. An assumption for using these models was that the data should exhibit linear characteristics, which did not align with the nonlinear nature of financial data. Subsequently, some nonlinear machine learning models were widely used in financial forecasting, including support vector machines (SVM), neural networks, and others. Recently, with the widespread adoption of deep learning models across various domains, techniques such as convolutional neural networks (CNNs) and long short-term memory (LSTM) networks have also been extensively employed in financial forecasting. These models are particularly effective in addressing nonlinear problems [2]. To further enhance feature representation, people attempt to use transformer models in varied fields. This paper aims to introduce the transformer model into stock price prediction, combining it with CNN to extract local features effectively. Additionally, to make predictions more practical, this work also explores multi-step forecasting, allowing for a more comprehensive understanding of future trends in stock prices.
The primary contributions of this paper are summarized as follows:
● We propose a small hybrid model that can fully leverage the advantages of CNNs and transformers.
● We not only achieved one-step prediction, but also achieved multi-step prediction for stock price prediction. This can obtain stock trends on a larger time scale, which is beneficial for making decisions.
● We selected four types of stock data to demonstrate the generalization ability and adaptability of our method. Experiments have shown that our proposed method achieves better performance.
The remainder of the paper is organized as follows: Section 2 provides an overview of recent literature on stock price prediction. In Section 3, we introduce the proposed fusion approach. Section 4 presents the experimental setup, followed by a discussion of the results. Finally, Section 5 concludes the paper and outlines directions for future research.
Researchers have made significant advancements in predicting stock prices. Early efforts primarily utilized time series analysis models such as autoregressive conditional heteroskedasticity (ARCH) [3] and generalized autoregressive conditional heteroskedasticity (GARCH) [4]. To better address the nonlinearity and volatility of financial data, machine learning models, including artificial neural networks (ANN) [5] and support vector machines (SVM) [6], have been widely applied. Tay et al. were the first to apply SVM to stock market prediction [7]. Birgul et al. used ANN to predict stock returns [8], and Kara et al. further confirmed the significant performance of ANN and SVM models in predicting stock prices [9].
However, these models have drawbacks, such as being prone to local optima and parameter tuning difficulties. To address these issues, researchers have proposed hybrid models based on these traditional approaches to enhance performance. Armano et al. introduced a hybrid neural network model that uses a genetic classifier to control the activation of feedforward neural networks [10]. Fu et al. proposed the Bayesian Ying Yang neural network, inspired by the ancient Chinese Ying-Yang philosophy [11]. Additionally, Choudhary et al. developed a hybrid model combining a genetic algorithm with SVM, which was applied to stock market prediction [12]. Vijh et al. [13] combined an artificial neural network with random forest techniques to predict the next day's stock closing price, demonstrating the effectiveness of their model. Meanwhile, Chandar et al. [14] assessed the efficiency of a hybrid prediction model using a multi-layer perceptron (MLP) and cat swarm optimization (CSO) algorithm. In our previous work [15], we also proposed a hybrid model combining particle swarm optimization with SVM. The experimental results showed that the hybrid model outperforms single models.
With the rise of deep learning technology, deep neural networks have been increasingly employed for stock market prediction. Wanjawa et al. applied deep neural networks to predict three stocks on the New York stock exchange, achieving favorable prediction results [16]. CNNs, known for their exceptional performance in image recognition, have also been utilized in stock prediction tasks. Tsantekidis et al. constructed a 1-D CNN for stock price prediction, demonstrating that CNN outperforms multi-layer perceptron and SVM models [17]. Gudelek et al. used a 2-D CNN to predict stock price trends by classifying trading images [18].
Furthermore, recurrent neural networks (RNNs) are crucial for addressing sequential problems and have been widely applied in fields such as speech recognition and natural language processing. Samarawickrama et al. employed RNNs for daily stock price prediction of listed companies, achieving superior results compared to feedforward neural networks [19]. Roondiwala et al. used LSTM networks to predict the stock returns of NIFTY 50 [20]. Selvin compared the performance of three different models (i.e., RNN, CNN, and LSTM—in predicting stock prices of NSE-listed companies, finding that RNN and LSTM) were capable of capturing long-term dependencies in the data, while CNN excelled at capturing short-term dependencies [21]. Building on these findings, Lu et al. proposed a CNN-LSTM hybrid model [22]. This model utilized CNNs to extract local features, followed by LSTM networks to learn temporal features. The results once again demonstrated the superiority of the hybrid model over single models.
In recent years, transformer models [23] have set new benchmarks in natural language processing and are increasingly being utilized in areas such as image recognition [24] and multimodal tasks [25]. Lai et al. [26] introduced a differential transformer model for predicting stock movement, incorporating a temporal attention-augmented bilinear layer and a temporal convolutional network to filter noise from the data. Tao et al. [27] developed the series decomposition transformer with period-correlation (SDTP) to discern relationships in historical data, thereby enhancing trend prediction in the stock market with high accuracy and generalizability. Mishra et al. [28] combined a transformer with GARCH and LSTM models to improve volatility forecasting and financial risk assessment. Recently, Shi et al. [29] proposed a new model called manbaStock, which effectively mines historical stock market data to predict future stock prices without handcrafted features.
Research literature indicates that transformers excel at capturing global features, while CNNs are more effective at representing local features. Therefore, in this paper, we combine the strengths of CNNs and Transformers to propose a hybrid model for stock price prediction (CNN-Trans-SPP).
The task of stock price prediction involves forecasting the price for the next day or several days using stock trading data from the previous days. Predicting the stock price for only the next day is referred to as single-step prediction, while forecasting the stock prices for several consecutive days is known as multi-step prediction. In this paper, we focus on predicting the opening price of the next day by leveraging trading data from the previous five days. The formal description of the single-step prediction task is shown in Eqs (3.1) and (3.2).
xopent=f(xt−5,xt−4,xt−3,xt−2,xt−1) | (3.1) |
xt=[xopent,xhight,xlowt,xcloset,xvolt] | (3.2) |
Here, xt represents the trading data, which includes the opening price, high price, low price, closing price, and trading volume. t denotes a specific day in the time series.
For the multi-step prediction task, we predict the opening prices for the next three days using the trading data from the previous five days. The multi-step prediction is formalized as shown in Eq (3.3).
[xopent,xopent+1,xopent+2]=f(xt−5,xt−4,xt−3,xt−2,xt−1) | (3.3) |
The CNN-Trans-SPP model consists of five parts: Conv1d embedding, position encoding, encoder, decoder, and linear layer, as shown in Figure 1. Conv1d embedding layer consists of 252 convolutional kernels. For position encoding, transformer model captures the relative position information by the functions of sine and cosine. However, an absolute position information is more beneficial for time series prediction tasks. Therefore, we utilize tabular position encoding, which limits the range of position values to [0, 1] and evenly distributes each element. The corresponding formula is shown in Eq (3.4).
yi={0ifi=0,yi−1+1nif0<i<k,1ifi=k−1. | (3.4) |
where, yi denotes the position value of the i−th element. k denotes the length of time series. Then, a linear layer is utilized for embedding the position value. Encoder consists of 3 encode blocks, which include mutil-head attention mechanism, residual layer, and feedforward neural network layer. Compared to the encoder, decoder consists of 3 decode blocks, which has an additional encoding and decoding attention mechanism.
LSTM is an advanced form of RNN designed to overcome the challenges of vanishing or exploding gradients during training. LSTM excels at capturing long-term dependencies and is particularly effective for time series prediction [30,31]. LSTM replaces the hidden neurons of RNN with memory cells, which are capable of selectively retaining or discarding information through a gated mechanism. Specifically, LSTM utilizes three gates: the input gate, forget gate, and output gate. The architecture of an LSTM memory cell is illustrated in Figure 2. The functioning of LSTM can be mathematically expressed through the Eqs (3.5)–(3.9).
ft=σ(Wfx(t)+Ufh(t−1)+bf) | (3.5) |
it=σ(Wix(t)+Uih(t−1)+bi) | (3.6) |
ot=σ(Wox(t)+Uoh(t−1)+bo) | (3.7) |
ct=f(t)⊗c(t−1)+i(t)⊗tanh(Wcx(t)+Uch(t−1)+bc) | (3.8) |
h(t)=o(t)⊗tanh(c(t)) | (3.9) |
The forget gate f(t) determines how much of the previous cell state c(t−1) is retained or discarded. The input gate i(t) governs the integration of current input information into the updated cell state c(t). The output gate o(t) controls the output derived from the current cell state, which is then forwarded to the next memory cell. In Eqs (3.2)–(3.6), W and U represent the weight matrices to be optimized, while b denotes the bias term. The symbol σ stands for the sigmoid activation function, and ⊗ represents element-wise multiplication.
The attention mechanism was initially introduced in computer vision, where it mimics the human visual system by focusing on key local features while filtering out irrelevant information. Incorporating the attention mechanism into deep learning models helps to manage information overload, thereby improving both efficiency and performance. Specifically, this mechanism enables the model to train a learnable vector, calculated using a softmax function, to produce weighted outcomes. In the context of LSTM models, the attention mechanism can be integrated across time steps, allowing the model to concentrate on important temporal contexts.
The experimental data used in this study were obtained from securities treasure, a free and open-source securities data platform. To validate the performance of CNN-Trans-SPP, we selected eight representative stocks from companies listed on the Shanghai stock exchange and the Shenzhen stock exchange, spanning four different sectors: finance, technology, agriculture, and industry. The daily data for each stock include five features: opening price, closing price, highest price, lowest price, and trading volume. The selected stock names and their corresponding codes are presented in Table 1.
Category | Stock name | Stock code |
Finance | China Bank (CB) | 601988 |
Ping An Bank (PAB) | 000001 | |
Technology | Inspur Group (IG) | 000977 |
Tongfang Co. (TC) | 600100 | |
Agriculture | Longping Hi-Tech (LHT) | 000998 |
Denghai Seeds (DS) | 002041 | |
Industry | China Heavy Industries (CHI) | 601989 |
XCMG Machinery (XCMG) | 000425 |
To verify the effectiveness of the proposed methods, we utilized three widely adopted quality metrics: root mean square error (RMSE), mean absolute percentage error (MAPE), and mean absolute error (MAE). RMSE is frequently used to evaluate the performance of regression models and is particularly sensitive to outliers and extreme values, meaning larger errors have a more pronounced impact on RMSE, as shown in Eq (4.1). MAPE offers an intuitive explanation of model performance and is applicable to various data types. Unlike RMSE, MAPE is not influenced by outliers, making it more robust, as demonstrated in Eq (4.2). MAE provides a linear evaluation of error magnitude and offers a straightforward representation of errors. However, it does not differentiate between the impacts of small and large errors, as illustrated in Eq (4.3). In these equations, yi and ˆyi represent the actual and predicted values, respectively, while N denotes the number of samples.
RMSE=√1nN∑i=1(yi−ˆyi)2) | (4.1) |
MAPE=N∑i=1|yi−ˆyiyi|×100n | (4.2) |
MAE=1nN∑i=1|yi−ˆyi| | (4.3) |
We select five features for the stock: the opening price, highest price, lowest price, closing price, and trading volume for each day. The data from the previous four days is used as input to predict the opening price for the next day or the next three days. We apply the min-max normalization to each of the five features. The normalization formula is xnorm=(x−xmin(xmax−xmin)), where x denotes the original value, xnorm denotes the normalized value, xmin and xmax are minimum and maximum values, respectively. We choose 80% of the data as the training set and the rest as the test set. For LSTM models, the normalized features from the previous four days are used as the input for each step of the LSTM and ALSTM models. For the CNN-Trans-SSP and MLP-Trans-SPP, the normalized features from the previous four days are mapped to a feature vector of size 512 using either 1D convolution or a linear layer. Subsequently, positional encoding is added, and this serves as the input to the transformer model.
In the single-step prediction experiments, we employed an LSTM model with an input layer of 5 nodes, a hidden layer of 64 nodes, and an output layer of 1 node. The time step was set to 5. We conducted experiments on PAB stock, setting the number of hidden layer nodes to 16, 32, 48, 64, and 80. The results obtained for the MAPE metric were 1.13%, 0.94%, 0.89%, 0.82%, and 0.91%, respectively. Therefore, we set the number of hidden layer nodes to 64. During the training phase, the learning rate was set to 0.001, and the number of epochs was set to 1000. For the attention-based LSTM, we used the same network structure as the standard LSTM to facilitate a direct comparison of effectiveness. The only difference was that the learning rate was adjusted to 0.0001. In the multi-step prediction experiments, we utilized an LSTM model with an input layer of 5 nodes, a hidden layer of 64 nodes, and an output layer of 3 nodes. All other experimental settings were consistent with the single-step prediction experiments.
To validate the effectiveness and robustness of the proposed method, we conducted experiments on eight stocks across various sectors, including finance, technology, agriculture, and industry. The LSTM model and the attention-based LSTM model, both commonly used for time series prediction, were employed as comparison methods. The performance of each method was evaluated using MAPE, RMSE, and MAE. For ease of analysis, the experimental results are presented in Table 2. Observing the average values of 8 stocks, we can see that the CNN-Trans-SSP method achieves the best results on all three indicators (0.66 in MAPE, 8.75 in RMSE and 5.90 in MAE), compared to LSTM (1.20, 16.45, and 12.46), attention-based LSTM (0.97, 12.71, and 9.47) and MLP-Trans-SPP (1.04, 13.88, and 8.08). In the term of MAPE, our proposed CNN-Trans-SSP decreased the error by 32% compared to attention-based LSTM and by 45% compared to LSTM. Compared to another transformer model (MLP-Trans-SPP), CNN-Trans-SSP improved by approximately 36.8%. In terms of individual stock prediction analysis, CNN-Trans-SSP model outperform attention-based LSTM, LSTM and MLP-Trans-SPP on all the metrics in the seven stocks, except for one stock (LHT). The performance of MLP-Trans-SPP lies between that of LSTM and A-LSTM. Among all the experimental results, LSTM achieved the worst results.
Metrics | Methods | Avg | PAB | CB | TC | IG | DS | LHT | CHI | XCMG |
MAPE(%)↓ | LSTM | 1.20 | 0.89 | 1.94 | 0.95 | 0.93 | 1.05 | 0.72 | 1.29 | 1.82 |
ALSTM | 0.97 | 0.82 | 1.53 | 0.72 | 0.56 | 0.81 | 0.41 | 1.31 | 1.62 | |
MTSP | 1.04 | 0.94 | 1.56 | 1.01 | 0.49 | 0.86 | 0.93 | 1.53 | 1.03 | |
CTSP | 0.66 | 0.38 | 1.37 | 0.53 | 0.24 | 0.34 | 0.47 | 1.01 | 0.93 | |
RMSE(e-2)↓ | LSTM | 16.45 | 18.67 | 11.21 | 13.74 | 33.87 | 19.39 | 17.15 | 7.03 | 10.51 |
ALSTM | 12.71 | 12.36 | 9.67 | 9.72 | 17.66 | 21.26 | 12.75 | 8.37 | 9.85 | |
MTSP | 13.88 | 18.41 | 11.95 | 13.51 | 12.53 | 16.44 | 18.82 | 8.93 | 10.46 | |
CTSP | 8.75 | 9.17 | 9.01 | 9.08 | 8.81 | 9.58 | 9.46 | 6.27 | 8.63 | |
MAE(e-2)↓ | LSTM | 12.46 | 12.83 | 6.46 | 7.93 | 26.61 | 18.26 | 12.45 | 5.62 | 9.54 |
ALSTM | 9.47 | 11.59 | 6.09 | 5.93 | 15.93 | 14.46 | 7.26 | 5.93 | 8.57 | |
MTSP | 8.08 | 10.21 | 6.39 | 6.22 | 7.13 | 10.31 | 12.61 | 5.32 | 6.47 | |
CTSP | 5.90 | 5.59 | 5.53 | 4.59 | 7.37 | 6.12 | 8.02 | 4.51 | 5.41 | |
Note: CTSP is the abbreviation for the CNN-Trans-SPP. ALSTM is the abbreviation for attention-based LSTM. MTSP is the abbreviation for MLP-Trans-SPP. Avg represents the average value of the eight stocks. |
To provide a more intuitive comparison of the performance of various methods, we plot the first 100 predicted values as shown in Figure 3. From the Figure 3, especially in subgraphs (a), (b), (e), and (h), we can observe that the brown dashed line (yielded by LSTM) and blue dashed line (yielded by MLP-Trans-SSP) deviate significantly from the black line (target). Most of the red dashed lines (yielded by CNN-Trans-SSP) overlap with the black line or fluctuate near the black line. From subgraphs (d) and (f), it can be seen that all experimental methods performed well on these two stocks (IG, LHT), almost consistent with the target.
For all the stocks, CNN-Trans-SSP exhibits good robustness and high performance.
To gain insights into the fluctuation trends of stock prices, it is essential to predict prices over an extended period. Therefore, we implemented a multi-step prediction approach and evaluated its performance. We also compared the performance of CNN-Trans-SSP, MLP-Trans-SPP, attention-based LSTM, and LSTM using MAPE, RMSE, and MAE as metrics. The detailed experimental results are presented in Table 3. Observing the average values of 8 stocks, we can see that the CNN-Trans-SSP method achieves the best results on all three indicators (0.99% in MAPE, 13.50 in RMSE, and 9.83 in MAE), compared to LSTM (1.21%, 17.31, and 12.75), attention-based LSTM (1.37%, 20.34, and 14.92) and MLP-Trans-SPP (1.28%, 17.99, and 11.44). In the term of MAPE, our proposed CNN-Trans-SSP decreased the error by 18% compared to attention-based LSTM and by 28% compared to LSTM. Similarly, CNN-Trans-SPP improved the prediction performance over MLP-Trans-SPP by approximately 22.8%. Although multi-step prediction does not improve as much as single step prediction, CNN-Trans-SPP outperforms other methods in all metrics.
Metrics | Methods | Avg | PAB | CB | TC | IG | DS | LHT | CHI | XCMG |
MAPE(%)↓ | LSTM | 1.37 | 1.79 | 1.93 | 1.25 | 0.68 | 1.51 | 0.82 | 1.37 | 1.58 |
ALSTM | 1.20 | 1.46 | 1.86 | 1.09 | 0.62 | 1.05 | 0.85 | 1.59 | 1.13 | |
MTSP | 1.28 | 1.61 | 2.38 | 1.23 | 0.51 | 1.29 | 0.81 | 1.41 | 1.03 | |
CTSP | 0.99 | 0.97 | 1.89 | 0.82 | 0.49 | 0.71 | 0.69 | 1.38 | 0.98 | |
RMSE(e-2)↓ | LSTM | 20.34 | 38.85 | 12.49 | 16.06 | 28.71 | 33.61 | 16.23 | 6.19 | 10.61 |
ALSTM | 17.31 | 33.25 | 11.09 | 12.61 | 18.79 | 25.81 | 20.52 | 7.79 | 8.65 | |
MTSP | 17.99 | 35.41 | 15.03 | 15.72 | 16.69 | 25.58 | 16.72 | 9.33 | 9.43 | |
CTSP | 13.50 | 22.05 | 11.62 | 9.32 | 15.31 | 16.93 | 17.59 | 7.52 | 7.64 | |
MAE(e-2)↓ | LSTM | 14.91 | 26.21 | 7.83 | 10.31 | 19.86 | 26.65 | 14.11 | 5.78 | 8.59 |
ALSTM | 12.75 | 21.44 | 7.64 | 8.73 | 17.64 | 18.63 | 14.83 | 6.82 | 6.28 | |
MTSP | 11.44 | 20.61 | 8.04 | 6.31 | 14.35 | 15.17 | 14.32 | 6.42 | 6.31 | |
CTSP | 9.83 | 14.29 | 7.81 | 6.52 | 13.91 | 12.56 | 12.08 | 5.93 | 5.54 | |
Note: CTSP is the abbreviation for the CNN-Trans-SPP. ALSTM is the abbreviation for attention-based LSTM. MTSP is the abbreviation for MLP-Trans-SPP. Avg represents the average value of the eight stocks. |
To compare the performance of various methods more intuitively for multi-step prediction, we also plot the first 100 predicted values as shown in Figure 4. Compared with Figure 3, all the methods have significant fluctuations, as shown in Figure 4(a). Observing Figure 4 (c), (e), and (h), we can see that the red dashed line is still superior to the other lines. All experimental methods perform well on these two stocks (IG, LHT), almost consistent with the target. We can draw a consistent conclusion with the Table 3.
Choices of the input layer architecture. We explore whether the CNN-based input layer has advantages over a linear layer by training and testing two alternative variants: MLP-Trans-SPP and CNN-Trans-SSP. As shown in Figure 5, CNN-Trans-SSP model outperforms the MLP-Trans-SPP model in the term of MAPE, conducting on all the selected stocks. This means that one-dimensional convolution can better capture features than MLP in stock price prediction.
Effect of the multi-head attention mechanism.The multi-head attention mechanism can learn different features from various perspectives, thereby enhancing the ability of model expression. Figure 6 shows the influence of the number of heads in term of MAPE. It can be observed that the impact of the number of heads varies for each stock. The model achieves its best performance by utilizing 4 heads, for Inspur Group (IG) and Denghai Seeds (DS). Meanwhile, the model achieves the best performance by utilizing 16 heads for China Bank (CB). Overall, the model exhibits relatively stable performance for all the stocks by utilizing 8 heads. This is likely because the 16-head attention mechanism is able to capture more high-frequency information while neglecting low-frequency information. In contrast, the 4-head attention mechanism has a better capability for capturing low-frequency information but is insufficient for high-frequency information. Thereby, we adopt an 8-head attention mechanism in our experiments.
Effect of the embedding dimension. We further investigated the impact of varying embedding dimensions on model performance. Experiments were conducted on all stocks using three different embedding dimensions: 128,512, and 1024. The results of these experiments are presented in Figure 7. The conclusion is different from our understanding that higher dimensions can bring better results. The model can achieve stable and good performance by utilizing 512 dimensions. This may be because a small model is unable to capture more useful information through high embedding dimensions.
Effect of the transformer blocks number in the architecture. The number of transformer blocks is a critical factor influencing the architecture of the transformer model. We conducted experiments with different layer settings: 1, 3, 5, 8, and 12. The results, displayed in Figure 8, reveal that the impact of the number of layers varies across different stocks. To provide a comprehensive evaluation, we assessed each model's performance based on the average MAPE values across all stocks. The black solid dots in Figure 8 represent these average MAPE values for each transformer configuration. Notably, the transformer model with 3 layers demonstrated superior performance compared to others. This is likely because our dataset is small, which leads to overfitting in the multi-layer transformer model and prevents it from achieving the desired results. Consequently, we focused on the transformer model with 3 layers in our subsequent experiments.
In this paper, we propose a hybrid model based on CNN and transformer architectures for stock price prediction. The model leverages CNN to capture local features and the transformer to extract periodic global features. We evaluated the model on eight stocks across four major sectors, and the results demonstrate its superior prediction performance. Future work will focus on exploring pre-training techniques in stock price prediction to further enhance accuracy.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
This research was supported in part by the National Natural Science Foundation of China (grant No. 62273212), National Social Science Fund Project of China (grant No. 17BGL058), Shandong Province Natural Science Foundation (grant No. ZR2022MG045).
The authors declare there is no conflicts of interest.
[1] |
Surveillances V (2020) The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19)—China, 2020. CCDC Weekly 2: 113-122. doi: 10.46234/ccdcw2020.032
![]() |
[2] |
Bai Y, Yao L, Wei T, et al. (2020) Presumed asymptomatic carrier transmission of COVID-19. Jama 323: 1406-1407. doi: 10.1001/jama.2020.2565
![]() |
[3] | Dehghanbanadaki H, Seif F, Vahidi Y, et al. (2020) Bibliometric analysis of global scientific research on coronavirus (COVID-19). MJIRI 34: 354-362. |
[4] |
Wang S, Zhou X, Zhang T, et al. (2020) The need for urogenital tract monitoring in COVID-19. Nat Rev Urol 17: 314-315. doi: 10.1038/s41585-020-0319-7
![]() |
[5] |
Rothan HA, Byrareddy SN (2020) The epidemiology and pathogenesis of coronavirus disease (COVID-19) outbreak. J Autoimmun 109: 102433. doi: 10.1016/j.jaut.2020.102433
![]() |
[6] |
Seif F, Aazami H, Khoshmirsafa M, et al. (2020) JAK inhibition as a new treatment strategy for patients with COVID-19. Int Arch Allergy Imm 181: 467-475. doi: 10.1159/000508247
![]() |
[7] | Zhang H, Kang Z, Gong H, et al. (2020) The digestive system is a potential route of 2019-nCov infection: a bioinformatics analysis based on single-cell transcriptomes. BioRxiv In press. |
[8] |
Chen Y, Chen L, Deng Q, et al. (2020) The presence of SARS-CoV-2 RNA in the feces of COVID-19 patients. J Med Virol 92: 833-840. doi: 10.1002/jmv.25825
![]() |
[9] | Backer JA, Klinkenberg D, Wallinga J (2020) The incubation period of 2019-nCoV infections among travellers from Wuhan, China. MedRxiv In press. |
[10] |
Lei J, Li J, Li X, et al. (2020) CT imaging of the 2019 novel coronavirus (2019-nCoV) pneumonia. Radiology 295: 18. doi: 10.1148/radiol.2020200236
![]() |
[11] |
Li Q, Guan X, Wu P, et al. (2020) Early transmission dynamics in Wuhan, China, of novel coronavirus—infected pneumonia. N Engl J Med 382: 1199-1207. doi: 10.1056/NEJMoa2001316
![]() |
[12] |
Adhikari SP, Meng S, Wu YJ, et al. (2020) Epidemiology, causes, clinical manifestation and diagnosis, prevention and control of coronavirus disease (COVID-19) during the early outbreak period: a scoping review. Infect Dis Poverty 9: 1-12. doi: 10.1186/s40249-020-00646-x
![]() |
[13] |
Guan CS, Lv ZB, Yan S, et al. (2020) Imaging features of coronavirus disease 2019 (COVID-19): evaluation on thin-section CT. Acad Radiol 27: 609-613. doi: 10.1016/j.acra.2020.03.002
![]() |
[14] | Centers for Disease Control and Prevention, Management of Patients with Confirmed 2019-nCoV, 2020. National Center for Immunization and Respiratory Diseases (NCIRD), Division of Viral Diseases, 2020 Available from: https://www.cdc.gov/coronavirus/2019-ncov/hcp/clinical-guidance-management-patients.html. |
[15] |
Ai T, Yang Z, Hou H, et al. (2020) Correlation of chest CT and RT-PCR testing in coronavirus disease 2019 (COVID-19) in China: a report of 1014 cases. Radiology 296: E32-E40. doi: 10.1148/radiol.2020200642
![]() |
[16] |
Denison MR, Graham RL, Donaldson EF, et al. (2011) Coronaviruses: an RNA proofreading machine regulates replication fidelity and diversity. RNA Biol 8: 270-279. doi: 10.4161/rna.8.2.15013
![]() |
[17] |
Zhu N, Zhang D, Wang W, et al. (2020) A novel coronavirus from patients with pneumonia in China, 2019. N Engl J Med 382: 727-733. doi: 10.1056/NEJMoa2001017
![]() |
[18] | Zhang Y (2020) Novel 2019 coronavirus genome Shanghai Public Health Clinical Center & School of Public Health, Fudan University, Available from: https://virological.org/t/novel-2019-coronavirus-genome/319. |
[19] | Corman VM, Landt O, Kaiser M, et al. (2020) Detection of 2019 novel coronavirus (2019-nCoV) by real-time RT-PCR. Eurosurveillance 25: 2000045. |
[20] |
Chan JFW, Kok KH, Zhu Z, et al. (2020) Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan. Emerg Microbes Infect 9: 221-236. doi: 10.1080/22221751.2020.1719902
![]() |
[21] |
Kim H (2020) Outbreak of novel coronavirus (COVID-19): What is the role of radiologists? Eur Radiol 30: 3266-3267. doi: 10.1007/s00330-020-06748-2
![]() |
[22] | Ora J, Puxeddu E, Cavalli F, et al. (2020) Does bronchoscopy help the diagnosis in Covid-19 infection? Eur Respir J In press. |
[23] | Tan FR, Qiu YL, Xu Z (2020) Bronchoalveolar lavage fluid was used to diagnose two cases of 2019-nCoV infection. Chin J Tuberc Respir Dis 43: 337-339. |
[24] |
Winichakoon P, Chaiwarith R, Liwsrisakun C, et al. (2020) Negative nasopharyngeal and oropharyngeal swabs do not rule out COVID-19. J Clin Microbiol 58: e00297-20. doi: 10.1128/JCM.00297-20
![]() |
[25] |
He JL, Luo L, Luo ZD, et al. (2020) Diagnostic performance between CT and initial real-time RT-PCR for clinically suspected 2019 coronavirus disease (COVID-19) patients outside Wuhan, China. Respir Med 168: 105980. doi: 10.1016/j.rmed.2020.105980
![]() |
[26] |
Sethuraman N, Jeremiah SS, Ryo A (2020) Interpreting diagnostic tests for SARS-CoV-2. Jama 323: 2249-2251. doi: 10.1001/jama.2020.8259
![]() |
[27] |
Dai W, Zhang H, Yu J, et al. (2020) CT imaging and differential diagnosis of COVID-19. Can Assoc Radiol J 71: 195-200. doi: 10.1177/0846537120913033
![]() |
[28] |
Fang Y, Zhang H, Xie J, et al. (2020) Sensitivity of chest CT for COVID-19: comparison to RT-PCR. Radiology 296: E115-E117. doi: 10.1148/radiol.2020200432
![]() |
[29] |
Chung M, Bernheim A, Mei X, et al. (2020) CT imaging features of 2019 novel coronavirus (2019-nCoV). Radiology 295: 202-207. doi: 10.1148/radiol.2020200230
![]() |
[30] |
Bernheim A, Mei X, Huang M, et al. (2020) Chest CT findings in coronavirus disease-19 (COVID-19): relationship to duration of infection. Radiology 295: 685-691. doi: 10.1148/radiol.2020200463
![]() |
[31] |
Xie X, Zhong Z, Zhao W, et al. (2020) Chest CT for typical 2019-nCoV pneumonia: relationship to negative RT-PCR testing. Radiology 296: E41-E45. doi: 10.1148/radiol.2020200343
![]() |
[32] | Chana JFW, Yip CCY, To KKW, et al. (2020) Improved molecular diagnosis of COVID-19 by the novel, highly sensitive and specific COVID-19-RdRp/Hel realtime reverse transcription-polymerase chain reaction assay validated in vitro and with clinical specimens. J Clin Microbiol 10: 00310-20. |
[33] |
Tavare AN, Braddy A, Brill S, et al. (2020) Managing high clinical suspicion COVID-19 inpatients with negative RT-PCR: a pragmatic and limited role for thoracic CT. Thorax 75: 531. doi: 10.1136/thoraxjnl-2020-214916
![]() |
[34] |
Houben PHH, Winkens RAG, Van der Weijden T, et al. (2010) Reasons for ordering laboratory tests and relationship with frequency of abnormal results. Scand J Prim Health Care 28: 18-23. doi: 10.3109/02813430903281758
![]() |
[35] | Sikaris KA (2017) Enhancing the clinical value of medical laboratory testing. Clin Biochem Rev 38: 107. |
[36] |
Singhal T (2020) A review of coronavirus disease-2019 (COVID-19). Indian J Pediatr 87: 281-286. doi: 10.1007/s12098-020-03263-6
![]() |
[37] |
Chen N, Zhou M, Dong X, et al. (2020) Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet 395: 507-513. doi: 10.1016/S0140-6736(20)30211-7
![]() |
[38] |
Wu J, Liu J, Zhao X, et al. (2020) Clinical characteristics of imported cases of COVID-19 in Jiangsu province: a multicenter descriptive study. Clin Infect Dis 71: 706-712. doi: 10.1093/cid/ciaa199
![]() |
[39] |
Yang S, Shi Y, Lu H, et al. (2020) Clinical and CT features of early stage patients with COVID-19: a retrospective analysis of imported cases in Shanghai, China. Eur Respir J 55: 2000407. doi: 10.1183/13993003.00407-2020
![]() |
[40] |
Zhang J, Dong X, Cao Y, et al. (2020) Clinical characteristics of 140 patients infected with SARS-CoV-2 in Wuhan, China. Allergy 75: 1730-1741. doi: 10.1111/all.14238
![]() |
[41] |
Shi H, Han X, Jiang N, et al. (2020) Radiological findings from 81 patients with COVID-19 pneumonia in Wuhan, China: a descriptive study. Lancet Infect Dis 20: 425-434. doi: 10.1016/S1473-3099(20)30086-4
![]() |
[42] |
Song F, Shi N, Shan F, et al. (2020) Emerging 2019 novel coronavirus (2019-nCoV) pneumonia. Radiology 295: 210-217. doi: 10.1148/radiol.2020200274
![]() |
[43] |
Zhang R, Wang X, Ni L, et al. (2020) COVID-19: Melatonin as a potential adjuvant treatment. Life Sci 250: 117583. doi: 10.1016/j.lfs.2020.117583
![]() |
[44] | Bahrami M, Kamalinejad M, Latifi SA, et al. (2020) Cytokine storm in COVID-19 and parthenolide: preclinical evidence. Phytother Res In press. |
[45] |
Emameh RZ, Nosrati H, Eftekhari M, et al. (2020) Expansion of Single Cell Transcriptomics Data of SARS-CoV Infection in Human Bronchial Epithelial Cells to COVID-19. Biol Proced Online 22: 16. doi: 10.1186/s12575-020-00127-3
![]() |
[46] | Roshanravan N, Seif F, Ostadrahimi A, et al. (2020) Targeting Cytokine Storm to Manage Patients with COVID-19: A Mini-Review. Arch Med Res In press. |
[47] |
Vashist SK (2020) In vitro diagnostic assays for COVID-19: recent advances and emerging trends. Diagnostics 10: 202. doi: 10.3390/diagnostics10040202
![]() |
[48] |
Xiao AT, Gao C, Zhang S (2020) Profile of specific antibodies to SARS-CoV-2: the first report. J Infect 81: 147-178. doi: 10.1016/j.jinf.2020.03.012
![]() |
[49] | Lou B, Li T, Zheng S, et al. (2020) Serology characteristics of SARS-CoV-2 infection since the exposure and post symptoms onset. Eur Respir J In press. |
[50] | Xiang F, Wang X, He X, et al. (2020) Antibody detection and dynamic characteristics in patients with COVID-19. Clin Infect Dis In press. |
[51] |
Zhang W, Du R H, Li B, et al. (2020) Molecular and serological investigation of 2019-nCoV infected patients: implication of multiple shedding routes. Emerg Microbes Infect 9: 386-389. doi: 10.1080/22221751.2020.1729071
![]() |
[52] |
Patel R, Babady E, Theel ES, et al. (2020) Report from the American Society for Microbiology COVID-19 International Summit, 23 March 2020: Value of diagnostic testing for SARS–CoV-2/COVID-19. mBio 11: e00722-20. doi: 10.1128/mBio.00722-20
![]() |
[53] | Rajendran K, Narayanasamy K, Rangarajan J, et al. (2020) Convalescent plasma transfusion for the treatment of COVID-19: Systematic review. J Med Virol 2020: 25961. |
[54] |
Azkur AK, Akdis M, Azkur D, et al. (2020) Immune response to SARS-CoV-2 and mechanisms of immunopathological changes in COVID-19. Allergy 75: 1564-1581. doi: 10.1111/all.14364
![]() |
[55] |
Mo H, Zeng G, Ren X, et al. (2006) Longitudinal profile of antibodies against SARS-coronavirus in SARS patients and their clinical significance. Respirology 11: 49-53. doi: 10.1111/j.1440-1843.2006.00783.x
![]() |
[56] | Shi YL, Li LH, Sun ZH, et al. (2010) Study on the changing regularity of special antibody and expression of stomach and enteric involvement on SARS-coronavirus infection in the recovery period of severe acute respiratory syndrome. Chin J Epidemiol 31: 795-799. |
[57] |
Chen H, Guo J, Wang C, et al. (2020) Clinical characteristics and intrauterine vertical transmission potential of COVID-19 infection in nine pregnant women: a retrospective review of medical records. Lancet 395: 809-815. doi: 10.1016/S0140-6736(20)30360-3
![]() |
[58] | Simpson S, Kay FU, Abbara S, et al. (2020) Radiological Society of North America Expert Consensus Statement on reporting chest CT findings related to COVID-19. Endorsed by the Society of Thoracic Radiology, the American College of Radiology, and RSNA. J Thorac Imaging 2: e200152. |
[59] |
Zhou S, Wang Y, Zhu T, et al. (2020) CT features of coronavirus disease 2019 (COVID-19) pneumonia in 62 patients in Wuhan, China. Am J Roentgenol 214: 1287-1294. doi: 10.2214/AJR.20.22975
![]() |
[60] |
Yang W, Cao Q, Qin L, et al. (2020) Clinical characteristics and imaging manifestations of the 2019 novel coronavirus disease (COVID-19): A multi-center study in Wenzhou city, Zhejiang, China. J Infect 80: 388-393. doi: 10.1016/j.jinf.2020.02.016
![]() |
[61] |
Pan F, Ye T, Sun P, et al. (2020) Time course of lung changes on chest CT during recovery from 2019 novel coronavirus (COVID-19) pneumonia. Radiology 295: 715-721. doi: 10.1148/radiol.2020200370
![]() |
[62] | Jin YH, Cai L, Cheng ZS, et al. (2020) A rapid advice guideline for the diagnosis and treatment of 2019 novel coronavirus (2019-nCoV) infected pneumonia (standard version). Mil Med Res 7: 4. |
[63] | Kalra MK, Homayounieh F, Arru C, et al. (2020) Chest CT practice and protocols for COVID-19 from radiation dose management perspective. Eur Radiol In press. |
[64] | Hoffman T, Nissen K, Krambrich J, et al. (2020) Evaluation of a COVID-19 IgM and IgG rapid test; an efficient tool for assessment of past exposure to SARS-CoV-2. Infect Ecol Epidemiol 10: 1754538. |
[65] | To KKW, Tsang OTY, Leung WS, et al. (2020) Temporal profiles of viral load in posterior oropharyngeal saliva samples and serum antibody responses during infection by SARS-CoV-2: an observational cohort study. Lancet Infect Dis 20: 567-574. |
[66] | Zhao R, Li M, Song H, et al. (2020) Serological diagnostic kit of SARS-CoV-2 antibodies using CHO-expressed full-length SARS-CoV-2 S1 proteins. MedRxiv In press. |
[67] |
Kim H, Hong H, Yoon SH (2020) Diagnostic performance of CT and reverse transcriptase-polymerase chain reaction for coronavirus disease 2019: a meta-analysis. Radiology 296: E145-E144. doi: 10.1148/radiol.2020201343
![]() |
[68] |
Kasteren PBV, Veer BVD, Brink SVD, et al. (2020) Comparison of seven commercial RT-PCR diagnostic kits for COVID-19. J Clin Virol 128: 104412. doi: 10.1016/j.jcv.2020.104412
![]() |
1. | Peng Lu, Yuchen He, Wenhui Li, Yuze Chen, Ru Kong, Teng Wang, An Informer-based multi-scale model that fuses memory factors and wavelet denoising for tidal prediction, 2025, 33, 2688-1594, 697, 10.3934/era.2025032 | |
2. | Madilyn Louisa, Gumgum Darmawan, Bertho Tantular, Enhancing Stock Price Forecasting with CNN-BiGRU-Attention: A Case Study on INDY, 2025, 13, 2227-7390, 2148, 10.3390/math13132148 |
Category | Stock name | Stock code |
Finance | China Bank (CB) | 601988 |
Ping An Bank (PAB) | 000001 | |
Technology | Inspur Group (IG) | 000977 |
Tongfang Co. (TC) | 600100 | |
Agriculture | Longping Hi-Tech (LHT) | 000998 |
Denghai Seeds (DS) | 002041 | |
Industry | China Heavy Industries (CHI) | 601989 |
XCMG Machinery (XCMG) | 000425 |
Metrics | Methods | Avg | PAB | CB | TC | IG | DS | LHT | CHI | XCMG |
MAPE(%)↓ | LSTM | 1.20 | 0.89 | 1.94 | 0.95 | 0.93 | 1.05 | 0.72 | 1.29 | 1.82 |
ALSTM | 0.97 | 0.82 | 1.53 | 0.72 | 0.56 | 0.81 | 0.41 | 1.31 | 1.62 | |
MTSP | 1.04 | 0.94 | 1.56 | 1.01 | 0.49 | 0.86 | 0.93 | 1.53 | 1.03 | |
CTSP | 0.66 | 0.38 | 1.37 | 0.53 | 0.24 | 0.34 | 0.47 | 1.01 | 0.93 | |
RMSE(e-2)↓ | LSTM | 16.45 | 18.67 | 11.21 | 13.74 | 33.87 | 19.39 | 17.15 | 7.03 | 10.51 |
ALSTM | 12.71 | 12.36 | 9.67 | 9.72 | 17.66 | 21.26 | 12.75 | 8.37 | 9.85 | |
MTSP | 13.88 | 18.41 | 11.95 | 13.51 | 12.53 | 16.44 | 18.82 | 8.93 | 10.46 | |
CTSP | 8.75 | 9.17 | 9.01 | 9.08 | 8.81 | 9.58 | 9.46 | 6.27 | 8.63 | |
MAE(e-2)↓ | LSTM | 12.46 | 12.83 | 6.46 | 7.93 | 26.61 | 18.26 | 12.45 | 5.62 | 9.54 |
ALSTM | 9.47 | 11.59 | 6.09 | 5.93 | 15.93 | 14.46 | 7.26 | 5.93 | 8.57 | |
MTSP | 8.08 | 10.21 | 6.39 | 6.22 | 7.13 | 10.31 | 12.61 | 5.32 | 6.47 | |
CTSP | 5.90 | 5.59 | 5.53 | 4.59 | 7.37 | 6.12 | 8.02 | 4.51 | 5.41 | |
Note: CTSP is the abbreviation for the CNN-Trans-SPP. ALSTM is the abbreviation for attention-based LSTM. MTSP is the abbreviation for MLP-Trans-SPP. Avg represents the average value of the eight stocks. |
Metrics | Methods | Avg | PAB | CB | TC | IG | DS | LHT | CHI | XCMG |
MAPE(%)↓ | LSTM | 1.37 | 1.79 | 1.93 | 1.25 | 0.68 | 1.51 | 0.82 | 1.37 | 1.58 |
ALSTM | 1.20 | 1.46 | 1.86 | 1.09 | 0.62 | 1.05 | 0.85 | 1.59 | 1.13 | |
MTSP | 1.28 | 1.61 | 2.38 | 1.23 | 0.51 | 1.29 | 0.81 | 1.41 | 1.03 | |
CTSP | 0.99 | 0.97 | 1.89 | 0.82 | 0.49 | 0.71 | 0.69 | 1.38 | 0.98 | |
RMSE(e-2)↓ | LSTM | 20.34 | 38.85 | 12.49 | 16.06 | 28.71 | 33.61 | 16.23 | 6.19 | 10.61 |
ALSTM | 17.31 | 33.25 | 11.09 | 12.61 | 18.79 | 25.81 | 20.52 | 7.79 | 8.65 | |
MTSP | 17.99 | 35.41 | 15.03 | 15.72 | 16.69 | 25.58 | 16.72 | 9.33 | 9.43 | |
CTSP | 13.50 | 22.05 | 11.62 | 9.32 | 15.31 | 16.93 | 17.59 | 7.52 | 7.64 | |
MAE(e-2)↓ | LSTM | 14.91 | 26.21 | 7.83 | 10.31 | 19.86 | 26.65 | 14.11 | 5.78 | 8.59 |
ALSTM | 12.75 | 21.44 | 7.64 | 8.73 | 17.64 | 18.63 | 14.83 | 6.82 | 6.28 | |
MTSP | 11.44 | 20.61 | 8.04 | 6.31 | 14.35 | 15.17 | 14.32 | 6.42 | 6.31 | |
CTSP | 9.83 | 14.29 | 7.81 | 6.52 | 13.91 | 12.56 | 12.08 | 5.93 | 5.54 | |
Note: CTSP is the abbreviation for the CNN-Trans-SPP. ALSTM is the abbreviation for attention-based LSTM. MTSP is the abbreviation for MLP-Trans-SPP. Avg represents the average value of the eight stocks. |
Category | Stock name | Stock code |
Finance | China Bank (CB) | 601988 |
Ping An Bank (PAB) | 000001 | |
Technology | Inspur Group (IG) | 000977 |
Tongfang Co. (TC) | 600100 | |
Agriculture | Longping Hi-Tech (LHT) | 000998 |
Denghai Seeds (DS) | 002041 | |
Industry | China Heavy Industries (CHI) | 601989 |
XCMG Machinery (XCMG) | 000425 |
Metrics | Methods | Avg | PAB | CB | TC | IG | DS | LHT | CHI | XCMG |
MAPE(%)↓ | LSTM | 1.20 | 0.89 | 1.94 | 0.95 | 0.93 | 1.05 | 0.72 | 1.29 | 1.82 |
ALSTM | 0.97 | 0.82 | 1.53 | 0.72 | 0.56 | 0.81 | 0.41 | 1.31 | 1.62 | |
MTSP | 1.04 | 0.94 | 1.56 | 1.01 | 0.49 | 0.86 | 0.93 | 1.53 | 1.03 | |
CTSP | 0.66 | 0.38 | 1.37 | 0.53 | 0.24 | 0.34 | 0.47 | 1.01 | 0.93 | |
RMSE(e-2)↓ | LSTM | 16.45 | 18.67 | 11.21 | 13.74 | 33.87 | 19.39 | 17.15 | 7.03 | 10.51 |
ALSTM | 12.71 | 12.36 | 9.67 | 9.72 | 17.66 | 21.26 | 12.75 | 8.37 | 9.85 | |
MTSP | 13.88 | 18.41 | 11.95 | 13.51 | 12.53 | 16.44 | 18.82 | 8.93 | 10.46 | |
CTSP | 8.75 | 9.17 | 9.01 | 9.08 | 8.81 | 9.58 | 9.46 | 6.27 | 8.63 | |
MAE(e-2)↓ | LSTM | 12.46 | 12.83 | 6.46 | 7.93 | 26.61 | 18.26 | 12.45 | 5.62 | 9.54 |
ALSTM | 9.47 | 11.59 | 6.09 | 5.93 | 15.93 | 14.46 | 7.26 | 5.93 | 8.57 | |
MTSP | 8.08 | 10.21 | 6.39 | 6.22 | 7.13 | 10.31 | 12.61 | 5.32 | 6.47 | |
CTSP | 5.90 | 5.59 | 5.53 | 4.59 | 7.37 | 6.12 | 8.02 | 4.51 | 5.41 | |
Note: CTSP is the abbreviation for the CNN-Trans-SPP. ALSTM is the abbreviation for attention-based LSTM. MTSP is the abbreviation for MLP-Trans-SPP. Avg represents the average value of the eight stocks. |
Metrics | Methods | Avg | PAB | CB | TC | IG | DS | LHT | CHI | XCMG |
MAPE(%)↓ | LSTM | 1.37 | 1.79 | 1.93 | 1.25 | 0.68 | 1.51 | 0.82 | 1.37 | 1.58 |
ALSTM | 1.20 | 1.46 | 1.86 | 1.09 | 0.62 | 1.05 | 0.85 | 1.59 | 1.13 | |
MTSP | 1.28 | 1.61 | 2.38 | 1.23 | 0.51 | 1.29 | 0.81 | 1.41 | 1.03 | |
CTSP | 0.99 | 0.97 | 1.89 | 0.82 | 0.49 | 0.71 | 0.69 | 1.38 | 0.98 | |
RMSE(e-2)↓ | LSTM | 20.34 | 38.85 | 12.49 | 16.06 | 28.71 | 33.61 | 16.23 | 6.19 | 10.61 |
ALSTM | 17.31 | 33.25 | 11.09 | 12.61 | 18.79 | 25.81 | 20.52 | 7.79 | 8.65 | |
MTSP | 17.99 | 35.41 | 15.03 | 15.72 | 16.69 | 25.58 | 16.72 | 9.33 | 9.43 | |
CTSP | 13.50 | 22.05 | 11.62 | 9.32 | 15.31 | 16.93 | 17.59 | 7.52 | 7.64 | |
MAE(e-2)↓ | LSTM | 14.91 | 26.21 | 7.83 | 10.31 | 19.86 | 26.65 | 14.11 | 5.78 | 8.59 |
ALSTM | 12.75 | 21.44 | 7.64 | 8.73 | 17.64 | 18.63 | 14.83 | 6.82 | 6.28 | |
MTSP | 11.44 | 20.61 | 8.04 | 6.31 | 14.35 | 15.17 | 14.32 | 6.42 | 6.31 | |
CTSP | 9.83 | 14.29 | 7.81 | 6.52 | 13.91 | 12.56 | 12.08 | 5.93 | 5.54 | |
Note: CTSP is the abbreviation for the CNN-Trans-SPP. ALSTM is the abbreviation for attention-based LSTM. MTSP is the abbreviation for MLP-Trans-SPP. Avg represents the average value of the eight stocks. |