Forecasting volatility using combination across estimation windows: An application to S&P500 stock market index

Davide De Gaetano; Davide De Gaetano

doi:10.3934/mbe.2019361

Mathematical Biosciences and Engineering

2019, Volume 16, Issue 6: 7195-7216. doi: 10.3934/mbe.2019361

Previous Article Next Article

Research article Special Issues

Forecasting volatility using combination across estimation windows: An application to S&P500 stock market index

Davide De Gaetano ^{1,2
,
,}

1.
Department of Economics, University of Roma Tre, Via Silvio D’Amico 77, Rome, 00145, Italy
2.
SOSE - Soluzioni per il Sistema Economico Spa, Via Mentore Maggini 48C, Rome, 00143, Italy

Received: 02 June 2019 Accepted: 16 July 2019 Published: 07 August 2019

The paper focuses on GARCH-type models for analysing and forecasting S&P500 stock market index. The aim is to empirically evaluate and compare alternative forecast combinations across estimation windows for directly dealing with possible structural breaks in the observed time series. In the in-sample analysis, alternative conditional volatility dynamics, suitable to account for stylized facts, have been considered along with different conditional distributions for the innovations. Moreover, an analysis of structural breaks in the unconditional variance of the series has been performed. In the out-of-sample analysis, for each model specification, the proposed forecast combinations have been evaluated and compared in terms of their predictive ability through the model confidence set. The results give evidence of the presence of structural breaks and, as a consequence, of parameter instability in S&P500 series. Moreover, averaging across volatility forecasts generated by individual forecasting models estimated using different window sizes performs well, for all the considered GARCH-type specifications and for all the implemented conditional distributions for the innovations and it appears to offer a useful approach to forecasting S&P500 stock market index.

Keywords:

Citation: Davide De Gaetano. Forecasting volatility using combination across estimation windows: An application to S&P500 stock market index[J]. Mathematical Biosciences and Engineering, 2019, 16(6): 7195-7216. doi: 10.3934/mbe.2019361

Related Papers:

[1]	Xiaoqiang Dai, Kuicheng Sheng, Fangzhou Shu . Ship power load forecasting based on PSO-SVM. Mathematical Biosciences and Engineering, 2022, 19(5): 4547-4567. doi: 10.3934/mbe.2022210
[2]	Xiao Liang, Taiyue Qi, Zhiyi Jin, Wangping Qian . Hybrid support vector machine optimization model for inversion of tunnel transient electromagnetic method. Mathematical Biosciences and Engineering, 2020, 17(4): 3998-4017. doi: 10.3934/mbe.2020221
[3]	Bei Liu, Hongzi Bai, Wei Chen, Huaquan Chen, Zhen Zhang . Automatic detection method of epileptic seizures based on IRCMDE and PSO-SVM. Mathematical Biosciences and Engineering, 2023, 20(5): 9349-9363. doi: 10.3934/mbe.2023410
[4]	Xin Zhou, Shangbo Zhou, Yuxiao Han, Shufang Zhu . Lévy flight-based inverse adaptive comprehensive learning particle swarm optimization. Mathematical Biosciences and Engineering, 2022, 19(5): 5241-5268. doi: 10.3934/mbe.2022246
[5]	Ming Zhu, Kai Wu, Yuanzhen Zhou, Zeyu Wang, Junfeng Qiao, Yong Wang, Xing Fan, Yonghong Nong, Wenhua Zi . Prediction of cooling moisture content after cut tobacco drying process based on a particle swarm optimization-extreme learning machine algorithm. Mathematical Biosciences and Engineering, 2021, 18(3): 2496-2507. doi: 10.3934/mbe.2021127
[6]	Dashe Li, Xueying Wang, Jiajun Sun, Huanhai Yang . AI-HydSu: An advanced hybrid approach using support vector regression and particle swarm optimization for dissolved oxygen forecasting. Mathematical Biosciences and Engineering, 2021, 18(4): 3646-3666. doi: 10.3934/mbe.2021182
[7]	Cong Wang, Chang Liu, Mengliang Liao, Qi Yang . An enhanced diagnosis method for weak fault features of bearing acoustic emission signal based on compressed sensing. Mathematical Biosciences and Engineering, 2021, 18(2): 1670-1688. doi: 10.3934/mbe.2021086
[8]	Zhanying Tong, Yingying Zhou, Ke Xu . An intelligent scheduling control method for smart grid based on deep learning. Mathematical Biosciences and Engineering, 2023, 20(5): 7679-7695. doi: 10.3934/mbe.2023331
[9]	Yunpeng Ma, Shilin Liu, Shan Gao, Chenheng Xu, Wenbo Guo . Optimizing boiler combustion parameters based on evolution teaching-learning-based optimization algorithm for reducing NO_x emission concentration. Mathematical Biosciences and Engineering, 2023, 20(11): 20317-20344. doi: 10.3934/mbe.2023899
[10]	Rongmei Geng, Renxin Ji, Shuanjin Zi . Research on task allocation of UAV cluster based on particle swarm quantization algorithm. Mathematical Biosciences and Engineering, 2023, 20(1): 18-33. doi: 10.3934/mbe.2023002

Abstract

1. Introduction

The coal spontaneous combustion disaster in China was very serious and caused nearly 4,000 fire hazards due to coal spontaneous combustion. The direct economic losses of combusted coal amounted to billions ^[1]. Furthermore, the coal spontaneous combustion also caused secondary disasters of gas and dust explosions, air pollution and other disasters. This helped us realize that effective prediction is the key to prevention and control of coal spontaneous combustion. Along with coal oxidation and temperature rising, it will release the corresponding indicator gases, such as CO, CO₂, CH₄, C₂H₆, C₂H₄, C₂H₂ and N₂. Therefore, there is a very complex nonlinear relationship between the degree of coal spontaneous combustion and gas products. By finding out the congruent relationships between these indicator gases and temperature of coal, monitoring the gas products of coal sample reaction, temperature, oxygen consumption and other indexes, the signs of coal spontaneous combustion could be detected so as to predict the development trend of spontaneous combustion ^[2]. However, in the coal spontaneous combustion process, due to the limitations of experimental conditions, the heating rates of coal temperature are different at different stages.

2. Related work

The most common method for prediction of coal spontaneous combustion is the comprehensive evaluation method, which is predicted by machine learning methods such as cluster analysis ^[3], neural network ^[4] and support vector machine ^[5]. Among them, the support vector machines are based on VC dimensional theory and structural risk minimization of statistical theory, and have a good generalization performance for small sample learning, rendering the possibility to effectively and accurately do predictions. Moreover, the traditional support vector machine does not work well on imbalanced sample classifications ^[6].

Through the proximate analysis of the composition in different coal samples, by means of the mathematical model of multiple regression analysis, Deng set up a regression equation to predict the system, the significance of the equation and its coefficient is tested, and residual analysis is carried out to identify the appropriateness of the equation ^[7]. Wang proposed a method to predict coal spontaneous combustion, which combines grey GM(1, 1) model with Markov Model ^[8]. Based on the analysis on the seam spontaneous combustion in the coal mining face, the prediction technology with the support vector machine was applied to predict and analyze the spontaneous combustion danger of the residual coal in the goaf ^[9]. Paper ^[10] introduce fuzzy membership and least squares method, adopt neighborhood rough set method to reduce the dimensions of input vectors of the coal spontaneous combustion, and use PSO algorithm to optimize the parameters of SVM model. Then a fuzzy least square spherical support vector machine is presented, use sequential minimize optimization method to solve the quadratic programming problem, and establish a coal spontaneous combustion forecast model. Therefore, this paper uses the least square method, transferring learning and particle swarm algorithms in the support vector machine, and proposes a prediction method based on adaptive particle swarm optimization least squares support vector machine.

3. LS-SVM

3.1. LS-SVM basic theory

The design complexity of standard SVM algorithm is related to the number of training samples. When the number of training samples is too large, solving the corresponding quadratic programming problem becomes complex and the calculation speed will be slowed down accordingly. Therefore, the calculation speed of standard SVM in practical application is greatly restricted. The Least Square Support Vector Machine (LS-SVM) algorithm can solve this kind of problem. The main difference between LS-SVM and standard SVM lies in the loss function and equality constraints.

In the structure of support vector machine, the input space of support vector machine is composed of original observation data, and the data of input space is mapped to high-dimensional feature space by kernel function. In feature space, support vector machine classifies or fits data by linear regression function. LS-SVM for classification is deduced as follows: Set the training set ${\rm{S}} = \left\{ {\left({{{\rm{x}}_i}, {{\rm{y}}_i}} \right)\left| {i = 1, 2, \cdots, m} \right.} \right\}$ , in which ${x_i} \in {R^n}$ and ${y_i} \in R$ are input data and output data; different from the classical SVM. LS-SVM uses SRM criterion to construct the minimum objective function and its constraint conditions as follows ^[11]:

$\begin{array}{*{20}{c}} {{\rm{minJ}}\left( {{\rm{w}}, {\rm{e}}} \right) = \frac{1}{2}{w^T}w + \frac{\gamma }{2}\mathop \sum _{i = 1}^m e_i^2}\\ {\begin{array}{*{20}{c}} {s.t.}&{{y_i} = {w^T}{\rm{\Phi }}\left( {{x_i}} \right)} \end{array} + b + {e_i}} \end{array}$

(1)

Wherein, w is the weight vector, ${\rm{ \mathsf{ γ} }}$ is the constant, b is the constant deviation.

In order to solve the optimization problem of equation (1), we translate it into the following system of linear equations.

$\left[ {\begin{array}{*{20}{c}} 0&{{L^T}}\\ L&{Q + {\gamma ^{ - 1}}I} \end{array}} \right]\left[ {\begin{array}{*{20}{c}} b\\ \alpha \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} 0\\ y \end{array}} \right]$

(2)

Wherein, ${\rm{L}} = \left[{1, 1, \cdots, 1} \right] \in {R^m}$ , ${\rm{ \mathsf{ α} }} = {\left[{{\alpha _1}, {\alpha _2}, \cdots, {\alpha _m}} \right]^T} \in {R^m}$ , I is an identity matrix, ${\rm{Q}} = {y_i}{y_j}{\rm{\Phi }}{\left({{x_i}} \right)^T}{\rm{\Phi }}\left({{x_i}} \right) = {y_i}{y_j}K\left({{x_i}, {x_j}} \right)$ , $K\left({{x_i}, {x_j}} \right) = {\rm{exp}}\left({ - {{\left\| {x - {x_i}} \right\|}^2}/2{\sigma ^2}} \right)$ , are the kernel functions satisfying the Mercer condition, ${\rm{y}} = {\left[{{y_1}, {y_2}, \cdots, {y_m}} \right]^T} \in {R^m}$ . Then the classification decision function of LS-SVM is ^[12]:

${\rm{f}}\left( {\rm{x}} \right) = {\rm{sgn}}\left( {\mathop \sum _{i = 1}^m {\alpha _i}{y_i}{\rm{K}}\left( {{\rm{x}}, {x_i}} \right) + {\rm{b}}} \right)$

(3)

Equation (2) is rewritten into a matrix equation AX = z, X is solved via the least squares method in the LS-SVM algorithm, so an A inversion is required. However, for large-scale problems in practical engineering, the dimensions of A^T A are larger, so it is difficult to realize the process of matrix inversion ^[13]. Therefore, we can solve the matrix equation by the iterative computation of a PSO algorithm.

3.2. Multi-classification LS-SVM

The LS-SVM algorithm is initially designed for binary classification problems. In handling multi-class problems, it needs to construct a proper multi-class classifier. At present, the method for constructing the LS-SVM multi-class classifier is mainly to combine multiple binary classifiers to realize the construction of a multi-class classifier. The common methods are one to one and one to many ^[14]. The method of one to one is to design a LS-SVM between any two categories of samples. Thus, it needs k LS-SVM for $k\left({k - 1} \right)/2$ categories of samples. In classifying an unknown sample, the category with the most votes shall be the category for it. It needs to traverse all the classifiers for the method, which is complex in training and low in classification effectiveness. In the training of the method of one to many, it needs to organize the samples of some kind of category as one category, and the remaining samples are the other category. Therefore, samples of k categories will construct k LS-SVM. In the classification, the unknown sample is classified as the category with a maximum classification function value. Compared with the one to one method, the method greatly reduces the necessary classifiers for training and enhances the training speed.

4. LS-SVM algorithm based on an adaptive particle swarm optimization

4.1. Adaptive particle swarm optimization algorithm

The PSO Algorithm is an evolutionary computation suggested in 1995 by Doctor Eberhart and Doctor Kennedy of Purdue University as inspired by observing bird foraging behavior. The basic idea is to seek for the optimal solution through cooperation and information sharing between individuals among a group. That is, the system shall initialize a group of random particles, and find out the optimal solution by iteration. In every iteration, the particle updates itself by tracking two "extreme values". After finding the two optimal values, the particle will adjust its speed in every dimensional space and calculate its new position ^[15,16]. The particle evolution formula is

$\begin{array}{*{20}{c}} {v_i^{k + 1} = \omega v_i^k + {c_1}{r_1}\left( {p_i^k - x_i^k} \right) + {c_2}{r_2}\left( {p_g^k - x_i^k} \right)}\\ {x_i^{k + 1} = x_i^k + v_i^{k + 1}} \end{array}$

(4)

In which the ${r_1}, {r_2} \in U\left({0, 1} \right)$ , $v_i^k$ indicates the speed of the particle in the k time(s) of the iteration, $x_i^k$ indicates the position of the particle in the k time(s) of iteration, and $p_i^k$ indicates the individual extreme of the particle in the k time (s) iteration. $p_g^k$ indicates the present global extreme point of the swarm in the k time(s) of iteration, ${\rm{ \mathsf{ ω} }}$ is the inertia weight factor, and the constant ${c_1}, {c_2}$ is the acceleration factor.

From the model of PSO algorithm, if the acceleration factor ${c_1}, {c_2}$ or the inertia weight factor ${\rm{ \mathsf{ ω} }}$ are with a bigger value, the particle swarms may miss the optimal solution, which will result in a non-convergence of the algorithm ^[17,18]. And even in convergence, if the tracking of the particle is the process of gradual convergence of the particle swarm, all the particles will tend to the optimal solution, which will cause an immature convergence, resulting in a rate of convergence in the later period of the algorithm to slow down and the decline in precision.

For this situation, the adjustment strategy is, if the search space dimension is large, in order to enhance the global search ability, the inertia weight should be increased appropriately; while the search space dimension is small, the inertia weight should be reduced appropriately to ensure the search efficiency of the algorithm. When adjusting the inertia weight, the balance between the search ability and the search efficiency should be achieved. To solve the problem, we can adjust the inertia weight factor ${\rm{ \mathsf{ ω} }}$ and add the constraint factor $\alpha$ .We can introduce the source field factor and target domain factor by transferring learning ^[19], etc., to improve the basic model and obtain the self-adaptive PSO model:

$\begin{array}{l} v_i^{k + 1} = \omega \cdot v_i^k + {c_1}{r_1}\left[ {{\xi _q}\left( {p_i^k - x_i^k} \right) + {\xi _{q - 1}}\left( {p_i^{k - 1} - x_i^{k - 1}} \right)} \right]\\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; + {c_2}{r_2}\left[ {{\xi _q}\left( {p_g^k - x_i^k} \right) + {\xi _{q - 1}}\left( {p_g^{k - 1} - x_i^{k - 1}} \right)} \right] \end{array}$

(5)

${v_i} = \left\{ {\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {{v_{max}}}&{{v_i} \gt {v_{max}}} \end{array}}\\ {\begin{array}{*{20}{c}} { - {v_{max}}}&{{v_i} \lt - {v_{max}}} \end{array}} \end{array}} \right.{\rm{}}$

(6)

In which the ${{\rm{ \mathsf{ ξ} }}_q}, {{\rm{ \mathsf{ ξ} }}_{q - 1}} \in {R^n}$ , $\mathop \sum _{i = 0}^q {\xi _{i - 1}} = 1$ , ${\xi _q}$ become the target domain factors, and the ${\xi _{q - 1}}$ is the source field factor. In particular, when ${\xi _{q - 1}} = 0$ and ${\xi _q} = 1$ , PSO is a special condition of the APSO. From the perspective of psychology, the usage of knowledge of the source domain means the accumulation of the individual search experience of the particle, which is in favor of the quicker convergence of the algorithm.

The adjustment of the inertia weight factor ${\rm{ \mathsf{ ω} }}$ can be done according to the advantages and disadvantages of the fitness of the particles. That is, in the initial stage of the algorithm, give ${\rm{ \mathsf{ ω} }}$ a bigger positive value to obtain a better global searching ability; while in the later stage of the algorithm, give ${\rm{ \mathsf{ ω} }}$ a smaller value to make the algorithm easier for convergence. The method suggested in the text may adjust the ${\rm{ \mathsf{ ω} }}$ dynamically according to the degree of convergence and individual fitness of the swarm. The specific method is shown as follows:

When ${\rm{f}}\left({{x_i}} \right) < f\left({{p_i}} \right)$ , the particles meeting the condition are the ordinary particles in the swarm, which have good global optimizing capacity and local optimizing capacity. The inertia weight ${\rm{ \mathsf{ ω} }}$ is changed as the following formula with the searching:

${\rm{ \mathsf{ ω} }} = {\omega _{min}} + \left( {{\omega _{max}} - {\omega _{min}}} \right)\frac{{{k_{max}} - k}}{{{k_{max}}}}$

(7)

In which the ${\omega _{max}}$ is the maximum value ${\rm{ \mathsf{ ω} }}$ in the beginning of search, which is 0.9. ${\omega _{min}}$ is the minimum value ${\rm{ \mathsf{ ω} }}$ in the end of search, which is 0.2. k are the steps needed in the iteration, and ${k_{max}}$ are the maximum times of iteration.

When ${\rm{f}}\left({{x_i}} \right) < f\left({{p_g}} \right)$ , the particles meeting the condition are the better particles in the swarm, which are close to the global optimal, so they shall be given with a smaller inertia weight to accelerate the convergence to the global optimum. The inertia weight ${\rm{ \mathsf{ ω} }}$ is changed as depicted in the following formula with the searching:

${\rm{ \mathsf{ ω} }} = {\rm{ \mathsf{ ω} }} - \left( {{\rm{ \mathsf{ ω} }} - {\omega _{min}}} \right)\left| {\frac{{f\left( {{x_i}} \right) - f\left( {{p_i}} \right)}}{{f\left( {{p_g}} \right) - f\left( {{p_i}} \right)}}} \right|$

(8)

In which the ${\omega _{min}}$ is the minimum value ${\rm{ \mathsf{ ω} }}$ in the end of search, which is 0.2. The better the adaptive value of the particles, the smaller the inertia weight, which is beneficial to the local optimization.

4.2. Steps of the APSO-LSSVM algorithm

The text suggests that the LS-SVM algorithm is based on self-adaption, which may translate the solution of the least squares of the matrix equation AX = z into the self-adaptive PSO algorithm for an iterative solution. In this way, a matrix inversion may be avoided, which ensures the convergence of the algorithm, accelerates the calculation speed, and enhances the solution accuracy. The process is as follows:

Step 1: Choose the proper training samples ${n_b}$ and test the samples for pretreatment.

Step 2: Initialize the parameters of the particle swarm, including the speed and location of the particle. Set the parameters of the particle swarm, and ${R^{n}}$ randomly produces m particles in space $\left({{x_1}, {x_2}, \cdots, {x_m}} \right)$ to form the initial swarm $\left({{v_{i1}}, {v_{i2}}, \cdots, {v_{im}}} \right)$ . The initial speed X(k) of the randomly produced particle forms the speed matrix V(k). The initial value of the optimal value of every particle is the initial value of ${x_i}$ .

Step 3: Train LS-SVM with training samples to calculate the adaptive value f(x) of every particle, and the formula of fitness function is: ${\rm{f}}\left({{x_i}} \right) = \frac{1}{l}\mathop \sum _{i = 1}^l {\left({{y_i} - {x_i}} \right)^2}$

In the formula, ${x_i}$ is the actual value of the i sample, ${y_i}$ is the predicted value of the i sample, and the l is the amount of the test sample. According to the fitness value of the particle, update ${p_{id}}$ and ${p_{gd}}$ ;

Step 4: for each particle, we compared the current fitness ${\rm{f}}\left({{{\rm{x}}_i}} \right)$ with the fitness ${\rm{f}}\left({{{\rm{p}}_i}} \right)$ of the best historical position. If ${\rm{f}}\left({{x_i}} \right) < f\left({{p_i}} \right)$ , then ${p_i} = {x_i}$ , and we adjust ω according to Eq. (6). We compare the current fitness ${\rm{f}}\left({{x_i}} \right)$ of all the particles of the group with the fitness ${\rm{f}}\left({{p_g}} \right)$ of the best position of the group. If ${\rm{f}}\left({{x_i}} \right) < f\left({{p_g}} \right)$ , then ${p_g} = {x_i}$ , and adjust ω according to Eq. (8).

Step 5: According to the improved PSO model $\left[{v_i^{k + 1}, x_i^{k + 1}} \right]$ , update the velocity and position of the particle to produce a new population X(k+1) for m = 1 to M,

Step 6: Determine whether the velocity vector meets the constraints $- {v_{max}} \le {v_i} \le {v_{max}}$ , and if not, adjust in accordance with the Eq. (6).

Step 7: Determine whether the fitness value meets the requirements or whether the maximum number of iterations has been reached. If the stop condition is satisfied, optimization ends, and the global optimal particle will be mapped to the optimized LS-SVM model parameters. Otherwise k = k+1, and go to Step 3.

Step 8: The LS-SVM is solved by using the training sample data and the parameters obtained in Step 7 to obtain the least squares solution of the matrix equation. That is, the optimal parameters ${{\rm{ \mathsf{ α} }}_i}$ and b are in the corresponding Eq.(2).

The APSO-LSSVM algorithm flow chart is as follows:

Figure 1. Algorithm flow chart.

DownLoad: Full-Size Img PowerPoint

4.3. Convergence test

In order to verify the performance of APSO-LSSVM, three common Benchmark functions are selected as test functions. Rosenbrock is a single-peak function, search space is [−100,100]^D, Schwefel and Penalized are two-peak functions, and search space are [−500,500]^D and [−50, 50]^D, respectively. Figure 2 shows the convergence curves of three test functions optimized by different algorithms.

Figure 2. Convergence contrast diagrams of three test functions.

DownLoad: Full-Size Img PowerPoint

It can be seen that APSO-LSSVM performs very well on all three functions. Since the adaptive strategy is adopted in the algorithm, the ability of the population to jump out of the local optimum and improve the accuracy of the solution can be improved while effectively maintaining the diversity of the population.

5. Experiments and analysis

5.1. Experimental setting

In 2013, coal samples from a certain coalmine in Hebi have been collected, and have carried out a spontaneous ignition test of the coal, and collected the sample data. We forecast the development trends of spontaneous ignition of coal by analyzing the characteristic parameters such as the concentration, ratio and occurrence rate of the indicator gas produced during the spontaneous ignition of coal. The process of coal spontaneous combustion can be generally divided into 3 stages ^[2]: the preparation period, the spontaneous heating period and the burning period. In order to better predict the state of coal spontaneous combustion, the spontaneous heating period can be further divided into 3 stages: early, mid-term and the later period of spontaneous heating.

C-SVM, LS-SVM, PSO-LSSVM and APSO-LSSVM algorithms were used to predict the degree of danger. The experiment uses the MATLAB 2010 version for programming with a 2.19GHz CPU and 2GB of memory.

To improve the accuracy of the sample, it will first be normalized to avoid the effects of singular point data on the performance of the Support Vector Machine. Set the size of the particle swarm to 25, the solution space to 350 dimensions, the maximum number of iterations to 1000, the acceleration factor ${c_1} = {c_2} = 2$ , initial ω. The regularized parameter γ = 1000, the width parameter of the radial basis function σ² = 0.15, and establish 5 LS-SVM classifiers.

5.2. Experimental results and analysis

Test data samples with the prediction model obtained by iteration of the adaptive PSO, establish standard support vector model C-SVM, LS-SVM model and LS-SVM model of a standard PSO, and we compared the results with the predicted results proposed in this article. The C-SVM model adopts the radial basis function. The inertia weight ω in the LS-SVM model of the standard PSO is constant. The correlation curve of the training time is shown in Figure 3, and the range of the number of training sample data is [50,300]. The testing time of the prediction of accuracy is shown in Figure 4. The range of the number of testing samples is [50,300], and the number of testing samples is 300. The correlation curve of the prediction of accuracy is shown in Figure 5. The range of the number of training samples is [50,300]. The number of testing samples is 300.

Figure 3. Correlation curve of training time.

DownLoad: Full-Size Img PowerPoint

Figure 4. Correlation curve of testing time.

DownLoad: Full-Size Img PowerPoint

Figure 5. Correlation curve of the prediction of accuracy.

DownLoad: Full-Size Img PowerPoint

It can be seen from the above response curves. As the number of training samples increases, the training time of 4 kinds of classification algorithms have been obviously increased. However, the training time of APSO-LSSVM is significantly shorter than that of C-SVM, LS-SVM and PSO-LSSVM, which proves that APSO-LSSVM has good adaptability to test conditions and the environment of the number of different samples and fast learning process. In terms of testing time, the processing time of 4 algorithms increases linearly with the increase of the number of test samples. However, the processing time of APSO-LSSVM is obviously shorter than that of C-SVM, LS-SVM and PSO-LSSVM and APSO-LSSVM, showing a good ability of real-time processing of APSO-LSSVM. Under the same conditions, the accuracy of classification of the 4 kinds of algorithms also increases with the increase of the number of training samples. However, the accuracy of APSO-LSSVM is slightly higher than that of C-SVM, LS-SVM and PSO-LSSVM. It can be seen that the adaptive PSO algorithm has obtained higher accuracy during the iteration process of the matrix.

Secondly, in order to test the performance of 4 kinds of algorithms and predict the accuracy of the results with the distribution of different samples, we selected coal samples of different coalmines for the test of spontaneous ignition to obtain the second group of sample data. We established the prediction model with different numbers of training samples. The test parameters respectively are as follows: training time, testing time, accuracy prediction and performance test, which are shown in Table 1.The test parameters respectively are as follows: training time, testing time, accuracy prediction and performance test, which are shown in Table 2.

Table 1. Experimental sample data.

	Data set	Class 1	CLASS 2	CLASS 3	Class 4	Class 5
Frist Group	Training Sample	55	67	59	67	62
Frist Group	Test sample	33	48	31	43	45
Second group	Training Sample	35	40	36	31	36
Second group	Test sample	21	31	24	19	25

| Show Table

DownLoad: CSV

Table 2. Comparison of algorithm performance.

	Algorithm	Training time (s)	Test time(s)	Forecast accuracy (%)
Frist group	C-SVM	0.282	0.325	82.38
	LS-SVM	0.237	0.267	84.23
	PSO-LSSVM	0.185	0.192	88.93
	APSO-LSSVM	0.108	0.145	91.07
Second group	C-SVM	0.188	0.253	83.57
	LS-SVM	0.145	0.226	85.76
	PSO-LSSVM	0.112	0.170	89.25
	APSO-LSSVM	0.073	0.104	92.12

| Show Table

DownLoad: CSV

As seen from the statistical results of the performance test of different coal sample data in two regions that the training and testing time consumption of APSO-LSSVM are significantly smaller than those of C-SVM, LS-SVM and PSO-LSSVM, which proves that APSO-LSSVM has competitive advantages in dealing with relatively complex issues and issues requiring higher real-time performance. The accuracy of the training set and testing set of APSO-LSSVM is slightly higher than that of C-SVM, LS-SVM and PSO-LSSVM, and APSO-LSSVM has a relatively small error, which shows it has a better classification effect.

Next, we consider the problem of time complexity. Assuming the complexity of training a classifier is C_h and updating a training sample is C_w, the time complexity of C-SVM, LS-SVM, PSO-LSSVM and APSO-LSSVM can be approximated to ${C_h}O\left({{k_{max}}} \right) + {C_w}O\left({{n_b}{k_{max}}} \right), {C_h}O\left({{k_{max}}} \right) + {C_w}O\left({l{k_{max}}} \right),$ ${C_h}O\left({N{k_{max}}} \right) + {C_w}O\left({lN{k_{max}}} \right)and{C_h}O\left({N{k_{max}}} \right) + {C_w}O\left({l{k_{max}}} \right)$ . The average training time of the four algorithms is compared in Figure 6.

Figure 6. Time cost comparison on different methods.

DownLoad: Full-Size Img PowerPoint

In order to have objective and scientific comparison results, hypothesis testing is used on the experimental results. Let the variable ${X_1}, {X_2}, {X_3}, {X_4}$ denote the classification error rate of APSO-LSSVM, PSO-LSSVM, LS-SVM and C-SVM algorithms, respectively. Since the value of ${X_1}, {X_2}, {X_3}, {X_4}$ is subject to many random factors, we assume that they submit to normal distribution, ${X_i}\sim N\left({{\mu _i}, \sigma _i^2} \right), i = 1, 2, 3, 4$ . Now, we compare the random variable mean of these algorithms, ${\mu _i}, i = 1, 2, 3, 4$ . The smaller the ${\mu _i}$ is, the lower the expected classification error rate is, and the higher the efficiency is. Because the sample variance is the unbiased estimation of the overall variance, the sample variance value is used as an estimate of the generality variance. In this experiment the significance level α sets as 0.01.

shows the comparison process on ${\mu _i}$ and other parameters. We can see that, the expectations of prediction accuracy in APSO-LSSVM is far below than other algorithms.

Table 3. Hypothesis testing for experimental results.

Hypothesis	$\begin{array}{*{20}{c}} {{{\rm{H}}_0}:{{\rm{ \mathsf{ μ} }}_1} \ge {{\rm{ \mathsf{ μ} }}_2}}\\ {{{\rm{H}}_1}:{{\rm{ \mathsf{ μ} }}_1} < {{\rm{ \mathsf{ μ} }}_2}} \end{array}$	$\begin{array}{*{20}{c}} {{{\rm{H}}_0}:{{\rm{ \mathsf{ μ} }}_1} \ge {{\rm{ \mathsf{ μ} }}_3}}\\ {{{\rm{H}}_1}:{{\rm{ \mathsf{ μ} }}_1} < {{\rm{ \mathsf{ μ} }}_3}} \end{array}$	$\begin{array}{*{20}{c}} {{{\rm{H}}_0}:{{\rm{ \mathsf{ μ} }}_1} \ge {{\rm{ \mathsf{ μ} }}_4}}\\ {{{\rm{H}}_1}:{{\rm{ \mathsf{ μ} }}_1} < {{\rm{ \mathsf{ μ} }}_4}} \end{array}$
Statistics	${U_1} = \frac{{\overline {{X_1}} - \overline {{X_2}} }}{{\sqrt {\frac{{\sigma _1^2}}{{{n_1}}} + \frac{{\sigma _2^2}}{{{n_2}}}} }}$	${U_2} = \frac{{\overline {{X_1}} - \overline {{X_3}} }}{{\sqrt {\frac{{\sigma _1^2}}{{{n_1}}} + \frac{{\sigma _3^2}}{{{n_3}}}} }}$	${U_3} = \frac{{\overline {{X_1}} - \overline {{X_4}} }}{{\sqrt {\frac{{\sigma _1^2}}{{{n_1}}} + \frac{{\sigma _4^2}}{{{n_4}}}} }}$
Rejection region	${U_1} \le - {Z_\alpha } = - 2.325$	${U_2} \le - {Z_\alpha } = - 2.325$	${U_3} \le - {Z_\alpha } = - 2.325$
Value of the statistic	${U_1} = - 52.58$	${U_2} = - 105.64$	${U_3} = - 124.68$
Conclusion	${{\rm{H}}_1}:{{\rm{ \mathsf{ μ} }}_1} < {{\rm{ \mathsf{ μ} }}_2}$	${{\rm{H}}_1}:{{\rm{ \mathsf{ μ} }}_1} < {{\rm{ \mathsf{ μ} }}_3}$	${{\rm{H}}_1}:{{\rm{ \mathsf{ μ} }}_1} < {{\rm{ \mathsf{ μ} }}_4}$

| Show Table

DownLoad: CSV

5.3. Results of UCI database

In order to verify the effectiveness of the algorithm, the data is tested using the Wine dataset and the Iris dataset in the UCI database. Wine dataset contains 3 types of samples, a total of 178 samples. 89 are selected as training samples and 89 are used as test samples. The Iris data set contains 3 types of samples, a total of 150 samples. This article selects 75 samples as training samples and 75 as test samples. In order to improve the accuracy of the sample, the sample was first normalized to [0, 1], support vector machine uses C-SVM, and kernel function uses Gauss radial base kernel function. The training process uses K-fold cross verification to test the accuracy of the judgment sample. Learning factors ${c_1} = {c_2} = 2$ , and number of particles N = 40, maximum number of iterations M = 100, inertial weights ω = 1.

Table 4. Data test results.

Dataset	Parameter C	Parameter σ	Time of parameter optimization	Training sample accuracy	Test sample accuracy	Range of parameter C	Range of parameter σ
Wine	68.94	0.01	20.36s	96.85%	95.34%	[0.1, 1000]	[0.01, 1000]
Iris	100	0.01	13.32s	97.80%	98.10%	[0.1, 1000]	[0.01, 1000]

| Show Table

DownLoad: CSV

6. Conclusion

The problem of coal spontaneous combustion prediction is very complex, and there are many factors that affect the prediction results. The LS-SVM algorithm based on adaptive PSO optimization proposed in this paper uses the LS-SVM algorithm to solve problems such as small samples, non-linear, high dimension and local very small points. The adaptive PSO algorithm is used to solve the problems such as high computational complexity and slow calculation speed of the LS-SVM model for large-scale samples, so that it can always obtain the optimal solution, and its training speed and accuracy are improved. In the coal spontaneous combustion experiment, the training and testing time consumption of the proposed method are significantly smaller than those of the remaining examined methods, which proves that APSO-LSSVM has competitive advantages in dealing with relatively complex issues and issues requiring higher real-time performance. The accuracy of the training set and testing set of APSO-LSSVM is slightly higher than that of the remaining examined methods, and APSO-LSSVM has a relatively small error, which shows it has a better classification effect.

7. Future work

On the basis of this article, there are still the following aspects to be studied. 1) How to more accurately solve the problem of unbalanced data in SVM requires further research. At the same time, the parameter selection problem of SVM is supported. Although the PSO is used to solve the problem to a certain extent, the parameters sought are only optimal for the relative training set. How to obtain the parameters in theory is a direction of research at present. 2) PSO will encounter a local optimal situation, so that the obtained parameters are not global optimal, but sub-optimal. How to obtain the global optimal solution more stably and efficiently for particles will be studied as the next step. 3) The prediction in this paper is only based on the index gas and does not combine with the surrounding environmental factors. In order to improve the effect of this model in practical application, the next step will comprehensively consider other factors, so that this model in practice to give full play to its advantages.

Acknowledgments

This work was jointly supported by National Natural Science Fund (NO.61473299, NO.61876185) and the Fundamental Research Funds for Central Universities (NO.2015QNB21).

Conflict of interest

All authors declare no conflicts of interest in this paper.

References

[1]	B. Rossi and A. Inoue, Out-of-sample forecast tests robust to the choice of window size, J. Bus. Econ. Stat., 30 (2012), 432–453.
[2]	D. E. Rapach and J. K. Strauss, Structural breaks and GARCH models of exchange rate volatility, J. Appl. Econom., 23 (2008), 65–90.
[3]	D. E. Rapach, J. K. Strauss and M. E. Wohar, Chapter 10 Forecasting Stock Return Volatility in the Presence of Structural Breaks, in Forecasting in the presence of structural breaks and model uncertainty, Emerald Group Publishing Limited, (2008), 381–416.
[4]	A. Timmermann, Chapter 4 Forecast combinations, in Handbook of economic forecasting, (2006), 135–196.
[5]	M. H. Pesaran and A. Timmermann, Selection of estimation windows in the presence of breaks, J. Econometrics, 137 (2007), 134–161.
[6]	J. Tian and H. M. Anderson, Forecast combinations under structural break uncertainty, Int. J. Forecasting, 30 (2014), 161–175.
[7]	M. H. Pesaran, A. Pick and M. Pranovich, Optimal forecasts in the presence of structural breaks, J. Econometrics, 177 (2013), 134–152.
[8]	K. Assenmacher-Wesche and M. H. Pesaran, Forecasting the Swiss economy using VECX models: An exercise in forecast combination across models and observation windows, Nat. Inst. Econ. Rev., 203 (2008), 91–108.
[9]	M. H. Pesaran, T. Schuermann and L. V. Smith, Forecasting economic and financial variables with global VARs, Int. J. Forecasting, 25 (2009), 642–675.
[10]	A. Schrimpf and Q. Wang, A reappraisal of the leading indicator properties of the yield curve under structural instability, Int. J. Forecasting, 26 (2010), 836–857.
[11]	D. De Gaetano, Forecast combinations in the presence of structural breaks: Evidence from U.S. equity markets, Math., 6 (2018), 34.
[12]	M. H. Pesaran and A. Pick, Forecast combination across estimation windows, J. Bus. Econ. Stat., 29 (2011), 307–318.
[13]	T. E. Clark and M. W. McCracken, Improving forecast accuracy by combining recursive and rolling forecasts, Int. Econ. Rev., 50 (2009), 363–395.
[14]	D. De Gaetano, Forecast Combinations for Structural Breaks in Volatility: Evidence from BRICS Countries, J. Risk. Financ. Manag., 11 (2018), 64.
[15]	D. De Gaetano, Forecasting with GARCH models under structural breaks: An approach based on combinations across estimation windows, Commun. Stat. Simulat. Comput., 6 (2018), 1–19.
[16]	P. R. Hansen, A. Lunde and J. M. Nason, The model confidence set, Econometrica, 79 (2011), 453–497.
[17]	F. X. Diebold, Modelling the persistence of conditional variance: A comment, Economet. Rev., 5 (1986), 51–56.
[18]	T. Mikosh and C. Stărică, Nonstationarities in financial time series, the long-range dependence and the IGARCH effects, Rev. Econ. Stat., 86 (2004), 378–390.
[19]	E. Hillebrand, Neglecting parameter changes in GARCH models, J. Econometrics, 129 (2005), 121–138.
[20]	S. Hwang and P. L. V. Pereira, Small sample properties of GARCH estimates and persistence, Eur. J. Financ., 12 (2006), 473–494.
[21]	T. Bollerslev, Generalized Autoregressive Conditional Heteroskedasticity, J. Econometrics, 31 (1986), 307–332.
[22]	D. B. Nelson, Conditional heteroschedasticity in asset returns: A new approach, Econometrica, 59 (1991), 217–235.
[23]	L. R. Glosten, R. Jagannathan and D. E. Runkle, On the Relation between the expected value and the volatility of the nominal excess return on stocks, J. Financ., 48 (1993), 1779–1801.
[24]	C. Forbes, M. Evans, N. Hastings, et al., Statistical distributions, 2^nd edition, John Wiley & Sons, 2011.
[25]	J. Nyblom, Testing for the constancy of parameters over time, J. Am. Stat. Assoc., 84 (1989), 223–230.
[26]	B. E. Hansen, Tests for parameter instability in regressions with I (1) processes, J. Bus. Econ. Stat., 20 (2002), 45–59.
[27]	A. Sansó, V. Arragó and J. L. Carrion-i-Silvestre, Testing for change in the unconditional variance of financial time series, Rev. Econ. Financ., 4 (2004), 32–53.
[28]	C. Inclan and G. C. Tiao, Use of cumulative sums of squares for retrospective detection of changes in variance, J. Am. Stat. Assoc., 89 (1994), 913–923.
[29]	W. K. Newey and K. D. West, Automatic Lag Selection in Covariance Matrix estimation, Rev. Econ. Stud., 61 (1994), 631–654.
[30]	G. J. Ross, Modelling financial volatility in the presence of abrupt changes, Physica. A, 392 (2013), 350–360.
[31]	J. G. MacKinnon, Computing numerical distribution functions in econometrics, in High Performance Computing Systems and Applications, Kluwer, (2000), 455–471.
[32]	A. J. Patton, Volatility forecast comparison using imperfect volatility proxies, J. Econometrics, 160 (2011), 246–256.

This article has been cited by:

1.	Zhenming Sun, Dong Li, Chi-Hua Chen, Coal Mine Gas Safety Evaluation Based on Adaptive Weighted Least Squares Support Vector Machine and Improved Dempster–Shafer Evidence Theory, 2020, 2020, 1607-887X, 1, 10.1155/2020/8782450
2.	Shangbo Zhou, Yuxiao Han, Long Sha, Shufang Zhu, A multi-sample particle swarm optimization algorithm based on electric field force, 2021, 18, 1551-0018, 7464, 10.3934/mbe.2021369
3.	Jing Bi, Kaiyi Zhang, Haitao Yuan, Qinglong Hu, 2021, Energy-Aware Task Offloading with Genetic Particle Swarm Optimization in Hybrid Edge Computing, 978-1-6654-4207-7, 3194, 10.1109/SMC52423.2021.9658678
4.	Yiqin Bao, Hongbing Lu, Qiang Zhao, Zhongxue Yang, Wenbin Xu, Detection system of dead and sick chickens in large scale farms based on artificial intelligence, 2021, 18, 1551-0018, 6117, 10.3934/mbe.2021306
5.	Jing Bi, Haitao Yuan, Kaiyi Zhang, MengChu Zhou, Energy-Minimized Partial Computation Offloading for Delay-Sensitive Applications in Heterogeneous Edge Networks, 2022, 10, 2168-6750, 1941, 10.1109/TETC.2021.3137980
6.	Wei Liu, Zhenjun Song, Meng Wang, Pengyu Wen, Dynamic prediction of high-temperature points in longwall gobs under a multi-field coupling framework, 2024, 187, 09575820, 1062, 10.1016/j.psep.2024.04.097
7.	Surendra Kumar Dogra, Jitendra Pramanik, Singam Jayanthu, Abhaya Kumar Samal, 2023, Chapter 13, 978-3-031-46965-7, 153, 10.1007/978-3-031-46966-4_13
8.	Liyang Gao, Bo Tan, Feiran Wang, Hao Lu, David Cliff, Xiaomeng Li, Comparison and Optimization of Monitoring and Warning Methods for Coal Spontaneous Combustion Fire Area, 2025, 0010-2202, 1, 10.1080/00102202.2025.2497061
9.	Xuezhao Zheng, Peihua Li, Guobin Cai, Jun Guo, Yin Liu, MICE-PSO-RF Model for Predicting Coal Spontaneous Combustion Temperature Based on Multiple Imputation by Chained Equations, 2025, 0010-2202, 1, 10.1080/00102202.2025.2500516
10.	Huimin Zhao, Xu Zhou, Jingjing Han, Yixuan Liu, Zhe Liu, Shishuo Wang, Research on early warning model of coal spontaneous combustion based on interpretability, 2025, 15, 2045-2322, 10.1038/s41598-025-01154-4

Reader Comments

Your name:*

Email:*
© 2019 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Mathematical Biosciences and Engineering

4.4

Metrics

Article views(4555) PDF downloads(468) Cited by(4)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Figures(1) / Tables(11)

Mathematical Biosciences and Engineering

Forecasting volatility using combination across estimation windows: An application to S&P500 stock market index

Related Papers:

Abstract

1. Introduction

2. Related work

3. LS-SVM

3.1. LS-SVM basic theory

3.2. Multi-classification LS-SVM

4. LS-SVM algorithm based on an adaptive particle swarm optimization

4.1. Adaptive particle swarm optimization algorithm

4.2. Steps of the APSO-LSSVM algorithm

4.3. Convergence test

5. Experiments and analysis

5.1. Experimental setting

5.2. Experimental results and analysis

5.3. Results of UCI database

6. Conclusion

7. Future work

Acknowledgments

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

Mathematical Biosciences and Engineering

Forecasting volatility using combination across estimation windows: An application to S&P500 stock market index

Related Papers:

Abstract

1. Introduction

2. Related work

3. LS-SVM

3.1. LS-SVM basic theory

3.2. Multi-classification LS-SVM

4. LS-SVM algorithm based on an adaptive particle swarm optimization

4.1. Adaptive particle swarm optimization algorithm

4.2. Steps of the APSO-LSSVM algorithm

4.3. Convergence test

5. Experiments and analysis

5.1. Experimental setting

5.2. Experimental results and analysis

5.3. Results of UCI database

6. Conclusion

7. Future work

Acknowledgments

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog