Optimal voltage controls of distribution systems with OLTC and shunt capacitors by modified particle swarm optimization: A case study

Minh Y Nguyen; Minh Y Nguyen

doi:10.3934/energy.2019.6.883

AIMS Energy

2019, Volume 7, Issue 6: 883-900. doi: 10.3934/energy.2019.6.883

Previous Article Next Article

Research article Special Issues

Optimal voltage controls of distribution systems with OLTC and shunt capacitors by modified particle swarm optimization: A case study

Minh Y Nguyen ^,

Department of Electrical and Computer Engineering, Faculty of International Training, Thai Nguyen University of Technology, 3-2 St., Tich Luong, National, Thai Nguyen, P. O. 230000, Vietnam

Received: 07 September 2019 Accepted: 05 November 2019 Published: 11 December 2019

This paper presents a new framework to determine the optimal voltage control of distribution systems based on modified particle swarm optimization. The problem is to determine the set-points of the existing regulation devices such as on-load tap changers, shunt capacitors, etc. which minimizes the multi-objective function including power losses, voltage deviations, switching operations while subject to the constraint of allowable voltage levels, switching stresses, line capacity, etc. The problem is formulated and solved by modified particle swarm optimization methods with the trial, test and analysis techniques due to its large-scale and high nonlinearity property. In each iteration, a Newton-Raphson-based simulation is run to evaluate the performance of the regulation devices and the distribution system as well. The convergence is guaranteed by defining neighborhood boundaries for each trial. The proposed method is applied in a practical case study of 15-MVA, 22-kV, 48-bus distribution systems in Vietnam. The result of simulations shows that the voltage profile can be improved significantly with no bus voltage out of the boundaries while the voltage deviations is reduced as much as 56.5% compared to the conventional nominal setting. In the case study, the power loss is not improved much (1.21%).

Keywords:

Citation: Minh Y Nguyen. Optimal voltage controls of distribution systems with OLTC and shunt capacitors by modified particle swarm optimization: A case study[J]. AIMS Energy, 2019, 7(6): 883-900. doi: 10.3934/energy.2019.6.883

Related Papers:

[1]	Ya'nan Wang, Sen Liu, Haijun Jia, Xintao Deng, Chunpu Li, Aiguo Wang, Cuiwei Yang . A two-step method for paroxysmal atrial fibrillation event detection based on machine learning. Mathematical Biosciences and Engineering, 2022, 19(10): 9877-9894. doi: 10.3934/mbe.2022460
[2]	Miao Zhu, Tao Yan, Shijie Zhu, Fan Weng, Kai Zhu, Chunsheng Wang, Changfa Guo . Identification and verification of FN1, P4HA1 and CREBBP as potential biomarkers in human atrial fibrillation. Mathematical Biosciences and Engineering, 2023, 20(4): 6947-6965. doi: 10.3934/mbe.2023300
[3]	Nattawan Chuchalerm, Wannika Sawangtong, Benchawan Wiwatanapataphee, Thanongchai Siriapisith . Study of Non-Newtonian blood flow - heat transfer characteristics in the human coronary system with an external magnetic field. Mathematical Biosciences and Engineering, 2022, 19(9): 9550-9570. doi: 10.3934/mbe.2022444
[4]	Shengjue Xiao, Yufei Zhou, Ailin Liu, Qi Wu, Yue Hu, Jie Liu, Hong Zhu, Ting Yin, Defeng Pan . Uncovering potential novel biomarkers and immune infiltration characteristics in persistent atrial fibrillation using integrated bioinformatics analysis. Mathematical Biosciences and Engineering, 2021, 18(4): 4696-4712. doi: 10.3934/mbe.2021238
[5]	Oualid Kafi, Nader El Khatib, Jorge Tiago, Adélia Sequeira . Numerical simulations of a 3D fluid-structure interaction model for blood flow in an atherosclerotic artery. Mathematical Biosciences and Engineering, 2017, 14(1): 179-193. doi: 10.3934/mbe.2017012
[6]	Huu Thuan Nguyen, Tu Anh Do, Benoît Cosson . Numerical simulation of submerged flow bridge scour under dam-break flow using multi-phase SPH method. Mathematical Biosciences and Engineering, 2019, 16(5): 5395-5418. doi: 10.3934/mbe.2019269
[7]	Annalisa Quaini, Sunčica Čanić, David Paniagua . Numerical characterization of hemodynamics conditions near aortic valve after implantation of left ventricular assist device. Mathematical Biosciences and Engineering, 2011, 8(3): 785-806. doi: 10.3934/mbe.2011.8.785
[8]	Hongtao Liu, Shuqin Liu, Xiaoxu Ma, Yunpeng Zhang . A numerical model applied to the simulation of cardiovascular hemodynamics and operating condition of continuous-flow left ventricular assist device. Mathematical Biosciences and Engineering, 2020, 17(6): 7519-7543. doi: 10.3934/mbe.2020384
[9]	Abdul Qadeer Khan, Azhar Zafar Kiyani, Imtiaz Ahmad . Bifurcations and hybrid control in a 3×3 discrete-time predator-prey model. Mathematical Biosciences and Engineering, 2020, 17(6): 6963-6992. doi: 10.3934/mbe.2020360
[10]	Li Cai, Yu Hao, Pengfei Ma, Guangyu Zhu, Xiaoyu Luo, Hao Gao . Fluid-structure interaction simulation of calcified aortic valve stenosis. Mathematical Biosciences and Engineering, 2022, 19(12): 13172-13192. doi: 10.3934/mbe.2022616

Abstract

1. Introduction

The extensive utilization of fossil fuels has given rise to profound energy crisis challenges and greenhouse gas emission issues. Consequently, energy conservation and emission reduction has emerged as a topic of common concern for all countries in the present world ^[1]. Photovoltaic (PV) power generation is known as a clean, safe, and sustainable renewable energy source ^[2], which has received special attention from investors and researchers because of its low cost of use, long lifetime, no greenhouse gas emissions ^[3], easy accessibility, low maintenance difficulty, abundant resources, and fixed payback period ^[4]. However, the fluctuating and stochastic nature of solar irradiance has introduced complexities to short-term scheduling within power systems, thereby incurring substantial supplementary costs for power suppliers ^[5]. Therefore, accurate forecasting of PV power generation is extremely important and is of great benefit to both power suppliers and power systems. Power suppliers need to obtain accurate information about PV power generation in order to set up specialized commercial offers to reduce production costs and increase profits; power systems can mitigate the negative impacts caused by uncertainty in PV power generation and ensure its stable and reliable operation ^[6].

According to the classification criteria of reference ^[7], existing PV power generation prediction methods can be classified into four main categories: Physical models, statistical models, machine learning models and deep learning models.

The most distinctive feature of the physical model is that it does not require any historical data ^[8] and it is based on solar irradiance and a complex set of mathematical equations describing the physical state of the PV system ^[4]. Under stable weather conditions, a physical model can achieve an acceptable level of predictive accuracy. However, its precision cannot be guaranteed when there are significant weather fluctuations. In the literature ^[9,10], several typical physical models are presented. In addition physical models can only produce meteorological data values after 6 hours, thus limiting its applicability for very short-term forecasting purposes ^[11].

Statistical models use purely mathematical equations to create a mapping relationship between historical and target forecast data to predict future PV generations ^[12]. These are data-driven approaches with the advantages of easy modeling and inter-regional generalizability. As a result, various statistical methods have been widely used in recent years to forecast PV power generation, including regression methods (RM) ^[13], autoregressive moving average models (ARMA) ^[14] and its improved version seasonal autoregressive integrated moving average model (SARIMA) ^[15]. However, the volatility and non-periodicity of the PV power generation time series may undermine the prerequisites required for the application of statistical methods, such as a substantial amount of suitable and high-quality input data, as well as an appropriate input time range.

Machine learning (ML) models, which have evolved from the foundation of statistical models, have been used in recent years in various fields of science and engineering, including the prediction of PV power generation. It follows the process of preparing data, training algorithms, generating models, and then making and refining predictions ^[16]. ML models frequently used in PV power generation prediction include artificial neural networks (ANN) ^[17], support vector machines (SVM) ^[18] and extreme learning machine (ELM) ^[19].

The above models are shallow networks and are suitable for handling small-scale datasets. With the explosive growth of data volume, these methods are not able to mine effective features from massive data, so deep learning can be used to address this bottleneck. Deep learning (DL) models have been used to solve many research areas and have been successfully applied to PV power generation ^[20]. Deep learning is capable enough to extract, transform and model the intricate features inherent in the PV power generation time series and provide not only more effective but also superior prediction results than traditional models. LSTM, as one of the most important deep learning techniques, has been frequently applied in related work ^[2].

Literature ^[21] compared LSTM with the persistence algorithm, linear least squares regression algorithm and back propagation algorithm using a dataset collected in the island of Santiago, Cape Verde, as an input, and the experimental results showed that the LSTM algorithm performs better in terms of prediction accuracy and generalization ability. Although a single LSTM unit outperforms compared to traditional physical and ML models, its prediction accuracy is still limited by the simple network structure. Literature ^[22] first uses the Bayesian optimization algorithm to optimize the hyperparameters of the deep neural network, which solves the problem of unidirectional data transfer of LSTM and achieves bidirectional propagation of historical and future information. However, the BiLSTM network is still essentially a simple RNN model, which is capable of effectively extracting historical useful information of the time series, but has a weak feature extraction capability for the input data. Literature ^[23] uses CNNto extract the features of the influencing factors of the input data and uses BiLSTM for time series prediction. However, in order to achieve longer time-size convolution, an extremely deep network of convolutional layers is required. In addition, most of the above prediction models are based only on a large amount of data, ignoring its real-world issues or physical implications, thereby potentially yielding irrational predictive outcomes.

Therefore, based on the above studies, this paper proposes a PV prediction model based on dilated causal convolution network (DCCN) and stacked LSTM (SLSTM) in conjunction with the physical constraints. The main contributions of our work can be summarized as follows:

● The introduction of the basic characteristics of PV power plants as physical constraints ensures the rapidity of subsequent model training and the reasonableness of model output;

● Employing temporal convolutional network for feature extraction can fully exploit the spatial features of PV historical data and improve the prediction accuracy of the subsequent model;

● The stacked network structure is used for training. Compared with the original LSTM model, the superiority of SLSTM is that it can more fully learn the temporal features in the PV sequences, thus elevating the prediction accuracy of the model.

The rest of the paper is described as follows: Section 2 gives details of the theoretical background of the proposed prediction model; Section 3 presents the experimental results and discussions, comparing the designed model against various others; Section 4 summarises the work of this paper and provides an outlook for future work.

2. Materials and methods

2.1. Overview

The overall framework of the proposed dilated causal convolution network-stacked LSTM model (DSLSTM) is shown in Figure 1. The structure of the method consists of a dilated causal convolutional network for feature extraction and a SLSTM layer. It consists of three main steps: Data preprocessing, training of the DL network and PV power prediction.

Figure 1. The proposed DSLSTM model. The PV forecasting task was divided into three steps: Input pre-processing, model training, and output prediction.

DownLoad: Full-Size Img PowerPoint

2.2. Physical constraint

In this subsection, physical constraints are initially extracted from the domain knowledge and physical laws of PV and then integrated into the construction of the DSLSTM model. The aim is to overcome the limitations of DL algorithms, which often yields predictions based solely on extensive data, potentially leading to results that do not adhere to the physical laws of the real world. This includes scenarios such as negative power generation during daytime hours (05:00 to 19:00) and positive power generation during nighttime hours (19:00 to 05:00), which may not align with the actual physical constraints.

Specifically, there are two types of constraints in the structure of DSLSTM (denoted as Cons1 and Cons2 in Figure 1). The first constraint, known as the data cropping module, is designed based on physical knowledge or general knowledge of PVs ^[17]. Its purpose is to eliminate physically unreasonable predictions during the training and testing processes by cropping the data. For instance, it addresses the occurrence of positive power generation during the nighttime hours of PVs (from 19:00 on the first day to 05:00 on the following day). The original dataset contained data points at 15-minute intervals, totaling 96 points per day. Given that the photovoltaic power generation remains at 0 before 5:00 in the morning and after 19:00 in the evening, data for the training of the model was only selected from the period 05:00 to 19:00 each day. Consequently, the data scale reduced from 35,040 data points per year to 20,440 data points per year, representing a decrease of 41.7%. This approach not only prevents occurrences of positive power generation during nighttime but also reduces the model training time.

The second constraint, called the data filtering module, is used to limit the output of the model to a reasonable range during training and testing. It is designed based on knowledge of the natural science of PV to eliminate physically unreasonable predictions such as negative power generation. According to the laws of physics, the generation value of PV should be greater than zero in reality, so the output of the model should be positive. Therefore, the predicted output of DSLSTM $y_{_t}^p$ should be constrained by Eq (1):

$y_{_t}^p = {\text{ReLU(}}f{\text{(}}{x_1}, {x_2}, ..., {x_n}{\text{))}} ,$

(1)

where ReLU() represents the rectified linear unit function. When the input to the ReLU function is less than 0, the ReLU function returns 0; when the input is greater than 0, the original value of the input is returned. ${x_i}$ represents the ith input and f() represents the neural network.

The inclusion of the above two constraints ensures the rapidity of the DSLSTM model training and the reasonableness of the output.

2.3. Feature extraction networks based on dilation causal convolutional networks

2.3.1. Causal convolution

The principle of causal convolution is that the current state cannot contain future information. In other words, the output at time step t + 1 is only correlated with the previous time steps, i.e., t, t-1, ..., t-n (as illustrated in Figure 3, where n = 14 in this paper). This approach effectively prevents the issue of information leakage caused by regular convolution.

Figure 2. The architecture of dilation convolution.

DownLoad: Full-Size Img PowerPoint

Figure 3. The architecture of proposed DCCN.

DownLoad: Full-Size Img PowerPoint

A significant drawback of causal convolution lies in its requirement for extremely deep networks or exceedingly large filters to achieve convolutions over longer time spans, both of which were not initially entirely practical. Therefore, the introduction of the dilation factor, denoted as 'd', serves to enlarge the receptive field and accept a broader range of input information.

2.3.2. Dilation convolution

A simple causal convolution can only recall a history that is linearly related to the depth of the network. This makes causal convolution difficult to apply to tasks that require longer time sizes. In this paper, dilation convolution is used to increase the sense field. Dilation convolution is equivalent to introducing a fixed-length interval between two neighboring elements in the convolution kernel of ordinary convolution. When the dilation factor is equal to 1, the dilation convolution is the ordinary convolution. The use of larger dilation factor represents a larger receptive field. The structure of the dilation causal convolution is shown in the following figure.

2.3.3. Residual connection

With the increase of convolutional layers, there is a risk of losing crucial feature information, so the residual connection that can realize the transfer of data across layers is introduced. At the same time, the residual connection can also effectively alleviate the gradient disappearance or gradient explosion problem that exists in the deep network. The structure of the feature extraction module proposed in this paper is shown in Figure 3, which consists of two parts: The left section is the dilation causal convolution, while the right section employs a 1 × 1 convolution for residual connections. The 1 × 1 convolution ensures matching tensor shapes when elements are added together ^[24]. The formula for the residual connection is shown in Eq (2):

${\text{DCCN(}}x{\text{)}} = {\text{DCC(}}x{\text{)}} + {\text{Conv(}}x{\text{)}} ,$

(2)

where $x$ denotes the input, ${\text{Conv(}}x{\text{)}}$ denotes the output of the 1 × 1 convolution, ${\text{DCC(}}x{\text{)}}$ denotes the output of the dilation causal convolution and ${\text{DCCN(}}x{\text{)}}$ denotes the output of the dilated causal convolutional network.

2.4. RNNs

Different architectures for DL, such as RNNs and deep neural networks (DNNs), have been used in many application areas and have produced superior to most data modeling techniques. A distinctive aspect separating RNNs from DNNs lies in the fact that RNNs generate not only outputs but also hidden states. These hidden states, subsequently coupled with input data at the subsequent timestamp, are employed to adapt network weights, thereby giving rise to the construction of deep learning architectures. The hidden states have the capacity to retain prior information and deploy it in subsequent training phases ^[24].

RNN is a powerful and robust neural network that uses existing time series data to predict future data for a specific length of time. In RNN, the output of the previous timestamp as well as the internal state of the current timestamp will be fed into the RNN unit. Consequently, the current output of the model depends not only on the current inputs, but also bears the influence of the previous states. Therefore, RNN is very promising in processing the historical PV data characterized by sequential characteristics.

2.4.1. LSTM

Although RNN can effectively extract temporal information from temporal data, RNN encounters issues of vanishing and exploding gradients during training on lengthy sequences. To overcome the limitations of RNN, Hochreiter and Schmidhuber ^[25] proposed LSTM architecture. LSTM adds forget gate, input gate, update gate and output gate to RNN. As the name suggests, the forget gate determines the percentage of long-term memory that is forgotten, the input gate determines what percentage of the current moment can be passed to the amount of cell state at the current moment, the update gate is used to update the cell state and the output gate produces the output at the current moment. These four main components of the LSTM will work and interact with each other in a special way. The internal structure of the LSTM is illustrated in Figure 4.

Figure 4. The unfold structure of the LSTM unit.

DownLoad: Full-Size Img PowerPoint

The forget gate determines the proportion by which the previous timestep cell state, serving as long-term memory, is to be forgotten. The forget gate consists of the hidden state of the previous moment and the input of the current moment, which is finally obtained by the activation function. The process of calculating the forget gate ${f_t}$ is shown in Eq (3):

${f_t} = \sigma ({W_{if}}{x_t} + {W_{hf}}{h_{t - 1}} + {b_f}),$

(3)

where ${x_t}$ is the input of the current moment, ${W_{if}}$ is the weight of the current input, ${h_{t - 1}}$ denotes the hidden state functioning as short-term memory from the previous timestep, ${W_{hf}}$ represents the weight of the hidden state in the previous moment, and ${b_f}$ is the bias of the forget gate. The symbol σ denotes the activation function and its outputs take on the range of [0, 1], where 0 means complete forgetting and 1 means complete retention.

The input gate determines the proportion of short-term information at the current moment that can be updated into long-term memory, and the process of calculating the input gate ${i_t}$ is described in Eq (4):

${i_t} = \sigma ({W_{ii}}{x_t} + {W_{hi}}{h_{t - 1}} + {b_i}) ,$

(4)

where ${W_{ii}}$ , ${W_{hi}}$ and ${b_i}$ stand for the weight matrix of the current input of the input gate, the weight matrix of the hidden state at the previous moment and the bias of the input gate, respectively.

The update gate is used to control the update of the candidate cell state and the candidate cell state ${g_t}$ is obtained by the tanh activation function, which takes the value in the range of [–1, 1]; the calculation is shown in Eq (5):

${g_t} = \tanh ({W_{ig}}{x_t} + {W_{hg}}{h_{t - 1}} + {b_g}) ,$

(5)

where ${W_{ig}}$ , ${W_{hg}}$ and ${b_g}$ , respectively, denote the weight matrix of the current input of the candidate cell state, the weight matrix of the hidden state at the previous moment and the bias of the candidate cell state.

The cell state at the current moment is jointly determined by the forget gate and the input gate. The calculation is shown in Eq (6):

${c_t} = {f_t}*{c_{t - 1}} + {i_t}*{g_t} ,$

(6)

The output gate is used to control the output of a cell state and transfer that state to the next cell. The process of calculating the output gate ${o_t}$ value is shown in Eq (7):

${o_t} = \sigma ({W_{io}}{x_t} + {W_{ho}}{h_{t - 1}} + {b_o}) ,$

(7)

where ${W_{io}}$ represents the weight matrix of the output gate, ${W_{ho}}$ is the weight of the hidden state in the previous moment and ${b_o}$ stands for the bias of the output gate.

Upon calculating the forget gate, input gate, update gate, output gate and candidate cell state, LSTM will calculate the output as well as update the hidden state with the following formula:

${y_t} = {o_t}*\tanh ({c_t}) ,$

(8)

${h_t} = {y_t} .$

(9)

2.4.2. SLSTM

Deep network architectures have demonstrated strong capabilities in dealing with complex nonlinear feature representations ^[26]. Research indicates that although a single LSTM unit can solve the gradient vanishing and explosion problems in RNN models, its prediction accuracy is still limited by the simple network structure ^[27]. Therefore, by increasing the stacking depth of the LSTMs, the features of the input sequences can be better learnt, consequently improving the network performance.

The structure of SLSTM is shown in Figure 5. It consists of multiple LSTM layers, where the input of the first LSTM layer is the original data and the output of the last LSTM layer is the prediction result. The inputs of the other intermediate LSTM layers come from the outputs of their previous LSTM layer, and the outputs are used as inputs to the latter LSTM layer. As with the original LSTM, the hidden states and cell states in the stacked LSTM are also passed to the next moment. The difference is that the dimension is increased in this stacked structure.

Figure 5. The architecture of SLSTM.

DownLoad: Full-Size Img PowerPoint

While stacking multiple LSTM layers significantly enhances the network's capacity to learn from long sequences, excessive layer stacking should be avoided. An increase in the number of layers can lead to slower model update iterations, reduced convergence effectiveness and exponential growth in temporal and memory costs during training. This can make the model susceptible to issues such as gradient vanishing. Therefore, in this paper we choose to adopt a stacked LSTM module consisting of three LSTM layers.

3. Experimental setup

3.1. Experimental dataset

The PV data utilized in this study is from Gaoyou, Jiangsu with the PV power plant positioned at 32 degrees, 58 minutes, 31 seconds north latitude and 119 degrees, 36 minutes, eight seconds east longitude, boasting an installed capacity of 10MW. The selected data used for model training and validation spanned from January 1, 2017, 00:00:00 to December 31, 2017, 23:45:00, with a temporal resolution of 15 minutes and 8:2 split ratio for the training and testing sets. As anticipated, solar irradiance is generally higher between 11:00 and 14:00, corresponding to elevated PV power generation during daylight hours. Comparatively, power generation is notably lower during the early morning hours (05:00 to 11:00) and the afternoon (14:00 to 19:00). Notably, power generation remains at zero during the night (19:00 to 05:00 the following day) due to the absence of solar irradiance.

Recent studies have used multivariate datasets consisting of meteorological data or other environmental variables to improve the performance of prediction models ^[2,5,15,16]. However, in many cases, such as in short-term studies, the addition of these variables has little effect on the accuracy of the predictions due to their small variation over a short period of time (i.e., 15 minutes) ^[28]. Nonetheless, additional variables complicate the model and slow down the training process. Consequently, this study exclusively considers historical PV generation data as the model input to validate the superiority of the proposed DSLSTM model.

The normalization of data can eliminate the effect of magnitude and dimensions, and overcome the problem of overflow of individual data during the training process. Common normalization techniques are max-min normalization, mean normalization and Z-Score normalization. Considering that the PV generation data is all positive, this study uses max-min normalization that scales the PV data between [0, 1]. The formulation for this normalization process is presented as Eq (10):

$x = \frac{{x' - {x^{\min }}}}{{{x^{\max }} - {x^{\min }}}} ,$

(10)

where $x'$ and $x$ are the original and normalized values of PV, respectively, ${x^{\min }}$ denotes the minimum value of PV data and ${x^{\max }}$ denotes the maximum value of PV data.

3.2. Evaluation indicators

In this paper, we employ four performance evaluation metrics to assess the predictive outcomes: Coefficient of determination (R2), mean absolute error (MAE), mean squared error (MSE) and root mean squared error (RMSE) ^[18,21,24]. These metrics are used to quantify the accuracy and performance of the predictive model. The MAE value represents the average magnitude of prediction errors, quantifying the average absolute difference between predicted values and actual values; the MSE reflects the average Euclidean distance between the predicted and actual values, and these two metrics are often used to gauge the overall performance of predictive models. RMSE is a widely used method in evaluating prediction errors, which defines the degree of error that exists between the prediction and the actual result, and is usually more sensitive to large deviations between measurements and predictions. Smaller values of MAE, MSE and RMSE indicate better predictive performance. The R2 reflects the correlation between inputs and outputs and is frequently used to assess the fit quality of regression models. In regression models, the closer the sum of squared residuals is to zero, the closer the R2 value is to 1, indicating higher precision in the model's predictions. It is worth noting that due to scenarios where both actual and predicted values are zero within this study, the mean absolute percentage error (MAPE) and symmetric mean absolute percentage error (SMAPE) were not adopted as evaluation metrics. Below are the specific formulas for calculating these four metrics used to measure predictive model performance:

$MAE\left( {{y^A}, {y^P}} \right) = \frac{1}{n}\sum\nolimits_{i = 1}^n {\left| {y_i^A - y_i^P} \right|} ,$

(11)

$MSE\left( {{y^A}, {y^P}} \right) = \frac{1}{n}\sum\nolimits_{i = 1}^n {{{(y_i^A - y_i^P)}^2}} ,$

(12)

$RMSE\left( {{y^A}, {y^P}} \right) = \sqrt {\frac{1}{n}\sum\nolimits_{i = 1}^n {{{(y_i^A - y_i^P)}^2}} } ,$

(13)

$R2\left( {{y^A}, {y^P}} \right) = 1 - \frac{{S{S_{res}}\left( {{y^A}, {y^P}} \right)}}{{S{S_{tot}}\left( {{y^A}, {y^P}} \right)}} ,$

(14)

where ${\bar y^A}$ and ${\bar y^P}$ denote the average value of the actual output and the average value of the predicted output, respectively; $y_i^A$ and $y_i^P$ denote the true value and the predicted value at the i moment, respectively; n denotes the length of the training samples; $S{S_{res}}$ stands for the sum of squares of the residuals and $S{S_{tot}}$ represents the total sum of squares of the real data, which are expressed in the following formulas:

$S{S_{res}}\left( {{y^A}, {y^P}} \right) = \sum\nolimits_{i = 1}^n {{{(y_i^A - y_i^P)}^2}} ,$

(15)

$S{S_{tot}}\left( {{y^A}, {y^P}} \right) = \sum\nolimits_{{\text{i}} = 1}^n {{{(y_i^A - {{\bar y}^A})}^2}} .$

(16)

In the following section, a case study will be conducted to validate the feasibility and effectiveness of the proposed methodology using real-world PV datasets.

3.3. Simulation setup and hyper-parameter selection

All experiments were conducted on a desktop computing workstation running on an Intel(R) Core(TM) i5-9500F central processing unit (CPU) @ 3.00GHz, NVIDIA GeForce GTX 1060 GPU, 16GB DDR4 RAM, and the operating system is Windows 10 Professional. The network proposed in this paper is built under Python 3.10.9, Pytorch 1.12.1. The Adam optimizer with weight decay set to 0.0001 and a learning rate of 0.0001 is used for optimization and iterative training of the networks through backpropagation.

The choice of hyperparameters of the model is essential for the correct training of the model. The main hyperparameters of the model are set as Table 1.

Table 1. Hyperparameters of the proposed DSLSTM model.

Hyper-Parameters	Qty
Input size (lag)	15 (225 min)
Batch size	8
Kernel size of Conv1d	1 × 3
Dilation step of Conv1d	2
Layers of Conv1d	3
Layers of LSTM	3
Dropout rate of LSTM	0.2
Hidden size of LSTM	64

| Show Table

DownLoad: CSV

4. Discussion

4.1. LSTM hidden layer tuning

Theoretically, the more the number of LSTM hidden layers, the stronger the curve fitting ability of the predicted data. However, as the number of layers increases, the neural network structure becomes more and more complex, the training time becomes longer and longer, it is easy to appear overfitting phenomenon and the generalization ability becomes worse. In this paper, we compare the prediction effect when the number of hidden layers is one, two, three, four, five, six, seven, respectively, and at the same time, in order to avoid chance, we randomly do ten experiments and take the average of the best two of them as the most results. The results are shown in Figure 6.

Figure 6. Prediction error and training time for different layers of LSTMs.

DownLoad: Full-Size Img PowerPoint

In Figure 6, although the single-layer LSTM prediction results deviate from the actual value, the training time is small; the prediction results of the two, three, four and five-layer LSTM have significant improvement, while the three-layer LSTM has the smallest prediction error; as for the six or seven layers, with the increase of the number of layers, the training time of the LSTM model increases significantly. The possible reason for this is that the network structure is getting more and more complex to appear the phenomenon of overfitting and the generalization ability becomes worse. Taken together, the three-layer LSTM model is optimal for prediction.

4.2. Comparison and analysis of model prediction results

To validate the effectiveness of DSLSTM, the proposed predictive model is compared with MLP, RNN, GRU, LSTM, SGRU, SLSTM, DSGRU and DSLSTM. In order to strictly control the variables, the same dataset is used as input, and to ensure the accuracy of the experimental results and avoid the influence of chance factors, the average value is taken as the prediction result after several experiments on the test set. The effectiveness of the application of various prediction algorithms is shown in Table 2 below.

Table 2. The numerical metrics of the proposed DSLSTM and benchmark models.

	MAE	MSE	RMSE	R2
MLP	4.531	66.575	8.159	0.898
RNN	4.297	65.321	8.082	0.899
GRU	4.414	65.948	8.121	0.901
LSTM	4.154	63.264	7.954	0.903
SGRU	4.263	63.482	7.967	0.903
SLSTM	4.145	62.287	7.892	0.905
DSGRU	3.872	62.400	7.899	0.905
DSLSTM	3.728	59.094	7.687	0.910

| Show Table

DownLoad: CSV

First and foremost, concerning individual models, it is evident from Table 2 that, compared to MLP, RNNs demonstrate superior adaptability and learning capabilities in time-series prediction tasks due to their capacity to retain previous information and incorporate it into current output computation. Moreover, an analysis of Table 2 reveals that, although GRU simplifies LSTM computations and reduces parameters, it falls short of effectively controlling data flow; thus, leading to inferior performance compared to LSTM, especially in scenarios involving sizable datasets. This assertion is corroborated by the comparative results between LSTM and GRU.

Second, we can infer that the performance of stacked models surpasses that of individual models. Single models, constrained by their simplistic network structures, benefit from increased depth through stacked networks, thereby enhancing the learning of distinctive features within input sequences and, consequently, improving network performance. This assertion finds support in the lower MSE of SLSTM by 1.5% and SGRU by 3.7% compared to LSTM and GRU, respectively.

Furthermore, it can be concluded that composite models exhibit significantly superior performance compared to stacked models. Observing Figure 7, it is apparent that the proposed DSLSTM and DSGRU models demonstrate lower MAE, MSE and RMSE and higher R-squared (R2) values. Specifically, compared to the stacked models without the expanded causal convolution network, DSLSTM achieves a reduction of 10, 5.13 and 2.6% in MAE, MSE and RMSE, respectively, while increasing R2 by 0.55%. Similarly, in the case of DSGRU, the MAE, MSE and RMSE decrease by 9.17, 1.7 and 0.86%, respectively, with a corresponding increase of 0.19% in R2. Figure 8 illustrates that prediction models incorporating the expanded causal convolution network exhibits superior performance, primarily owing to the network's ability to capture holistic feature information from historical data, thereby facilitating more accurate PV output predictions. This highlights the significance of feature extraction capabilities.

Figure 7. The performance metrics of the proposed DSLSTM and benchmark models on the real dataset.

DownLoad: Full-Size Img PowerPoint

Figure 8. The forecasting results of the proposed DSLSTM and benchmark models on the real dataset.

DownLoad: Full-Size Img PowerPoint

Finally, compared to DSGRU, DSLSTM demonstrates a 3.7, 5.3 and 2.7% reduction in MAE, MSE, and RMSE evaluation metrics, respectively, along with a 0.6% increase in R2. Figure 9 also indicates that the prediction deviation of the DSLSTM model is less than that of DSGRU, underscoring the ability of the proposed model to offer more precise and reliable PV predictions, thus exhibiting promising practical application prospects.

Figure 9. Scatter plots of observed and forecasted PV power generation using DSLSTM and DSGR.

DownLoad: Full-Size Img PowerPoint

Naturally, the DSLSTM network proposed in this paper demands more time for training. However, in practical applications, our focus remains on prediction time (testing time), while training can be completed during offline and idle periods.

5. Conclusions

The prediction of PV power generation has been extremely important in the development of the entire PV industry. This article presents an innovative deep learning-based framework to address the short-term prediction challenges inherent in PV power generation. Through experimental simulations and analytical examinations, the following conclusions have been drawn:

1) The introduction of the fundamental physical constraint properties of PV power plants ensures the rapidity of the subsequent model training and the reasonableness of the model prediction output.

2) For the huge dataset, the use of DCCN for feature extraction can fully exploit the relevant features to the PV historical data, thereby enhancing the prediction accuracy of the model.

3) The adoption of a SLSTM model for training presents an advantage over the conventional LSTM model due to its intricate network architecture, which more comprehensively captures the patterns of variation within the solar sequence; thus, enhancing the predictive accuracy of the model.

Through comparative analysis with various alternative models, it becomes evident that the proposed DSLSTM model outperforms in all performance metrics. From the comprehensive results, this indicates that the proposed DSLSTM model possesses excellent overall performance, thereby demonstrating substantial feasibility for practical applications.

In future work, the anticipated direction will involve an in-depth anticipation of various decomposition algorithms to improve the accuracy of short-term PV power forecasting. In addition, migration learning will be considered to enhance the practicality of the model in response to the insufficient amount of data from PV power plants.

Use of AI tools declaration

The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

Conflict of interest

The authors declare there is no conflict of interest.

References

[1]	Erbrink JJ, Gulski E, Smit JJ, et al. (2010) Diagnosis of onload tap changer contact degradation by dynamic resistance measurements. IEEE Trans Power Delivery 25: 2121-2131. doi: 10.1109/TPWRD.2010.2050499
[2]	Quevedo J de O, Cazakevicius FE, Beltrame RC, et al. (2016) Analysis and design of an electronic on-load tap changer distribution transformer for automatic voltage regulation. IEEE Trans Ind Electron 64: 883-894.
[3]	Wu W, Tian Z, Zhang B (2017) An exact linearization method for OLTC of transformer in branch flow model. IEEE Trans Power Syst 32: 2475-2476. doi: 10.1109/TPWRS.2016.2603438
[4]	Asl DK, Mohammadi M, Seifi AR (2019) Holomorphic embedding load flow for unbalanced radial distribution networks with DFIG and tap-changer modelling. IET Gener, Trans Distrib 13: 4263-4273. doi: 10.1049/iet-gtd.2018.6239
[5]	Hu J, Marinelli M, Coppo M, et al. (2016) Coordinated voltage control of a decoupled three-phase on-load tap changer transformer and photovoltaic inverters for managing unbalanced networks. Electr Power Syst Res 131: 264-274. doi: 10.1016/j.epsr.2015.10.025
[6]	Liu X, Airchhorn A, Liu L, et al. (2012) Coordinated control of distributed energy storage system with tap changer transformers for voltage rise mitigation under high photovoltaic penetration. IEEE Trans Smart Grid 3: 897-906. doi: 10.1109/TSG.2011.2177501
[7]	Kraiczy M, Stetz T, Braun M (2018) Parallel operation of transformers with on load tap changer and photovoltaic systems with reactive power control. IEEE Trans Smart Grid 9: 6419-6428. doi: 10.1109/TSG.2017.2712633
[8]	Kraiczy M, Braun M, Wirth G, et al. (2013) Unintended interferences of local voltage control strategies of HV/MV transformer and distributed generators. European PV Solar Energy Conference and Exhibition, Paris.
[9]	Andren F, Bletterie B, Kadam S, et al. (2015) On the stability of local voltage control in distribution networks with a high penetration of inverter based generation. IEEE Trans Ind Electron 62: 2519-2529. doi: 10.1109/TIE.2014.2345347
[10]	Rauma K, Cadoux F, Hadj-SaïD N, et al. (2016) Assessment of the MV/LV on-load tap changer technology as a way to increase LV hosting capacity for photovoltaic power generators. CIRED Workshop 2016, Helsinki, Finland, June 2016.
[11]	Navarro-Espinosa A, Ochoa LF (2015) Increasing the PV hosting capacity of LV networks: OLTC-fitted transformers vs. reinforcements. 2015 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT), Washington, DC, USA.
[12]	Singh P, Bishnoi SK, Meena NK (2019) Moth search optimization for optimal DERS integration in conjunction to oltc tap operations in distribution systems. IEEE Syst J (Early Access): 1-9.
[13]	Franco JF, Procopiou A, Quirós-Tortós J, et al. (2019) Advanced control of OLTC-enabled LV networks with PV systems and electric vehicles. IET Gener, Trans Distrib 13: 2967-2975. doi: 10.1049/iet-gtd.2019.0208
[14]	Ku T, Lin C, Chen C, et al. (2019) Coordination of transformer on-load tap changer and PV smart inverters for voltage control of distribution feeders. IEEE Trans Ind Appl 55: 256-264. doi: 10.1109/TIA.2018.2870578
[15]	Ganguly S, Samajpati D (2017) Distributed generation allocation with on-load tap changer on radial distribution networks using adaptive genetic algorithm. Appl Soft Comput 59: 45-67. doi: 10.1016/j.asoc.2017.05.041
[16]	Nijhuis M, Gibescu M, Cobben JFG (2016) Incorporation of on-load tap changer transformers in low-voltage network planning. IEEE PES Innovative Smart Grid Technologies Conference Europe (ISGT-Europe), Ljubljana, Slovenia, Oct. 2016.
[17]	Iria J, Heleno M, Candoso G (2019) Optimal sizing and placement of energy storage systems and on-load tap changer transformers in distribution networks. Appl Energy 15: 1147-57.
[18]	Meena NK, Swarnkar A, Gupta N, et al. (2018) Optimal integration of DERs in coordination with existing VRs in distribution networks. IET Gener, Trans Distrib 12: 2520-2529. doi: 10.1049/iet-gtd.2017.1403
[19]	Nick M, Cherkaoui R, Paolone M (2014) Optimal allocation of dispersed energy storage systems in active distribution networks for energy balance and grid support. IEEE Trans Power Syst 29: 2300-2310. doi: 10.1109/TPWRS.2014.2302020
[20]	IEEE Standard 1547-2018 (2018) IEEE Standard for interconnection and interoperability of distributed energy resources with associated electric power systems interfaces. IEEE Standard Association.

This article has been cited by:

1.	Runxin Fang, Yang Li, Yanjuan Zhang, Qiang Chen, Quanjun Liu, Zhiyong Li, Impact of left atrial appendage location on risk of thrombus formation in patients with atrial fibrillation, 2021, 1617-7959, 10.1007/s10237-021-01454-4
2.	Alejandro Gonzalo, Manuel García‐Villalba, Lorenzo Rossini, Eduardo Durán, Davis Vigneault, Pablo Martínez‐Legazpi, Oscar Flores, Javier Bermejo, Elliot McVeigh, Andrew M. Kahn, Juan C. del Alamo, Non‐Newtonian blood rheology impacts left atrial stasis in patient‐specific simulations , 2022, 38, 2040-7939, 10.1002/cnm.3597
3.	Pedro Morais, Dominik Nelles, Vivian Vij, Baravan Al-Kassou, Marcel Weber, Georg Nickenig, Jan Wilko Schrickel, João L. Vilaça, Alexander Sedaghat, Assessment of LAA Strain and Thrombus Mobility and Its Impact on Thrombus Resolution—Added-Value of a Novel Echocardiographic Thrombus Tracking Method, 2022, 13, 1869-408X, 950, 10.1007/s13239-022-00629-z
4.	YiRen Sun, Yunfei Ling, Zijia Chen, Zhengjie Wang, Tao Li, Qi Tong, Yongjun Qian, Finding low CHA2DS2-VASc scores unreliable? Why not give morphological and hemodynamic methods a try?, 2023, 9, 2297-055X, 10.3389/fcvm.2022.1032736
5.	Lida Alinezhad, Farzan Ghalichi, Majid Ahmadlouydarab, Maryam Chenaghlou, Left atrial appendage shape impacts on the left atrial flow hemodynamics: A numerical hypothesis generating study on two cases, 2022, 213, 01692607, 106506, 10.1016/j.cmpb.2021.106506
6.	Wen-Jing Xiang, Jia-Dong Huo, Wei-Tao Wu, Peng Wu, Influence of Inlet Boundary Conditions on the Prediction of Flow Field and Hemolysis in Blood Pumps Using Large-Eddy Simulation, 2023, 10, 2306-5354, 274, 10.3390/bioengineering10020274
7.	Ahmed Qureshi, Gregory Y. H. Lip, David A. Nordsletten, Steven E. Williams, Oleg Aslanidi, Adelaide de Vecchi, Imaging and biophysical modelling of thrombogenic mechanisms in atrial fibrillation and stroke, 2023, 9, 2297-055X, 10.3389/fcvm.2022.1074562
8.	Kan Huang, Haitao Wen, Canjun Liu, Association Between Ischemic Stroke and Left Atrial Appendage Morphology in Patients With Atrial Fibrillation and Low CHA2DS2-VASc Scores, 2022, 49, 1526-6702, 10.14503/THIJ-20-7544
9.	Lianru Zang, Kaihao Gu, Xingkai Ji, Hao Zhang, Shengjie Yan, Xiaomei Wu, Comparative Analysis of Temperature Rise between Convective Heat Transfer Method and Computational Fluid Dynamics Method in an Anatomy-Based Left Atrium Model during Pulsed Field Ablation: A Computational Study, 2023, 10, 2308-3425, 56, 10.3390/jcdd10020056
10.	Runxin Fang, Yang Li, Jun Wang, Zidun Wang, John Allen, Chi Keong Ching, Liang Zhong, Zhiyong Li, Stroke risk evaluation for patients with atrial fibrillation: Insights from left atrial appendage, 2022, 9, 2297-055X, 10.3389/fcvm.2022.968630
11.	Runxin Fang, Zidun Wang, Xie Zhao, Jun Wang, Yang Li, Yanjuan Zhang, Qiang Chen, Jiaqiu Wang, Quanjun Liu, Minglong Chen, Zhiyong Li, Stroke risk evaluation for patients with atrial fibrillation: Insights from left atrial appendage with fluid-structure interaction analysis, 2022, 148, 00104825, 105897, 10.1016/j.compbiomed.2022.105897
12.	Kaihao Gu, Shengjie Yan, Xiaomei Wu, Influence of pulsating intracardiac blood flow on radiofrequency catheter ablation outcomes in an anatomy-based atrium model, 2022, 39, 0265-6736, 1064, 10.1080/02656736.2022.2108149
13.	Jordi Mill, Victor Agudelo, Andy L. Olivares, Maria Isabel Pons, Etelvino Silva, Marta Nuñez-Garcia, Xabier Morales, Dabit Arzamendi, Xavier Freixa, Jérôme Noailly, Oscar Camara, Sensitivity Analysis of In Silico Fluid Simulations to Predict Thrombus Formation after Left Atrial Appendage Occlusion, 2021, 9, 2227-7390, 2304, 10.3390/math9182304
14.	Yonghui Qiao, Kun Luo, Jianren Fan, Computational Prediction of Thrombosis in Food and Drug Administration’s Benchmark Nozzle, 2022, 13, 1664-042X, 10.3389/fphys.2022.867613
15.	Mattia Corti, Alberto Zingaro, Luca Dede’, Alfio Maria Quarteroni, Impact of atrial fibrillation on left atrium haemodynamics: A computational fluid dynamics study, 2022, 150, 00104825, 106143, 10.1016/j.compbiomed.2022.106143
16.	Lianru Zang, Kaihao Gu, Xingkai Ji, Hao Zhang, Shengjie Yan, Xiaomei Wu, Effect of Anisotropic Electrical Conductivity Induced by Fiber Orientation on Ablation Characteristics of Pulsed Field Ablation in Atrial Fibrillation Treatment: A Computational Study, 2022, 9, 2308-3425, 319, 10.3390/jcdd9100319
17.	Ehsan Khalili, Cécile Daversin‐Catty, Andy L. Olivares, Jordi Mill, Oscar Camara, Kristian Valen‐Sendstad, On the importance of fundamental computational fluid dynamics toward a robust and reliable model of left atrial flows, 2024, 40, 2040-7939, 10.1002/cnm.3804
18.	Carlos Albors, Andy L. Olivares, Xavier Iriart, Hubert Cochet, Jordi Mill, Oscar Camara, 2023, Chapter 50, 978-3-031-35301-7, 485, 10.1007/978-3-031-35302-4_50
19.	Henrik A. Kjeldsberg, Carlos Albors, Jordi Mill, David Viladés Medel, Oscar Camara, Joakim Sundnes, Kristian Valen‐Sendstad, Impact of left atrial wall motion assumptions in fluid simulations on proposed predictors of thrombus formation, 2024, 40, 2040-7939, 10.1002/cnm.3825
20.	Yan Wang, Buyun Xu, Kun Luo, Jianren Fan, Meixiang Xiang, The impact of left atrial morphology on thrombosis risk in atrial fibrillation, 2024, 36, 1070-6631, 10.1063/5.0214185
21.	Qi Gao, Hongtao Lin, Jianghong Qian, Xingli Liu, Shengze Cai, He Li, Hongguang Fan, Zhe Zheng, A deep learning model for efficient end-to-end stratification of thrombotic risk in left atrial appendage, 2023, 126, 09521976, 107187, 10.1016/j.engappai.2023.107187
22.	Jordi Mill, Josquin Harrison, Marta Saiz-Vivo, Carlos Albors, Xabier Morales, Andy L. Olivares, Xavier Iriart, Hubert Cochet, Jerome Noailly, Maxime Sermesant, Oscar Camara, The role of the pulmonary veins on left atrial flow patterns and thrombus formation, 2024, 14, 2045-2322, 10.1038/s41598-024-56658-2
23.	Lan Ge, Yawei Xu, Jun Li, Yuan Li, Yifeng Xi, Xinyan Wang, Jing Wang, Yang Mu, Hongsen Wang, Xu Lu, Jun Guo, Zengsheng Chen, Tao Chen, Yundai Chen, The impact of contrast retention on thrombus formation risks in patients with atrial fibrillation: A numerical study, 2024, 10, 24058440, e26792, 10.1016/j.heliyon.2024.e26792
24.	Sara Valvez, Manuel Oliveira-Santos, Ana P. Piedade, Lino Gonçalves, Ana M. Amaro, Computational Flow Dynamic Analysis in Left Atrial Appendage Thrombus Formation Risk: A Review, 2023, 13, 2076-3417, 8201, 10.3390/app13148201
25.	Alberto Zingaro, Zan Ahmad, Eugene Kholmovski, Kensuke Sakata, Luca Dede’, Alan K. Morris, Alfio Quarteroni, Natalia A. Trayanova, A comprehensive stroke risk assessment by combining atrial computational fluid dynamics simulations and functional patient data, 2024, 14, 2045-2322, 10.1038/s41598-024-59997-2
26.	Runxin Fang, Zidun Wang, Jiaqiu Wang, Jiayu Gu, Geman Yin, Qiang Chen, Xunrong Xia, Zhiyong Li, Patient-specific pulmonary venous flow characterization and its impact on left atrial appendage thrombosis in atrial fibrillation patients, 2024, 257, 01692607, 108428, 10.1016/j.cmpb.2024.108428
27.	Åshild Telle, Clarissa Bargellini, Yaacoub Chahine, Juan C. Del Álamo, Nazem Akoum, Patrick M Boyle, Personalized biomechanical insights in atrial fibrillation: opportunities & challenges, 2023, 21, 1477-9072, 817, 10.1080/14779072.2023.2273896
28.	João Loures Salinet, ĺtalo Sandoval Ramos de Oliveira, John Andrew Sims, João Lameu, 2023, Chapter 7, 978-3-031-38742-5, 107, 10.1007/978-3-031-38743-2_7
29.	Yanlu Chen, Buyun Xu, Yuzhou Cheng, Kun Luo, Jianren Fan, Meixiang Xiang, Hemodynamic differences caused by left atrial appendage modeling contours, 2023, 35, 1070-6631, 10.1063/5.0172261
30.	Ahmed Qureshi, Paolo Melidoro, Maximilian Balmus, Gregory Y.H. Lip, David A. Nordsletten, Steven E. Williams, Oleg Aslanidi, Adelaide de Vecchi, MRI-based modelling of left atrial flow and coagulation to predict risk of thrombogenesis in atrial fibrillation, 2025, 101, 13618415, 103475, 10.1016/j.media.2025.103475
31.	Henrik Aasen Kjeldsberg, Renate B. Schnabel, Joakim Sundnes, Kristian Valen-Sendstad, Estimation of inlet flow rate in simulations of left atrial flows: A proposed optimized and reference-based algorithm with application to sinus rhythm and atrial fibrillation, 2025, 00219290, 112594, 10.1016/j.jbiomech.2025.112594

Reader Comments

Your name:*

Email:*
© 2019 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)