A novel pessimistic multigranulation roughness by soft relations over dual universe

Jamalud Din; Muhammad Shabir; Samir Brahim Belhaouari; Jamalud Din; Muhammad Shabir; Samir Brahim Belhaouari

doi:10.3934/math.2023397

AIMS Mathematics

2023, Volume 8, Issue 4: 7881-7898. doi: 10.3934/math.2023397

Previous Article Next Article

Research article Special Issues

A novel pessimistic multigranulation roughness by soft relations over dual universe

1.
Department of Mathematics, Quaid-I-Azam University Islamabad, 44230, Pakistan
2.
Division of Information and Computing Technology, College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar

Received: 12 September 2022 Revised: 31 December 2022 Accepted: 10 January 2023 Published: 31 January 2023
MSC : 03B52, 68T27

A multigranulation rough set over two universes delivers a unique perspective on the combination of multigranulation information. This paper presents the pessimistic multignualtion rough set over dual universes based on soft binary relations. Firstly, a new pessimistic multigranualtion rough set over dual universes based on two soft binary relations has been developed, and their properties are derived. Then we extend this idea and present pessimistic multigranulation roughness over dual universes based on the finite number of soft binary relations. Finally, we present an example to illustrate our proposed multigranualtion rough set model.

Keywords:

rough set,
soft set,
multigranulation rough set,
soft relation and approximation by soft binary relation

Citation: Jamalud Din, Muhammad Shabir, Samir Brahim Belhaouari. A novel pessimistic multigranulation roughness by soft relations over dual universe[J]. AIMS Mathematics, 2023, 8(4): 7881-7898. doi: 10.3934/math.2023397

Related Papers:

[1]	Stevo Stević, Bratislav Iričanin, Witold Kosmala . On a family of nonlinear difference equations of the fifth order solvable in closed form. AIMS Mathematics, 2023, 8(10): 22662-22674. doi: 10.3934/math.20231153
[2]	Ahmed Ghezal, Mohamed Balegh, Imane Zemmouri . Solutions and local stability of the Jacobsthal system of difference equations. AIMS Mathematics, 2024, 9(2): 3576-3591. doi: 10.3934/math.2024175
[3]	Hashem Althagafi, Ahmed Ghezal . Solving a system of nonlinear difference equations with bilinear dynamics. AIMS Mathematics, 2024, 9(12): 34067-34089. doi: 10.3934/math.20241624
[4]	Ziying Qi, Lianzhong Li . Lie symmetry analysis, conservation laws and diverse solutions of a new extended (2+1)-dimensional Ito equation. AIMS Mathematics, 2023, 8(12): 29797-29816. doi: 10.3934/math.20231524
[5]	Stevo Stević, Durhasan Turgut Tollu . On a two-dimensional nonlinear system of difference equations close to the bilinear system. AIMS Mathematics, 2023, 8(9): 20561-20575. doi: 10.3934/math.20231048
[6]	Changlong Yu, Jing Li, Jufang Wang . Existence and uniqueness criteria for nonlinear quantum difference equations with $p$ -Laplacian. AIMS Mathematics, 2022, 7(6): 10439-10453. doi: 10.3934/math.2022582
[7]	Zhe Ji, Yifan Nie, Lingfei Li, Yingying Xie, Mancang Wang . Rational solutions of an extended (2+1)-dimensional Camassa-Holm- Kadomtsev-Petviashvili equation in liquid drop. AIMS Mathematics, 2023, 8(2): 3163-3184. doi: 10.3934/math.2023162
[8]	Jiali Wu, Maoning Tang, Qingxin Meng . A stochastic linear-quadratic optimal control problem with jumps in an infinite horizon. AIMS Mathematics, 2023, 8(2): 4042-4078. doi: 10.3934/math.2023202
[9]	Pengshe Zheng, Jing Luo, Shunchu Li, Xiaoxu Dong . Elastic transformation method for solving ordinary differential equations with variable coefficients. AIMS Mathematics, 2022, 7(1): 1307-1320. doi: 10.3934/math.2022077
[10]	Hua Wang, Hong Yan Xu, Jin Tu . The existence and forms of solutions for some Fermat-type differential-difference equations. AIMS Mathematics, 2020, 5(1): 685-700. doi: 10.3934/math.2020046

Abstract

1. Introduction

The tendency toward clean power production is in an ongoing surge due to its beneficial impact on the environment. Thus, countries are embracing renewable energy by establishing supportive regulations to foster their development ^[1]. Solar energy is an essential renewable energy source, and its use has increased rapidly in recent years. However, one of the drawbacks of using solar electricity is its intermittent nature, which makes accurate forecasting of solar power output challenging ^[2,3]. Hence, reliable solar forecasting is essential for efficient solar energy integration into power networks. Solar forecasting improves power grid management, energy trading decisions, and power system planning and operation ^[4]. Accurate solar forecasting also encourages optimal solar energy consumption, which is critical for the global transition to a sustainable energy system. As a result, much research has been carried out in order to produce dependable and accurate solar forecasting models. Hence, this study focuses on forecasting Global Horizontal Irradiance (GHI).

GHI values are greatly influenced by the weather parameters, such as humidity, pressure, air temperature, wind speed, and cloud cover. The primary determinants of these variables are the site's geographic location and climate. In addition, four primary categories are taken into account while determining the GHI forecasting horizon ^[5]: Ultra-short-term forecasting (1 second to < 1 hour), short-term forecasting (1−24 hours), medium-term forecasting (1 week−1 month), and long-term forecasting (1 month−1 year). The goals and applications of GHI forecasting may vary depending on the stakeholders involved and the time horizon of interest. The ultra-short-term prediction has gained constant attention in energy-based real-time applications. For instance, the goal of a 5min GHI forecast might be to enable real-time control of power generation and consumption in a microgrid or a building ^[6]. A 15min GHI forecast, on the other hand, might be helpful in energy trading and market participation ^[7]. Power generators and retailers could use the forecast to optimize their bidding strategies in a day-ahead or intraday electricity market. A 30min solar forecast could be helpful in scheduling energy resources and optimizing energy management in a building, a microgrid, or a community. In contrast, a 60min GHI forecast could be helpful for long-term energy planning and grid integration ^[8]. Hence, this research paper builds GHI models considering different forecasting horizons.

1.1. Related work

In addition, considering the GHI forecasting methods, statistical techniques, machine learning algorithms, and physical models are just a few examples of forecasting algorithms that can be utilized ^[8]. The statistical techniques are divided into (i) machine learning (ML) algorithms, such as support vector regression (SVR) and artificial neural networks (ANN), and (ii) time series models, such as autoregressive, moving average, exponential smoothing, and autoregressive moving average (ARMA), are frequently used in the energy sector. Physical models use mathematical equations to model the physical processes influencing GHI output ^[9]. An example of a physical model is the numerical weather prediction (NWP). Each of these algorithms has advantages and disadvantages, and the choice of method is determined by aspects such as data availability, forecasting horizon, and the desired level of accuracy. K. Omer ^[10], for instance, examines the performance of the particle swarm optimization (PSO) algorithm, ANNs, and bagged tree (BT) methods in forecasting seasonal solar irradiance. Data from 2007 to 2020, encompassing variables like air temperature, precipitation, snow mass, air density, and cloud cover fraction, are used to predict solar irradiance. The findings indicate that the BT method exhibited the most favorable statistical accuracy. Specifically, the BT model showcased superior performance, revealing a coefficient of determination (R²) of 0.992, root mean square error (RMSE) of 0.00339, and mean absolute error (MAE) of 0.0199. Solano et al. ^[11] explored the use of ML models, namely SVR, extreme gradient boosting (XGBT), categorical boosting (CatBoost), and voting-average (VOA), for solar radiation forecasting in Brazil using input parameters such as dry bulb temperature, relative humidity, wind speed, atmospheric pressure, and time of the day. Results revealed that VOA outperformed other models in terms of accuracy with RMSE in the winter and summer of 0.2417 and 0.2877, respectively. Lee et al. ^[12] also presented ensemble learning-based solar irradiance forecasting models using weather data. They used boosted trees, BT, random forest (RF), and generalized RF, and compared their performance in short-term prediction of solar irradiance with Gaussian process regression and SVR. Results indicated that ensemble approaches led to reliable forecasting outcomes for all the considered locations.

The paper in ^[13] presents a method for predicting hourly GHI using extraterrestrial radiation alongside limited weather forecast data. The study compared the performance of various prediction models—BP network, SVM, and the light gradient boosting machine (LightGBM). The LightGBM model demonstrated superior performance with the lowest RMSE in the testing set of 126.1 W/m². Moreover, the study explored the influence of weather types on the prediction outcomes. The analysis revealed that weather patterns were not the primary influencers on the LightGBM model's prediction outcomes. Interestingly, the model's accuracy remained unchanged even after excluding weather predictors, where the RMSE was found to be 135.2 W/m². Furthermore, the difficulties of projecting the power generation of distributed, small-scale solar PV systems at various horizons and resolutions were examined in ^[14]. The authors presented and assessed many forecasting methodologies, such as particle swarm optimization (PSO)-based prediction combinations and base forecasters. The assessment procedure compared how well the forecasting techniques work when trained on varying data sets and tested in different environments and periods. The findings demonstrated that forecast combinations, especially at high resolutions and short horizons, can enhance the performance of forecasting models for solar PV power output. The forecasting models are assessed using the median absolute scaled error (MASE). The results demonstrated that the proposed PSO-based forecast combination approach performed better than the base forecasters and other benchmark models at all resolutions and horizons, with a 3.81% reduction in MASE.

However, these meteorological variables may not be available due to the high cost of weather monitoring devices. This presents a significant barrier to generating an accurate forecasting model, particularly for regions with limited financial resources. Therefore, one of the significant research gaps in GHI forecasting is the possibility of relying solely on lag observations of historical GHI data without integrating any weather or environmental variables. This method is also known as persistent forecasting or naive forecasting. This approach assumes that the GHI's future reading will be identical to its recent historical output without accounting for any external influences that may affect the output. In comparison to more complex algorithms that combine weather and environmental data, the use of lag observations alone for GHI forecasting has gotten very little attention in the literature despite its simplicity and ease of implementation. This approach, however, may offer potential advantages in terms of computational speed and ease of implementation, particularly for short-term solar forecasting applications when the influence of external factors may be negligible. As a result, this research aims to evaluate the possibility of forecasting GHI future observations using only lag observations.

Furthermore, machine learning models continue to confront difficulties when processing large amounts of input data, frequently facing complications such as vanishing or expanding gradients ^[8,15]. The rapid expansion of artificial intelligence approaches has resulted in a continued emphasis on deep learning (DL), which is known for its excellent performance in tasks such as image recognition ^[16,17] and machine translation ^[18,19]. To solve these inherent issues, deep learning has been implemented into solar prediction. Deep learning models surpass conventional machine learning models in terms of accuracy due to their greater feature learning capacity and ability to handle large datasets. S. Tajjour et al. ^[20] conducted a study focused on short-term solar irradiation forecasting utilizing DL models. Employing eleven years of NASA satellite data, they evaluated the effectiveness of three specific deep learning models: multilayer perceptron (MLP), LSTM, and gated recurrent unit (GRU). The results indicated that all three models exhibited comparable accuracy levels, with a mean square error (MSE) near to 0.017 kWh/m²/day. Despite containing more layers, the GRU model demonstrated higher training speed compared to LSTM. The MLP model emerged as the most efficient, attributed to its fewer parameters (49,281) when contrasted with GRU (1,025,793). In addition, M. Elizabeth et al. ^[21] presented a novel multistep CNN-stacked LSTM model designed for short-term solar irradiance prediction. Through comparisons with CNN and LSTM models, their proposed approach demonstratesd superior performance among contemporary DL models. Moreover, they benchmarked the proposed method against traditional ML techniques like linear regression (LR), SVR, and ANN using the same dataset. In forecasting solar irradiance, their framework yielded the lowest RMSE and R² values, achieving 0.36 and 0.98 W/m², respectively.

Moreover, the study in ^[22] presented a study on predicting solar radiation using a hybrid CNN-categorical boosting (CNN-CatBoost) model. They used extra-atmospheric solar radiation and three weather variables (temperature, humidity, and total cloud volume) to predict solar radiation. The study compared the performance of boosting models (XGBoost and CatBoost) and recurrent neural network (RNN) models (LSTM and GRU). The results indicated that the hybrid CNN-CatBoost model provided accurate predictions of solar radiation by a reduction in MAE values from 0.1104 to 0.1027. V. Sansine et al. ^[23] utilized also a hybrid deep learning model that combined CNN and LSTM algorithms (CNN-LSTM) for predicting solar irradiance. Additionally, the study compared the performance of the hybrid model with other stand-alone models, including ANN, CNN, and LSTM. The results showed that the CNN-LSTM hybrid model outperformed other models, with the best statistical error results for probabilistic forecasting. For test data, the CNN-LSTM model achieved an RMSE of 91.73 W/m² and MAE of 60.46 W/m², with an R² of 87. The authors in ^[24] also provided a short-term PV forecasting model using the variational autoencoder (VAE) model. They used data from two different locations (a parking lot in the US with a size of 243 kW and a PV system in Algeria with total capacity of 9 MW). For comparison purposes, they compared VAE with seven DL methods, namely the recurrent neural network (RNN), LSTM, bidirectional LSTM, the convolutional LSTM network, gated recurrent units, stacked autoencoder, and the restricted Boltzmann machine, and two well-known ML methods, namely LR and SVR. The findings showed that DL techniques outperformed other ML techniques, while VAE consistently beat the other techniques.

For time series forecasting, particularly GHI forecasting, CNN has grown in popularity. Unfortunately, few research efforts have currently concentrated on the improved CNN in GHI forecasting. One of the primary disadvantages of using CNN is their vulnerability to certain parameters. The CNN architecture is made up of a set of memory cells that can learn and store information over long periods of time, making it ideal for capturing temporal dependencies in sequential data. The accuracy and dependability of GHI forecasts can be considerably impacted by the best choice of CNN architecture, which entails selecting the number of convolution layers, filters, and the learning rate. The number of convolution layers assists the model in learning complicated connections in the data but also raises the possibility of overfitting ^[25]. Similarly, optimizing learning rates is critical for effective model convergence since it regulates the step size in weight updates during training ^[26]. The combination of these parameters is crucial; insufficient convolution layers or incorrectly set learning rates can impair the model's capacity to comprehend the temporal complexities associated with GHI data, resulting in suboptimal predictions. Therefore, since finding the optimal design of CNN is fundamental for achieving accurate and reliable GHI forecasts and there is little attention in the literature on this aspect, the current research discovers the best CNN model architectures that yield the best forecasting results.

1.2. Motivation and contributions of the study

Based on the discussion above, the primary objective of this research work is to offer a CNN-based framework aimed at estimating the GHI. The framework consisted of several steps: Data collection and preprocessing, data partitioning, CNN model architecture, model training, model testing, and model deployment. This framework creates a model that accurately and reliably predict a GHI output. Therefore, the following states are the main differences of this study compared to other published works in the literature:

● Optimal selection of CNN architecture: This study considers the best CNN architecture for GHI forecasting using solely historical GHI data. This is significant because the choice of CNN's architecture significantly impacts performance. To identify a suitable design, the CNN is tested under various combinations of layers, filters per layer, and learning rates.

● Use of past data of GHI only: In this study, we only used past data of GHI as input to the CNN model for forecasting. This differs from many other works that use weather data and GHI data for forecasting. This approach is functional when weather data is not available or is unreliable.

● Comparison with other forecasting algorithms: In this study, the effectiveness of the proposed CNN model is compared to that of several well-known forecasting methods, including the RNN, ANN, RF, and SVR. This comparison sheds light on how different algorithms compare in terms of forecasting GHI.

● Forecasting horizon: In this study, we concentrated on forecasting GHI over various time horizons, including 5, 15, and 30min. This is important because the accuracy of the forecasting algorithms may vary depending on the forecasting horizon.

The structure of this study is as follows: Section 2 provides a comprehensive discussion of the problem statement, framework, CNN algorithm, and data preparation techniques utilized. Section 3 focuses on the sensitivity analysis employed. Sections 4 and 5 present the key findings and provide a thorough discussion of the study results. In Section 6, a potential real-world application of the proposed CNN-based forecasting model is explored, while Section 7 highlights the study's conclusions.

2. Materials and methods

This section includes the problem statement, a thorough explanation of the research framework, and an overview of the CNN algorithm.

2.1. Problem statement

The growing significance of solar energy as a renewable energy source has increased, necessitating accurate projections of GHI for effective energy management. Precise estimation of the GHI can help utilities and grid operators balance the supply and demand of energy, optimize energy storage, and reduce costs associated with energy imbalance. However, forecasting GHI becomes challenging due to the lack of meteorological data either by their unavailability or reliability. This leaves an open opportunity for further research into the idea of relying purely on lag observations of past GHI data without incorporating any weather or environmental variables. In addition, traditional forecasting models, such as statistical models, have limitations in capturing the non-linear relationships between the input variables and the GHI observation. The CNN algorithm has recently shown promise in forecasting GHI. However, there is still a need for research to investigate the effectiveness of CNN-based models in GHI forecasting and to compare their performance with other forecasting models. Additionally, research is required to determine how various data sources, model architectures, and hyperparameters affect the precision and dependability of GHI forecasts. By filling in these knowledge gaps, forecasting of GHI may be made more accurate and reliable, and more effective energy management tactics can be supported.

2.2. Study framework

The methodology for forecasting GHI using CNN with different forecasting horizons using only lag observations of CNN is shown in Figure 1 and described below:

Figure 1. Framework of the developed GHI forecasting models.

DownLoad: Full-Size Img PowerPoint

● Step 1: Data Collection: The first step is to collect the historical data of GHI. The data should be collected at a high temporal resolution, such as every 5min. The data should cover a sufficiently long period to include seasonal patterns.

● Step 2: Data Preprocessing: The collected data should be preprocessed before feeding it to the CNN and other forecasting algorithms. The preprocessing steps include data cleaning and normalization and splitting the data into training, validation, and testing sets. In this study, we only use the lag observations of GHI, meaning that the model only uses past GHI values as inputs.

● Step 3: Forecasting Horizon Analysis: In this study, we evaluate the performance of the CNN model with different forecasting horizons. We generate forecasts for 5, 15, and 30min ahead. The performance metrics are calculated for each forecasting horizon, and the results are compared to identify the best forecasting horizon.

● Step 4: CNN Model Design: The CNN model is intended to capture temporal dependencies in GHI data. The model is made up of numerous CNN layers that are followed by a fully connected layer. The number of CNN layers, neurons in each layer, and the activation functions are all hyperparameters that should be tuned.

● Step 5: Model Training: The designed CNN model is trained on the training data set. During training, the model's weights are modified using an optimization technique such as Adam. When the validation loss stops improving, the training process ends.

● Step 6: Model Evaluation: The trained model is evaluated on the testing data set, to compare the CNN model's performance with other forecasting algorithms. The evaluation metrics used include the coefficient of determination (R²), root mean square (RMSE), normalized root mean square (nRMSE), mean absolute error (MAE), normalized mean absolute error (nMAE), and mean absolute percentage error (MAPE).

● Step 7: Implementation: The CNN model is implemented using a programming language. In this study, we used the MATLAB environment to build the CNN model.

2.3. Convolution Neural Network (CNN)

The CNN stands as a fundamental DL algorithm that has significantly advanced the field of computer vision and image processing ^[27]. CNNs are specifically designed to process and analyze visual data, which renders them ideal for applications such as image recognition, object detection, and image classification ^[28]. One of the advantages of CNNs lies in their capacity to automatically learn hierarchical representations of features from raw data ^[28]. This is achieved through the use of specialized layers, including convolutional layers, pooling layers, and fully connected layers (see Figure 2). The convolutional layers play a fundamental role in feature extraction by applying adaptable filters or kernels to the input data ^[29]. These filters are convolved with the input to detect patterns, edges, and textures, enabling the network to capture meaningful visual information. In contrast, pooling layers execute downsampling operations on the feature maps derived from convolutional layers, reducing spatial dimensions while retaining crucial features ^[30]. Popular pooling techniques like max pooling and average pooling assist in minimizing computational complexity and prevent overfitting. Finally, the fully connected layers process the extracted features to perform classification or regression tasks, allowing the network to learn complex relationships in the data ^[31].

Figure 2. Architecture diagram of the CNN.

DownLoad: Full-Size Img PowerPoint

The initial step involves feeding the input data into the input layer to initiate the process of feature transformation. Subsequently, the convolutional and pooling layers work in extracting relevant features from the input data. These extracted details are then amalgamated through the fully connected layers. Finally, the output layer communicates the result of the feature extraction process. The goal of each convolutional layer is specifically geared toward extracting spatial patterns from the input variables correlated with the target variable, GHI. This process is illustrated as follows ^[22]:

${y}_{ik}^{k} = f({\left({W}^{k}\times h\right)}_{i, j}+{b}_{k}) ,$

(1)

where $f$ is the specified activation function, ${W}^{k}$ represents the kernel weight, and $\times$ refers to the convolution process operator.

2.4. Data cleaning and normalization

Data cleaning is an essential step in developing a successful forecasting model. Solar datasets should be cleaned and filtered before being fed into the forecasting models. In GHI forecasting, the night hours are removed from the database, and only the ones that occur between sunrise and sunset are saved. A solar elevation-based pre-processing operation is carried out to accomplish this because data near sunset and dawn are frequently incorrect. Hence, solar radiation data is excluded for solar elevations less than 10 ^[32]. Furthermore, normalizing input data is necessary before examining forecasting models' performance. The objective here is to mitigate the likelihood that characteristics with substantial numerical values outweigh those with comparatively lower numerical values. Equation (2) is used to normalize the input data between 0 and 1.

${x}_{i}^{n} = \frac{{x}_{i}-{x}_{min}}{{x}_{max}-{x}_{min}} ,$

(2)

where ${x}_{i}$ is the measured GHI value; ${x}_{i}^{n}$ is the normalized GHI, while ${x}_{max}$ and ${x}_{min}$ are the highest and lowest values corresponding to the measured GHI that exists in the input dataset, respectively.

2.5. Model evaluation metrics

The precision and effectiveness of the forecasting techniques are assessed using the following statistical indicators: R², RMSE, nRMSE, MAE, nMAE, and MAPE. These metrics reflect the degree to which the measured values agree with the GHI values generated by the forecasting models. The formulas in Eqs (3)−(8) define these metrics [33−35].

${\mathrm{R}}^{2} = 1-\frac{\sum _{i = 1}^{n}(\tilde{y}-{{f}_{i})}^{2}}{\sum _{i = 1}^{n}(\tilde{y}-{{y}_{i})}^{2}}$

(3)

$RMSE = \sqrt{\frac{1}{n}\sum _{i = 1}^{n}({y}_{i}-{{f}_{i})}^{2}}$

(4)

$nRMSE = \frac{\sqrt{\frac{1}{n}\sum _{i = 1}^{n}({y}_{i}-{{f}_{i})}^{2}}}{{y}_{i, max}}$

(5)

$MAE = \sqrt{\frac{1}{n}\sum _{i = 1}^{n}|{y}_{i}-{f}_{i}|}$

(6)

$MAE = \frac{\sqrt{\frac{1}{n}\sum _{i = 1}^{n}|{y}_{i}-{f}_{i}|}}{{y}_{i, max}}$

(7)

$MAPE = \frac{1}{n}\frac{\sum _{i = 1}^{n}\left|{y}_{i}-{f}_{i}\right|}{{y}_{i}} .$

(8)

In the above equations, $n$ represents the volume of the testing datasets; ${y}_{i}$ denotes the measured value of the GHI; ${y}_{i, max}$ corresponds to the highest value within the testing dataset, while ${f}_{i}$ represents the forecasted value produced by the forecasting models. The mean of the measured GHI values of ${y}_{i}$ is represented by $\tilde{y}$ . In regression problems, a model's R² indicates how well it fits a set of observations ^[36]. The MAE, known as the mean absolute value of the residuals (forecasting errors), measures the average magnitude of errors ^[37]. On the other hand, the RMSE quantifies the divergence between actual GHI readings and forecasted values by considering their squared differences, while MAPE is frequently used to determine the forecasting model's performance accuracy using a percentage form ^[38].

2.6. Study site and dataset source

Solcast is a corporation that offers solar irradiance data worldwide ^[39]. Researchers can obtain valuable data from it, and the public can freely access these data. The public can access many atmospheric parameters via their website (https://solcast.com/). It is possible to acquire various meteorological variables over a number of time intervals (5, 30, and 60 minutes), including GHI, diffuse horizontal irradiance (DIF), direct normal irradiance (DNI), air temperature, solar zenith angle, solar azimuth angle, cloud capacity (a percentage ranging from 0% to 100% completely cloudy), pressure, wind speed, and wind direction. The solar data are collected at Riyadh, Saudi Arabia, with the location with the following coordination: latitude: 24.90689 $°N$ and longitude: 46.39721 $°E$ (see Figure 3). The GHI data are gathered in 5min intervals for the period between Jan 1^st, 2022, and Dec 31^st, 2022. The maximum GHI reading from the system was found to be on May 15^th, 2022, at 10:45 A.M. with a value of 1076 W/m², while the average of the GHI readings in 2022 was found to be 506.75 W/m².

Figure 3. Solar map of Saudi Arabia and the study site ^[40].

DownLoad: Full-Size Img PowerPoint

3. Sensitivity analysis

In this section, a sensitivity analysis is conducted to examine the influence of the different lengths of the dataset, the resolution of data, and the seasonal variation of solar radiation on the future forecasting output of the GHI readings.

3.1. Analysis of different lengths of datasets

Most previous studies used at least one year of data for hour-ahead solar radiation. This amount of data is helpful in training the forecasting model, yet it requires a long time to generate the ultimate GHI forecasting model. This could hinder its applicability in real-word applications. Hence, this study investigates different lengths of datasets, including 1 day, 1 week, 1 month, 2 months, and 3 months, for the goal of generating high-accuracy models in a shorter time. A thorough grasp of the temporal dynamics and patterns present in solar irradiance data is made possible by investigating several temporal spans. For instance, shorter datasets—such as those covering one day or one week—offer information on short-term patterns and instantaneous fluctuations, which are essential for comprehending the quick changes in GHI brought on by variations in the weather. Longer datasets, on the other hand, covering 1, 2, or 3 months, reflect seasonal patterns, long-term climate impacts, and possible cyclic patterns that affect solar irradiance. Hence, more resilient and flexible forecasting models are made possible by the model's ability to learn from and adapt to a variety of temporal variables through the analysis of these different dataset lengths. In this study, different combinations of historical observations of GHI were selected as the input feature, as follows:

- 5-min: Previous 5min of GHI readings

- 15-min: Previous 15min of GHI readings at 5-minute intervals

- 30-min: Previous 30min of GHI readings at 5-minute intervals

- 45-min: Previous 45min of GHI readings at 5-minute intervals

- 60-min: Previous 60min of GHI readings at 5-minute intervals

In terms of training dataset volume, each of the above combinations of historical observations of GHI were trained using historical data of 1 day, 1 week, 1 month, 2 months, and 3 months. A comparison study was conducted in this research work to determine the optimum training dataset and feature set.

3.2. Impact of higher-forecasting horizons

Most of the previous studies that focus on short-term forecasts of GHI are in 1-hour intervals ^[41]. The available data from Solcast are in 5 minutes, enabling the exploration of shorter resolutions of data on the accuracy of predicting hour-ahead GHI forecasting. Therefore, this study investigates the accuracy of forecasting GHI values at the 5min, 15min, and 30min horizons. To accomplish this, the lag observations of GHI mentioned in Subsection 3.1 are used to create the multistep forecasting models. In 15min and 30min forecasting horizons, the 1, 2, and 3 months are used only as they led to the best forecasting models in case of 5min (see Section 4).

With all forecasting horizons, the training and testing datasets are divided using the sliding window approach. In the sliding window technique, for example, the lag of 30min at 5min intervals (window size) are employed as an input and the future 15min at 5min intervals (forecast horizon) are used as an output variable (see Figure 4). Through using the sliding window technique, the CNN is enabled to use supervised learning. In addition, different ML algorithms are compared with CNN using the same set of input features (see Algorithm 1).

Figure 4. Sliding window approach with different input features.

DownLoad: Full-Size Img PowerPoint

Algorithm 1 - Training and Testing Phase
	————Training Phase————
1:	Set the length of data: LD = 1 day, 1 week, 1, 2, and 3 months
2:	Set the lag observations of GHI: Lag = 5min, 15min, 30min, 45min, and 60min
3:	Load the data: M1
4:	Load the output day: D = 288 × 1
5:	Apply Slide Window Technique to divide M1 and D using LD and Lag
6:	Mark MR as the training dataset
7:	Mark MS as the testing dataset
8:	Mark MV as the validation dataset
9:	Split the target T into TR, TS, and TV for training, testing, and validation
10:	Normalize MR, MS, and MV
11:	For each algorithm R, do:
12:	Train R using MR as input and TR as output
13:	Validate R using MV as input and TV as output
14:	Save the trained model TM
15:	End
	————Testing Phase————
16:	Load R, MS, and TS
17:	For each trained model TM, do:
18:	Test TM using MS as input
19:	Save the estimated output P_GHI
20:	Compare P_GHI and TS and save the results
21:	End

3.3. Impact of seasonal change

In the literature, most of the studies divide the yearly data into 80% for training and 20% for testing to develop the forecasting model. This testing data is unnecessary to reflect all the seasonal variations during the year, and the generated model could not be generalized. Hence, the impact of seasonal change must be investigated to examine the performance of a forecasting algorithm.

Analyzing seasonal variations in GHI values across a range of meteorological scenarios is crucial for determining the reliability of a forecasting model. These various weather scenarios illustrate the changing pattern of solar irradiance throughout the year and depict a range of meteorological circumstances that are common across seasons. It is essential to comprehend how the model reacts to and predicts GHI in various weather conditions and seasons in order to verify the model's generalizability and dependability. This study, therefore, explored the performance of the CNN algorithm with different seasonal changes in GHI observations across varied weather conditions, including rainy, cloudy, partially cloudy, partially sunny, and sunny days. In this study, therefore, a total of 25 independent models were generated for each type of day. Each day was examined with 5 different volumes of dataset (1 day, 1 week, 1 month, 2 months, and 3 months) in which there were 5 different combinations of historical observations of GHI.

3.3.1. Classification of day type

Many studies have used weather or day-type categorization to forecast GHI, aiming to organize vast datasets characterized by significant fluctuations ^[42,43]. Most of these studies divided the type of day according to the general metrological conditions. In this paper, nevertheless, the seasonal variation was captured by classifying days into five groups based on the incident solar radiation (W/m²). Equation (9) determines the type of day using the ratio (R_day) that compares the daily measured GHI to the daily clear sky GHI data, which are collected from CAMS ^[44]. After obtaining the value of R_day, the type of day was classified based on the R_day range shown in Table 1 ^[45].

${R}_{day} = \frac{\mathrm{D}\mathrm{a}\mathrm{i}\mathrm{l}\mathrm{y}\;\mathrm{m}\mathrm{e}\mathrm{a}\mathrm{s}\mathrm{u}\mathrm{r}\mathrm{e}\mathrm{d}\;\mathrm{G}\mathrm{H}\mathrm{I}}{\mathrm{D}\mathrm{a}\mathrm{i}\mathrm{l}\mathrm{y}\;\mathrm{c}\mathrm{l}\mathrm{e}\mathrm{a}\mathrm{r}\;\mathrm{s}\mathrm{k}\mathrm{y}\;\mathrm{G}\mathrm{H}\mathrm{I}}\times 100\% .$

(9)

Table 1. Classification of day type based on measured and clear sky GHI.

Day type	Range of measured GHI to clear sky GHI
Sunny	${R}_{day} > 90\%$
Partially Sunny	$70\% < {R}_{day}\le 90\%$
Partially Cloudy	$50\%\% < {R}_{day}\le 70\%$
Cloudy	$30\% < {R}_{day}\le 50\%$
Rainy	${R}_{day}\le$ 30%

| Show Table

DownLoad: CSV

4. Results and discussion

This section compares the CNN forecasting models based on a number of error metrics to assess how well they performed in estimating the GHI output. The results of the CNN models with different time ahead horizons (5min, 15min, and 30min) forecasting are listed in Tables 3 and 7, respectively. Figures 5, 11, and 12 display graphical representations of the five selected days, with each forecasting model based on 5min ahead and multi-step forecasting within 15min and 30min forecasting horizons, respectively.

4.1. Case 1: Hour-ahead forecasting based on 5min

This section discusses the results of the hour-ahead forecasting of the GHI based on 5min. This section covers the following topics: Choosing the optimal feature set, optimizing hyperparameters, comparing the proposed CNN with other widely used forecasting algorithms, predicting outcomes, and examining the execution time of the proposed CNN model.

4.1.1. Input features selection

Variations in the number of lag observations could significantly affect the accuracy of the GHI forecasting in the future. Furthermore, the amount of trained data may result in accurate prediction and faster generation of the subsequent GHI reading, which is essential for real-time applications. Therefore, for every type of day, 25 independent models were created. Every day was analyzed using five distinct dataset volumes—1 day, 1 week, 1 month, 2 months, and 3 months—each containing five possible combinations of GHI's historical observations—lag 5min, 15min, 30min, 45min, and 60min, each at 5min intervals. The evaluation herein is accomplished with the initial hyperparameters displayed in Table 2.

Table 2. Initial hyperparameters used with 5, 15, and 30min.

Name	Configuration/Value
Input Feature Training Set	GHI_t-1, GHI_t-3, GHI_t-6, GHI_t-9, GHI_t-12
Volume of Dataset	Previous 1 day, week, month, 2–3 months
Number of ConvLayers	3
Number of Filters in Each ConvLayer	100
Learning Rate	0.001
Epochs	100
Optimizer	Adam

| Show Table

DownLoad: CSV

Table 3 lists the statistical error results of each type of day with different combinations and volumes of historical datasets. It can be observed from Table 3 that 2 months of data with 5min lag observation has the best forecasting performance with each type of day. This indicates that seasonal trends and the long-term climatic effects of solar irradiance can be reflected in the 2 months of trained data. Furthermore, the preceding 5min data provides insights into short-term trends and immediate variations in sun irradiation. According to the statistical error measurement shown in Table 3, the average value of the R² of all the days is 0.9999, while the RMSE and MAE are found to be 2.714 W/m² and 2.249 W/m², respectively. In addition, 1 week and 1 month of data with 5min of previous GHI measurements could lead to satisfactory forecasting results, where R², RMSE, and MAE are 0.999, 2.997 W/m², and 2.372 W/m², respectively, for 1 week and 0.999, 4.903 W/m², and 4.617 W/m², respectively, for 1 month. On the other hand, 1 day of data performs poorly regardless of the type of day and the amount of trained data.

Table 3. Statistical results of the CNN model for 5min forecast. RMSE and MAE (W/m²); nRMSE, nMAE, and MAPE (%).

		1 Day						1 Week
	Day Type	R²	RMSE	nRMSE	MAE	nMAE	MAPE	R²	RMSE	nRMSE	MAE	nMAE	MAPE
5 min	Rainy	0.991	12.126	2.411	10.433	2.074	14.124	0.999	3.232	0.643	2.428	0.483	4.745
	Cloudy	0.993	27.048	2.794	18.759	1.938	36.895	1.000	3.517	0.363	2.687	0.278	3.627
	Partially Cloudy	0.997	18.216	1.971	13.533	1.465	19.835	1.000	3.786	0.410	3.510	0.380	2.152
	Partially Sunny	0.997	16.194	1.738	10.001	1.073	19.106	1.000	2.225	0.239	1.590	0.171	2.251
	Sunny	0.998	16.508	1.556	12.470	1.175	16.846	1.000	2.225	0.210	1.647	0.155	2.450
	Average	0.995	18.018	2.094	13.039	1.545	21.361	1.000	2.997	0.373	2.372	0.293	3.045
15 min	Rainy	0.984	16.517	3.284	12.925	2.570	14.246	0.997	7.230	1.437	5.378	1.069	5.718
	Cloudy	0.976	48.991	5.061	38.327	3.959	37.743	0.992	27.926	2.885	17.790	1.838	8.821
	Partially Cloudy	0.991	29.501	3.193	19.974	2.162	21.300	0.996	20.252	2.192	12.264	1.327	5.388
	Partially Sunny	0.996	19.324	2.073	13.896	1.491	20.092	0.999	9.618	1.032	7.624	0.818	4.068
	Sunny	0.997	18.878	1.779	13.902	1.310	17.990	1.000	7.040	0.664	6.231	0.587	3.106
	Average	0.989	26.642	3.078	19.804	2.298	22.274	0.997	14.413	1.642	9.857	1.128	5.420
30 min	Rainy	0.970	22.585	4.490	16.326	3.246	17.293	0.994	10.216	2.031	8.067	1.604	8.554
	Cloudy	0.934	81.516	8.421	60.322	6.232	46.864	0.991	29.764	3.075	20.495	2.117	14.148
	Partially Cloudy	0.979	45.330	4.906	30.559	3.307	23.051	0.993	26.500	2.868	17.484	1.892	7.980
	Partially Sunny	0.992	27.583	2.960	22.792	2.446	24.775	0.998	12.574	1.349	10.745	1.153	6.166
	Sunny	0.991	33.338	3.142	26.222	2.471	20.780	0.999	12.348	1.164	10.537	0.993	6.082
	Average	0.973	42.070	4.784	31.244	3.540	26.552	0.995	18.280	2.097	13.466	1.552	8.586
45 min	Rainy	0.935	33.144	6.589	26.322	5.233	23.155	0.994	10.311	2.050	8.415	1.673	9.419
	Cloudy	0.925	86.526	8.939	66.568	6.877	55.132	0.992	28.544	2.949	20.754	2.144	15.588
	Partially Cloudy	0.958	63.838	6.909	46.198	5.000	30.674	0.982	41.900	4.535	27.981	3.028	12.387
	Partially Sunny	0.988	33.378	3.581	27.887	2.992	25.047	0.998	14.664	1.573	12.285	1.318	6.497
	Sunny	0.990	35.204	3.318	27.967	2.636	20.827	0.999	13.378	1.261	12.089	1.139	6.028
	Average	0.959	50.418	5.867	38.988	4.548	30.967	0.993	21.759	2.474	16.305	1.861	9.984
60 min	Rainy	0.920	36.776	7.311	29.435	5.852	25.106	0.984	16.306	3.242	13.818	2.747	16.101
	Cloudy	0.944	74.659	7.713	53.814	5.559	33.526	0.991	30.270	3.127	23.569	2.435	19.469
	Partially Cloudy	0.958	64.056	6.932	45.788	4.955	32.365	0.975	49.311	5.337	32.434	3.510	14.275
	Partially Sunny	0.997	17.192	1.845	12.829	1.377	9.449	0.998	15.043	1.614	12.583	1.350	7.080
	Sunny	0.984	44.244	4.170	34.456	3.248	23.752	0.998	13.829	1.303	12.420	1.171	6.429
	Average	0.961	47.385	5.594	35.264	4.198	24.840	0.989	24.952	2.925	18.965	2.243	12.671
		1 Month						2 Months
	Day Type	R²	RMSE	nRMSE	MAE	nMAE	MAPE	R²	RMSE	nRMSE	MAE	nMAE	MAPE
5 min	Rainy	0.996	7.739	1.539	7.175	1.427	10.222	0.999	2.959	0.588	2.794	0.555	3.651
	Cloudy	1.000	6.296	0.650	6.222	0.643	4.855	1.000	3.916	0.405	3.356	0.347	4.739
	Partially Cloudy	1.000	3.583	0.388	3.223	0.349	1.986	1.000	1.202	0.130	0.606	0.066	1.476
	Partially Sunny	1.000	2.785	0.299	2.543	0.273	2.504	1.000	2.543	0.273	1.954	0.210	2.052
	Sunny	1.000	4.114	0.388	3.923	0.370	2.468	1.000	2.953	0.278	2.534	0.239	1.674
	Average	0.999	4.903	0.653	4.617	0.612	4.407	1.000	2.714	0.335	2.249	0.283	2.718
15 min	Rainy	0.995	9.012	1.792	7.074	1.406	5.736	0.996	8.616	1.713	6.784	1.349	5.494
	Cloudy	0.991	30.495	3.150	19.753	2.041	9.277	0.991	30.846	3.187	19.881	2.054	9.337
	Partially Cloudy	0.993	26.827	2.903	16.250	1.759	6.769	0.994	25.199	2.727	15.552	1.683	6.134
	Partially Sunny	0.999	8.627	0.926	7.436	0.798	4.119	0.999	7.994	0.858	6.796	0.729	3.550
	Sunny	0.999	12.525	1.180	10.297	0.971	5.123	0.999	11.530	1.087	10.345	0.975	4.279
	Average	0.995	17.497	1.990	12.162	1.395	6.205	0.996	16.837	1.914	11.872	1.358	5.759
30 min	Rainy	0.994	10.030	1.994	7.490	1.489	6.177	0.995	9.058	1.801	6.853	1.362	5.461
	Cloudy	0.991	29.271	3.024	18.843	1.947	9.026	0.991	29.529	3.051	19.578	2.023	9.597
	Partially Cloudy	0.993	25.991	2.813	16.130	1.746	6.521	0.993	26.987	2.921	17.597	1.904	7.013
	Partially Sunny	0.999	11.418	1.225	9.955	1.068	5.538	0.999	10.728	1.151	9.521	1.022	4.855
	Sunny	0.999	12.801	1.207	10.553	0.995	5.050	0.999	12.534	1.181	11.279	1.063	4.838
	Average	0.995	17.902	2.052	12.594	1.449	6.463	0.995	17.767	2.021	12.966	1.475	6.353
45 min	Rainy	0.995	9.085	1.806	7.114	1.414	6.011	0.995	9.634	1.915	6.908	1.373	5.570
	Cloudy	0.991	30.055	3.105	23.661	2.444	19.040	0.995	21.638	2.235	14.369	1.484	7.504
	Partially Cloudy	0.990	30.539	3.305	19.056	2.062	6.674	0.986	36.411	3.941	23.935	2.590	9.285
	Partially Sunny	0.998	14.462	1.552	12.793	1.373	6.895	0.998	13.473	1.446	12.105	1.299	5.834
	Sunny	0.998	14.166	1.335	12.043	1.135	5.099	0.999	13.057	1.231	11.789	1.111	5.291
	Average	0.995	19.661	2.221	14.933	1.686	8.744	0.995	18.843	2.153	13.821	1.572	6.697
60 min	Rainy	0.995	9.499	1.889	7.469	1.485	6.289	0.995	9.472	1.883	7.743	1.539	6.547
	Cloudy	0.993	27.386	2.829	21.462	2.217	17.045	0.995	21.724	2.244	15.642	1.616	8.754
	Partially Cloudy	0.991	29.400	3.182	19.302	2.089	6.689	0.986	36.779	3.980	24.659	2.669	9.643
	Partially Sunny	0.997	15.760	1.691	14.057	1.508	7.265	0.998	13.912	1.493	12.531	1.344	5.996
	Sunny	0.998	14.958	1.410	12.453	1.174	5.304	0.999	13.196	1.244	11.898	1.121	5.123
	Average	0.995	19.401	2.200	14.949	1.695	8.519	0.995	19.017	2.169	14.495	1.658	7.213
		3 Months
	Day Type		R²		RMSE		nRMSE	MAE		nMAE		MAPE
5 min	Rainy		0.999		3.332		0.662	3.196		0.635		4.247
	Cloudy		0.999		9.834		1.016	9.384		0.969		6.022
	Partially Cloudy		1.000		3.038		0.329	2.523		0.273		1.556
	Partially Sunny		0.999		8.317		0.892	7.169		0.769		2.417
	Sunny		1.000		2.070		0.195	1.484		0.140		2.244
	Average		1.000		5.318		0.619	4.751		0.557		3.297
15 min	Rainy		0.994		9.715		1.931	7.185		1.428		5.563
	Cloudy		0.990		31.772		3.282	21.025		2.172		9.522
	Partially Cloudy		0.994		24.199		2.619	14.649		1.585		5.798
	Partially Sunny		0.999		8.593		0.922	7.533		0.808		3.966
	Sunny		0.999		9.670		0.911	8.890		0.838		4.296
	Average		0.995		16.790		1.933	11.856		1.366		5.829
30 min	Rainy		0.994		10.448		2.077	7.247		1.441		5.522
	Cloudy		0.991		30.167		3.116	20.599		2.128		9.822
	Partially Cloudy		0.994		24.899		2.695	15.730		1.702		6.377
	Partially Sunny		0.999		10.640		1.142	9.396		1.008		4.672
	Sunny		0.999		12.919		1.218	10.916		1.029		5.225
	Average		0.995		17.815		2.049	12.777		1.462		6.324
45 min	Rainy		0.992		11.471		2.281	8.978		1.785		7.808
	Cloudy		0.996		20.459		2.114	13.034		1.346		6.836
	Partially Cloudy		0.996		20.385		2.206	14.612		1.581		6.182
	Partially Sunny		0.998		13.562		1.455	12.012		1.289		5.635
	Sunny		0.999		13.006		1.226	11.798		1.112		4.970
	Average		0.996		15.776		1.856	12.087		1.423		6.286
60 min	Rainy		0.991		12.416		2.468	9.876		1.963		8.974
	Cloudy		0.996		18.737		1.936	12.189		1.259		6.280
	Partially Cloudy		0.996		19.641		2.126	14.246		1.542		6.135
	Partially Sunny		0.998		15.136		1.624	13.668		1.466		6.538
	Sunny		0.999		13.065		1.231	11.868		1.119		5.018
	Average		0.996		15.799		1.877	12.369		1.470		6.589

| Show Table

DownLoad: CSV

Comparing the lag observation of data, the previous 5min of GHI readings led to the best forecasting results for all days and volume of data—1 day, 1 week, 1 month, 2 months, and 3 months. The 15min and 30min at 5min intervals came in second and third place in generating models with high-accuracy outcomes, respectively. Hence, the 2 months of the trained dataset with the previous 5min of the GHI reading (2M-5min) is selected as the best feature set to predict the future 5min output of the GHI. In addition, and for further visualization, Figure 5 depicts the performance of different models when the measured GHI values are plotted against the predicted value of the GHI model.

Figure 5. The performance of the CNN model with different data volume and historical GHI data for a 5min forecast. (1D-60min: 1 day of data with the previous 60min of GHI readings at a 5min interval).

DownLoad: Full-Size Img PowerPoint

4.1.2. Hyperparameter tuning

Hyperparameter selection is an important step when using deep learning algorithms for prediction, such as CNN. This stage helps to improve overall precision and shorten the algorithm's execution time. A comprehensive evaluation of the previously mentioned validation criteria is combined with a heuristic technique to discover the optimal set of hyperparameters for the GHI forecasting using the CNN algorithm. The best-predicting results were obtained using the 2 months of the trained dataset with the previous 5min of the GHI reading, as mentioned in Subsection 4.1.1. Therefore, this set of features was selected to carry out the hyperparameter tuning for 5min forecasting horizon at the study site.

There are no set techniques when it comes to hyperparameter tuning. Nonetheless, the following order for fine-tuning the hyperparameters was chosen for the GHI forecasting based on the literature analysis and best practices: Number of convolution layers (ConvLayer), number of filters at each ConvLayer, and learning rate. For hyperparameter tuning, the falling leaf approach was used as it offers a more flexible and dynamic way to explore the hyperparameter space. In this approach, for instance, the process is continued with various combinations of several filters after determining the ideal number of ConvLayers. For example, it was found that the two ConvLayers with 32 filters at each layer produced the best forecasting outcomes out of the (1, 2, 3) ConvLayers. The two ConvLayers were fixed in the following phase, and various learning rates were examined.

Figures 6 and 7 show the performance comparison to obtain the optimal number of ConvLayers, filters, and learning rates, respectively. Figure 6 depicts the statistical error results of 1, 2, and 3 ConvLayers with the number of filters as 32, 64,100, and 128. It can be seen that a setup with 2 ConvLayers with 32 filters (2-ConvLayer (32)) had the best forecasting outcomes. Hence, 2 ConvLayers with 32 filters were selected to continue in the hyperparameter tuning process.

Figure 6. The performance comparison to obtain the optimal number of ConvLayers and filters for a 5min forecasting horizon.

DownLoad: Full-Size Img PowerPoint

Figure 7. The performance comparison to obtain the optimal learning rate for a 5, 15, and 30min forecasting horizon.

DownLoad: Full-Size Img PowerPoint

The performance comparison between two ConvLayers, each with 32 filters, is shown in Figure 7 in order to determine the ideal learning rate value. Compared to 0.1, 0.01, 0.001, and 0.0001, it can be seen that the learning rate of 0.001 produced better forecasting outcomes. As a result, Table 4 lists the ultimate, best CNN configurations chosen for 5min ahead of GHI forecasting of the study site.

Table 4. Optimal CNN configurations chosen for 5, 15, and 30min ahead of GHI forecasting.

	5min Prediction	15min Prediction	30min Prediction
Name	Configuration/Value	Configuration/Value	Configuration/Value
Input Feature Training Set	GHI_t-1	GHI_t-3	GHI_t-9
Volume of Dataset	Previous 2 Months of GHI Observations	Previous 2 Months of GHI Observations	Previous 3 Months of GHI Observations
Number of ConvLayers	2	3	3
Number of Filters in Each ConvLayer	32	100	100
Learning Rate	0.001	0.001	0.0001
Epochs	100	100	100
Optimizer	Adam	Adam	Adam

| Show Table

DownLoad: CSV

4.1.3. Comparing the proposed CNN with other forecasting algorithms

The forecasting performance of the developed forecasting model was evaluated against four popular forecasting algorithms, namely RNN, ANN, RF, and SVR. Table 5 contains the results of developed CNN, RNN, ANN, RF, and SVR. To ensure a fair comparison, the best input features (2M-5min) identified in the Subsection 4.1.1 were used as input to RNN, ANN, RF, and SVR. According to Table 5, the proposed forecasting models with optimal input features and configurations outperformed the other forecasting models in predicting the future values of GHI with low RMSE, MAE, and MAPE values for all the day types. Regarding models fitting accuracy with the CNN, the proposed model had the best prediction outcomes, where the average value RMSE for the five days was found to be 2.262 W/m², MAE was found to be 1.794 W/m², and MAPE was found to be 2.17%. The RNN algorithm showed promising performance with an average RMSE value of 3.062 W/m², MAE of 2.192 W/m², and MAPE of 2.169%. ANN, RF, and SVR came in third, fourth, and fifth, respectively.

Table 5. Statistical results of the CNN model with optimal configurations compared to other ML models for a 5min forecast. RMSE and MAE (W/m²); nRMSE, nMAE, and MAPE (%).

CNN	Day Type	R²	RMSE	nRMSE	MAE	nMAE	MAPE
	Rainy	0.999554	2.744083	0.545543	2.511927	0.499389	3.468878
	Cloudy	0.999892	3.292661	0.340151	2.818895	0.291208	2.481333
	Partially Cloudy	0.999986	1.171729	0.126811	0.484412	0.052425	1.347758
	Partially Sunny	0.999963	1.862934	0.199886	1.29169	0.138593	2.084594
	Sunny	0.999959	2.237807	0.210915	1.864288	0.17571	1.46848
	Average	0.999871	2.261843	0.284661	1.794242	0.231465	2.170209
RNN	Day Type	R²	RMSE	nRMSE	MAE	nMAE	MAPE
	Rainy	0.999451	3.045107	0.605389	2.320046	0.461242	2.682643
	Cloudy	0.999819	4.263271	0.440421	3.058548	0.315966	2.981755
	Partially Cloudy	0.999936	2.495266	0.27005	1.844726	0.199646	1.954493
	Partially Sunny	0.999926	2.639085	0.283164	1.804686	0.193636	1.510071
	Sunny	0.999933	2.868403	0.270349	1.933374	0.182222	1.71893
	Average	0.999813	3.062226	0.373875	2.192276	0.270542	2.169579
ANN	Day Type	R²	RMSE	nRMSE	MAE	nMAE	MAPE
	Rainy	0.997896	5.959759	1.184843	5.801542	1.153388	7.458361
	Cloudy	0.999648	5.938251	0.613456	5.205368	0.537745	7.427777
	Partially Cloudy	0.999934	2.535143	0.274366	2.21285	0.239486	1.144426
	Partially Sunny	0.999835	3.957284	0.424601	3.36722	0.36129	1.275724
	Sunny	0.999907	3.368667	0.317499	2.613383	0.246313	1.265069
	Average	0.999444	4.351821	0.562953	3.840073	0.507644	3.714272
RF	Day Type	R²	RMSE	nRMSE	MAE	nMAE	MAPE
	Rainy	0.996923	7.207522	1.432907	5.298681	1.053416	5.359766
	Cloudy	0.999475	7.250545	0.749023	5.075886	0.524368	4.413965
	Partially Cloudy	0.999702	5.399749	0.584388	4.015516	0.43458	2.830605
	Partially Sunny	0.999583	6.28164	0.673996	4.359316	0.467738	2.586884
	Sunny	0.999454	8.184438	0.771389	5.160276	0.48636	2.517631
	Average	0.999028	6.864779	0.842341	4.781935	0.593292	3.54177
SVR	Day Type	R²	RMSE	nRMSE	MAE	nMAE	MAPE
	Rainy	0.996895	7.240124	1.439388	5.658349	1.12492	10.83215
	Cloudy	0.9984	12.65985	1.307836	10.33041	1.067191	11.6109
	Partially Cloudy	0.996976	17.21449	1.86304	14.31814	1.549583	14.65393
	Partially Sunny	0.996518	18.15265	1.94771	12.95612	1.390141	6.40973
	Sunny	0.99783	16.31258	1.537472	12.63674	1.191022	10.14963
	Average	0.997324	14.31594	1.619089	11.17995	1.264571	10.73127

| Show Table

DownLoad: CSV

In addition, Figure 8 illustrates the efficacy of the proposed CNN model in comparison to RNN, ANN, RF, and SVR for the five specified days. Figure 8(a) depicts that when input features and CNN hyperparameters were appropriately selected, the proposed CNN model exceled in accurately tracing the actual values of the GHI output, outperforming other models. Furthermore, the boxplots shown in Figure 8(b) were designed to offer a more comprehensive assessment of the forecasting models' predictive performance. A box and whisker plot (BWP) shows the distribution of the mean absolute error (MAE) when all of the predicted days are combined. While analyzing the BWP, an outlier is a data point that deviates quantitatively from the rest of the data (shown by the red cross). Consistent with earlier deductions, the proposed CNN models consistently outperformed the RNN, ANN, RF, and SVR. This superior performance is also highlighted in the scatter plots presented in Figure 9. This figure shows the measured versus predicted GHI output values acquired by the proposed CNN model compared to the RNN and SVR models for the five simulation days.

Figure 8. (a) The efficacy of the proposed CNN model in comparison to RNN, ANN, RF, and SVR for the five specified days. (b) Boxplot comparing the MAE error values of the proposed CNN and other models for all the considered days.

DownLoad: Full-Size Img PowerPoint

Figure 9. The measured versus predicted GHI output values of proposed CNN, RNN and SVR models.

DownLoad: Full-Size Img PowerPoint

4.1.4. One week of forecasting

To further examine the performance of the proposed forecasting model, a randomly selected week (December 5–11, 2022) is forecasted using the optimal set of input features and CNN configurations. The forecasting accuracy results are shown in Table 6 and Figure 10. According to Table 6, the 5min ahead forecast led to an RMSE value of 2.2785 W/m², while the MAE and MAPE were found to be 1.59 W/m² and 2.913%, respectively. It can be inferred that the accuracy of forecasting models steadily declined, starting with its best forecasting result 5min ahead and ending at a 30min estimate. While the time horizon lengthens, the accuracy of various models gradually declined, and the uncertainty in observations of GHI forecasting grew.

Table 6. Statistical results of the CNN model with optimal configurations for one week—5, 15, and 30min forecast. RMSE and MAE (W/m²); nRMSE, nMAE, and MAPE (%).

	R²	RMSE	nRMSE	MAE	nMAE	MAPE
5min	0.999895	2.278482	0.312121	1.509018	0.206715	2.912875
15min	0.982755	29.24868	4.006669	15.76139	2.159094	8.80489
30min	0.927333	60.04036	8.224707	33.48643	4.587183	19.32963

| Show Table

DownLoad: CSV

Figure 10. The performance of the CNN model with optimal input features and configurations for 5, 15, and 30min forecasting horizons.

DownLoad: Full-Size Img PowerPoint

4.1.5. Analysis of execution time

The system used for the simulations consists of an Intel Core i7-7700@ 4.20GHz CPU, an NVIDIA GeForce GTX 1080 GPU, and 16 GB of RAM. The computer simulation environment is MATLAB, which permits the usage of GPUs that support the CUDA Toolkit. Utilizing a GPU greatly accelerated computing; therefore, to generate the forecasting model in fast execution time, especially with a large amount of training dataset, it is recommended to use high-performance GPUs. For 5min ahead of forecasting the GHI, the average running time of the optima prediction model was 57 seconds for the five selected days. Nevertheless, the model training consumed over 95% of the entire run duration. In a real-time scenario, loading a model that has already been trained can reduce the simulation time for the purpose of a 5min GHI forecast.

5. Cases 2 and 3: Hour-ahead forecasting based on 15min and 30min

The outcomes of the hour-ahead GHI forecasting based on 15min and 30min horizons are covered in this section. The following topics are covered in this section: Selecting the best feature set, fine-tuning hyperparameters, evaluating the proposed CNN with other popular forecasting algorithms, predicting results, and investigating the recommended CNN model's execution time.

5.1. Input features selection

Similar to what was conducted with a 5min prediction horizon, every day was analyzed to cover the seasonal variations and examine the performance of the CNN algorithm. Regarding the data volume, however, 1 month, 2 months, and 3 months were used as input features with 15min and 30min multistep forecasts. Each trained data contained five possible combinations of GHI's historical observations—lag 5min, 15min, 30min, 45min, and 60min, each at 5min intervals. Hence, in this case, and for each specific day, a total of 15 models were created. The analysis conducted for 15min and 30min multistep forecasts used the first set of hyperparameters shown in Table 2.

Table 7 presents the 15min multistep statistical error results for each type of day based on different combinations of historical dataset quantities and the lag readings of GHI readings. In comparison to other developed models, the results show that 2 months of data with a 15min lag in observation (2M-15min) performed the best for the 15min ahead forecast scenario. The average values of R², RMSE, MAE, and MAPE for all days are 0.9708, 35.776 W/m², 20.685 W/m², and 12.437%, respectively. On the other hand, the 3 months of training data with the previous 45min of input GHI values (3M-45min) outperformed other models for the 30min multistep forecasts of GHI. The error values of R², RMSE, MAE, and MAPE generated with this model were found to be 0.9276, 56.319 W/m², 36.891 W/m², and 19.711%. In addition, Table 7 indicates that regardless of the trained data, the input feature of lag 5min of GHI had the worst accuracy in predicting the 15min and 30min multistep of GHI forecasts. Furthermore, and for additional visualization, Figures 11 and 12 show how various models performed when the measured GHI values were plotted against the GHI model's predicted value for the 15min and 30min multistep, respectively.

Table 7. Statistical results of the CNN model for 15min forecast. RMSE and MAE (W/m²); nRMSE, nMAE, and MAPE (%).

		1 Month						2 Months
	Day Type	R²	RMSE	nRMSE	MAE	nMAE	MAPE	R²	RMSE	nRMSE	MAE	nMAE	MAPE
5 min	Rainy	-0.341	151.974	30.213	134.780	26.795	259.437	-0.016	132.290	26.300	107.808	21.433	185.087
	Cloudy	0.054	307.945	31.813	243.428	25.148	186.257	0.111	299.220	30.911	232.845	24.054	530.840
	Partially Cloudy	0.241	272.774	29.521	212.433	22.991	107.753	0.266	268.127	29.018	208.578	22.573	101.295
	Partially Sunny	-0.529	380.334	40.808	328.531	35.250	97.006	-0.246	343.408	36.846	293.574	31.499	104.098
	Sunny	-0.584	440.813	41.547	383.998	36.192	119.535	-0.389	412.770	38.904	360.339	33.962	122.886
	Average	-0.232	310.768	34.780	260.634	29.275	153.998	-0.055	291.163	32.396	240.629	26.704	208.841
15 min	Rainy	0.928	34.849	6.928	18.984	3.774	12.896	0.936	33.327	6.626	17.919	3.563	15.809
	Cloudy	0.961	62.132	6.419	36.612	3.782	19.552	0.964	60.122	6.211	34.817	3.597	27.607
	Partially Cloudy	0.957	64.817	7.015	34.049	3.685	10.714	0.956	65.310	7.068	34.563	3.741	10.991
	Partially Sunny	0.997	15.571	1.671	13.606	1.460	4.844	0.999	10.887	1.168	8.206	0.881	3.893
	Sunny	0.999	8.519	0.803	7.035	0.663	3.202	0.999	9.234	0.870	7.917	0.746	3.882
	Average	0.969	37.178	4.567	22.057	2.673	10.242	0.971	35.776	4.389	20.684	2.505	12.437
30 min	Rainy	0.903	40.800	8.111	23.025	4.578	21.055	0.914	38.425	7.639	20.759	4.127	16.327
	Cloudy	0.971	54.233	5.603	31.044	3.207	15.604	0.975	49.996	5.165	27.662	2.858	17.823
	Partially Cloudy	0.945	73.707	7.977	38.577	4.175	12.058	0.946	72.497	7.846	36.870	3.990	12.649
	Partially Sunny	0.998	12.818	1.375	9.204	0.988	4.548	0.998	13.396	1.437	9.620	1.032	5.403
	Sunny	0.999	7.837	0.739	6.265	0.590	3.466	0.999	11.493	1.083	10.204	0.962	4.782
	Average	0.963	37.879	4.761	21.623	2.708	11.346	0.967	37.161	4.634	21.023	2.594	11.397
45 min	Rainy	0.902	41.108	8.173	25.451	5.060	23.409	0.923	36.540	7.264	21.447	4.264	20.649
	Cloudy	0.966	58.486	6.042	36.243	3.744	21.060	0.970	54.832	5.664	32.137	3.320	18.834
	Partially Cloudy	0.947	71.916	7.783	41.228	4.462	13.285	0.949	70.883	7.671	41.031	4.441	13.138
	Partially Sunny	0.997	15.771	1.692	13.480	1.446	5.385	0.998	13.418	1.440	10.058	1.079	7.146
	Sunny	0.998	13.700	1.291	12.037	1.135	4.174	0.998	13.621	1.284	11.568	1.090	4.274
	Average	0.962	40.196	4.996	25.688	3.169	13.463	0.968	37.859	4.665	23.248	2.839	12.808
60 min	Rainy	0.919	37.443	7.444	22.725	4.518	21.549	0.930	34.759	6.910	21.038	4.182	19.585
	Cloudy	0.962	61.399	6.343	37.481	3.872	23.272	0.968	56.676	5.855	35.954	3.714	42.472
	Partially Cloudy	0.952	68.303	7.392	40.576	4.391	12.987	0.955	66.186	7.163	38.273	4.142	13.827
	Partially Sunny	0.996	18.719	2.008	15.613	1.675	6.081	0.997	16.816	1.804	14.065	1.509	5.592
	Sunny	0.999	12.944	1.220	11.090	1.045	5.579	0.999	12.915	1.217	11.286	1.064	6.521
	Average	0.966	39.761	4.881	25.497	3.100	13.894	0.970	37.470	4.590	24.123	2.922	17.600
		3 Months
	Day Type		R²		RMSE		nRMSE	MAE		nMAE		MAPE
5 min	Rainy		-0.646		168.403		33.480	154.267		30.669		284.358
	Cloudy		-0.049		324.220		33.494	255.725		26.418		188.281
	Partially Cloudy		0.051		304.964		33.005	226.944		24.561		96.020
	Partially Sunny		-0.487		375.088		40.245	321.385		34.483		97.544
	Sunny		-0.740		461.897		43.534	397.667		37.480		116.884
	Average		-0.374		326.914		36.752	271.198		30.722		156.617
15 min	Rainy		0.932		34.317		6.822	18.810		3.740		13.913
	Cloudy		0.960		63.231		6.532	36.973		3.820		19.278
	Partially Cloudy		0.964		59.416		6.430	31.726		3.434		10.497
	Partially Sunny		0.999		11.253		1.207	8.045		0.863		3.653
	Sunny		0.999		12.395		1.168	10.286		0.969		3.723
	Average		0.971		36.122		4.432	21.168		2.565		10.213
30 min	Rainy		0.878		45.755		9.096	26.549		5.278		22.744
	Cloudy		0.967		57.363		5.926	33.197		3.429		17.103
	Partially Cloudy		0.960		62.285		6.741	33.940		3.673		11.302
	Partially Sunny		0.998		13.105		1.406	10.487		1.125		4.577
	Sunny		0.999		9.498		0.895	7.852		0.740		3.787
	Average		0.961		37.601		4.813	22.405		2.849		11.903
45 min	Rainy		0.893		42.847		8.518	26.249		5.219		20.571
	Cloudy		0.961		62.614		6.468	39.114		4.041		22.113
	Partially Cloudy		0.969		54.981		5.950	31.723		3.433		9.937
	Partially Sunny		0.994		23.007		2.469	20.362		2.185		10.605
	Sunny		0.999		12.464		1.175	10.853		1.023		4.753
	Average		0.963		39.183		4.916	25.660		3.180		13.596
60 min	Rainy		0.905		40.448		8.041	25.105		4.991		22.392
	Cloudy		0.961		62.626		6.470	38.588		3.986		21.106
	Partially Cloudy		0.972		52.145		5.643	28.364		3.070		9.984
	Partially Sunny		0.996		19.159		2.056	16.062		1.723		5.718
	Sunny		0.998		13.567		1.279	10.638		1.003		3.610
	Average		0.967		37.589		4.698	23.751		2.955		12.562

| Show Table

DownLoad: CSV

Figure 11. The performance of the CNN model with different data volume and historical GHI data for a 15min forecast. (1D-60min: 1 day of data with the previous 60min of GHI readings at a 5min interval).

DownLoad: Full-Size Img PowerPoint

Figure 12. The performance of the CNN model with different data volume and historical GHI data for a 30min forecast. (1D-60min: 1 day of data with the previous 60min of GHI readings at a 5min interval).

DownLoad: Full-Size Img PowerPoint

5.2. Hyperparameter tuning

The hyperparameter tuning process conducted with the 5min ahead forecast was also employed with the 15min and 30min multistep forecast of the GHI values. Figures 13, 14, and 7 show the performance comparison to obtain the optimal number of ConvLayers, filters, and learning rates for the 15min and 30min horizon forecasts, respectively. Figures 13 and 14 depict the statistical error results of 1, 2, and 3 ConvLayers with the number of filters as 32, 64,100, and 128 of the 15min and 30min horizons, respectively.

Figure 13. The performance comparison to obtain the optimal number of ConvLayers and filters for the 15min forecasting horizon.

DownLoad: Full-Size Img PowerPoint

Figure 14. The performance comparison to obtain the optimal number of ConvLayers and filters for the 30min forecasting horizon.

DownLoad: Full-Size Img PowerPoint

For the 15min and 30min forecasting horizons, it can be seen that the design with 3 ConvLayers, each with 100 filters (3-ConvLayer (100)), had the best forecasting outcomes. Hence, 3 ConvLayers with 100 filters were selected to continue in the hyperparameter tuning process for the cases of 15min and 30min. On the other hand, and for learning rate tuning, Figure 7 reveals that 0.001 and 0.0001 are optimal values for 15min and 30min multistep, respectively. Consequently, Table 4 presents the optimal CNN configurations selected for 15min and 30min of GHI forecasting of the study site.

5.3. Comparing the proposed CNN with other forecasting algorithms

Here, a comparison is accomplished between the proposed CNN model and other forecasting algorithms. Since RNN and ANN performed the best compared to RF and SVR in the case of 5min ahead forecast, the RNN and ANN are selected to be compared with the optimal CNN model in the case of 15min and 30min.

According to Tables 8 and 9, the proposed CNN forecasting models with optimal input features and configurations outperformed RNN and ANN models in predicting the future values of GHI with noticeably low statistical error results for all the day types. For the 15min case, Table 8 shows that the average RMSE values of proposed CNN, RNN, and ANN were 30.569 W/m², 35.759 W/m², and 43.058 W/m², respectively. On the other hand, and for the 30min case, Table 9 indicates that the average R² values of proposed CNN, RNN, and ANN were generated to be 0.933, 0.919, and 0.914.

Table 8. Statistical results of the CNN model with optimal configurations compared to other ML models for a 15min forecast. RMSE and MAE (W/m²); nRMSE, nMAE, and MAPE (%).

CNN	Day Type	R²	RMSE	nRMSE	MAE	nMAE	MAPE
	Rainy	0.930707	34.20283	6.799768	18.85526	3.748561	13.26084
	Cloudy	0.976319	48.71237	5.03227	26.70142	2.758411	12.74513
	Partially Cloudy	0.973232	51.21305	5.542538	25.72208	2.783775	10.97661
	Partially Sunny	0.998963	9.906341	1.062912	7.205727	0.773147	3.537772
	Sunny	0.999367	8.813277	0.830658	7.752396	0.730669	4.057151
	Average	0.975717	30.56957	3.853629	17.24738	2.158913	8.915503
RNN	Day Type	R²	RMSE	nRMSE	MAE	nMAE	MAPE
	Rainy	0.904618	40.12841	7.977815	25.0472	4.979563	16.5717
	Cloudy	0.967473	57.08935	5.89766	25.11826	2.594862	14.74114
	Partially Cloudy	0.970538	53.72851	5.814774	30.75228	3.328169	11.36507
	Partially Sunny	0.997388	15.72194	1.686904	9.222514	0.98954	6.908786
	Sunny	0.998801	12.12546	1.142834	9.387336	0.884763	5.382341
	Average	0.967764	35.75874	4.503997	19.90552	2.555379	10.99381
ANN	Day Type	R²	RMSE	nRMSE	MAE	nMAE	MAPE
	Rainy	0.872276	46.43584	9.231778	29.03691	5.772746	20.50227
	Cloudy	0.962532	61.27265	6.329819	35.73584	3.691719	18.87583
	Partially Cloudy	0.930234	82.67886	8.947929	48.29123	5.226324	17.57485
	Partially Sunny	0.997417	15.63435	1.677506	12.73531	1.36645	6.541495
	Sunny	0.999299	9.270155	0.873719	7.680705	0.723912	3.599812
	Average	0.952352	43.05837	5.41215	26.696	3.35623	13.41885

| Show Table

DownLoad: CSV

Table 9. Statistical results of the CNN model with optimal configurations compared to other ML models for a 30min forecast. RMSE and MAE (W/m²); nRMSE, nMAE, and MAPE (%).

CNN	Day Type	R²	RMSE	nRMSE	MAE	nMAE	MAPE
	Rainy	0.884291	44.19796	8.78687	31.1768	6.198171	24.86742
	Cloudy	0.844454	124.8436	12.89706	81.47719	8.417065	42.91131
	Partially Cloudy	0.939979	76.68758	8.299522	43.66264	4.725394	16.0358
	Partially Sunny	0.998022	13.68148	1.46797	10.31797	1.107078	8.019125
	Sunny	0.998863	11.80963	1.113066	10.22706	0.963907	5.07522
	Average	0.933122	54.24404	6.512898	35.37233	4.282323	19.38177
RNN	Day Type	R²	RMSE	nRMSE	MAE	nMAE	MAPE
	Rainy	0.834329	52.88608	10.51413	37.11608	7.378942	25.91648
	Cloudy	0.838339	127.2736	13.1481	67.30181	6.952666	34.58024
	Partially Cloudy	0.930889	82.28961	8.905802	48.27343	5.224397	17.30105
	Partially Sunny	0.997367	15.7834	1.693497	8.306134	0.891216	6.402013
	Sunny	0.9976	17.15679	1.617039	13.24123	1.247995	8.263789
	Average	0.919705	59.07789	7.175713	34.84773	4.339043	18.49271
ANN	Day Type	R²	RMSE	nRMSE	MAE	nMAE	MAPE
	Rainy	0.832421	53.18969	10.57449	40.05009	7.962244	31.48269
	Cloudy	0.820569	134.0866	13.85192	81.81275	8.451731	39.32261
	Partially Cloudy	0.92617	85.0529	9.20486	50.46915	5.46203	17.84484
	Partially Sunny	0.995512	20.60753	2.211108	14.80705	1.58874	8.625684
	Sunny	0.996983	19.23387	1.812806	14.2376	1.341904	6.303567
	Average	0.914331	62.43412	7.531038	40.27533	4.96133	20.71588

| Show Table

DownLoad: CSV

5.4. One week of forecasting

The 15min and 30min multistep forecasts were also conducted for the randomly selected week (December 5–11, 2022). This step is to further examine the performance of the developed CNN models. Table 6 lists the error values of the generated results based on one week, while Figure 10 shows the performance of the forecasting results when they are plotted against the observed GHI readings. For the 15min case, the RMSE and MAE values were 29.248 W/m² and 15.761 W/m², respectively, while RMSE and MAE values were found to be 60.040 W/m² and 33.486 W/m² for the case of 30min, respectively. As a comparison between the 5min, 15min, and 30min ahead forecasts, Table 6 reveals that as the forecasting horizon increased, more error was expected in the forecasting results. For example, the R² was found to be 0.999 for the 5min forecast horizon, while it was found to be 0.983 and 0.927 for 15min and 30min forecasting horizons, respectively. Hence, it is expected to obtain high error values if the forecasting horizons increase to be, for example, 60min at 5min intervals.

5.5. Analysis of execution time

The system setup used for the simulations in the case of a 5min ahead forecast was used in the cases of 15min and 30min ahead forecasts of the GHI values. In the case of the 15min forecast, the optimal prediction model took an average of 42 seconds to run for the five days that were chosen, while it took an average of 51 in the case of the 30min forecast. However, most of the running time was consumed in the training phase.

6. Hardware implementation and potential challenges

Our forecasting model using CNN with optimized architecture can be implemented into hardware to be used for energy management applications. This can be achieved by deploying the model into a microcontroller or a single-board computer, such as Raspberry Pi, and integrating it into an energy management system. The system can receive real-time data from a solar radiation device and use our proposed model to forecast the future output of the photovoltaic system. The forecasted output can then be used to optimize the energy management system, such as scheduling energy consumption and storage, or selling the excess energy to the grid. By implementing our model into hardware, we can provide a reliable and accurate forecasting tool for energy management, which can lead to cost savings and more efficient use of renewable energy sources.

There are a number of research constraints that must be considered while analyzing the study's possible limitations while creating a CNN prediction model for GHI. CNN initially requested a significant volume of GHI data. It can be challenging to gather trustworthy GHI data with shorter time periods because of measurement irregularities or sensor problems. Second, in this research work the CNN was trained using certain amounts of data (1 day, 1 week, 1−3 months) owing to the fact that CNNs learn features from raw data. However, determining the pertinent features or data volume can be challenging because it involves considerable thought and technical experience. Finally, when dealing with large datasets or intricate structures, training CNN models for GHI forecasting can be computationally demanding. For models to be trained effectively, sufficient computational resources—including strong GPUs—are required, which can be costly for certain users. Therefore, resolving these issues is essential to guaranteeing the model's dependability and practicality in real-world applications.

7. Conclusions and future work

Forecasting solar irradiance has gotten a lot of interest owing to the growing demand for renewable energy. However, the expensive cost of climate observatories makes gathering meteorological data difficult, impeding the development of precision forecasting models. Therefore, this research intended to overcome this barrier by developing a framework to forecast GHI values even in the absence or inaccuracy of meteorological data. A forecasting model based on the convolution neural network (CNN) algorithm was developed using merely lag measurements of GHI as input, with no external variables. The CNN forecasting outputs with different network designs was investigated through a heuristic configurations paradigm. Furthermore, the performance of the developed model was then compared to that of other popular forecasting algorithms over predicting horizons of 5, 15, and 30min. By analyzing the outcomes derived from the most effective forecasting model and evaluating the performance of estimation algorithms, the conclusion can be summarized as follows:

- Based on the criteria for model accuracy, a duration of two months' worth of data proves sufficient for constructing high-accuracy forecasting models for 5min and 15min horizons. However, to achieve similarly good forecasting results for a 30min horizon, three months of data is recommended.

- Regarding the model fitting accuracy, the developed CNN forecasting models outperformed other forecasting models (RNN, ANN, RF, and SVR) in forecasting GHI output. The average RMSE prediction results under different forecasting horizons of 5min, 15min and 30min considering different types of days (rainy, cloudy, partially cloudy, partially sunny, and sunny) with CNN model were 2.262, 30.569, and 54.244 W/m², respectively.

- Forecasting model accuracy rapidly decreased, beginning with a high predicting result 5min ahead and ending with a 30min prediction. As the time horizon increased, the accuracy of various models steadily fell, and the uncertainty in GHI forecasting observations grew.

Finally, the framework developed in this study holds the potential for predicting GHI output in other countries, offering a valuable tool for enhancing energy management strategies. However, there exists opportunities for further exploration to enhance the accuracy of GHI prediction models. A hybrid deep learning model, such as CNN-LSTM and RNN-LSTM, can be investigated to acquired more spatial features in the GHI data. Furthermore, extending the duration of available data, such as spanning over 2 weeks or 6 months, warrants deeper examination, particularly in non-real-time applications. Another potential avenue involves leveraging metaheuristic optimization algorithms like particle swarm optimization and genetic algorithms. These could optimize CNN architectures, revealing the model's sensitivity to variations in CNN designs.

Use of AI tools declaration

The authors declare that they have not used artificial intelligence (AI) tools in the creation of this article.

Acknowledgments

The researchers would like to thank the Deanship of Scientific Research at Qassim University for funding the publication of this project.

Conflict of interest

The authors declare that they have no competing interests.

References

[1]	A. Ali, M. I. Ali, N. Rehman, New types of dominance based multi-granulation rough sets and their applications in conflict analysis problems, J. Intell. Fuzzy Syst., 35 (2018), 3859–3871. https://doi.org/10.3233/JIFS-18757 doi: 10.3233/JIFS-18757
[2]	M. I. Ali, A note on soft sets, rough soft sets and fuzzy soft sets, Appl. Soft Comput., 11 (2011), 3329–3332. https://doi.org/10.1016/j.asoc.2011.01.003 doi: 10.1016/j.asoc.2011.01.003
[3]	S. Ayub, W. Mahmood, M. Shabir, A. N. Koam, R. Gul, A study on soft multi-granulation rough sets and their applications, IEEE Access, 2022. https://doi.org/10.1109/ACCESS.2022.3218695 doi: 10.1109/ACCESS.2022.3218695
[4]	D. G. Chen, C. Z. Wang, Q. H. Hu, A new approach to attribute reduction of consistent and inconsistent covering decision systems with covering rough sets, Inform. Sci., 177 (2007), 3500–3518. https://doi.org/10.1016/j.ins.2007.02.041 doi: 10.1016/j.ins.2007.02.041
[5]	J. Din, M. Shabir, Y. Wang, Pessimistic multigranulation roughness of a fuzzy set based on soft binary relations over dual universes and its application, Mathematics, 10 (2022), 541. https://doi.org/10.3390/math10040541 doi: 10.3390/math10040541
[6]	F. Feng, M. I. Ali, M. Shabir, Soft relations applied to semigroups, Filomat, 27 (2013), 1183–1196. https://doi.org/10.2298/FIL1307183F doi: 10.2298/FIL1307183F
[7]	F. Feng, X. Liu, V. Leoreanu-Fotea, Y. B. Jun, Soft sets and soft rough sets, Inform. Sci., 181 (2011), 1125–113. https://doi.org/10.1016/j.ins.2010.11.004 doi: 10.1016/j.ins.2010.11.004
[8]	S. Greco, B. Matarazzo, R. Slowinski, Rough approximation by dominance relations, Int. J. Intell. Syst., 17 (2002), 153–171. https://doi.org/10.1002/int.10014 doi: 10.1002/int.10014
[9]	B. Huang, C. Guo, Y. Zhuang, H. Li, X. Zhou, Intuitionistic fuzzy multigranulation rough sets, Inform. Sci., 277 (2014), 299–320. https://doi.org/10.1016/j.ins.2014.02.064 doi: 10.1016/j.ins.2014.02.064
[10]	Z. Li, N. Xie, N. Gao, Rough approximations based on soft binary relations and knowledge bases, Soft Comput., 21 (2017), 839–852. https://doi.org/10.1007/s00500-016-2077-2 doi: 10.1007/s00500-016-2077-2
[11]	T. J. Li, Y. Leung, W. X. Zhang, Generalized fuzzy rough approximation operators based on fuzzy coverings, Int. J. Approx. Reason., 48 (2008), 836–856. https://doi.org/10.1016/j.ijar.2008.01.006 doi: 10.1016/j.ijar.2008.01.006
[12]	Z. Li, N. Xie, N. Gao, Rough approximations based on soft binary relations and knowledge bases, Soft Comput., 21 (2017), 839–852. https://doi.org/10.1007/s00500-016-2077-2 doi: 10.1007/s00500-016-2077-2
[13]	G. Liu, Rough set theory based on two universal sets and its applications, Knowl.-Based Syst., 23 (2010), 110–115. https://doi.org/10.1016/j.knosys.2009.06.011 doi: 10.1016/j.knosys.2009.06.011
[14]	C. Liu, D. Miao, N. Zhang, Graded rough set model based on two universes and its properties, Knowl.-Based Syst., 33 (2012), 65–72. https://doi.org/10.1016/j.knosys.2012.02.012 doi: 10.1016/j.knosys.2012.02.012
[15]	D. Molodtsov. Soft set theory—first results, Comput. Math. Appl., 37 (1999), 19–31.
[16]	W. Ma, B. Sun, Probabilistic rough set over two universes and rough entropy, Int. J. Approx. Reason., 53 (2012), 608–619. https://doi.org/10.1016/j.ijar.2011.12.010 doi: 10.1016/j.ijar.2011.12.010
[17]	Z. Pawlak, Rough sets, Int. J. Comput. Inform. Sci., 11 (1982), 341–356. https://doi.org/10.1007/BF01001956 doi: 10.1007/BF01001956
[18]	Z. Pawlak, Rough sets: Theoretical aspects of reasoning about data, Springer Science and Business Media, 2012.
[19]	Y. Qian, J. Liang, Y. Yao, C. Dang, MGRS: A multi-granulation rough set, Inform. Sci., 180 (2010), 949–970. https://doi.org/10.1016/j.ins.2009.11.023 doi: 10.1016/j.ins.2009.11.023
[20]	Y. Qian, J. Liang, C. Dang, Incomplete multigranulation rough set, IEEE T. Syst. Man Cy., 40 (2009), 420–431. https://doi.org/10.1109/TSMCA.2009.2035436 doi: 10.1109/TSMCA.2009.2035436
[21]	M. Shabir, M. I. Ali, T. Shaheen, Another approach to soft rough sets, Knowl.-Based Syst., 40 (2013), 72–80. https://doi.org/10.1016/j.knosys.2012.11.012 doi: 10.1016/j.knosys.2012.11.012
[22]	M. Shabir, J. Din, I. A. Ganie, Multigranulation roughness based on soft relations, J. Intell. Fuzzy Syst., 40 (2021), 10893–10908. https://doi.org/10.3233/JIFS-201910 doi: 10.3233/JIFS-201910
[23]	M. Shabir, R. S. Kanwal, M. I. Ali, Reduction of an information system, Soft Comput., 24 (2020), 10801–10813. https://doi.org/10.1007/s00500-019-04582-3 doi: 10.1007/s00500-019-04582-3
[24]	A. Skowron, J. Stepaniuk, Tolerance approximation spaces, Fund. Inform., 27 (1996), 245–253. https://doi.org/10.3233/FI-1996-272311 doi: 10.3233/FI-1996-272311
[25]	R. Slowinski, D. Vanderpooten, A generalized definition of rough approximations based on similarity, IEEE T. Knowl. Data Eng., 12 (2000), 331–336. https://doi.org/10.1109/69.842271 doi: 10.1109/69.842271
[26]	B. Sun, W. Ma, Multigranulation rough set theory over two universes, J. Intell. Fuzzy Syst., 28 (2015), 1251–1269. https://doi.org/10.3233/IFS-141411 doi: 10.3233/IFS-141411
[27]	B. Sun, W. Ma, X. Xiao, Three-way group decision making based on multigranulation fuzzy decision-theoretic rough set over two universes, Int. J. Approx. Reason., 81 (2017), 87–102. https://doi.org/10.1016/j.ijar.2016.11.001 doi: 10.1016/j.ijar.2016.11.001
[28]	B. Sun, W. Ma, Y. Qian, Multigranulation fuzzy rough set over two universes and its application to decision making, Knowl.-Based Syst., 123 (2017), 61–74. https://doi.org/10.1016/j.knosys.2017.01.036 doi: 10.1016/j.knosys.2017.01.036
[29]	B. Sun, W. Ma, X. Chen, X. Zhang, Multigranulation vague rough set over two universes and its application to group decision making, Soft Comput., 23 (2019), 8927–8956. https://doi.org/10.1007/s00500-018-3494-1 doi: 10.1007/s00500-018-3494-1
[30]	B. Sun, X. Zhou, N. Lin, Diversified binary relation-based fuzzy multigranulation rough set over two universes and application to multiple attribute group decision making, Inform. Fusion, 55 (2020), 91–104. https://doi.org/10.1016/j.inffus.2019.07.013 doi: 10.1016/j.inffus.2019.07.013
[31]	B. Sun, W. Ma, X. Chen, X. Zhang, Multigranulation vague rough set over two universes and its application to group decision making, Soft Comput., 23 (2019), 8927–8956. https://doi.org/10.1007/s00500-018-3494-1 doi: 10.1007/s00500-018-3494-1
[32]	B. Sun, W. Ma, Multigranulation rough set theory over two universes, J. Intell. Fuzzy Syst., 28 (2015), 1251–1269. https://doi.org/10.3233/IFS-141411 doi: 10.3233/IFS-141411
[33]	A. Tan, W. Z. Wu, S. Shi, S. Zhao, Granulation selection and decision making with multigranulation rough set over two universes, Int. J. Mach. Learn. Cyb., 10 (2019), 2501–2513. https://doi.org/10.1007/s13042-018-0885-7 doi: 10.1007/s13042-018-0885-7
[34]	Y. H. Qian, J. Y. Liang, W. Wei, Pessimistic rough decision, The Second International Workshop on Rough Set Theory, 005 (2010), 440–449.
[35]	W. Z. Wu, W. X. Zhang, Neighborhood operator systems and approximations, Inform. Sci., 144 (2002), 201–217. https://doi.org/10.1016/S0020-0255(02)00180-9 doi: 10.1016/S0020-0255(02)00180-9
[36]	W. Z. Wu, J. S. Mi, W. X. Zhang, Generalized fuzzy rough sets, Inform. Sci., 151 (2003), 263–282. https://doi.org/10.1016/S0020-0255(02)00379-1 doi: 10.1016/S0020-0255(02)00379-1
[37]	W. Xu, W. Li, X. Zhang, Generalized multigranulation rough sets and optimal granularity selection, Granular Comput., 2 (2017), 271–288. https://doi.org/10.1007/s41066-017-0042-9 doi: 10.1007/s41066-017-0042-9
[38]	W. H. Xu, W. X. Zhang, Measuring roughness of generalized rough sets induced by a covering, Fuzzy Set. Syst., 158 (2007), 2443–2455. https://doi.org/10.1016/j.fss.2007.03.018 doi: 10.1016/j.fss.2007.03.018
[39]	W. Xu, X. Zhang, Q. Wang, S. Sun, On general binary relation based rough set, J. Inform. Comput. Sci., 7 (2012), 54–66.
[40]	W. Xu, Q. Wang, S. Luo, Multi-granulation fuzzy rough sets, J. Intell. Fuzzy Syst., 26 (2014), 1323–1340. https://doi.org/10.3233/IFS-130818 doi: 10.3233/IFS-130818
[41]	Y. Y. Yao, T. T. Lin, Generalization of rough sets using mo dal logic, Intell. Autom. Soft Comput., 2 (1996), 103–120.
[42]	Y. Y. Yao, Generalized rough set models, Rough Set. Knowl. Discov., 1 (1998), 286–318.
[43]	Y. Yao, B. Yao, Covering based rough set approximations, Inform. Sci., 200 (2012), 91–107. https://doi.org/10.1016/j.ins.2012.02.065 doi: 10.1016/j.ins.2012.02.065
[44]	X. B. Yang, X. N. Song, H. L. Dou, J. Y. Yang, Multi-granulation rough set: From crisp to fuzzy case, Ann. Fuzzy Math. Inform., 1 (2011), 55–70.
[45]	R. Yan, J. Zheng, J. Liu, Y. Zhai, Research on the model of rough set over dual-universes, Knowl.-Based Syst., 23 (2010), 817–822. https://doi.org/10.1016/j.knosys.2010.05.006 doi: 10.1016/j.knosys.2010.05.006
[46]	W. Zhu, Generalized rough sets based on relations, Inform. Sci., 177 (2007), 4997–5011. https://doi.org/10.1016/j.ins.2007.05.037 doi: 10.1016/j.ins.2007.05.037
[47]	W. Zhu, Relationship between generalized rough sets based on binary relation and covering, Inform. Sci., 179 (2009) 210–225. https://doi.org/10.1016/j.ins.2008.09.015 doi: 10.1016/j.ins.2008.09.015
[48]	Q. Zhou, Research on tolerance-based rough set models, In 2010 International Conference on System Science, Engineering Design and Manufacturing Informatization, IEEE, Yichang, China, 2 (2010), 137–139. https://doi.org/10.1109/ICSEM.2010.124
[49]	J. Zhan, W. Xu, Two types of coverings based multigranulation rough fuzzy sets and applications to decision making, Artif. Intell. Rev., 53 (2020), 167–198. https://doi.org/10.1007/s10462-018-9649-8 doi: 10.1007/s10462-018-9649-8
[50]	Q. Zhang, Q. Xie, G. Wang, A survey on rough set theory and its applications, CAAI T. Intell. Techno., 1 (2016), 323–333. https://doi.org/10.1016/j.trit.2016.11.001 doi: 10.1016/j.trit.2016.11.001
[51]	J. Zhan, X. Zhang, Y. Yao, Covering based multigranulation fuzzy rough sets and corresponding applications, Artif. Intell. Rev., 53 (2020), 1093–1126. https://doi.org/10.1007/s10462-019-09690-y doi: 10.1007/s10462-019-09690-y
[52]	C. Zhang, D. Li, R. Ren, Pythagorean fuzzy multigranulation rough set over two universes and its applications in merger and acquisition, Int. J. Intell. Syst., 31 (2016), 921–943. https://doi.org/10.1002/int.21811 doi: 10.1002/int.21811
[53]	H. Y. Zhang, W. X. Zhang, W. Z. Wu, On characterization of generalized interval-valued fuzzy rough sets on two universes of discourse, Int. J. Approx. Reason., 51 (2009), 56–70. https://doi.org/10.1016/j.ijar.2009.07.002 doi: 10.1016/j.ijar.2009.07.002
[54]	C. Zhang, D. Li, Y. Mu, D. Song, An interval-valued hesitant fuzzy multigranulation rough set over two universes model for steam turbine fault diagnosis, Appl. Math. Model., 42 (2017), 693–704. https://doi.org/10.1016/j.apm.2016.10.048 doi: 10.1016/j.apm.2016.10.048

This article has been cited by:

1.	Mehmet Gümüş, Raafat Abo-zeid, Kemal Türk, (Special Issue) Global behavior of solutions of a two-dimensional system of difference equations, 2024, 2687-6531, 10.54286/ikjm.1457991
2.	Linxia Hu, Yonghong Shen, Xiumei Jia, Global behavior of a discrete population model, 2024, 9, 2473-6988, 12128, 10.3934/math.2024592

Reader Comments

Your name:*

Email:*
© 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

AIMS Mathematics

1.8 3.4

Metrics

Article views(1330) PDF downloads(62) Cited by(1)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

AIMS Mathematics

A novel pessimistic multigranulation roughness by soft relations over dual universe

Related Papers:

Abstract

1. Introduction

1.1. Related work

1.2. Motivation and contributions of the study

2. Materials and methods

2.1. Problem statement

2.2. Study framework

2.3. Convolution Neural Network (CNN)

2.4. Data cleaning and normalization

2.5. Model evaluation metrics

2.6. Study site and dataset source

3. Sensitivity analysis

3.1. Analysis of different lengths of datasets

3.2. Impact of higher-forecasting horizons

3.3. Impact of seasonal change

3.3.1. Classification of day type

4. Results and discussion

4.1. Case 1: Hour-ahead forecasting based on 5min

4.1.1. Input features selection

4.1.2. Hyperparameter tuning

4.1.3. Comparing the proposed CNN with other forecasting algorithms

4.1.4. One week of forecasting

4.1.5. Analysis of execution time

5. Cases 2 and 3: Hour-ahead forecasting based on 15min and 30min

5.1. Input features selection

5.2. Hyperparameter tuning

5.3. Comparing the proposed CNN with other forecasting algorithms

5.4. One week of forecasting

5.5. Analysis of execution time

6. Hardware implementation and potential challenges

7. Conclusions and future work

Use of AI tools declaration

Acknowledgments

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog