1.
Introduction
In meteorology, wind speed is a fundamental atmospheric quantity caused by the movement of air from high to low pressure, usually due to temperature variations. Alongside wind speed, wind direction plays a pivotal role in the analysis and prediction of weather patterns and the global climate. Wind speed and direction significantly affect factors such as evaporation rates, sea surface turbulence, and the formation of oceanic waves and storms. Moreover, these factors have substantial impacts on water quality, water levels, and various fields, including weather forecasting, climatology, renewable energy, environmental monitoring, aviation, and agriculture. Therefore, a comprehensive understanding of wind speed is important for managing potential impacts. Additionally, using wind speed data for analysis can help researchers and experts understand and address issues related to wind speed across various areas. For wind speed data, a normal distribution may not be appropriate, even though the normal distribution is one of the most widely used statistical distributions. If the data exhibits skewness, it is advisable to consider alternative distributions. Many new distributions have been developed using certain transformations from the normal distribution. One such distribution is the Birnbaum-Saunders distribution. Importantly, Mohammadi, Alavi, and McGowan [1] investigated the application of the two-parameter Birnbaum-Saunders distribution for analyzing wind speed and wind energy density at ten different stations in Canada. Their results demonstrated that the Birnbaum-Saunders distribution was especially effective at all the chosen locations. The Birnbaum-Saunders distribution was introduced by Birnbaum and Saunders [2] for the purpose of modeling the fatigue life of metals subjected to periodic stress. As a result, this distribution is sometimes referred to as the fatigue life distribution. The Birnbaum-Saunders distribution has been applied in various contexts, such as engineering, testing, medical sciences, and environmental studies. It is well known that the Birnbaum-Saunders distribution is a positive skewed one. However, some data to be analyzed may have both positive and zero values. Therefore, if zero observations follow a binomial distribution combined with the Birnbaum-Saunders distribution, the resulting distribution is the zero-inflated Birnbaum-Saunders (ZIBS) distribution, which is a new and interesting distribution. This ZIBS distribution was inspired by Aitchison [3], and several researchers have studied the combination of zero observations with other distributions to form new distributions, such as the zero-inflated lognormal distribution [4], the zero-inflated gamma distribution [5], and the zero-inflated two-parameter exponential distribution [6].
The coefficient of variation (CV) of wind speed is important for several reasons. Since the CV measures the dispersion of data relative to the mean, it is expressed as the ratio of the standard deviation to the mean. The CV assesses the variability of a dataset, regardless of the unit of measurement. Additionally, using the CV to evaluate wind speed is beneficial in various contexts. For instance, calculating the CV helps in understanding how much wind speed fluctuates compared to its average. If the CV is high, it indicates that the wind speed is highly variable, making it more difficult to predict wind conditions. In the context of wind energy, the CV can help assess the reliability of energy sources. If wind speed variability is high, it may result in inconsistent energy production, which could affect the stability of energy output from wind farms. Additionally, the coefficient of variation has been used in many fields, including life insurance, science, economics, and medicine. Importantly, many researchers have constructed confidence intervals (CIs) for the coefficient of variation, which have been applied to various distributions. For example, Vangel [7] constructed the CIs for a normal distribution coefficient of variation. Buntao and Niwitpong [8] introduced the CIs for the coefficient of variation of zero-inflated lognormal and lognormal distributions. D'Cunha and Rao [9] proposed the Bayes estimator and created CIs for the coefficient of variation of the lognormal distribution. Sangnawakij and Niwitpong [10] developed CIs for coefficients of variation in two-parameter exponential distributions. Janthasuwan, Niwitpong, and Niwitpong [11] established CIs for the coefficients of variation in the zero-inflated Birnbaum-Saunders distribution.
In the analysis and comparison of wind variability across multiple weather stations or wind directions, without needing to account for the differences in average wind speed at each station or direction, it is necessary to use the common CV. The common CV provides a single indicator representing the overall variability of wind speed, which is crucial when planning wind energy projects, designing wind turbines, or calculating the power production of wind farms that require knowledge of wind stability across different areas. Additionally, the common CV is useful in meteorological and climatological research, as it allows for the analysis of wind variability across multiple regions simultaneously. It can also assist in examining the relationship between wind variability and long-term climate changes or recurring events, such as storms or shifts in wind patterns. Therefore, the common coefficient of variation is a crucial aspect when making inferences for more than one population. This holds particularly true when collecting independent samples from various situations. Consequently, numerous researchers have investigated methods for computing the common coefficient of variation in several populations from variety distributions. For instance, Tian [12] made inferences about the coefficient of variation of a common population within a normal distribution. Then, Forkman [13] studied methods for constructing CIs and statistical tests based on McKay's approximation for the common coefficient of variation in several populations with normal distributions. Sangnawakij and Niwitpong [14] proposed the method of variance of estimate recovery to construct CIs for the common coefficient of variation for several gamma distributions. Next, Singh et al. [15] used several inverse Gaussian populations to estimate the common coefficient of variation, test the homogeneity of the coefficient of variation, and test for a specified value of the common coefficient of variation. After that, Yosboonruang, Niwitpong, and Niwitpong [16] presented methods to construct CIs for the common coefficient of variation of zero-inflated lognormal distributions, employing the method of variance estimate recovery, equal-tailed Bayesian intervals, and the fiducial generalized confidence interval. Finally, Puggard, Niwitpong, and Niwitpong [17] introduced Bayesian credible intervals, highest posterior density intervals, the method of variance estimate recovery, generalized confidence intervals, and large-sample methods to construct confidence intervals for the common coefficient of variation in several Birnbaum-Saunders distributions. Previous research has shown that no studies have investigated the estimation of the common coefficient of variation in the context of several ZIBS distributions. Therefore, the primary objective of this article is to determine the CIs for the common coefficient of variation of several ZIBS distributions. The article presents five distinct methods: the generalized confidence interval, the method of variance estimates recovery, the large sample approximation, the bootstrap confidence interval, and the fiducial generalized confidence interval.
2.
Materials and methods
Let Yij,i=1,2,…,k and j=1,2,…,mi be a random sample drawn from the ZIBS distributions. The density function of Yij is given by
where ϑi,αi, and βi are the proportion of zero, shape, and scale parameters, respectively. I is an indicator function, with I0[yij]={1;yij=0,0;otherwise, and I(0,∞)[yij]={0;yij=0,1;yij>0. This distribution is a combination of Birnbaum-Saunders and binomial distributions. Suppose that mi=mi(1)+mi(0) is the sample size, where mi(1) and mi(0) are the numbers of positive and zero values, respectively. For the expected value and variance of Yij, we have applied the concepts from Aitchison [3], which can be expressed as follows:
and
respectively. Hence, the coefficient of variation of Yij is defined as
The asymptotic distribution of ˆϑi is calculated by using the delta method, which is given by √mi(ˆϑi−ϑi)∼N(0,ϑi(1−ϑi)), where ˆϑi=mi(0)/mi. According to Ng, Kundu, and Balakrishnan [18], the asymptotic joint distribution of ˆαi and ˆβi is obtained as
where ˆαi={2[(−yi∑mi(1)j=1y−1ijmi(1))12−1]}12, ˆβi={−yi(∑mi(1)j=1y−1ijmi(1))−1}12, and −yi=∑mi(1)j=1yijmi(1). The estimator of θi is given by
According to Janthasuwan, Niwitpong, and Niwitpong [11], the asymptotic variance of ˆθi, derived using the Taylor series in the delta method, is given by
where Ψi=(2+α2i)2(1−ϑi)[α2i(4+5α2i)+ϑi(2+α2i)2]. According to Graybill and Deal [19], the common CV of several ZIBS distributions can be written as
where ˆV(ˆθi) denotes the estimator of V(ˆθi), which is defined in Eq (2) with αi and ϑi replaced by ˆαi and ˆϑi, respectively. This can be expressed as follows:
where ˆΨi=(2+ˆα2i)2(1−ˆϑi)[ˆα2i(4+5ˆα2i)+ˆϑi(2+ˆα2i)2].
The following subsection provides detailed explanations of the methods employed for constructing confidence intervals.
2.1. Generalized confidence interval
Weerahandi [20] recommended the generalized confidence interval (GCI) method for constructing confidence intervals, which is based on the concept of a generalized pivotal quantity (GPQ). To construct the confidence interval for θ using the GCI, we get the generalized pivotal quantities for the parameters βi, αi, and ϑi. Sun [21] introduced the GPQ for the scale parameter βi, which can be derived as
where Λi follows the t-distribution with mi(1)−1 degrees of freedom. βi1 and βi2 are the two solutions of the following quadratic equation:
where Ω1=(mi(1)−1)A2i−1mi(1)BiΛ2i, Ω2=(mi(1)−1)AiCi−(1−AiCi)Λ2i, Ai=1mi(1)∑mi(1)j=11√Yij, Bi=∑mi(1)j=1(1√Yij−Ai)2, Ci=1mi(1)∑mi(1)j=1√Yij, and Di=∑mi(1)j=1(√Yij−Ci)2. Next, considering the GPQ for the shape parameter αi as proposed by Wang [22], the GPQ for αi is derived as
where Ei1=∑mi(1)j=1Yij, Ei2=∑mi(1)j=11Yij, and Ki follows the chi-squared distribution with mi(1) degrees of freedom. Subsequently, the GPQ for the proportion of zero ϑi was recommended by Wu and Hsieh [23], who proposed using the GPQ based on the variance stabilized transformation to construct confidence intervals. Therefore, the GPQ for ϑi is defined as
where Wi=2√mi(arcsin√ˆϑi−arcsin√ϑi)∼N(0,1). Now, we can calculate the GPQs for θi and the variance of ˆθi using Eqs (5) and (6), resulting in
and
where GΨi=(2+G2αi)2(1−Gϑi)[G2αi(4+5G2αi)+Gϑi(2+G2αi)2]. Therefore, the GPQ for θi is the weighted average of the GPQ Gθi based on k individual samples, given by
Then, the (1−ρ)100% CI for the common CV of several ZIBS distributions employing the GCI method is given by
where Gθ(ρ/2) and Gθ(1−ρ/2) denote the 100(ρ/2)th and 100(1−ρ/2)th percentiles of Gθ, respectively.
Algorithm 1 is used to construct the GCI for the common coefficient of variation of several ZIBS distributions.
Algorithm 1.
For g=1 to n, where n is the number of generalized computations:
1) Compute Ai,Bi,Ci,Di,Ei1, and Ei2.
2) At the p step:
a) Generate Λi∼t(mi(1)−1), and then compute Gβi(yij;Λi) from Eq (4);
b) If Gβi(yij;Λi)<0, regenerate Λi∼t(mi(1)−1);
c) Generate Ki∼χ2mi(1), and then compute Gαi(yij;Ki,Λi) from Eq (5);
d) Compute Gϑi, Gθi, and GV(ˆθi) from Eqs (6)–(8), respectively;
e) Compute Gθ from Eq (9).
End g loop.
3) Repeat step 2, a total of G times;
4) Compute LGCI and UGCI from Eq (10).
2.2. Method of variance estimates recovery
The method of variance estimates recovery (MOVER) estimates a closed-form confidence interval. Let ˆωi be an unbiased estimator of ωi. Furthermore, let [li,ui] represent the (1−ρ)100% confidence interval for ωi,i=1,2,...,k. Assume that ∑ki=1ciωi is a linear combination of the parameters ωi, where ci are constants. According to Zou, Huang, and Zhang [24], the lower and upper limits of the confidence interval for ∑ki=1ciωi are defined by
and
Considering Eq (7), the (1−ρ)100% CI for θi based on the GPQs has become
where Gθi(ρ/2) and Gθi(1−ρ/2) represent the 100(ρ/2)th and 100(1−ρ/2)th percentiles of Gθi, respectively. Hence, the (1−ρ)100% CI for the common CV of several ZIBS distributions employing the MOVER method is given by
and
where c#i=ηi∑kj=1ηj and ηi=1ˆV(ˆθi).
Algorithm 2 is used to construct the MOVER for the common coefficient of variation of several ZIBS distributions.
Algorithm 2.
1) Compute ˆαi and ˆϑi;
2) Compute ˆθi and ˆV(ˆθi);
3) Compute li and ui from Eq (11);
4) Compute LMOVER from Eq (12);
5) Compute UMOVER from Eq (13).
2.3. Large sample approximation
Recall that the estimator of θi from Eq (1) is
and the estimated variance of ˆθi is
where ˆΨi=(2+ˆα2i)2(1−ˆϑi)[ˆα2i(4+5ˆα2i)+ˆϑi(2+ˆα2i)2]. The large sample (LS) estimate of the CV for the ZIBS distribution is a pooled estimate, as described in Eq (3). Accordingly, the (1−ρ)100% CI for the common CV of several ZIBS distributions employing the LS method is as follows:
where ηi=1ˆV(ˆθi).
Algorithm 3 is used to construct the LS for the common coefficient of variation of several ZIBS distributions.
Algorithm 3.
1) Compute ˆαi and ˆϑi;
2) Compute ˆθi and ˆV(ˆθi);
3) Compute LLS and ULS from Eq (14).
2.4. Bootstrap confidence interval
Efron [25] introduced the bootstrap method, which involves repeated resampling of existing data. According to Lemonte, Simas, and Cribari-Neto [26], the constant-bias-correcting parametric bootstrap is the most efficient method for reducing bias. As a result, we used it to estimate the confidence interval for θ. Assuming that there are D bootstrap samples available, the ˆαi series for those samples can be computed, which is shown as ˆα#i1,ˆα#i2,...,ˆα#iD. Here, ˆα#ir is a sequence of the bootstrap maximum likelihood estimation (MLE) of αir for i=1,2,...,k and r=1,2,...,D. The MLE of αir can be calculated using the Broyden-Fletcher-Goldfarb-Shanno (BFGS) quasi-Newton nonlinear optimization algorithm. The bias of the estimator αi is defined as
and then the bootstrap expectation E(ˆαi) could be approximated using the mean ˆα/i=1D∑Dr=1ˆα#ir. As a result, the bootstrap bias estimate for D replications of ˆαi is derived as ˆD(ˆαi,αi)=ˆα/i−ˆαi. According to Mackinnon and Smith [27], the corrected estimate for ˆα#i is obtained by applying the bootstrap bias estimate, which is
Let ˆϑ#i be observed values of ˆϑi based on bootstrap samples. In accordance with Brown, Cai, and DasGupta [28], the bootstrap estimator of ϑi is given by
By using Eqs (15) and (16), the bootstrap estimators of θi and the variance of ˆθi can be written as
and
where ˆΨ∗i=(2+(ˆα∗i)2)2(1−ˆϑ∗i)[(ˆα∗i)2(4+5(ˆα∗i)2)+ˆϑ∗i(2+(ˆα∗i)2)2]. Now, the common θ based on k individual sample is obtained by
Consequently, the (1−ρ)100% CI for the common CV of several ZIBS distributions employing the bootstrap confidence interval (BCI) method is provided by
where ˆθ∗(ρ/2) and ˆθ∗(1−ρ/2) denote the 100(ρ/2)th and 100(1−ρ/2)th percentiles of ˆθ∗, respectively.
Algorithm 4 is used to construct the BCI for the common coefficient of variation of several ZIBS distributions.
Algorithm 4.
For b=1 to n :
1) At the q step:
Fa) Generate y∗ij, with replacement from yij where i=1,2,...,k and j=1,2,...,mi;
b) Compute ˆα/i and ˆD(ˆαi,αi);
c) Compute ˆα∗i from Eq (15);
d) Generate ˆϑ∗i from Eq (16);
e) Compute ˆθ∗i from Eq (17);
f) Compute ˆV∗(ˆθi) from Eq (18);
g) Compute ˆθ∗ from Eq (19).
End b loop.
2) Repeat step 1, a total of B times;
3) Compute LBCI and UBCI from Eq (20).
2.5. Fiducial generalized confidence interval
Hannig [29] and Hannig [30] introduced the concept of the generalized fiducial distribution by assuming a functional relationship Rj=Qj(δ,U) for j=1,2,...,m, where Q=(Q1,...,Qm) are the structural equations. Then, assume that U=(U1,...,Um) are independent and identically distributed samples from a uniform distribution U(0,1) and that the parameter δ∈Ξ⊆Rp is p -dimensional. Consequently, the generalized fiducial distribution is absolutely continuous with a density
where L(r,δ) represents the joint likelihood function of the observed data and
where ddrQ−1(r,δ) and ddδQ−1(r,δ) are m×p and m×m Jacobian matrices, respectively. In addition, Hannig [29] deduced that if the sample r was independently and identically distributed from an absolutely continuous distribution with cumulative distribution function Fδ(r), then Q−1=(Fδ(R1),...,Fδ(Rm)). Let Zij,i=1,2,...,k,j=1,2,...,mi(1), be a random sample drawn from the Birnbaum-Saunders distribution. The likelihood function can be written as
Therefore, from Eq (20), the generalized fiducial distribution of (αi,βi) is
where
as obtained by Li and Xu [31]. Let α#i and β#i be the generalized fiducial samples for αi and βi, respectively. According to Li and Xu [31], the adaptive rejection Metropolis sampling (ARMS) method was used to obtain the fiducial estimates of αi and βi from the generalized fiducial distribution. Thus, the calculation of α#i and β#i can be implemented using the function arms in the package dlm of R software. Additionally, Hannig [29] recommended methods for estimating the fiducial generalized pivotal quantities for binomial proportion ϑi, with simulation results indicating that the best option is the mixture distribution of two beta distributions with weight ½, which is
Currently, the approximate fiducial generalized pivotal quantities for θi and the variance of ˆθi can be computed by
and
where Ψ#i=(2+(α#i)2)2(1−ϑ#i)[(α#i)2(4+5(α#i)2)+ϑ#i(2+(α#i)2)2]. As a result, the common θ based on k individual samples is calculated as
The (1−ρ)100% confidence interval for the common CV of several ZIBS distributions employing the fiducial generalized confidence interval (FGCI) method is obtained by
where θ#(ρ/2) and θ#(1−ρ/2) denote the 100(ρ/2)th and 100(1−ρ/2)th percentiles of θ#, respectively.
The algorithm 5 is used to construct the FGCI for the common coefficient of variation of several ZIBS distributions.
Algorithm 5.
For g=1 to n :
1) Generate G samples of αi and βi by using the arms function in the dlm package of R software;
2) Burn-in F samples (the number of remaining samples is G−F);
3) Thin the samples by applying sampling lag L>1, and the final number of samples is G'=(G−F)/L. Because the generated samples are not independent, we must reduce the autocorrelation by thinning them;
4) Generate ϑ#i from Eq (22);
5) Compute θ#i and V#(ˆθi) from Eqs (23) and (24), respectively;
6) Compute θ# from Eq (25);
End g loop.
7) Repeat steps 1–6, a total of G times;
8) Compute LFGCI and UFGCI from Eq (26).
3.
Simulation results and discussion
To evaluate the performance of the proposed methods, Monte Carlo simulations in R software were conducted under various scenarios using different sample sizes, proportions of zeros, and shape parameters, as shown in Table 1. The scale parameter was consistently fixed at 1.0 in all scenarios. In generating a simulation, we set the total number of replications to 1000 replicates, 3000 replications for the GCI and FGCI, and 500 replications for the BCI. The performance comparison was based on a coverage probability (CP) greater than or equal to the nominal confidence level of 0.95, as well as the narrowest average width (AW). Algorithm 6 shows the computational steps to estimate the coverage probability and average width performances of all the methods.
The simulation results for k = 3 are shown in Table 2 and Figure 1. The coverage probabilities of the confidence intervals for the GCI method are greater than the nominal confidence level of 0.95 in almost all scenarios, while the coverage probabilities for the MOVER method are close to the specified coverage probability value when proportions of zeros equal 0.13. For the BCI method, they are close to the target, especially when the sample size is large. For the LS and FGCI methods, they provide coverage probability values lower than 0.95 in all scenarios. In terms of average width, the LS and MOVER methods have narrower confidence intervals than other methods in most scenarios. However, the coverage probabilities of both confidence intervals are less than 0.95 in almost all scenarios, so they do not meet the requirements. Among the remaining methods, the GCI method has the shortest average width in all scenarios studied, while the BCI method has the widest.
The simulation results for k = 5 are shown in Table 3 and Figure 2. The coverage probabilities of the LS and BCI methods are close to the nominal confidence level of 0.95 in almost all scenarios. In contrast, the MOVER and FGCI methods have values below the specified target. For the GCI method, the coverage probability meets the target when the proportions of zeros are unequal. In terms of the average width, the confidence interval of the MOVER method is the narrowest. However, this method has a coverage probability lower than 0.95 in all scenarios, thus failing to meet the criteria. The LS and BCI methods have the widest confidence intervals compared to the other methods.
The simulation results for k = 10 are shown in Table 4 and Figure 3. In almost all scenarios, the LS and BCI methods have coverage probabilities greater than or close to 0.95, except when the proportions of zeros equal 0.510. Both methods have wider average widths compared to the other methods. The GCI method has coverage probabilities close to 0.95 when the sample size is large and the shape parameters are equal to 2.0. For the MOVER method, even though it has the narrowest average width, it has coverage probabilities lower than 0.95 in all scenarios.
Figures 1–3 exhibit similar patterns, showing consistent trends. As the sample size increases, all the proposed methods tend to decrease. Similarly, as the shape parameter increases, all the proposed methods also tend to decrease. Conversely, when the proportion of zeros increases, all the proposed methods tend to increase. These observations are all based on the average width.
Algorithm 6.
For a given (m1,m2,...,mk), (α1,α2,...,αk), (ϑ1,ϑ2,...,ϑk), and β1=β2=...=βk=1,
for r=1 to M
1) Generate sample from the ZIBS distribution;
2) Compute the unbiased estimates ˆαi and ˆϑi;
3) Compute the 95% confidence intervals for θ based on the GCI, MOVER, LS, BCI, and FGCI via Algorithms 1–5, respectively;
4) If [Lr≤θ≤Ur], set Dr=1; else set Dr=0;
End r loop.
5) The coverage probability and average width for each method are obtained by CP=1M∑Mr=1Dr and AW=Ur−LrM, where Ur and Lr are the upper and lower confidence limits, respectively.
4.
An empirical application
In this study, we leverage wind speed data from all directions to construct CIs for the common coefficient of variation of several ZIBS distributions. The data were collected from January 1 to 7, 2024, from three weather stations: Chanthaburi Weather Observing Station in Chanthaburi Province, Chumphon Weather Observing Station in Chumphon Province, and Songkhla Weather Observing Station in Songkhla Province. The selection of these three stations is due to their proximity to the Gulf of Thailand, which makes them directly influenced by sea breezes and tropical storms. This results in high wind speed fluctuations and also impacts the livelihoods, economy, and environment of the surrounding communities. All data were collected by the Thai Meteorological Department and are presented in Table 5 (Thai Meteorological Department Automatic Weather System, https://www.tmd.go.th/service/tmdData). To visualize the data distribution, we plotted histograms of wind speed data from all three stations, as shown in Figure 4. Table 6 provides statistical summaries for wind speed data at each station, revealing that the coefficients of variation of the wind speed data for the Chanthaburi Weather Observing Station, Chumphon Weather Observing Station, and Songkhla Weather Observing Station are 2.6799, 2.5111, and 2.7118, respectively. When considering the entire wind speed dataset, we observe a mixture of zero values (no wind) and positive values. For the positive values, we evaluate the suitability of the data distribution using the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) calculated as AIC=2ln(L)+2p and BIC=2ln(L)+2pln(o), respectively, where p is the number of parameters estimated, o is the number of observations, and L is the likelihood function. From Table 7, it is evident that the Birnbaum-Saunders distribution exhibits the lowest AIC and BIC values compared to other distributions, indicating its best fit for positive wind speed data. Additionally, to confirm that the positive wind speed data follows the Birnbaum-Saunders distribution, we plotted the cumulative distribution function (CDF) derived from the positive wind speed data and the estimated CDF from the Birnbaum-Saunders distribution. As shown in Figure 5, both graphs are similar, indicating a good fit. Therefore, the wind speed data comprises both positive and zero values and follows the ZIBS distribution. This distribution was thus used to compute the CIs for the common coefficient of variation of the wind speed data. Table 8 presents the 95% confidence intervals for the common coefficient of variation of wind speed data from the three weather observing stations using the GCI, MOVER, LS, BCI, and FGCI methods. We compared wind speed data with parameters generated from simulation using a sample size of mi = 1003, parameter αi = 2.53, and parameter ϑi = 0.53, as shown in Table 2. The simulation results indicate that the GCI and BCI methods meet the criterion of coverage probability greater than or equal to the nominal confidence level of 0.95. When considering the average width, the GCI method provides the narrowest confidence interval. The results in Table 8 show that the confidence interval for the common coefficient of variation for the wind speed data using the GCI method is [2.5001, 2.7224], with a confidence interval width of 0.2224, which is the narrowest among all methods. This leads to the conclusion that the appropriate method for the wind speed data is consistent with the simulation results.
5.
General discussion
Based on the study results, it is evident that the GCI demonstrates good performance in almost all scenarios, as the coverage probability is greater than or close to the nominal confidence level of 0.95, which is consistent with the previous research by Ye, Ma, and Wang [32], Thangjai, Niwitpong, and Niwitpong [33], Janthasuwan, Niwitpong, and Niwitpong [11]. When k is large, both LS and BCI perform well. In most scenarios, MOVER and FGCI have coverage probabilities below acceptable levels, indicating that these methods may not be suitable for many situations, which aligns with the previous research by Puggard, Niwitpong, and Niwitpong [17]. Considering the average width, all proposed methods tend to decrease as the sample size and shape parameters increase, which improves their efficiency. Conversely, when the proportion of zeros increases, all proposed methods tend to decrease, leading to reduced efficiency. In our case, the simulation results showed that the MOVER method provided the narrowest confidence intervals for most scenarios and performed well with small sample sizes combined with a low proportion of zeros. However, the MOVER method yielded coverage probabilities lower than the specified confidence level in almost all scenarios. Similarly, the FGCI method achieves a coverage probability close to the specified confidence level in scenarios with a low proportion of zeros. This could be attributed to certain weaknesses that affect the fiducial generalized pivotal quantities for the proportion of zeros. Additionally, the issues with both the MOVER and FGCI methods likely arise from the upper and lower bounds for zero values used in constructing the confidence intervals and the combined effect with other parameters; this results in insufficient coverage probability. Finally, wind energy is a vital, renewable source of power, primarily generated by capturing wind speed. However, fluctuations in wind speed can introduce uncertainty. According to Lee, Fields, and Lundquist [34], understanding these variations is crucial for assessing wind resource potential.
6.
Conclusions
This article presents an estimation of the common coefficient of variation of several ZIBS distributions. The methods proposed include GCI, MOVER, LS, BCI, and FGCI. The performance of each method was evaluated through Monte Carlo simulations, comparing their coverage probabilities and average widths. The simulation results for k = 3 recommend the GCI method due to its acceptable coverage probability and narrow confidence intervals in almost all scenarios, while the BCI method is another option for situations with large sample sizes. For k = 5, we recommend the GCI method when ϑi is unequal, the LS method when ϑi is small and the sample size is large, and the BCI method when ϑi is large. For k = 10, the BCI and GCI methods are recommended: the BCI method (for small to medium sample sizes) and the GCI method for large sample sizes. Additionally, in all sample cases (k = 3, 5, and 10), the MOVER method has the narrowest confidence intervals but a coverage probability below the acceptable level in most situations, and the FGCI method has a coverage probability below the acceptable level in almost all situations. Therefore, these two methods are not recommended. Finally, all the proposed methods were applied to wind speed data in Thailand and yielded results consistent with the simulation findings. In future research, we will explore new methods for constructing confidence intervals, potentially using Bayesian and highest posterior density (HPD) approaches to enhance their effectiveness. Additionally, we will use other real-world data to conduct a more comprehensive study.
Author contributions
Usanee Janthasuwan conducted the data analysis, drafted the initial manuscript, and contributed to the writing. Suparat Niwitpong developed the research framework, designed the experiment, and reviewed the manuscript. Sa-Aat Niwitpong provided analytical methodologies, validated the final version, and obtained funding.
Use of Generative-AI tools declaration
The authors declare that they have not used Artificial Intelligence (AI) tools in the creation of this article.
Acknowledgments
The authors would like to express their sincere gratitude to the editor and reviewers for their valuable comments and suggestions, which have significantly improved the quality of the manuscript. This research was funded by the King Mongkut's University of Technology North Bangkok, contract no: KMUTNB-68-KNOW-17.
Conflict of interest
The authors declare no conflict of interest.