Estimation of Shannon entropy of the inverse exponential Rayleigh model under progressively Type-Ⅱ censored test

Haiping Ren; Ziwen Zhang; Qin Gong; Haiping Ren; Ziwen Zhang; Qin Gong

doi:10.3934/math.2025434

AIMS Mathematics

2025, Volume 10, Issue 4: 9378-9414. doi: 10.3934/math.2025434

Previous Article Next Article

Theory article Special Issues

Estimation of Shannon entropy of the inverse exponential Rayleigh model under progressively Type-Ⅱ censored test

1.
Teaching Department of Basic Subjects, Jiangxi University of Science and Technology, Nanchang, 330013, China
2.
College of Science, Jiangxi University of Science and Technology, Ganzhou, 341000, China

Received: 08 February 2025 Revised: 01 April 2025 Accepted: 03 April 2025 Published: 23 April 2025
MSC : 62E10, 62F10

In this paper, we investigate the problem of entropy estimation for the inverse exponential Rayleigh distribution under progressively Type-Ⅱ censored samples. For the Shannon entropy of the inverse Rayleigh distribution, the maximum likelihood estimation and Bayesian estimation are studied, as well as the interval estimation. Under the Bayesian method, we used three different loss functions to discuss the Shannon entropy under each of these loss functions. The three loss functions are as follows: the weighted squared error loss function, the K-loss function, and the precautionary loss function. The Bayesian estimates of Shannon's entropy under these three loss functions are computed using the Lindley approximation and mixed Gibbs sampling. Finally, we calculated the estimated values, mean squared errors, and variances of the estimators through Monte Carlo simulations. To evaluate the performance of the estimation methods, we chose to use the mean squared errors for a comparative analysis. In the subsequent step, we applied the estimated results to a real dataset.

Keywords:

Bayesian estimation,
inverse exponential Rayleigh distribution,
loss function,
Shannon entropy

Citation: Haiping Ren, Ziwen Zhang, Qin Gong. Estimation of Shannon entropy of the inverse exponential Rayleigh model under progressively Type-Ⅱ censored test[J]. AIMS Mathematics, 2025, 10(4): 9378-9414. doi: 10.3934/math.2025434

Related Papers:

[1]	Amal S. Hassan, Najwan Alsadat, Oluwafemi Samson Balogun, Baria A. Helmy . Bayesian and non-Bayesian estimation of some entropy measures for a Weibull distribution. AIMS Mathematics, 2024, 9(11): 32646-32673. doi: 10.3934/math.20241563
[2]	Refah Alotaibi, Mazen Nassar, Zareen A. Khan, Wejdan Ali Alajlan, Ahmed Elshahhat . Entropy evaluation in inverse Weibull unified hybrid censored data with application to mechanical components and head-neck cancer patients. AIMS Mathematics, 2025, 10(1): 1085-1115. doi: 10.3934/math.2025052
[3]	Baria A. Helmy, Amal S. Hassan, Ahmed K. El-Kholy, Rashad A. R. Bantan, Mohammed Elgarhy . Analysis of information measures using generalized type-Ⅰ hybrid censored data. AIMS Mathematics, 2023, 8(9): 20283-20304. doi: 10.3934/math.20231034
[4]	Hatim Solayman Migdadi, Nesreen M. Al-Olaimat, Maryam Mohiuddin, Omar Meqdadi . Statistical inference for the Power Rayleigh distribution based on adaptive progressive Type-II censored data. AIMS Mathematics, 2023, 8(10): 22553-22576. doi: 10.3934/math.20231149
[5]	Xue Hu, Haiping Ren . Statistical inference of the stress-strength reliability for inverse Weibull distribution under an adaptive progressive type-Ⅱ censored sample. AIMS Mathematics, 2023, 8(12): 28465-28487. doi: 10.3934/math.20231457
[6]	Haidy A. Newer, Bader S Alanazi . Bayesian estimation and prediction for linear exponential models using ordered moving extremes ranked set sampling in medical data. AIMS Mathematics, 2025, 10(1): 1162-1182. doi: 10.3934/math.2025055
[7]	Abdullah Ali H. Ahmadini, Amal S. Hassan, Ahmed N. Zaky, Shokrya S. Alshqaq . Bayesian inference of dynamic cumulative residual entropy from Pareto Ⅱ distribution with application to COVID-19. AIMS Mathematics, 2021, 6(3): 2196-2216. doi: 10.3934/math.2021133
[8]	Haiping Ren, Xue Hu . Estimation for inverse Weibull distribution under progressive type-Ⅱ censoring scheme. AIMS Mathematics, 2023, 8(10): 22808-22829. doi: 10.3934/math.20231162
[9]	Qin Gong, Laijun Luo, Haiping Ren . Unit compound Rayleigh model: Statistical characteristics, estimation and application. AIMS Mathematics, 2024, 9(8): 22813-22841. doi: 10.3934/math.20241110
[10]	Mohamed S. Eliwa, Essam A. Ahmed . Reliability analysis of constant partially accelerated life tests under progressive first failure type-II censored data from Lomax model: EM and MCMC algorithms. AIMS Mathematics, 2023, 8(1): 29-60. doi: 10.3934/math.2023002

Abstract

1. Introduction

The research on the statistical inference of the entropy of distribution functions has attracted great attention in recent years. For instance, Ren et al. ^[1] systematically explored entropy estimation for the generalized Rayleigh distribution under progressively Type-Ⅱ censored samples. Maiti et al. ^[2] studied the estimation of Shannon entropy (SE) and Renyi entropy of generalized exponential distributions under the progressive censoring test. Xu and Gui ^[3] investigated entropy estimation for inverse Weibull distributions under adaptive Type-Ⅱ progressive hybrid censoring schemes, thus demonstrating the robustness of these methods across different models. Shrahili et al. ^[4] studied the entropy estimation of log-logistic distributions under progressive Type-Ⅱ censoring.

The inverse exponential Rayleigh (IER) distribution plays a pivotal role in the reliability analysis, with its theoretical significance and practical applications garnering increasing attention from scholars in recent years. Maurya et al. ^[5] conducted research on the Bayesian estimation (BAE) of the IER distribution under asymmetric loss functions. Yousef and Hassan ^[6] applied a Bayesian estimation with Markov chain Monte Carlo (MCMC) methods to the system reliability for the inverted Topp–Leone distribution under ranked set sampling, highlighting the effectiveness of these techniques in complex scenarios. Mutairi et al. ^[7] further discussed the BAE and prediction for the exponentiated inverted Topp–Leone distribution. They utilized various methods from different perspectives to simulate and compare the advantages and disadvantages of these methods across different samples. Due to the IER model's suitability for independent stress and strength random variables, Panahi ^[8] proposed using this model to study the efficiency of the display systems and obtained Bayesian estimates using the Metropolis - Hastings - Gibbs algorithm. Panahi et al. ^[9] used the maximum likelihood (ML) method to estimate the unknown parameters under the IER distribution and evaluated them using MCMC techniques. Rastogi et al. ^[10] employed various methods, including Tierney, to obtain Bayesian estimates under the IER distribution and verified their reliability. Wang et al. ^[11] conducted in-depth research on the existence and uniqueness of the parameter estimates for the IER distribution and compared the performance of different methods using simulation studies. Al-Kadim et al. ^[12] found that the concept of length-biased distributions is well-suited for lifetime data models. They introduced a new class of length-biased weighted exponential and Rayleigh distributions and conducted a comprehensive study of their statistical properties. Their findings demonstrated that the new length-biased weighted Rayleigh distribution can effectively replace other well-known models, thus providing valuable insights for practitioners in applied sciences. Kim ^[13] discussed the estimation of the reliability of a mixed model using a combination of exponential and Rayleigh distributions under constant truncation and employed an imputation method to estimate the model's reliability. Shuvashree et al. ^[14] recently proposed a joint Type-Ⅱ generalized progressive hybrid censoring scheme, which enhances flexibility in handling complex lifetime data. Almongy and Alshenawy ^[15] applied similar censoring methods to Weibull extended distributions in transformer insulation, thus highlighting their practical utility in engineering contexts.

The IER distribution has numerous practical applications in the reliability assessment and lifetime evaluation across various fields. In the field of medicine, Hashem et al. ^[16] examined the estimation of the shape parameter of the Type-Ⅰ asymptotic mixture-censored data under the IER distribution using real patient data they collected. They conducted an in-depth study of the estimation based on various methods, including the ML estimation. Sriramachandran et al. ^[17] investigated an acceptance sampling plan for censored life tests to determine whether the lifespan of a product follows an IER distribution. They applied appropriate methods to test different quality levels, obtained the corresponding operating characteristic values, and compared the results with those for a standard inverse Rayleigh distribution. Kayal et al. ^[18] believed that the IER distribution could find applications in modeling high-resolution sea clutter and develop constant false alarm rate (CFAR) detection schemes, frequency distributions of energy release from earthquakes, biological models, modeling light-variation phenomena in stars, etc. Anwar et al. ^[19] investigated the application of IER distribution in stress intensity with a reliability assessment. Chalabi et al. ^[20] proposed and used a composite model with an IER distribution as a texture component for developing a CFAR detection scheme. Wang et al. ^[21] addressed generalized Type-Ⅱ progressive hybrid censoring under competing risks, thus expanding the applicability of IER models in multi-failure scenarios. Muhammed and Muhammed ^[22] compared Bayesian and non-BAE for binary inverse Weibull distributions under progressive censoring, thus offering insights into parameter inference strategies. Raqab ^[23] provided a statistical framework to descriminate between generalized Rayleigh and Weibull distributions, which aids in selecting appropriate models for lifetime data. Fan et al. ^[24] investigated the parameter estimation for the IER distribution under stepwise-first-failure census samples. Gao et al. ^[25] proposed a pivotal quantity inference method to estimate the parameters of an IER distribution based on asymptotically trimmed data. Using pivotal quantity methods, they studied the Bayesian point estimation of an IER distribution. Ma et al. ^[26] discussed the use of the ML estimation and BAE methods to estimate the parameters of the IER distribution in cases where the data is known to be mixed type-Ⅰ censored. Based on the above analysis, we found that researchers have applied different methods to compare and examine the advantages and disadvantages of using this distribution from multiple perspectives. In various fields, scholars have applied different methods to compare and study the advantages and disadvantages of using this distribution from different perspectives. However, there has been no scholarly research on the entropy of the IER distribution.

Let X be a random variable following an IER distribution with parameters ω and υ, denoted by IER(ω, υ). The probability density function (PDF) is as follows:

(1)

The cumulative distribution function (CDF) is as follows:

(2)

Here, ω > 0 and υ > 0 are the shape and scale parameters, respectively.

The survival function (SF) for IER(ω, υ) is as follows:

The hazard function (HF) of IER(ω, υ) is as follows:

Figure 1 vividly depicts the variation of PDF and CDF between different parameter values. Through an intuitive analysis, a clear peak appears in the PDF image, and the PDF image exhibits an overall symmetry, while the CDF image shows a strictly increasing trend.

Figure 1. Curves of PDF (a) and CDF (b) for the IER distribution with parameters ω and υ.

DownLoad: Full-Size Img PowerPoint

From Figure 2, we can observe that the SF image always shows a decreasing trend, while the HF image exhibits a unimodal shape, thereby rapidly increasing with the change of x and gradually decreasing towards a plateau.

Figure 2. Curves of SF (a) and HF (b) for the IER distribution with parameters ω and υ.

DownLoad: Full-Size Img PowerPoint

Figures 1 and 2 show the PDF, CDF, SF, and HF curves of the IER distribution. These graphs are key to understanding its entropy features. The PDF in Figure 1 helps intuit data distribution for entropy calculations. The CDF in Figure 2 reflects the data's cumulative probability distribution, thus influencing entropy. The SF reveals the data tail traits via decay, which is vital for entropy as the tail data impacts it greatly. The HF shows a data failure pattern, thus clarifying the entropy distribution. Overall, these figures visually boost our understanding of the IER distribution's entropy properties.

In a more comprehensive analysis, we observe an intriguing dynamic pattern in the risk-rate function of the distribution. The function initially starts at a certain level and gradually increases over time until it reaches a peak. However, after reaching this peak, it exhibits a significant downward trend, thus indicating that the risk decreases in subsequent time periods compared to the initial point. This periodic rise and fall are a fundamental characteristic of the HF of the distribution. By examining this dynamic process, we can gain better insight into and predict the level of risk at different points in time, thus providing valuable guidance for a risk assessment. This finding confirms the effectiveness of the model in simulating the early failure behavior of complex systems and components. It can accurately predict potentially catastrophic failures in the mechanical equipment or electrical components, including, but not limited to, increased infant mortality rates and failures due to fatigue during continued use. In this way, the model provides engineers with a powerful tool to evaluate risk management strategies in the design of products to ensure that they can withstand the expected wear and aging conditions in order to minimize possible future safety hazards. compared to other common models or distributions, Ghitany et al. ^[27] pointed out that the IER distribution has a greater potential for reliably modeling extensive datasets.

This paper investigates the SE estimation problem of IER(ω, υ). Section 2 introduces the concept of SE and derives the SE of IER distribution. Section 3 discusses the ML estimation and asymptotic confidence interval estimation of the SE, including the derivation process and results. In Section 4, the BAE of the SE is analyzed based on the three given loss functions. Section 5 uses a Monte Carlo simulation to obtain estimates under various estimation methods and compares the performance of various estimation methods. Section 6 uses real data to validate the feasibility of the estimation method in practice. Finally, Section 7 analyzes the conclusions obtained.

2. Preliminary of Shannon entropy

The concept of SE was first proposed by Shannon in the 1940s, and has become a central issue in information theory research. SE is a central concept in information theory and involving various theoretical aspects. Essentially, SE is about quantifying intangible information into tangible data, thus reducing uncertainty by collecting information, where the level of uncertainty can be compared through probability. For instance, in a random event with multiple possible outcomes, the greater the uncertainty of the result, the more information it carries. Additionally, SE possesses characteristics such as non-negativity, symmetry, and scalability, which provide a solid foundation for the development of information theory and have widespread applications across various fields. The expression of SE is as follows:

(3)

Upon reviewing existing research findings, it is evident that academia has made significant progress in estimating entropy functions for various distribution types. These advancements not only offer new theoretical frameworks for data analyses but also enhance our understanding of the underlying principles governing complex phenomena. Xu et al. ^[3] discussed the entropy estimation problem of the two-parameter inverse Weibull distribution under different loss functions using an ML estimation and BAE methods. Additionally, they tested the effectiveness of the estimates and validated the conclusions with real data. This topic holds great importance in numerous scientific research fields, including clinical medicine, epidemiology, and industrial engineering. In a recent study conducted by Shrahili and his team ^[28], they delved into the entropy of log-logistic distributions within the framework of asymptotic tests. By developing theoretical models and employing mathematical methods, they meticulously analyzed the impact of various parameter choices on the estimations.

Raqab et al. ^[29] conducted an in-depth analysis and estimation of the entropy of the generalized Bilal distribution using various methods. They utilized Newton-Raphson to calculate estimates of entropy and model parameters.

Theorem 1. Let X be a random variable distributed with IER(ω, υ), with the PDF defined in Eq (1). The SE of the IER distribution is as follows:

(4)

where φ(·) and Γ(·) denote the digamma and gamma functions, respectively.

Proof. Substitute Eq (1) into Eq (3) to obtain the following:

Among them,

According to the fact that , we can conclude that

Take the derivative of both sides of the above equations with respect to ω:

Next, let υ/x² = S; then,

Let W be a random variable distributed with the generalized exponential distribution, whose PDF is as follows:

Then, according to ^[1], we have the following:

Let w/ω = s; then,

Furthermore,

Let υ/x² = r, x = (υ/r)^1/2; then,

Let e^-r = z, r = -ln z, dr = -1/zdz; then,

According to the power series expansion formula,

We have the following:

According to the PDF of the negative log-gamma distribution and the normalization condition for a PDF, we know that

Then,

Take the derivative of both sides of the above equations with respect to a:

If a = 0, then

Finally, we obtain the following:

3. Classical estimations of Shannon entropy

3.1. Maximum likelihood estimation

To study an IER distribution, scholars mainly conduct research through data analysis, and the data used for the research are either complete or missing. They analyze and model these data to understand and predict the probability of events. This research methodology is crucial for fields such as insurance actuarial science, financial risk assessments, and radar forecasting. However, due to constraints related to time and funding, we have opted to employ the censored test method for longevity testing. In statistics, life-censored testing is an important method to assess whether a sample of data is representative of the overall life distribution. The Type-Ⅰ and Type-Ⅱ tests are two commonly used approaches in life-truncated tests. Gao and Gui ^[30] mentioned that Type-Ⅰ and Type-Ⅱ censoring, as the most common censoring schemes, have been widely used by many authors in various life models. When conducting timed tests alongside a fixed number of review tests, researchers typically conclude the study process at a predetermined time point or after a specified number of products, without removing any items. This method can lead to discrepancies between the experimental conclusions and the actual situation ^[31]. Therefore, it is prudent to adopt a progressive filtering test in the life filtering experiment ^[32].

The Type-Ⅱ rejection test lies in its ability to analyze product performance by removing non-failing products at each stage of testing. This allows for the flexible adjustment of the sample size and censoring time, thus leading to a more efficient use of time and resources and an improved data collection efficiency. From a practical perspective, this new review scheme appears to be more flexible and efficacious than the traditional Type-Ⅱ scheme, as it enables the sample size to be precisely determined in accordance with the specific research objective and practical circumstances without sacrificing the accuracy. Similarly, gradually increasing the cut-off time maximizes the time savings without impacting the production quality, minimizes deletions, reduces the time consumption, and enhances the data analysis reliability.

Suppose we are conducting a lifetime experiment on n products, and during the experiment, as soon as any nonconforming product is found, a portion of the products will be randomly eliminated from the remaining nonconforming products ^[33,34]. In essence, once the point in time when the first product fails is identified as X_1:m:n, this critical information should be meticulously documented. Subsequently, for the remaining n-1 non-failed products, the L₁ products are excluded from this list and the remaining n-1-L₁ products are kept to be continuously observed. Then, the observation is continued to determine the time to failure for the second product, which has a time to failure of X_2:m:n. Then, repeat this process m times, recording the failure time of the m-th product as X_m:m:n and stopping the experiment, after which R_m products are randomly removed. We denote (X_1:m:n, X_2:m:n, X_m:m:n) as the set of m PC-II samples observed in the experiment, where m represents the number of observed failure samples. Each observed sample conforms to the IER(ω, υ) defined by Eq (1). In a progressively Type-Ⅱ censored test, the joint likelihood function is used to describe the joint probability distribution of the observed data and the censoring mechanism. Assuming the observed failure times are (X_1:m:n, X_2:m:n, X_m:m:n) and R_m non-failed units are removed after each failure, the joint likelihood function LF(ω, υ|x^*) can be expressed as follows ^[23]:

(5)

where , u(υ, x) = 1-e^-υ/x², and X_i:m:n denotes the observation value of X_i:m:n. Inputting Eq (1) and (2) into Eq (5) yields the following:

(6)

The corresponding log-LF:

(7)

(8)

(9)

where u(υ, x) = 1-e^-υ/x².

In order to improve the accuracy and reliability of the estimation methods, it is necessary to first verify the existence and uniqueness of the ML estimation solutions, thereby laying a theoretical foundation for a subsequent parameter estimation.

Next, we present two propositions:

Proposition 1: Let the left-hand side of Eq (8) be ; then, solving for f₁(ω) = 0 yields a solution that exists and is unique at ω∈(0, +∞).

Proof. (1) Existence.

When ω→0⁺, it can be derived that , and . Therefore, we have the following:

When ω→+∞, it can be derived that , and thus . Therefore, we have the following:

Since f₁(ω) is continuous on the interval ω∈(0, +∞), by the Intermediate Value Theorem, there exists at least one solution such that , thus proving the existence.

(2) Uniqueness.

Finding the derivative of f₁(ω) yields .

When ω∈(0, +∞), can be obtained, and thus f₁(ω) is a monotonically decreasing function, thus proving the uniqueness.

Proposition 2: Let the left-hand side of Eq (9) be . Then, solving for f₂(υ) = = 0 yields a solution that exists and is unique at υ∈(0, +∞).

Proof. (1) Existence.

When υ→0⁺, it can be derived that , and thus . Therefore,

Thus, there is .

When υ→+∞, it can be derived that and .

Therefore, we can conclude that

Thus, there is .

Since f₂(υ) is continuous on the interval υ∈(0, +∞), by the intermediate value theorem, there exists at least one solution such that , thus proving the existence.

(2) Uniqueness.

By taking the derivative of f₂(υ), we can obtain the following:

When υ∈(0, +∞), it can be derived that 1 > u(υ, x) = 1-e^-υ/x² > 0; thus, we can obtain the following:

Thus, there is .

When υ∈(0, +∞), f₂(υ) is a monotonically decreasing function, thus proving the uniqueness.

To obtain the values of ω and υ, we can solve Eqs (8) and (9). By substituting the known variables and applying the appropriate mathematical techniques, we can isolate ω and υ, thereby determining their values. However, it is important to note that their ML estimates are challenging to precisely determine using direct analytical methods. Therefore, numerical calculations with the assistance of mathematical tools and techniques are necessary to find the accurate values of ω and υ. This approach involves utilizing either computer programs or a numerical analysis software to approximate the true solution through iterative operations and constant adjustment of parameters. In the following sections, we will delve into how numerical computational methods (Algorithm1) can be employed to solve for the ML estimates by following an algorithmic process.

Algorithm 1. Numerical ML estimation via Newton-Raphson iteration.
(1). Derive the expression for the ML function of the parameter and convert it to the log-LF. Differentiate the obtained log-likelihood expression.
(2). The required parameter values are initialized, generally using the mean of the resulting sample or other reliable values.
(3). Use the resulting parameter values to plug into the formula obtained in (2) and calculate the corresponding values.
(4). The first and second derivatives are obtained and then substituted into the following Newton iteration formula to obtain the new parameter values.

(5). Repeat steps 3 and 4 multiple times until the difference between the last two iterations is within the desired level of precision.
(6). The final parameter values obtained from the iteration are the ML estimates.

Because the ML estimation is invariant, there is no significant difference in the parameter estimates obtained by iteration. Therefore, we can replace , by the ω, υ obtained through the calculation, and then substitute it into Eqs (8) and (9), so that we obtain the ML estimate of SE, which is given by the following expression:

(10)

3.2. Asymptotic confidence interval estimation

To analyze the stability and reliability of the model, we need to calculate the asymptotic confidence intervals (ACINs) for the SE. We will derive the asymptotic confidence interval for SE using the Delta method ^[34]. In this paper, to obtain the ACINs of SE, we will use the delta method to derive its expression.

The Fisher information matrix is as follows:

(11)

where

Since the presence of a sum of infinite terms for a single number in Eq (4) is not convenient to derive, we use an alternative expression of Eq (5):

From the above expression, we can obtain the first-order differential of SE with respect to ω, υ,

Therefore, we can obtain the variance of SE as follows:

where , K^T is the transpose of K, and U_α/2 is the upper α/2 percentile of the standard normal distribution N(0, 1). The 100(1-α)% ACINs of SE is as follows:

(12)

4. Bayesian estimation of shannon entropy

The BAE is a parameter estimation method that provides a powerful tool to predict and estimate unknown variables based on known a priori information by delving deeper into the principles and logic embedded in Bayes' theorem ^[35]. It has found extensive applications in machine learning, statistics, data mining, and other related fields.

We calculated the BAE of SE under three different loss functions: the weighted squared error loss (WSEL) function, the precautionary loss (PL) function, and the K-loss (KL) function.

The WSEL function is an improvement on the standard squared error loss function. It measures the difference between the predicted and actual values, thereby considering the relative weight of each data point. By applying weights, the model can focus more on the samples that are more important or influential during the training process. The WSEL function is selected for its adaptability to skewed distributions through parameter-dependent weights and was validated in generalized Rayleigh models by Han ^[36].

The PL function is commonly used in risk management and decision optimization, with the goal of reducing potential risks in a given plan. Unlike traditional loss functions, the PL function calculates prediction errors while reducing the underestimation of critical events, thus ensuring more cautious decisions when the risk is higher. The PL function prioritizes asymmetric penalty terms to mitigate underestimation risks, thus aligning with reliability engineering requirements, as demonstrated in the exponential distribution parameter estimation by Rasheed and Abd ^[37].

The KL function is primarily used to measure the difference between two probability distributions, which are often used in entropy calculations. The KL divergence quantifies the "information loss" or "distance" between distributions, thus helping to measure the uncertainty of model predictions through the notion of "information loss" or "distance." Its exact mathematical form may vary depending on the application, but it typically adjusts errors by considering the scale of the parameters in its calculations. The KL function normalizes errors by the parameter scale, thus ensuring a dimensional consistency in the entropy calculations and robustness against outliers in Pareto distributions, as reported by Hassan et al. ^[38].

4.1. Bayesian estimation

We first determine the prior distribution. A priori selection of information involves integrating the available data with observations derived from existing knowledge and experience ^[39]. Indeed, selecting a prior distribution is often grounded in extensive experience or empirical data, thus ensuring that it addresses the limitations of the data and leads to more representative parameters. In Bayesian modeling of non-negative continuous parameters, Gamma priors are assigned to ω and υ due to their theoretical conjugacy with exponential-family likelihoods and practical flexibility in the parameter interpretation. This specification ensures closed-form posterior distributions when paired with Poisson or exponential likelihoods, thus streamlining the computational inference while preserving the parameter interpretability. Collectively, Gamma distributions balance the mathematical tractability and empirical fidelity, thus making them ideal for non-negative parameter estimations in Bayesian frameworks.

We assume there are two unknown parameters, ω and υ, which are independent of each other. Both parameters follow a Gamma distribution. This assumption is based on the existing knowledge and experience, as well as the combination of available information with observations.

Specifically, the PDF of ω and υ are as follows:

Through the above two expressions, we can obtain the joint prior distribution of ω and υ as follows:

By Bayes' theorem, we can obtain an expression for the joint posterior density of ω, υ as follows:

(13)

The BAE of SE under three different loss functions is analyzed. The three loss functions are as follows ^[36,37,38]:

We can obtain the corresponding the Bayesian estimators of SE as follows:

(14)

(15)

(16)

where SE(f)_{LF_WB}, SE(f)_{LF_PB}, SE(f)_{LF_KB} stands for Bayesian estimators of the entropy values of the WSEL function, PL function, and KL function, respectively, and E[SE⁻¹(f|ω, υ)|x] stands for the posterior expectation of SE⁻¹. The BAE in Eqs (14)–(16) are calculated by obtaining E[SE⁻¹(f|ω, υ)|x], E[SE(f|ω, υ) |x], E[SE²(f|ω, υ) |x] and substituting the obtained values into Eqs (14)–(16). Next, to obtain the Bayesian estimates of entropy under the WSEL function, PL function, and KL function, this paper chooses the Lindley method for the calculation.

4.2. Lindley's approximation of Bayesian estimators

The Lindley approximation is a method primarily used to approximate the posterior distribution of complex Bayesian models. It is an improvement upon the Laplace method. Under certain regularity conditions, as the sample size increases, the posterior distribution tends to approach a normal distribution. The Lindley approximation typically assumes that the likelihood function of the model can be approximated by a second-order Taylor expansion. By approximating the likelihood function and incorporating information from the prior distribution, it reduces the computational complexity. The Lindley approximation is particularly effective when the parameter space is small, and the likelihood function is relatively smooth over the interval.

The Lindley approximation method has significant advantages compared to computational techniques such as MCMC. When dealing with high-dimensional parameter models, the convergence of MCMC may take several hours to several days, thus consuming a large amount of computational resources. In contrast, the Lindley approximation method can quickly provide a reasonable estimate, thus greatly shortening the research cycle and reducing the time cost. When the model is of a high complexity due to the complex likelihood functions or prior distributions, the convergence of MCMC is easily disturbed, thus making it difficult to accurately estimate the posterior distribution. Relying on reasonable approximation assumptions, the Lindley approximation method has relatively loose conditional constraints on the model. It can still maintain a good performance in complex models and demonstrate a stronger robustness.

In this paper, after referring to Lindley's approximation, the following equation is given ^[40]:

(17)

where Q(ω, υ) = ln π(ω, υ), T(ω, υ) is the function of ω and υ, and lf(ω, υ|x^*) is the log(LF) in Eq (7). When the sample size is big enough, Eq (17) can be simplified as follows ^[40]:

In the above expressions, and are estimates of ω and υ, respectively, T_ωω is the second-order partial differential of T(ω, υ) with respect to υ, and Q_ω is the partial derivative of the first-order of Q(ω, υ) with respect to ω. Furthermore, we have the following:

where τ(m, n) represents the (m, n)-th element of , and u(υ, x) = 1-e^-υ/x².

The various expressions derived above are employed to articulate the BAE of SE in Eqs (14)–(16) as follows:

(ⅰ) The following is the Bayesian estimators under the WSEL function:

Thus, it can be obtained that

Then, we substitute the obtained expression into Eq (14) in order to derive the Bayesian estimates under the WSEL function.

(ⅱ) The following is the Bayesian estimators under the PL function:

Thus, it can be obtained that

Then, we substitute the derived expression into Eq (15) in order to obtain the Bayesian estimates under the PL function.

(ⅲ) The following are the Bayesian estimators under the KL function:

Under the KL function, we first list the following expression:

In the case of (ⅰ), we get

Thus, we need to ask for an expression for E[SE(f)|x]:

Thus, it can be obtained that

By substituting the above expression into , we obtain the Bayesian estimates under the PL function.

4.3. Mixed Gibbs sampling for calculating Bayesian estimates

For the BAE, scholars have varying opinions, one of which is to use MCMC methods. In Bayesian statistics, it is often necessary to calculate the posterior distribution to obtain the Bayesian estimate ^[1]. However, due to the complexity and difficulty of integrating the posterior distribution, the MCMC method has become an effective tool to address such challenges.

The MCMC method involves exploring the parameter space by generating Markov chains that ultimately converge to the posterior distribution. This approach not only handles complex posterior distributions effectively but also provides a comprehensive understanding of the accuracy and uncertainty associated with the parameter estimates. Consequently, employing the MCMC method enables us to effectively address computational challenges related to complex posterior distributions, thus resulting in more precise and reliable Bayesian estimates.

MHs sampling is suitable for sampling low-dimensional posterior distributions. It generates Markov chains through an acceptance-rejection mechanism designed to explore the parameter space and eventually converge on target posterior distributions. While MH sampling performs well with simple structures and low dimensionality, more sophisticated methods may be required to improve the sampling efficiency and convergence speed when dealing with high-dimensional or non-standard posterior distributions.

On the other hand, Gibbs sampling represents a special case of MCMC methods tailored for high-dimensional and complex posterior distributions. It utilizes the sequential sampling of conditional distributions in order to update the parameter values and construct Markov chains.

This paper presents a hybrid Gibbs sampling approach that combines MH with Gibbs sampling. We can derive the joint posterior distribution of the two unknown parameters from Eq (13) as follows:

Based on the above expression, we can obtain the posterior distribution of ω, υ as follows:

(18)

(19)

This paper employs hybrid Gibbs sampling to obtain the parameter samples. This is how Gibbs sampling works:

(1) First, the MH method is used to obtain ω⁽ⁱ⁺¹⁾ cases from π(ω|υ⁽ⁱ⁾, x^*).

ω obeys a normal distribution N(ω_M, μ_ω²), ω_M is the current state, and μ_ω² denotes the variance of ω.

(ⅰ) We draw ω' from N(ω_M, μ_ω²). Resample at ω'≤0. The acceptance probability expression is as follows:

(ⅱ) A sample λ₁ is obtained by obeying a uniformly distribution U(0, 1) with parameter values as follows:

(ⅲ) Let M = M + 1. After obtaining the result, return to the first step.

(2) First, the MH method is used to obtain υ⁽ⁱ⁺¹⁾ samples from π(υ|ω⁽ⁱ⁺¹⁾, x^*).

υ obeys a normal distribution N(ω_M, μ_ω²), η_k is the current state, and μ_υ² denotes the variance of υ.

(ⅰ) We draw υ' from N(ω_M, μ_ω²). Resample at υ'≤0. The acceptance probability expression is as follows:

(ⅱ) A sample λ₂ is obtained by obeying a uniformly distributed U(0, 1) with parameter values as follows:

(ⅲ) Let M = M + 1. After obtaining the result, return to the first step.

5. Numerical simulation

The Monte Carlo method encompasses a vast array of computational algorithms that leverage repeated random sampling to derive numerical results. Its widespread application across diverse fields such as statistics, physics, finance, engineering, and beyond attests to its remarkable flexibility. The fundamental principle involves generating random samples from a probability distribution and utilizing these samples either to estimate the properties of a system or to resolve a particular problem.

At its core, the Monte Carlo method employs random sampling to simulate an experimental process, thereby yielding a numerical approximation of the solution to the problem at hand. The essence of this approach lies in the generation of random numbers that conform to a specific distribution, with an approximate solution being derived through extensive random sampling. As the quantity of the samples increases, the Monte Carlo method progressively enhances the accuracy of the approximate solution, thus drawing it ever closer to the true solution of the problem. In the subsequent sections, we will employ Monte Carlo methods to compare different estimation techniques, thus illustrating their extensive range of applications and their efficacy in addressing real-world problems.

To compare the performance of various estimation methods based on these schemes, we used Monte Carlo simulation techniques to obtain the SE estimates for various estimation methods. After obtaining the estimates, we further calculated several key indicators such as the MSE, variance, and coverage. The MSE can intuitively display the degree of deviation between estimates and true values. Variance can reflect the dispersion of estimates (i.e., stability). Coverage reflects the frequency at which real parameters fall within the estimated interval in multiple simulations. By obtaining the MSE, variance, and coverage of the estimated values, we evaluated the performance of various estimation methods, thus providing solid and reliable decision support for subsequent research and practical applications. During the research process, we constructed various PC-II schemes (as shown in Table 1).

Table 1. PC-II designs.

Scheme	Number of samples removed each time
Ⅰ	L₁=L₂=...=L_m-1=0, L_m=nm
Ⅱ	L₁=L₂=...=L_m=1
Ⅲ	L₁=n-m, L₂=...=L_m=0

| Show Table

DownLoad: CSV

Several different PC-II schemes set out in the table above can be used to obtain PC-II data using the following algorithm (Algorithm 2):

Algorithm 2. Generating PC-II data under IER(ω, υ).
(1). First, randomly obtain m samples D_i(1≤i≤m) from a uniform distribution.
(2). Assign corresponding values to the different selected PC-II rule.

(3). Setting the initial values of parameters ω, υ, set

(4). Substituting the corresponding parameter values into Eqs (8), (9), and (12), the ML estimates of ω, υ, SE(f), and ACIN are calculated.
(5). We assume that the hyperparameter is a = b = c = d = 1.
(6). The corresponding parameters are substituted into Eqs (14)–(16), and then Lindley approximation is applied to calculate Eqs (14)–(16).
(7). Repeat steps one through five a total of 1000 times. Then, compute the variance along with the ML and MSEs. The outcomes of the aforementioned calculations are presented in the table provided.

During the MCMC simulation process, we set the initial values of the parameters ω, υ as ω₀ = 0.1, υ₀ = 0.1, respectively. Meanwhile, the initial values of parameters a, b, c and d in the gamma function are all set to 1. To eliminate the influence of the initial value on the parameter estimation, we took the first 500 times as the warm-up period. With 5000 iterations in total, the parameter estimates became stable as indicated by the decreasing variance. We judge convergence by monitoring the difference between the values of the successive parameters. Over the course of a large number of iterations, if the absolute difference between the successive values of a parameter is consistently below a pre-set threshold (1*10⁻⁶), then this indicates that the parameter has reached a relatively stable state. Regarding the selection of initial values, scholars generally determine them by combining the background knowledge and physical meaning of the problem. Alternatively, at the start of the simulation, they randomly select several sets of initial values, conduct small-scale preliminary experiments or exploratory simulations, try different initial values, and observe the model's operation results and convergence status before making a decision. In this paper, the initial values are selected using the latter method. At the beginning, several sets of initial values are randomly chosen for the preliminary experiments. It is found that when the initial value ω₀ = 0.1, υ₀ = 0.1, the model converges more quickly, and the results are relatively stable. Therefore, we chose the initial value ω₀ = 0.1, υ₀ = 0.1.

Subsequently, the MCMC simulation was carried out according to the censoring scheme shown in Table 1, and the final obtained results are presented in Table 2.

Table 2. Estimates of SE of IER(ω, υ) and their corresponding MSEs.

			ML	Lindley			MCMC
n	m	Ps	M_ML	M_w	M_P	M_k	M_w	M_P	M_k
30	15	Ⅰ	1.8333	1.3379	2.1681	2.1428	4.0058	4.2344	4.0801
		Ⅰ	0.0186	0.0145	0.0215	0.0213	0.0420	0.0450	0.0430
		Ⅱ	2.9593	2.6997	3.0375	4.8642	4.9937	3.4057	3.6990
		Ⅱ	0.0295	0.0268	0.0304	0.0538	0.0558	0.0346	0.0381
		Ⅲ	2.1239	1.7393	2.3461	2.8169	5.7161	7.3924	6.2604
		Ⅲ	0.0211	0.0177	0.0232	0.028	0.0671	0.0973	0.0763
	20	Ⅰ	1.3387	0.8527	1.7124	1.2593	5.7097	5.8981	5.7741
		Ⅰ	0.0145	0.0111	0.0175	0.0139	0.0670	0.0701	0.0680
		Ⅱ	2.4583	2.2396	2.5512	3.6842	5.2158	5.2227	5.2181
		Ⅱ	0.0243	0.0222	0.0253	0.0379	0.0591	0.0592	0.0592
		Ⅲ	1.8969	1.5975	2.0677	2.4101	5.4683	5.1808	4.8803
		Ⅲ	0.0191	0.0166	0.0206	0.0239	0.0631	0.0586	0.0541
40	15	Ⅰ	2.1092	1.6438	2.4046	2.7240	4.3557	4.4536	4.3878
		Ⅰ	0.0210	0.0170	0.0238	0.0270	0.0466	0.0480	0.0471
		Ⅱ	3.2181	3.0126	3.2358	5.5805	5.5811	5.6058	5.5892
		Ⅱ	0.0324	0.0301	0.0326	0.0649	0.0649	0.0653	0.0650
		Ⅲ	1.8413	1.3439	2.1873	2.1591	5.7654	6.6013	6.0088
		Ⅲ	0.0186	0.0146	0.0217	0.0215	0.0679	0.0824	0.0720
	20	Ⅰ	1.5321	1.0586	1.8749	1.5986	5.5001	5.7136	5.5712
		Ⅰ	0.0160	0.0125	0.0189	0.0166	0.0636	0.0670	0.0647
		Ⅱ	2.6158	2.4106	2.6915	4.0648	5.4929	7.9602	5.9815
		Ⅱ	0.0259	0.0239	0.0267	0.0428	0.0635	0.1089	0.0715
		Ⅲ	1.7374	1.3778	1.9692	2.0562	7.3308	6.9897	5.9927
		Ⅲ	0.0177	0.0148	0.0197	0.0205	0.0961	0.0896	0.0717
50	15	Ⅰ	1.7513	1.1619	2.1960	1.9163	4.3314	4.4213	4.3614
		Ⅰ	0.0179	0.0132	0.0218	0.0193	0.0463	0.0475	0.0467
		Ⅱ	1.7443	1.1836	2.1474	1.9239	4.7887	4.6084	4.6402
		Ⅱ	0.0178	0.0134	0.0214	0.0193	0.0527	0.0502	0.0506
		Ⅲ	2.3678	1.9841	2.5906	3.3509	5.6337	7.0812	5.7139
		Ⅲ	0.0234	0.0199	0.0256	0.0339	0.0657	0.0913	0.0670
	20	Ⅰ	1.5720	1.1216	1.8959	1.6861	4.3022	4.3850	4.3302
		Ⅰ	0.0164	0.0129	0.0191	0.0173	0.0459	0.0470	0.0463
		Ⅱ	1.2253	0.6985	1.6658	1.0498	5.7158	5.7480	5.7267
		Ⅱ	0.0137	0.0101	0.0171	0.0124	0.0671	0.0676	0.0672
		Ⅲ	1.8986	1.5607	2.1108	2.3872	4.6716	5.4634	4.9216
		Ⅲ	0.0191	0.0163	0.0210	0.0236	0.0511	0.0630	0.0547

| Show Table

DownLoad: CSV

Table 2 shows the ML estimates, Bayesian estimates, and their corresponding MSEs under different PC-II schemes (Ps). We have summarized the characteristics of each method by observing the estimated values under different parameter values and sample sizes.

From the MCMC simulation results, we finally obtained Table 2 and Figure 3; after a careful observation and analysis of Table 2 and Figure 3, we can obtain the following conclusions:

Figure 3. Comparison of estimation values and MSEs of ML, Lindley, and MCMC methods under different censoring schemes.

DownLoad: Full-Size Img PowerPoint

(1) The Effects of Different Censoring Schemes on Estimates and MSEs.

(ⅰ) Censoring Scheme Ⅰ. In the context of a specific sample combination (n = 30, m = 15), the Lindley method demonstrates precise and stable estimation characteristics under the WSEL function. Its estimated value is 1.3379, and the corresponding MSE is only 0.0145. When the total sample size n varies, the MSEs of different estimation methods exhibit inconsistent change trends. This phenomenon is closely related to the inherent distribution of the data and the censoring mechanism.

(ⅱ) Censoring Scheme Ⅱ. Estimated values under this scheme are generally at a relatively high level. For example, by taking n = 30, m = 15, the estimated value obtained by the ML method is 2.9593, which is significantly higher than the corresponding value in censoring scheme Ⅰ. Meanwhile, its MSE is 0.0295, which is also larger than that in scheme Ⅰ. This is most likely because this censoring scheme leads to the significant loss of data information, thus resulting in larger deviations and more intense fluctuations in the estimation results. Under different sample size conditions, there are significant differences in the MSEs of the MCMC method under different loss functions.

(ⅲ) Censoring Scheme Ⅲ. The fluctuation range of the MSEs of different estimation methods under this scheme is extremely wide. In the case of n = 30, m = 15, the MSE of the Lindley method under the WSEL loss function is 0.0177, while that of the MCMC method is as high as 0.0671, which clearly indicates the poor stability of the MCMC method.

(2) Performance Comparison of Different Estimation Methods under Each Censoring Scheme.

(ⅰ) ML Method. In censoring scheme Ⅰ, the overall performance is relatively stable. In some sample combinations, its MSE is at a medium level. For example, when n = 30, m = 15, the MSE is 0.0186. However, under censoring scheme Ⅱ, the estimated value is on the high side, and the MSE is relatively large. This reflects that when dealing with such censored data, this method may have certain biases and instabilities. For example, when n = 30, m = 15, the estimated value is 2.9593, and the MSE is 0.0295. Under censoring scheme Ⅲ, the fluctuation range of the MSEs is relatively large, thus indicating that this method has a weak adaptability to this censoring pattern.

(ⅱ) Lindley Method. In censoring scheme Ⅰ, it is sometimes possible to obtain a relatively small MSE. For example, when n = 30, m = 15, the MSE under the WSEL loss function is 0.0145, thus reflecting a good estimation accuracy and stability. However, under censoring schemes Ⅱ and Ⅲ, although an estimation can be carried out, the change range of the MSEs is relatively large, and in some cases, the value is relatively high.

(ⅲ) MCMC Method. Under multiple censoring schemes, its MSE is usually relatively large. For example, under censoring scheme Ⅰ, when n = 30, m = 15, the MSE under the WSEL function is 0.042. Under censoring scheme Ⅲ, when n = 30, m = 15, the MSE under the WSEL loss function is as high as 0.0671. This indicates that when the MCMC method deals with these censored data, the estimation stability is poor, and the error is relatively large. Under the same censoring scheme and sample size conditions, different loss functions have a very significant impact on its MSE.

Simultaneously, while delving deeply into the influence of diverse censoring schemes on the accuracy and stability of estimations, we amassed a set of crucial data spanning different confidence levels. The particulars of this data are collated in Table 3. This table presents a comprehensive overview of the variance and coverage figures for distinct censoring schemes (Ⅰ, Ⅱ, Ⅲ) at different confidence levels (α = 0.1 and α = 0.05) under a wide array of combinations of sample sizes (n) and the numbers of deleted samples (m).

Table 3. Variance and coverage of parameter estimates under different censoring schemes at confidence levels α = 0.1 and α = 0.05.

n	m	Ps	a=0.1	a=0.05	V(SE(f))
30	15	Ⅰ	0.622	0.674	1.1774
		Ⅱ	0.728	0.731	0.5614
		Ⅲ	0.449	0.563	0.8917
	20	Ⅰ	0.612	0.686	1.0218
		Ⅱ	0.690	0.710	0.4660
		Ⅲ	0.431	0.477	0.6248
40	15	Ⅰ	0.696	0.713	1.1489
		Ⅱ	0.693	0.720	0.3645
		Ⅲ	0.541	0.574	1.1969
	20	Ⅰ	0.796	0.830	1.0288
		Ⅱ	0.692	0.699	0.4314
		Ⅲ	0.501	0.530	0.7614
50	15	Ⅰ	0.730	0.733	1.4971
		Ⅱ	0.710	0.730	1.3725
		Ⅲ	0.568	0.584	0.9424
	20	Ⅰ	0.730	0.778	0.9801
		Ⅱ	0.697	0.754	1.1402
		Ⅲ	0.541	0.554	0.7408

| Show Table

DownLoad: CSV

From Table 3, we can observe the following:

(1) The Impact of Censoring Schemes on Coverage and Variance.

Censoring scheme Ⅰ typically shows a higher coverage rate. The coverage rates of each scheme vary with different n and m combinations. The influence of the sample size and the number of deleted samples (m) is complex and lacks a unified pattern.

For schemes Ⅱ and Ⅲ, the variance is relatively large in some situations. When n = 30, m = 15, scheme Ⅱ has a variance of 0.731 at α = 0.1, and scheme Ⅲ has a variance of 0.477 at α = 0.1, both of which are less stable than scheme Ⅰ. As n and m change, the variance of each scheme shows no clear regularity. The impact of the sample parameters on the variance is unpredictable, and a larger variance means a higher dispersion of estimates and a lower accuracy.

(2) The Impact of Confidence Level on Coverage and Variance.

Overall, as the confidence level rises from α = 0.1 to α = 0.05, the coverage rate of each censoring scheme increases across different n and m combinations. For example, when n = 30, m = 15 in censoring scheme Ⅰ, the coverage at α = 0.1 is 0.674, and it grows to 1.1774 at α = 0.05, which conforms to the theoretical expectations. A higher confidence level demands a wider confidence interval, thereby increasing the probability of capturing the true parameter value.

Based on the research findings, we can draw the following conclusions. Censoring schemes significantly affect the coverage rate and variance. If a high coverage rate is pursued, then Scheme Ⅰ is more suitable. However, if the stability of the estimation is more highly valued, attention should be paid to the problem that Schemes Ⅱ and Ⅲ may exhibit relatively large variances in certain situations.

6. Practical application example

The conclusions drawn above are based on simulation experiments. To verify the effectiveness of the method presented in this paper, we will use real data for validation. This is a set of survival time data for cancer patients ^[41]. The data is as follows:

0.122, 0.2356, 0.2374, 0.2587, 0.3198, 0.37, 0.4135, 0.4738, 0.5546, 0.5836, 0.6347, 0.6846, 0.7826, 0.7447, 0.8143, 0.84, 0.92, 0.94, 1.10, 1.12, 1.19, 1.27, 1.30, 1.33, 1.40, 1.46, 1.55, 1.59, 1.73, 1.79, 1.94, 1.95, 2.09, 2.49, 2.81, 3.19, 3.39, 4.32, 4.69, 5.19, 6.33, 7.25, 8.17, 17.76.

This is a dataset about the survival times of cancer patients, which consists of numerical data. There are records of the survival times of a total of 43 cancer patients. Raqab ^[42] fitted the generalized Rayleigh distribution function and the Weibull distribution function to a set of real data. By comparing the fitting effects of the two distributions, it was found that the Weibull distribution function had a better fit. Therefore, after obtaining the real data, we used multiple distributions to fit to compare which distribution provided the better fit.

The shortest survival time of the patients is 0.122 and the longest is 17.76. The large span of survival times indicates that there is a significant difference in the durations of survival among different patients.

This dataset has a moderate scale, with a relatively wide range of the patients' survival times. The data is quite dispersed, and it seems that it may follow a skewed distribution with more values on the right side. This real dataset can be used to verify whether relevant methods are correct, and it can also be applied to another related research.

We can observe the distribution and central tendency of the data through Figure 4.

Figure 4. Boxplot of the dataset.

DownLoad: Full-Size Img PowerPoint

We must ascertain the appropriateness of the IER distribution for real data in Section 6. To this end, we selected the Weibull distribution (WD) and the inverse Weibull distribution (IWD) as candidates for the IER distribution, thereby employing the Kolmogorov-Smirnov (KS) test to evaluate which distribution provides a superior fit for the sample. In the context of the KS test, a larger p-value typically signifies a more favorable fit, thus suggesting that the data does not significantly deviate from the hypothesized distribution. This indicates insufficient evidence to reject the null hypothesis, which posits that the data adheres to the assumed distribution.

According to Table 4, it is evident that the p-value for the KS test of the IER distribution is markedly greater than those for the other two distributions, thereby affirming that the IER distribution is the most fitting choice for this sample.

Table 4. Goodness-of-fit test for real data.

Distribution	KS	p-values
WD	0.5733	2.7351e-13
IWD	0.3593	1.1660e-05
IER	0.1048	0.3800

| Show Table

DownLoad: CSV

We will process the obtained real data and use MATLAB19.0 to draw the experience distribution diagram, as shown in Figure 5.

Figure 5. Comparison of IER's empirical distribution and CDFs for IER, WD and IWD.

DownLoad: Full-Size Img PowerPoint

By observing Figure 4 and Table 4, we can conclude that it is reasonable to use the IER distribution to analyze the real data in Section 6.

An ML estimation and BAE were conducted on the aforementioned real data using the PC-II scheme and the estimation method outlined in this paper. By applying the methods mentioned in this article to real data, the parameters ω = 0.1965, υ = 0.0011 can be estimated. The results are shown in Table 5 and Figure 6.

Table 5. Estimates of SE.

ML	BAE
	Lindley			MCMC
	WSEL	PL	KL	WSEL	PL	KL
10.1959	6.6963	6.7294	17.9151	3.8127	4.4186	3.9618

| Show Table

DownLoad: CSV

Figure 6. Image of the estimation of SE.

DownLoad: Full-Size Img PowerPoint

7. Conclusions

The introductory section of this article provided a comprehensive overview of the model and briefly summarized its characteristics. In the PC-II schemes, we derived the SE function of the IER distribution in detail and obtained its specific mathematical expression. To improve the computational accuracy, we used Newton-Raphson to obtain ML estimates of the parameters. Subsequently, we used the Delta method to derive the entropy's ACIN to evaluate the reliability of the entropy estimation. Finally, based on multiple loss functions, we performed BAE on the SE function and analyzed the estimation of the SE function under BAE.

This paper discussed the BAE of the IER distribution with respect to three loss functions. To compute the BAE under these loss functions, we chose the Lindley approximation method. Through our detailed study and efforts, we calculated the MSE, variance, and interval coverage rate for both the ML estimation and BAE using the Monte Carlo simulation and mixed Gibbs sampling in order to visually compare the performance of different statistical methods in addressing this issue and draw the following conclusions:

(1) Censoring schemes have a significant impact on the estimated values and the MSEs. Under all censoring schemes and sample size conditions, there is no estimation method that always performs optimally. The ML method is relatively stable in some cases, the Lindley method can achieve a better estimation accuracy under specific conditions, while the overall estimation stability of the MCMC method is poor. The choice of loss function is crucial for the estimation results. Different loss functions can cause the performance of estimation methods to change under different censoring schemes. In the actual application process, it is necessary to comprehensively consider and select appropriate estimation methods and loss functions based on specific data characteristics (such as sample size, censoring scheme) and research objectives to obtain more accurate and reliable estimation results.

(2) Increasing the confidence level can raise the coverage rate, but it may lead to unstable variance. In practical applications, it is necessary to strike a balance between the coverage rate and the estimation accuracy (reflected by the variance).

(3) The influence of the sample size (n) and the number of deleted samples (m) on the coverage rate and variance is rather complex. When designing a research plan, it is necessary to comprehensively consider the impact of different combinations of sample parameters on the performance of censoring schemes, rationally select the values of n and m, and match them with the appropriate censoring schemes and confidence levels to obtain reliable and accurate estimation results.

(4) Following the analysis of real data conducted in Section 6, it becomes apparent that the Bayesian algorithm surpasses the ML estimation method in terms of data fitting accuracy and interpretability when we chose the suitable prior distribution. Moreover, it was noted that the MCMC method demonstrates greater efficiency and stability compared to the Lindley algorithm.

In summary, this study applied Monte Carlo simulation methods to evaluate the performance of various estimation methods under different truncation conditions. However, to further advance this field, there are still many open areas that need to be explored. In terms of extension schemes and methods, we can try to develop adaptive PC-II schemes by combining emerging artificial intelligence algorithms, which may improve the comprehensiveness and accuracy of our evaluation.

Author contributions

Haiping Ren: Method, Writing-review & editing; Ziwen Zhang: Methods, Writing-original draft, Writing-review & editing; Qin Gong: Methods, Writing-review & editing. All authors have read and approved the final version of the manuscript for publication.

Use of AI tools declaration

The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

Acknowledgments

This research was funded by National Natural Science Foundation of China, grant number 71661012 and Science and Technology Research Project of Jiangxi Provincial Department of Education, grant number GJJ2200814.

Conflict of interest

The authors declare no conflict of interest.

References

[1]	H. P. Ren, Q. Gong, X. Hu, Estimation of entropy for generalized rayleigh distribution under progressively type-Ⅱ censored samples, Axioms, 12 (2023), 776. https://doi.org/10.3390/axioms12080776 doi: 10.3390/axioms12080776
[2]	K. Maiti, S. Kaya, D. Kundu, Statistical inference on the Shannon and renyi entropy measures of generalized exponential distribution under the progressive censoring, SN. Comput. Sci., 3 (2022), 317. https://doi.org/10.1007/s42979-022-01200-2 doi: 10.1007/s42979-022-01200-2
[3]	R. Xu, W. Gui, Entropy estimation of inverse Weibull distribution under adaptive type-Ⅱ progressive hybrid censoring schemes, Symmetry, 11 (2019), 1463. https://doi.org/10.3390/sym11121463 doi: 10.3390/sym11121463
[4]	M. Shrahili, A. R. El-Saee, Estimation of entropy for log-logistic distribution under progressive Type-Ⅱ censoring, J. Nanomater., 2022 (2022), 2739606. https://doi.org/10.1155/2022/2739606 doi: 10.1155/2022/2739606
[5]	R. K. Maurya, Y. M. Tripathi, T. Sen, On progressively censored inverted exponentiated rayleigh distribution, J. Stat. Comput. Simul., 89 (2018), 492–518. https://doi.org/10.1080/00949655.2018.1558225 doi: 10.1080/00949655.2018.1558225
[6]	M. M. Yousef, A. S. Hassan, A. H. Al-Nefaie, Bayesian estimation using MCMC method of system reliability for inverted Topp–Leone distribution based on ranked set sampling, Mathematics, 10 (2022), 3122. https://doi.org/10.3390/math10173122 doi: 10.3390/math10173122
[7]	A. A. Mutairi, A. Alrashidi, N. T. Al-Sayed, S. M. Behairy, M. Elgarhy, S. G. Nassr, Bayesian and E-Bayesian estimation based on constant-stress partially accelerated life testing for inverted Topp–Leone distribution, Open Phys., 21 (2023), 20230126. https://doi.org/10.1515/phys-2023-0126 doi: 10.1515/phys-2023-0126
[8]	H. Panahi, Reliability analysis for stress–strength model from inverted exponentiated rayleigh based on the hybrid censored sample, Int. J. Qual. Reliab. Ma., 40 (2022), 1412–1428. https://doi.org/10.1108/IJQRM-05-2021-0130 doi: 10.1108/IJQRM-05-2021-0130
[9]	H. Panahi, N. Moradi, Estimation of the inverted exponentiated rayleigh distribution based on adaptive type Ⅱ progressive hybrid censored sample, J. Comput. Appl. Math., 364 (2020), 112345. https://doi.org/10.1016/j.cam.2019.112345 doi: 10.1016/j.cam.2019.112345
[10]	M. K. Rastogi, Y. M. Tripathi, Estimation for an inverted exponentiated Rayleigh distribution under type Ⅱ progressive censoring, J. Appl. Stat., 41 (2014), 2375–2405. https://doi.org/10.1080/02664763.2014.910500 doi: 10.1080/02664763.2014.910500
[11]	L. Wang, S. -J. Wu, Y. M. Tripathi, S. Dey, Estimation and prediction of modified progressive hybrid censored data from inverted exponentiated rayleigh distribution, Qual. Technol. Quant. M., 21 (2023), 502–524. https://doi.org/10.1080/16843703.2023.2219556 doi: 10.1080/16843703.2023.2219556
[12]	A. K. Al-kadim, A. N. Hussein, New proposed length-biased weighted exponential and Rayleigh distribution with application, Math. Theor. Model., 4 (2014), 137–152.
[13]	H. J. Kim, Reliability estimation in the mixed distribution of exponential and rayleigh distributions, J. Korean Data Anal. Soc., 13 (2011), 1687–1696.
[14]	M. Shuvashree, K. Debasis, On the joint Type-Ⅱ generalized progressive hybrid censoring scheme, Commun. Stat. Theory Methods., 49 (2020), 958–976. https://doi.org/10.1080/03610926.2018.1554128 doi: 10.1080/03610926.2018.1554128
[15]	H. M. Almongy, F. Y. Alshenawy, Applying transformer insulation using weibull extended distribution based on progressive censoring scheme, Axioms, 10 (2021), 100. https://doi.org/10.3390/axioms10020100 doi: 10.3390/axioms10020100
[16]	A. F. Hashem, S. A. Alyami, Utilizing empirical bayes estimation to Assess reliability in Inverted exponentiated rayleigh distribution with progressive hybrid censored medical data, Axioms., 12 (2023), 872. https://doi.org/10.3390/AXIOMS12090872 doi: 10.3390/AXIOMS12090872
[17]	G. V. Sriramachandran, M. Palanivel, Acceptance Sampling Plan from Truncated Life Tests based on exponentiated Inverse Rayleigh distribution, Am. J. Math. Manag. Sci., 33 (2014), 20–35. https://doi.org/10.1080/01966324.2013.877362 doi: 10.1080/01966324.2013.877362
[18]	T. Kayal, Y. M. Tripathi, M. K. Rastogi, Estimation and prediction for an inverted exponentiated rayleigh distribution under hybrid censoring, Commun. Stat. Theory Methods., 47 (2018), 1615–1640. https://doi.org/10.1080/03610926.2017.1322702 doi: 10.1080/03610926.2017.1322702
[19]	S. Anwar, L. S. Ahmad, A. Khan, S. Almutlak, Stress-strength reliability estimation for the inverted exponentiated rayleigh distribution under unified progressive hybrid censoring with application, Electron. Res. Arch., 31 (2023), 4011–4033. https://doi.org/10.3934/era.2023204 doi: 10.3934/era.2023204
[20]	I. Chalabi, High-resolution sea clutter modelling using compound inverted exponentiated rayleigh distribution, Remot. Sens. Lett., 14 (2023), 433–441. https://doi.org/10.1080/2150704X.2023.2215894 doi: 10.1080/2150704X.2023.2215894
[21]	L. Wang, Y. M. Tripathi, C. Lodhi, X. J. Zuo, Inference for constant-stress Weibull competing risks model under generalized progressive hybrid censoring, Math. Comput. Simul. Lett., 192 (2021), 70–83. https://doi.org/10.1016/j.matcom.2021.08.017 doi: 10.1016/j.matcom.2021.08.017
[22]	H. Z. Muhammed, E. M. Al Metwally, Bayesian and non-bayesian estimation for the shape parameters of new versions of bivariate inverse weibull distribution based on progressive type Ⅱ censoring, Ann. Data Sci., 10 (2023), 481–512. https://doi.org/10.1007/s40745-020-00316-7 doi: 10.1007/s40745-020-00316-7
[23]	M. Z. Raqab, Discriminating between the generalized rayleigh and weibull distributions, J. Appl. Stat., 40 (2013), 1480–1493. https://doi.org/10.1080/02664763.2013.788614 doi: 10.1080/02664763.2013.788614
[24]	J. Fan, W. Gui, Statistical inference of inverted exponentiated rayleigh distribution under joint progressively type-Ⅱ censoring, Entropy, 24 (2022), 171. https://doi.org/10.3390/e24020171 doi: 10.3390/e24020171
[25]	S. Gao, J. Yu, W. Gui, Pivotal inference for the inverted exponentiated rayleigh distribution based on progressive type-Ⅱ censored data, Am. J. Math. Manag. Sci., 39 (2020), 315–328. https://doi.org/10.1080/01966324.2020.1762142 doi: 10.1080/01966324.2020.1762142
[26]	J. Ma, L. Wang, Y. M. Tripathi, M. K. Rastogi, Reliability inference for stress-strength model based on inverted exponential rayleigh distribution under progressive type-Ⅱ censored data, Commun. Stat. -Simul. C., 52 (2021), 2388–2407. https://doi.org/10.1080/03610918.2021.1908552 doi: 10.1080/03610918.2021.1908552
[27]	M. E. Ghitany, V. K. Tuan, N. Balakrishnan, Likelihood estimation for a general class of inverse exponentiated distributions based on complete and progressively censored data, J. Stat. Comput. Simul., 84 (2014), 96–106. https://doi.org/10.1080/00949655.2012.696117 doi: 10.1080/00949655.2012.696117
[28]	M. Shrahili, A. R. El-Saeed, A. S. Hassan, I. Elbatal, M. Elgarhy, Estimation of entropy for log-logistic distribution under progressive type-Ⅱ censoring, J. Nanomater., 2022 (2022), 2739606. https://doi.org/10.1155/2022/2739606 doi: 10.1155/2022/2739606
[29]	M. Z. Raqab, Discriminating between the generalized rayleigh and weibull distributions, J. Appl. Stat., 40 (2013), 1480–1493. https://doi.org/10.1080/02664763.2013.788614 doi: 10.1080/02664763.2013.788614
[30]	S. Gao, W. Gui, Parameter estimation of the inverted exponentiated rayleigh distribution based on progressively first-failure censored samples, Int. J. Syst. Assur. Eng., 10 (2019), 925–936. https://doi.org/10.1007/s13198-019-00822-9 doi: 10.1007/s13198-019-00822-9
[31]	W. Yan, P. Li, Y. Yu, Statistical inference for the reliability of burr-XII distribution under improved adaptive type-Ⅱ progressive censoring, Appl. Math. Model., 95 (2021), 38–52. https://doi.org/10.1016/j.apm.2021.01.050 doi: 10.1016/j.apm.2021.01.050
[32]	A. F. Hashem, S. A. Alyami, Inference on a new lifetime distribution under progressive type Ⅱ censoring for a parallel-series structure, Complexity, 2021 (2012), 6684918. https://doi.org/10.1155/2021/6684918 doi: 10.1155/2021/6684918
[33]	D. A. Ramadan, Assessing the lifetime performance index of weighted lomax distribution based on progressive type-Ⅱ censoring scheme for bladder cancer, Int. J. Biomath., 14 (2021), 2150018. https://doi.org/10.1142/S1793524521500182 doi: 10.1142/S1793524521500182
[34]	J. Ren, W. Gui, Statistical analysis of adaptive type-Ⅱ progressively censored competing risks for weibull models, Appl. Math. Model., 98 (2021), 323–342. https://doi.org/10.1016/j.apm.2021.05.008 doi: 10.1016/j.apm.2021.05.008
[35]	D. Kundu, Bayesian inference and life testing plan for the weibull distribution in presence of progressive censoring, Technometrics, 50 (2008), 144–154. https://doi.org/10.1198/004017008000000217 doi: 10.1198/004017008000000217
[36]	M. Han, E-bayesian estimations of parameter and its evaluation standard: expected mean square error under different loss functions, Commun. Stat. -Simul. C., 50 (2021), 1971–1988. https://doi.org/10.1080/03610918.2019.1589510 doi: 10.1080/03610918.2019.1589510
[37]	M. N. Abd, H. A. Rasheed, Bayesian estimation for the reliability function of two parameter exponential distribution under different loss functions, AIP Conf. Proc., 3036 (2024), 040036. https://doi.org/10.1063/5.0203008 doi: 10.1063/5.0203008
[38]	A. S. Hassan, E. A. Elsherpieny, R. E. Mohamed, Classical and bayesian estimation of entropy for pareto distribution in presence of outliers with application, Sankhya A., 85 (2023), 707–740. https://doi.org/10.1007/s13171-021-00274-z doi: 10.1007/s13171-021-00274-z
[39]	A. Gelman, J. B. Carlin, A. Vehtari, D. B. Rubin, Bayesian data analysis, 3 Eds., Boca Raton: Chapman & Hall/CRC, 2014. https://doi.org/10.1201/b16018
[40]	A. Kohansal, On estimation of reliability in a multicomponent stress-strength model for a kumaraswamy distribution based on progressively censored sample, Stat. Pap., 60 (2019), 2185–2224. https://doi.org/10.1007/s00362-017-0916-6 doi: 10.1007/s00362-017-0916-6
[41]	B. Efron, Logistic Regression, Survival analysis, and the kaplan-meier curve, J. Amer. Statist. Assoc., 83 (2012), 414–425. https://doi.org/10.1080/01621459.1988.10478612 doi: 10.1080/01621459.1988.10478612
[42]	M. Z. Raqab, Discriminating between the generalized rayleigh and weibull distributions, J. Appl. Stat., 40 (2013), 1480–1493. https://doi.org/10.1080/02664763.2013.788614 doi: 10.1080/02664763.2013.788614

Reader Comments

Your name:*

Email:*
© 2025 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

AIMS Mathematics

1.8 3.4

Metrics

Article views(313) PDF downloads(38) Cited by(0)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Figures(6) / Tables(5)

AIMS Mathematics

Estimation of Shannon entropy of the inverse exponential Rayleigh model under progressively Type-Ⅱ censored test

Related Papers:

Abstract

1. Introduction

2. Preliminary of Shannon entropy

3. Classical estimations of Shannon entropy

3.1. Maximum likelihood estimation

3.2. Asymptotic confidence interval estimation

4. Bayesian estimation of shannon entropy

4.1. Bayesian estimation

4.2. Lindley's approximation of Bayesian estimators

4.3. Mixed Gibbs sampling for calculating Bayesian estimates

5. Numerical simulation

6. Practical application example

7. Conclusions

Author contributions

Use of AI tools declaration

Acknowledgments

Conflict of interest

References

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

AIMS Mathematics

Estimation of Shannon entropy of the inverse exponential Rayleigh model under progressively Type-Ⅱ censored test

Related Papers:

Abstract

1. Introduction

2. Preliminary of Shannon entropy

3. Classical estimations of Shannon entropy

3.1. Maximum likelihood estimation

3.2. Asymptotic confidence interval estimation

4. Bayesian estimation of shannon entropy

4.1. Bayesian estimation

4.2. Lindley's approximation of Bayesian estimators

4.3. Mixed Gibbs sampling for calculating Bayesian estimates

5. Numerical simulation

6. Practical application example

7. Conclusions

Author contributions

Use of AI tools declaration

Acknowledgments

Conflict of interest

References

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog