1.
Introduction
Kaller and Kamath [1] proposed two-parameter inverse Weibull distribution (IWD) to simulate the degradation of mechanical components of diesel engines. After that, IWD is considered an appropriate model to analyze lifetime data. For example, Abhijit and Anindya [2] found that IWD is superior to the normal distribution when using ultrasonic pulse velocity to measure concrete structures. Elio et al. [3] proposed a new model generated by appropriate mixing of IWD for modeling under extreme wind speed conditions. Langlands et al. [4] observed that breast cancer mortality data can be modeled and analyzed by IWD. Beyond that, IWD is widely used in reliability research. For example, Bi and Gui [5] considered the estimation of stress-strength reliability of IWD. They proposed an approximate maximum likelihood estimation for point and confidence interval estimations. Bayesian estimator and highest posterior density (HPD) confidence interval were derived using Gibbs sampling. Based on adaptive type-Ⅰ progressive hybrid censored scheme, Azm et al. [6] studied the estimation of unknown parameters of IWD when the data were competing risks data. The maximum likelihood estimation and Bayesian estimation were discussed. The asymptotic confidence intervals, the bootstrap confidence intervals and the HPD confidence intervals were derived. Then, two sets of real data were studied to illustrate maximum likelihood estimation and Bayesian estimation. Alslman and Helu [7] assumed that the two components were independent and identically distributed and considered the estimation of stress-strength reliability for IWD. Its estimators were derived by maximum likelihood estimation and maximum product of spacing method and compared using computer simulation. Shawky and Khan [8] assumed that stress and strength both followed IWD, and they focused on the multi-component stress-strength model. The estimation of reliability was obtained by maximum likelihood estimation. Monte Carlo simulation results indicate that the proposed estimating methods are effective.
The probability density function (PDF), the cumulative distribution function (CDF) and the reliability function of IWD are respectively given by
and
Here, η is rate parameter and λ is shape parameter. For convenience, the PDF (1.1) of IWD will be denoted by IW(η,λ).
The assessment of product reliability often relies on the collection of life data, which can be obtained through a life test. This test involves observing whether a group of test samples fail during the test and recording their corresponding failure time. If the life test continues until all the test samples fail, the failure time can be recorded for all the samples, resulting in complete data. However, if the test stops before all the test samples fail, the data collected are called censored data. With the continuous advancement of science and technology, products are becoming more reliable and have longer lifespans. Collecting complete data in a life test can be expensive, making reliability analysis based on censored data a popular research topic among scholars. A progressive type-Ⅱ censored sample can be expressed as follows: consider an experiment where n units are subjected to a life test at time zero, and the experimenter decides beforehand the number of failures to be observed, denoted by r. Upon observing the first failure time T1, Q1 out of the remaining n−1 surviving units are randomly selected and removed. At the second observed failure time T2, Q2 out of the remaining n−2−Q1 surviving units are randomly selected and removed. This process continues until the r-th failure is observed at time Tr, and the remaining Qr=n−r−Q1−Q2−...−Qr−1 surviving units are all removed. The sample T=(T1,T2,...,Tr) is referred to as a progressively type-Ⅱ censored sample of size r from a sample of size n with censoring scheme Q=(Q1,Q2,...,Qr).
IWD is one of the commonly used lifetime distributions in reliability estimation [5,6,7,8]. Most of the estimation methods are maximum likelihood estimation and Bayesian estimation, and there is a lack of research on inverse moment estimation. Therefore, this paper aims to provide three methods to compute the point estimations and construct generalized confidence intervals of unknown parameters.
The rest of this paper is arranged as follows: In Section 2, the maximum likelihood estimators (MLEs) are obtained by Newtown-Raphson method. The Lindley approximation is proposed to derive the Bayesian estimators (BEs) in Section 3. The inverse moment estimators (IMEs) and the construction of the generalized confidence intervals (GCIs) are discussed in Section 4. In Section 5, Monte Carlo simulation is conducted to evaluate the effect of these methods. Section 6 gives a set of real data as a demonstration. Finally, Section 7 gives the conclusions of this paper.
2.
Maximum likelihood estimation
In this section, we will discuss the MLEs of η, λ and R(t). Due to the nonlinear nature of the likelihood equations, the Newton-Raphson method is considered to solve likelihood equations numerically.
Let T=(T1,T2,...,Tr) be the collected progressive type-Ⅱ sample under the censoring scheme Q=(Q1,Q2,...,Qr). Denote t=(t1,t2,...,tr) as the observation of T. We can easily get the likelihood function l(η,λ;t) as follows:
where δ=n(n−Q1−1)(n−Q1−Q2−2)...(n−Q1−Q2−...−Qr−1−r+1).
Then the log-likelihood function L(η,λ;t) is
Thus, the partial derivatives of L(η,λ;t) with respect to η and λ are respectively given by
Denote the MLEs of η and λ as ˆηML and ˆλML respectively, and they are the solutions of likelihood equations (2.5). According to the invariance of the maximum likelihood estimation, the MLE ˆRML(t) is obtained by replacing the parameters with ˆηML and ˆλML. Since (2.3) and (2.4) are non-linear, the Newtown-Raphson method is considered to solve likelihood equations numerically. The elements of the Jacobi matrix are given from (2.6) to (2.9).
Define
and
where θ=(η,λ). The steps involved in Newton-Raphson iteration method for obtaining the MLEs of η and λ are given below.
Step 1. Pick an arbitrary starting estimate θ0, and desired precision ε=10−5.
Step 2. Update θ0 as θnew=θ0−[H′(θ0)]−1H(θ0), where [H′(θ0)]−1 is the inverse matrix of H′(θ0).
Step 3. If |θnew−θ0|⩽ε, then ˆθML=θnew.
Step 4. If |θnew−θ0|>ε, set θ0=θnew and return to Step 2.
Step 5. Repeat from Step 2 to Step 4 until the condition in Step 3 is achieved.
3.
Bayesian estimation
Statistical inference requires three kinds of information: overall information, sample information and prior information. The statistical inference based on the first two kinds of information is called classical statistics, and the statistical inference based on comprehensive consideration of the three kinds of information is called Bayes statistics. Prior information already exists before sampling, which mostly comes from experience and historical data. The distribution obtained by processing prior information is called a prior distribution. We all know that a random variable can be described by a certain distribution. Bayesian scholars believe that an unknown parameter can be regarded as a random variable. In other words, an unknown parameter can also be described by a certain distribution, that is, a prior distribution. Kundu and Howlader [9] considered the estimation of parameters of IWD using the Bayesian approach under the squared error loss function when the sample is a type-Ⅱ censoring sample. Akgul et al. [10] discussed Bayes estimation of the step-stress partially accelerated life test model with type-Ⅰ censored sample for the IWD. The point estimators were obtained using the Lindley approximation and Tierney–Kadane approximation. The credible intervals were constructed using the Gibbs sampling method. Helu and Samawi [11] considered the Bayesian inferences based on IWD based on progressive first-failure censored data. The point estimators were derived under three loss functions. Additionally, the estimators were calculated by Lindley approximation. Based on generalized adaptive progressively hybrid censored sample, Lee [12] considered the estimation of uncertainty measure for IWD. For the BE and HPD confidence interval, the Tierney-Kadane approximation and importance sampling technique were proposed.
In this section, the BEs of η, λ and R(t) are derived under the symmetric entropy (SE) loss function, scale squared error (SSE) loss function and LINEX loss function. Since there is a complex ratio of two integrals in BEs, Lindley approximation is proposed to solve this problem.
(i) The SE loss function is defined as (Xu et al. [13]):
(ii) The SSE loss function is defined as (Song et al. [14]):
(iii) The LINEX loss function is defined as (Varian [15]):
where ˆθ is the estimator of unknown parameter θ and d is a nonnegative integer.
As an important part of Bayesian estimation, the selection of prior distribution will directly affect the final Bayesian estimation. It usually follows two rules making full use of prior information, such as empirical and historical data and being convenient for computational use. The most widely used prior distributions are mainly non-informative prior distribution, conjugate prior distribution and hierarchical prior distribution. The gamma prior belongs to the conjugate prior, which makes the computation process of estimation results easier.
Assume the prior distributions of η and λ are gamma prior. η follows Gamma(a1,b1) and λ follows Gamma(a2,b2).
Based on (3.4) and (3.5), the joint prior distribution is
Using (2.1) and (3.6), the posterior distribution is
where K=[∫+∞0∫+∞0ηr+b1−1λr+b2−1e−a1η−a2λr∏i=1t−λ−1ie−ηt−λi(1−e−ηt−λi)Qidηdλ]−1.
Based on (3.7), the margin posterior distribution of η is
The margin posterior distribution of λ is
Lindley [16] proposed an approximation algorithm to calculate the ratio of two integrals in the form:
Here is a continuous function in η and λ, L(η,λ;t) as shown in (2.2) and J(η,λ) is the logarithm of joint prior distribution (3.6). This ratio usually occurs in BE, which is why the Lindley approximation is often used to calculate the Bayesian estimator.
The expression (3.10) can be approximated by (3.11) under regularity conditions or with a large sample size. Here A, B and C are given by (3.12). Lηηη denotes the third derivative of log-likelihood function (2.2) for η, and ˆLηηη represents the value of Lηηη at η=ˆηML. ψij is the element of inverse matrix of −Lij (i,j=η,λ), and ˆψij represents the value of ψij at η=ˆηML and λ=ˆλML. Other terms are defined as the same as the above rules. The detailed expressions are presented in (3.13) to (3.18).
3.1. Bayesian estimation under SE loss function
Lemma 1. Suppose that T is a random sample. The BE ˆθSE of unknown parameter θ under the SE loss function (3.1) for any prior distribution π(θ) is
where E(θ|T) and E(θ−1|T) denote the posterior expectations of θ and θ−1.
Proof. Based on the SE loss function (3.1), the Bayesian risk of ˆθSE is
To minimize R(ˆθSE), only need to minimize E(S1(θ,ˆθSE)|T). Denote h1(ˆθSE)=E(S1(θ,ˆθSE)|T) for convenience.
Because
and the derivative is
The BE ˆθSE can be obtained by solving the equation h′1(ˆθSE)=0.
According to Lemma 1, the BEs ˆηSE, ˆλSE and ˆRSE(t) under the SE loss function can be written as:
and
From the marginal posterior distribution (3.8), the BE (3.20) may be written as
From the marginal posterior distribution (3.9), the BE (3.21) may be written as
From the posterior distribution (3.7), the BE (3.22) may be written as
It is obvious that (3.23)–(3.25) cannot be evaluated explicitly. Thus, the Lindley approximation is used to approximate them.
(i) When W(η,λ)=η, there are
Submitting W(η,λ)=η and (3.26) in the expression (3.11), the posterior expectation E(η|T) may be written as
(ii) When W(η,λ)=η−1, there are
Submitting W(η,λ)=η−1 and (3.28) in the expression (3.11), the posterior expectation E(η−1|T) may be written as
The BE ˆηSE of rate parameter η under the SE loss function can be obtained by submitting (3.27) and (3.29) in the expression (3.20).
Using Lindley approximation, the BEs ˆλSE and ˆRSE(t) under SE loss function are obtained as similar to the above steps.
3.2. Bayesian estimation under SSE loss function
Lemma2. Suppose that T is a set of simple random samples. The BE ˆθSSE of unknown parameter θ under the SSE loss function (3.2) for any prior distribution π(θ) is
Proof. The Bayesian risk of ˆθSSE based on SSE loss function (3.2) is
Denote h2(ˆθSSE)=E(S2(θ,ˆθSSE)|T), and
The derivative of h2(ˆθSSE) is
Therefore, the BE ˆθSSE under the SSE loss function is derived by solving equation h′2(ˆθSSE)=0.
According to Lemma 2, the BEs ˆηSSE, ˆλSSE and ˆRSSE(t) under SSE loss function are presented in (3.31) and (3.33) respectively.
and
From the marginal posterior distribution (3.8), the BE (3.31) may be written as
From the marginal posterior distribution (3.9), the BE (3.32) can be written as
From the posterior distribution (3.7), the BE (3.33) can be written as
Then, Lindley approximation is used to approximate (3.34) to (3.36).
(i) When W(η,λ)=η1−d, there are
Putting W(η,λ)=η1−d and (3.37) into (3.11), the posterior expectation E(η1−d|T) can be written as
(ii) When W(η,λ)=η−d, there are
Putting W(η,λ)=η−d and (3.39) into (3.11), the posterior expectation E(η−d|T) can be written as
Hence the BE ˆηSSE under the SSE loss function are obtained by submitting (3.38) and (3.40) in (3.31).
Using Lindley approximation, the BEs ˆλSSE and ˆRSSE(t) under SSE loss function can be obtained by the similar steps.
3.3. Bayesian estimation under LINEX loss function
Lemma 3. Suppose that T is a set of simple random samples. The BE ˆθL of unknown parameter θ under the LINEX loss function (3.3) for any prior distribution π(θ) is
Proof. The Bayesian risk of ˆθL based on LINEX loss function (3.3) is
Denote h3(ˆθL)=E(S3(θ,ˆθL)|T), and
The derivative of h3(ˆθL) is
Therefore, the BE ˆθL under the LINEX loss function is derived by solving equation h′3(ˆθL)=0.
It follows from Lemma 3 that the BEs under LINEX loss function are
Subsequently, (3.42)–(3.44) can be written as
Next, the explicit expressions of these BEs are obtained by using the Lindley approximation. When W(η,λ)=e−aη, there are
According to Lindley's formula (3.11), the posterior expectation E(e−aη|T) can be written as
The BE ˆηL under LINEX loss function is derived by substituting (3.49) into (3.42). The BEs ˆλL and ˆRL(t) can be acquired using a comparable method to the aforementioned steps, and therefore will not be reiterated here.
4.
Estimation based on generalized pivotal quantity
In Sections 2 and 3, the MLEs and BEs have been derived. However, we cannot obtain the explicit forms of MLEs and BEs easily by using both methods. In addition, it is necessary to select the appropriate initial values when using the Newtown-Raphson iteration method. Therefore, the generalized pivot quantity is constructed for deriving IMEs and GCIs in this Section. Compared with the maximum likelihood estimation and Bayes estimation, the inverse moment estimation is much simpler in calculation. It only needs some mathematical transformations and finally solves the equations. Wang [17] proposed a new method and named it as inverse moment estimation method in 1992. Additionally, the method was applied to parameter estimation of Weibull distribution. After that, inverse moment estimation has been widely used and studied. For example, Luo et al. [18] used the inverse third-moment method when forecasting a single time series using a large number of predictors in the presence of a possible nonlinear forecast function. Qin and Yuan [19] proposed an ensemble of IMEs to explore the central subspace. Based on progressive censored data, Gao et al. [20] proposed the pivotal inference for inverse exponentiated Rayleigh distribution. The point estimators were derived using the method that combined pivotal quantity with inverse moment estimation.
In this section, the IMEs and GCIs of η, λ and R(t) are obtained by constructing the generalized pivot quantity.
4.1. Inverse moment estimation
Definition 1. Assume that T is a random variable and t is the observation of T, and θ1 is an interest parameter and θ2 is a nuisance parameter. A function G(T;t,θ1,θ2) is called a generalized pivotal quantity if it satisfies the following conditions:
(1) Given t, the distribution of G(T;t,θ1,θ2) is unrelated to both θ1 and θ2.
(2) Given t, the observation G(t;t,θ1,θ2) of generalized pivotal quantity G(T;t,θ1,θ2) is unrelated to θ2.
First, let
The distribution of Xi is
Let Exp(1) be the standard exponential distribution. It is obvious that Xi∼Exp(1), and Xr<Xr−1<...<X1. Let
Then, S1,S2,...,Sr are independent of each other and Si∼Exp(1). Denote U=2r∑i=1Si and V=2S1, so U follows χ2 distribution with 2r−2 degrees of freedom and V follows χ2 distribution with 2 degrees of freedom. Finally, let
Therefore, G1 and G2 are independent, G1 follows F distribution with 2 and 2r−2 degrees of freedom and G2 follows χ2 distribution with 2r degrees of freedom. According to Definition 1, G1 is the generalized pivotal quantity of λ, but G2 is neither a generalized pivotal quantity of η nor λ. Since (r−1)(r−2)−1 is the mean of F(2,2r−2) and 2r is the mean of χ2(2r), Theorem 1 can be derived.
Theorem 1. Let T=(T1,T2,...,Tr) be a progressive type-Ⅱ censored sample following IW(η, λ), and let Q=(Q1,Q2,...,Qr) be the censoring scheme. Denote t=(t1,t2,...,tr) as the observation of T. The IME ˆηG of η and the IME ˆλG of λ are determined by the following expressions. The IME ˆRG(t) is obtained by replacing the parameters with ˆηG and ˆλG.
4.2. Generalized confidence interval
This section will discuss the GCIs by generalized pivotal quantity.
Lemma 3. Suppose that a set of constants ki (i=1,2,...,r) satisfy 0<k1<k2<...<kr and denote G(λ)=r(r−1)k−λrr∑i=1k−λi−rk−λr.
(i) G(λ) decreases monotonically when λ>0;
(ii) The equation G(λ)=k has only one solution, which k>0 and k is a constant.
Proof. (i) The derivative of G(λ) is
According to 0<k1<k2<...<kr, it can be derived 0<lnk1<lnk2<...<lnkr. That is lnki−lnkr<0 (i=1,2,...,r−1). Therefore, G(λ) decreases monotonically when λ>0.
(ii) Suppose that the equation G(λ)=k has two unequal solutions, λ1 and λ2 respectively. Based on G(λ1)=G(λ2), there is
That is
Here, (kikr)−λ is monotone because of kikr⩾1. Hence the above expression is inconsistent with the supposition, and the equation G(λ)=k has only one solution.
Theorem 2. Let T=(T1,T2,...,Tr) with observation t=(t1,t2,...,tr) be a progressive type-Ⅱ censored sample following IW(η, λ), and let Q=(Q1,Q2,...,Qr) be the censoring scheme. Fω(2,2r−2) and χ2ω(2r) denote the upper quantile of F(2,2r−2) and χ2(2r) respectively with ω=1−√1−γ2. The 100(1−γ)% GCIs of η and λ are given as follows:
where φ(t1,...,tr,F1−ω(2,2r−2)) is the solution of equation (4.8) and φ(t1,...,tr,Fω(2,2r−2)) is the solution of Eq (4.9).
Proof. From Section 4.1, there are G1 F(2,2r−2) and G2∼χ2(2r). G1 and G2 are independent, so
The F1−ω(2,2r−2)<G1<Fω(2,2r−2) may be written as
According to Lemma 3, F1−ω(2,2r−2) and Fω(2,2r−2) are positive constants, the above inequation is equivalent to
χ2ω(2r)<G2<χ21−ω(2r) is equivalent to
5.
Monte Carlo simulation
In this Section, the proposed estimation methods are compared using MATLAB. For point estimates, the mean squared errors (MSEs) are calculated by Eq (5.1). For interval estimates, coverage probability (CP) is used to reflect the performance of GCIs. The simulation is carried out under true value (ηreal,λreal)=(2,2) and different n, r and Q, and the trials are N at 1000 times. The hyper-parameter of the prior distribution is (a1,b1)=(2,1.8) and (a2,b2)=(1.5,2), and the parameters of SSE loss function and LINEX loss function are d=4 and a=4 respectively. The MSEs of η, and λ are shown in Tables 1 and 2 respectively, and the MSEs of R(t) are shown in Table 3 with t=3. The CP values is shown in Table 4.
Balakrishnan and Sandhu [21] proposed an algorithm to produce progressive type-Ⅱ censored sample from any continuous distribution. The specific steps applied to the IWD are as follows:
Step 1. Generate samples w1,w2,...,wr from U(0,1), where w1,w2,...,wr are independent.
Step 2. Let vi=w(i+Qr+Qr−1+...+Qr−i+1)−1i and ui=1−vrvr−1...vr−i+1.
Step 3. Let ti=F−1(ui;ηreal,λreal), where F(⋅) is the CDF (1.2) of IWD, i=1,2,...,r.
Then, t1<t2<...<tr are progressive type-Ⅱ censored data from IW(η, λ) with a censoring scheme Q. Calculation results can be found in Tables 1–3.
From Table 1 to Table 3, we have the following conclusions:
(i) Obviously, considering the same n and r, the censoring scheme Q has a great influence on MSE.
(ii) Considering the same n, r and Q, the BE of η under SE loss function is the better than the MLE and IME.
(iii) Considering the same n, r and Q, the BE λ under SE loss function is better than others. However, the BE of λ under SSE loss function is close to the BE under LINEX loss function.
(iv) Considering the same n, r and Q, for the reliability R(t), MSEs of MLEs and BEs under SE and SSE loss functions are relatively close, while, MSEs of BEs under LINEX loss function are larger than others.
From Table 4, We know that CP values for η and λ are all close to 95%.
6.
Real data analysis
There is a set of real data from Dumonceaux and Antle [22], which represents the maximum flood level (in millions of cubic feet per second) of the Susquehanna River at Harrisburg, Pennsylvania over 20 four-year periods (1890–1969) as:
0.265, 0.269, 0.297, 0.315, 0.324, 0.338, 0.379, 0.379, 0.392, 0.402, 0.412, 0.416, 0.418, 0.423, 0.449, 0.484, 0.494, 0.613, 0.654, 0.740.
According to the data, we can plot the empirical CDF and the CDF of the IWD, as shown in Figure1. In the IWD, we using the BEs under SE loss function as the value of parameter, i.e. η=0.0336,λ=2.0431. According to Figure 1, we can see that the IWD can well model the data. Therefore, we can conclude that this distribution is valid.
Now, the real data with censoring scheme (Q1,Q2,...,Q10)=(1,1,...,1) are as follows:
0.265, 0.297, 0.324, 0.379, 0.392, 0.412, 0.418, 0.449, 0.494, 0.654.
Before proceeding with the estimation, it is necessary to establish the existence and uniqueness of the maximum likelihood estimate. However, proving this can be a complicated process due to the nonlinearity of the system of Eq (2.5). For this reason, we visualize it through Figure 2, where L1 represents ∂L(η,λ;t)∂η in Eq (2.3) and L2 represents ∂L(η,λ;t)∂λ in Eq (2.4).
From Figure 2, we know that the two curves intersect at only one point, indicating the presence of a unique MLE.
The estimates and generalized confidence intervals that obtained by using these proposed methods are shown in Table 5.
7.
Conclusions
In this paper, we have investigated the point estimation and interval estimation of parameters based on progressive type-Ⅱ censored sample for IWD. First, the Newton-Raphson iteration method is used to solve the likelihood equations of parameters and obtain their MLEs. Then, the BEs are derived based on SE and SSE loss functions, respectively. Finally, the IMEs are derived by generalized pivotal quantity. Additionally, the GCIs are also constructed by generalized pivotal quantity. Monte Carlo simulation is used to present the effect of the above estimators. The simulation results revealed that the estimators derived using Bayesian approach perform better than other methods in terms of MSE. Moreover, a real set of data is analyzed and the results coincide with simulation. Monte Carlo simulation results indicate that Bayesian estimation under the SE loss function works best among all the methods mentioned in this paper.
Use of AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
Acknowledgments
This research was funded by National Natural Science Foundation of China (No. 71661012).
Conflict of interest
The authors declare there is no conflict of interest.