1.
Introduction
The heavy-tailed distribution is an important model in extreme value statistics with applications in finance, insurance, meteorology, and hydrology. Its primary parameter is the positive extreme value index (also known as extreme value index of heavy-tailed distribution, abbreviated as heavy-tailed index), which characterizes the probability of extreme events such as the catastrophic flood on the 100-year return period, enormous earthquakes, large insurance claims, and so on (see [1,2]). Therefore, estimators for the extreme value index has become one of the main research problems in extreme value statistics.
The estimation of the extreme value index started earlier. However, in the 1970s, researchers started working on semi-parametric estimators of extreme value index. The seminal and most famous estimator is the Hill estimator [3], which is favored by many scholars for its simplicity in form; however, it is sensitive to the threshold k. In general, the Hill estimator has a large variance for small values of k, while large values of k usually induce a high bias. This means that an inadequate selection of k can result in large expected errors. Therefore, the threshold selection is also one of the most fundamental problems in extreme value statistics. The existing methods include graphical diagnostics and heuristic procedures based on sample paths' stability as a function of k, as well as minimization of the estimator of the mean square error, also as functions of k. More comprehensive reviews about threshold selection can be found in [4,5].
Afterward, Dekkers et al. [6] proposed a moment estimator by constructing a moment statistic, which improved the Hill estimator. By use of the moment statistic, many estimators for the extreme value index have been proposed, such as the moment ratio estimator [7], the estimator proposed by Gomes and Martins [8], the estimator proposed by Caeiro and Gomes [9], and Lehmer's mean-of-order-p(Lp) estimator [10]; more estimators can be found in [11,12]. As a further study, based on the moment statistic, this paper constructs a class of heavy-tailed index estimators with four parameters. Many estimators can be obtained from specific parameter values, which include not only the existing estimators in the literature but also newly derived ones. The consistency and asymptotic normality of the proposed estimators are investigated under the first-order regular variation and second-order regular variation conditions. The asymptotical unbiasedness of the specified estimators is discussed, and a comparison of the main components of the asymptotic variance is performed in the asymptotical unbiased estimators. In addition, their finite sample performance is analyzed through a Monte-Carlo simulation in terms of the simulated mean value and mean square error. The results show that some of the new estimators perform better.
Let X be a non-negative random variable with a distribution function F(x), for sufficiently large x>0, satisfying
we call F(x) a heavy-tailed distribution, and γ is the heavy-tailed index, one of the primary parameters of extreme events, where L(x)>0 is a slow varying function at infinity, that is, for all t>0,
If a positive measure function has infinite right endpoints and satisfies that for all , there are
then is said to be a regularly varying function with index , denoted .
Let us denote as the tail quantile function, and as a generalized inverse function of ; from [13], the heavy-tailed distribution has the following equivalence relation
In general, the equivalence relation (1.2) is called the first-order regular variation condition.
The classical heavy-tailed index estimators are all composed of the number of the top order statistics, whose consistency are obtained under the first-order regular variation condition (1.2) and the following condition on the sequences ,
In general, we call this intermediate if it satisfies (1.3).
In addition, in order to obtain the asymptotic normality of the heavy-tailed index estimator, the second-order regular variation condition is often needed. The is said to satisfy the second-order regular variation condition if there exists an eventually positive regular varying function with index , that is , such that
for every , where is the second-order parameter. In this paper, we only consider the case .
To construct new estimators, we shall consider the following moment statistic and give some heavy-tailed index estimators.
Let be a sample of independent observations from a common unknown distribution function , denote the ascending order statistics associated with the sample .
Let us consider the moment statistic
Now we introduce several heavy-tailed index estimators constructed from the moment statistic.
The Hill estimator is expressed as follows:
The monent estimator is expressed as follows:
The moment estimator is an asymptotical unbiased estimator when satisfy .
The moment ratio estimator is expressed as follows:
The estimators in [8] are expressed as follows:
and
where for any , when , there exists satisfying , such that is an asymptotical unbiased estimator. In addition, note that the estimator becomes the moment ratio estimator when .
The estimator in [9] is expressed as follows:
where for any , is an asymptotical unbiased estimator when .
The estimator in [10] is expressed as follows:
where the estimator becomes the moment ratio estimator when .
2.
New estimators
Based on the moment statistic (1.5), we construct a class of heavy-tailed index estimators with four parameters, whose expression is as follows:
where , and , see (1.5) and (1.6), respectively.
Before discussing the properties of the proposed estimator, the following two lemmas are introduced.
Lemma 2.1. [8] Let be a sample of independent observations from a common unknown distribution function , denote the ascending order statistics associated with the sample . If satisfies the first-order regular variation condition (1.2) and is intermediate, then the moment statistic
Lemma 2.2. [8] Let be a sample of independent observations from a common unknown distribution function , denote the ascending order statistics associated with the sample . If satisfies the second-order regular variation condition (1.4) and is intermediate, then the moment statistic has the following asymptotic distributional representation
where , , is an asymptotical standard normal random variable, , are independent identically distributed standard exponential random variables, and the covariance of and is denoted by
The consistency and asymptotic normality of the new estimators are discussed below.
Theorem 2.1. Let be a sample of independent observations from a common unknown distribution function , denote the ascending order statistics associated with the sample . If satisfies the first-order regular variation condition (1.2) and is intermediate, then
Proof. According to Lemma 2.1 and the continuous mapping theorem, we have , , and . Using the continuous mapping theorem again, we obtain . □
Theorem 2.2. Let be a sample of independent observations from a common unknown distribution function , denote the ascending order statistics associated with the sample . If satisfies the second-order regular variation condition (1.4) and is intermediate, then
where
is an asymptotical standard normal random variable.
Proof. By Lemma 2.2, we obtain
and
Using , one gets
From , it follows that
Combining with (2.6) and (2.7), we obtain
Applying again, we obtain
Similar to the proof of (2.8), we have
Thus, combining with (2.5), (2.8), and (2.9), it follows that
From the above equation, we can see that
Let , and , we have
From Lemma 2.2, one can deduce that , are independent, identically distributed random variables. It is easy to obtain that and . Applying Lindeberg-Lévy central limit theorem, we get that is an asymptotical standard normal random variable, and
□
Corollary 2.1. Under the conditions of Theorem 2.2, suppose that , then
Remark 1. For every , if there exist , such that , then the corresponding is an asymptotical unbiased estimator of , even when .
3.
Specific expression of new estimators
The estimator has four parameters, and for the sake of dealing with practical problems, one can consider specifying the parameters to obtain different specific estimator expressions. We give ten specific estimators below.
(E1) ;
(E2) ;
(E3) ;
(E4) ;
(E5) ;
(E6) ;
(E7) ;
(E8) ;
(E9) ;
(E10) .
Among the above-mentioned ten estimators, (E1)–(E5) are existing estimators and (E6)–(E10) are new estimators. As described in the literature [3,8,9,10], (E1), (E3), and (E5) are asymptotical biased estimators, while (E2) and (E4) are asymptotical unbiased estimators when appropriate parameters are chosen. The asymptotic unbiasedness of the new estimators (E6)–(E10) is discussed below.
Let , then . Therefore, is an increasing function for a given .
Thus, the main component of the asymptotic bias of the new estimator in Theorem 2.2 can be rewritten as
Since the main component of the asymptotic bias of the estimator (E6)–(E10) involves only one parameter, it can be abbreviated as for convenience. The asymptotic unbiasedness of the estimators (E6)–(E10) is discussed below.
For the estimator (E6), the main component of the asymptotic bias is . Since , is a decreasing function on for a given . Also, , and . It is easy to see that when . There exists such that , i.e., the estimator (E6) with the parameter is an asymptotical unbiased estimator.
For the estimator (E7), the main component of the asymptotic bias is . Since , and , there always exists such that , i.e., the estimator (E7) with the parameter is asymptotically unbiased, where satisfies .
For the estimator (E8), the main component of the asymptotic bias is . Since , and , there always exists such that , i.e., the estimator (E8) with the parameter is asymptotically unbiased, where .
For the estimator (E9), the main component of the asymptotic bias is . Since , and , there always exists such that , i.e., the estimator (E9) with the parameter is asymptotically unbiased, where satisfies .
For the estimator (E10), the main component of the asymptotic bias is . Since , and , there always exists such that , i.e., the estimator (E10) with the parameter is asymptotically unbiased, where satisfies .
In summary, the estimator (E6) can be asymptotically unbiased only if it satisfies and a suitable is chosen. However, for any , one can always find such that the estimators (E7)–(E10) are asymptotically unbiased and the value of depends only on .
Next, in the above-mentioned asymptotical unbiased estimators, we compare the main components of their asymptotic variances . The asymptotical unbiased estimators (E2), (E4), (E7), (E8), (E9), and (E10) are considered here. For convenience, the main components of the asymptotic variance of the estimators involved are denoted as . Given different values of , we compute the ratio of to in the compared estimators (the ratio depends only on ) and the corresponding values of . The results of the calculations are shown in Table 1. From Table 1, we can draw the following conclusions.
1) All ratios decrease as decreases.
2) For a given , , and are smaller, followed by and , and finally .
3) For , and , when , ; when , .
Overall, among the asymptotical unbiased estimators compared, the estimators (E8) and (E10) perform better with smaller values of asymptotic variance.
4.
Comparison of finite sample properties
In order to investigate the performance of the estimators (E2), (E4), (E7)–(E10) mentioned in the previous section in the finite sample case, Monte-Carlo simulations are used to generate samples of sample size from the following model.
1) Frchet() model, ;
2) Burr() model, .
For convenience, the estimators involved are denoted as . And for each estimator, we computed the simulated mean value (E) and mean square error (MSE) of the estimators. The calculation formulas are as follows:
From the discussion in the previous section, it can be seen that the parameter values of the asymptotic unbiased estimators only depend on the parameter . Therefore, the following simulations will be divided into two cases: is known and is unknown. For the case where is unknown, we will adopt the approach of the literature [14,15] to give an estimator of , which in turn can be used to obtain an estimator of the parameter in the estimators compared. Specifically, the following is discussed.
We estimate the parameter by the following estimator proposed in [14].
where
To decide which value ( or ) of the parameter to take in the aforementioned estimator, we use the algorithm provided in [15]. And for the estimator , following the recommendation from [15], we use .
The simulation results are shown in Figures 1–4. Figure 1 shows the simulated mean values and MSEs of the estimators involved under study for sample of size from the Frchet(1) model when is unknown. The simulated mean values show that all estimators are almost asymptotically unbiased. Regarding the simulated MSEs, for almost all values of , we can see that
and is almost equal to . This is generally consistent with the theoretical analysis.
Figure 2 is the equivalent of Figure 1 when is known and similar to Figure 7 in [9]. The simulated mean values show that all estimators are asymptotically unbiased. Regarding the simulated MSEs, for every , we can see that
In addition, is almost equal to and is closer to . This is consistent with the theoretical analysis.
Figures 3 and 4 show the simulated mean values and MSEs of the estimators involved under study for sample of size from the Burr(1, -1) model when is unknown and known, respectively. They perform essentially the same in terms of the simulated mean value and MSEs. The simulated mean values show that the estimators seem to be asymptotically unbiased for some of the values of .
At this time, for most of the values of , it is clear from the simulated MSEs that
For other different values of Burr's model, the conclusions obtained are similar to the results of the above analysis.
In the following, the above-mentioned estimators are considered to be compared in terms of the simulated mean value and mean square error at the optimal level ,
where denote . The estimator of the parameter is given by the estimator of in (4.1). And the estimators are denoted as at the optimal level. We have implemented Monte-Carlo simulation experiments of size 100 for sample sizes , and , from the Frchet and Burr models. For each model, mean values with the smallest squared bias and the smallest root mean square error (RMSE) are presented in bold, where . The simulated results are shown in Tables 2–5.
From Tables 2–5, we now provide a few comments.
1) For all simulated models, the simulated values of and are almost equal especially in terms of RMSE.
2) For models with , has a smaller squared bias than other estimators in terms of the simulated mean value.
3) For models with , has a smaller squared bias than other estimators in terms of the simulated mean value. However, has a smaller RMSE than other estimators when .
Overall, the new estimators perform well within a certain range.
5.
Conclusions
In extreme value statistics, the estimator for the heavy-tailed index is one of the most important current research topics. By means of the moment statistic, this paper constructs a class of heavy-tailed index estimators with four parameters. The consistency and asymptotic normality of the proposed estimators are proved under first-order and second-order regular variation conditions. For the proposed estimators, ten estimators are given by the specific values of the parameters, which include both the existing estimators in the literature and new estimators. The asymptotic unbiasedness is discussed for specific new estimators. In the asymptotical unbiased estimators, some of the new estimators are compared with the existing ones in terms of asymptotic variance, and the new estimators perform better. In the finite sample case, the simulated mean and mean square error of the estimators compared were calculated by Monte-Carlo simulation. The results show that the size relationship of the simulated mean square error is consistent with the theoretical analysis in asymptotical unbiased estimators. In addition, we compare the simulated mean value and mean square error of the mentioned estimators at the optimal level. It is concluded that the new estimators perform better within a certain range. Although we propose a class of parameterized heavy-tailed index estimators, the selection of parameters is still an open problem and will require further study.
Use of AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
Acknowledgments
The authors would like to express their gratitude to the editor and two anonymous reviewers for their valuable comments and suggestions, which have greatly improved the quality of the paper. This work was supported by the National Natural Science Foundation of China (12001395).
Conflict of interest
The authors declare there is no conflicts of interest.