Processing math: 100%
Research article

Empirical likelihood based heteroscedasticity diagnostics for varying coefficient partially nonlinear models

  • Received: 26 September 2024 Revised: 04 November 2024 Accepted: 06 December 2024 Published: 12 December 2024
  • MSC : 62G05, 62G20, 62H15

  • Heteroscedasticity diagnostics of error variance is essential before performing some statistical inference work. This paper is concerned with the statistical diagnostics for the varying coefficient partially nonlinear model. We propose a novel diagnostic approach for heteroscedasticity of error variance in the model by combining it with the empirical likelihood method. Under some mild conditions, the nonparametric version of the Wilks theorem is obtained. Furthermore, simulation studies and a real data analysis are implemented to evaluate the performances of our proposed approaches.

    Citation: Cuiping Wang, Xiaoshuang Zhou, Peixin Zhao. Empirical likelihood based heteroscedasticity diagnostics for varying coefficient partially nonlinear models[J]. AIMS Mathematics, 2024, 9(12): 34705-34719. doi: 10.3934/math.20241652

    Related Papers:

    [1] Liqi Xia, Xiuli Wang, Peixin Zhao, Yunquan Song . Empirical likelihood for varying coefficient partially nonlinear model with missing responses. AIMS Mathematics, 2021, 6(7): 7125-7152. doi: 10.3934/math.2021418
    [2] Jieqiong Lu, Peixin Zhao, Xiaoshuang Zhou . Orthogonality based modal empirical likelihood inferences for partially nonlinear models. AIMS Mathematics, 2024, 9(7): 18117-18133. doi: 10.3934/math.2024884
    [3] Yanting Xiao, Yifan Shi . Robust estimation for varying-coefficient partially nonlinear model with nonignorable missing response. AIMS Mathematics, 2023, 8(12): 29849-29871. doi: 10.3934/math.20231526
    [4] Xiaowei Zhang, Junliang Li . Model averaging with causal effects for partially linear models. AIMS Mathematics, 2024, 9(6): 16392-16421. doi: 10.3934/math.2024794
    [5] Zawar Hussain, Atif Akbar, Mohammed M. A. Almazah, A. Y. Al-Rezami, Fuad S. Al-Duais . Diagnostic power of some graphical methods in geometric regression model addressing cervical cancer data. AIMS Mathematics, 2024, 9(2): 4057-4075. doi: 10.3934/math.2024198
    [6] Jingjing Yang, Weizhong Tian, Chengliang Tian, Sha Li, Wei Ning . Empirical likelihood method for detecting change points in network autoregressive models. AIMS Mathematics, 2024, 9(9): 24776-24795. doi: 10.3934/math.20241206
    [7] Lei Hu . A weighted online regularization for a fully nonparametric model with heteroscedasticity. AIMS Mathematics, 2023, 8(11): 26991-27008. doi: 10.3934/math.20231381
    [8] Neama Salah Youssef Temraz . Analysis of stress-strength reliability with m-step strength levels under type I censoring and Gompertz distribution. AIMS Mathematics, 2024, 9(11): 30728-30744. doi: 10.3934/math.20241484
    [9] Fei Yan, Junpeng Li, Haosheng Jiang, Chongqi Zhang . A-Optimal designs for mixture polynomial models with heteroscedastic errors. AIMS Mathematics, 2023, 8(11): 26745-26757. doi: 10.3934/math.20231369
    [10] Junfeng Huo, Mingquan Wang, Xiuqing Zhou . Nonparametric estimation of coefficient function derivatives in varying coefficient models. AIMS Mathematics, 2025, 10(5): 11592-11626. doi: 10.3934/math.2025526
  • Heteroscedasticity diagnostics of error variance is essential before performing some statistical inference work. This paper is concerned with the statistical diagnostics for the varying coefficient partially nonlinear model. We propose a novel diagnostic approach for heteroscedasticity of error variance in the model by combining it with the empirical likelihood method. Under some mild conditions, the nonparametric version of the Wilks theorem is obtained. Furthermore, simulation studies and a real data analysis are implemented to evaluate the performances of our proposed approaches.



    As one of the most important semiparametric regression models, the varying coefficient partially linear model (VCPLM), is an integration of the varying-coefficient model and the classic linear model. It can be described as follows:

    Y=XTα(U)+ZTβ+ε, (1.1)

    where YR represents the response variable, X=(X1,X2,,Xq)T, Z=(Z1,Z2,,ZP)T and UR represent the associated covariates, ε is the model error, β=(β1,,βP)T is an unknown p-dimensional parameter vector, and α()=(α1(),,αq())T is an unknown q-dimensional varying coefficient function vector. It has been extensively researched due to the fact that model (1.1) has both the interpretability of parametric structure and the flexibility of nonparametric structure.

    It is observed that the existence form of the unknown β is linear through the function ZTβ in model (1.1). Actually, the absolute linear relationship may be inappropriate in practical applications. To further explore more accurate information between Y and some certain covariates, the VCPNLM, which was introduced by Li and Mei [1], inherits the following form

    Y=g(Z,β)+XTα(U)+ε. (1.2)

    The difference between these two models is that model (1.2) extends ZTβ in (1.1) to g(Z,β), and g(,) is set to be a pre-known function. It is notable that the dimensions of the parameter vector β and the covariate Z in g(Z,β) are not necessarily consistent. Taking the generalized linear model as an example, we can write exp(a+ZTβ) as g(Z,˜β) with g()=exp() and ˜β=(a,βT)T, where both a and β are parameters.

    Compared to the VCPLM (1.1), VCPNLM (1.2) has a stronger adaptability. Therefore, it is a of great significance to conduct some statistical studies related to model (1.2). Fortunately, it has been extensively researched since its introduction. In the work of Li and Mei [1], they presented the profile nonlinear least square estimators for both the unknown β and α(). Zhou et al. [2] developed the construction of confidence regions for the unknown quantities by employing the empirical likelihood technique. Jiang et al. [3] put forward a robust estimation approach based on a novel loss function related to the exponential squared in the case where variables have measurement errors. Xiao and Chen [4] promoted the local bias-corrected empirical likelihood procedures to deal with the additive errors for the nonparametric component. Dai and Huang [5] were dedicated to treating the distorted measurement errors in both the response and the covariates. Qian and Huang [6] proposed the corrected profile least squares estimation procedure with measurement errors in the nonparametric part. For model (1.2) with data missing, Wang et al. [7] used the inverse probability weighted profile nonlinear least squares approach and the empirical likelihood technique to deal with the missing covariates. Xia et al. [8] developed the statistical inferences with missing responses. Furthermore, Xiao and Liang [9] performed a robust two-stage estimator method from the aspect of a modal regression. Xiao and Shi [10] studied the robust estimation for model (1.2) with a nonignorable missing response. Zhou and Zhao [11] studied model (1.2) under the framework of a quantile regression with the censored response variable and a missing censoring indicator.

    It is noted that all the aforementioned studies were conducted under the assumption of the equal variances of εi,i=1,,n, that is Eεi=0 and Varεi=σ2,i=1,,n. However, using procedures for homoscedastic models in the case of heteroscedastic errors may lead to the loss of efficiency. Therefore, it is crucial and meaningful to ensure the absence of heteroscedasticity before we perform some statistical inference work. The diagnosis of heteroscedasticity has received sufficient attention from many scholars. We can refer to [12,13,14] for partial linear models, and refer to [15,16,17] for VCPLMs with measurement errors. As we have seen, some statistical works based on the empirical likelihood technique ([18,19,20]) inherit many advantages. A significant advantage is that there is no need for a variance estimation. Recently, it has also been shown to work well in the issue of testing the underlying heteroscedastic errors. Readers can be referred to but not limited to [12,13,14]. To our knowledge, research related to the diagnosis of heteroscedasticity for VCPNLM (1.2) by means of the empirical likelihood technique has not yet emerged.

    Taking all the above statements into account, in this paper, we plan to perform a diagnostic method for heteroscedasticity based on the empirical likelihood technique in model (1.2). Assuming that the variance of εi satisfies Varεi=σ2i, then the hypotheses testing problem can be defined as follows:

    H0:σ2i=σ2,VSH1:σ2iσ2, (1.3)

    where σ denotes an ordinary constant. We are concerned in constructing a test for heteroscedasticity by invoking the empirical likelihood technique, which does not specify the distribution of the errors. Under several regularity conditions, we attempt to derive the corresponding Wilk's theorem. Finally, we expect to verify the feasibility of our proposed method through some simulation studies.

    The remainder is described as follows: the methodology and the main results of the empirical likelihood based diagnostics method are introduced in Section 2; some simulation studies are implemented to exhibit the finite sample performances of our proposed test statistics in Section 3; we use the Boston housing price data to illustrate our proposed method in Section 4; and the conclusions and ongoing works are presented in Section 5. The proofs of the main results are presented in the Appendix.

    Denote {(Yi,Zi,Xi,Ui),i=1,,n} as the i.i.d. copies of {(Y,Z,X,U)}; then, the individual form of the VCPNLM is as follows:

    Yi=g(Zi,β)+XTiα(Ui)+εi,i=1,,n. (2.1)

    Suppose that β is known beforehand; then, (2.1) can be reexpressed as the following varying coefficient model:

    Yig(Zi,β)=qj=1αj(Ui)Xij+εi,i=1,,n. (2.2)

    First, we employ the classic local linear smooth technique to derive the estimator of {αj(),j=1,,q} in model (2.2). Based on Taylor's expansion and for u in a small neighborhood of u0, αj(u) can be locally approximated via the following linear form:

    αj(u)αj(u0)+αj(u0)(uu0),j=1,,q. (2.3)

    According to (2.3), the estimator of {(αj(u0),αj(u0)),j=1,,q} can be deduced by minimizing the weighted local least-squares problem as follows:

    ni=1{Yig(Zi,β)qj=1[αj(u0)+αj(u0)(Uiu0)]Xij}2Kh(Uiu0), (2.4)

    where K() is the kernel function, Kh() has the form K(/h)/h, and h is the bandwidth which can be determined by some usual methods.

    We introduce the following matrix notations for simplicity in description. Let

    Y=(Y1,,Yn)T,M=(XT1α(U1),,XTnα(Un))T,g(Z,β)=(g(Z1,β),,g(Zn,β))T,
    K(u0)=diag(Kh(U1u0),,Kh(Unu0)),H(u0)=(α(u0)T,hα(u0)T)T,

    and

    X(u0)=(XT1h1(U1u0)XT1XTnh1(Unu0)XTn).

    Under the above matrix representations, the estimator of H(u0) can be derived by the following:

    ˆH(u0)=[X(u0)TK(u0)X(u0)]1X(u0)TK(u0)[Yg(Z,β)]. (2.5)

    Then, the estimator of α() at u0 can be obtained by taking only the first part, that is,

    ˆα(u0)=(Iq,0q)[X(u0)TK(u0)X(u0)]1X(u0)TK(u0)[Yg(Z,β)], (2.6)

    where Iq represents the q×q identity matrix, and 0q represents the q×q matrix for all entries 0.

    For the purpose of testing the heteroscedasticity in model (2.1), first we rewrite the expression of the error variance as follows:

    Var(εi)=σ2mi,

    where mi>0. Similar to arguments in Liu et al.[16], we presume that mi possesses the subsequent sturcture:

    mi=m(Ui,γ). (2.7)

    Here, mi is supposed to rely on the known covariate Ui and an unknown q×1 vector γ. It is remarkable that the structure of the function m(,) is usually known in advance. In addition, we assume that m(,) is differentiable with respect to γ and there exists a unique γ that satisfies m(Ui,γ)=1 for all Ui. Thus, (1.3) is converted to the following hypothesis problem:

    H0:γ=γ,VSH1:γγ. (2.8)

    In order to utilize the empirical likelihood technique, we first consider the following estimation function:

    {h1i=ηi[(YiXTiα(Ui)g(Zi,β))2σ2],h2i=˜g(Zi,β)[YiXTiα(Ui)g(Zi,β)]. (2.9)

    Denote ηi=(˙mTi,1)T, and ˙mi represents the derivative of mi with respect to δ under the null hypothesis H0. Write hi=(hT1i,hT2i)TRp+q+1; then, we can easily know that E(hi)=0 under H0. Intuitively, the above heteroscedasticity test problem is converted to testing whether E(hi)=0. This can be completed by means of the empirical likelihood technique.

    Denote p1,p2,,pn be some nonnegative numbers whose sum is 1, that is ni=1pi=1. Under the null hypothesis H0, we can construct the profile empirical likelihood ratio for γ,σ2,β as follows:

    R0(γ;σ2,β)=max{ni=1npini=0pihi=0,pi0,ni=0pi=1}. (2.10)

    Here, β and σ2 are nuisance parameters. It is noteworthy that R0(γ;σ2,β) cannot be used to construct a test directly, for it contains the unknown γ, β, σ2, and α(). A measure to deal with this issue is to substitute ˆα() for α(). Therefore, we denote ˆh1i and ˆh2i as follows:

    {ˆh1i=ηi[(YiXTiˆα(Ui)g(Zi,β))2σ2],ˆh2i=˜g(Zi,β)[YiXTiˆα(Ui)g(Zi,β)]. (2.11)

    Denote ˆhi=(ˆhT1i,ˆhT2i)T. Naturally, the estimated profile empirical likelihood ratio is expressed by the following:

    R(γ;σ2,β)=max{ni=1npini=0piˆhi=0,pi0,ni=0pi=1}. (2.12)

    Combining the method of the Lagrange multiplier, we can obtain the optimal value of pi as follows:

    pi=1n(1+λTˆhi), (2.13)

    and λ satisfies the following equation:

    1nni=1ˆhi1+λTˆhi=0. (2.14)

    Substituting (2.13) into (2.12), we have the following:

    2logR(γ;σ2,β)=2ni=1log(1+λTˆhi). (2.15)

    To establish the nonparametric Wilk's theorem for 2logR(γ;σ2,β), the following regularity conditions C1–C7 are needed with references to Zhou et al. [2], and the condition C8 is needed for the proof.

    C1 The density function f(u) of U is Lipschitz continuous and has bounded away from zero on its bounded support U.

    C2 Γ(u) is a q×q nonsingular matrix for u in the support. Both Γ(u),Γ(u)1, and Φ(u) are Lipschitz continuous.

    C3 There exists an s>2 that satisfies EX2s<, Eg(Z,β)2s<, Eε2s<, and EU2s<, where denotes the Euclidean norm. Meanwhile, for some 0<δ<2s1, n2δ1h holds.

    C4 {αj(u),j=1,,q} with respect to u is continuous in UΩ.

    C5 The Kernel function K() is a univariate symmetric density function that satisfies the Lipschitz condition. The functions u3K(u) and u3K(u) are bounded and u4K(u)du.

    C6 nh80 and nh2/(logn)2 hold.

    C7 g(z,β) is continuous with respect to β for any z, and g(z,β) with respect to β are all continuous, where βB and B is a compact set.

    C8

    1n(ni=1ηiητini=1ηig(Zi,β)ni=1ηig(Zi,β)Tni=1g(Zi,β)g(Zi,β)T)Σ(B11B12B21B22),

    and

    B11>0,B22>0.

    The following Theorem 1 describes the asymptotic behavior of 2logR(γ;σ2,β).

    Theorem 1. Suppose that Conditions C1–C8 hold. Under the null hypothesis, we have the following:

    2logR(γ;σ2,β)Lχ2p+q+1,

    where "L" represents the convergence in distribution and χ2p+q+1 represents the chi-square distribution with p+q+1 degrees of freedom.

    To deal with the so-called nuisance parameters β and σ2, under the null hypothesis H0, we address

    R(γ)maxσ2,βR(γ;σ2,β), (2.16)

    that is, maximizing (2.16) with respect to β and σ2. Then, R(γ) has the following asymptotic result:

    R(γ)Lχ2q.

    In this section, we assess the finite sample performances of our proposed work by some simulation studies. Let the data be generated from the following VCPNLM:

    Y=g(Z,β)+Xα(U)+ε. (3.1)

    Specifically, g(Z,β)=exp{Zβ} with β=2 and α(U)=sin(2πU). The model error ε is supposed to come from the normal distribution (Case 1) and the uniform distribution (Case 2), respectively, with E(ε|X,Z,U)=0 and Var(ε|X,Z,U)=σ2m(U,γ), where σ2=1, m(U,γ)=exp(γU). Obviously, γ=0 corresponds to the null hypothesis, and γ0 corresponds to the alternative hypothesis. Moreover, in the following simulation, the covariates are generated on the base of ZN(0,1), XN(0,1) and UU(0,1), and naturally Y is generated according to the model (3.1). Throughout the simulation studies, we select the Epanechnikov kernel K(u)=34(1u2)+ as the kernel function in our simulation, and the bandwidth h is taken as h=cn1/5, where the constant c is chosen as the standard deviation of the covariate U.

    To evaluate the performance of the proposed method, the sample size in our simulation is taken as n=200,400, and 600, respectively, and the nominal level is 0.05. For each situation, we repeat 1000 simulation replications. With these replications, the power of the proposed empirical likelihood ratio test is displayed in Table 1 and Figure 1. Then, we can make the following observations:

    Table 1.  The power of empirical likelihood ratio test under different model errors.
    Model error n γ=0 γ=0.2 γ=0.4 γ=0.6 γ=0.8
    Case 1 200 0.099 0.166 0.404 0.689 0.893
    400 0.074 0.258 0.647 0.944 0.994
    600 0.051 0.320 0.859 0.992 1.000
    Case 2 200 0.091 0.156 0.433 0.712 0.906
    400 0.079 0.250 0.683 0.946 0.998
    600 0.050 0.362 0.842 0.993 1.000

     | Show Table
    DownLoad: CSV
    Figure 1.  The power curves of empirical likelihood ratio test under model errors.

    (ⅰ) The power declines rapidly when the null hypothesis holds, and it converges to the correct nominal level when the sample increases. This result declares that our proposed testing method can control the probability of making the Type Ⅰ error.

    (ⅱ) For any given n, the simulation performances under different error distribution cases are very similar. This result also indicates that our proposed test method is efficient for different model errors.

    (ⅲ) The power quickly tends to 1 when the sample increases and when the alternative hypothesis holds. In this respect, we can also demonstrate that our proposed heteroscedasticity test for the VCPNLM is effective.

    Next, we compare the proposed empirical likelihood ratio test method with the profile likelihood ratio (PLR) test method used in [21]. In this simulation, the nominal level is taken as α=0.05, the sample size is taken n=400, and the experiments are repeated 1000 times for each case. The simulation results of the 1000 replicates are shown in Figure 2, where the dashed line is the empirical power function based on the empirical likelihood ratio (ELR) test method proposed by this paper, and the dotted line is the empirical power function based on the PLR method.

    Figure 2.  The testing power of the ELR and PLR methods under n=400.

    From Figure 2, we can see that the empirical power functions obtained by the ELR test method and the PLR test method both rapidly increase as the value of γ increases. In addition, the empirical power derived with the ELR test method, is superior to that obtained by the PLR test method.

    In this section, we analyse the Boston housing price data to illustrate the model testing procedure proposed by this paper. The data set contains information of 506 different houses from a variety of locations in Boston Standard Metropolitan Statistical Area in 1970. Many researchers have analyzed this data set by using the partially linear additive model, the partially linear additive spatial autoregressive model, the partially linear single-index model, and other semiparametric models (see in [22,23,24]). The objective of these studies is to evaluate the influencing factors of the price of owner-occupied homes such as the the per capita crime rate by town, the weighted distances to five Boston employment centres, the average number of rooms per dwelling, and other factors. Hence, for the purpose of our demonstration, we take the indexes as the pupil-teacher ratio by town (denoted by PTRATIO), the index of accessibility to radial highways(denoted by RAD), the percentage of lower status of the population (denoted by LSTAT), the per capita crime rate by town (denoted by CRIME), and the median value of owner-occupied homes in USD 1000's (denoted by MEDV).

    In addition, [25] pointed that the covariate CRIME has a nonlinear effect on the response. Hence, we fit this data set by using the following model:

    Yi=exp(Zi1β1+Zi2β2)+Xiα(Ui)+εi,i=1,,506,

    where Yi is the response MEDV, Ui is the covariate CRIME, Xi is the covariate log(LSTAT), and Zi1 and Zi2 are covariates RAD and PTRATIO, respectively. The logarithmic transformation for the covariate LSTAT is taken to ease off the trouble caused by big gaps in the domain.

    Here, we consider the null hypothesises H0:Var(εi|Zi1,Zi2,Xi,Ui)σ2. By using the ELR testing procedure proposed by this paper, we find that the p-value of this testing problem is 0.3484. This means that the null hypotheses can not be rejected under the nominal level 0.05, which also implies that the model error ε does not have significant effect on the covariates.

    In this paper, we were concerned with the statistical inferences for the VCPNLM. Combining the empirical likelihood method, we proposed a diagnostic technique for heteroscedasticity in the semiparametric varying-coefficient partially nonlinear models. Under some mild conditions, the nonparametric version of Wilks theorem was derived and proven. Furthermore, simulation studies were performed to illustrate the performances of our proposed methods. As we have known, missing data is common in many fields. Ignoring the missing data will result in the reduction of effective information. Therefore, our forthcoming work is to implement the the statistical inferences for the VCPNLM in the case of missing data.

    Cuiping Wang: Writing-original draft, Conceptualization, Formal analysis, Methodology; Xiaoshuang Zhou: Funding acquisition, Validation and data analysis, Methodology, Supervision, Writing-review and editing; Peixin Zhao: Funding acquisition, Software, Supervision, Writing-review and editing. All authors have read and approved the final version of the manuscript for publication

    The authors declares they have used Artificial Intelligence (AI) tools in the creation of this article.

    Xiaoshuang Zhou's research was supported by the Natural Science Foundation of Shandong Province (Grant Nos. ZR2020MA021 and ZR2022MA065); Peixin Zhao's research was supported by the National Social Science Foundation of China (Grant No. 24BTJ062).

    The author declares no conflicts of interest in this paper.

    Several Lemmas are needed to prove the main result.

    Lemma 1. Assuming that Conditions C1–C8 hold. Then we get the following conclusions:

    max1jqsupuU|ˆαj(u)αj(u)|=O(dn)a.s., 

    and dn=h2+(logn/nh)1/2. If h=dn1/5 with a constant d, then we have

    max1jqsupuU|ˆαj(u)αj(u)|=O(n2/5(logn)1/2)a.s.

    Proof. The proof can be derived in [2].

    Lemma 2. B=(B11B12B21B22) is a real symmetric matrix, if B22>0, write B11.2 B11B12B122B21, then we have

    (a) B>0B22>0,B11.2>0.

    (b) If B22>0, then B0B11.20.

    Proof. The proof can be seen in [26].

    Lemma 3. Let θi,i=1,,n be i.i.d. random variables with E(θi)=0 and Var(θi)=σ2<, then for any permutation (l1,l2,,ln) of (1,2,,n), we have

    max1jn|ji=1θli|=O(n1/2logn)a.s.

    Proof. The proof of Lemma 3 can be referred to [27].

    Lemma 4. Assuming that Conditions C1–C8 and H0 hold, then we have

    1nni=1ˆhi=1nni=1hi+op(1).

    Proof. Firstly, we prove

    (1). 1nni=1ˆh1i=1nni=1h1i+op(1),

    (2). 1nni=1ˆh2i=1nni=1h2i+op(1).

    Firstly, we consider the component 1nni=1ˆh1i,

    1nni=1ˆh1i=1nni=1ηi[(YiXTiˆα(Ui)g(Zi,β))2σ2]=1nni=1ηi(ε2iσ2)2nni=1ηiεi[XTiˆα(Ui)XTiα(Ui)]+1nni=1ηi[XTiˆα(Ui)XTiα(Ui)]21nni=1ηi(ε2iσ2)+R1+R2,

    where R1=2nni=1ηiεi[XTiˆα(Ui)XTiα(Ui)] and R2=1nni=1ηi[XTiˆα(Ui)XTiα(Ui)]2. Therefore, we can derive that

    |R1|=2n|ni=1ηiεi(XTiˆα(Ui)XTiα(Ui))|2nsup1inηimax1knki=1εisup1in(ˆα(Ui)α(Ui))max1inXi=op(1).

    Similar to the discussion of R1, R2=op(1) holds. Then

    1nni=1ˆh1i=1nni=1ηi(ε2iσ2)+op(1).

    Next, we consider the component 1nni=1ˆh2i,

    1nni=1ˆh2i=1nni=1˜g(Zi,β)[YiXTiˆα(Ui)g(Zi,β)]=1nni=1˜g(Zi,β)εi1nni=1˜g(Zi,β)[XTiˆα(Ui)XTiα(Ui)],

    where

    1nni=1˜g(Zi,β)[XTiˆα(Ui)XTiα(Ui)]nmax1in˜g(Zi,β)max1inXisup1inˆα(Ui)α(Ui)=op(1).

    So 1nni=1ˆh2i=1nni=1˜g(Zi,β)εi+op(1).

    Lemma 5. Assuming that Conditions C1–C8 and H0 hold. Then

    1nni=1ˆhiLN(0,Σ).

    Proof. Denote b=(bT1,bT2)T, where b1Rq+1 and b1Rp, Therefore, b can be regarded as a (p+q+1)-dimensional vector.

    (bT1,bT2)(ni=1ηi(ε2iσ2)ni=1g(Zi,β)εi)=ni=1(bT1ηi(ε2iσ2)+bT2g(Zi,β)εi)ni=1ξi.

    Denote μi=Eεi, then we have μ1=Eε=0 and μ2=Eε2=σ2,

    Var(ξi)=(bT1,bT2)(ηiηTi(μ4μ22)ηig(Zi,β)Tμ3ziηTiμ3g(Zi,β)g(Zi,β)Tμ2)(b1b2).

    According to Condition C8, matrix Σ=(B11B12B21B22) is nonnegative definite.

    We know matrix B22>0 and B11B12B122B210 from Lemma 3(b), this together with Cauchy-Schwartz inequality show that

    μ23=(E(ε2iσ2)εi)2<E(ε2iσ2)2Eε2i=(μ4μ22)μ2.

    Since the relationship between the εi and ε2iσ2 is nonlinear, the above inequality holds strictly. Then,

    B11(μ4μ22)B12B122B21μ23/μ2>(B11B12B122B21)μ23/μ20.

    Naturally, we get B11(μ4μ22)B12B122B21μ23/μ2>0,B22μ2>0, it follows from Lemma 3 that

    (B11(μ4μ22)B12μ3B21μ3B22μ2)Σ

    is a positive definite matrix. This indicates that the Lindeberg Condition is met. So by means of the Lindeberg-Feller central limit theorem, we obtain

    1nni=1bτˆhiLN(0,bτΣb).

    This together with Cramer-Wold method, we deduce that 1nni=1ˆhiLN(0,Σ). The proof is finished.

    Lemma 6. Under null hypothesis and conditions C1–C6, we have

    1nni=1ˆhiˆhTiPΣ.

    Proof. It is easy to obtain that

    1nni=1ˆhiˆhTi=1n(ni=1ˆh1iˆhτ1ini=1ˆh1iˆhT2ini=1ˆh2iˆhτ1ini=1ˆh2iˆhT2i),
    1nni=1ˆh1iˆhT1i=1nni=1ηiηTi[(YiXTiˆα(Ui)g(Zi,β))2σ2]2=1nni=1ηiηTi[ε2iσ2+(XTiˆα(Ui)XTiα(Ui))2+2εi(XTiˆα(Ui)XTiα(Ui))]2=1nni=1ηiηTi(ε2iσ2)2+1nni=1ηiητi(XTiˆα(Ui)XTiα(Ui))4+4nni=1ηiηTiε2i(XTiˆα(Ui)XTiα(Ui))2+2nni=1ηiηTi(ε2iσ2)(XTiˆα(Ui)XTiα(Ui))2+4nni=1ηiηTi(ε2iσ2)εi(XTiˆα(Ui)XTiα(Ui))+4nni=1ηiητiεi(XTiˆα(Ui)XTiα(Ui))3=1nni=1ηiηTi(ε2iσ2)2+1nni=1ηiηTi(XTiˆα(Ui)XTiα(Ui))4+6nni=1ηiητiε2i(XTiˆα(Ui)XTiα(Ui))22nσ2ni=1ηiητi(XTiˆα(Ui)XTiα(Ui))2+4nni=1ηiητiε3i(XTiˆα(Ui)XTiα(Ui))4nσ2ni=1ηiητiεi(XTiˆα(Ui)XTiα(Ui))+4nni=1ηiητiεi(XTiˆα(Ui)XTiα(Ui))3=1nni=1h1ihT1i+N1+N2+N3+N4+N5+N6.

    Using Condition C6 and Lemma 1, we get

    |N1|=|1nni=1(XTiˆα(Ui)XTiα(Ui))4|sup1inXTiˆα(Ui)XTiα(Ui)4=O(C4n)Op(1)=op(1).

    Using the similar derivation method, we arrive at the following conclusion

    Ni=op(1),i=2,,6.

    Moreover, we have

    1nni=1ˆhiˆhτi=1nni=1hihτi+op(1).

    Invoking the law of large number, it is easy to get

    1nni=1ˆhiˆhTiPΣ.

    Lemma 7. Denote ˆhmax=max{ˆh1,,ˆhn}, then under the null hypothesis and Conditions C1–C8, it holds

    ˆhmax=op(n1/2).

    Proof. The proof can be inspired from [18].

    Lemma 8. The conclusion about the the Lagrange multiplier λ is as follows:

    λ=Op(n1/2).

    Proof. It can be get from [18], thus we omit here.

    Proof of Theorem 1. Based on the above Lemmas 7 and 8 and the Taylor expansion of (2.15), we deduce that

    2logR(γ;σ2,β)=2ni=1(λTˆhi12(λTˆhi)2)+op(1).

    By Lemmas 5–8, we have

    2logR(γ;σ2,β)=(1nni=1ˆhi)T(1nni=1ˆhiˆhTi)1(1nni=1ˆhi)+op(1).

    Similar to [18], 2logR(γ;σ2,β)Lχ2p+q+1 can be derived. Theorem 1 follows clearly.



    [1] T. Li, C. Mei, Estimation and inference for varying coefficient partially nonlinear models, J. Stat. Plan. Infer., 143 (2013), 2023–2037. http://dx.doi.org/10.1016/j.jspi.2013.05.011 doi: 10.1016/j.jspi.2013.05.011
    [2] X. Zhou, P. Zhao, X. Wang, Empirical likelihood inferences for varying coefficient partially nonlinear models, J. Appl. Stat., 44 (2017), 474–492. http://dx.doi.org/10.1080/02664763.2016.1177496 doi: 10.1080/02664763.2016.1177496
    [3] Y. Jiang, Q. Ji, B. Xie, Robust estimation for the varying coefficient partially nonlinear models, J. Comput. Appl. Math., 326 (2017), 31–43. http://dx.doi.org/10.1016/j.cam.2017.04.028 doi: 10.1016/j.cam.2017.04.028
    [4] Y. Xiao, Z. Chen, Bias-corrected estimations in varying-coefficient partially nonlinear models with measurement error in the nonparametric part, J. Appl. Stat., 45 (2018), 586–603. http://dx.doi.org/10.1080/02664763.2017.1288201 doi: 10.1080/02664763.2017.1288201
    [5] S. Dai, Z. Huang, Estimation for varying coefficient partially nonlinear models with distorted measurement errors, J. Korean Stat. Soc., 48 (2019), 117–133. http://dx.doi.org/10.1016/j.jkss.2018.09.001 doi: 10.1016/j.jkss.2018.09.001
    [6] Y. Qian, Z. Huang, Statistical inference for a varying-coefficient partially nonlinear model with measurement errors, Stat. Methodol., 32 (2016), 122130. http://dx.doi.org/10.1016/j.stamet.2016.05.004 doi: 10.1016/j.stamet.2016.05.004
    [7] X. Wang, P. Zhao, H. Du, Statistical inferences for varying coefficient partially nonlinear model with missing covariates, Commun. Stat.-Theor. M., 50 (2021), 2599–2618. http://dx.doi.org/10.1080/03610926.2019.1674870 doi: 10.1080/03610926.2019.1674870
    [8] L. Xia, X. Wang, P. Zhao, Y. Song, Empirical likelihood for varying coefficient partially nonlinear model with missing responses, AIMS Mathematics, 6 (2021), 7125–7152. http://dx.doi.org/10.3934/math.2021418 doi: 10.3934/math.2021418
    [9] Y. Xiao, L. Liang, Robust estimation and variable selection for varying-coefficient partially nonlinear models based on modal regression, J. Korean Stat. Soc., 51 (2020), 692–715. http://dx.doi.org/10.1007/s42952-021-00158-w doi: 10.1007/s42952-021-00158-w
    [10] Y. Xiao, Y. Shi, Robust estimation for varying-coefficient partially nonlinear model with nonignorable missing response, AIMS Mathematics, 8 (2023), 29849–29871. http://dx.doi.org/10.3934/math.20231526 doi: 10.3934/math.20231526
    [11] X. Zhou, P. Zhao, Estimation and inferences for varying coefficient partially nonlinear quantile models with censoring indicators missing at random, Comput. Stat., 37 (2022), 1727–1750. http://dx.doi.org/10.1007/s00180-021-01192-2 doi: 10.1007/s00180-021-01192-2
    [12] H. Wong, F. Liu, M. Chen, W. Ip, Empirical likelihood based diagnostics for heteroscedasticity in partial linear models, Comput. Stat. Data Anal., 53 (2009), 3466–3477. http://dx.doi.org/10.1016/j.csda.2009.02.029 doi: 10.1016/j.csda.2009.02.029
    [13] H. Wong, F. Liu, M. Chen, W. Ip, Empirical likelihood based diagnostics for heteroscedasticity in partially linear errors-in-variables models, J. Stat. Plan. Infer., 139 (2009), 916–929. http://dx.doi.org/10.1016/j.jspi.2008.05.049 doi: 10.1016/j.jspi.2008.05.049
    [14] G. Fan, H. Liang, J. Wang, Empirical likelihood for a heteroscedastic partial linear errors-in-variables model, Commun. Stat.-Theor. M., 41 (2012), 108–127. http://dx.doi.org/10.1080/03610926.2010.517357 doi: 10.1080/03610926.2010.517357
    [15] J. Lin, Y. Zhao, H. Wang, Heteroscedasticity diagnostics in varying-coefficient partially linear regression models and applications in analyzing Boston housing data, J. Applied Statistics, 42 (2015), 2432–2448. http://dx.doi.org/10.1080/02664763.2015.1043623 doi: 10.1080/02664763.2015.1043623
    [16] F. Liu, C. Li, P. Wang, X. Kang, Empirical likelihood based diagnostics for heteroscedasticity in semiparametric varying-coefficient partially linear errors-in-variables models, Commun. Stat.-Theor. M., 47 (2018), 5485–5496. http://dx.doi.org/10.1080/03610926.2017.1395050 doi: 10.1080/03610926.2017.1395050
    [17] F. Liu, W. Gao, J. He, X. Fu, X. Kang, Empirical likelihood based diagnostics for heteroscedasticity in semiparametric varying-coefficient partially linear models with missing responses, J. Syst. Sci. Complex., 34 (2021), 1175–1188. http://dx.doi.org/10.1007/s11424-020-9240-7 doi: 10.1007/s11424-020-9240-7
    [18] A. Owen, Empirical likelihood ratio confidence intervals for single functional, Biometrika, 75 (1988), 237–249. http://dx.doi.org/10.1093/biomet/75.2.237 doi: 10.1093/biomet/75.2.237
    [19] A. Owen, Empirical likelihood ratio confidence regions, Ann. Stat., 18 (1990), 90–120. http://dx.doi.org/10.1214/aos/1176347494 doi: 10.1214/aos/1176347494
    [20] A. Owen, Empirical likelihood for linear models, Ann. Stat., 19 (1991), 1725–1747. http://dx.doi.org/10.1214/aos/1176348368 doi: 10.1214/aos/1176348368
    [21] J. Fan, T. Huang, Profile likelihood inferences on semiparametric varying-coefficient partially linear models, Bernoulli, 11 (2005), 1031–1057. http://dx.doi.org/10.3150/bj/1137421639 doi: 10.3150/bj/1137421639
    [22] Y. Sun, H. Yan, W. Zhang, Z. Lu, A semiparametric spatial dynamic model, Ann. Stat., 42 (2014), 700–727. http://dx.doi.org/10.1214/13-AOS1201 doi: 10.1214/13-AOS1201
    [23] J. Du, X. Sun, R. Cao, Z. Zhang, Statistical inference for partially linear additive spatial autoregressive models, Spat. Stat., 25 (2018), 52–67. http://dx.doi.org/10.1016/j.spasta.2018.04.008 doi: 10.1016/j.spasta.2018.04.008
    [24] S. Cheng, J. Chen, Estimation of partially linear single-index spatialautoregressive model, Stat. Papers, 62 (2021), 495–531. http://dx.doi.org/10.1007/s00362-019-01105-y doi: 10.1007/s00362-019-01105-y
    [25] H. Zhang, G. Cheng, Y. Liu, Linear or nonlinear? Automatic structure discovery for partially linear models, J. Am. Stat. Assoc., 106 (2011), 1099–1112. http://dx.doi.org/10.1198/jasa.2011.tm10281 doi: 10.1198/jasa.2011.tm10281
    [26] S. Wang, M. Wu, Z. Jia, Inequalities in matrix theory (Chinese), 2 Eds., Beijing: Science Press, 2006.
    [27] J. Gao, Asymptotic theory for partly linear models, Commun. Stat.-Theor. M., 24 (1995), 1985–2009. http://dx.doi.org/10.1080/03610929508831598 doi: 10.1080/03610929508831598
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(725) PDF downloads(45) Cited by(0)

Figures and Tables

Figures(2)  /  Tables(1)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog