Review

Hybrid fuzzy differential equations

  • In this paper we study the existence of the solution for a class of hybrid differential equations with fuzzy initial value. The some new results of generalized division are proposed and applied.

    Citation: Atimad Harir, Said Melliani, L. Saadia Chadli. Hybrid fuzzy differential equations[J]. AIMS Mathematics, 2020, 5(1): 273-285. doi: 10.3934/math.2020018

    Related Papers:

    [1] Vladica S. Stojanović, Hassan S. Bakouch, Radica Bojičić, Gadir Alomair, Shuhrah A. Alghamdi . Poisson-Lindley minification INAR process with application to financial data. AIMS Mathematics, 2024, 9(8): 22627-22654. doi: 10.3934/math.20241102
    [2] Kee Wah Fo, Seng Huat Ong, Choung Min Ng, You Beng Koh . An alternative hyper-Poisson integer-valued GARCH model with application to polio, internet protocol and COVID-19 data. AIMS Mathematics, 2023, 8(12): 29116-29139. doi: 10.3934/math.20231491
    [3] Ahmed M. Gemeay, Najwan Alsadat, Christophe Chesneau, Mohammed Elgarhy . Power unit inverse Lindley distribution with different measures of uncertainty, estimation and applications. AIMS Mathematics, 2024, 9(8): 20976-21024. doi: 10.3934/math.20241021
    [4] W. B. Altukhaes, M. Roozbeh, N. A. Mohamed . Feasible robust Liu estimator to combat outliers and multicollinearity effects in restricted semiparametric regression model. AIMS Mathematics, 2024, 9(11): 31581-31606. doi: 10.3934/math.20241519
    [5] Hongping Guo, Yuhang Qian, Yiran Zhu, Xinming Dai, Xiao Wang . Confidence intervals for the difference between coefficients of variation of zero-inflated gamma distributions. AIMS Mathematics, 2023, 8(12): 29713-29733. doi: 10.3934/math.20231521
    [6] Usanee Janthasuwan, Suparat Niwitpong, Sa-Aat Niwitpong . Generalized confidence interval for the common coefficient of variation of several zero-inflated Birnbaum-Saunders distributions with an application to wind speed data. AIMS Mathematics, 2025, 10(2): 2697-2723. doi: 10.3934/math.2025127
    [7] Ahmed Sedky Eldeeb, Muhammad Ahsan-ul-Haq, Mohamed S. Eliwa . A discrete Ramos-Louzada distribution for asymmetric and over-dispersed data with leptokurtic-shaped: Properties and various estimation techniques with inference. AIMS Mathematics, 2022, 7(2): 1726-1741. doi: 10.3934/math.2022099
    [8] Abdulaziz S. Alghamdi, Muhammad Ahsan-ul-Haq, Ayesha Babar, Hassan M. Aljohani, Ahmed Z. Afify . The discrete power-Ailamujia distribution: properties, inference, and applications. AIMS Mathematics, 2022, 7(5): 8344-8360. doi: 10.3934/math.2022465
    [9] Heng Liu, Xia Cui . Adaptive estimation for spatially varying coefficient models. AIMS Mathematics, 2023, 8(6): 13923-13942. doi: 10.3934/math.2023713
    [10] Emrah Altun, Christophe Chesneau, Hana N. Alqifari . Two parameter log-Lindley distribution with LTPL web-tool. AIMS Mathematics, 2025, 10(4): 8306-8321. doi: 10.3934/math.2025382
  • In this paper we study the existence of the solution for a class of hybrid differential equations with fuzzy initial value. The some new results of generalized division are proposed and applied.


    The Poisson distribution is a well-known distribution to model the count data sets. As widely known, the mean and variance of the Poisson distribution are equal to each other. This property of the Poisson distribution causes some problems in modeling the real-life data sets. In real-life data modeling, the data sets are generally over-dispersed which means that the empirical variance is greater than the empirical mean. In this case, the use of the Poisson distribution for these type of data sets yields the misspecification of the underlying probability distribution. Negative-Binomial (NB) distribution is the first choice for modeling the over-dispersed count data sets. However, we need more flexible discrete distributions to model highly over-dispersed count data sets. In the last decade, several authors have proposed alternative discrete distributions to handle this problem, such as Shoukri et al. [28], Shmueli et al. [26], Rodríguez-Avi et al. [25], Mahmoudi and Zakerzadeh [24], Lord and Geedipally [23], Cheng et al. [11], Gómez-Déniz [17], Sáez-Castillo and Conde-Sánchez [27], Zamani et al. [30], Gencturk and Yigiter [16], Bhati et al. [9], Imoto et al. [20], Wongrin and Bodhisuwan [31], Altun [2,3,4,5,6], El-Morshedy et al. [14,15], Altun et al. [1], Eliwa et al. [13].

    The other phenomena in count data modeling is inflation. Inflation is seen generally at zero point and called as zero-inflation. Zero-inflation means that the underlying data set contains too many zero observations that cannot be represented by the corresponding distribution, such as Poisson and NB. This situation is commonly seen in insurance and health sciences, such as loss frequency, number of physicians visits, daily coronavirus cases etc. In this case, zero-inflated version of the Poisson and NB regression models are used. These are called as zero-inflated Poisson (ZIP) and zero-inflated negative-binomial (ZINB). Thanks to their software support, these models have been applied to real-life problems by many researchers. For instance, a comparison of over-dispersed count data sets were studied by Avci et al. [8]. Besides, Ismail and Zamani [19] conducted a study for applications of the ZIP and ZINB models on the Malaysian own damage claim data. One can also visit the work of Lord et al. [22] to see the application of these models on the crash data. Also, Ayati and Abbasi [7] investigated the suitability of these models for the accidents on urban highways.

    Here, the main purpose is to develop a new sophisticated model for the zero-inflated and/or over-dispersed data sets. To do this, we use the Poisson generalized-Lindley (PGL) distribution, introduced by Wongrin and Bodhisuwan [34]. The suitable re-parametrized version of the PGL distribution is introduced and its statistical properties are studied. The maximum likelihood estimation (MLE) method is preferred to estimate the unknown model parameters. The suitability of the MLE method for estimating the parameters of the proposed model is discussed with simulation study. Two real data sets are analyzed to prove the importance of the proposed distribution against the existing models such as Poisson, NB regression models and their zero-inflated models.

    The other parts of the study are organized as follows: The re-parametrized PGL distribution is studied in Section 2. In Section 3, the parameter estimation problem is addressed with MLE and the simulation study is given. Section 4 is devoted to introduce a new regression model for both zero-inflated and over-dispersed cases. Section 5 contains the empirical results of the study. Some conclusion remarks are given in Section 6.

    Wongrin and Bodhisuwan [34] introduced PGL distribution by using the generalized-Lindley distribution of Elbatal et al. [12]. Let the random variable Y follow the PGL distribution with probability mass function (pmf)

    P(y;α,β,θ)=1y!(θ+1)y+1[(θθ+1)αθΓ(y+α)Γ(α)+(θθ+1)βΓ(y+β)Γ(β)],y=0,1,2,...,

    where Γ() is the gamma function and α,β,θ>0. The mean and variance of Y are

    E(Y)=αθ+βθ(θ+1),

    and

    Var(Y)=β[θ(2α+θ+2)+1]+αθ[α+(θ+1)2]+β2θθ2(θ+1)2.

    Now, we introduce a re-parametrized version of the PGL distribution to make it suitable distribution for count regression model.

    Proposition 1. Let β=θ(μθ+μα), then, the pmf of PGL distribution is

    P(y;α,θ,μ)=1y!(θ+1)y+1[(θθ+1)αθΓ(y+α)Γ(α)+(θθ+1)θ(μθ+μα)Γ(y+θ(μθ+μα))Γ(θ(μθ+μα))], (2.1)

    where y=0,1,2,..., α>0, θ>0 and μ>0. The mean and variance of Y are

    E(Y)=μ,

    and

    V(Y)=θ(μθ+μα)[θ(2α+θ+2)+1]+αθ[α+(θ+1)2]+{θ(μθ+μα)}2θθ2(θ+1)2.

    Hereinafter, the random variable Y refers to the re-parametrized PGL distribution given in (2.1) and shortly denoted as PGL(α,θ,μ). Following the results of Wongrin and Bodhisuwan [34], the statistical properties of the re-parametrized PGL distribution could be obtained easily. The pmf shapes of the re-parametrized PGL distribution are displayed in Figure 1. From these figures, we conclude that this distribution could be used to model zero-inflated, bimodal and right skewed count data sets.

    Figure 1.  The pmf shapes of PGL distribution.

    The dispersion index (DI) is defined as DI=Var(X)/E(X). The DI shows the flexibility of the distribution in modeling over(under)-dispersed data sets. When the DI is greater than one, it means that the data set exhibits over-dispersion. The opposite case (DI<1) indicates the under-dispersion. The variance and DI plots of the PGL distribution are displayed in Figure 2 (for fixed α=0.5). We conclude the following results from Figure 2: When the parameter μ increases, dispersion index and variance increase; when the parameter θ increases, dispersion index and variance decreases. The DI of the PGL is always greater than one. So, the PGL distribution is an appropriate choice to model over-dispersed data sets. Figure 3 shows the results of dispersion index and variance of the PGL distribution for α=1.5. The similar interpretation can be done as in the case of α=0.5.

    Figure 2.  The dispersion index and variance of PGL distribution for α=0.5.
    Figure 3.  The dispersion index and variance of PGL distribution for for α=1.5.

    In this section, the parameters of PGL distribution are obtained by MLE method. The appropriateness of the MLE method is evaluated by simulation study.

    Assume that we have a random sample, y1,y2,,yn, from the PGL distribution. Then, the log-likelihood function of the PGL distribution is

    (ττ)=ni=1ln[(θθ+1)αθΓ(yi+α)Γ(α)+(θθ+1)θ(μθ+μα)Γ(yi+θ(μθ+μα))Γ(θ(μθ+μα))]ni=1ln[yi!(θ+1)yi+1]. (3.1)

    where ττ=(α,θ,μ) is the unknown parameter vector. The score vector components can be obtained by taking partial derivatives of (3.1) with respect to α, θ, μ. The likelihood equation does not have the explicit solution. In this case, we should prefer the direct maximization of the log-likelihood function given in (3.1). For this purpose, the optimization toolboxes of the R, S-Plus or Matlab can be used. The nlm function of the R software is used in this study. To construct the asymptotic confidence intervals, we need the observed information matrix whose elements are given by

    IF(ττ)=(IααIαθIαμIθαIθθIθμIμαIμθIμμ). (3.2)

    Since the second derivatives of the log-likelihood function are complicated, these equations are omitted, however, these are upon request from the authors. The inverse of the observed information matrix evaluated at ^ττ gives the asymptotic variance-covariance matrix. The asymptotic standard errors are obtained by the inverse of (3.2). Then, the asymptotic confidence intervals of the parameters are given by

    ˆα±zp/2Var(ˆα),ˆθ±zp/2Var(ˆθ),ˆμ±zp/2Var(ˆμ),

    where zp/2 represents the left quantile value of the standard normal distribution at p/2.

    Now, we conduct a simulation study to see the finite-sample performance of the MLEs of the parameters of the PGL distribution. The below simulation steps are implemented.

    (1) Determine the sample size n, simulation replication N and the parameter values of the PGL distribution, α, θ and μ.

    (2) Using the parameter settings in Step 1, generate the random variables from the PGL distribution using the inverse transform method.

    (3) Using the generated sample in Step 2, obtain the MLEs of the parameters α, θ and μ.

    (4) Repeat N times the Steps 2 and 3.

    (5) Using the MLEs and the true parameter values, compute the estimated values of biases, mean square errors (MSEs) and mean relative estimates (MREs). The required formulas for these measures can be found in Altun [5].

    The simulation results are displayed in Figure 4. We determine the simulation replication N=10,000 and the sample size n=50,55,60,,500. The true parameter values are τ=(2,2,5). We expect to see that when the sample size becomes larger, the biases and MSEs should be near zero and MRE should be near one. From the results given in Figure 4, we conclude that the estimated biases and MSEs are near the zero. Also, as expected, the MREs are near the one. These results show that the MLE is an appropriate method to estimate the unknown parameters of the PGL distribution.

    Figure 4.  The graphical simulation results of the PGL distribution.

    Additionally, two different parameter settings are evaluated to check whether the similar results are obtained for the different parameter vectors. The results are reported in Table 1. As in previous simulation study, the biases are near zero and MSE and MRE approach their desired values. Consequently, MLE is effective parameter estimation method for the PGL distribution.

    Table 1.  The simulation results of the PGL distribution for two different parameter settings.
    Sample size Parameters (α=2,θ=1,μ=5) (α=3,θ=2,μ=4)
    α θ μ α θ μ
    n=50 Bias 0.4814 0.1297 -0.0321 0.2415 0.1503 -0.0631
    MSE 0.8389 0.3301 0.3394 0.9317 0.7125 0.3579
    MRE 1.2408 1.1297 0.9936 1.0805 1.0751 0.9842
    n=250 Bias 0.0548 0.0134 -0.0202 0.0699 0.0423 -0.0172
    MSE 0.4491 0.0746 0.0761 0.3785 0.1035 0.0685
    MRE 1.0274 1.0134 0.9960 1.0233 1.0211 0.9957
    n=500 Bias 0.0234 0.0047 -0.0146 0.0245 0.0209 -0.0164
    MSE 0.2180 0.0354 0.0366 0.1624 0.0445 0.0335
    MRE 1.0117 1.0047 0.9971 1.0082 1.0105 0.9959

     | Show Table
    DownLoad: CSV

    As mentioned before, the Poisson regression model does not work well in case of over-dispersion. Dealing with the over-dispersed data set, the first choice is NB regression model. Now, we introduce an alternative regression model to the NB regression model for modeling the highly over-dispersed data sets. Assume that Y is a random variable distributed as a PGL distribution, given in (2.1). Since the mean of Y is E(Y|α,θ,μ)=μ, the log-link function can be used to link the covariates to the mean of the PGL distribution, as follows

    μi=exp(xxTiββ),i=1,...,n, (4.1)

    where xxTi=(1,xi1,xi2,...xik) represents the covariates and ββ=(β0,β1,β2,...βk)T represents the regression parameters. Note that the log-link function is used to link the covariates to the mean of the response variable. Since the mean of the response variable is defined on the positive domain, the link function should convert the observations defined on R to R+. However, the log-link function is not the only option to do this transformation. The softplus function, proposed by Weiss et al. [35], can be used as an alternative to the log-link function. Replacing μ in (2.1) with (4.1), the log-likelihood function of the PGL regression model is

    (α,θ,ββ)=ni=1ln[(θθ+1)αθΓ(yi+α)Γ(α)+(θθ+1)θ(exp(xxTiββ)θ+exp(xxTiββ)α)×Γ(yi+θ(exp(xxTiββ)θ+exp(xxTiββ)α))Γ(θ(exp(xxTiββ)θ+exp(xxTiββ)α))]ni=1ln[yi!(θ+1)yi+1]. (4.2)

    The parameters α and θ are the distributional parameters and ββ=(β0,β1,β2,...βk) is the vector of unknown regression parameters. These parameters are estimated by direct maximization of (4.2). The nlm function of R software is used to minimize the minus of (4.2), which is equivalent to the maximization of (4.2). The standard errors of the estimated parameters are obtained by means of hessian matrix evaluated at the MLEs of the parameters. The elements of the hessian matrix are computed with fdHess function of R software. The elements of the hessian matrix consist of the second-order partial derivatives of the log-likelihood function. Since these derivatives are complicated, they are omitted and not presented in the study.

    Now, we evaluate the suitability of the MLE method for estimating the parameters of the PGL regression model. The simulation replication number N is determined as 10,000 and four sample sizes are used: n=50,250,500,1000. Using the log-link function, we generate random variables using the ln(μi)=β0+β1xi1+β2xi2 with parameters β0=0.5,β1=0.5,β2=0.5 and θ=1, α=1. The covariates are generated from the standard uniform distribution. The response variable, yi, is generated based on the values of μi, α and θ. The simulation results are listed in Table 2. The average of estimates (AEs) are near the true parameter values for small and large sample sizes. The biases and MSEs are near the desired values. These results confirm the consistency property of the MLEs.

    Table 2.  The simulation results of PGL regression model.
    Sample size Parameters β0 β1 β2 θ α
    n=50 AE 0.4967 0.4973 0.4778 1.1607 1.3993
    MSE 0.0860 0.1402 0.1349 0.2728 0.8066
    n=250 AE 0.5056 0.4935 0.4944 1.0620 1.1438
    MSE 0.0270 0.0452 0.0419 0.1381 0.2302
    n=500 AE 0.4999 0.4826 0.5031 1.0359 1.0569
    MSE 0.0135 0.0225 0.0212 0.0687 0.1110
    n=1000 AE 0.5012 0.4986 0.4953 1.0057 1.0109
    MSE 0.0064 0.0118 0.0097 0.0293 0.0413

     | Show Table
    DownLoad: CSV

    Additionally, we compare the standard deviations of the estimators based on the simulated samples and fdHess function. The sample function of R software is use to generate bootstrap samples. The bootstrap samples are used to calculate the bootstrap standard errors of the model parameters. The bootstrap replication number is determined as 1,000. The model parameters are β0=0.5,β1=0.5,β2=0.5 and θ=2, α=2. The simulation results are reported in Table 3. The results show that the obtained standard errors using two different approaches are close to each other. Thus, it is verified that the fdHess function works well to obtain the asymptotic standard errors of the model parameters.

    Table 3.  Comparison of standard errors of the estimated parameters using bootstrap methodology and fdHess function.
    Sample size Standard Errors Parameters
    β0 β1 β2 θ α
    n=50 fdHess 0.2514 0.3089 0.3157 0.6405 0.8302
    Bootstrap 0.2834 0.3528 0.3320 0.5448 0.7137
    n=250 fdHess 0.1573 0.1900 0.1921 0.4134 0.5405
    Bootstrap 0.1607 0.1975 0.1767 0.3969 0.5589
    n=500 fdHess 0.1035 0.1261 0.1255 0.2773 0.3624
    Bootstrap 0.1284 0.0798 0.1555 0.1938 0.2992

     | Show Table
    DownLoad: CSV

    The ZIP and ZINB regression models are the most widely used models in case of the zero-inflation. ZINB regression model could be more appropriate choice in most cases since Poisson distribution does not model the over-dispersion. The zero-inflated Poisson distribution is given by

    P(y;λ)={w+(1w)eλ,y=0,(1w)λyeλy!,y>0, (4.3)

    where 0w1. It is easy to see that when the w=0, the zero-inflated Poisson distribution reduces to Poisson distribution. As in PGL regression model, the mean of Poisson distribution, λi, is linked to covariates by means of log-link function. The probability of zero counts, wi is linked to covariates by means of logit-link function which is given by

    ln(wi1wi)=zzTiγγ, (4.4)

    where zzTi=(1,zi1,zi2,...zik) is the vector of covariates and γγ=(γ0,γ1,γ2,...γk)T is the unknown vector of regression coefficients for zero process. The log-likelihood function of ZIP regression model is given by

    (ββ,γγ)=yi=0ln[exp(zzTiγγ)+exp(exp(xxTiββ))]+yi>0[yixxTiββexp(xxTiββ)ln(yi!)]ni=1ln([1+exp(zzTiγγ)]). (4.5)

    The log-likelihood function given in (4.5) can be maximized by means of nlm function of R. As mentioned before, when the corresponding data displays over-dispersion, the negative-binomial regression model should be used. The pmf of zero-inflated negative-binomial distribution is given by

    P(y;w,λ,τ)={w+(1w)(1+λτ)τ, y=0,(1w)Γ(y+τ)y!Γ(τ)(1+λτ)τ(1+τλ)y,y>0, (4.6)

    where τ is the shape parameter. When w=0, the zero-inflated negative-binomial distribution reduces to negative-binomial distribution. The log-likelihood function of ZINB regression model is given by

    (ββ,γγ,τ)=ni=1ln(1+exp(zzTiγγ))yi=0ln(exp(zzTiγγ)+(exp(xxTiββ)+ττ)τ)+yi>0(τln(exp(xxTiββ)+ττ)+yiln(1+exp(xxTiββτ)))+yi>0(lnΓ(τ)+lnΓ(yi+1)lnΓ(yi+τ)). (4.7)

    The log-likelihood given in (4.7) can be maximized with nlm function of R software. Here, an alternative zero-inflated regression model is introduced based on the zero-inflated PGL distribution. The pmf of zero-inflated PGL distribution is given by

    P(y;w,α,θ,μ)={w+(1w)[(θθ+1)α+1+θθ(μθ+μα)(θ+1)θ(μθ+μα)+1],y=0,(1w)1y!(θ+1)y+1[(θθ+1)αθΓ(y+α)Γ(α)+(θθ+1)θ(μθ+μα)Γ(y+θ(μθ+μα))Γ(θ(μθ+μα))],y>0, (4.8)

    where 0w1 and α>0, θ>0 and μ>0. Inserting (4.1) and (4.4) in (4.8), the log-likelihood function of zero-inflated PGL (ZIPGL) regression model is given by

    (ββ,γγ,α,θ)=yi=0ln(exp(zzTiγγ)1+exp(zzTiγγ)+11+exp(zzTiγγ)[(θθ+1)α+1+θθ(exp(xxTiββ)θ+exp(xxTiββ)α)(θ+1)θ(exp(xxTiββ)θ+exp(xxTiββ)α)+1])yi>0ln(1[1+exp(zzTiγγ)]yi!(θ+1)yi+1[(θθ+1)αθΓ(yi+α)Γ(α)+(θθ+1)θ(exp(xxTiββ)θ+exp(xxTiββ)α)×Γ(yi+θ(exp(xxTiββ)θ+exp(xxTiββ)α))Γ(θ(exp(xxTiββ)θ+exp(xxTiββ)α))]). (4.9)

    The unknown parameters, α, θ, ββ=(β0,β1,β2,...βk) and γγ=(γ0,γ1,γ2,...γk)T are obtained by maximizing the (4.9) with nlm function of R software.

    In this section, two real data sets are analyzed to show the flexibility of the PGL regression model against the Poisson and NB regression models and also their zero-inflated counterparts. Also, we compare the PGL model with NPGL model, proposed by Altun [2]. In statistics literature, there are many discrete distributions to models the over or under-dispersed count data sets. Some of these distributions can be cited as follows: Mean Conway-Maxwell-Poisson distribution by Huang [18], zero-modified geometric by Kang et al. [21] and generalized COM-Poisson by Qian and Zhu [33]. The best model for the fitted data is chosen according to the results of the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). The lowest values of the AIC and BIC indicate the best model for the data used.

    The first data set comes from the daily number of absences of 316 high school students. We model the daily number of absences with some covariates such as gender and type of instructional program. The same data set was analyzed by Altun [5]. The female individuals are coded 1 and male individuals are coded 0. The gender is represented by the covariate (x1). The instructional program variable has three categories. These are general, academic and vocational. Therefore, one of them is determined as the baseline category and two dummy variables are created. The baseline category is selected as the vocational program. The general program (x2) and academic program (x3) are used as two dummy variables. The below regression structure is fitted to the data set.

    μi=exp(β0+β1x1i+β2x2i+β3x3i). (5.1)

    The probability distribution of the response variable, number of absence, is displayed in Figure 5(a). The mean and variance of the response variable are 5.955 and 49.518, respectively. Since the DI is greater than one, it is concluded that the response variable exhibits over-dispersion. Cameron and Trivedi [10] proposed a test to evaluate the over-dispersion. The dispersiontest function of R software is used to perform over-dispersion test of Cameron and Trivedi [10]. The obtained test statistic value is z=6.679 and corresponding p value is <0.001. This result verifies the over-dispersion problem in response variable.

    Figure 5.  The probability distributions of (a) days of absence of students and (b) number of physician visits.

    The estimated parameters with corresponding standard errors (SEs) and goodness of fit statistics are listed in Table 4. From Table 4, since the PGL regression model has the lowest values of AIC and BIC, we conclude that the PGL performs better than Poisson, NB and NPGL regression models.

    Table 4.  The estimated parameters of the fitted count regression models for the number of absence data.
    Covariates Poisson NB NPGL PGL
    Estimate (SE) SE p-value Estimate SE p-value Estimate SE p-value Estimate SE p-value
    β0 1.323 0.089 <0.001 1.271 0.214 <0.001 1.272 <0.001 1.515 0.192 <0.001
    β1 -0.234 0.047 <0.001 -0.193 0.123 0.118 -0.195 0.011 -0.233 0.101 0.022
    β2 1.374 0.076 <0.001 1.362 0.199 <0.001 1.365 <0.001 1.206 0.156 <0.001
    β3 0.957 0.066 <0.001 0.949 0.140 <0.001 0.951 <0.001 0.708 0.152 <0.001
    τ - - - 1.017 0.104 - - - - - - -
    α - - - - - - - - 0.177 0.022 -
    θ - - - - - - 1.059 0.136 - 1.040 0.459 -
    1343.250 867.225 881.409 862.480
    AIC 2694.500 1744.449 1772.818 1736.960
    BIC 2709.498 1763.196 1791.565 1759.456

     | Show Table
    DownLoad: CSV

    As seen from estimated regression coefficients of PGL regression model, we conclude that the gender, academic and instructional programs variables have statistically significant effects (at 1% level) on the days of absence for high school students. The number of absences for female students are exp(0.233)=0.792, that is 20.8 lower than male students. It means that male students have higher absences than female students. However, the number of absences for general and academic instructional program students are exp(1.206)=3.340 that is 234 and exp(0.708)=2.029, that is 102.9 higher than the vocational instructional program students.

    The profile log-likelihood plots of the PGL regression model are displayed in Figure 6. These figures are helpful to evaluate the suitability of the estimated model parameters. Thanks to the profile log-likelihood plots, it is obvious that the estimated parameters of the PGL model are the maximizers of the log-likelihood function.

    Figure 6.  Profile log-likelihood plots of the PGL regression model.

    The residual analysis is also applied to check the model suitability for the fitted data set. To do this, the randomized quantile residuals (rqs) are used. The rqs is

    rq,i=Φ1(ui), (5.2)

    where ui=F(yi;ˆμi) is uniformly distributed random variable between ai=limyyiF(y;ˆμi) and bi=F(y;ˆμi) (Altun, [2]). Note that F(y;μ) is the cdf of PGL distribution. If the model is statistically valid for the data set, the rqs follows the standard normal distribution. The index and quantile-quantile plots of the PGL regression model are displayed in Figure 7. These figures show that the PGL regression model provides perfect fit to the data.

    Figure 7.  Residual analysis of PGL regression model.

    The second data comes from the US National Medical Expenditure Survey (NMES) in the years of 1987–1988. The data set has 4406 observations. It can also be found in the countreg package of R software (see Zeileis et al. [32]). Here, our goal is to model the number of physician visits y, with following variables: number of hospital stay, (x1), number of chronic conditions, (x2), gender (female = 0, male = 1) (x3), number of years of education, (x4) and indicator of private insurance (yes = 1, no = 0), (x5). The following model is fitted to NMES data set using the zero-inflated Poisson, negative binomial and PGL regression models.

    log(μi)=β0+β1x1i+β2x2i+β3x3i+β4x4i+β5x5i,logit(wi)=γ0+γ1z1i+γ2z2i+γ3z3i+γ4z4i+γ5z5i. (5.3)

    The histogram of the number of physician visits are displayed in Figure 5(b). The mean and variance of the number of physician visits are 5.774 and 45.687, respectively. Since the DI of the response variable is greater than one, it exhibits over-dispersion. As in Section 5.1, the over-dispersion test of Cameron and Trivedi [10] is performed. The obtained test statistic value is z=6.679 and corresponding p value is <0.001. Therefore, the response variable has a over-dispersion. As seen from Figure 5(b), the response variable is highly peaked at zero. To assess the zero-inflation, the test proposed by Van den Broek [29] is used. The zero.test is used for this purpose. The test statistic follows a chi-square distribution with one degree of freedom. The obtained test statistic value is χ2=33438.09 and its p-value is <0.001. This result verifies that the frequency of zero process in response variable cannot be modeled by Poisson regression model. Therefore, zero-inflated regression models are needed to model such data sets. The AIC and BIC of the fitted count regression models are listed in Table 5. Since the data exhibits both over-dispersion and zero-inflation, Poisson and NB regression models do not perform well. According to the AIC and BIC values, PGL and ZIPGL models perform better than other models for NMES data set.

    Table 5.  Comparison of models for NMES data.
    Poisson NB PGL ZIP ZINB ZIPGL
    AIC 36314.70 24430.48 24263.38 32611.14 24286.48 24244.30
    BIC 36353.04 24475.22 24314.51 32687.83 24369.56 24333.77

     | Show Table
    DownLoad: CSV

    The estimated parameters of the fitted models as well as their standard errors are summarized in Table 6. Zero-inflated regression models have two parts to interpret. These parts are related the non-inflated and zero-inflated processes. As mentioned before, non-inflated process is modeled with log-link function and zero-inflated process is modeled by logit-link function. Therefore, the regression coefficients of zero-inflated process can be interpreted as odd ratio. As seen from estimated regression coefficients of ZIPGL regression model, for non-inflated process, all variables have statistically significant (at 1% level) effect on number of physician visits. According to zero-inflated process, number of chronic conditions and privative insurance variables have statistically significant effects on number of physician visits. Having private insurance decreases the odds of not having the opportunity of physician visits by exp(1.174)=0.309, which is 69.1.

    Table 6.  The estimated parameters of the fitted models for the NMES data.
    Covariates Poisson NB PGL ZIP ZINB ZIPGL
    Est. SE p Est. SE p Est. SE p Est. Se p Est. SE p Est. SE p
    β0 0.987 0.037 <0.001 0.931 0.086 <0.001 0.897 0.074 <0.001 1.446 0.037 <0.001 1.245 0.088 <0.001 1.108 0.111 <0.001
    β1 0.182 0.006 <0.001 0.240 0.020 <0.001 0.208 0.017 <0.001 0.175 0.006 <0.001 0.223 0.020 <0.001 0.207 0.017 <0.001
    β2 0.175 0.004 <0.001 0.204 0.012 <0.001 0.186 0.009 <0.001 0.129 0.004 <0.001 0.155 0.012 <0.001 0.160 0.011 <0.001
    β3 -0.116 0.013 <0.001 -0.136 0.031 <0.001 -0.135 0.026 <0.001 -0.065 0.013 <0.001 -0.090 0.031 0.003 -0.118 0.030 <0.001
    β4 0.022 0.002 <0.001 0.021 0.004 <0.001 0.021 0.004 <0.001 0.015 0.002 <0.001 0.016 0.004 <0.001 0.017 0.005 <0.001
    β5 0.183 0.017 <0.001 0.192 0.040 <0.001 0.237 0.034 <0.001 0.061 0.017 <0.001 0.096 0.042 0.021 0.171 0.059 0.003
    τ 1.180 0.033 - 1.443 0.034 -
    α 0.228 0.008 - 0.233 0.009 -
    θ 1.516 0.143 - 1.316 0.128 -
    Zero-inflated
    γ0 0.287 0.222 0.196 0.587 0.449 0.191 0.447 0.825 0.588
    γ1 -0.310 0.091 <0.001 -0.815 0.411 0.047 -0.479 0.581 0.410
    γ2 -0.542 0.044 <0.001 -1.319 0.186 <0.001 -1.920 0.814 0.018
    γ3 0.418 0.089 <0.001 0.635 0.201 0.002 0.447 0.370 0.227
    γ4 -0.056 0.012 <0.001 -0.090 0.026 <0.001 -0.106 0.063 0.093
    γ5 -0.751 0.102 <0.001 -1.181 0.221 <0.001 -1.174 0.544 0.031

     | Show Table
    DownLoad: CSV

    As in the previous section, the profile log-likelihood plots are displayed to check the correctness of the estimated model parameters. According to the plots in Figure 8, the estimated model parameters of the ZIPGL model are the maximizers of the function, given in (4.9).

    Figure 8.  Profile log-likelihood plots of the ZIPGL regression model.

    This study introduces a new count regression model for zero-inflated and over-dispersed count data sets based on the re-parametrization of the PGL distribution. The PGL regression model and its zero-inflated counterpart are studied. Two real data sets are analyzed to convince the readers in favor of the PGL regression model against the Poisson and NB regression models. Empirical findings show that PGL and ZIPGL regression models provide better fits than Poisson, ZIP, NB and ZINB regression models. As a future work of this study, one-inflated PGL regression model could be considered. The one-inflated regression models are useful for modeling the claim numbers in insurance. We hope that the PGL and ZIPGL regression models find a wider application area in different applied sciences.

    The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

    The second author would like to thank the Deanship of Scientific Research at Qassim University for funding the publication of this project.

    The authors have no conflicts of interest.



    [1] R. Boukezzoula, S. Galichet, L. Foulloy, Inverse arithmetic operators for fuzzy intervals, In: Proc. EUSFLAT 2007 Conf., Ostrawa, 279-286.
    [2] L. S. Chadli, A. Harir, S. Melliani, Fuzzy Euler differential equation, SOP Trans. Appl. Math., 2 (2015).
    [3] L. S. Chadli, A. Harir, S. Melliani, Solutions of fuzzy heat-like equations by variational iterative method, Ann. Fuzzy Math. Inf., 10 (2015), 29-44.
    [4] L. S. Chadli, A. Harir, S. Melliani, Solutions of fuzzy wave-like equations by variational iteration method, Int. Ann. Fuzzy Math. Inf., 8 (2014), 527-547.
    [5] B. C. Dhage, D. O'Regan, A fixed point theorem in Banach algebras with applications to functional integral equations, Funct. Differ. Eq., 7 (2004), 259-267.
    [6] B. C. Dhage, On α-condensing mappings in Banach algebras, Math. Student, 63 (1994), 146-152.
    [7] B. C. Dhage, V. Lakshmikantham, Basic results on hybrid differential equations, Nonlinear Anal. Hybrid syst., 4 (2010), 414-424. doi: 10.1016/j.nahs.2009.10.005
    [8] B. C. Dhage, A nonlinear alternative in Banach algebras with applications to functional differential equations, Nonlinear Funct. Anal. Appl., 8 (2004), 563-575.
    [9] D. Qiu, C. Lu, W. Hhang, et al. Algebraic properties and topological properties of the quotient space of fuzzy numbers based on Mares equivalence relation, Fuzzy set. Syst., 245 (2014), 63-82. doi: 10.1016/j.fss.2014.01.003
    [10] D. Qiu, W. Hhang, C. Lu, On fuzzy differential equations in the quotient space of fuzzy numbers, Fuzzy set. Syst., 295 (2016), 72-98. doi: 10.1016/j.fss.2015.03.010
    [11] O. Kaleva, Fuzzy differential equations, Fuzzy Set. Syst., 24 (1987), 301-317. doi: 10.1016/0165-0114(87)90029-7
    [12] M. Ma, M. Friedman, A. Kandel, A new fuzzy arithmetic, Fuzzy Set. Syst., 108 (1999), 83-90. doi: 10.1016/S0165-0114(97)00310-2
    [13] S. Seikkala, On the fuzzy initialvalue problem, Fuzzy Set. Syst., 24 (1987), 319-330. doi: 10.1016/0165-0114(87)90030-3
    [14] L. Stefanini, Ageneralization of Hukuhara difference and division for interval and fuzzy arithmetic, Fuzzy Set. Syst., 161 (2010), 1564-1584. doi: 10.1016/j.fss.2009.06.009
  • Reader Comments
  • © 2020 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(3834) PDF downloads(525) Cited by(0)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog