Research article

Poisson-Lindley minification INAR process with application to financial data

  • Received: 26 March 2024 Revised: 01 July 2024 Accepted: 09 July 2024 Published: 22 July 2024
  • MSC : 62M10, 60G10, 62M20, 62P05

  • This paper introduces the Poisson-Lindley minification integer-valued autoregressive (PL-MINAR) process, a novel statistical model for analyzing count time series data. The modified negative binomial thinning and the Poisson-Lindley (PL) marginal distribution served as the foundation for the model. The proposed model was examined in terms of its basic stochastic properties, especially related to conditional stochastic measures (e.g., transition probabilities, conditional mean and variance, autocorrelation function). Through comprehensive simulations, the effectiveness of various parameter estimation techniques was validated. The PL-MINAR model's practical utility was demonstrated in analyzing the number of Bitcoin transactions and stock trades, showing its superior or comparable performance to the established INAR model. By offering a robust tool for financial time series analysis, this research holds potential for significant improvements in forecasting and understanding market dynamics.

    Citation: Vladica S. Stojanović, Hassan S. Bakouch, Radica Bojičić, Gadir Alomair, Shuhrah A. Alghamdi. Poisson-Lindley minification INAR process with application to financial data[J]. AIMS Mathematics, 2024, 9(8): 22627-22654. doi: 10.3934/math.20241102

    Related Papers:

    [1] Kee Wah Fo, Seng Huat Ong, Choung Min Ng, You Beng Koh . An alternative hyper-Poisson integer-valued GARCH model with application to polio, internet protocol and COVID-19 data. AIMS Mathematics, 2023, 8(12): 29116-29139. doi: 10.3934/math.20231491
    [2] Hasnain Iftikhar, Murad Khan, Josué E. Turpo-Chaparro, Paulo Canas Rodrigues, Javier Linkolk López-Gonzales . Forecasting stock prices using a novel filtering-combination technique: Application to the Pakistan stock exchange. AIMS Mathematics, 2024, 9(2): 3264-3288. doi: 10.3934/math.2024159
    [3] Wael W. Mohammed, Kalpasree Sharma, Partha Jyoti Hazarika, G. G. Hamedani, Mohamed S. Eliwa, Mahmoud El-Morshedy . Zero-inflated discrete Lindley distribution: Statistical and reliability properties, estimation techniques, and goodness-of-fit analysis. AIMS Mathematics, 2025, 10(5): 11382-11410. doi: 10.3934/math.2025518
    [4] Zouaoui Chikr Elmezouar, Fatimah Alshahrani, Ibrahim M. Almanjahie, Salim Bouzebda, Zoulikha Kaid, Ali Laksaci . Strong consistency rate in functional single index expectile model for spatial data. AIMS Mathematics, 2024, 9(3): 5550-5581. doi: 10.3934/math.2024269
    [5] Emrah Altun, Hana Alqifari, Mohamed S. Eliwa . A novel approach for zero-inflated count regression model: Zero-inflated Poisson generalized-Lindley linear model with applications. AIMS Mathematics, 2023, 8(10): 23272-23290. doi: 10.3934/math.20231183
    [6] Yirong Huang, Liang Ding, Yan Lin, Yi Luo . A new approach to detect long memory by fractional integration or short memory by structural break. AIMS Mathematics, 2024, 9(6): 16468-16485. doi: 10.3934/math.2024798
    [7] James Daniel, Kayode Ayinde, Adewale F. Lukman, Olayan Albalawi, Jeza Allohibi, Abdulmajeed Atiah Alharbi . Optimised block bootstrap: an efficient variant of circular block bootstrap method with application to South African economic time series data. AIMS Mathematics, 2024, 9(11): 30781-30815. doi: 10.3934/math.20241487
    [8] Ahmed Sedky Eldeeb, Muhammad Ahsan-ul-Haq, Mohamed S. Eliwa . A discrete Ramos-Louzada distribution for asymmetric and over-dispersed data with leptokurtic-shaped: Properties and various estimation techniques with inference. AIMS Mathematics, 2022, 7(2): 1726-1741. doi: 10.3934/math.2022099
    [9] Stelios Arvanitis, Michalis Detsis . Mild explocivity, persistent homology and cryptocurrencies' bubbles: An empirical exercise. AIMS Mathematics, 2024, 9(1): 896-917. doi: 10.3934/math.2024045
    [10] Pedro J. Gutiérrez-Diez, Jorge Alves-Antunes . Stock market uncertainty determination with news headlines: A digital twin approach. AIMS Mathematics, 2024, 9(1): 1683-1717. doi: 10.3934/math.2024083
  • This paper introduces the Poisson-Lindley minification integer-valued autoregressive (PL-MINAR) process, a novel statistical model for analyzing count time series data. The modified negative binomial thinning and the Poisson-Lindley (PL) marginal distribution served as the foundation for the model. The proposed model was examined in terms of its basic stochastic properties, especially related to conditional stochastic measures (e.g., transition probabilities, conditional mean and variance, autocorrelation function). Through comprehensive simulations, the effectiveness of various parameter estimation techniques was validated. The PL-MINAR model's practical utility was demonstrated in analyzing the number of Bitcoin transactions and stock trades, showing its superior or comparable performance to the established INAR model. By offering a robust tool for financial time series analysis, this research holds potential for significant improvements in forecasting and understanding market dynamics.



    Integer-valued time series models have been widely used to model count time series. The most represented models of this type are the so-called integer-valued autoregressive (INAR) time series models, for which there is extensive literature (see, as the most recent, e.g., [1,2,3,4,5,6,7,8]). On the other hand, there are real-world datasets that cannot be modeled by ordinary INAR models, and for this reason various modifications are introduced. One of them is represented by the so-called minification time series models, which first appeared in [9,10], where a continuous, exponential marginal distribution is considered. Thereafter, the identical structure's minification model with Weibull marginal distribution was shown in [11]. The first discrete minification models were presented in [12], while different types of minification models with discrete marginal distributions were discussed in [13]. The situation of geometric marginal distributions was the main focus of [14,15,16], where some new results about discrete minification processes have recently appeared. Specifically, the findings in Aleksić & Ristić [14], where the minification INAR method with geometric marginal distribution was developed, should be highlighted. Motivated by this issue, a new minification integer-valued autoregressive process with Poisson-Lindley marginal distribution (abbr. PL-MINAR process) is proposed here. Some preliminary assumptions necessary to introduce this stochastic process, as well as its definition and some key stochastic features, are described in Section 2. Then, Section 2.2 presents some estimation techniques of the PL-MINAR process parameters and Monte Carlo simulations of the obtained estimators. Practical applications of the PL-MINAR process in modeling the dynamics and empirical distribution of real-world time series, related to the count-based financial data, are given in Section 4. Here, the PL-MINAR process has also been compared with ordinary INAR(1) processes and shown to have the same or even better prediction efficiency and accuracy. Section 5 contains some closing thoughts, while proofs for the stated proposition and theorems are given in the Appendix.

    In this part, the definition of modified negative binomial thinning is given first. After that, some key properties of this operator are presented, as well as its specificities in the case of the Poisson-Lindley (PL) distribution.

    Definition 1. Let α>0 and GjGeom(α/(1+α)), j=1,2, be independent identically distributed (IID) random variables (RVs) with a geometric distribution, whose probability mass function (PMF) is as follows:

    pG(x;α):=P{Gj=x}=αx(1+α)x+1,x=0,1,2, (1)

    The modified negative binomial thinning operator "" is given by the equality:

    αX:=X+1j=1Gj, (2)

    where X is the integer-valued RV, also independent of the RVs (Gj), called counting series related to the mentioned operator.

    It is worth noting that this form of thinning operator was proposed for the first time by Zhang et al. [17]. It differs from the usual negative binomial thinning operator, introduced in Ristić et al. [18], where the term "X" is used instead of "X+1". The main reason is that the upper bound X+1 prevents the RV αX from "vanishing" in the case where X=0, because then Eq (2) yields α0=G10. Consequently, in line with Aleksić & Ristić [14], using this kind of modified operator can avoid the problem of constant behaviour of zero, which may appear when considering a minification model with an ordinary binomial or negative binomial thinning operator. Furthermore, for a given X=y, the RV αX represents a sum of y+1 geometrically distributed RVs. Thus, as is well known (e.g., Pitman [19]), the RV αX has the negative binomial (NB) distribution with "the number of successes" y+1 and parameter p=α/(1+α). In this way, the conditional PMF of the RV αX is as follows:

    P{αX=xX=y}=(x+yy)(α1+α)x(11+α)y+1, (3)

    where x=0,1,2, In addition, using Equations (1) and (2) for the conditional probability generating function (PGF) of the RV αX, one obtains:

    ΨαXX(u;α)=E[uαXX]=(E[uGj])X+1=[11+αx=0(αu1+α)x]X+1=(1+ααu)X1, (4)

    where |u|<(1+α)/α.

    According to this and the well-known properties of PGFs and negative binomial thinning operator (e.g., Ristić et al. [18]), the conditional measures of αX can be easily obtained. Thus, the conditional mean is

    E[αXX]=ΨαXX(u;α)u|u=1=α(X+1)(1+ααu)X2|u=1=α(X+1), (5)

    as well as

    E[αX(X1)X]=2ΨαXX(u;α)u2|u=1=α2(X+1)(X+2).

    Based on this, the conditional second moment of αX is

    E[(αX)2X]=α2(X+1)2+α(1+α)(X+1),

    and thus the conditional variance is obtained as:

    Var[αXX]=E[(αX)2X](E[αXX])2=α(1+α)(X+1). (6)

    Now, let us suppose that X is a RV with Poisson-Lindley (PL) distribution, whose PMF is given by:

    pX(x;θ):=P{X=x}=θ2(θ+x+2)(θ+1)x+3,x=0,1,2,, (7)

    where θ>0 is the (unknown) parameter. It should be noted that the PL distribution, firstly proposed by Sankaran [20], is obtained from the Poisson distribution, whose parameter is adapted to the Lindley distribution (some more results about this distribution and its generalizations can be found, e.g., in [21,22,23,24]). Based on this, the PGF of the PL distribution is easily obtained:

    ΨX(u;θ):=E[uX]=θ2(θu+2)(θ+1)(θu+1)2,|u|<θ+1, (8)

    and using a procedure similar to the previous one, its mean and variance are, respectively:

    μX:=E[X]=θ+2θ(θ+1),σ2X:=Var[X]=θ3+4θ2+6θ+2θ2(θ+1)2=μX+θ2+4θ+2θ2(θ+1)2. (9)

    Thus, the PL-distributed RV X is obviously over-dispersed. Finally, the survival function of X, which plays an important role in the further presentation, is obtained as follows:

    SX(x):=P{Xx}=k=xpX(k;θ)=θx+(θ+1)2(θ+1)x+2. (10)

    Under the given assumptions, the following preliminary result can be proved.

    Proposition 1. Let X be the PL-distributed RV, with the PMF given by Eq (7), which is independent of RVs Gj(α/(1+α)), given in Definition 1. Then, the RV αX represents the mixture of the negative binomial distribution NB(β/(1+β),2) and the geometric distribution Geom(β/(1+β)), with the PMF:

    pαX(x;α,θ):=P{X=x}=(x+1)βx(θ+1)2(1+β)x+2+θ(θ+2)βx(θ+1)2(1+β)x+1, (11)

    and the survival function:

    SαX(x;α,θ):=P{αXx}=αx(θ+1)x2(αθ+α+θ)x+1[θ(x+(θ+1)2)+α(θ+1)3], (12)

    where x=0,1,2, and β=α(θ+1)/θ>0.

    Proof. The proof of this proposition, as well as the following theorems, is given in the Appendix.

    Remark 1. Using Proposition 1, as well as the facts previously stated, some more stochastic properties for the RV αX, when X has the PL(θ) distribution, can be highlighted. For instance, according to Equations (5), (6), and (9), as well as by applying the laws of total expectation and total variance, the mean and the variance of the RV αX are, respectively:

    μαX:=E[αX]=E[E[αXX]]=αE[X+1]=α(θ+1)2+1θ(θ+1)σ2αX:=Var[αX]=E[Var[αXX]]+Var[E[αXX]]=α(α+1)E[X+1]+α2Var[X]=α(α+1)(θ+1)2+1θ(θ+1)+α2θ3+4θ2+6θ+2θ2(θ+1)2=μαX+α2θ4+4θ3+8θ2+8θ+2θ2(θ+1)2.

    Note that the RV αX, as well as the PL-distributed RV X, are over-dispersed. Also, the same results can be obtained by applying Proposition 1, i.e., the PMF of the RV αX, given by Eq (11). Finally, observations of the RV αX can be generated by applying the following simple algorithm:

    Step 1. Generate a RV U from the uniform distribution over the interval (0,1).

    Step 2. If U<(θ+1)2, generate the RV from the negative binomial distribution NB(β/(1+β),2).

    Step 3. Else, generate the RV from the geometric distribution Geom(β/(1+β)).

    Similarly as in Aleksić & Ristić [14], a formal definition of the integer-valued minification process (of the first order) is given here. Thereafter, some stochastic properties of this process are considered in the case when it has a PL marginal distribution.

    Definition 2. A time series (Xt), tZ, given by the equality:

    Xt:=min(αXt1,εt). (13)

    is a minification INAR model of the first order (abbr. MINAR(1) model) if it meets the following conditions:

    (i) (εt), tZ is an innovation series, that is, the IID integer-valued RVs, independent from (Xt),

    (ii) the counting series incorporated in αXt1, tZ are mutually independent and also independent of RVs Xt1 and εt,

    (iii) the counting series incorporated in αXt1 and αXk1 are independent for all tk, (iv) the RVs Xtj and εt are independent for all jZ.

    Based on this definition, it may seem at first glance that the minification process leads to relatively small values. However, due to the definition of the modified thinning operator, higher values are also possible. This behavior is similar to max-INAR models, discussed in [26,27,28], but there is a difference in how both models reach the extreme value (see for more details Aleksić & Ristić [14]). In the following, we assume that for all tZ, the RVs Xt have a Poisson-Lindley distribution PL(θ), whose PMF is given by Eq (7). Thus, we say that the series (Xt) represents a PL-minification integer-valued autoregressive (abbr. PL-MINAR) process (of the first order). According to the definition of the PL-MINAR process and the results described in the previous section, we first determine the distribution of its innovation series (εt).

    Theorem 1. When the innovations (εt) have the survival function of the form:

    Sε(x;α,θ):=P{εtx}=(αθ+α+θ)x+1(θ(θ+x+2)+1)αx(θ+1)2x(α(θ+1)3+θ((θ+1)2+x)) (14)

    then, the discrete time series (Xt), given by Eq (13), has a stationary Poisson-Lindley distribution PL(θ), whose PMF is given by Eq (7), if and only if the following inequality is valid:

    α12(1θ1+θ+θ2+3θ+6(θ+1)(θ+2)). (15)

    The reverse statement is also true, and the PGF of RVs (εt) is given as follows:

    Ψε(u;α,θ):=E[uεt]=1+γ(u1)(λ1γu(λ1)δLδ+1(γu)), (16)

    wherein λ=αθ+α+θ, γ=λ/(α(θ+1)2), δ=λ(θ+1)2/θ, |u|<1/γ and

    Lδ(u):=k=0ukk+δ

    is the Lerch transcendent (of the first order).

    Remark 2. Figure 1 shows distributions of the series (Xt), (αXt), and (εt), when the parameters α=0.75 and θ=1 are taken. It is easy to verify that both of the above conditions are fulfilled, and based on them, it can bee seen that increasing the value of parameter θ also increases the set of allowed values for α (see again Figure 6(b)). Also, it is worth noting that the series (αXt) and (Xt) have a quite similar distribution, which is expected, because according to the definition of the PL-MINAR process, given by Eq (13), their values partially match. On the other hand, the distribution of the innovation series (εt) is very specific and different compared to them, so its properties will be further investigated.

    Figure 1.  The PMFs of the RVs (Xt), (αXt), and (εt).

    Remark 3. By differentiating the PGF Ψε(u;α,θ) given by Eq (16) and then putting u=1, the mean value of the innovations (εt) can be easily calculated as follows:

    με:=E[εt]=Ψε(u;α,θ)u|u=1=[λγ(1γ)(1γu)2(λ1)γδ(Lδ+1(γu)+γ(u1)Lδ+1(γu))]|u=1=γ[λ1γ(λ1)δLδ+1(γ)].

    At the same time, note that under the condition 0<γ<1 and the inequality:

    Lδ+1(γ)=k=0γkk+δ+1<1δk=0γk=1δ(1γ)

    it follows that:

    με>λγ1γ(λ1)γ1γ=γ1γ>0.

    Similarly, according to equality:

    E[εt(εt1)]=2Ψε(u;α,θ)u2|u=1=2γ2[λ(1γ)2(λ1)δLδ+1(γ)]

    for the variance of the RVs (εt), after some computations, one obtains:

    Var[εt]=E[εt(εt1)]+E[εt](E[εt])2=γ(1γ)2[λ+(λ1)(1γ)2δLδ+1(γ)(γδLδ+1(γ)+1)γλ(Lδ+1(γ)δ(1γ)+1)2]2γ2δ(λ1)Lδ+1(γ).

    Below are some important properties of the PL-MINAR series (Xt) related to its conditional stochastic measures. First, the Markov properties and the conditional PGF of the PL-MINAR series (Xt) are presented.

    Theorem 2. Let (Xt), tZ, be a PL-MINAR process given by Eq (13). Then, (Xt) is a strictly stationary, homogeneous Markov process with one-step transition probabilities:

    P{Xt=xXt1=y}=Sε(x;α,θ)(x+yx)αx(1+α)x+y+1+pε(x;α,θ)k=x+1(k+yk)αk(1+α)y+k+1, (17)

    where x,y=0,1,2,. In addition, the conditional PGF of the PL-MINAR RVs (Xt), for a given value Xt1=y, has the following form:

    E[uXtXt1=y]=1+λγ(u1)1γu(11(1+ααγu)y+1)(λ1)γδ(u1)(Lδ+1(γu)j=0(y+jj)(αγu)jLδ+j+1(γu)(1+α)y+j+1), (18)

    where |u|<1/γ and the parameters λ, γ, δ are the same as in Theorem 1.

    Remark 4. Based on the previous theorem, in a similar way as in [21], some additional Markovian properties of the PL-MINAR process can be observed. Note that Eq (14), under the condition given by Eq (15), ensures Sε(x;α,θ)>Sε(x+1;α,θ)>0, which implies pε(x;α,θ)>0, for all x=0,1,2,. Therefore, the transition probabilities given by Eq (17) are always positive, so the process (Xt) is an irreducible, aperiodic and (positively or null) recurrent Markov chain.

    Remark 5. Based on the previous theorem and using the well-known properties of PGFs, the conditional mathematical expectation and the conditional variance of the PL-MINAR series (Xt) can be derived. Thus, by differentiating Eq (18) and setting u=1, for the conditional mean of the series (Xt), after some computations, one obtains:

    E[XtXt1]=E[uXtXt1]u|u=1=λγ1γ[1(11+ααγ)Xt1+1]δ(λ1)γ[Lδ+1(γ)j=0(Xt1+jj)(αγ)jLδ+j+1(γ)(1+α)Xt1+j+1]. (19)

    Similarly, the conditional variance of (Xt) can be obtained as:

    Var[XtXt1]=E[Xt(Xt1)Xt1]+E[XtXt1](E[XtXt1])2,

    where

    E[Xt(Xt1)Xt1]=2E[uXtXt1]u2|u=1.

    However, due to the complexity of the calculation, a more detailed procedure will be omitted here.

    Finally, the correlation structure of the PL-MINAR process is examined below, where it is presented in the next statement.

    Theorem 3. The first-order autocorrelation of the PL-MINAR series (Xt) is given by the following expression:

    ρX(1):=Corr(Xt,Xt1)=θ2(θ+1)2θ3+4θ2+6θ+2[θ+2θ(θ+1)(λγ1γ1+(λ1)γδLδ+1(γ))+γδ(λ1)(αθ+α+θ)3(θθ+1)2j=0(j+1)(αγ(θ+1)(1+α)(θ+1)1)j×((1+α)(θ+1)(θ+3)θ+j1)Lj+δ+1(γ)(θ+2)2θ2(θ+1)2]. (20)

    Remark 6. Let us examine in a nutshell the problem of finding the correlation dependence for RVs Xt and Xtk, where k=1,2, Using similar considerations as in the general minification process [12], we define a two-dimensional survival function:

    S(2)k(x,y):=P{Xtx,Xtky}=P{min(αXt1,εt)x,Xtky}=Sε(x)P{αXt1x,Xtky}.

    From here, using the conditional probability and the conditional expectation, one obtains:

    E[XtXtk]=x=1y=1S(2)k(x,y)=x=1Sε(x)y=1P{Xtky|αXt1x}P{αXt1x}=x=1Sε(x)P{αXt1x}E[Xtk|αXt1x]=x=1P{min(αXt1,εt)x}E[Xtk|αXt1x]:=Φ(Xt1,Xtk;α),

    where the replacement of the order of the sums is ensured in a manner similar to the proof of Theorem 2 (see Appendix). Thus, the mixed moments E[XtXtk], and therefore the autocorrelation function ρX(k)=Corr(Xt,Xtk), k=1,2,, can be expressed recursively, in terms of bivariate distribution of the RVs (Xt1,Xtk).

    In this part, some procedures for estimating the unknown parameters α(0,1) and θ>0 of the PL-MINAR model are discussed. For this purpose, based on previously obtained theoretical results, three different estimation methods are examined here, and their more detailed description will be given below. In doing so, as usual, it is assumed that the PL-MINAR series (Xt) is given by one of its observed realizations X1,,XT of length T>0.

    This estimation method is based on equating the theoretical and empirical moments of the PL-MINAR series (Xt). According to the stationarity of the series (Xt) and the first of Eq (9), the MM-estimator of the parameter θ is easily obtained from the equality:

    ˆθMM=1¯XT+(1¯XT)2+8¯XT2¯XT, (21)

    where ¯XT:=T1Tt=1Xt is the empirical mean of the realized series X1,,XT. By using some general results about the MM estimates of the PL distribution (e.g., [25]), the asymptotic normality of the estimator ˆθMM can be proven. Thereafter, by replacing θ with ˆθMM, as well as the first-order correlation ρX(1) with the empirical one

    ˆρT(1):=T1t=1(Xt¯XT)(Xt+1¯XT)Tt=1(Xt¯XT)2,

    solving Eq (20) with respect to α gives the estimator ˆαMM of this parameter. For this purpose, some numerical procedure must be used, although a particular problem is the complexity of this equation. Nevertheless, after substituting θ=ˆθMM and ρX(1)=ˆρT(1) into Eq (20), it gains simplicity. Moreover, since the mean and variance of the PL-MINAR series (Xt) do not depend on α(0,1), to obtain this parameter we can take Eq (32) (see Appendix), i.e., equality E[XtXt1]=ˆM(2)T, where

    ˆM(2)T=1T1Tt=2XtXt1

    is the empirical mixed moment of observed series X1,,XT. Finally, a consistency and asymptotic normality of this estimator can be shown by using some general results of the MM estimators (see, for more detail, e.g., [29]).

    As is well-known, for the observed values x1,...,xT related to the mentioned realization, the CML method is based on maximizing the conditional log-likelihood function:

    (α,θ)=logL(α,θ)=logTt=2P{Xt=xtXt1=xt1}=Tt=2logP{Xt=xtXt1=xt1}. (22)

    Since the PL-MINAR process (Xt) has Markov properties, described in Theorem 2.2, its conditional log-likelihood function can be easily derived. Namely, by using previously obtained transitional probabilities, given by Eq (17), as well as their replacement in Eq (22), the following objective function on α,θ is obtained:

    (α,θ)=Tt=2log[Sε(xt;α,θ)(xt+xt1xt)αxt(1+α)xt+xt1+1+pε(xt;α,θ)(1xtk=0(k+xt1k)αk(1+α)xt1+k+1)]. (23)

    Note that CML estimates of the parameters α,θ can be computed by solving the coupled equations (α,θ)/α=(α,θ)/θ=0. However, as is common for this estimation method, CML estimators ˆαCLM and ˆθCLM cannot be obtained in closed form in this way. Therefore, it is necessary to use some numerical methods for their calculation, which will also be discussed below. It is worth noting that, similar to the previous method, the asymptotic properties of these estimators can be shown using general results related to CML estimators (see for more details, e.g., [30]).) Furthermore, using a procedure like that of the Poisson-Lindley INAR(1) model, introduced in [21], the asymptotic equivalence of the efficiency of the MM and CLM estimators can also be shown.

    The third estimation method is the CLS method, where the estimates of the parameters α(0,1) and θ>0 are obtained as values that minimize the following objective function:

    QT(α,θ)=Tt=2[XtE[XtXt1]]2.

    By substituting here the conditional mean previously obtained by Equation (19), it follows:

    QT(α,θ)=Tt=2[Xtλγ1γ[1(11+ααγ)Xt1+1]δ(λ1)γ×[Lδ+1(γ)j=0(Xt1+jj)(αγ)jLδ+j+1(γ)(1+α)Xt1+j+1]]2, (24)

    wherein λ=αθ+α+θ, γ=λ/(α(θ+1)2) and δ=λ(θ+1)2/θ. Therefore, by applying the usual procedure, that is, by solving coupled equations QT(α,θ)/α=QT(α,θ)/θ=0, the CLS estimators can be computed. Note that, as in the previous case, the minimization of the function QT(α,θ) can be performed using some numerical procedures, while the asymptotic properties of the obtained CLS estimates can be proved by applying some basic results of the CLS theory (see, for more detail, e.g., [31]). Finally, as before, the asymptotic efficiency equivalence of the CLS estimators with previous ones can be easily shown.

    In this part, numerical simulations of previously described procedures for estimating the unknown parameters (α,θ) of the PL-MINAR process (Xt) were carried out, based on its observed realization X1,,XT. For this purpose, the innovations (εt) are first generated, so that their PMF, given by Eq (28) (see Appendix), is implemented using the R-package "discreteRV", authorized by Buja et al. [32]. Thereafter, using Eqs (2) and (13), the simulated values of the series (αXt) and (Xt), respectively, are simply generated. The procedure of generating the PL-MINAR process (Xt) can be presented by the following pseudo-code:

    Step 1. For the innovation series (εt), by using Eq (14), define the appropriate survival function Sε(x;α,θ)=P{εtx}, as well as its PMF pε(x;α,θ)=Sε(x;α,θ)Sε(x+1;α,θ), when x=0,1,2,

    Step 2. For t=0,1,,T generate the values of εt, whose PMF is pε(x;α,θ).

    Step 3. Put the initial values X0=αX0=ε0.

    Step 4. For t=1,,T, by using Eq (2), generate the values of αXt1, and then the values of Xt=min{αXt1,εt}.

    Note that the values of αXt can be alternatively generated according to the pseudo-code previously stated in Remark 1. As an illustration, Figure 2 shows the empirical frequency distributions of all three series, for different values of parameters (α,θ), obtained according to their realizations of length T=1500. It can be noted, as explained in the previous theoretical part, that the empirical distributions of the series (αXt) and (Xt) are similar to each other. In contrast, the empirical distribution of innovations (εt) is significantly more flexible and more susceptible to changes in parameters, which is fully consistent with the procedure for obtaining these distributions, as stated in Theorem 2.1. Furthermore, by applying the described procedure, simulations of the PL-MINAR series (Xt) of different lengths T{500,1500,3000} were conducted. Thereby, we point out that some of these lengths are taken in accordance with the size of real-world data, which are observed and analyzed in the next section. In addition, the previously presented estimation methods can be applied to each of these obtained realizations {X1,,XT}. To this end, for each of the mentioned lengths, we independently generated 1000 independent simulations of the series (Xt), and the results of applying the three previously described estimation procedures are shown in Table 1.

    Figure 2.  Empirical frequency distributions of three PL-MINAR series of length T=1500, generated using Monte Carlo simulations for different values of parameters α,θ.
    Table 1.  Summary statistics, estimation errors, and AN testing of parameter estimates of the PL-MINAR process. (True parameters are: α=0.5, θ=2).
    Sample MM CML CLS
    ˆαMM ˆθMM ˆαCML ˆθCML ˆαCLS ˆθCLS
    T=500 Min. 0.1073 1.550 0.3600 1.360 0.3292 1.497
    Mean 0.5078 2.068 0.5121 2.035 0.4719 2.032
    Max. 0.8404 2.817 0.6713 2.620 0.6126 2.574
    SD 0.1007 0.1747 0.0705 0.3111 0.0519 0.1956
    MSEE 0.0102 0.0351 0.0712 0.0970 0.0588 0.0457
    AD 1.015* 0.6818 0.5395 0.7330 0.4749 0.5116
    (p-value) (0.0113) (0.0746) (0.1585) (0.0525) (0.2355) (0.1950)
    T=1500 Min. 0.1530 1.790 0.3810 1.436 0.3918 1.535
    Mean 0.5065 2.069 0.4924 2.038 0.5030 2.013
    Max. 0.8303 2.415 0.6397 2.597 0.5911 2.540
    SD 0.0977 0.0991 0.0617 0.2915 0.0314 0.1917
    MSEE 9.92E-3 0.0145 0.0615 0.0882 0.0315 0.0379
    AD 0.5357 0.3575 0.5089 0.6041 0.3380 0.2794
    (p-value) (0.1694) (0.4538) (0.1940) (0.1136) (0.5033) (0.6393)
    T=3000 Min. 0.2299 1.868 0.3747 1.778 0.4415 1.594
    Mean 0.5036 2.070 0.4954 2.006 0.4998 2.050
    Max. 0.8032 2.294 0.6200 2.549 0.5946 2.482
    SD 0.0990 0.0689 0.0584 0.2753 0.0271 0.1880
    MSEE 9.65E-3 0.0100 0.0586 0.0751 0.0269 0.0375
    AD 0.5506 0.2974 0.2089 0.6328 0.3202 0.1844
    (p-value) (0.1558) (0.5895) (0.8599) (0.0964) (0.5319) (0.9064)
    * 0.01 < p < 0.05

     | Show Table
    DownLoad: CSV

    More precisely, Table 1 contains, for all considered estimation methods and different lengths of simulated series, summary statistics of the computed estimates, i.e., their minimums (Min.), mean values (Mean), maximums (Max.), and standard deviations (SD). In addition, the mean squared errors of estimation (MSEE), as well as the Anderson-Darling normality test were performed. Note that in the case of the MM estimator, the parameter θ is easily estimated according to Eq (21). Afterwards, using the procedure specified in Subsection 3.1, the MM estimates of the parameter α are easily derived. To that cause, the R-procedure "nmilnb" for box-constrained optimization was used (see Gay [33] for more details), but the same results are obtained using the R-procedure "optimize" for unconstrained optimization. From the other side, CML and CLS estimates were performed using minimization of the objective functions (α,θ) and Q(α,θ), given by Eqs (23) and (24), respectively. Thereby, the realizations from the uniform distribution were taken as the initial estimated values of the parameters, taking care to satisfy the conditions of Theorem 1.

    From the results given in Table 1, we can notice that the MM estimates of α have a large range, which is a consequence of their calculation using the reciprocal estimated autocorrelation. In contrast, the MM estimates of θ are significantly more efficient because they are obtained directly from Eq (21). More efficient estimates are also obtained for the other two estimation methods, although the estimates for θ obtained by the CML method have higher standard deviation values. By using similar considerations as Aleksić & Ristić [14], it should be noted that CML estimates require significantly more computing time, especially for large samples. Nevertheless, it is noticeable the decrease of SD and MSEE values with increasing series lengths, that is, there is a pronounced convergence of all estimation methods. The asymptotic normality (AN) test results are also shown in Table 1, where the Anderson–Darling normality test was conducted. The test statistic, denoted AD, along with the corresponding p-values, were calculated using the procedure from the R-package "nortest", authored by Gross [34]. According to the values obtained, it can be seen that the AN property is confirmed for almost all calculated estimates, at the significance level of 0.01<p<0.05, which is also confirmed in Figure 3. Moreover, the AN property is the most pronounced in the case of CLS estimates, so this, along with the previously mentioned facts, indicates that these estimates may be the most adequate for practical application, which will be done below.

    Figure 3.  Histograms of the empirical distributions of the PL-MINAR process parameters estimates. (The length is T=1500 and parameters are the same as in Table 1.).

    This section discusses some practical applications of the PL-MINAR process in real-world data modeling. For this purpose, as well as to compare the efficiency of the proposed model with some well-known existing models, we used two datasets. The first one, labeled as Series A, represents daily bitcoin transaction volumes, calculated as the sum of all transaction outputs belonging to the blocks mined on a given day. The data refer to the first period of transactions with this cryptocurrency, from January 9, 2009 to April 3, 2010, based on the Bitcoin dashboard "Coinmetrics" [35]. Thus, a time series of length T=450 was obtained, and similarly to [36,37], the last 10%, that is, 45 observations, are used to test the predictive accuracy of PL-MINAR process. The second (Series B), considered also in [36], represents the number of transactions in shares of the Empire District Electric (EDE) company, measured at 5-minute intervals between 9:45 a.m. and 4:00 p.m. for the period from January 3 to February 18, 2005. The data was collected from the official website of the Wall Street Journal and in that way, a count time series of length T=2925 was obtained, where the last 225 observations were left to check the validity of our model forecast. The dynamics of these two time series, along with their autocorrelation functions (ACFs), are shown in Figure 4. It is noticeable the decrease of both ACFs is a little bit slower than the standard INAR(1) models, especially for Series A. In accordance with previous theoretical results, this suggests the possibility of modeling the dynamics of both time series by the PL-MINAR model. Some confirmation of this can also be seen in Table 2, where the descriptive statistics of both series are presented. For instance, both observed series have a wide range of data, positive asymmetry, and significant overdispersion. It is somewhat more pronounced in the case of series A, where significant zero inflation is also present. Furthermore, the Augmented Dickey-Fuller (ADF) test was conducted using the R-package "aTSA" [38], and the alternative hypothesis that the observed series are stationary was clearly verified in both cases.

    Figure 4.  The dynamics of the real-world data series (plots above) and their appropriate ACFs (plots below).
    Table 2.  Summary statistics and stationarity testing of the observed data.
    Statistics Series A Series B
    Minimum 0 0
    Maximum 16 25
    Mode 0 1
    Median 0 3
    Mean 1.416 3.314
    Variance 7.183 9.209
    St. deviation 2.680 3.035
    Skewness 2.875 1.658
    Kurtosis 9.250 4.799
    ADF-test -13.46 -22.60
    (p-value) (<0.01) (<0.01)

     | Show Table
    DownLoad: CSV

    Further, by applying the previously described procedures, parameters were estimated for both series, assuming that their dynamics can be described by the PL-MINAR model. Additionally, to verify its effectiveness the proposed model is compared with the Poisson-Lindley INAR(1) model Xt=αXt1+εt, t=1,2,,T, introduced by Mohammadpour et al. [21], where α(0,1) and is the binomial thinning operator (see, e.g., [39] for more details). Thus, both stochastic models, the PL-MINAR and the PL-INAR(1) process, have a Poisson-Lindley marginal distribution. To estimate the parameters of both models, according to the results obtained in the previous section, MM estimates were used as initial ones, while CLS estimates were used to fit and compare them with empirical data. The estimated values for both series, as well as the values of objective functions QT(α,θ), are presented in Table 3. After that, using the estimated parameter values, 500 independent Monte Carlo simulations of both models were performed, and the distributions agreement of the real-world and fitted data was checked using MSEE statistics, Akaike's information criterion (AIC), and Bayesian information criterion (BIC).

    Table 3.  Estimated parameter values of the Poisson-Lindley INAR(1) and MINAR processes, along with the corresponding estimation errors and predictive test statistics.
    Parameters/Statistics Series A Series B
    PL-INAR(1) PL-MINAR PL-INAR(1) PL-MINAR
    α 0.3337 1.5885 0.3211 1.2517
    θ 1.5587 1.5034 0.4988 0.4990
    QT 3.2697 5.6894 8.2238 13.235
    MSEE 0.0580 0.0523 0.0485 0.0484
    AIC 173.06 171.99 136.80 136.55
    BIC 174.26 173.29 136.98 136.73
    DM 1.7142* 1.6899*
    (p-value) (0.0464) (0.0471)
    *0.01 < p < 0.05

     | Show Table
    DownLoad: CSV

    Based on this, close estimated values of the θ parameter, as well as fit statistics and estimation errors, are noticeable for both estimation methods and both stochastic models. This is expected, because in both cases it is assumed that series (Xt) have a PL distribution. However, it is obvious that the error statistics are lower when the PL-MINAR model is applied. The forecast accuracy for both models is also analyzed below, where, as already mentioned, forecast length horizons of h=50 (Series A) and h=100 (Series B) are taken. The testing procedure was performed using a one-tailed Diebold–Marian test [40], with the null hypothesis being that both models have the same predictive accuracy, while the alternative is that the PL-MINAR model has better accuracy. The test statistic, labeled DM, as well as the appropriate p-values, were calculated within the R-package "forecast" [41] and presented in the lower part of Table 3. Based on this, it is clear that in both cases the PL-MINAR process has better predictive accuracy, at a significant level of 0.01<p<0.05. Some confirmation of these facts can be seen in Figure 5, where empirical and fitted frequencies of both time series and both models are shown, as well as for observed and forecasted data.

    Figure 5.  Frequency distributions (plots above) and prediction features (plots below) of the observed and fitted data.

    A novel count time series minification model, named the PL-MINAR process, is presented here. The key properties of the proposed model, as well as different procedures for estimating its parameters, are discussed in detail. Through Monte Carlo simulations, the consistency of these estimators was examined, and at the end, a practical application of the PL-MINAR process in fitting real-world data was given. To verify its effectiveness, the proposed model is applied to fit the distributions and predictions of two real-world time series, representing the number of bitcoin and stock transactions. Also, the PL-MINAR model was compared with the INAR(1) model, which is based on the same Poisson-Lindley (PL) marginal distribution. According to the obtained results, i.e., fitting errors and DM statistics, it can be seen that the PL-MINAR model has the same or even better fitting and forecasting capabilities within the observed data. These results can be motivating for further applications of the PL-MINAR process in some other fields, such as epidemiology, climate change, or cell counts. On the other hand, investigations of the minification count process, where some other and different count-based distributions are used as marginals, may also be a direction of some future research.

    Proof of Proposition 1. According to the previously obtained conditional PGF of the RV αX, given by Eq (4), the PGF of this RV can be expressed as follows:

    ΨαX(u;α,θ)=(1+ααu)1ΨX((1+ααu)1).

    Now, for the PL-distributed RV X, i.e., using its PGF given by Eq (8), after some computation one obtains:

    ΨαX(u;α,θ)=(1+ααu)1θ2(θs+2)(θ+1)(θs+1)2|s=(1+ααu)1=11+ααuθ2(θ+2)(1+ααu)11+ααu(θ+1)((θ+1)(1+ααu)11+ααu)2=θ2(θα(θ+2)(u1)+1)(θ+1)(θα(θ+1)(u1))2,|u|1. (A.1)

    Using the partial fraction decomposition method, Eq (A.1) can be expressed in an equivalent way as:

    ΨαX(u;α,θ)=θ2(θ+1)2(θα(θ+1)(u1))2+θ3+2θ2(θ+1)2(θα(θ+1)(u1))=1(θ+1)2(1+ββu)2+θ(θ+2)(θ+1)2(1+ββu), (A.2)

    where β=α(θ+1)/θ and |u|<(1+β)/β. It is obvious that functions

    Ψ1(u;β):=1(1+ββu)2,Ψ2(u;β):=11+ββu

    are, respectively, the PGFs of the negative binomial distribution NB(β/(1+β),2) and the geometric distribution Geom(β/(1+β)). Finally, using the PMF of the RV αX, given by Eq (11), and after some computation, the survival function of αX is obtained as follows:

    SαX(x;α,θ):=P{αXx}=1(θ+1)2k=x(k+1)βk(1+β)k+2+θ(θ+2)(θ+1)2k=xβk(1+β)k+1=1(θ+1)2(1+β+x)βx(1+β)x+1+θ(θ+2)(θ+1)2βx(1+β)x=βx((1+β)(θ+1)2+x)(θ+1)2(1+β)x+1=αx(θ+1)x2(αθ+α+θ)x+1[θ(x+(θ+1)2)+α(θ+1)3].

    Therefore, this statement is fully proved.

    Proof of Theorem 1. Let us observe an arbitrary non-negative integer x=0,1,2,. Based on the definition of the PL-MINAR process, as well as the independence of the RVs (Xt) and (εt), it follows that:

    P{Xtx}=P{min(αXt1,εt)x}=P{αXt1x}P{εtx}.

    From here, the survival function for RVs (εt) is directly obtained:

    Sε(x):=P{εtx}=P{Xtx}P{αXt1x},

    and using the previously obtained survival functions, given by Eqs (10) and (12), as well as after some computation, Eq (14) follows. At the same time, note that this expression is well defined if and only if the following two conditions are met:

    I condition: The exponential part of Eq (14) must converge to 0 and 1, when x+ and x0, respectively. According to α,θ>0, this is obviously fulfilled if and only if:

    αθ+α+θα(θ+1)2<1α(θ+1)2α(θ+1)θ>0αθ(θ+1)>θα>1θ+1. (A.3)

    II condition: The survival function Sε(x)=P{εtx} is monotonically non-increasing, i.e., for arbitrary x=0,1,2, the PMF of the RVs (εt) must satisfy the condition:

    pε(x;α,θ):=Sε(x)Sε(x+1)0. (A.4)

    Applying Eq (14) yields the following inequality:

    (αθ+α+θ)x+1αx(θ+1)2x(θ(θ+x+2)+1α(θ+1)3+θ((θ+1)2+x)(αθ+α+θ)(θ(θ+x+3)+1)α(θ+1)2(α(θ+1)3+θ((θ+1)2+x+1)))0,

    and after some computation, it can be expressed in the following equivalent way:

    a(α,θ)x2+b(α,θ)x+c(α,θ)0, (A.5)

    where:

    a(α,θ):=θ3(α(θ+1)1)b(α,θ):=θ2(α(θ+1)1)(α(θ+1)3+θ3+3θ2+4θ+1)c(α,θ):=θ2(θ+1)2(α2(θ+2)(θ+1)2+α(θ+2)(θ21)θ(θ+3)1).

    Note that from the first condition, that is, when Eq (A.3) holds, the inequalities a(α,θ)>0 and b(α,θ)>0 obviously follow. Therefore, the second condition is fulfilled if and only if it holds:

    ˜c(α,θ):=c(α,θ)θ2(θ+1)2=(θ+2)(θ+1)2α2+(θ+2)(θ21)αθ(θ+3)10. (A.6)

    To that end, notice that ˜c(α,θ) is a convex quadratic function with respect to α>0, with coefficients:

    a1(θ):=(θ+2)(θ+1)2>0,c1(θ):=(θ2+3θ+1)<0,

    and discriminant:

    D(θ)=(θ+1)3(θ+2)(θ2+3θ+6)>0.

    Therefore, equation ˜c(α,θ)=0 has two real solutions α1=α1(θ) and α2=α2(θ), with different signs because, according to Vieta formulas, α1α2=c1(θ)/a1(θ)<0. Thus, this implies that the inequality (A.6) holds if and only if α>0 is greater than or equal to the larger solution of the equation ˜c(α,θ)=0 (see Figure 6(a)). In that way, the second condition is equivalent to the inequality:

    α(1θ)(θ+1)(θ+2)+(θ+1)3(θ+2)(θ2+3θ+6)2(θ+2)(θ+1)2,

    which yields, after some rearrangement, Eq (15). Finally, let us notice that, for arbitrary θ>0, the following expressions hold:

    Figure 6.  (a) Dependence of parameters α and θ given by quadratic function ˜c(α,θ). (b) Conditions for a well-defined distribution of innovations (εt) depending on the parameters α and θ.
    12(1θ1+θ+θ2+3θ+6θ2+3θ+2)>12(1θ1+θ+1)=1θ+1.

    This means that the second condition, given by Eq (15), implies the first condition, given by Eq (A.3), as can also be seen in Figure 6(b).

    Finally, using Eqs (14) and (A.4), for the PGF of the RVs (εt) one obtains:

    Ψε(u;α,θ)=x=0uxpε(x;α,θ)=x=0ux[Sε(x;α,θ)Sε(x+1;α,θ)]=x=0uxSε(x;α,θ)x=1ux1Sε(x;α,θ)=1+x=1(uxux1)Sε(x;α,θ)=1+(1u1)x=1ux(αθ+α+θ)x+1αx(θ+1)2xθx+(θ+1)2θx+(θ+1)2(αθ+α+θ)=1+(11u)(αθ+α+θ)x=1(u(αθ+α+θ)α(θ+1)2)x(1(θ+1)2(αθ+α+θ1)θx+(θ+1)2(αθ+α+θ))=1+u1uλ(x=1(γu)x(θ+1)2(λ1)x=1(γu)xθx+λ(θ+1)2)=1+u1uλx=1(γu)xu1uλ(θ+1)2θ(λ1)x=1(γu)xx+λ(θ+1)2/θ=1+u1uλγu1γuu1uδ(λ1)x=1(γu)xx+δ=1+λγ(u1)1γuγδ(λ1)(u1)x=0(γu)xx+δ+1.

    The last expression is obviously the same as in Eq (16), whereby according to Eq (A.3) it follows:

    0<γ=αθ+α+θα(θ+1)2<1.

    Thus, under the condition |u|<1/γ, the PGF Ψε(u;α,θ) is properly defined, which completely proves this theorem.

    Proof of Theorem 2. For arbitrary non-negative integers x,y, the transition probabilities can be expressed in terms of conditional survival functions as follows:

    P{Xt=xXt1=y}=P{XtxXt1=y}P{Xtx+1Xt1=y}.

    Using the independence of the RVs αXt1 and εt, as well as the fact that RVs αXt1, when Xt1=y, have a negative binomial distribution NB(α/(1+α),y+1), given by Eq (3), it follows that:

    P{XtxXt1=y}=P{min(αXt1,εt)xXt1=y}=P{αXt1xXt1=y}P{εtx}=Sε(x;α,θ)k=xP{αXt1=kXt1=y}=Sε(x;α,θ)k=x(k+yk)αk(1+α)k+y+1.

    According to this, one obtains:

    P{Xt=xXt1=y}=Sε(x;α,θ)k=x(k+yk)αk(1+α)k+y+1Sε(x+1;α,θ)k=x+1(k+yk)αk(1+α)k+y+1=Sε(x;α,θ)(x+yx)αx(1+α)x+y+1+[Sε(x;α,θ)Sε(x+1;α,θ)]k=x+1(k+yk)αk(1+α)k+y+1,

    and the last expression is obviously the same as in Eq (17).

    To prove the strict stationarity of our model, we use a similar procedure as Aleksić & Ristić [14], that is, we show that for all hN0 and nN holds:

    P{X1=x1,,Xn=xn}=P{X1+h=x1,,Xn+h=xn}.

    According to the proven Markov's property, this equality is equivalent to the following:

    P{X1=x1}nj=2P{Xj=xjXj1=xj1}=P{X1+h=x1}nj=2P{Xj+h=xjXj+h1=xj1},

    where (Xt) is marginally stationary, and proven Eq (17) ensures that the above conditional probabilities do not depend on the time tZ. Thus, it follows that the left and right sides of the above equation are really equal, i.e., (Xt) is indeed a strictly stationary stochastic process.

    Now, by using Eq (17), the conditional PGF of the RVs (Xt) can be expressed as follows:

    E[uXtXt1=y]=x=0uxP{Xt=xXt1=y}=ux(P{XtxXt1=y}P{Xt+1xXt1=y})=x=0uxSε(x;α,θ)j=x(y+jj)αj(1+α)y+j+1x=0uxSε(x+1;α,θ)j=x+1(y+jj)αj(1+α)y+j+1=1+(11u)x=1uxSε(x;α,θ)S(x,y;α). (A.7)

    In the last summation, we refer to:

    S(x,y;α):=j=x(y+jj)αj(1+α)y+j+1

    as the survival function of the RV with negative binomial distribution NB(α/(1+α),y+1), which implies that S(x,y;α) is monotonically decreasing on x and S(x,y;α)1, for all x,y=0,1,2,. In addition, similarly as in the proof of Theorem 1, we have:

    Tε(j;α,θ):=jk=1ukSε(k;α,θ)=jk=1uk(αθ+α+θ)k+1αk(θ+1)2kθk+(θ+1)2θk+(θ+1)2(αθ+α+θ)=(αθ+α+θ)jk=1(u(αθ+α+θ)α(θ+1)2)k(1(θ+1)2(αθ+α+θ1)θk+(θ+1)2(αθ+α+θ))=(αθ+α+θ)(jk=1(γu)k(θ+1)2(αθ+α+θ1)jk=1(γu)kθk+(θ+1)2(αθ+α+θ))=λjk=1(γu)kδ(λ1)jk=1(γu)kk+δ=λγu1(γu)j1γuδ(λ1)γu(Lδ+1(uγ)(γu)jLδ+j+1(uγ)),

    by which, under the condition |γu|<1, it follows:

    limj+Tε(j;α,θ)=λγu1γuδ(λ1)γuLδ+1(uγ)<+.

    Therefore, according to Abel's convergence criterion (see, e.g., [42]), the double sum in Eq (A.7) is convergent, and the order of sums can be changed, which yields:

    E[uXtXt1=y]=1+(11u)j=1Tε(j;α,θ)(y+jj)αj(1+α)y+j+1=1+(11u)[λγu1γuj=0(1(γu)j)(y+jj)αj(1+α)y+j+1δ(λ1)γuj=0(Lδ+1(uγ)(γu)jLδ+j+1(uγ))(y+jj)αj(1+α)y+j+1]=1+λγ(u1)1γu(1(1+ααγu)y1)δ(λ1)γ(u1)(Lδ+1(γu)j=0Lδ+j+1(γu)(y+jj)(αγu)j(1+α)y+j+1).

    Thus, the obtained expression obviously represents Eq (18), which proves this theorem.

    Proof of Theorem 3. First, we have calculated the joint two-dimensional PGF of RVs Xt and Xt1. Using the law of total expectation, the previously obtained PGF of the PL-distributed RVs, given by Eq (8), and the conditional PGF of the first order, given by Eq (18), one obtains:

    Ψ(2)X(u,v):=E[uXtvXt1]=E[vXt1E[uXtXt1]]=E[vXt1[1+λγ(u1)1γu(11(1+ααγu)Xt1+1)(λ1)γδ(u1)(Lδ+1(γu)j=0(Xt1+jj)(αγu)jLδ+j+1(γu)(1+α)Xt1+j+1)]]=A(u;λ,γ,δ)ΨX(v;θ)B(u;α,λ,γ)K(u,v;α,γ)+C(u;α,λ,γ,δ)j=0ηj(u;α,γ,δ)Mj(v;α,θ),

    where:

    A(u;λ,γ,δ)=1+λγ(u1)1γu(λ1)γδ(u1)Lδ+1(γu),B(u;α,λ,γ)=λγ(u1)(1γu)(1+ααγu),C(u;α,γ,δ)=(λ1)γδ(u1)1+α,ΨX(v;θ)=E[vXt1]=θ2(θv+2)(θ+1)(θv+1)2,K(u,v;α,γ,θ)=E[(v1+ααγu)Xt1]=ΨX(v1+ααγu)=θ2(θw+2)(θ+1)(θw+1)2|w=v1+ααγu=θ2(1+ααγu)[(θ+2)(1+ααγu)v](θ+1)[(θ+1)(1+ααγu)v]2,ηj(u;α,γ,δ)=(αγu1+α)jLδ+j+1(γu),Mj(v;α,θ)=E[(Xt1+jj)(v1+α)Xt1]=1j!E[(Xt1+j)(Xt1+j1)(Xt1+1)(v1+α)Xt1]=1j!jE[zXt1+j]zj|z=v1+α=1j!jzj(zjΨX(z;θ))|z=v1+α=θ2(θ+1)j2[(θ+1)(θ+2)z(θj+1)](θz+1)j+2|z=v1+α=θ2(θ+1)j2(1+α)j+1(θ+1)(θ+2)(1+α)v(θj+1)[(θ+1)(1+α)v]j+2.

    By differentiating the function Ψ(2)X(u,v) with respect to u,v, we get:

    2Ψ(2)X(u,v)uv=A(u;λ,γ,δ)uΨX(v;θ)vB(u;α,λ,γ)uK(u,v;α,γ,θ)vB(u;α,λ,γ)2K(u,v;α,γ,θ)uv+j=0[C(u;α,λ,γ,δ)uηj(u,α,γ)+C(u;α,λ,γ,δ)ηj(u,α,γ)u]×Mj(v;α,θ)v.

    Thereafter, putting u=v=1 and after some computations, the mixed moment for RVs Xt and Xt1 can be obtained as follows:

    E[XtXt1]=2Ψ(2)X(u,v)uv|u=v=1=(θ+2)γ(λ(λ1)(1γ)δLδ+1(γ))θ(θ+1)(1γ)λγθ2(α(1γ)(θ+3)+θ+2)(1γ)(θ+1)(α(1γ)(θ+1)+θ)3+γδ(λ1)(αθ+α+θ)3(θθ+1)2×j=0(j+1)(αγ(θ+1)(1+α)(θ+1)1)j((1+α)(θ+1)(θ+3)θ+j1)Lj+δ+1(γ). (A.8)

    According to this and by applying Eq (9), for the first-order correlation of the series (Xt) one obtains:

    ρX(1)= Corr(Xt,Xt1)=E[XtXt1]E[Xt]E[Xt1]Var[Xt]=(E[XtXt1](θ+2)2θ2(θ+1)2)θ2(θ+1)2θ3+4θ2+6θ+2,

    which after some rearrangement yields Eq (20).

    Vladica S. Stojanović: Methodology; Conceptualization, Writing – Original Draft; Visualization; Hassan S. Bakouch: Supervision; Writing – Original Draft; Formal Analysis; Radica Bojičić: Writing – Review & Editing; Validation; Visualization; Gadir Alomair: Resources; Writing – Review & Editing; Funding Acquisition; Shuhrah A. Alghamdi: Formal Analysis, Investigation. All authors have read and approved the final version of the manuscript for publication.

    The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

    This work was supported by the Deanship of Scientific Research, Vice Presidency for Graduate Studies and Scientific Research, King Faisal University, Saudi Arabia [Grant No. 6010]

    The authors declare no conflict of interest.



    [1] L. Guan, X. Wang, A discrete-time dual risk model with dependence based on a poisson INAR(1) process, AIMS Math., 7 (2022), 20823–20837. http://dx.doi.org/10.3934/math.20221141 doi: 10.3934/math.20221141
    [2] R. Maya, C. Chesneau, A. Krishna, M. R. Irshad, Poisson extended exponential distribution with associated INAR(1) process and applications, Stats, 5 (2022), 755–772.
    [3] W. C. Khoo, S. H. Ong, B. Atanu, Coherent forecasting for a mixed integer-valued time series model, Mathematics, 10 (2022), Article No. 2961.
    [4] K. Yu, T. Tao, An Observation-Driven random parameter INAR(1) model based on the poisson thinning operator, Entropy, 25 (2023), Article No. 859. https://doi.org/10.3390/e25060859 doi: 10.3390/e25060859
    [5] V. S. Stojanović, H. S. Bakouch, E. Ljajko, N. Qarmalah, Zero-and-One integer-valued AR(1) time series with power series innovations and probability generating function estimation approach, Mathematics, 11 (2023), Article No. 1772. https://doi.org/10.3390/math11081772 doi: 10.3390/math11081772
    [6] J. Huang, F. Zhu, D. Deng, A mixed generalized poisson INAR model with applications, J. Stat. Comput. Sim., 93 (2023), 1851–1878.
    [7] V. S. Stojanović, H. S. Bakouch, Z. Gajtanović, F. E. Almuhayfith, K. Kuk, Integer-valued Split-BREAK process with a general family of innovations and application to accident count data modeling, Axioms, 13 (2024), Article No. 40. https://doi.org/10.3390/axioms13010040 doi: 10.3390/axioms13010040
    [8] Y. Kang, F. Zhu, D. Wang, S. Wang, A zero-modified geometric INAR(1) model for analyzing count time series with multiple features, Can. J. Stat., (2024). https://doi.org/10.1002/cjs.11774
    [9] L. V. Tavares, An exponential Markovian stationary process, J. Appl. Probab., 17 (1980), 1117–1120. https://doi.org/10.2307/3213224 doi: 10.2307/3213224
    [10] L. V. Tavares, A Non-Gaussian Markovian model to simulate hydrologic processes, J. Hydrol., 46 (1980), 281–287. https://doi.org/10.1016/0022-1694(80)90081-5 doi: 10.1016/0022-1694(80)90081-5
    [11] C. H. Sim, Simulation of Weibull and Gamma autoregressive stationary process, Commun. Stat. Simul. Comput., 15 (1986), 1141–1146. https://www.tandfonline.com/doi/abs/10.1080/03610918608812565 doi: 10.1080/03610918608812565
    [12] P. A. Lewis, E. D. McKenzie, Minification processes and their transformations, J. Appl. Probab., 28 (1991), 45–57. https://doi.org/10.2307/3214739 doi: 10.2307/3214739
    [13] V. A. Kalamkar, Minification processes with discrete marginals, J. Appl. Probab., 32 (1995), 692–706. https://doi.org/10.2307/3215123 doi: 10.2307/3215123
    [14] M. Aleksić, M. Ristić, A geometric minification integer-valued autoregressive model, Appl. Math. Model., 90 (2021), 265–280. https://doi.org/10.1016/j.apm.2020.08.047 doi: 10.1016/j.apm.2020.08.047
    [15] M. Stojanović, An EM algorithm for estimation of the parameters of the geometric minification INAR model, J. Stat. Comput. Simul., 92 (2022). https://doi.org/10.1080/00949655.2022.2053125
    [16] L. Qian, F. Zhu, A new minification integer-valued autoregressive process driven by explanatory variables, Aust. N. Z. J. Stat., 64 (2022), 478–494.
    [17] Q. Zhang, D. Wang, X. Fan, A negative binomial thinning-based bivariate INAR(1) process, Stat. Neerl., 74 (2020), 517–537.
    [18] M. S. Ristić, H. S. Bakouch, A. S. Nastić, A new geometric first-order integer-valued autoregressive (NGINAR(1)) process, J. Stat. Plann. Infer., 139 (2009), 2218–2226. https://doi.org/10.1016/j.jspi.2008.10.007. doi: 10.1016/j.jspi.2008.10.007
    [19] J. Pitman, Probability, New York, NY: Springer New York. p. 372, 1993. https://doi.org/10.1007/978-1-4612-4374-8. ISBN 978-0-387-94594-1
    [20] M. Sankaran, The discrete Poisson-Lindley distribution, Biometrics, 26 (1970), 145–149.
    [21] M. Mohammadpour, H. S. Bakouch, M. Shirozhan, Poisson–Lindley INAR(1) Model with Applications, Braz. J. Probab. Stat., 32 (2018), 262–280. https://doi.org/10.1214/16-BJPS341. doi: 10.1214/16-BJPS341
    [22] Z. Mohammadi, Z. Sajjadnia, H. S. Bakouch, M. Sharafi, Zero-and-One inflated Poisson–Lindley INAR(1) process for modelling count time series with extra zeros and ones, J. Stat. Comput. Sim., 92 (2022), 2018–2040.
    [23] W. A. H. Al-Nuaami, A. A. Heydari, H. J. Khamnei, The Poisson–Lindley distribution: Some characteristics, with its application to SPC, Mathematics, 11 (2023), Article No. 2428. https://doi.org/10.3390/math11112428 doi: 10.3390/math11112428
    [24] H. S. Bakouch, F. Gharari, K. Karakaya, Y. Akdoğan, Fractional Lindley distribution generated by time scale theory, with application to discrete-time lifetime data, Math. Popul. Stud., 31 (2024), 116–146. https://doi.org/10.1080/08898480.2024.2301865 doi: 10.1080/08898480.2024.2301865
    [25] M. Ghitany, D. Al-Mutairi, Estimation methods for the discrete Poisson–Lindley distribution, J. Stat. Comput. Simul., 79 (2009), 1–-9.
    [26] M. G. Scotto, C. H. Weiß, T. A. Möller, S. Gouveia, The Max-INAR(1) model for count processes, Test, 27 (2018), 850–870. https://doi.org/10.1007/s11749-017-0573-z doi: 10.1007/s11749-017-0573-z
    [27] L. Qian, G. Li, A Class of Max-INAR (1) processes with explanatory variables, J. Stat. Comput. Simul., 92 (2022), 1898–1919.
    [28] M. G. Scotto, S. Gouveia, On the extremes of the Max-INAR (1) process for time series of counts, Commun. Stat. Theory M., 52 (2023), 1136–1154.
    [29] V. L. Martin, A. R. Tremayne, R. C. Jung, Efficient method of moments estimators for integer time series models, J. Time Series Anal., 35 (2014), 491–516.
    [30] Y. Cui, Q. Zheng, Conditional maximum likelihood estimation for a class of observation-driven time series models for count data, Stat. Probab. Lett., 123 (2017), 193–201. https://doi.org/10.1016/j.spl.2016.11.002. doi: 10.1016/j.spl.2016.11.002
    [31] R. Azrak, G. Mélard, Asymptotic properties of conditional Least-Squares estimators for array time series, Stat. Inference Stoch. Process, 24 (2021), 525–547. https://doi.org/10.1007/s11203-021-09242-8 doi: 10.1007/s11203-021-09242-8
    [32] A. Buja, E. Hare, H. Hofmann, Create and manipulate discrete random variables, R package version 1.2.2, (2015). https://CRAN.R-project.org/package = discreteRV (accessed on 20 February 2024).
    [33] D. M. Gay, Usage Summary for Selected Optimization Routines, Computing Science, Technical Report 153, AT & T Bell Laboratories, Murray Hill, 1990. (accessed on 25 February 2024).
    [34] L. Gross, Tests for Normality, R package Version: 1.0.4, (2013). http://CRAN.R-project.org/package = nortest (accessed on 25 February 2024)
    [35] COINMETRICS, https://coinmetrics.io/
    [36] A. Aknouche, B. S. Almohaimeed, S. Dimitrakopoulos, Forecasting transaction counts with integer-valued GARCH models, Stud. Nonlinear Dyn. E., 26 (2021), 529–539. https://doi.org/10.1515/snde-2020-0095 doi: 10.1515/snde-2020-0095
    [37] C. H. Weiß, F. Zhu, Conditional-Mean multiplicative operator models for count time series, Comput. Stat. Data Anal., 191 (2024), Article No. 107885.
    [38] D. Qiu, Alternative Time Series Analysis, R package Version: 3.1.2.1. (2015). Available from: https://rdocumentation.org/packages/aTSA/versions/3.1.2.1. (accessed on 29 February 2024)
    [39] O. Kella, A. Löpker, On Binomial Thinning and Mixing, Indag. Math., 5 (2023), 1121–1145.
    [40] F. Diebold, R. Mariano, Comparing Predictive Accuracy, J. Bus. Econ. Stat. 13 (1995), 253–263.
    [41] R. Hyndman, Forecasting Functions for Time Series and Linear Models, R Package Version 7.1. (2016). Available from: http://CRAN.R-project.org/package = forecast (accessed on 3 March 2024).
    [42] T. M. Apostol, Mathematical Analysis (2nd ed.), Addison-Wesley, 1974. ISBN 978-0-201-00288-1
  • This article has been cited by:

    1. Predrag M. Popović, Hassan S. Bakouch, Miroslav M. Ristić, A non-linear integer-valued autoregressive model with zero-inflated data series, 2024, 0266-4763, 1, 10.1080/02664763.2024.2419495
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1151) PDF downloads(62) Cited by(1)

Figures and Tables

Figures(6)  /  Tables(3)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog