Loading [MathJax]/jax/output/SVG/jax.js
Research article Special Issues

Improved multi-label classifiers for predicting protein subcellular localization

  • Received: 24 September 2023 Revised: 13 November 2023 Accepted: 22 November 2023 Published: 11 December 2023
  • Protein functions are closely related to their subcellular locations. At present, the prediction of protein subcellular locations is one of the most important problems in protein science. The evident defects of traditional methods make it urgent to design methods with high efficiency and low costs. To date, lots of computational methods have been proposed. However, this problem is far from being completely solved. Recently, some multi-label classifiers have been proposed to identify subcellular locations of human, animal, Gram-negative bacterial and eukaryotic proteins. These classifiers adopted the protein features derived from gene ontology information. Although they provided good performance, they can be further improved by adopting more powerful machine learning algorithms. In this study, four improved multi-label classifiers were set up for identification of subcellular locations of the above four protein types. The random k-labelsets (RAKEL) algorithm was used to tackle proteins with multiple locations, and random forest was used as the basic prediction engine. All classifiers were tested by jackknife test, indicating their high performance. Comparisons with previous classifiers further confirmed the superiority of the proposed classifiers.

    Citation: Lei Chen, Ruyun Qu, Xintong Liu. Improved multi-label classifiers for predicting protein subcellular localization[J]. Mathematical Biosciences and Engineering, 2024, 21(1): 214-236. doi: 10.3934/mbe.2024010

    Related Papers:

    [1] Ting Guo, Zhipeng Qiu . The effects of CTL immune response on HIV infection model with potent therapy, latently infected cells and cell-to-cell viral transmission. Mathematical Biosciences and Engineering, 2019, 16(6): 6822-6841. doi: 10.3934/mbe.2019341
    [2] Cuicui Jiang, Kaifa Wang, Lijuan Song . Global dynamics of a delay virus model with recruitment and saturation effects of immune responses. Mathematical Biosciences and Engineering, 2017, 14(5&6): 1233-1246. doi: 10.3934/mbe.2017063
    [3] Shaoli Wang, Jianhong Wu, Libin Rong . A note on the global properties of an age-structured viral dynamic model with multiple target cell populations. Mathematical Biosciences and Engineering, 2017, 14(3): 805-820. doi: 10.3934/mbe.2017044
    [4] Yan Wang, Minmin Lu, Daqing Jiang . Viral dynamics of a latent HIV infection model with Beddington-DeAngelis incidence function, B-cell immune response and multiple delays. Mathematical Biosciences and Engineering, 2021, 18(1): 274-299. doi: 10.3934/mbe.2021014
    [5] Jinhu Xu . Dynamic analysis of a cytokine-enhanced viral infection model with infection age. Mathematical Biosciences and Engineering, 2023, 20(5): 8666-8684. doi: 10.3934/mbe.2023380
    [6] Jiawei Deng, Ping Jiang, Hongying Shu . Viral infection dynamics with mitosis, intracellular delays and immune response. Mathematical Biosciences and Engineering, 2023, 20(2): 2937-2963. doi: 10.3934/mbe.2023139
    [7] Xia Wang, Yuming Chen . An age-structured vector-borne disease model with horizontal transmission in the host. Mathematical Biosciences and Engineering, 2018, 15(5): 1099-1116. doi: 10.3934/mbe.2018049
    [8] Junli Liu . Threshold dynamics of a time-delayed hantavirus infection model in periodic environments. Mathematical Biosciences and Engineering, 2019, 16(5): 4758-4776. doi: 10.3934/mbe.2019239
    [9] Jinliang Wang, Xiu Dong . Analysis of an HIV infection model incorporating latency age and infection age. Mathematical Biosciences and Engineering, 2018, 15(3): 569-594. doi: 10.3934/mbe.2018026
    [10] Shengqiang Liu, Lin Wang . Global stability of an HIV-1 model with distributed intracellular delays and a combination therapy. Mathematical Biosciences and Engineering, 2010, 7(3): 675-685. doi: 10.3934/mbe.2010.7.675
  • Protein functions are closely related to their subcellular locations. At present, the prediction of protein subcellular locations is one of the most important problems in protein science. The evident defects of traditional methods make it urgent to design methods with high efficiency and low costs. To date, lots of computational methods have been proposed. However, this problem is far from being completely solved. Recently, some multi-label classifiers have been proposed to identify subcellular locations of human, animal, Gram-negative bacterial and eukaryotic proteins. These classifiers adopted the protein features derived from gene ontology information. Although they provided good performance, they can be further improved by adopting more powerful machine learning algorithms. In this study, four improved multi-label classifiers were set up for identification of subcellular locations of the above four protein types. The random k-labelsets (RAKEL) algorithm was used to tackle proteins with multiple locations, and random forest was used as the basic prediction engine. All classifiers were tested by jackknife test, indicating their high performance. Comparisons with previous classifiers further confirmed the superiority of the proposed classifiers.



    Infectious diseases caused by viral infections such as influenza, AIDs, heptitis B/C, and the current COVID-19 pandemic have always been being a big threat to both public health and economy. Though the underlying infection mechanisms in hosts are very complex, mathematical modeling has been an effective tool to understand the infection and to provide guidelines on control. One of the simplest viral infection models,

    {dxdt=λμxβxv,dydt=βxvδy,dvdt=kycv (1.1)

    was proposed and investigated by Nowak et al. [1,2,3]. Here x(t), y(t), and v(t) are the densities of uninfected target cells, infected target cells, and free viruses at time t, respectively. We refer the readers to the citations for the biological meanings of the positive parameters. Model (1.1) has been modified by many researchers to better understand the interaction mechanism between viruses and host cells in more detail and to evaluate the efficiency of associated therapies. For example, we refer to some on latent infection [4], on eclipse stage [5], on immune response [6,7,8], on cellular reservoirs [9,10,11], on treatment [9], on the effects of delay [10,12], on co-infection [6,13], on the effect of drug abuse on HIV dynamics [14].

    It was pointed out in [15,16] that, for some infectious diseases induced by viruses, the infected cells could contain defective viruses, that is, these infected cells produce defective proviruses that will not produce any offspring viruses. To model this phenomenon, Nowak and May [17] divided the infected cells into three classes, longer lived latently infected cells (y1), actively infected cells (y2) that produce large quantities of free viruses in a short time, and defectively infected cells (y3) that contain mutated virus genomes and cannot produce new virions. They proposed the following viral infection model,

    {dxdt=λμxβxv,dy1dt=p1βxv(δ1+γ)y1,dy2dt=p2βxv+γy1δ2y2,dy3dt=p3βxvδ3y3,dvdt=k1y1+k2y2cv, (1.2)

    where the parameter pi (i=1, 2, 3) denotes the probability that upon infection a cell will become an infected cell of type yi, 3i=1pi=1, δi (i=1, 2, 3) is the death rate of the associated infected cells, γ is the transfer rate of latently infected cells to actively infected ones, k1 and k2 are the numbers of free viruses produced by a latently infected cell and an actively infected cell, respectively. Model (1.2) can also be used to describe low steady state viral loads [18].

    Model (1.2) includes some previously studied viral infection models. For example, Korobeinikov [19] investigated the global stability of the case where p2=p3=k1=0, that is, after being infected, susceptible cells must undergo a latent stage before producing viruses and there is no defectively infected cells. The case where, after being infected, susceptible cells become either latent or active, and only actively infected cells can produce viruses, that is, p3=k1=0 and p1=1α, p2=α with α(0,1), was studied in [20,21]. When p3=γ=0 and p1=1α, p2=α with α(0,1), the corresponding model is the same as that with treatment in [21], where y1 and y2 are the populations of infected cells under different drug effects. However, to the best of our knowledge, the dynamical behavior of model (1.2) with p1p2p3γ0 is not completely understood.

    With respect to the analysis of viral dynamic models, the stability of equilibria plays a very important role in understanding the mechanism of virus infection and outcome of treatment. To name a few, [19] dealt with some basic virus dynamics models, [9,20] considered models with eclipse stages of infected cells, [21,22] investigated models with nonlinear incidences, [23,24] modeled Zika virus, and [25,26] included the immune response. One of the most powerful approaches to determine the global stability of equilibria of differential equations is Lyapunov's direct method. The key to applying this method is to construct an appropriate Lyapunov function. It requires both that the constructed function be positively definite and that its derivative along solutions of the system be negative definite or negative semi-definite. These two requirements are interrelated. In practice, it is often difficult to fulfill the second requirement or it is complicated to verify it for a positive definite function. By now, lots of techniques and methods have been developed for studying the stability of dynamic models in some applied disciplines. We refer to some for the basic framework [27]. In [28] we found a new type of function to construct Lyapunov functions while in [29] we provided a new way to do so with commonly used Volterra-type functions and quadratic functions. Moreover, to verify the negative (semi-)definiteness of the derivative of a class of Lyapunov functions along solutions of the system, a graph-theoretic approach and an algebraic approach were developed by Guo et al. [30] and Li et al. [31], respectively. Recently, for stability of disease models with immigration of infected hosts, McCluskey [32] gave a general result on finding algebraic conditions under which the Lyapunov function for a model without immigration of infected hosts extends to be a valid Lyapunov function for the corresponding system with immigration of infected hosts. In spite of the rich literature, the global stability of equilibria of many dynamical models still can not be proved theoretically.

    The purpose of this paper is to provide a new approach to discuss the global stability of equilibria of system (1.2). Our approach has three features. Firstly, there is a correlation between the two Lyapunov functions used to prove global stability of the infection-free equilibrium and the infection equilibrium. Secondly, the specific form of Lyupunov functions used is universal. Lastly, compared with existing approaches, ours here to verify the negative definiteness or negative semi-definiteness of the derivatives of the Lyapunov functions along solutions is relatively simple. We would mention that viral dynamical models share many features with the classical compartmental models of infectious diseases (see, for example, [33] on SIR and SIRS models with nonlinear incidences, [34] for stage-structured epidemic models, [35,36] for models with asymptomatic and symptomatic infectious individuals, and [37] for some cholera models) and even models of vector-borne diseases (to name a few, see [38,39,40] for vector-borne disease models with two transmission routes for the host population, [41] for a model on vector-borne relapsing diseases, [42] for a vector-borne disease model with human and vectors immigration, and references therein). We expect that the approach here can be applied to study the global stability of equilibria of such models.

    Note that y3 is decoupled from the other equations in model (1.2). As a result, we only need to focus on

    {dxdt=λμxβxv,dy1dt=p1βxvc1y1,dy2dt=p2βxv+γy1c2y2,dvdt=k1y1+k2y2cv, (1.3)

    where c1=δ1+γ, c2=δ2, and 0<p1+p21. It is easy to see that every solution of (1.3) with a nonnegative initial condition exists globally and is also nonnegative. The rest of the paper is organized as follows. In the next section, we derive the expression of the basic reproduction number of viruses with the approach of the next generation matrix and determine the equilibria of (1.3). Section 3 is the main part of this paper, which is devoted to establishing a threshold dynamics for (1.3). The paper ends with a brief conclusion and discussion.

    We first obtain the expression of the basic reproduction number (of viral particles) of model (1.3) by employing the method of the next generation matrix developed by van den Driessche and Watmough [43]. For this purpose, we denote z=(y1,y2,v,x)T. Then model (1.3) can be rewritten as

    dzdt=F(z)V(z), (2.1)

    where

    F(z)=(p1βxvp2βxv00),V(z)=(c1y1γy1+c2y2k1y1k2y2+cvλ+μx+βxv).

    Obviously, model (1.3) always has the infection-free equilibrium P0=(x0,0,0,0), where x0=λμ. Accordingly, system (2.1) has an equilibrium ˉP0(0,0,0,x0) corresponding to P0.

    The Jacobian matrices of F(z) and V(z) at the infection-free equilibrium ˉP0 are

    DF(ˉP0)=(F3×3000)andDV(ˉP0)=(V3×30000βx0μ),

    respectively, where

    F3×3=(00p1βx000p1βx0000),V3×3=(c100γc20k1k2c).

    Then the basic reproduction number, R0, of model (1.3) is the spectral radius of the next generation matrix FV1, that is,

    R0=βx0c[k1p1c1+k2c2(p2+p1γc1)]. (2.2)

    The three terms in R0 correspond to the three ways that viral particles are produced. In a wholly population of uninfected target cells of size x0, suppose a viral particle is introduced. During its lifespan, 1c, it will infect βx0c uninfected target cells. Among them p1βx0c will be latently infected and p2βx0c will be actively infected. For the latently infected cells, during their lifespan 1c1, p1βx0k1cc1 viral particles will be produced and γp1βx0cc1 actively infected cells will be produced. Then during the lifespan 1c2, the total p2βx0c+γp1βx0cc1 actively infected cells will produce k2c2(p2βx0c+γp1βx0cc1) viral particles. Therefore, a total of R0 viral particles will be produced. This biologically explains R0 as the average number of secondary viral particles produced by introducing a typical viral particle into a population of uninfected cells.

    Next we find the equilibria of (1.3). An equilibrium satisfies

    λμxβxv=0, (2.3a)
    p1βxvc1y1=0, (2.3b)
    p2βxv+γy1c2y2=0, (2.3c)
    k1y1+k2y2cv=0. (2.3d)

    It follows from (2.3a) that

    x=λμ+βv. (2.4)

    Substituting it into (2.3b) gives

    y1=p1βλvc1(μ+βv). (2.5)

    Now substituting (2.4) and (2.5) into (2.3c), we get

    y2=βλvc2(μ+βv)(p2+γp1c1). (2.6)

    Finally, substituting (2.4)–(2.6) into (2.3d) yields

    v{βλμ+βv[k2p2c2+p1c1(k1+k2γc2)]c}=0.

    Thus, for (2.3), we have

    v=0orv=λc[k2p2c2+p1c1(k1+k2γc2)]μβ=μβ(R01)v.

    Obviously, v=0 produces the infection-free equilibrium P0. For v to give a biologically relevant equilibrium, we require v>0 or equivalently R0>1. In summary we have obtained the following result on equilibria of (1.3).

    Theorem 1. Model (1.3) always has the infection-free equilibrium P0. Furthermore, when R0>1, there is also a unique infection equilibrium P(x,y1,y2,v), wherev=μβ(R01) and

    x=λμ+βv,y1=p1βλvc1(μ+βv),y2=βλvc2(μ+βv)(p2+γp1c1).

    In this section, by applying the approach of Lyapunov's direct method, we establish a threshold dynamics determined by R0 for (1.3).

    We first show the boundedness of solutions of (1.3). Let q(0,min{c1γk1,c2k2}). A straightforward calculation yields

    d(x+y1+y2+qv)dt=λμx(c1γqk1)y1(c2qk2)y2qcv+(p1+p21)βxvλρ(x+y1+y2+qv),

    where ρ=min{μ,c1γqk1,c2qk2,c}. It follows that

    x(t)+y1(t)+y2(t)+qv(t)λρ+eρt[x(0)+y1(0)+y2(0)+qv(0)λβ]

    and hence

    lim supt(x(t)+y1(t)+y2(t)+qv(t))λρ.

    Moreover, similarly, it follows from dxdtλμx that

    x(t)x(0)eμt+λμ(1eμt)andlim suptx(t)λμ.

    The above discussion implies that solutions of (1.3) are bounded. Moreover, it is easy to see that the set

    Ω={(x,y1,y2,v)R4+:xλμ,x+y1+y2+qvλρ}

    is positively invariant and attracting for system (1.3).

    The following result on the existence of solutions to a set of inequalities will be used to construct appropriate Lyapunov functions.

    Lemma 2. For the parameters γ, c, β, and pi, ki, ci (i=1, 2) of (1.3), if

    0<ρcc2c1β[p1k1c2+k2(p1γ+p2c1)], (3.1)

    then the following system of inequalities on m, n, and q,

    {mp1+np210,nγ+qk1mc10,qk2nc20,βρqc0, (3.2)

    must have positive solutions.

    Proof. It follows from the first two inequalities of (3.2) that

    nγ+qk1c1m1np2p1. (3.3)

    Hence it is necessary that

    np2<1 (3.4)

    and

    n(p1γ+p2c1)+qp1k1c1. (3.5)

    This, combined with the last two inequalities of (3.2), gives the following system of inequalities on n and q,

    {n(p1γ+p2c1)+qp1k1c1,qk2nc20,βρqc0. (3.6)

    It is easy to see that the system of linear equations on n and q,

    {n(p1γ+p2c1)+qp1k1=c1,qk2nc2=0, (3.7)

    has a unique solution

    n=k2c1p1k1c2+k2(p1γ+p2c1)n,q=c2c1p1k1c2+k2(p1γ+p2c1)q. (3.8)

    Note that q<c1p1k1 and the condition (3.1) is equivalent to βρcq. Then the solution set of (3.6) is given by

    D={(n,q) | βρcqq,qk2c2nc1qp1k1p1γ+p2c1},

    where n1(q)=qk2c2 and n2(q)=c1qp1k1p1γ+p2c1 are derived from system (3.7) corresponding to the first two inequalities of (3.6) (see Fig. 1).

    Figure 1.  Solution set of inequalities (3.6).

    Clearly, for (n,q)D, we have n<1p2 and nγ+qk1c1<1np2p1. Thus, for (n,q)D, we can choose m according to (3.3). Then such (m,n,q) is a positive solution of (3.2).

    Notice that cc2c1β[p1k1c2+k2(p1γ+p2c1)]=x0R0 according to the expression of R0 defined by (2.2). Then the condition (3.1) can be rewritten as 0<ρx0R0.

    The next two results follow from the proof of Lemma 2 and will be useful in applying Lyapunov's direct method.

    Corollary 3. Suppose ρ<x0R0, i.e., βρc<q. Then there are positive numbers m, n, and q satisfying the following system of inequalities,

    {mp1+np210,nγ+qk1mc1<0,qk2nc2<0,βρqc<0. (3.9)

    Corollary 4. Suppose ρ=x0R0, i.e., βρc=q. Then (3.2) only has the unique solution,

    m=1p1(1p2k2βρc2c),n=k2βρc2c,q=βρc, (3.10)

    in other words, only the equalities hold.

    Denote

    Ω0={(x,y1,y2,v)Ω:y1+y2+v>0}.

    Let (x(t),y1(t),y2(t),v(t)) be a solution of (1.3) with (x0,y10,y20,v0)Ω. Then x(t)>0 for t>0. If further (x0,y10,y20,v0)Ω0, then the solution is positive for t>0. Thus Ω0 is a positively invariant set of (1.3).

    Now we are ready to prove the main result of this paper, a threshold dynamics of (1.3) determined by the basic reproduction number R0.

    Theorem 5. If R01, then the infection-free equilibrium P0 of (1.3) is globally asymptotically stable in Ω, while if R0>1, then the infection equilibrium P is globally asymptotically stable in Ω0.

    Proof. As mentioned earlier, the approach is Lyapunov's direct method. To construct appropriate Lyapunov functions, we need the Volterra-type function g:(0,)uu1lnu. Note that g is nonnegative and attains its global minimum 0 only at u=1.

    We first consider the infection-free equilibrium P0 with a Lyapunov function of the form,

    L1=x0g(xx0)+my1+ny2+qv,

    where m, n, and q are positive numbers to be determined. Note that L1 can be regarded as well-defined by the discussion just a few lines above. Clearly, L1 is positive definite about P0, that is, the function L1 is zero only at P0 and positive at other points. The derivative of L1 along solutions of system (1.3) is given by

    L1=(1x0x)dxdt+mdy1dt+ndy2dt+qdvdt=μ(xx0)2x+(nγ+qk1mc1)y1+(qk2nc2)y2+(βx0qc)v+(mp1+np21)βxv. (3.11)

    For L10 on Ω, it is sufficient that

    {mp1+np210,nγ+qk1mc10,qk2nc20,βx0qc0. (3.12)

    With the definition of q in (3.8) and x0=λμ, we see that R0 can be expressed as R0=βx0cq. Then R01 is equivalent to x0cqβ. Thus (3.12) is the same as (3.2) with ρ=x0. We distinguish two cases to finish this part.

    Case 1: R0<1. Then x0<cqβ. By Corollary 3, we can choose positive m, n, and q satisfying (3.9) with ρ=x0. As a result, L1 is negative definite about P0, namely, the function L1 is zero only at P0 and negative at other points. It follows from Lyapunov Theorem [44] that P0 is globally asymptotically stable in Ω if R0<1.

    Case 2: R0=1. Then x0=cqβ and Corollary 4 tells us that the positive numbers m, n, and q determined by (3.10) with ρ=x0 are the unique solution of the inequalities (3.12). Consequently, L1=μ(xx0)2x0 and

    M1={(x,y1,y2,v)Ω:L1=0}={(x,y1,y2,v):x=x0}.

    Let (x(t),y1(t),y2(t),v(t)) be a solution of (1.3) in M1. Then x(t)=x0. It follows that 0=dx(t)dt=λμx(t)βx(t)v(t)=βx0v(t), which gives v(t)0. Thus 0=dv(t)dt=k1y1(t)+k2y2(t)cv(t)=k1y1(t)+k2y2(t) produces y1(t)=y2(t)=0 as y1(t)0 and y2(t)0. This shows that the largest invariant set of (1.3) in M1 is the singleton {P0}. Therefore, by LaSalle Invariance Principle [27], P0 is globally asymptotically stable in Ω if R0=1.

    To sum up, P0 is globally asymptotically stable in Ω if R01.

    Next, we consider the stability of the infection equilibrium P in Ω0 with the Lyapunov function candidate,

    L2=xg(xx)+my1g(y1y1)+ny2g(y2y2)+qvg(vv), (3.13)

    where m, n, and q are positive numbers to be determined. Again we can assume that L2 is well-defined on Ω0. L2 is positive definite about P and the derivative of L2 along solutions of system (1.3) is

    L2=(1xx)dxdt+m(1y1y1)dy1dt+n(1y2y2)dy2dt+q(1vv)dvdt=(1xx)(λμxβxv)+m(1y1y1)(p1βxvc1y1)+n(1y2y2)(p2βxv+γy1c2y2)+q(1vv)(k1y1+k2y2cv)=C+F(x,y1,y2,v), (3.14)

    where

    C=λ+μx+mc1y1+nc2y2+cqv,F(x,y1,y2,v)=(mp1+np21)βxvxvxv+(nγ+qk1mc1)y1y1y1+(qk2nc2)y2y2y2+(βxcq)vvvλxxμxxxmp1βxvxvy1xvy1np2βxvxvy2xvy2nγy1y1y2y1y2qk1y1y1vy1vqk2y2y2vy2v.

    Since L2=0 for xx=y1y1=y2y2=vv=1, we have C=F(x,y1,y2,v).

    We define a function F(x,y1,y2,v) related to F(x,y1,y2,v) by

    F(x,y1,y2,v)=(mp1+np21)βxvlnxvxv+(nγ+qk1mc1)y1lny1y1+(qk2nc2)y2lny2y2+(βxcq)vlnvvλlnxxμxlnxxmp1βxvlnxvy1xvy1np2βxvlnxvy2xvy2nγy1lny1y2y1y2qk1y1lny1vy1vqk2y2lny2vy2v.

    A straightforward calculation shows

    F(x,y1,y2,v)=(λμxβxv)lnxx+m(p1βxvc1y1)lny1y1+n(p2βxv+γy1c2y2)lny2y2+q(k1y1+k2y2cv)lnvv.

    According to system (2.3) satisfied by the infection equilibrium P(x,y1,y2,v), we have F(x,y1,y2,v)=0. Therefore,

    L2=F(x,y1,y2,v)F(x,y1,y2,v)F(x,y1,y2,v)=(mp1+np21)βxvg(xvxv)+(nγ+qk1mc1)y1g(y1y1)+(qk2nc2)y2g(y2y2)+(βxcq)vg(vv)λg(xx)μxg(xx)mp1βxvg(xvy1xvy1)np2βxvg(xvy2xvy2)nγy1g(y1y2y1y2)qk1y1g(y1vy1v)qk2y2g(y2vy2v). (3.15)

    Recall that g(u)0 for u>0 and g(u)=0 if and only if u=1. To make L20, it suffices that the positive numbers m, n, and q satisfy the following system of inequalities,

    {mp1+np210,nγ+qk1mc10,qk2nc20,βxcq0. (3.16)

    Again, (3.16) is the same as (3.2) with ρ=x. According to Corollary 4, the system of inequalities (3.16) has a unique positive solution,

    m=1p1(1p2k2βxc2c),n=k2βxc2c,q=βxc,

    and in fact all the equalities of (3.16) hold. Then with these m, n, and q, L2 becomes

    L2=λg(xx)μxg(xx)mp1βxvg(xvy1xvy1)np2βxvg(xvy2xvy2)nγy1g(y1y2y1y2)qk1y1g(y1vy1v)qk2y2g(y2vy2v).

    It follows that L20 and

    M2={(x,y1,y2,v)Ω0:L2=0}={(x,y1,y2,v)Ω0:x=x,y1y1=y2y2=vv}.

    Let (x(t),y1(t),y2(t),v(t)) be a solution of (1.3) in M2. Then x(t)=x and y1(t)y1=y2(t)y2=vv=θ(t) for a positive function θ. It follows from 0=dx(t)dt=λμx(t)βx(t)v(t)=λμxβxv(t)=βx(1θ(t)) that θ(t)1, which implies that the largest invariant set of system (1.3) in M2 is the singleton {P}. Therefore, LaSalle Invariant Principle[27] tells us that P is globally asymptotically stable in Ω0.

    In this paper, for a viral infection model with defectively infected cells, we obtained a threshold dynamics, which is completely determined by the basic reproduction number of virus R0. That is, the virus dies out when R01 while the virus persists and the viral load approaches a positive number when R0>1. In practice, for diseases described by this model, any measure makes R0 below unity is quite effective. The explicit expression of R0 provides guidelines on how to increase or decrease parameter values by appropriate control strategies. Even if we cannot make R01, the global stability of the infection equilibrium tells us that we can still change the values of parameters to make the viral load below the tolerance level.

    The obtained result is established by Lyapunov's direct method. The Lyapunov function for the infection-free equilibrium is a linear combination of the Volterra-type function (for the uninfected target cells) and linear functions (for the other three variables) but the one for the infection equilibrium is a linear combination of only Volterra-type functions. Surprisingly, the coefficients satisfy the same set of inequalities to make the derivatives along solutions negative (semi-)definite. This shows that there is a correlation between the two Lyapunov functions with given forms. It solves the problem of constructing Lyapunov functions used to prove the global stability of infection equilibrium to certain extent, since it is often difficult to find a suitable Lyapunov function for the positive equilibrium, but easy for the boundary equilibrium.

    Furthermore, for the given form of Lyapunov function, we used the method of undetermined coefficients to determine them. By this method, all the suitable coefficients of the given form can be found. Therefore, it has the advantage of universality, which has been shown in [31,45]. But, with respect to proving the negative definiteness or negative semi-definiteness of the derivative of the Lyapunov function along solutions of the model, the approach used here is different from those in [31,45].

    According to the algebraic approach proposed in [31,45], even if the coefficients of the Lyapunov function (L2) are given, in order to show the negative or negative semi-negative definiteness of its derivative (L2), the derivative (L2) must be expressed in the following form

    b1(2x1x)+b2(31xxvy1y1v)+b3(31xxvy2y2v)+b4(41xxvy1y1y2y2v),

    where the expressions of bi's (i=1, 2, 3, 4) also need to be determined. For low dimensional differential systems, the approach of rearranging the terms in the derivative is feasible, but it is not so easy for systems with higher dimensions. Thus the approach of proving the global stability of the infection equilibrium here is concise. It can also indicate that this approach is relatively simple for proving the global stability of the endemic equilibria of high dimensional epidemic models in [37,46].

    This work is supported partially by the National Natural Science Foundation of PR China (Nos. 11971281, 12071268, 12071418).

    The authors declare there is no conflict of interest.



    [1] K. C. Chou, H. B. Shen, Recent progress in protein subcellular location prediction, Anal. Biochem., 370 (2007), 1–16. https://doi.org/10.1016/j.ab.2007.07.006 doi: 10.1016/j.ab.2007.07.006
    [2] R. F. Murphy, M. V. Boland, M. Velliste, Towards a systematics for protein subcellular location: quantitative description of protein localization patterns and automated analysis of fluorescence microscope images, in Proceedings International Conference on Intelligent System Molecular Biology, 8 (2000), 251–259.
    [3] J. Cao, W. Liu, J. He, H. Gu, Mining proteins with non-experimental annotations based on an active sample selection strategy for predicting protein subcellular localization, PLoS One, 8 (2013), e67343. https://doi.org/10.1371/journal.pone.0067343 doi: 10.1371/journal.pone.0067343
    [4] H. B. Shen, J. Yang, K. C. Chou, Methodology development for predicting subcellular localization and other attributes of proteins, Expert Rev. Proteomics, 4 (2007), 453–463. https://doi.org/10.1586/14789450.4.4.453 doi: 10.1586/14789450.4.4.453
    [5] A. Reinhardt, T. Hubbard, Using neural networks for prediction of the subcellular location of proteins, Nucleic Acids Res., 26 (1998), 2230–2236. https://doi.org/10.1093/nar/26.9.2230 doi: 10.1093/nar/26.9.2230
    [6] J. Cedano, P. Aloy, J. A. Perez-Pons, E. Querol, Relation between amino acid composition and cellular location of proteins, J. Mol. Biol., 266 (1997), 594–600. https://doi.org/10.1006/jmbi.1996.0804 doi: 10.1006/jmbi.1996.0804
    [7] Y. X. Pan, Z. Z. Zhang, Z. M. Guo, G. Y. Feng, Z. D. Huang, L. He, Application of pseudo amino acid composition for predicting protein subcellular location: stochastic signal processing approach, J. Protein Chem., 22 (2003), 395–402. https://doi.org/10.1023/a:1025350409648 doi: 10.1023/a:1025350409648
    [8] J. Y. Shi, S. Zhang, Q. Pan, G. Zhou, Using pseudo amino acid composition to predict protein subcellular location: approached with amino acid composition distribution, Amino Acids, 35 (2008), 321–327. https://doi.org/10.1007/s00726-007-0623-z doi: 10.1007/s00726-007-0623-z
    [9] H. Lin, H. Ding, F. Guo, A. Zhang, J. Huang, Predicting subcellular localization of mycobacterial proteins by using Chou's pseudo amino acid composition, Protein Pept. Lett., 15 (2008), 739–744. https://doi.org/10.2174/092986608785133681 doi: 10.2174/092986608785133681
    [10] K. Chou, Prediction of protein cellular attributes using pseudo-amino acid composition, Proteins, 43 (2001), 246–255. https://doi.org/10.1002/prot.1035 doi: 10.1002/prot.1035
    [11] T. Liu, X. Zheng, C. Wang, J. Wang, Prediction of subcellular location of apoptosis proteins using pseudo amino acid composition: an approach from auto covariance transformation, Protein Pept. Lett., 17 (2010), 1263–1269. https://doi.org/10.2174/092986610792231528 doi: 10.2174/092986610792231528
    [12] Y. Shen, J. Tang, F. Guo, Identification of protein subcellular localization via integrating evolutionary and physicochemical information into Chou's general PseAAC, J. Theor. Biol., 462 (2019), 230–239. https://doi.org/10.1016/j.jtbi.2018.11.012 doi: 10.1016/j.jtbi.2018.11.012
    [13] Y. H. Yao, Z. X. Shi, Q. Dai, Apoptosis protein subcellular location prediction based on position-specific scoring matrix, J. Comput. Theor. Nanos., 11 (2014), 2073–2078. https://doi.org/10.1166/jctn.2014.3607 doi: 10.1166/jctn.2014.3607
    [14] T. Liu, P. Tao, X. Li, Y. Qin, C. Wang, Prediction of subcellular location of apoptosis proteins combining tri-gram encoding based on PSSM and recursive feature elimination, J. Theor. Biol., 366 (2015), 8–12. https://doi.org/10.1016/j.jtbi.2014.11.010 doi: 10.1016/j.jtbi.2014.11.010
    [15] S. Wang, W. Li, Y. Fei, An improved process for generating uniform PSSMs and its application in protein subcellular localization via various global dimension reduction techniques, IEEE Access, 7 (2019), 42384–42395. https://doi.org/10.1109/ACCESS.2019.2907642 doi: 10.1109/ACCESS.2019.2907642
    [16] X. Cheng, X. Xiao, K. C. Chou, pLoc-mHum: predict subcellular localization of multi-location human proteins via general PseAAC to winnow out the crucial GO information. Bioinformatics, 34 (2018), 1448–1456. https://doi.org/10.1093/bioinformatics/btx711 doi: 10.1093/bioinformatics/btx711
    [17] X. Cheng, S. Zhao, W. Lin, X. Xiao, K. Chou, pLoc-mAnimal: predict subcellular localization of animal proteins with both single and multiple sites, Bioinformatics, 33 (2017), 3524–3531. https://doi.org/10.1093/bioinformatics/btx476 doi: 10.1093/bioinformatics/btx476
    [18] X. Cheng, X. Xiao, K.C. Chou, pLoc-mGneg: Predict subcellular localization of Gram-negative bacterial proteins by deep gene ontology learning via general PseAAC, Genomics, 110 (2017), 231–239. https://doi.org/10.1016/j.ygeno.2017.10.002 doi: 10.1016/j.ygeno.2017.10.002
    [19] X. Cheng, X. Xiao, K. C. Chou, pLoc-mEuk: Predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC, Genomics, 110 (2018), 50–58. https://doi.org/10.1016/j.ygeno.2017.08.005 doi: 10.1016/j.ygeno.2017.08.005
    [20] K. Chou, Y. Cai, A new hybrid approach to predict subcellular localization of proteins by incorporating gene ontology, Biochem. Biophys. Res. Commun., 311 (2003), 743–747. https://doi.org/10.1016/j.bbrc.2003.10.062 doi: 10.1016/j.bbrc.2003.10.062
    [21] S. Wan, M. Mak, S. Kung, GOASVM: A subcellular location predictor by incorporating term-frequency gene ontology into the general form of Chou's pseudo-amino acid composition, J. Theor. Biol., 323 (2013), 40–48. https://doi.org/10.1016/j.jtbi.2013.01.012 doi: 10.1016/j.jtbi.2013.01.012
    [22] S. Wan, M. Mak, S. Kung, mGOASVM: Multi-label protein subcellular localization based on gene ontology and support vector machines, BMC Bioinf., 13 (2012), 290. https://doi.org/10.1186/1471-2105-13-290 doi: 10.1186/1471-2105-13-290
    [23] K. C. Chou, Y. D. Cai, Using functional domain composition and support vector machines for prediction of protein subcellular location, J. Biol. Chem., 277 (2002), 45765–45769. https://doi.org/10.1074/jbc.M204161200 doi: 10.1074/jbc.M204161200
    [24] K. Chou, H. Shen, A new method for predicting the subcellular localization of eukaryotic proteins with both single and multiple sites: Euk-mPLoc 2.0, PLoS One, 5 (2010), e9931. https://doi.org/10.1371/journal.pone.0009931 doi: 10.1371/journal.pone.0009931
    [25] Y. Cai, K. Chou, Nearest neighbour algorithm for predicting protein subcellular location by combining functional domain composition and pseudo-amino acid composition, Biochem. Biophys. Res. Commun., 305 (2003), 407–411. https://doi.org/10.1016/s0006-291x(03)00775-7 doi: 10.1016/s0006-291x(03)00775-7
    [26] K. Chou, Y. Cai, Predicting subcellular localization of proteins by hybridizing functional domain composition and pseudo-amino acid composition, J. Cell. Biochem., 91 (2004), 1197–1203. https://doi.org/10.1002/jcb.10790 doi: 10.1002/jcb.10790
    [27] X. Pan, L. Chen, M. Liu, Z. Niu, T. Huang, Y. Cai, Identifying protein subcellular locations with embeddings-based node2loc, IEEE/ACM Trans. Comput. Biol. Bioinf., 19 (2022), 666–675. https://doi.org/10.1109/TCBB.2021.3080386 doi: 10.1109/TCBB.2021.3080386
    [28] X. Pan, H. Li, T. Zeng, Z. Li, L. Chen, T. Huang, et al., Identification of protein subcellular localization with network and functional embeddings, Front. Genet., 11 (2021), 626500. https://doi.org/10.3389/fgene.2020.626500 doi: 10.3389/fgene.2020.626500
    [29] H. Liu, B. Hu, L. Chen, Identifying protein subcellular location with embedding features learned from networks, Curr. Proteomics, 18 (2021), 646–660. https://doi.org/10.2174/1570164617999201124142950 doi: 10.2174/1570164617999201124142950
    [30] R. Wang, L. Chen, Identification of human protein subcellular location with multiple networks, Curr. Proteomics, 19 (2022), 344–356.
    [31] R. Su, L. He, T. Liu, X. Liu, L. Wei, Protein subcellular localization based on deep image features and criterion learning strategy, Briefings Bioinf., 22 (2020), bbaa313. https://doi.org/10.1093/bib/bbaa313 doi: 10.1093/bib/bbaa313
    [32] M. Ullah, F. Hadi, J. Song, D. Yu, PScL-DDCFPred: an ensemble deep learning-based approach for characterizing multiclass subcellular localization of human proteins from bioimage data, Bioinformatics, 38 (2022), 4019–4026. https://doi.org/10.1093/bioinformatics/btac432 doi: 10.1093/bioinformatics/btac432
    [33] M. Ullah, K. Han, F. Hadi, J. Xu, J. Song, D. Yu, PScL-HDeep: image-based prediction of protein subcellular location in human tissue using ensemble learning of handcrafted and deep learned features with two-layer feature selection, Briefings Bioinf., 22 (2021), bbab278. https://doi.org/10.1093/bib/bbab278 doi: 10.1093/bib/bbab278
    [34] G. Tsoumakas, I. Vlahavas, Random k-Labelsets: An ensemble method for multilabel classification, in Machine Learning: ECML 2007, (2007), 406–417. https://doi.org/10.1007/978-3-540-74958-5_38
    [35] L. Breiman, Random forests, Mach. Learn., 45 (2001), 5–32. https://doi.org/10.1023/A:1010933404324 doi: 10.1023/A:1010933404324
    [36] K. C. Chou, Z. C. Wu, X. Xiao, iLoc-Hum: using the accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites, Mol. Biosyst., 8 (2012), 629–641. https://doi.org/10.1039/c1mb05420a doi: 10.1039/c1mb05420a
    [37] H. B. Shen, K. C. Chou, A top-down approach to enhance the power of predicting human protein subcellular localization: Hum-mPLoc 2.0, Anal. Biochem., 394 (2009), 269–274. https://doi.org/10.1016/j.ab.2009.07.046 doi: 10.1016/j.ab.2009.07.046
    [38] W. Z. Lin, J. Fang, X. Xiao, K. Chou, iLoc-Animal: a multi-label learning classifier for predicting subcellular localization of animal proteins, Mol. Biosyst., 9 (2013), 634–644. https://doi.org/10.1039/c3mb25466f doi: 10.1039/c3mb25466f
    [39] H. B. Shen, K. C. Chou, Gneg-mPLoc: a top-down strategy to enhance the quality of predicting subcellular localization of Gram-negative bacterial proteins, J. Theor. Biol., 264 (2010), 326–333. https://doi.org/10.1016/j.jtbi.2010.01.018 doi: 10.1016/j.jtbi.2010.01.018
    [40] X. Xiao, Z. C. Wu, K. C. Chou, A multi-label classifier for predicting the subcellular localization of gram-negative bacterial proteins with both single and multiple sites, PLoS One, 6 (2011), e20592. https://doi.org/10.1371/journal.pone.0020592 doi: 10.1371/journal.pone.0020592
    [41] G. Tsoumakas, I. Katakis, Multi-label classification: An overview, Int. J. Data Warehouse. Min., 3 (2007), 1–13. https://doi.org/10.4018/jdwm.2007070101 doi: 10.4018/jdwm.2007070101
    [42] S. Al-Maadeed, Kernel collaborative label power set system for multi-label classification, in Qatar Foundation Annual Research Forum Volume 2013 Issue 1, Hamad bin Khalifa University Press, 2013 (2013). https://doi.org/10.5339/qfarf.2013.ICTP-028
    [43] J. P. Zhou, L. Chen, Z. H. Guo, iATC-NRAKEL: An efficient multi-label classifier for recognizing anatomical therapeutic chemical classes of drugs, Bioinformatics, 36 (2020), 1391–1396. https://doi.org/10.1093/bioinformatics/btz757 doi: 10.1093/bioinformatics/btz757
    [44] J. P. Zhou, L. Chen, T. Wang, M. Liu, iATC-FRAKEL: A simple multi-label web-server for recognizing anatomical therapeutic chemical classes of drugs with their fingerprints only, Bioinformatics, 36 (2020), 3568–3569. https://doi.org/10.1093/bioinformatics/btaa166 doi: 10.1093/bioinformatics/btaa166
    [45] X. Li, L. Lu, L. Chen, Identification of protein functions in mouse with a label space partition method, Math. Biosci. Eng., 19 (2022), 3820–3842. https://doi.org/10.3934/mbe.2022176 doi: 10.3934/mbe.2022176
    [46] H. Li, S. Zhang, L. Chen, X. Pan, Z. Li, T. Huang, et al., Identifying functions of proteins in mice with functional embedding features, Front. Genet., 13 (2022), 909040. https://doi.org/10.3389/fgene.2022.909040 doi: 10.3389/fgene.2022.909040
    [47] L. Chen, Z. Li, T. Zeng, Y. Zhang, H. Li, T. Huang, et al., Predicting gene phenotype by multi-label multi-class model based on essential functional features, Mol. Genet. Genomics, 296 (2021), 905–918. https://doi.org/10.1007/s00438-021-01789-8 doi: 10.1007/s00438-021-01789-8
    [48] Y. Zhu, B. Hu, L. Chen, Q. Dai, iMPTCE-Hnetwork: a multi-label classifier for identifying metabolic pathway types of chemicals and enzymes with a heterogeneous network, Comput. Math. Methods Med., 2021 (2021), 6683051. https://doi.org/10.1155/2021/6683051 doi: 10.1155/2021/6683051
    [49] J. Che, L. Chen, Z. Guo, S. Wang, Aorigele, Drug target group prediction with multiple drug networks, Comb. Chem. High Throughput Screen., 23 (2020), 274–284. https://doi.org/10.2174/1386207322666190702103927 doi: 10.2174/1386207322666190702103927
    [50] H. Wang, L. Chen, PMPTCE-HNEA: Predicting metabolic pathway types of chemicals and enzymes with a heterogeneous network embedding algorithm, Curr. Bioinf., 18 (2023), 748–759. https://doi.org/10.2174/1574893618666230224121633 doi: 10.2174/1574893618666230224121633
    [51] J. Read, P. Reutemann, B. Pfahringer, MEKA: A multi-label/multi-target extension to WEKA, J. Mach. Learn. Res., 17 (2016), 1–5.
    [52] B. Ran, L. Chen, M. Li, Y. Han, Q. Dai, Drug-Drug interactions prediction using fingerprint only, Comput. Math. Methods Med., 2022 (2022), 7818480. https://doi.org/10.1155/2022/7818480 doi: 10.1155/2022/7818480
    [53] M. Onesime, Z. Yang, Q. Dai, Genomic island prediction via chi-square test and random forest algorithm, Comput. Math. Methods Med., 2021 (2021), 9969751. https://doi.org/10.1155/2021/9969751 doi: 10.1155/2021/9969751
    [54] L. Chen, K. Chen, B. Zhou, Inferring drug-disease associations by a deep analysis on drug and disease networks, Math. Biosci. Eng., 20 (2023), 14136–14157. https://doi.org/10.3934/mbe.2023632 doi: 10.3934/mbe.2023632
    [55] P. Chen, T. Shen, Y. Zhang, B. Wang, A sequence-segment neighbor encoding schema for protein hotspot residue prediction, Curr. Bioinf., 15 (2020), 445–454. https://doi.org/10.2174/1574893615666200106115421 doi: 10.2174/1574893615666200106115421
    [56] Z. B. Lv, J. Zhang, H. Ding, Q. Zou, RF-PseU: A random forest predictor for rna pseudouridine sites, Front. Bioeng. Biotechnol., 8 (2020), 134. https://doi.org/10.3389/fbioe.2020.00134 doi: 10.3389/fbioe.2020.00134
    [57] F. Huang, Q. Ma, J. Ren, J. Li, F. Wang, T. Huang, et al., Identification of smoking associated transcriptome aberration in blood with machine learning methods, Biomed. Res. Int., 2023 (2023), 445–454. https://doi.org/10.1155/2023/5333361 doi: 10.1155/2023/5333361
    [58] F. Huang, M. Fu, J. Li, L. Chen, K. Feng, T. Huang, et al., Analysis and prediction of protein stability based on interaction network, gene ontology, and kegg pathway enrichment scores, Biochim. Biophys. Acta. Proteins Proteom., 1871 (2023), 140889. https://doi.org/10.1016/j.bbapap.2023.140889 doi: 10.1016/j.bbapap.2023.140889
    [59] J. Ren, Y. Zhang, W. Guo, K. Feng, Y. Yuan, T. Huang, et al., Identification of genes associated with the impairment of olfactory and gustatory functions in COVID-19 via machine-learning methods, Life (Basel), 13 (2023), 798. https://doi.org/10.3390/life13030798 doi: 10.3390/life13030798
    [60] K. C. Chou, C. T. Zhang, Prediction of protein structural classes, Crit. Rev. Biochem. Mol. Biol., 30 (1995), 275–349. https://doi.org/10.3109/10409239509083488 doi: 10.3109/10409239509083488
    [61] K. C. Chou, Z. C. Wu, X. Xiao, iLoc-Euk: A multi-label classifier for predicting the subcellular localization of singleplex and multiplex eukaryotic proteins, PLoS One, 6 (2011), e18258. https://doi.org/10.1371/journal.pone.0018258 doi: 10.1371/journal.pone.0018258
    [62] S. Tang, L. Chen, iATC-NFMLP: Identifying classes of anatomical therapeutic chemicals based on drug networks, fingerprints and multilayer perceptron. Curr. Bioinf., 17 (2022), 814–824.
    [63] H. Zhao, Y. Li, J. Wang, A convolutional neural network and graph convolutional network-based method for predicting the classification of anatomical therapeutic chemicals, Bioinformatics, 37 (2021), 2841–2847. https://doi.org/10.1093/bioinformatics/btab204 doi: 10.1093/bioinformatics/btab204
    [64] W. Chen, H. Yang, P. Feng, H. Ding, H. Lin, iDNA4mC: identifying DNA N4-methylcytosine sites based on nucleotide chemical properties, Bioinformatics, 33 (2017), 3518–3523. https://doi.org/10.1093/bioinformatics/btx479 doi: 10.1093/bioinformatics/btx479
    [65] L. Wei, P. Xing, R. Su, G. Shi, Z. S. Ma, Q. Zou, CPPred-RF: A sequence-based predictor for identifying cell-penetrating peptides and their uptake efficiency, J. Proteome Res., 16 (2017), 2044–2053. https://doi.org/10.1021/acs.jproteome.7b00019 doi: 10.1021/acs.jproteome.7b00019
    [66] S. R. Safavian, D. Landgrebe, A survey of decision tree classifier methodology, T-SMCA, 21 (1991), 660–674. https://doi.org/10.1109/21.97458 doi: 10.1109/21.97458
    [67] C. Cortes, V. Vapnik, Support-vector networks, Mach. Learn., 20 (1995), 273–297. https://doi.org/10.1007/BF00994018 doi: 10.1007/BF00994018
  • mbe-21-01-010-supplementary.pdf
  • This article has been cited by:

    1. Jianquan Li, Yuming Chen, Peijun Zhang, Dian Zhang, Global Stability of a Viral Infection Model with Defectively Infected Cells and Latent Age, 2024, 45, 0252-9599, 555, 10.1007/s11401-024-0028-2
    2. Jianquan Li, Yuming Chen, Fengqin Zhang, An improved algebraic approach to proving global stability of autonomous polynomial differential systems with applications to epidemic models, 2024, 539, 0022247X, 128564, 10.1016/j.jmaa.2024.128564
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(2579) PDF downloads(139) Cited by(12)

Figures and Tables

Figures(10)  /  Tables(4)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog