
Influence maximization (IM), a central issue in optimizing information diffusion on social platforms, aims to spread posts or comments more widely, rapidly, and efficiently. Existing studies primarily focus on the positive effects of incorporating heuristic calculations in IM approaches. However, heuristic models fail to consider the potential enhancements that can be achieved through network representation learning techniques. Some recent work is keen to use representation learning to deal with IM issues. However, few in-depth studies have explored the existing challenges in IM representation learning, specifically regarding the role characteristics and role representations. This paper highlights the potential advantages of combining heuristic computing and role embedding to solve IM problems. First, the method introduces role granularity classification to effectively categorize users into three distinct roles: opinion leaders, structural holes and normal nodes. This classification enables a deeper understanding of the dynamics of users within the network. Second, a novel role-based network embedding (RbNE) algorithm is proposed. By leveraging the concept of node roles, RbNE captures the similarity between nodes, allowing for a more accurate representation of the network structure. Finally, a superior IM approach, named RbneIM, is recommended. RbneIM combines heuristic computing and role embedding to establish a fusion-enhanced IM solution, resulting in an improved influence analysis process. Exploratory outcomes on six social network datasets indicate that the proposed approach outperforms state-of-the-art seeding algorithms in terms of maximizing influence. This finding highlights the effectiveness and efficacy of the proposed method in achieving higher levels of influence within social networks. The code is available at https://github.com/baiyazi/IM2.
Citation: Xu Gu, Zhibin Wang, Xiaoliang Chen, Peng Lu, Yajun Du, Mingwei Tang. Influence maximization in social networks using role-based embedding[J]. Networks and Heterogeneous Media, 2023, 18(4): 1539-1574. doi: 10.3934/nhm.2023068
[1] | Ke Guo, Wanbiao Ma . Global dynamics of an SI epidemic model with nonlinear incidence rate, feedback controls and time delays. Mathematical Biosciences and Engineering, 2021, 18(1): 643-672. doi: 10.3934/mbe.2021035 |
[2] | Yu Ji . Global stability of a multiple delayed viral infection model with general incidence rate and an application to HIV infection. Mathematical Biosciences and Engineering, 2015, 12(3): 525-536. doi: 10.3934/mbe.2015.12.525 |
[3] | Jinliang Wang, Gang Huang, Yasuhiro Takeuchi, Shengqiang Liu . Sveir epidemiological model with varying infectivity and distributed delays. Mathematical Biosciences and Engineering, 2011, 8(3): 875-888. doi: 10.3934/mbe.2011.8.875 |
[4] | Pengyan Liu, Hong-Xu Li . Global behavior of a multi-group SEIR epidemic model with age structure and spatial diffusion. Mathematical Biosciences and Engineering, 2020, 17(6): 7248-7273. doi: 10.3934/mbe.2020372 |
[5] | Pengfei Liu, Yantao Luo, Zhidong Teng . Role of media coverage in a SVEIR-I epidemic model with nonlinear incidence and spatial heterogeneous environment. Mathematical Biosciences and Engineering, 2023, 20(9): 15641-15671. doi: 10.3934/mbe.2023698 |
[6] | Maoxing Liu, Yuhang Li . Dynamics analysis of an SVEIR epidemic model in a patchy environment. Mathematical Biosciences and Engineering, 2023, 20(9): 16962-16977. doi: 10.3934/mbe.2023756 |
[7] | Tingting Xue, Long Zhang, Xiaolin Fan . Dynamic modeling and analysis of Hepatitis B epidemic with general incidence. Mathematical Biosciences and Engineering, 2023, 20(6): 10883-10908. doi: 10.3934/mbe.2023483 |
[8] | N. H. AlShamrani, A. M. Elaiw . Stability of an adaptive immunity viral infection model with multi-stages of infected cells and two routes of infection. Mathematical Biosciences and Engineering, 2020, 17(1): 575-605. doi: 10.3934/mbe.2020030 |
[9] | Thomas Torku, Abdul Khaliq, Fathalla Rihan . SEINN: A deep learning algorithm for the stochastic epidemic model. Mathematical Biosciences and Engineering, 2023, 20(9): 16330-16361. doi: 10.3934/mbe.2023729 |
[10] | Xiao-Min Huang, Xiang-ShengWang . Traveling waves of di usive disease models with time delay and degeneracy. Mathematical Biosciences and Engineering, 2019, 16(4): 2391-2410. doi: 10.3934/mbe.2019120 |
Influence maximization (IM), a central issue in optimizing information diffusion on social platforms, aims to spread posts or comments more widely, rapidly, and efficiently. Existing studies primarily focus on the positive effects of incorporating heuristic calculations in IM approaches. However, heuristic models fail to consider the potential enhancements that can be achieved through network representation learning techniques. Some recent work is keen to use representation learning to deal with IM issues. However, few in-depth studies have explored the existing challenges in IM representation learning, specifically regarding the role characteristics and role representations. This paper highlights the potential advantages of combining heuristic computing and role embedding to solve IM problems. First, the method introduces role granularity classification to effectively categorize users into three distinct roles: opinion leaders, structural holes and normal nodes. This classification enables a deeper understanding of the dynamics of users within the network. Second, a novel role-based network embedding (RbNE) algorithm is proposed. By leveraging the concept of node roles, RbNE captures the similarity between nodes, allowing for a more accurate representation of the network structure. Finally, a superior IM approach, named RbneIM, is recommended. RbneIM combines heuristic computing and role embedding to establish a fusion-enhanced IM solution, resulting in an improved influence analysis process. Exploratory outcomes on six social network datasets indicate that the proposed approach outperforms state-of-the-art seeding algorithms in terms of maximizing influence. This finding highlights the effectiveness and efficacy of the proposed method in achieving higher levels of influence within social networks. The code is available at https://github.com/baiyazi/IM2.
The study of host-virus interactions using dynamical models (within-host models) has improved our understanding of the mechanistic interactions that govern chronic infections caused by pathogens such as human immunodeficiency virus [1,2] and hepatitis B virus[4], and mechanistic interactions that govern acute infections caused by pathogens such as influenza virus [5,6], dengue virus [7,8], Zika virus [10], and severe acute respiratory syndrome coronavirus 2 [11,12]. Regardless of the virus considered, the most basic within-host model has a general structure that includes the interaction between the cells susceptible to the virus, the cells infected by the virus, and the virus at short (acute) and long (chronic) timescales. The emergence of unexpected dynamics in the virus data, new information about the virus' life-cycle, data describing host immunity to the infection, or a combination of some or all of the above, may require addition of complexity into the within-host modeling process (see [13,14] for a review).
Data fitting techniques for simple or complex within-host models use (normalized) least-squares approaches, in which the Euclidean distance between the data and the mathematical model is minimized with respect to the unknown parameters. The first step in the parameter estimation algorithm is to provide an initial guess for each parameter based on prior biological knowledge, such as the duration of eclipse stages, life-span of an infected cell and/or virus in vitro, and knowledge from modeling of virus dynamics of related viruses. When prior knowledge is unknown, the user makes the assumption that any positive parameter guess is acceptable. Then, an optimization search algorithm is employed until a termination criterion is reached. For many within-host mathematical models and corresponding datasets, the optimization is ill-posed due to the structure of the model and/or the frequency of the data [15]. As a result, some parameters may be difficult or impossible to quantify. To determine whether the uncertainty in parameter estimations is due to the model or the data, both structural and practical identifiability questions need to be addressed.
Structural identifiability investigates the ability of a model to reveal its unknown parameters from noise-free infinite amount of data [16,17,18]. When nonstructural identifiability of parameters occurs, it is important to find the source of non-identifiability, such as correlation between model parameters. This allows the user to propose additional assumptions needed to make the model structurally identifiable. Only after the structural identifiability of the unknown parameters is guaranteed, can one conduct data fitting schemes to estimate parameter values.
Practical identifiability investigates the ability of a model to reveal unknown structurally identifiable parameters under scarce and noisy (among subjects) data, often examined using Monte Carlo simulations [18,19,20], the Fisher information matrix (FIM) or correlation matrix [16,21,22,23], Bayesian method [24], and the profile likelihood method [25,26,27]. As with the structural identifiability, it is important to identify whether the practical identifiability issues are due to model structure. Additionally, it is important to determine whether increased data frequency, availability of data measurements for more than one model variable, and/or relaxing restrictions imposed on the unknown parameters can improve practical identifiability issues.
To address these important considerations in model validation, one needs to compare a set of models for the same virus infection system and the same empirical data. Here, we accomplish that by investigating four previously developed models of influenza A virus (IAV) infection in mice [28]. The first three models, all validated with the same virus titer dataset, are ranging from the basic within-host model to models with increased complexity through the addition of nonlinear terms and/or the inclusion of additional variables for the host cell populations infected by the influenza virus. The fourth model is the most complex, due to the addition of both nonlinear terms and variables for the host immune system. This results in a large number of unknown parameters. To compensate for the added complexity, this model is validated with two datasets: the same virus titer data and an additional immune cell population data.
The goal of this study is to determine how model complexity and data availability induce uncertainty in parameter estimates. Using the proposed models as proof of concept, we aim to provide a framework for model validation, from structural to practical identifiability, that can be generalized to other models of virus infections.
We consider four within-host models of acute infections used to describe influenza A virus infection in mice [5]. They all describe the same influenza A virus titer data, but they account for increased biological complexity, as follows. Model 1 assumes that influenza A virus infects all available susceptible target cells before being cleared according to first order infected cells death and viral clearance rates (target cell limitation); Model 2 explains an observed viral biphasic decay in the data by assuming a second order (density dependent) infected cell killing rate; Model 3 explains an observed viral expansion delay in the data by assuming the presence of an eclipse phase; and Model 4 utilizes a secondary immune cells dataset by adding a model population that describes immune-mediated antiviral responses. With each model, we include biological realism that describes the dynamics of virus expansion and decay in more detail, while at the same time increasing model complexity through the addition of nonlinearities and increased numbers of model parameters. The flow charts of the four models are presented in Figure 1. Below we describe all models in detail, and address the ability of accounting for complexity given the available data by investigating structural and practical identifiability of each model considered.
Model 1 is the classical target-cell limitation model of viral infection, which considers the interaction between target cells, infected cells, and virus as follows [5,28]. Target cells, T, interact with the virus, V, at rate β to become infected cells I. Infected cells die at per capita rate δ and produce virus at rate π. Virus is cleared at rate c. Model 1 is described by the system of ordinary differential equations (ODE) Eq (2.1) below,
Model 1: dTdt=−βTV,dIdt=βTV−δI,dVdt=πI−cV, | (2.1) |
with initial conditions T(0)=T0, I(0)=I0, and V(0)=0.
Experimental data has shown that, following peak expansion, virus decays in a biphasic manner. To capture the dynamics of viral decay, a modified death rate has been considered. It assumes that the rate of infected cell clearance increases as the density of infected cells decreases, as described by δ/(Kδ+I), where δ is the maximum per capita death rate and Kδ is infected cell population where death rate is half-maximal [28]. This leads to the modified target-cell limitation Model 2 given by the ODE system Eq (2.2) below,
Model 2: dTdt=−βTV,dIdt=βTV−δKδ+II,dVdt=πI−cV, | (2.2) |
with initial conditions T(0)=T0, I(0)=I0, and V(0)=0.
It was observed experimentally that, following influenza A virus exposure, there is a delay between infection of target cells and viral production by infected cells [29]. The delay was accounted for by assuming that, upon infection, cells enter an eclipse phase I1, where cells are infected but do not produce virus. They become productively infected I2 after 1/k days [6], where 1/k is the average time spent in eclipse phase. This leads to the target-cell limitation model with eclipse phase Model 3 given by the ODE system Eq (2.3) below,
Model 3: dTdt=−βTV,dI1dt=βTV−kI1,dI2dt=kI1−δKδ+I2I2,dVdt=πI2−cV, | (2.3) |
with initial conditions T(0)=T0, I1(0)=I0, I2(0)=0, and V(0)=0.
The first three models do not explicitly account for any immune responses, but indirectly assume infected cell loss at nonspecific rate δ (or δ/(Kδ+I2)) and viral clearance at nonspecific rate c. The observed biphasic viral decay captured by Models 2 and 3 given by Eqs (2.2) and (2.3), however, has the additional feature that the timing of the second phase viral decay coincides with the development of adaptive immune cells in the form of CD8+ T cells, which are responsible for killing infected cells and resolving the infection [5]. To account for adaptive immunity (especially in the presence of immune cell data), an additional variable E is considered. It only accounts for the effector CD8+ T cell population (and ignores the memory CD8+ T cell population), as follows. In the absence of infection, a baseline of influenza A virus-specific effector CD8+ T cells are present, E(0)=E0. Infection results in recruitment of additional effector CD8+ T cells at a rate proportional to the productively infected cells I2. This is modeled in a density dependent manner at rate λ/(KE+E), where λ is the maximum influx and KE is the effector CD8+ T cell population where the influx is half-maximal. Effector CD8+ T cells proliferate in the presence of infection. This is modeled by a delayed term ηI2(t−τI)E, which assumes that expansion occurs following interaction between effector CD8+ T cells and cells that became productively infected τI days ago. To account for effector CD8+ T cells function, the model assumes that effector CD8+ T cells kill infected cells in a density dependent manner modeled by the term δE/(Kδ+I2), where δE is the maximum per capita killing rate and Kδ is the I2 concentration where the killing is half-maximal. A nonspecific infected cell killing rate δ is still considered. The resulting delay differential equations (DDE) immune model is described by the DDE system Eq (2.4) below,
dTdt=−βTV,dI1dt=βTV−kI1,dI2dt=kI1−δI2−δEKδ+I2EI2,dVdt=πI2−cV,dEdt=λKE+EI2+ηEI2(t−τI)−dEE, | (2.4) |
with initial conditions T(0)=T0, I1(0)=I0, V(0)=0, E(0)=E0, and I2(t)=0 for −τI≤t≤0.
To unify the goal of investigating uncertainty in parameter estimates when fitting ODE systems of virus dynamics to data, we first approximate the DDE system given by Eq (2.4) with an ODE system as follows [30]. For a delay of τI days, we incorporate n dummy variables which all span τI/n days in the variable I2's dynamics. Briefly, we let yi be the productively infected cell populations at times t−inτI days post infection, for i=1,...,n, and consider the following ODE system for dummy variables yi(t),
dy1dt=I2−nτIy1,⋮dyidt=nτIyi−1−nτIyi,⋮dyndt=nτIyn−1−nτIyn, | (2.5) |
with yi(0)=0 for i=1,...,n. Then, the delayed productively infected cell population is given by
I2(t−τI)≈yn(t). | (2.6) |
Without loss of generality, we assume n=3. The corresponding immune Model 4 is given by the ODE system Eq (2.7) below,
Model 4: dTdt=−βTV,dI1dt=βTV−kI1,dI2dt=kI1−δI2−δEKδ+I2EI2,dVdt=πI2−cV,dEdt=λKE+EI2+ηEy3−dEE,dy1dt=I2−3τIy1,dy2dt=3τIy1−3τIy2,dy3dt=3τIy2−3τIy3, | (2.7) |
with initial conditions T(0)=T0, I1(0)=I0, I2(0)=0, V(0)=0, E(0)=E0, and yi(0)=0 for i=1,2,3.
To study the structural identifiability of the Models 1–4, we rewrite them in the following general form
x′(t)=f(x,p), | (3.1) |
and the observations as
y(t)=g(x,p). | (3.2) |
Here, x denotes the state variables, p is the parameter vector, and y is the output (given by the empirical data), also called the observations. The generic model given by Eq (3.1) is called structurally identifiable if the parameter vector p can be determined uniquely from the observations given by the smooth curve y(t). Otherwise, it is said to be unidentifiable. The formal definition of structural identifiability is provided below.
Definition 1. Let p and ˆp be two distinct parameter vectors. Model Eq (3.1) is said to be globally (uniquely) structurally identifiable if and only if,
g(x,p)=g(x,ˆp)impliesp=ˆp. |
Definition 2. Model Eq (3.1) is said to be locally structurally locally identifiable if for any p within an open neighborhood of ˆp in the parameter space,
g(x,p)=g(x,ˆp)impliesp=ˆp. |
Various methods have been proposed for analyzing the structural identifiability of ODE models [16,17,31]. In this study, we employ the differential algebra approach. It performs the elimination of unobserved state variables, resulting in equations expressed as functions of model parameters and observed state variables. These are referred to as the input-output equations, and are differential-algebraic polynomials consisting of the outputs, y(t), with model parameters, p, as coefficients. The formal definition of structural identifiability within the differential algebra approach for model Eq (3.1) is provided below.
Definition 3. Let c(p) denote the coefficients of the input-output equation corresponding to model Eq (3.1). We say that model Eq (3.1) is structurally identifiable from unlimited observations y(t) if and only if,
c(p)=c(ˆp)impliesp=ˆp. |
Studying structural identifiability of ODE models using the differential algebra methods can be accomplished using several platforms and available open-source software. Here, we present three such platforms: the differential algebra for identifiability of system (DAISY) [32], the identifiable combinations web application (COMBOS) [33], and the StructuralIdentifiability.jl in JULIA [34].
There are many similarities among the three methods. All of them offer insights into the structural identifiability status of each parameter by categorizing them into locally identifiable, globally identifiable, or non-identifiable. They employ a differential elimination method to calculate input-output equations of the considered system, and test the one-to-one map between the coefficients of the input-output equations and model parameters. COMBOS and the StructuralIdentifiability.jl package in JULIA are superior to DAISY, as they provide globally identifiable parameter correlations in an otherwise non-identifiable system. Even though DAISY does not print parameter correlations, the correlations can be derived using the coefficients of the input-output equations and algebraic manipulations in software such as MATHEMATICA. Of the three software, COMBOS does not print the input-output equations, making for a faster (yet more opaque) platform. Previous studies have shown that COMBOS works best for small to medium-size models and is not assured for models with large parameter vectors [33,35,36]. While highly similar, it is up to the user to determine which software is best suited for studying the identifiability of the models considered.
To determine whether the considered models can reveal their parameters, we examine the structural identifiability of Models 1–3, given by Eqs (2.1)–(2.3), under unlimited observations of viral load and the structural identifiability of Model 4, given by Eq (2.7), under unlimited combined observations of viral load and effector CD8+ T cell concentrations. We used the differential algebra software DAISY.
We assume that all Model 1's parameters p={β,δ,π,c} are unknown and that we have unlimited empirical observations of the viral load, y(t)=V(t). Using DAISY [32], we obtain the following input-output equation in variable V and model parameters p,
0=V‴V−V″V′+V″V2β+V″V(c+δ)−V′2(c+δ)+V′V2β(c+δ)+V3βcδ. | (3.3) |
By Definition 3, we need to examine whether another set of parameters, ˆp={ˆβ,ˆδ,ˆπ,ˆc} can produce the same empirical observation V(t), making the map from the parameter space p to the coefficients of input-output equation Eq (3.3) one-to-one. The coefficients of input-output equation Eq (3.3) are c(p)={β,c+δ,cδ}. To determine whether the map from the parameter space p to the coefficients c(p) is one-to-one, we set c(p)=c(ˆp), which is the following system:
{β=ˆβ,c+δ=ˆc+ˆδ,cδ=ˆcˆδ}. | (3.4) |
Solving Eq (3.4) results in the following two sets of solutions:
S1:{β=ˆβ,c=ˆc,δ=ˆδ},S2:{β=ˆβ,c=ˆδ,δ=ˆc}. |
Hence, only the infection rate β is globally structurally identifiable, while the infected cells killing rate δ and the virus clearance rate c are locally identifiable. Lastly, the virus production rate π does not appear in the input-output equation Eq (3.3). Therefore, it is not structurally identifiable. We summarize the results for Model 1 below (see Table 1).
Model | Observed states | DAISY | JULIA | COMBOS |
Model 1 unknown initial conditions |
V(t) | β globally identifiable {c,δ} locally identifiable Correlations: c+δ=ˆc+ˆδ, cδ=ˆcˆδ |
β globally identifiable {c,δ} locally identifiable π nonidentifiable Correlations: c+δ=ˆc+ˆδ, cδ=ˆcˆδ |
β globally identifiable {c,δ} locally identifiable with 2 solutions |
Model 2 unknown initial conditions |
V(t) | {β,c} globally identifiable Correlations: δπ=ˆδˆπ, πKδ=ˆπ^Kδ |
{β,c} globally identifiable {π,Kδ,δ} nonidentifiable Correlations: δπ=ˆδˆπ, πKδ=ˆπ^Kδ |
{β,c,Kδπ,δπ} globally identifiable |
Model 3 unknown initial conditions |
V(t) | {β,c,k} globally identifiable Correlations: δKδ=ˆδ^Kδ, πKδ=ˆπ^Kδ |
{β,c,k} globally identifiable {π,Kδ,δ} nonidentifiable Correlations: δπ=ˆδˆπ, πKδ=ˆπ^Kδ |
{β,c,Kδπ,δπ} globally identifiable |
Model 4 unknown initial conditions |
V(t),E(t) | {β,c,dE,δ,k,KE,τI} globally identifiable Correlations: δEη=ˆδEˆη, Kδη=ˆKδˆη, λη=ˆλˆη, πη=ˆπˆη |
{β,c,dE,δ,k,KE,τI} globally identifiable {δE,Kδ,π,λ,η} nonidentifiable Correlations: Kδπ=ˆKδˆπ, Kδλ=ˆKδˆλ, ηKδ=ˆηˆKδ, δEπ=ˆδEˆπ |
identifiability unknown |
All models known initial conditions except for E0 |
V(t) V(t),E(t) |
Models 1–3 globally identifiable Model 4 identifiability unknown |
Models 1–4 globally identifiable |
Models 1–4 identifiability unknown |
Proposition 1. Model 1 given by Eq (2.1) is not structured to identify all of its parameters from unlimited viral load observations, V(t). More precisely, parameter β is globally structurally identifiable, parameters c and δ are locally structurally identifiable, and parameter π is not structurally identifiable. Moreover, Model 1 is globally structural identifiable under known initial conditions.
We assume that all parameters p={β,δ,Kδ,π,c} of Model 2, given by Eq (2.2), are unknown and that we have unlimited empirical observations of the viral load, y(t)=V(t). Using DAISY, we obtain the following input-output equation,
0=V‴V′2V+2V‴V′V2c+2V‴V′VKδπ+V‴V3c2+2V‴V2cKδπ+V‴VK2δπ2−V″V′3+V″V′2V2β−V″V′2Vc−2V″V′2Kδπ+2V″V′V3βc+V″V′V2(2βKδπ+c2)−V″V′K2δπ2+V″V4βc2+V″V3c(2βKδπ+c2)+V″V2Kδπ(βKδπ+2c2)+V″VKδπ2(cKδ+δ)−V′4c+V′3V2βc−2V′3Vc2+V′3π(−2cKδ−δ)+2V′2V3βc2+V′2V2(2βcKδπ+βδπ−c3)−2V′2Vcπ(cKδ+δ)−V′2Kδπ2(cKδ+δ)+V′V4βc3+2V′V3βcπ(cKδ+δ)+V′V2π(βcK2δπ+βδKδπ−c2δ)+V4βc2δπ+V3βcδKδπ2. | (3.5) |
As before, we examine whether another set of parameters, ˆp, can produce the same empirical observation V(t), making the map from the parameter space p to the coefficients of input-output equation Eq (3.5) one-to-one. If we set c(p)=c(ˆp), we obtain
c=ˆc,β=ˆβ,Kδπ=ˆKδˆπ,cπ(cKδ+δ)=ˆcˆπ(ˆc^Kδ+ˆδ),βc2δπ=ˆβˆc2ˆδˆπ,βcKδπ+βδπ−c3=ˆβˆc^Kδˆπ+ˆβˆδˆπ−ˆc3,Kδπ2(ckδ+δ)=^Kδˆπ2(ˆc^Kδ+ˆδ), |
with solutions
{c=ˆc,β=ˆβ,δπ=ˆδˆπ,πKδ=ˆπˆKδ}. |
Hence, Model 2 is not structurally identifiable. In particular, infection rate β, viral clearance rate c, and the products δπ, Kδπ (but not the individual parameters δ, π, and Kδ) are globally identifiable. Since the correlations δπ and Kδπ are known, fixing one of these parameters can make model Eq (2.2) identifiable. We summarize the structureal identifiability results for Model 2 below (see Table 1).
Proposition 2. Model 2 given by Eq (2.2) is not structured to identify all of its parameters from unlimited viral load observations, V(t). More precisely, parameters β and c are globally structurally identifiable. Moreover, the parameter products δπ and Kδπ are globally identifiable. Since the correlations are known, fixing δ, π, or Kδ makes the Model 2 globally structurally identifiable from unlimited observations V(t). Moreover, Model 2 is globally structural identifiable under known initial conditions.
We assume that all parameters p={β,δ,k,δEKδ,π,c} of Model 3, given by Eq (2.3), are unknown and that we have unlimited empirical observations of the viral load, y(t)=V(t). Using DAISY, we can derive the input-output equations (they are too messy and will not be presented here). As before, we examine whether another set of parameters, ˆp, can produce the same empirical observation V(t), making the map from parameter space p to coefficients of input-output equation (not shown) one-to-one. If we set c(p)=c(ˆp), we obtain
{c=ˆc,β=ˆβ,k=ˆk,δKδ=ˆδ^Kδ,πKδ=ˆπ^Kδ}. |
Hence, Model 3 is not structurally identifiable. In particular, the infection rate β, the eclipse parameter k, the viral clearance rate c, the ratio δ/Kδ, and the product Kδπ (but not the individual parameters δ, π, and Kδ) are globally identifiable. Since the correlations are known, fixing one of these parameters makes the model Eq (2.3) identifiable. We summarize the results for Model 3 below (see Table 1).
Proposition 3. Model 3 given by Eq (2.3) is not structured to identify all of its parameters from unlimited viral load observations, V(t). More precisely, parameters β, k, and c are globally structurally identifiable. Moreover, the parameter ratio δ/Kδ and the parameter product Kδπ are globally identifiable. Since the correlations are known, fixing the parameter δ, π, or Kδ makes Model 3 globally structurally identifiable from unlimited observations V(t). Moreover, Model 3 is globally structural identifiable under known initial conditions.
To study the structural identifiability of Model 4 (given by Eq (2.7)), we assume that all parameters, p={β,δ,k,δEKδ,π,cλ,η,dE,τI,E0}, are unknown and that we have unlimited empirical observations for the viral load y1(t)=V(t) and the effector cell CD8+ T cell data y2(t)=E(t). Using DAISY, we can obtain input-output equations (they are messy and will not be presented here). As before, we examine whether another set of parameters, ˆp, can produce the same empirical observations V(t) and E(t), making the map from the parameter space p to the coefficients of input-output equations (not shown) one-to-one. If we set c(p)=c(ˆp), we obtain
{c=ˆc,β=ˆβ,k=ˆk,dE=ˆdE,δ=ˆδ,KE=ˆKE,τI=ˆτI,δEη=ˆδEˆη,Kδη=ˆKδˆη,λη=ˆλˆη,πη=ˆπˆη}. |
Hence, Model 4 is not structurally identifiable. In particular, the infection rate β, the eclipse parameter k, the viral clearance rate c, the effector cells death rate dE, the generic killing rate δ, the half-maximal level KE, the delay τI, the ratios λ/η and π/η, and the products dEη and Kδη (but not the individual parameters δE,η,Kδ,π,λ) are globally identifiable. If the parameter η is fixed, then the model Eq (2.7) becomes identifiable. We summarize the results for Model 4 below (see Table 1).
Proposition 4. Model 4 given by Eq (2.7) is not structured to identify all of its parameters from unlimited viral load and effector cell observations, V(t) and E(t). More precisely, parameters β, k, c, dE, δ, KE, and τI are globally structurally identifiable. Moreover, the parameter ratios λ/η and π/η and the parameter products dEη and Kδη are globally identifiable. If the parameter η is fixed, then Model 4 becomes globally structurally identifiable from unlimited observations V(t) and E(t).
We do not know (from DAISY) whether knowing initial conditions guarantees global stability of Model 4 (see Table 1).
Studying structural identifiability of ODE models can be achieved using software other than DAISY To determine how these methods compare, results from three platforms, DAISY, COMBOS [33], and StructuralIdentifiability.jl in JULIA [34], for Models 1–4 are presented side by side in Table 1.
We find that all three software uncover the same structural identifiability results for Models 1–3. On the other hand, DAISY and StructuralIdentifiability.jl in JULIA uncover the same identifiability results (while COMBOS cannot find results) for Model 4 under unknown initial conditions. Even though Models 3 and 4 employ different interpretations of the parameter correlations among platforms, simple algebraic manipulations show that the obtained correlations are equivalent. Given the similarity in the results among Models 1–3, it is up to the user to decide which of the three software is best suited for their analysis. Similarly, given the similarity in the results among DAISY and StructuralIdentifiability.jl in JULIA for Model 4 with unknown initial conditions, it is up to the user to decide which of the two software is best suited for their analysis. However, only StructuralIdentifiability.jl in JULIA can be used to determine the structural identifiability of Model 4 with unknown E0 and known other initial conditions. Hence, for larger systems with nonlinear terms of interactions, this method should be employed.
We use previously published longitudinal influenza A infectious virus titer and CD8+ T cell data in mice from Smith et al. [5]. Adult mice were inoculated intranasally with 75 TCID50 of mouse adapted influenza A/Puerto Rico/8/34 (H1N1) (PR8) virus.
Total infectious virus (log10 TCID50 per lung) was measured for ten mice each day. Nine days after inoculation, the infectious virus was no longer detectable in any of the mice. Therefore, we only consider infectious virus titer data from the first nine days post inoculation in our analyses. We let E(Vdata(i)) be the mean infectious virus titer data at day i={1,...,9} and Var(Vdata(j)) be the infectious virus titer variance at days i={1,...,9} among the ten mice.
Moreover, total effector CD8+ T cells (cells per lung) were measured daily for five mice. Since influenza A-specific effector CD8+ T cells were detectable for all twelve days of the study, we consider effector CD8+ T cells data from the first twelve days post inoculation in our analyses. We let E(Edata(j)) be the mean CD8+ T cell data (per lung) at day j={1,...,12} and Var(Edata(j)) be the CD8+ T cell data variance at days j={1,...,12} among the five mice.
For all models, we assume known initial conditions T(0)=107 cells/ml, I(0)=75 cells/ml, and V(0)=0 virus/ml as in [5]. For Models 3 and 4, we additionally assume that I2(0)=0, and for Model 4, we assume yi(0)=0, for i=1,2,3. For Model 4, we assume E(0)=E0 is unknown, therefore adding E0 to the parameter vector to be estimated from the data. Lastly, we assume all parameters are unknown. When parameters are either very large or very small, we estimate their value on natural logarithmic scale. In particular, we estimate p1={ln(β),δ,π,c} for Model 1, p2={ln(β),ln(δ),ln(Kδ),π,c} for Model 2, p3={ln(β),ln(δ),ln(Kδ),π,c,k} for Model 3, and p4={ln(β),δ,ln(Kδ),π,c,k,δE,ln(η),ln(λ),dE,ln(KE),τI,ln(E0)} for Model 4.
To estimate parameters p1–p3, we fit the predicted viral load log10Vwmodel(i) given by Models 1–3 to the longitudinal mean (among the ten mice) infectious virus (log10 TCID50 per lung) E(Vdata(i)), knowing that the variance in the data at day i is Var(Vdata(i)), for i={1...9} days. We assume that the data satisfies the following statistical model [19,37]
E(Vdata(i))=log10Vwmodel(i,pw)+ϵi√Var(Vdata(i)), | (4.1) |
where Vwmodel(i) is the predicted virus trajectory given by Model w at days i={1,...,9} post infection; p1={ln(β),δ,π,c}, p2={ln(β),ln(δ),ln(Kδ),π,c}, and p3={ln(β),ln(δ),ln(Kδ),π,c,k}; and ϵi are independent and identically distributed with mean zero and standard deviation σ. Given the statistical model Eq (4.1), we assume that the measured data, E(Vdata(i)), follows a normal distribution with a mean equal to the model prediction log10Vwmodel(i) and with variance equal to σ2Var(Vdata(i)). Moreover, the availability of measurements from several animals that vary with time allows us to account for the change in data variance over time, Var(Vdata(i)). Therefore, we consider the following functional (weighted residual sum of squares), to estimate the model parameters,
RSSw(pw)=9∑i=1(log10Vwmodel(i,pw)−E(Vdata(i)))2Var(Vdata(i)). | (4.2) |
Consequently, parameters of Models 1–3 are estimated by minimizing the weighted least-squares given by
Jw(pw)=minpwRSSw(pw). | (4.3) |
Moreover, to estimate parameters p4, we fit both the predicted viral load log10V4model(i), given by Model 4, to the longitudinal mean (among ten mice) infectious virus E(Vdata(i)) (knowing that the variance in the data at days i={1...9} is Var(Vdata(i))) and the predicted effector cell population log10E4model(j), given by Model 4, to the longitudinal mean (among five mice) CD8+ T cell data E(Edata(j)) (knowing that the variance in the data at days j={1...12} is Var(Edata(j))). We assume that the data is satisfying the following statistical model [19,37]
E(Vdata(i))=log10V4model(i,p4)+ϵi√Var(Vdata(i)), | (4.4) |
E(Edata(j))=log10E4model(j,p4)+ηj√Var(Edata(j)), | (4.5) |
where V4model(i) is the predicted virus trajectory given by Model 4 at days i={1,...,9} post infection, E4model(j) is the predicted CD8+ T cell population given by Model 4 at days j={1,...,12} post infection, and p4={ln(β),δ,ln(Kδ),π,c,k,δE,ln(η),ln(λ),dE, ln(KE),τI,ln(E0)}. Here, ϵi and ηj are independent and identically distributed with mean zero and standard deviations σV and σE, respectively. As before, the measured data E(Edata(j)) follows a normal distribution whose mean is the model prediction log10Ewmodel(i) and whose variance is σ2EVar(Edata(j)). We consider the following functional (weighted residual sum of squares),
RSS4(p4)=u19∑i=1(log10V4model(i,p4)−E(Vdata(i)))2Var(Vdata(i))+u212∑j=1(log10Emodel(j,p4)−E(Edata(j)))2Var(Edata(j)). | (4.6) |
Consequently, parameters of Model 4 are estimated by minimizing the weighted least-squares given by,
J4(p4)=minp4RSS4(p4). | (4.7) |
Note that we weighted the virus and effector cells contributions, with weights u1=1 and u2=maxjVar(Edata(j))/maxiVar(Vdata(i)). We minimize all least-square functionals RSSw using the fmincon function in MATLAB with ranges for parameters pw given in Table 2.
Parameter | Model 1 | Model 2 | Model 3 | Model 4 |
β×10−5 | 0.1–10 | 0.1–10 | 0.1–10 | 0.1–10 |
δ | 0–25 | 102–108 | 102–108 | 0–15 |
Kδ | - | 103–107 | 103–107 | 101–105 |
π | 0–102 | 0–102 | 0–102 | 0–102 |
c | 0–25 | 0–25 | 0–25 | 0–25 |
k | - | - | 4–6 | 4–6 |
δE | - | - | - | 0–25 |
η×10−7 | - | - | - | 10−2–100 |
λ×103 | - | - | - | 10−2–102 |
dE | - | - | - | 0–25 |
KE×106 | - | - | - | 0.1–103 |
τI | - | - | - | 0–10 |
E0×103 | - | - | - | 0.1–10 |
To compare Models 1–4, we calculate the corrected Akaike information criteria (AICc), given below
AICc=nln(Jwn)+2(M+1)+2(M+1)(M+2)n−M, | (4.8) |
where n is the number of data points used to estimate parameters pw and M is the number of estimated parameters. In Models 1–3, n=9 and M=4,5, and 6, respectively. In Model 4, n=21 and M=13.
To quantify the uncertainty associated with predicted solutions of each model, we perform parametric bootstrapping. It is a simulation-based method which assumes that data comes from a known distribution with unknown parameters. For Models 1–3, we assume that the predicted viral population for best parameter estimates, log10Vwmodel(i,pw), is the mean of the data's normal distribution and σ2Var(Vdata(i)) is its variance (see Eq (4.1)). Then, σ can be approximated as follows,
σ2≈1n−Mn∑i=1(log10Vwmodel(i,pw)−E(Vdata(i)))2Var(Vdata(i)), |
(see Banks et al. for a full derivation [37]). Here, n=9 is the number of viral samples and M is the number of parameters (M=4, M=5, and M=6 for Models 1–3, respectively). To find a confidence region in our model predictions, we generate 1000 simulated datasets using the distribution space given by Eq (4.1), and fit Models 1–3 to each datasets.
Similarly, for Model 4, assuming that viral data and effector cell data come from distributions with means log10V4model(i,p4) and log10E4model(i,p4) (the predicted variables for best parameter fits) and that σ2VVar(Vdata(i)) and σ2EVar(Edata(j)) are the variances, then
σ2V≈1ntot−MnV∑i=1(log10V4model(i,p4)−E(Vdata(i)))2Var(Vdata(i)), |
and
σ2E≈1ntot−MnE∑j=1(log10(Emodel(j,p4)−E(Edata(j)))2Var(Edata(j)), |
as before. Here, nV=9 is the number of viral samples, nE=12 is the number of CD8+ T cell samples, ntot=nV+nE=21 is the number of total data samples, and M=13 is the number of parameters fitted.
We fitted Models 1–3 to previously published longitudinal influenza A infectious virus titer and we fitted Model 4 to both longitudinal influenza A infectious virus titer and longitudinal CD8+ T cell data in infected mice [5], using a normalized least-square optimization algorithm (see Section 4). The results from fitting V(t) given by Models 1–3 to viral load data are shown in Figure 2A–C and the best parameter fits are given in Table 3. The results from fitting both V(t) and E(t) given by Model 4 to viral titer and effector cell data are shown in Figure 2D and the best parameter fits are given in Table 3. Model selection, using the corrected AICc, predicts that Model 4 best describes the data (see Table 3). To quantify the uncertainty associated with predicted solutions of each model, we find a 90% confidence region in our model predictions (see Section 4.5), illustrated by shaded gray areas in Figure 2A–C for Models 1–3 and by gray and blue shaded regions in Figure 2D for Model 4. We see large error regions in virus population predictions for all models during the decay phase (gray regions in Figure 2A–D). Moreover, Models 2–4 better capture the virus population expansion phase compared to Model 1 (gray regions in Figure 2B–D versus gray region in Figure 2A). Lastly, the largest error in CD8+ T cell prediction in Model 4 occurs in the second week of infection (blue region in Figure 2D).
Parameter | Model 1 | Model 2 | Model 3 | Model 4 |
β×10−5 | 0.48 | 6.88 | 6.20 | 8.39 |
δ | 1.47 | 1.59×106 | 1.72×106 | 0.196 |
Kδ | - | 4.69×104 | 2.37×105 | 5.49×103 |
π | 1.18 | 0.86 | 1.69 | 0.69 |
c | 1.48 | 6.49 | 12.34 | 6.19 |
k | - | - | 4.82 | 5.02 |
δE | - | - | - | 14.20 |
η×10−7 | - | - | - | 3.59 |
λ×103 | - | - | - | 1.31 |
dE | - | - | - | 0.20 |
KE×106 | - | - | - | 9.68 |
τI | - | - | - | 1.69 |
E0×103 | - | - | - | 1.21 |
Jw | 21.12 | 2.09 | 2.39 | 2.14 |
AICc | 37.67 | 40.88 | 114.1 | 36.78 |
While structural identifiability investigates whether parameters can be uniquely determined from a model given unlimited data in the absence of measurement error or noise, practical identifiability determines whether parameters can be accurately identified in real-world scenarios, where observed discrete and variable among subject data is contaminated with measurement errors. We and others have employed several methods to study practical identifiability of within-host and infectious disease models, such as Monte Carlo simulations [18,19,20], Fisher information matrix or correlation matrix [16,21,22,23], and Bayesian methods [24]. In this study, we use the profile likelihood method [25,38] described in detail below.
Consider that the vector of parameters p is partitioned into p = (r, s), where r represents the parameter whose practical identifiability we are investigating and s represents the vector of remaining parameters. The profile likelihood of r is given by,
PL(r)=minsRSS(r,s), | (6.1) |
where RSS is the objective functional used for data fitting (in our case, Eq (4.2) for Models 1–3 and Eq (4.6) for Model 4). In other words, PL(r) finds the minimum of the objective functional RSS(r,s) for an array of fixed r values over the space of the remaining parameters s. The shape of PL(r) informs the identifiability of r, with a u-shaped PL(r) that exceeds a threshold (corresponding to a chosen confidence level) indicating practical identifiability of r and a flat PL(r) indicating nonpractical identifiability of r.
We estimate PL(r) over a mesh of equally spaced values of r, centered at the best-fit estimate ˆr, with the number of mesh points chosen to have enough data to generate a confidence interval for ˆr, as follows. If we consider a model with parameters p unknown and a model with parameters s unknown, we obtain two nested models that differ by parameter r. It has been shown that the likelihood ratio of the nested models converges to a χ2 distribution with one degree of freedom, df=1 (see Corrollary 2 in [38] for more detail). This helps us define the Δ-level likelihood-based confidence interval for parameter r to be
CI={r|PL(r)<J+Δ}, | (6.2) |
where Δ is the percentile of the χ2 distribution with df=1, and J is the weighted least-squares functional at the best parameter estimate [38]. This can be summarized, as follows.
Definition 4. Let CI={r|PL(r)<J+Δ} be the likelihood-based confidence interval for parameter r.
i. If CI⊂[r1,r2], where r1 and r2 are finite, then parameter r is practically identifiable.
ii. If either r1 or r2 is infinite, then parameter r is not practically identifiable.
A model is practically identifiable if all parameters are practically identifiable.
For Models 1–3, we generated the profile likelihoods PL(r) for each parameter r∈{pw} for w={1,2,3} by fitting the functional RSSw(pw) given by Eq (4.2) and Model w to mean population viral load data. We obtained best estimates for the remaining parameters s over a mesh of equally spaced, known r values. Similarly, for Model 4, we generated the profile likelihood PL(r) for unknown parameters r∈{p4} by fitting functional RSS4(p4) given by Eq (4.6) simultaneously to the mean population viral titer and the mean population effector cell data. We obtained best estimates for the remaining parameters s over a mesh of equally spaced, known r values.
Additionally, to further explore the relationship between data availability and practical identifiability, we generated profile likelihoods for parameters pw and Model w using simulated, noise-free, high frequency datasets. In particular, we assumed that the virus titer data was collected every fourteen minutes (for a total of 899 evenly spaced points) for Models 1–3 and that both the virus titer data and the effector cell data were collected every fourteen minutes (for a total of 899 evenly spaced points of each data type) for Model 4. We fitted each model to this high frequency data and generated profile likelihoods of the resulting parameter values. In all of our models, we chose Δ to be the 90-th percentile of the χ2-distribution, χ290 [38]. This guaranteed us a 90% confidence interval in the estimated parameter r.
Since we determined that Models 1–4 are structurally identifiable under known initial conditions and unlimited data, we were allowed to search for best estimates for all models' parameters. The resulting fitting routine may still be ill-posed, given that the data consisted of discrete (daily) datasets that varied among the infected mice, rather than the unlimited data required by the structural identifiability theory. Therefore, we performed practical identifiability for Models 1–4 under the discrete subject data in [28]. We used the Profile Likelihood practical identifiability method [23,25,26,27], which has the advantage of not only determining whether a parameter is practically identifiable, but also of determining a 90% confidence interval for the parameter space where practical identifiability occurs.
When using the empirical population mean virus titer data in [5] for Model 1, we found that β is practically identifiable with 90% confidence interval CI⊂[1.64×10−6,2.94×10−5] and π is practically identifiable with 90% confidence interval CI⊂[0.31,3.18], respectively. On the other hand, both δ and c are not practically identifiable, with identical 90% confidence intervals CI⊂[0.87,∞] (see Figure 3A). Adding high frequency data and rerunning the profile likelihood analysis resulted in practical identifiability of all four parameters, consistent with the structural identifiability results for Model 1 (see Figure 3B).
Similar to Model 1, when using the empirical population mean virus titer data in [5] for Model 2, we found that δ, Kδ, and π are practically identifiable with 90% confidence intervals CI⊂[1.41×106,2.06×106], CI⊂[1.74×104,4.19×105], and CI⊂[0.44,5.24], respectively. On the other hand, both β and c are not practically identifiable, with 90% confidence intervals CI⊂[6.62×10−6,∞] and CI⊂[3.98,∞], respectively (see Figure 3C). Adding high frequency data and rerunning the profile likelihood analysis resulted in practical identifiability of all five parameters, consistent with the structural identifiability results for Model 2 (see Figure 3D).
For Model 3, when using the empirical population mean virus titer data in [5], we found that β, δ, Kδ, and π are practically identifiable with 90% confidence intervals CI⊂[2.17×10−5,1.60×10−4], CI⊂[1.47×106,2.03×106], CI⊂[1.59×105,4.32×105], and CI⊂[1.07,4.49], respectively. On the other hand, both c and k are not practically identifiable, with 90% confidence intervals CI⊂[3.98,∞] and CI⊂[∞,∞], respectively (see Figure 3E). Adding high frequency data and rerunning the profile likelihood analysis did not result in practical identifiability of all six parameters (see Figure 3F). However, if we additionally relaxed constraints on parameters c and k to range in the [0,1000] and [0,50] intervals, compared to the constraints chosen in Table 2, we observed practical identifiability of all five parameters, consistent with the structural identifiability results for Model 3 (see Figure 4).
For Model 4, when using the discrete empirical population mean virus titer data and the empirical population mean effector cell data in [5] simultaneously, we found that π and E0 are practically identifiable with 90% confidence intervals CI⊂[0.29,2.49] and CI⊂[66,5.41×10−4]. Parameters k, λ, and KE are not practically identifiable with the same 90% confidence interval CI⊂[−∞,∞]. Parameters β, δE, Kδ, η, and c are also not practically identifiable with 90% confidence intervals CI⊂[8.00×10−6,∞], CI⊂[1.10,∞], CI⊂[42.98,∞], CI⊂[5.37×10−8,∞], and CI⊂[3.89,∞], respectively. Lastly, parameters δ, dE, and τ are not practically identifiable on the positive domain with an undefined lower bound (ULB) for the 90% confidence interval and a finite upper bound. In particular, CI⊂[ULB,0.63], CI⊂[ULB,3.20], CI⊂[ULB,6.99], for δ, dE, and τI, respectively (see Figure 5A). Adding high frequency data and rerunning the profile likelihood analysis resulted in practical identifiability of all thirteen parameters, consistent with the structural identifiability results for Model 4 (see Figure 5B).
In this study, we investigated the conditions needed to ensure model identifiability in four models of influenza A virus dynamics in infected mice. To apply the same methodology and software, all considered models were either modeled by systems of ODEs (Models 1–3 given by equations Eqs (2.1)–(2.3)) or approximated by a system of ODEs (Model 4 given by Eq (2.7)). The considered models differ in the number of equations (corresponding to the number of variables) from three in Models 1 and 2, to four in Model 3 to eight in Model 4. Consequently, the number of unknown parameters is a maximum of four in Model 1, a maximum of five in Model 2, a maximum of six in Model 3, and a maximum of thirteen in Model 4. Lastly, the terms of interaction include only mass-action and linear terms for Model 1 and mass-action, linear terms, and density dependence terms for Models 2–4.
We found that the increased complexity needed to capture biological realism comes with a cost. It resulted in increased uncertainty in parameter estimates not only when discrete and noisy virus and immune cell empirical data is used for validation but also when we assumed (hypothetically) that unlimited data is available. This means that data fitting should not be conducted until it is established that parameters can be revealed from unlimited data under the considered model structure. In other words, the first step in the model validation is determining whether all unknown parameters are structurally identifiable (see Figure 6).
When it comes to investigating the structural identifiability of systems of ODEs several software platforms are available. Here, we compared results from three of them: DAISY [32], COMBOS [33], and StructuralIdentifiability.jl in JULIA [34]. For Models 1–3 and unlimited virus titer data, we found the same classification for the structurally identifiable parameters and the same (or equivalent) correlations between the nonstructurally identifiable parameters, regardless of the software used (Table 1). For the more complex Model 4 and unlimited virus titer and effector CD8+ T cell data, however, only StructuralIdentifiability.jl in JULIA found that the model is structurally identifiable under known (with the exception of initial effector population, E0) initial conditions (Table 1). When initial conditions are unknown, we found identical classification for structurally identifiable parameters and equivalent correlations between the nonstructurally identifiable parameters among StructuralIdentifiability.jl in JULIA and DAISY (Table 1). COMBOS cannot handle the structural stability analyses for Model 4, regardless of whether initial conditions are known or not (Table 1). While increased difficulty in analyzing the structural identifiability of Model 4 is not surprising given its increased dimensionality (eight equations), multiple parameters (thirteen), and complex terms of interaction, this model is validated with two datasets (virus titer load and effector CD8+ T cells). Our analysis showed that the addition of data for another model variable did not compensate for the size of the model and number of unknown parameters.
Interestingly, we found that all parameters (for all models) are structurally identifiable under known (with the exception of initial effector population, E0) initial conditions. Given that this is an inoculation study, the assumption of known viral inoculum (set by the experiment) and of known initial target cell population (approximated based on the animal body weight) is not unreasonable. When Models 1–4 are used to model natural infection, however, such initial conditions would be unknown due to differences in individual routes of infection, heterogeneity in individual immune responses, and variability in patient susceptibility. Under such unknowns, Models 1–4 would become structurally unidentifiable. Hence, it would be impossible to estimate all parameters even in the presence of unlimited data. A reduction of the unknown parameter space (based on the reported parameter correlations) would be needed before model validation can be attempted.
We next validated Models 1–3 with discrete (daily) virus titer data (for the first nine days) and validated Model 4 with discrete (daily) virus titer data (for the first nine days) and discrete (daily) CD8+ T cell data (for the first twelve days). Model selection (based on AICc) favored Model 4 as the best model (Table 3). Interestingly, Model 1 was the second best model, even though it had the largest 90% error region around the predicted mean virus fit (Figure 2, gray shaded regions). All models perform the worst during the contraction interval (Figure 2, gray and blue shaded regions), suggesting uncertainty in death rates estimates (for the virus and infected cells).
We used the best parameter estimates obtained through data fitting for Models 1–4 to further investigate their practical identifiablity. Knowing that data used for validation was collected daily and that there was variability among the subjects at each time point, we wanted to determine whether there is uncertainty in estimated parameters. When it comes to practical identifiability, several methods are available, such as the Monte Carlo simulations [18,19,20], the Fisher information matrix or correlation matrix [16,21,22,23], and Bayesian methods [24]. In this study, we used the profile likelihood method [25,38] for two main reasons. First, it allowed us to not just classify the models as practically or non-practically identifiable, but to determine a 90% confidence interval for each practically identifiable parameter. Second, it allowed us to determine the required assumptions needed to improve practical identifiability, while maintaining biological realism (for example, by imposing positivity for all parameters).
We found that none of the models are practically identifiable for the daily empirical data collected in [28] and the parameter range restrictions imposed in Table 2 (see Figures 3A, C, E, and 5A). While Model 1, Model 2, and Model 4 become practically identifiable if we assume data is collected every fourteen minutes (see Figures 3A, D, and 5B), Model 3 does not (see Figure 3F). For this model, we can achieve practical identifiability only when we assume that the viral clearance rate can reach values as high as c=500 per day (corresponding to a life-span for the influenza virus of 2.9 minutes), and that the epithelial cells spend 1/k=1.2 hours in the eclipse phase before they become productively infected (see Figure 4). While large influenza clearance rates have been reported before [28], the eclipse phase 1/k is assumed to be conserved in a tight interval of 4–6 hours in most respiratory infections [6,11,28]. Therefore, this parameter is not practically identifiable from Model 3 even when data is collected at high frequency. This is a situation where a parameter should be removed from the data fitting routine in order to improve the uncertainty in the estimates of the remaining parameters (see Figure 6).
Our study has several limitations. First, Model 4 was originally expressed as a five order system of DDEs. Given the lack of methods that can be used to determine the structural identifiability of DDEs, we approximated it with an eight order system of ODEs. More work is needed to determine whether we maintain (or improve) the practical identifiability results when the DDE system is used in the place of the ODE system. Second, we assumed that all model parameters are unknown. It is feasible that the practical identifiability will be improved if certain parameters (such as the eclipse phase) were assumed known. Lastly, all our practical identifiability results come in the context of daily data collection. It would be interesting to see how data collected with random frequency (especially unavailability of data measurements before peak viral load) changes the results. In conclusion, we investigated the structural and practical identifiability of four nested ODE models of influenza A virus infection in mice. We determined the tradeoff between model complexity (defined as combined system dimension, number of unknown parameters, nonlinearity in model interactions), data availability, and our ability to reliably estimate model parameters. We presented solutions for improving model identifiability. While our results dependent on the structure of the models considered the available data, the methods are generalizable and their use is needed to improve reliability and reproducibility of parameter estimates in other systems of ODEs applied to discrete biological data.
Identifiability analysis has critical implications for experimental design, particularly when it comes to ensuring that the data obtained in these experiments will provide accurate estimation of parameters. For instance, if a model is not structurally identifiable even with noise-free unlimited data, then no experimental design will allow for estimation of certain parameters. Furthermore, structural identifiability analysis informs us which variables need to be measured in order to obtain reliable parameter estimates. Therefore, these experiments can be designed with some structurally identifiable models in mind. On the other hand, practical identifiability reveals the optimal data sampling frequency, where the data is more informative for certain parameters. These results will refine the experimental design to obtain data at those times and reduce the uncertainty in parameter estimates.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
Conceptualization: NT and SMC; analysis: YRL and NHB; funding acquisition SMC and NT; writing - original draft preparation: YRL, NHB, NT and SMC; writing - review and editing: YRL, NHB, NT and SMC.
SMC and NHB acknowledge partial support from National Science Foundation grant No. 2051820 and NIH NIGMS 1R01GM152743-01. This research was enabled in part through the Virginia Tech Center for the Mathematics of Biosystems (VTCMB-033500). NT acknowledges partial support from National Science Foundation grant DMS 1951626. NT and YRL acknowledge support from NIH NIGMS 1R01GM152743-01.
Necibe Tuncer is one of the special issue editors for Mathematical Biosciences and Engineering (MBE) and was not involved in the editorial review or the decision to publish this article. All authors declare no conflict of interest.
[1] |
J. Gu, G. Li, N. D. Vo, J. J. Jung, Contextual Word2Vec model for understanding chinese out of vocabularies on online social media, Int. J. Semant. Web. Inf. Syst., 18 (2022), 1–14. https://doi.org/10.4018/ijswis.309428 doi: 10.4018/ijswis.309428
![]() |
[2] | G. Manal, Social media data for the conservation of historic urban landscapes: Prospects and challenges, in Culture and Computing. Design Thinking and Cultural Computing (eds. M. Rauterberg), Springer, (2021), 209–223. https://doi.org/10.1007/978-3-030-77431-8_13 |
[3] | J. Zhao, L. Yang, X. Yang, Maximum profit of viral marketing: An optimal control approach, in Proceedings of the 2019 4th International Conference on Mathematics and Artificial Intelligence, Association for Computing Machinery, (2019), 209–214. https://doi.org/10.1145/3325730.3325767 |
[4] | D. Pedro, R. Matt, Mining the network value of customers, in Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, Association for Computing Machinery, (2001), 57–66. https://doi.org/10.1145/502512.502525 |
[5] | X. Song, B. L. Tseng, C. Lin, M. Sun, Personalized recommendation driven by information flow, in Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, Association for Computing Machinery, (2006), 509–516. https://doi.org/10.1145/1148170.1148258 |
[6] |
Y. Li, D. Zhang, K. Tan, Real-time targeted influence maximization for online advertisements, Proc. VLDB Endow., 8 (2015), 1070–1081. https://doi.org/10.14778/2794367.2794376 doi: 10.14778/2794367.2794376
![]() |
[7] | L. Simone, M. Diego, R. Giuseppe, M. Maurizio, Mining micro-influencers from social media posts, in Proceedings of the 35th Annual ACM Symposium on Applied Computing, Association for Computing Machinery, (2020), 867–874. https://doi.org/10.1145/3341105.3373954 |
[8] |
X. Zhou, S. Li, Z. Li, W. Li, Information diffusion across cyber-physical-social systems in smart city: a survey, Neurocomputing, 444 (2021), 203–213. https://doi.org/10.1016/j.neucom.2020.08.089 doi: 10.1016/j.neucom.2020.08.089
![]() |
[9] |
V. Soroush, M. Mostafa, R. Deb, Rumor gauge: predicting the veracity of rumors on Twitter, ACM Trans. Knowl. Discov. Data, 11 (2017), 1–36. https://doi.org/10.1145/3070644 doi: 10.1145/3070644
![]() |
[10] |
S. R. Sahoo, B. B. Gupta, Multiple features based approach for automatic fake news detection on social networks using deep learning, Appl. Soft Comput., 100 (2021), 106983. https://doi.org/10.1016/j.asoc.2020.106983 doi: 10.1016/j.asoc.2020.106983
![]() |
[11] | D. Kempe, J. Kleinberg, E. Tardos, Maximizing the spread of influence through a social network, in Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery, (2003), 137–146. https://doi.org/10.1145/956750.956769 |
[12] |
A. G. Cecilia, B. Manuel, T. M. Valentina, An agent-based social simulation for citizenship competences and conflict resolution styles, Int. J. Semant. Web Inf. Syst., 18 (2022), 1–23. https://doi.org/10.4018/IJSWIS.306749 doi: 10.4018/IJSWIS.306749
![]() |
[13] | Y. Rong, Q. Zhu, H. Cheng, A model-free approach to infer the diffusion network from event cascade, in Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, Association for Computing Machinery, (2016), 1653–1662. https://doi.org/10.1145/2983323.2983718 |
[14] | S. Galhotra, A. Arora, S. Virinchi, S. Roy, Asim: A scalable algorithm for influence maximization under the independent cascade model, in Proceedings of the 24th International Conference on World Wide Web, Association for Computing Machinery, (2015), 35–36. https://doi.org/10.1145/2740908.2742725 |
[15] |
A. Cetto, M. Klier, A. Richter, J. F. Zolitschka, "Thanks for sharing"—Identifying users' roles based on knowledge contribution in Enterprise Social Networks, Comput. Net., 135 (2018), 275–288. https://doi.org/10.1016/j.comnet.2018.02.012 doi: 10.1016/j.comnet.2018.02.012
![]() |
[16] |
L. Sopjani, J. J. Stier, S. Ritzén, M. Hesselgren, P. Georén, Involving users and user roles in the transition to sustainable mobility systems: The case of light electric vehicle sharing in Sweden, Transp. Res. Part D: Transp. Environ., 71 (2019), 207–221. https://doi.org/10.1016/j.trd.2018.12.011 doi: 10.1016/j.trd.2018.12.011
![]() |
[17] |
L. B. Jeppesen, K. Laursen, The role of lead users in knowledge sharing, Res. Policy, 38 (2009), 1582–1589. https://doi.org/10.1016/j.respol.2009.09.002 doi: 10.1016/j.respol.2009.09.002
![]() |
[18] |
I. Singh, N. Kumar, S. K. G., T. Sharma, V. Kumar, S. Singhal, Database intrusion detection using role and user behavior based risk assessment, J. Inf. Secur. Appl., 55 (2020), 102654. https://doi.org/10.1016/j.jisa.2020.102654 doi: 10.1016/j.jisa.2020.102654
![]() |
[19] | D. Kempe, J. M. Kleinberg, E. Tardos, Maximizing the spread of influence through a social network, in Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, Association for Computing Machinery, (2003), 137–146. https://doi.org/10.1145/956750.956769 |
[20] | P. Shakarian, A. Bhatnagar, A. Aleali, E. Shaabani, R. Guo, The independent cascade and linear threshold models, in Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Association for Computing Machinery, (2015), 177–184. https://doi.org/10.1007/978-3-319-23105-1_4 |
[21] | M. Richardson, P. Domingos, Mining knowledge-sharing sites for viral marketing, in Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, Association for Computing Machinery, (2002), 61–70. https://doi.org/10.1145/775047.775057 |
[22] | D. Oriedi, C. de Runz, Z. Guessoum, A. A. Nyongesa, Influence maximization through user interaction modeling, in Proceedings of the 35th Annual ACM Symposium on Applied Computing, Association for Computing Machinery, (2020), 1888–1890. https://doi.org/10.1145/3386901.3388999 |
[23] | L. Sun, A. Chen, P. S. Yu, W. Chen, Influence maximization with spontaneous user adoption, in Proceedings of the 13th International Conference on Web Search and Data Mining, Association for Computing Machinery, (2020), 573–581. https://doi.org/10.1145/3336191.3371785 |
[24] |
J. Guo, W. Wu, Adaptive influence maximization: if influential node unwilling to be the seed, ACM Trans. Knowl. Discov. Data, 15 (2021), 1–23. https://doi.org/10.1145/3447396 doi: 10.1145/3447396
![]() |
[25] | J. Luo, X. Liu, X. Kong, Competitive opinion maximization in social networks, in Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Association for Computing Machinery, (2019), 250–257. https://doi.org/10.1145/3341161.3342899 |
[26] | Y. Zhang, Y. Zhang, Top-K influential nodes in social networks: A game perspective, in Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery, (2017), 1029–1032. https://doi.org/10.1145/3077136.3080709 |
[27] | X. Liu, X. Kong, P. S. Yu, Active ppinion maximization in social networks, Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Association for Computing Machinery, (2018), 1840–1849. https://doi.org/10.1145/3219819.3220061 |
[28] | P. Banerjee, W. Chen, L. V.S. Lakshmanan, Maximizing welfare in social networks under a utility driven influence diffusion model, in Proceedings of the 2019 International Conference on Management of Data, Association for Computing Machinery, (2019), 1078–1095. https://doi.org/10.1145/3299869.3319879 |
[29] |
M. M. Keikha, M. Rahgozar, M. Asadpour, M. F. Abdollahi, Influence maximization across heterogeneous interconnected networks based on deep learning, Expert Syst. Appl., 140 (2020). https://doi.org/10.1016/j.eswa.2019.112905 doi: 10.1016/j.eswa.2019.112905
![]() |
[30] |
Q. Zhan, W. Zhuo, Y. Liu, Social influence maximization for public health campaigns, IEEE Access, 7 (2019), 151252–151260. https://doi.org/10.1109/ACCESS.2019.2946391 doi: 10.1109/ACCESS.2019.2946391
![]() |
[31] |
S. Tian, S. Mo, L. Wang, Z. Peng, Deep reinforcement learning-based approach to tackle topic-aware influence maximization, Data Sci. Engineer., 5 (2020), 1–11. https://doi.org/10.1007/s41019-020-00117-1 doi: 10.1007/s41019-020-00117-1
![]() |
[32] | D. Li, J. Liu, J. Jeon, S. Hong, T. Le, D. Lee, et al., Large-scale data-rriven airline market influence maximization, in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Association for Computing Machinery, (2021), 914–924. https://doi.org/10.1145/3447548.3467365 |
[33] |
C. Zhang, W. Li, D. Wei, Y. Liu, Z. Li, Network dynamic GCN influence maximization algorithm with leader fake labeling mechanism, IEEE Trans. Comput. Soc. Syst., (2022), 1–9. https://doi.org/10.1109/TCSS.2022.3193583 doi: 10.1109/TCSS.2022.3193583
![]() |
[34] |
W. Li, Z. Li, A. M. Luvembe, C. Yang, Influence maximization algorithm based on Gaussian propagation model, Inf. Sci., 568 (2021), 386–402. https://doi.org/10.1016/j.ins.2021.04.061 doi: 10.1016/j.ins.2021.04.061
![]() |
[35] |
W. Li, Y. Li, W. Liu, C. Wang, An influence maximization method based on crowd emotion under an emotion-based attribute social network, Inf. Process. Manage., 59 (2022), 102818. https://doi.org/10.1016/j.ipm.2021.102818 doi: 10.1016/j.ipm.2021.102818
![]() |
[36] |
W. Li, Y. Hu, C. Jiang, S. Wu, Q. Bai, E. M. K. Lai, ABEM: an adaptive agent-based evolutionary approach for influence maximization in dynamic social networks, Appl. Soft Comput., 136 (2023), 110062. https://doi.org/10.1016/j.asoc.2023.110062 doi: 10.1016/j.asoc.2023.110062
![]() |
[37] |
H. Cai, V. W. Zheng, K. C. Chang, A comprehensive survey of graph embedding: Problems, techniques, and applications, IEEE Trans. Knowl. Data Engineer., 30 (2018), 1616–1637. https://doi.org/10.1109/TKDE.2018.2807452 doi: 10.1109/TKDE.2018.2807452
![]() |
[38] | B. Perozzi, R. Al-Rfou, S. Skiena, Deepwalk: online learning of social representations, in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, Association for Computing Machinery, (2014), 701–710. https://doi.org/10.1145/2623330.2623732 |
[39] | J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, Q. Mei, Line: large-scale information network embedding, in Proceedings of the 24th international conference on World Wide Web, Association for Computing Machinery, (2015), 1067–1077. https://doi.org/10.1145/2736277.2741093 |
[40] | A. Grover, J. Leskovec, node2vec: scalable feature learning for networks, in Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, Association for Computing Machinery, (2016), 855–864. https://doi.org/10.1145/2939672.2939754 |
[41] | C. McCormick, Word2vec tutorial-the skip-gram model, 2016. Available from: http://mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-gram-model. |
[42] |
M. M. Keikha, M. Rahgozar, M. Asadpour, Community aware random walk for network embedding, Knowledge-Based Syst., 148 (2018), 47–54. https://doi.org/10.1016/j.knosys.2018.02.028 doi: 10.1016/j.knosys.2018.02.028
![]() |
[43] | T. Lou, J. Tang, Mining structural hole spanners through information diffusion in social networks, in Proceedings of the 22nd international conference on World Wide Web, Association for Computing Machinery, (2013), 825–836. https://doi.org/10.1145/2488388.2488461 |
[44] |
R. S. Burt, Structural holes and good ideas, Am. J. Soc., 110 (2004), 349–399. https://doi.org/10.1086/421787 doi: 10.1086/421787
![]() |
[45] | S. Wu, J. M. Hofman, W. A. Mason, D. J. Watts, Who says what to whom on Twitter, in Proceedings of the 20th International Conference on World Wide Web, WWW 2011, Association for Computing Machinery, (2011). https://doi.org/10.1145/1963405.1963504 |
[46] | T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation of word representations in vector space, 2013. Available from: http://arXiv.org/abs/1301.3781. |
[47] |
D. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Slooten, S. M. Dawson, The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations, Behav. Ecol. Soc., 54 (2003), 396–405. https://doi.org/10.1007/s00265-003-0651-y doi: 10.1007/s00265-003-0651-y
![]() |
[48] | A. L. Traud, P. J. Mucha, M. A. Porter, Social structure of Facebook networks, Phys. A, 391 (2012), 4165–4180. https://arXiv.org/1102.2166 |
[49] | M. J. Newman, Finding community structure in networks using the eigenvectors of matrices, Phys. Rev. E, 74 (2006), 036104. https://arXiv.org/abs/physics/0605087v3 |
[50] |
J. Leskovec, J. Kleinberg, C. Faloutsos, Graph evolution: Densification and shrinking diameters, ACM Trans. Knowl. Discovery Data, 1 (2007). https://doi.org/10.1145/1217299.1217301 doi: 10.1145/1217299.1217301
![]() |
[51] | B. Rozemberczki, R. Davies, R. Sarkar, C. Sutton, Gemsec: graph embedding with self clustering, in Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2019, Association for Computing Machinery, (2019), 65–72. https://doi.org/10.1145/3341161.3342890 |
[52] | J. Zhang, Y. Luo, Degree centrality, betweenness centrality, and closeness centrality in social network, in Proceedings of the 2017 2nd International Conference on Modelling, Simulation and Applied Mathematics (MSAM2017), 132 (2017), 300–303. https://doi.org/10.2991/msam-17.2017.68 |
[53] |
M. E. J. Newman, A measure of betweenness centrality based on random walks, Soc. Net., 27 (2005), 39–54. https://doi.org/10.1016/j.socnet.2004.11.009 doi: 10.1016/j.socnet.2004.11.009
![]() |
[54] |
D. F. Gleich, PageRank beyond the Web, SIAM Rev., 57 (2015), 321–363. https://doi.org/10.1137/140976649 doi: 10.1137/140976649
![]() |
1. | Mourad Chamekh, Mohamed Ali Latrach, Fadel Jday, Multi-step semi-analytical solutions for a chikungunya virus system, 2023, 2731-6734, 10.1007/s43994-023-00027-8 | |
2. | Muhammad Farman, Aamir Shehzad, Ali Akgül, Dumitru Baleanu, Manuel De la Sen, Modelling and Analysis of a Measles Epidemic Model with the Constant Proportional Caputo Operator, 2023, 15, 2073-8994, 468, 10.3390/sym15020468 | |
3. | Olumuyiwa James Peter, Sania Qureshi, Mayowa M. Ojo, Ratchada Viriyapong, Amanullah Soomro, Mathematical dynamics of measles transmission with real data from Pakistan, 2022, 2363-6203, 10.1007/s40808-022-01564-7 | |
4. | Salah Alsahafi, Stephen Woodcock, Exploring HIV Dynamics and an Optimal Control Strategy, 2022, 10, 2227-7390, 749, 10.3390/math10050749 | |
5. | Zakaria Yaagoub, Karam Allali, Global Stability of Multi-Strain SEIR Epidemic Model with Vaccination Strategy, 2023, 28, 2297-8747, 9, 10.3390/mca28010009 | |
6. | Ahmed Alshehri, Miled El Hajji, Mathematical study for Zika virus transmission with general incidence rate, 2022, 7, 2473-6988, 7117, 10.3934/math.2022397 | |
7. | Nikhila Yaladanda, Rajasekhar Mopuri, Hari Prasad Vavilala, Srinivasa Rao Mutheneni, Modelling the impact of perfect and imperfect vaccination strategy against SARS CoV-2 by assuming varied vaccine efficacy over India, 2022, 15, 22133984, 101052, 10.1016/j.cegh.2022.101052 | |
8. | Mohammed H. Alharbi, Global investigation for an "SIS" model for COVID-19 epidemic with asymptomatic infection, 2023, 20, 1551-0018, 5298, 10.3934/mbe.2023245 | |
9. | Amer Hassan Albargi, Miled El Hajji, Mathematical analysis of a two-tiered microbial food-web model for the anaerobic digestion process, 2023, 20, 1551-0018, 6591, 10.3934/mbe.2023283 | |
10. | Yantao Luo, Yunqiang Yuan, Zhidong Teng, ANALYSIS OF A DEGENERATED DIFFUSION SVEQIRV EPIDEMIC MODEL WITH GENERAL INCIDENCE IN A SPACE HETEROGENEOUS ENVIRONMENT, 2024, 14, 2156-907X, 2704, 10.11948/20230346 | |
11. | Rukhsar Ikram, Amir Khan, Mostafa Zahri, Aeshah A. Raezah, The impact of Lévy noise on a stochastic measles model, 2023, 138, 2190-5444, 10.1140/epjp/s13360-023-04625-7 | |
12. | Maoxing Liu, Yuhang Li, Dynamics analysis of an SVEIR epidemic model in a patchy environment, 2023, 20, 1551-0018, 16962, 10.3934/mbe.2023756 | |
13. | Dilber Uzun Ozsahin, Najeeb Alam Khan, Araib Aqeel, Hijaz Ahmad, Maged F. Alotaibi, Muhammad Ayaz, Junyuan Yang, Mathematical modeling and dynamics of immunological exhaustion caused by measles transmissibility interaction with HIV host, 2024, 19, 1932-6203, e0297476, 10.1371/journal.pone.0297476 | |
14. | Liang’an 良安 Huo 霍, Yue 跃 Yu 于, Impact of individual behavior adoption heterogeneity on epidemic transmission in multiplex networks, 2023, 32, 1674-1056, 108703, 10.1088/1674-1056/acea65 | |
15. | Jiandong Nie, Qiaoling Chen, Zhidong Teng, Yihan Zhang, Ramziya Rifhat, Dynamics of a Stochastic Measles Model with General Incidence Rate and Black–Karasinski Process, 2024, 47, 0126-6705, 10.1007/s40840-024-01771-8 | |
16. | Miled El Hajji, Periodic solutions for chikungunya virus dynamics in a seasonal environment with a general incidence rate, 2023, 8, 2473-6988, 24888, 10.3934/math.20231269 | |
17. | Miled El Hajji, Dalal M. Alshaikh, Nada A. Almuallem, Periodic Behaviour of an Epidemic in a Seasonal Environment with Vaccination, 2023, 11, 2227-7390, 2350, 10.3390/math11102350 | |
18. | Young Rock Kim, Youngho Min, Joy Nana Okogun-Odompley, Rehana Naz, A mathematical model of COVID-19 with multiple variants of the virus under optimal control in Ghana, 2024, 19, 1932-6203, e0303791, 10.1371/journal.pone.0303791 | |
19. | Marlon Nunes Gonzaga, Marcelo Martins de Oliveira, Allbens Picardi Faria Atman, Role of Vaccination Strategies to Host-Pathogen Dynamics in Social Interactions, 2024, 26, 1099-4300, 739, 10.3390/e26090739 | |
20. | Amar Nath Chatterjee, Santosh Kumar Sharma, Fahad Al Basir, A Compartmental Approach to Modeling the Measles Disease: A Fractional Order Optimal Control Model, 2024, 8, 2504-3110, 446, 10.3390/fractalfract8080446 | |
21. | Amer Hassan Albargi, Miled El Hajji, Bacterial Competition in the Presence of a Virus in a Chemostat, 2023, 11, 2227-7390, 3530, 10.3390/math11163530 | |
22. | W. Ahmad, A. I. K. Butt, M. Rafiq, Z. Asif, T. Ismaeel, N. Ahmad, Modeling, analyzing and simulating the Measles transmission dynamics through efficient computational optimal control technique, 2024, 139, 2190-5444, 10.1140/epjp/s13360-024-05368-9 | |
23. | Zakaria Yaagoub, Fractional two-strain SVLIR epidemic model with vaccination and quarantine strategies, 2025, 13, 2195-268X, 10.1007/s40435-024-01561-x | |
24. | Tianqi Song, Yishi Wang, Chuncheng Wang, Qi An, Modeling and analyzing COVID-19 infections in South Africa, 2025, 18, 1793-5245, 10.1142/S1793524523501036 | |
25. | Nassira Madani, Zakia Hammouch, EL-Houssine Azroul, New model of HIV/AIDS dynamics based on Caputo–Fabrizio derivative order: Optimal strategies to control the spread, 2025, 18777503, 102612, 10.1016/j.jocs.2025.102612 |
Model | Observed states | DAISY | JULIA | COMBOS |
Model 1 unknown initial conditions |
V(t) | β globally identifiable {c,δ} locally identifiable Correlations: c+δ=ˆc+ˆδ, cδ=ˆcˆδ |
β globally identifiable {c,δ} locally identifiable π nonidentifiable Correlations: c+δ=ˆc+ˆδ, cδ=ˆcˆδ |
β globally identifiable {c,δ} locally identifiable with 2 solutions |
Model 2 unknown initial conditions |
V(t) | {β,c} globally identifiable Correlations: δπ=ˆδˆπ, πKδ=ˆπ^Kδ |
{β,c} globally identifiable {π,Kδ,δ} nonidentifiable Correlations: δπ=ˆδˆπ, πKδ=ˆπ^Kδ |
{β,c,Kδπ,δπ} globally identifiable |
Model 3 unknown initial conditions |
V(t) | {β,c,k} globally identifiable Correlations: δKδ=ˆδ^Kδ, πKδ=ˆπ^Kδ |
{β,c,k} globally identifiable {π,Kδ,δ} nonidentifiable Correlations: δπ=ˆδˆπ, πKδ=ˆπ^Kδ |
{β,c,Kδπ,δπ} globally identifiable |
Model 4 unknown initial conditions |
V(t),E(t) | {β,c,dE,δ,k,KE,τI} globally identifiable Correlations: δEη=ˆδEˆη, Kδη=ˆKδˆη, λη=ˆλˆη, πη=ˆπˆη |
{β,c,dE,δ,k,KE,τI} globally identifiable {δE,Kδ,π,λ,η} nonidentifiable Correlations: Kδπ=ˆKδˆπ, Kδλ=ˆKδˆλ, ηKδ=ˆηˆKδ, δEπ=ˆδEˆπ |
identifiability unknown |
All models known initial conditions except for E0 |
V(t) V(t),E(t) |
Models 1–3 globally identifiable Model 4 identifiability unknown |
Models 1–4 globally identifiable |
Models 1–4 identifiability unknown |
Parameter | Model 1 | Model 2 | Model 3 | Model 4 |
β×10−5 | 0.1–10 | 0.1–10 | 0.1–10 | 0.1–10 |
δ | 0–25 | 102–108 | 102–108 | 0–15 |
Kδ | - | 103–107 | 103–107 | 101–105 |
π | 0–102 | 0–102 | 0–102 | 0–102 |
c | 0–25 | 0–25 | 0–25 | 0–25 |
k | - | - | 4–6 | 4–6 |
δE | - | - | - | 0–25 |
η×10−7 | - | - | - | 10−2–100 |
λ×103 | - | - | - | 10−2–102 |
dE | - | - | - | 0–25 |
KE×106 | - | - | - | 0.1–103 |
τI | - | - | - | 0–10 |
E0×103 | - | - | - | 0.1–10 |
Parameter | Model 1 | Model 2 | Model 3 | Model 4 |
β×10−5 | 0.48 | 6.88 | 6.20 | 8.39 |
δ | 1.47 | 1.59×106 | 1.72×106 | 0.196 |
Kδ | - | 4.69×104 | 2.37×105 | 5.49×103 |
π | 1.18 | 0.86 | 1.69 | 0.69 |
c | 1.48 | 6.49 | 12.34 | 6.19 |
k | - | - | 4.82 | 5.02 |
δE | - | - | - | 14.20 |
η×10−7 | - | - | - | 3.59 |
λ×103 | - | - | - | 1.31 |
dE | - | - | - | 0.20 |
KE×106 | - | - | - | 9.68 |
τI | - | - | - | 1.69 |
E0×103 | - | - | - | 1.21 |
Jw | 21.12 | 2.09 | 2.39 | 2.14 |
AICc | 37.67 | 40.88 | 114.1 | 36.78 |
Model | Observed states | DAISY | JULIA | COMBOS |
Model 1 unknown initial conditions |
V(t) | β globally identifiable {c,δ} locally identifiable Correlations: c+δ=ˆc+ˆδ, cδ=ˆcˆδ |
β globally identifiable {c,δ} locally identifiable π nonidentifiable Correlations: c+δ=ˆc+ˆδ, cδ=ˆcˆδ |
β globally identifiable {c,δ} locally identifiable with 2 solutions |
Model 2 unknown initial conditions |
V(t) | {β,c} globally identifiable Correlations: δπ=ˆδˆπ, πKδ=ˆπ^Kδ |
{β,c} globally identifiable {π,Kδ,δ} nonidentifiable Correlations: δπ=ˆδˆπ, πKδ=ˆπ^Kδ |
{β,c,Kδπ,δπ} globally identifiable |
Model 3 unknown initial conditions |
V(t) | {β,c,k} globally identifiable Correlations: δKδ=ˆδ^Kδ, πKδ=ˆπ^Kδ |
{β,c,k} globally identifiable {π,Kδ,δ} nonidentifiable Correlations: δπ=ˆδˆπ, πKδ=ˆπ^Kδ |
{β,c,Kδπ,δπ} globally identifiable |
Model 4 unknown initial conditions |
V(t),E(t) | {β,c,dE,δ,k,KE,τI} globally identifiable Correlations: δEη=ˆδEˆη, Kδη=ˆKδˆη, λη=ˆλˆη, πη=ˆπˆη |
{β,c,dE,δ,k,KE,τI} globally identifiable {δE,Kδ,π,λ,η} nonidentifiable Correlations: Kδπ=ˆKδˆπ, Kδλ=ˆKδˆλ, ηKδ=ˆηˆKδ, δEπ=ˆδEˆπ |
identifiability unknown |
All models known initial conditions except for E0 |
V(t) V(t),E(t) |
Models 1–3 globally identifiable Model 4 identifiability unknown |
Models 1–4 globally identifiable |
Models 1–4 identifiability unknown |
Parameter | Model 1 | Model 2 | Model 3 | Model 4 |
β×10−5 | 0.1–10 | 0.1–10 | 0.1–10 | 0.1–10 |
δ | 0–25 | 102–108 | 102–108 | 0–15 |
Kδ | - | 103–107 | 103–107 | 101–105 |
π | 0–102 | 0–102 | 0–102 | 0–102 |
c | 0–25 | 0–25 | 0–25 | 0–25 |
k | - | - | 4–6 | 4–6 |
δE | - | - | - | 0–25 |
η×10−7 | - | - | - | 10−2–100 |
λ×103 | - | - | - | 10−2–102 |
dE | - | - | - | 0–25 |
KE×106 | - | - | - | 0.1–103 |
τI | - | - | - | 0–10 |
E0×103 | - | - | - | 0.1–10 |
Parameter | Model 1 | Model 2 | Model 3 | Model 4 |
β×10−5 | 0.48 | 6.88 | 6.20 | 8.39 |
δ | 1.47 | 1.59×106 | 1.72×106 | 0.196 |
Kδ | - | 4.69×104 | 2.37×105 | 5.49×103 |
π | 1.18 | 0.86 | 1.69 | 0.69 |
c | 1.48 | 6.49 | 12.34 | 6.19 |
k | - | - | 4.82 | 5.02 |
δE | - | - | - | 14.20 |
η×10−7 | - | - | - | 3.59 |
λ×103 | - | - | - | 1.31 |
dE | - | - | - | 0.20 |
KE×106 | - | - | - | 9.68 |
τI | - | - | - | 1.69 |
E0×103 | - | - | - | 1.21 |
Jw | 21.12 | 2.09 | 2.39 | 2.14 |
AICc | 37.67 | 40.88 | 114.1 | 36.78 |