
In this work, a novel approach based on a single-layer machine learning Legendre spectral neural network (LSNN) method is used to solve an elliptic partial differential equation. A Legendre polynomial based approach is utilized to generate neurons that fulfill the boundary conditions. The loss function is computed by using the error back-propagation principles and a feed-forward neural network model combined with automatic differentiation. The main advantage of using this methodology is that it does not need to solve a system of nonlinear and nonsparse equations compared with other traditional numerical schemes, which makes this algorithm more convenient for solving higher-dimensional equations. Further, the hidden layer is eliminated with the help of a Legendre polynomial to enlarge the input pattern. The neural network's training accuracy and efficiency were significantly enhanced by the innovative sampling technique and neuron architecture. Moreover, the Legendre spectral approach can handle equations on more complex domains because of numerous networks. Several test problems were used to validate the proposed scheme, and a comparison was made with other neural network schemes consisting of the physics-informed neural network (PINN) scheme. We found that our proposed scheme has a very good agreement with PINN, which further enhances the reliability and efficiency of our proposed method. The absolute and relative error in both L2 and L∞ between exact and numerical solutions are provided, which shows that our numerical method converges exponentially.
Citation: Ishtiaq Ali. Advanced machine learning technique for solving elliptic partial differential equations using Legendre spectral neural networks[J]. Electronic Research Archive, 2025, 33(2): 826-848. doi: 10.3934/era.2025037
[1] | Stefania Palmentieri . Post-pandemic scenarios. The role of the Italian National Recovery and Resilience Plan (NRRP) in reducing the gap between the Italian Central-Northern regions and Southern ones. AIMS Geosciences, 2023, 9(3): 555-577. doi: 10.3934/geosci.2023030 |
[2] | Rajeev Singh Chandel, Shruti Kanga, Suraj Kumar Singh . Impact of COVID-19 on tourism sector: a case study of Rajasthan, India. AIMS Geosciences, 2021, 7(2): 224-243. doi: 10.3934/geosci.2021014 |
[3] | Giuseppe Terranova . Geopolitics of Covid-19: global challenge at national borders. AIMS Geosciences, 2020, 6(4): 515-524. doi: 10.3934/geosci.2020029 |
[4] | Giulia de Spuches, Francesca Sabatini, Gabriella Palermo, Emanuela Caravello . Risk narrations and perceptions in the COVID-19 time. A discourse analysis through the Italian press. AIMS Geosciences, 2020, 6(4): 504-514. doi: 10.3934/geosci.2020028 |
[5] | Eleonora Gioia, Eleonora Guadagno . Perception of climate change impacts, urbanization, and coastal planning in the Gaeta Gulf (central Tyrrhenian Sea): A multidimensional approach. AIMS Geosciences, 2024, 10(1): 80-106. doi: 10.3934/geosci.2024006 |
[6] | Giuseppe Terranova . The role of the Italic community as a new agent of glocal development in the post-pandemic era. AIMS Geosciences, 2023, 9(2): 219-227. doi: 10.3934/geosci.2023012 |
[7] | Ester Cristina Lucia Tarricone . Multimedia resources and movies in the new perspectives on teaching geography through CLIL and ICT. AIMS Geosciences, 2021, 7(4): 605-612. doi: 10.3934/geosci.2021036 |
[8] | Francesco De Pascale, Giuseppe Ferraro . Educational thematic mapping of cultural & natural heritage in southern Italy during and after the COVID-19 pandemic. AIMS Geosciences, 2022, 8(4): 669-685. doi: 10.3934/geosci.2022037 |
[9] | Lorenzo D'Agostino, Daniela Santus . Teaching geography and blended learning: interdisciplinary and new learning possibilities. AIMS Geosciences, 2022, 8(2): 266-276. doi: 10.3934/geosci.2022016 |
[10] | Daniela Santus, Sara Ansaloni . Mobility issues and multidimensional inequalities: exploring the limits of the National Strategy for Immigration and Asylum during the COVID-19 pandemic in Morocco. AIMS Geosciences, 2023, 9(1): 191-218. doi: 10.3934/geosci.2023011 |
In this work, a novel approach based on a single-layer machine learning Legendre spectral neural network (LSNN) method is used to solve an elliptic partial differential equation. A Legendre polynomial based approach is utilized to generate neurons that fulfill the boundary conditions. The loss function is computed by using the error back-propagation principles and a feed-forward neural network model combined with automatic differentiation. The main advantage of using this methodology is that it does not need to solve a system of nonlinear and nonsparse equations compared with other traditional numerical schemes, which makes this algorithm more convenient for solving higher-dimensional equations. Further, the hidden layer is eliminated with the help of a Legendre polynomial to enlarge the input pattern. The neural network's training accuracy and efficiency were significantly enhanced by the innovative sampling technique and neuron architecture. Moreover, the Legendre spectral approach can handle equations on more complex domains because of numerous networks. Several test problems were used to validate the proposed scheme, and a comparison was made with other neural network schemes consisting of the physics-informed neural network (PINN) scheme. We found that our proposed scheme has a very good agreement with PINN, which further enhances the reliability and efficiency of our proposed method. The absolute and relative error in both L2 and L∞ between exact and numerical solutions are provided, which shows that our numerical method converges exponentially.
The onset of large epidemics in recent decades, such as the Severe acute respiratory syndrome (SARS) epidemic, the avian influenza ([1], [2]), Ebola ([3]), and the Covid-19 pandemic ([4], [5]), have meant that a lot of mathematical models to study various infectious diseases have been developed (see, for example, [6], [7], [8], [9], [10]). The main aim of such studies is to forecast the dynamics of the disease, by using also suitable inference techniques in order to adopt appropriate containment policies (see [11], [12], [13]).
Generally in epidemic models, the population is divided into compartments that constitute a partition of it, that is, each compartment constitutes a subset of the population disjoint from the others and the union of all the compartments returns the whole population. Among the models that contemplate this compartmentalized philosophy, the susceptible-infectious-recovered (SIR) type and its derivatives stand out.
The study of disease propagation through these models, both in their deterministic and stochastic versions, has experienced a great boom in recent decades, giving rise to an extensive literature. The range of models considered is very wide, taking into account different points of view. For example, in [14] and [15], Kalman filtering techniques are applied to estimate the states of a discrete nonlinear compartmental model. In the stochastic environment, Markovian models occupy a prominent place. Within them, if we consider models with discrete states, continuous-time Markov chains have been widely used (see, for instance, [16], [17], [18], [19], [20], [21]). As for continuous-time Markovian models with continuous state space, diffusion processes arise naturally by introducing random environments (through a multidimensional Wiener process) into the systems of ordinary differential equations governing deterministic models.
From the classical SIR model, a number of increasingly complex variants have emerged. Among the lines on which the evolution of the models has been based, we can highlight the following:
● The partitioning of the total population into an increasing number of compartments, each of which obeys a particular situation of the individuals. Thus, in addition to the usual susceptible (S), infected (I) and recovered (R) individuals, there are others such as those with passive immunity (M), the exposed and uninfected (E) and deceased (D), among others. This has led to an increased complexity and difficulty in treating compartment models (see, for instance, [22,23]).
● The inclusion of the mechanisms of contagion of the disease, the so-called incidence function. The most usual function of this type is the one that relates susceptible individuals to infected individuals according to the law of mass action, but considering a constant contagion rate and/or rational functions of both groups of individuals to regulate the contagion scheme ([24,25]).
● The inclusion of terms describing the effect of vaccination of individuals (including the possibility of restrictions in the vaccination process), as well as the possible effects of cross-infection (see, for instance, [26], [27], [28], [29])
● The emergence of new diseases. The effect of COVID19 on the increase in the literature on this type of model cannot be denied. Indeed, the analysis of the effects of the pandemic has given rise to numerous publications focusing on two lines of action: on the one hand, the study of the evolution of the disease in specific locations using existing models (see [30]) and, on the other hand, the appearance of new models (see, for example, [31]).
However, it should be noted that a very high number of these publications focus their interest on the description of the model and its analysis from a theoretical point of view (existence, uniqueness, non-explosion of the solution, stability, equilibrium problems and extinction of the disease). However, especially in stochastic models, the aspects derived from the estimation of the models are not yet well developed, resorting to simulations and/or Monte Carlo type methods to illustrate the validity of the proposed models. The main reason for this derives from the difficulty to know the probability distribution governing the dynamic evolution of the model (the more complex the more complicated the model). Some approximation has been made on simpler models (such as the susceptible-infectious-susceptible (SIS)) by means of approximations derived from Euler-Maruyama type discretizations (see, for instance, [32]), although this requires certain conditions on the model components (e.g. the incidence function). The situation can become even more complex if the rates involved in the model are not considered constant but time-dependent.
This paper therefore falls along the line of inference in epidemic models. The aim is the inference of a SI type model, which is essentially a "simplified" version of an SIR type model. Precisely, in the SI model it is assumed that an individual can be in one of only two states, either susceptible (S) or infectious (I). Although this model is quite simple, it is adept at capturing several types of diseases in which individuals remain infected for life (e.g., brucellosis in domestic and wild populations, fox rabies). Even the most famous AIDS has been modeled in the literature using an SI type model (see, for example, [33]).
For t≥t0, we denote by S(t) the number of susceptible individuals, by I(t) the number of individuals infected and by K the total population size, where K=S(t)+I(t) is constant. In similar models, infected individuals are lifetime infectious. We point out that here the size of the population is assumed to be constant, i.e., the birth and death rates of both populations S and I are assumed to be negligible. This is a very strong assumption, but reasonable if one observes the phenomenon for a limited time.
A model with two states is described via ordinary differential equations in the deterministic dynamics. More general models can be built making use of stochastic approaches based on the birth and death processes or diffusion processes (see, for instance, [6], [34,35], and references therein); they are more realistic but more complicated to analyze. Such models try to forecast the spread of the disease in terms of the total number of infected people and the duration of an epidemic. Moreover, they allow us to estimate suitable epidemiological parameters, as the transmission rate of the disease measured via the basic reproduction number, i.e., the expected number of infected cases directly generated by one case in a population where all individuals are susceptible or infected.
Let I(t) and S(t) be the sizes of the infected and the susceptible populations, respectively. We consider the deterministic SI model described by the following
dS(t)dt=−λ(t)KS(t)I(t),dI(t)dt=λ(t)KS(t)I(t), |
where the transmission intensity function λ(t) is a positive, bounded and continuous function of t. It follows that the population dynamics of the infected I(t) can be described by the Pearl-Verhust logistic growth differential equation:
dI(t)dt=λ(t)K[K−I(t)]I(t),t>t0, | (1.1) |
where the transmission intensity function λ(t) is a positive, bounded and continuous function of t. The solution of (1.1) is
I(t)=KI(t0)I(t0)+[K−I(t0)]e−Λ(t|t0),t≥t0, | (1.2) |
with
Λ(t|t0)=∫tt0λ(θ)dθ. | (1.3) |
We note that limt→∞I(t)=K, so the whole population is destined to become infected, hence the parameter K identifies the carrying capacity of the infected population. We point out that the susceptible-infectious epidemic model is an extreme case of the more general models including recovered population and can be obtained from them by assuming that the time required to reach an immunity situation is infinitely long (see, for instance, [36]). The state K is not reachable in finite time, so it is interesting to consider the time T∗m required to reach a fixed threshold m. When the process is time-homogeneous, i.e. the function λ(t) is constant, one can determine the time T∗m such that I(T∗m)=m. In particular, from (1.2) one obtains
T∗m=t0+1λlnK−I(t0)(K−m)I(t0), |
being I(t0) the initial size of the infected population and m∈(0,K).
To generate a stochastic diffusion process for I(t), several approaches can be used, generally, they introduce stochastic elements into the equation (1.1) or into solution (1.2). These two approaches lead to different processes described by stochastic differential equations of Itô or Stratonovich type, respectively (see [37]). In this work, the first of the two approaches has been chosen.
In [35], a time-inhomogeneous diffusion process to model the size of the infected population in a stochastic environment is provided by starting from (1.2). For such a process, the authors study the probability distribution and derive closed form results for the first-passage time problem through a constant boundary to obtain the stochastic counterpart of the parameter T∗m. However, such important issues as that of model inference were not addressed in this study.
The aim of the present paper is to provide the inference on the stochastic diffusion process built from (1.2). Indeed, such issue becomes fundamental to study the dynamics of infectious diseases and to measure their aggressiveness so as to make predictions about future infections.
The paper is organized as follows. Section 2 provides the stochastic model and its probability distribution, i.e. the transition probability density function (pdf) and the related moments. Section 3 contains the inference procedure based on the Generalized Method of Moments (GMM) for fitting the parameters and the unknown functions of the model. In Section 4, some simulation experiments are performed to validate the proposed procedure. An application to real data is also considered in Section 5. Some conclusions close the paper.
Under the assumption of random environment, let {X(t),t≥t0} be the stochastic process describing the size of the infected population at time t, and we interpret Λ(t|t0) as the mean of a time-inhomogeneous Wiener process {Z(t),t≥t0}, described by the stochastic equation
Z(t)=Λ(t|t0)+W[V(t|t0)],t≥t0, | (2.1) |
where W(t) is the standard Wiener process and
V(t|t0)=∫tt0σ2(θ)dθ,t≥t0, | (2.2) |
with σ(t) being a positive, bounded and continuous function of t describing the breadth of the random oscillations. Hence, Eq. (1.2) is generalized by the following stochastic equation:
X(t)=KX(t0)X(t0)+[K−X(t0)]e−Z(t),t≥t0. | (2.3) |
As proved in [35], {X(t),t≥t0} is a time-inhomogeneous diffusion process defined in (0,K) described by the following stochastic differential equation:
dX(t)=A1(x,t)dt+√A2(x,t)dW(t),X(t0)=x0, | (2.4) |
where
A1(x,t)=λ(t)K(K−x)x+14∂A2(x,t)∂x=(K−x)xK[λ(t)+12σ2(t)],A2(x,t)=σ2(t)(K−x)2x2K2, | (2.5) |
are the infinitesimal drift and the infinitesimal variance of X(t), respectively, and x0∈(0,K) is the initial size of the infected population.
As shown in [35], the process X(t) can be transformed into a time-inhomogeneous Wiener process Y(t) with drift and infinitesimal variance
B1(t)=λ(t),B2(t)=σ2(t), | (2.6) |
by using the transformation:
y=∫xx0Kdzz(K−z)=ln[x(K−x0)x0(K−x)],y0=0. | (2.7) |
We point out that Y(t) is a Gauss Markov process with mean
E[Y(t)]=Λ(t|t0) | (2.8) |
and covariance function
cov[Y(s),Y(t)]=V(s|t0),t0≤s≤t. | (2.9) |
Further, since the transition pdf fY(y,t|y0,t0) of Y(t) is normal with mean y0+Λ(t|t0) and variance V(t|t0), we can obtain the transition pdf of the process X(t):
fX(x,t|x0,t0)=Kx(K−x)1√2πV(t|t0)exp{−[ln(x(K−x0)x0(K−x))−Λ(t|t0)]22V(t|t0)}. | (2.10) |
Moreover, the transition distribution function of X(t) is
FX(x,t|x0,t0)=12{1+Erf[ln(x(K−x0)x0(K−x))−Λ(t|t0)√2V(t|t0)]},x,x0∈(0,K). | (2.11) |
where Erf(z)=2√π∫z0e−u2du is the error function.
The conditional median μ[X(t)|X(t0)=x0] of the process X(t) can be obtained from (2.11) by imposing that FX(x,t|x0,t0)=1/2 for t≥t0, soone has:
μ[X(t)|X(t0)=x0]=Kx0x0+(K−x0)e−Λ(t|t0),t≥t0. | (2.12) |
Moreover, for m=1,2,…, the m-th conditional moment of X(t) is:
E[Xm(t)|X(t0)=x0]=Km√π∫+∞−∞[1+K−x0x0exp{−z√2V(t|t0)−Λ(t|t0)}]−me−z2dz. | (2.13) |
Such analytical results will be used in the following section to make inference for the process X(t), to provide a fitting procedure based on GMM.
The transformation (2.7) is the basis of our estimation procedure for the functions λ(t) and σ2(t). The idea is to estimate such functions by making inference on the transformed Wiener process Y(t) in order to fit λ(t) and σ2(t). In particular, from (2.8) and (2.9) we obtain:
λ(t)=dE[Y(t)]dt, | (3.1) |
σ2(s)=dcov[Y(s),Y(t)]ds. | (3.2) |
In the following, we assume that the carrying capacity K is known since it represents the total size of the population, whereas the functions to be estimated are λ(t) and σ2(t), for t belongs to the observation interval [t0,T].
We consider a discrete sampling of the process (2.4) based on d sample paths observed at the times tj, with j=1,…,n. For i=1,…,d, let xij be the observed values of the i-th path at times tj, and the values xi1 represent the initial point of the sample paths.
The inference procedure is illustrated in the following:
● From the observed data xij for i=1,…,d and j=1,…,n of the process X(t), obtain the points yij as the following:
yij=ln[xij(K−xi1)xi1(K−xij)] | (3.3) |
Such points can be considered as observations of the Wiener process Y(t) given in (2.6).
● From the data yij for i=1,…,d, obtain the sample mean μj
μj=1dd∑i=1yij(j=1,…,n) | (3.4) |
and the sample covariance νj between two subsequent observations Y(tj−1) and Y(tj):
νj=1d−1d∑i=1(yij−μj)(yij−1−μj−1),j=2,…,n. | (3.5) |
● Interpolate the values μj for j=1,…,n and νj for j=2,…,n. Let ˆM(t) and ˆΣ(t) be the functions obtained.
● Evaluate the derivatives of ˆM(t) and ˆΣ(t).
● From Eq. (3.1) and (3.2), obtain the estimate of λ(t) and σ2(t) as follows:
ˆλ(t)=dˆM(t)dtˆσ2(t)=dˆΣ(t)dt. | (3.6) |
We point out that the procedure just illustrated combines a GMM estimation with the interpolation of the sample mean and covariance points, so the consistence of the estimators (3.6) of λ(t) and σ2(t) derives from the consistence of the GMM estimator and from the uniform convergence of the interpolation method, for example, in our analysis we can consider cubic spline interpolation.
Finally, we note that in real applications, it could happen that the observed paths do not reach the carrying capacity since the phenomenon is observed before K is achieved. In these cases, we argue that an a priori rough estimate of the parameter K can be obtained by starting from the maximum observed point. Such an estimate can be used to obtain the points yij for i=1,…,d and j=1,…,n from which, by using the previous procedure, we fit the functions λ(t) and σ2(t). An improvement of the estimate of K could be then obtained by looking at the conditional median function (2.12) and by implementing a step-by-step procedure similar to that proposed in [11]. This topic will be the subject of future investigations.
In order to evaluate the suggested procedure, in the following we consider several simulation experiments in which d sample-paths of X(t) are simulated, each including n equally spaced observations in [t0,T] with t0=0, T=50, ti−ti−1=Δ=0.01 (i=1,…,n) and x0=20. The estimation of the unknown functions is replicated N=500 times. In all the cases we choose the carrying capacity K=200. In Section 4.1 we consider the case in which both the functions λ(t) and σ2(t) are time independent, whereas in the Section 4.2 one or both of them are continuous functions of the time.
Suppose that λ(t) and σ2(t) in (2.5) are constant functions. In particular, in this simulation study, we consider
λ(t)=λ=0.4,σ2(t)=σ2=0.1. |
In this case, the process X(t) in (2.5) is time homogeneous, so the inference for the parameters λ and σ2 can be made by means of the classical MLE. Indeed, it is easy to obtain the MLE for λ and σ2 by looking at the log-likelihood function of the process Y(t) in (2.6):
logLY(λ,σ2)=−d(n−1)2log(2πΔ)−d(n−1)2logσ2−12σ2Δd∑i=1n∑j=2(yij−yij−1−λΔ)2 | (4.1) |
In particular, we obtain the following estimator:
ˆλMLE=1d(n−1)Δd∑i=1n∑j=2(yij−yij−1),ˆσ2MLE=1d(n−1)Δd∑i=1n∑j=2(yij−yij−1−ˆλMLE)2. | (4.2) |
In Figure 1 we compare the results obtained via the MLE method with those obtained via our procedure in the following denoted by GMM. In particular, in Figure 1 on the top the box-plots of the 500 estimates of λ (on the left) and of σ2 (on the right) are shown. We can see that the estimates of λ by means of MLE and GMM have quite the same distribution ranging from 0.46 to 0.51. Differently, the estimates of σ2 via MLE present a very low variability compared to those of the GMM, resulting that MLE is to be preferred in this case. However, we point out the MLE assumes that the parameters λ and σ2 are constant and so it looks at the log-likelihood as a function of such parameters; instead, our procedure can be applied in the general case in which we do not have such information and the parameters generally depend on time. On the bottom of Figure 1 the kernel density of the MLE and GMM standardized estimates of λ (on the left) and σ2 (on the right) are shown. The red curves represent the density of MLE, whereas the black curves refer to GMM. We can observe that the sampling distributions of both the estimators and both the methods are quite symmetric and superimposable.
Finally, let
MRE(ˆλ)=1500500∑i=1|ˆλi−λ|λ,MRE(ˆσ2)=1500500∑i=1|ˆσ2i−σ2|σ2 |
be the mean relative error (MRE) of the estimators ˆλ and ˆσ2. In Table 1 the MREs are shown for different choices of the parameters λ and σ2, i.e. λ=0.4,0.6,0.8,1.2 and σ2=0.05,0.1, and for the two considered methods. The MREs obtained via the MLE and via the GMM are generally comparable, especially for λ, even though the MLE method provides lower errors as expected by seeing the box plots in Figure 1. This one is due to the strong assumption of the constancy of the parameters. Moreover, the MREs increase as σ2 increases for both the estimators and for both the methods.
λ | σ2 | MRE(ˆλMLE) | MRE(ˆλGMM) | MRE(ˆσ2MLE) | MRE(ˆσ2GMM) |
0.4 | 0.05 | 0.113 | 0.110 | 0.010 | 0.123 |
0.6 | 0.05 | 0.078 | 0.075 | 0.014 | 0.115 |
0.8 | 0.05 | 0.116 | 0.119 | 0.213 | 0.175 |
1.0 | 0.05 | 0.282 | 0.284 | 0.474 | 0.299 |
1.2 | 0.05 | 0.396 | 0.398 | 0.712 | 0.389 |
0.4 | 0.1 | 0.225 | 0.223 | 0.011 | 0.149 |
0.6 | 0.1 | 0.131 | 0.128 | 0.023 | 0.114 |
0.8 | 0.1 | 0.103 | 0.106 | 0.131 | 0.190 |
1.0 | 0.1 | 0.271 | 0.273 | 0.225 | 0.315 |
1.2 | 0.1 | 0.387 | 0.389 | 0.303 | 0.406 |
In this section we consider the general case in which one or both of the functions λ(t) and σ2(t) are time dependent. In the following we analyze three different cases:
a. λ(t)=0.4+sint,σ2(t)=σ2=0.1;
b. λ(t)=0.4+sint,σ2(t)=0.1+0.01(1−e−2t)2;
c. λ(t)=λ=0.4,σ2(t)=0.01(1.2+sint).
We note that in all the cases the chosen functions are continuous and bounded. In particular, we choose λ(t) constant or periodic to consider a seasonality effect of the infectious disease. Further, in the Case a and Case b, λ(t) periodically becomes negative, including situations in which a period of growth in the infection rate is followed by a period of regression of it. Three different choices are instead made for the function σ2(t): constant, asymptotically constant and periodic.
The results for the Case a are shown in Figure 2 for the functions λ(t) (on the left) and for σ2(t) (on the right). The red curve is the true function and the blue curve is the mean of the 500 obtained estimates ˆλi(t) and ˆσ2i(t), i.e., ˆλ(t) and ˆσ2(t). The black lines represent the observed confidence interval for the functions λ(t) and σ2(t), respectively, obtained as
ˆλ(t)±sd(ˆλ(t)),ˆσ2(t)±sd(ˆσ2(t)), |
where
sd(ˆλ(t))=√1500500∑i=1[ˆλi(t)−ˆλ(t)]2,sd(ˆσ2(t))=√1500500∑i=1[ˆσ2i(t)−ˆσ2(t)]2 |
are the standard deviations of the estimators. We can observe on the left of Figure 2 that ˆλ(t) fits very well in the true function λ(t), and also the amplitude of the confidence interval is small due to low values of the standard deviation. For the function σ2(t), we can see the the estimate ˆσ2(t) decreases to the true value 0.1 starting from the time t=10 and the confidence interval always contains the true value.
In the Case b we consider the same periodic function of the Case a for λ(t), whereas σ2(t) is a time dependent function that asymptotically tends to the constant value 0.11 and this value is quickly reached. This is the reason for which the results for the Case b, illustrated in Figure 3 for the function λ(t) (on the left) and for σ2(t) (on the right), seem similar to the results in the previous case (Figure 2), although both the functions depend on time.
Very interesting are the results for the Case c shown in Figure 4. Here, the constant function λ(t)=0.4 is estimated, by using our procedure and by means of a function ˆλ(t) showing a periodicity. Clearly, such periodicity is due to the presence of sint in the function σ2(t). In this direction, we have to point out that the proposed procedure for estimating λ(t) and σ2(t) is nonparametric and the only assumption on the unknown functions is boundness of such functions. So, in the Case c, the procedure, looking at the moments, and in particular to the mean and the covariance functions, captures the periodicity in the model and associates it to both the functions λ(t) and σ2(t). Anyway, the mean estimate ˆλ(t) is very close to the constant function λ(t)=0.4. Also, the function ˆσ2(t) fits very well in the true function σ2(t)=0.01(1.2+sint) as shown in Figure 4 on the right, although the confidence bands are increasingly further apart as time increases. These observations open the way to the possibility of finding a tool to discriminate between different estimated models. From this perspective, informational divergence could be a criterion to be used in a future study.
In this section, we apply our estimation procedure to the dataset twentymeas included in the R package tsiR (see [38]). It contains biweekly data (IP = 2) related to measles infection for twenty locations in England from 1944-1964 and was studied in [39]. We point out that the application presented here is primarily for illustrative purposes of how the proposed procedure can be used with real data. People infected by measles become immune so more complicated models, such as SIR and susceptible- infected- recovered-susceptible (SIRS), may be more adequate. However, even if simplified, we think that the SI model can also be used in this case. In fact, it is true that individuals who recover from measles become immune, but they remain infected, in a certain sense. The only difference is that they cannot infect other individuals. In our opinion, this aspect is taken into account with the fact that the transmission intensity function depends on time; in particular, we expect that λ(t) becomes very small as time increases, showing that the disease is transmitted less frequently as recovered people have become immunized.
For our analysis we first consider the cumulative number of the infected for each location and we normalize this number by using the maximum number of population size, identified in the variable "pop" of the dataset. In Figure 5, the sample paths of the infected population and the normalized sample paths are shown. Here, we consider each normalized time series of the infected people as a sample path of a same diffusion process X(t) modeled via (2.3). From the normalized process data (on the right of Figure 5), we fix K=0.25, i.e., the asymptotic infected population is 25% of the total population. In Figure 6, the estimated function ˆλ(t) (on the top) and ˆσ2(t) (on the bottom) are plotted for the whole period of observation on the left, and for the period 1946 to 1965 on the right. By looking at ˆλ(t), we can see that ˆλ(t) presents a sharp decrease in the first period of observation due to its high initial followed by a rapid tendency to assume constant behavior over time. Further, in this second period, the behavior of ˆλ(t) shows a certain seasonality, as expected in infectious diseases. Also on the bottom a higher value of the estimate ˆσ2(t) is observed in the initial period, and after that it continues to decrease asymptotically to 0.
We have considered a stochastic diffusion process for the time-inhomogeneous deterministic SI epidemic model and we have proposed an estimating procedure to inferring the considered process. A relevant issue concerning the model under consideration is the fact that it works quite well in situations where the total population size is large. In situations where the size is sufficiently small, several authors recommend the use of discrete state space processes. However, it should be noted that the proposed estimation procedure allows dealing when the population size is not too large. An example of this can be found in the example described at the beginning of Section 4, where the total size does not differ so much from the original (200 vs. 20).
Another interesting aspect is related to the assumption that the total population size is constant, which is usually a generalized assumption in this type of model. This is due to the fact that they reflect a study carried out over a period of time that is not too long to consider relevant variations in the size of the population. One line of future work is to consider a population growth pattern, either logistic, Gompertz, or other growth models.
To estimate the functions λ(t) and σ2(t) that characterize the process, we have proposed a procedure based on the GMM method combined with the interpolation of the sample mean and covariance points, so the consistence of the estimators derives from the consistence of the GMM estimator and the uniform convergence of the used interpolation method. It should be noted that the proposed procedure does not make any assumptions about the functional form of the unknown functions. The only necessary assumption is that the functions involved are continuous and bounded in the observation interval. Several simulation studies have been carried out which demonstrate the validity of the proposed methodology.
The results for both the estimates obtained in the considered simulation studies seem very close to the "true" functions. In particular, for the time homogeneous case, we have compared the results obtained via the MLE procedure with those ones obtained with our procedure: concerning the parameter λ, representative of the infection rate, the estimates are very close with those obtained via our method. Some differences are instead found for the parameter σ2, related to environmental variability, in which better performances are shown for the MLE due to the strong assumption of the constancy of the parameters. However, the MREs for the two methods and for the parameters are in all the cases comparable. Other simulation cases have been analyzed in which the functions λ(t) and/or σ2(t) are time-dependent, with particular reference to situations in which seasonality effects of the dynamics of the infection are included. Finally, we have applied our estimation procedure to the dataset twentymeas included in the R package tsiR (see [38]). It contains biweekly data (IP = 2) related to measles infection for twenty locations in England from 1944-1964. Regarding the estimates obtained for λ(t) and σ2(t) via our method, we observe a sharp decrease in the first period of observation due to their high initial values, followed by a rapid tendency to assume constant behavior over time. Further, in this second period, the behavior of the transmission intensity function shows a certain seasonality, as expected in infectious diseases. Instead, the estimate of ˆσ2(t) rapidly decreases to a constant value without showing any periodicity.
The authors are special issue editors for Mathematical Biosciences and Engineering and were not involved in the editorial review or the decision to publish this article. All authors declare that there are no competing interests.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
This research is partially supported by MUR-PRIN 2022, project 2022XZSAFN "Anomalous Phenomena on Regular and Irregular Domains: Approximating Complexity for the Applied Sciences" (Italy), by MUR-PRIN 2022 PNRR, project P2022XSF5H "Stochastic Models in Biomathematics and Applications" (Italy), by PID2020-1187879GB-100 and CEX2020-001105-M grants, funded by MCIN/AEI/10.13039/501100011033 (Spain). G. Albano and V. Giorno are members of the GNCS-INdAM.
[1] | Y. Pinchover, J. Rubinstein, An Introduction to Partial Differential Equations, Cambridge University Press, 2005. https://doi.org/10.1017/CBO9780511801228 |
[2] |
J. Douglas, B. F. Jones, On predictor-corrector methods for nonlinear parabolic differential equations, J. Soc. Ind. Appl. Math., 11 (1963), 195–204. https://doi.org/10.1137/0111015 doi: 10.1137/0111015
![]() |
[3] |
A. Wambecq, Rational Runge-Kutta methods for solving systems of ordinary differential equations, Computing, 20 (1978), 333–342. https://doi.org/10.1007/BF02252381 doi: 10.1007/BF02252381
![]() |
[4] | J. N. Reddy, An Introduction to the Finite Element Method, McGraw-Hill, 1993. |
[5] | R. J. LeVeque, Finite Difference Methods for Ordinary and Partial Differential Equations: Steady-state and Time-dependent Problems, SIAM, 2007. https://doi.org/10.1137/1.9780898717839 |
[6] |
I. E. Lagaris, A. Likas, D. I. Fotiadis, Artificial neural networks for solving ordinary and partial differential equations, IEEE Trans. Neural Networks, 9 (1998), 987–1000. https://doi.org/10.1109/72.712178 doi: 10.1109/72.712178
![]() |
[7] |
S. Mall, S. Chakraverty, Application of Legendre neural network for solving ordinary differential equations, Appl. Soft Comput., 43 (2016), 347–356. https://doi.org/10.1016/j.asoc.2015.10.069 doi: 10.1016/j.asoc.2015.10.069
![]() |
[8] |
T. T. Dufera, Deep neural network for system of ordinary differential equations: Vectorized algorithm and simulation, Mach. Learn. Appl., 5 (2021), 100058. https://doi.org/10.1016/j.mlwa.2021.100058 doi: 10.1016/j.mlwa.2021.100058
![]() |
[9] |
J. A. Rivera, J. M. Taylor, A. J. Omella, D. Pardo, On quadrature rules for solving partial differential equations using neural networks, Comput. Methods Appl. Mech. Eng., 393 (2022), 114710. https://doi.org/10.1016/j.cma.2022.114710 doi: 10.1016/j.cma.2022.114710
![]() |
[10] |
L. S. Tan, Z. Zainuddin, P. Ong, Wavelet neural networks based solutions for elliptic partial differential equations with improved butterfly optimization algorithm training, Appl. Soft Comput., 95 (2020), 106518. https://doi.org/10.1016/j.asoc.2020.106518 doi: 10.1016/j.asoc.2020.106518
![]() |
[11] |
Z. Sabir, S. A. Bhat, M. A. Z. Raja, S. E. Alhazmi, A swarming neural network computing approach to solve the Zika virus model, Eng. Appl. Artif. Intell., 126 (2023), 106924. https://doi.org/10.1016/j.engappai.2023.106924 doi: 10.1016/j.engappai.2023.106924
![]() |
[12] |
M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, J. Comput. Phys., 378 (2019), 686–707. https://doi.org/10.1016/j.jcp.2018.10.045 doi: 10.1016/j.jcp.2018.10.045
![]() |
[13] |
E. Schiassi, R. Furfaro, C. Leake, M. De Florio, H. Johnston, D. Mortari, Extreme theory of functional connections: A fast physics-informed neural network method for solving ordinary and partial differential equations, Neurocomputing, 457 (2021), 334–356. https://doi.org/10.1016/j.neucom.2021.06.015 doi: 10.1016/j.neucom.2021.06.015
![]() |
[14] |
S. Dong, Z. Li, Local extreme learning machines and domain decomposition for solving linear and nonlinear partial differential equations, Comput. Methods Appl. Mech. Eng., 387 (2021), 114129. https://doi.org/10.1016/j.cma.2021.114129 doi: 10.1016/j.cma.2021.114129
![]() |
[15] |
S. Dong, J. Yang, On computing the hyperparameter of extreme learning machines: Algorithm and application to computational PDEs, and comparison with classical and high-order finite elements, J. Comput. Phys., 463 (2022), 111290. https://doi.org/10.1016/j.jcp.2022.111290 doi: 10.1016/j.jcp.2022.111290
![]() |
[16] |
G. B. Huang, Q. Y. Zhu, C. K. Siew, Extreme learning machine: Theory and applications, Neurocomputing, 70 (2006), 489–501. https://doi.org/10.1016/j.neucom.2005.12.126 doi: 10.1016/j.neucom.2005.12.126
![]() |
[17] |
V. Dwivedi, B. Srinivasan, Physics-informed extreme learning machine (PIELM) – A rapid method for the numerical solution of partial differential equations, Neurocomputing, 391 (2020), 96–118. https://doi.org/10.1016/j.neucom.2019.12.099 doi: 10.1016/j.neucom.2019.12.099
![]() |
[18] |
F. Calabrò, G. Fabiani, C. Siettos, Extreme learning machine collocation for the numerical solution of elliptic PDEs with sharp gradients, Comput. Methods Appl. Mech. Eng., 387 (2021), 114188. https://doi.org/10.1016/j.cma.2021.114188 doi: 10.1016/j.cma.2021.114188
![]() |
[19] |
S. M. Sivalingam, P. Kumar, V. Govindaraj, A Chebyshev neural network-based numerical scheme to solve distributed-order fractional differential equations, Comput. Math. Appl., 164 (2024), 150–165. https://doi.org/10.1016/j.camwa.2024.04.005 doi: 10.1016/j.camwa.2024.04.005
![]() |
[20] |
A. Jafarian, M. Mokhtarpour, D. Baleanu, Artificial neural network approach for a class of fractional ordinary differential equations, Neural Comput. Appl., 28 (2017), 765–773. https://doi.org/10.1007/s00521-015-2104-8 doi: 10.1007/s00521-015-2104-8
![]() |
[21] |
I. Ali, S. U. Khan, A dynamic competition analysis of stochastic fractional differential equation arising in finance via the pseudospectral method, Mathematics, 11 (2023), 1328. https://doi.org/10.3390/math11061328 doi: 10.3390/math11061328
![]() |
[22] |
S. U. Khan, M. Ali, I. Ali, A spectral collocation method for stochastic Volterra integro-differential equations and its error analysis, Adv. Differ. Equ., 2019 (2019), 161. https://doi.org/10.1186/s13662-019-2096-2 doi: 10.1186/s13662-019-2096-2
![]() |
[23] |
I. Ali, S. U. Khan, Dynamics and simulations of stochastic COVID-19 epidemic model using Legendre spectral collocation method, AIMS Math., 8 (2023), 4220–4236. https://doi.org/10.3934/math.2023210 doi: 10.3934/math.2023210
![]() |
[24] |
S. U. Khan, I. Ali, Application of Legendre spectral-collocation method to delay differential and stochastic delay differential equations, AIP Adv., 8 (2018), 035301. https://doi.org/10.1063/1.5016680 doi: 10.1063/1.5016680
![]() |
[25] | C. Canuto, M. Y. Hussaini, A. Quarteroni, T. A. Zang, Spectral Methods: Fundamentals in Single Domains, Springer, 2006. https://doi.org/10.1007/978-3-540-30726-6 |
[26] |
G. Mastroianni, D. Occorsio, Optimal systems of nodes for Lagrange interpolation on bounded intervals: A survey, J. Comput. Appl. Math., 134 (2001), 325–341. https://doi.org/10.1016/S0377-0427(00)00557-4 doi: 10.1016/S0377-0427(00)00557-4
![]() |
[27] | D. Gottlieb, S. A. Orszag, Numerical Analysis of Spectral Methods: Theory and Applications, SIAM, 1977. https://doi.org/10.1137/1.9781611970425 |
[28] | J. P. Boyd, Chebyshev and Fourier Spectral Methods, Dover Publications, 2001. |
[29] | J. S. Hesthaven, S. Gottlieb, D. Gottlieb, Spectral Methods for Time-dependent Problems, Cambridge University Press, 2007. https://doi.org/10.1017/CBO9780511618352 |
[30] | C. Canuto, M. Y. Hussaini, A. Quarteroni, T. A. Zang, Spectral Methods in Fluid Dynamics, Springer, 2012. https://doi.org/10.1007/978-3-642-84108-8 |
[31] | J. Shen, T. Tang, L. L. Wang, Spectral Methods: Algorithms, Analysis and Aapplications, Springer, 2011. https://doi.org/10.1007/978-3-540-71041-7 |
[32] | N. Liu, Theory and Applications and Legendre Polynomials and Wavelets, Ph.D thesis, The University of Toledo, 2008. |
[33] |
S. Wang, X. Yu, P. Perdikaris, When and why PINNs fail to train: A neural tangent kernel perspective, J. Comput. Phys., 449 (2022), 110768. https://doi.org/10.1016/j.jcp.2021.110768 doi: 10.1016/j.jcp.2021.110768
![]() |
[34] | P. Yin, S. Ling, W. Ying, Chebyshev spectral neural networks for solving partial differential equations, preprint, arXiv: 2407.03347. |
[35] |
Y. Yang, M. Hou, H. Sun, T. Zhang, F. Weng, J. Luo, Neural network algorithm based on Legendre improved extreme learning machine for solving elliptic partial differential equations, Soft Comput., 24 (2020), 1083–1096. https://doi.org/10.1007/s00500-019-03944-1 doi: 10.1007/s00500-019-03944-1
![]() |
[36] |
M. Xia, X. Li, Q. Shen, T. Chou, Learning unbounded-domain spatiotemporal differential equations using adaptive spectral methods, J. Appl. Math. Comput., 70, (2024), 4395–4421. https://doi.org/10.1007/s12190-024-02131-2 doi: 10.1007/s12190-024-02131-2
![]() |
[37] |
Y. Ye, Y. Li, H. Fan, X. Liu, H. Zhang, SLeNN-ELM: A shifted Legendre neural network method for fractional delay differential equations based on extreme learning machine, Netw. Heterog. Media, 18 (2023), 494–512. https://doi.org/10.3934/nhm.2023020 doi: 10.3934/nhm.2023020
![]() |
[38] |
Y. Yang, M. Hou, J. Luo, A novel improved extreme learning machine algorithm in solving ordinary differential equations by Legendre neural network methods, Adv. Differ. Equ., 2018 (2018), 469. https://doi.org/10.1186/s13662-018-1927-x doi: 10.1186/s13662-018-1927-x
![]() |
[39] |
Y. Wang, S. Dong, An extreme learning machine-based method for computational PDEs in higher dimensions, Comput. Methods Appl. Mech. Eng., 418 (2024), 116578. https://doi.org/10.1016/j.cma.2023.116578 doi: 10.1016/j.cma.2023.116578
![]() |
[40] |
D. Yuan, W. Liu, Y. Ge, G. Cui, L. Shi, F. Cao, Artificial neural networks for solving elliptic differential equations with boundary layer, Math. Methods Appl. Sci., 45 (2022), 6583–6598. https://doi.org/10.1002/mma.8192 doi: 10.1002/mma.8192
![]() |
[41] |
H. Liu, B. Xing, Z. Wang, L. Li, Legendre neural network method for several classes of singularly perturbed differential equations based on mapping and piecewise optimization technology, Neural Process. Lett., 51 (2020), 2891–2913. https://doi.org/10.1007/s11063-020-10232-9 doi: 10.1007/s11063-020-10232-9
![]() |
[42] |
X. Li, J. Wu, X. Tai, J. Xu, Y. Wang, Solving a class of multi-scale elliptic PDEs by Fourier-based mixed physics-informed neural networks, J. Comput. Phys., 508 (2024), 113012. https://doi.org/10.1016/j.jcp.2024.113012 doi: 10.1016/j.jcp.2024.113012
![]() |
[43] |
S. Zhang, J. Deng, X. Li, Z. Zhao, J. Wu, W. Li, et al., Solving the one-dimensional vertical suspended sediment mixing equation with arbitrary eddy diffusivity profiles using temporal normalized physics-informed neural networks, Phys. Fluids, 36 (2024), 017132. https://doi.org/10.1063/5.0179223 doi: 10.1063/5.0179223
![]() |
[44] |
X. Li, J. Deng, J. Wu, S. Zhang, W. Li, Y. Wang, Physics-informed neural networks with soft and hard boundary constraints for solving advection-diffusion equations using Fourier expansions, Comput. Math. Appl., 159 (2024), 60–75. https://doi.org/10.1016/j.camwa.2024.01.021 doi: 10.1016/j.camwa.2024.01.021
![]() |
[45] |
X. Li, J. Wu, L. Zhang, X. Tai, Solving a class of high-order elliptic PDEs using deep neural networks based on its coupled scheme, Mathematics, 10 (2022), 4186. https://doi.org/10.3390/math10224186 doi: 10.3390/math10224186
![]() |
[46] | X. Li, J. Wu, Y. Huang, Z. Ding, X. Tai, L. Liu, et al., Augmented physics informed extreme learning machine to solve the biharmonic equations via Fourier expansions, preprint, arXiv: 2310.13947. |
[47] |
Z. Fu, W. Xu, S. Liu, Physics-informed kernel function neural networks for solving partial differential equations, Neural Networks, 172 (2024), 106098. https://doi.org/10.1016/j.neunet.2024.106098 doi: 10.1016/j.neunet.2024.106098
![]() |
[48] |
J. Bai, G. Liu, A. Gupta, L. Alzubaidi, X. Feng, Y. Gu, Physics-informed radial basis network (PIRBN): A local approximating neural network for solving nonlinear partial differential equations, Comput. Methods Appl. Mech. Eng., 415 (2023), 116290. https://doi.org/10.1016/j.cma.2023.116290 doi: 10.1016/j.cma.2023.116290
![]() |
[49] | A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, et al., Automatic differentiation in PyTorch, in NIPS 2017 Workshop on Autodiff, (2017), 1–4. |
[50] | D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, preprint, arXiv: 1412.6980. |
1. | Christopher L. Atkinson, Research Trends in Resilience and Vulnerability Studies, 2023, 3, 2673-8392, 1208, 10.3390/encyclopedia3040088 | |
2. | Christopher L. Atkinson, Allison M. Atkinson, Impacts of Climate Change on Rural Communities: Vulnerability and Adaptation in the Global South, 2023, 3, 2673-8392, 721, 10.3390/encyclopedia3020052 | |
3. | Nadia Matarazzo, Rosa Coluzzi, Vito Imbrenda, Maria Lanfredi, Michele Galella, Dionisia Russo Krauss, The role of peripherality in the spread of pandemic: evidence from Basilicata (Southern Italy) during the first wave of COVID-19, 2025, 122, 22124209, 105457, 10.1016/j.ijdrr.2025.105457 |
λ | σ2 | MRE(ˆλMLE) | MRE(ˆλGMM) | MRE(ˆσ2MLE) | MRE(ˆσ2GMM) |
0.4 | 0.05 | 0.113 | 0.110 | 0.010 | 0.123 |
0.6 | 0.05 | 0.078 | 0.075 | 0.014 | 0.115 |
0.8 | 0.05 | 0.116 | 0.119 | 0.213 | 0.175 |
1.0 | 0.05 | 0.282 | 0.284 | 0.474 | 0.299 |
1.2 | 0.05 | 0.396 | 0.398 | 0.712 | 0.389 |
0.4 | 0.1 | 0.225 | 0.223 | 0.011 | 0.149 |
0.6 | 0.1 | 0.131 | 0.128 | 0.023 | 0.114 |
0.8 | 0.1 | 0.103 | 0.106 | 0.131 | 0.190 |
1.0 | 0.1 | 0.271 | 0.273 | 0.225 | 0.315 |
1.2 | 0.1 | 0.387 | 0.389 | 0.303 | 0.406 |
λ | σ2 | MRE(ˆλMLE) | MRE(ˆλGMM) | MRE(ˆσ2MLE) | MRE(ˆσ2GMM) |
0.4 | 0.05 | 0.113 | 0.110 | 0.010 | 0.123 |
0.6 | 0.05 | 0.078 | 0.075 | 0.014 | 0.115 |
0.8 | 0.05 | 0.116 | 0.119 | 0.213 | 0.175 |
1.0 | 0.05 | 0.282 | 0.284 | 0.474 | 0.299 |
1.2 | 0.05 | 0.396 | 0.398 | 0.712 | 0.389 |
0.4 | 0.1 | 0.225 | 0.223 | 0.011 | 0.149 |
0.6 | 0.1 | 0.131 | 0.128 | 0.023 | 0.114 |
0.8 | 0.1 | 0.103 | 0.106 | 0.131 | 0.190 |
1.0 | 0.1 | 0.271 | 0.273 | 0.225 | 0.315 |
1.2 | 0.1 | 0.387 | 0.389 | 0.303 | 0.406 |