
This work investigates the minimum eradication time in a controlled susceptible-infectious-recovered model with constant infection and recovery rates. The eradication time is defined as the earliest time the infectious population falls below a prescribed threshold and remains below it. Leveraging the fact that this problem reduces to solving a Hamilton-Jacobi-Bellman (HJB) equation, we propose a mesh-free framework based on a physics-informed neural network to approximate the solution. Moreover, leveraging the well-known structure of the optimal control of the problem, we efficiently obtain the optimal vaccination control from the minimum eradication time using the dynamic programming principle. To improve training stability and accuracy, we incorporate a variable scaling method and provide theoretical justification through a neural tangent kernel analysis. Numerical experiments show that this technique significantly enhances convergence, reducing the mean squared residual error by approximately 80% compared with standard physics-informed approaches. Furthermore, the method accurately identifies the optimal switching time. These results demonstrate the effectiveness of the proposed deep learning framework as a computational tool for solving optimal control problems in epidemic modeling as well as the corresponding HJB equations.
Citation: Minseok Kim, Yeongjong Kim, Yeoneung Kim. Physics-informed neural networks for optimal vaccination plan in SIR epidemic models[J]. Mathematical Biosciences and Engineering, 2025, 22(7): 1598-1633. doi: 10.3934/mbe.2025059
[1] | Han Ma, Qimin Zhang . Threshold dynamics and optimal control on an age-structured SIRS epidemic model with vaccination. Mathematical Biosciences and Engineering, 2021, 18(6): 9474-9495. doi: 10.3934/mbe.2021465 |
[2] | Yoon-gu Hwang, Hee-Dae Kwon, Jeehyun Lee . Feedback control problem of an SIR epidemic model based on the Hamilton-Jacobi-Bellman equation. Mathematical Biosciences and Engineering, 2020, 17(3): 2284-2301. doi: 10.3934/mbe.2020121 |
[3] | Youqiong Liu, Li Cai, Yaping Chen, Bin Wang . Physics-informed neural networks based on adaptive weighted loss functions for Hamilton-Jacobi equations. Mathematical Biosciences and Engineering, 2022, 19(12): 12866-12896. doi: 10.3934/mbe.2022601 |
[4] | Heman Shakeri, Faryad Darabi Sahneh, Caterina Scoglio, Pietro Poggi-Corradini, Victor M. Preciado . Optimal information dissemination strategy to promote preventivebehaviors in multilayer epidemic networks. Mathematical Biosciences and Engineering, 2015, 12(3): 609-623. doi: 10.3934/mbe.2015.12.609 |
[5] | Dan Shi, Shuo Ma, Qimin Zhang . Sliding mode dynamics and optimal control for HIV model. Mathematical Biosciences and Engineering, 2023, 20(4): 7273-7297. doi: 10.3934/mbe.2023315 |
[6] | Haoyu Wang, Xihe Qiu, Jinghan Yang, Qiong Li, Xiaoyu Tan, Jingjing Huang . Neural-SEIR: A flexible data-driven framework for precise prediction of epidemic disease. Mathematical Biosciences and Engineering, 2023, 20(9): 16807-16823. doi: 10.3934/mbe.2023749 |
[7] | Holly Gaff, Elsa Schaefer . Optimal control applied to vaccination and treatment strategies for various epidemiological models. Mathematical Biosciences and Engineering, 2009, 6(3): 469-492. doi: 10.3934/mbe.2009.6.469 |
[8] | Thomas Torku, Abdul Khaliq, Fathalla Rihan . SEINN: A deep learning algorithm for the stochastic epidemic model. Mathematical Biosciences and Engineering, 2023, 20(9): 16330-16361. doi: 10.3934/mbe.2023729 |
[9] | Pannathon Kreabkhontho, Watchara Teparos, Thitiya Theparod . Potential for eliminating COVID-19 in Thailand through third-dose vaccination: A modeling approach. Mathematical Biosciences and Engineering, 2024, 21(8): 6807-6828. doi: 10.3934/mbe.2024298 |
[10] | ZongWang, Qimin Zhang, Xining Li . Markovian switching for near-optimal control of a stochastic SIV epidemic model. Mathematical Biosciences and Engineering, 2019, 16(3): 1348-1375. doi: 10.3934/mbe.2019066 |
This work investigates the minimum eradication time in a controlled susceptible-infectious-recovered model with constant infection and recovery rates. The eradication time is defined as the earliest time the infectious population falls below a prescribed threshold and remains below it. Leveraging the fact that this problem reduces to solving a Hamilton-Jacobi-Bellman (HJB) equation, we propose a mesh-free framework based on a physics-informed neural network to approximate the solution. Moreover, leveraging the well-known structure of the optimal control of the problem, we efficiently obtain the optimal vaccination control from the minimum eradication time using the dynamic programming principle. To improve training stability and accuracy, we incorporate a variable scaling method and provide theoretical justification through a neural tangent kernel analysis. Numerical experiments show that this technique significantly enhances convergence, reducing the mean squared residual error by approximately 80% compared with standard physics-informed approaches. Furthermore, the method accurately identifies the optimal switching time. These results demonstrate the effectiveness of the proposed deep learning framework as a computational tool for solving optimal control problems in epidemic modeling as well as the corresponding HJB equations.
The study of vaccination strategies and eradication times in susceptible-infectious-recovered (SIR) models has a long history, beginning with the seminal work of Kermack and McKendrick [1], and its variants have received a great deal of attention during and after the outbreak of COVID-19. The controlled SIR model is a cornerstone in mathematical epidemiology and is frequently employed to study the dynamics of disease transmission and control strategies. Various optimization problems based on the SIR model, where the vaccination strategy is treated as a control, have been extensively studied within the framework of optimal control theory [2,3,4,5].
The controlled SIR model is given by
{˙S=−β(t)SI−r(t)S,˙I=β(t)SI−γ(t)I, |
for t>0, with the initial conditions S(0)=x and I(0)=y≥μ for some μ>0. Here, β(t) and γ(t) denote the infection and recovery rates, respectively, and r(t) represents a vaccination control that takes values in [0,1].
Recently, Bolzoni et al. [4] introduced the notion of minimum eradication time, defined as the first time I falls below a given threshold μ>0. For mathematical treatments, the eradication time in controlled SIR models with constant infection and recovery rates was first studied as a viscosity solution to a static first-order Hamilton-Jacobi-Bellman (HJB) equation in [6]. Additionally, two critical times in SIR dynamics were studied in [7]: the point at which the infected population begins to decrease and the first time that this population falls below a given threshold. Both studies [6,7] focused on SIR models with a constant β and γ. To identify the optimal controls, the Pontryagin maximum principle (PMP) [8] was applied, confirming that a bang-bang control (i.e., taking values of 0 or 1) is optimal. The authors of [9] extended the notion of eradication time to cases involving time-inhomogeneous dynamics. However, numerical treatments for solving Hamilton–Jacobi equations and determining the optimal vaccination controls remain relatively unexplored.
Motivated by the fact that the minimum eradication time satisfies an HJB equation in the viscosity sense [6,10], we introduce a novel framework based on physics-informed neural networks (PINNs) to solve this problem in a mesh-free and scalable manner. Traditional numerical schemes often suffer from discretization errors and are limited in their flexibility, especially when handling high-dimensional or nonlinear systems. In contrast, our approach embeds the HJB's dynamics directly into the training of a neural network, enabling efficient approximation without spatial discretization. Moreover, by incorporating the dynamic programming principle (DPP) into a neural optimization loop, we compute both the value function and the associated optimal bang-bang vaccination control within a unified PINN framework. This integration offers a new perspective for solving optimal control problems in the study of controlled epidemic models using deep learning.
Unlike existing deep learning approaches for epidemic modeling, which typically require retraining for each initial condition, our PINN-based framework computes the minimum eradication time uniformly across a range of initial states. This is made possible by the structural property of the optimal control in our setting. Specifically, the optimal control exhibits a bang-bang form, which is a unique property in our problem's setting. Leveraging this structure, we apply the DPP to recover the optimal vaccination strategy directly from the computed eradication time, without solving the control problem separately.
In addition, we provide rigorous theoretical support to explain why it works. While PINNs have shown success in various applications, their training is often unstable, particularly when applied to stiff or degenerate partial differential equations (PDEs). To address this issue, we leverage a variable scaling (VS) technique originally proposed in [11] and develop a theoretical foundation for solving the HJB equation arising in the minimum eradication time problem. Specifically, using neural tangent kernel (NTK) theory [12], we quantify how VS affects the convergence rate during the training of PINNs. This constitutes the first NTK-based convergence analysis associated with HJB equations arising in the controlled SIR problem and provides new insights into the theoretical tractability of PINN-based optimal control. We further analyze the impact of applying distinct scaling factors to each variable. This variable-specific scaling effectively captures the anisotropic behavior inherent in the HJB equation.
PINNs [13] have gained significant attention as a powerful and flexible framework for solving differential equations, and have been widely adopted in various fields, including epidemic modeling [14,15], fluid mechanics [13,16,17,18], finance [19,20], and biomedical engineering [21,22], where understanding the underlying physical models is crucial.
In the framework of PINNs, we solve differential equations by training a neural network. The loss function consists of initial and boundary conditions, along with residual terms derived from the governing equations. However, the training results are highly sensitive to the choice of boundary condition settings, requiring the introduction of a penalty coefficient to balance the boundary loss term. Heuristic adjustments to the penalty coefficient can accelerate convergence. However, if not properly set, these values may lead to inaccurate solutions. To address these challenges, various adaptive methods have been proposed. One example is a learning rate annealing algorithm [23]. This algorithm dynamically adjusts the weights assigned to each term in the loss function. PINNs with adaptive weighted loss functions have been introduced for the efficient training of Hamilton-Jacobi (HJ) equations [24]. To further enhance the stability of PINNs, the authors of [25] proposed an adaptive training strategy that ensures stable convergence through the lens of NTK theory [12]. Recently, the failure of PINNs in stiff ordinary differential equation (ODE) systems was observed [26], and stiff-PINN was proposed as an improvement. Subsequently, various methods have been introduced, such as self-adaptive PINNs [27] and variable scaling PINNs [11]. Among these methods, we employ the variable scaling technique [11], as it is simple and effective.
While PINNs have been successfully applied to a wide range of differential equation problems, their application to optimal control, particularly in solving HJB equations, remains relatively underexplored. The key challenge lies in ensuring stability and accuracy when approximating the value functions and control policies. This has motivated recent studies investigating the interplay between deep learning and optimal control, aiming to develop computationally efficient methods that leverage the advantages of PINNs for solving partial differential equation (PDE) and optimal control problems.
There is a rich body of literature exploring the interplay between PINNs and optimal control. By leveraging the ability of PINNs to solve partial differential equations (PDEs) and the scalability of deep neural networks, researchers have developed computationally efficient methods for solving optimal control problems. For instance, a training procedure for obtaining optimal control in PDE-constrained problems was presented in [28]. Similarly, the authors of [29] utilized a Lyapunov-type PDE for efficient policy iteration in control-affine problems. Slightly later, a deep operator learning framework was introduced to solve high-dimensional optimal control problems [30], building on the policy iteration scheme developed in [31]. Most recently, [32] demonstrated the application of deep learning in controlled epidemic models.
In parallel, recent studies have explored the integration of artificial intelligence (AI) with mechanistic epidemiological models to enhance predictive capabilities and inform public health interventions [33,34]. These approaches highlight the potential of AI to address complexities in disease dynamics that traditional models may not fully capture.
Building on these advancements, our work focuses on leveraging PINNs for solving optimal control problems, specifically in the context of SIR models with vaccination strategies. The proposed approach not only provides an effective approximation of the minimum eradication time but also facilitates the synthesis of optimal control policies in a computationally efficient manner.
This paper makes the following contributions:
● We propose a PINN-based framework to approximate the minimum eradication time in the controlled SIR model by solving the associated HJB equation in a mesh-free manner. Our approach enables the simultaneous computation of eradication times across all initial conditions, eliminating the need for retraining.
● By leveraging the known structural property that the optimal control in the controlled SIR model is a bang-bang control with at most one switching, we recover the optimal vaccination strategy efficiently via the DPP, thereby avoiding the complex control synthesis procedures commonly required in general optimal control problems.
● We provide a theoretical analysis based on NTK theory, which explains the effectiveness of the variable scaling method in training PINNs for solving the HJB equation. Furthermore, we introduce variable-specific (nonuniform) scaling to address the anisotropic behavior inherent in the PDE, improving both training stability and convergence.
The remainder of this paper is organized as follows: Section 2 presents preliminary results on the minimum eradication time in the context of HJB equations. Section 3 reviews variable scaling PINNs and includes an error analysis specific to our HJB equation. Section 4 details the training procedure, while Section 5 presents the experimental results. Finally, Section 6 concludes the paper by summarizing the key findings, discussing the limitations of the approach, and providing directions for future research.
Throughout the paper, we consider a time-homogeneous controlled SIR model where the infection and recovery rates are constant:
β(t)≡βandγ(t)≡γ. |
We consider a threshold μ>0 and the initial conditions x≥0 and y≥μ. Let r(t)∈U={r:[0,∞)→[0,1]} be a vaccination control. Under these settings, we define the eradication time as
ur(x,y):=min{t>0:I(t)=μ}, |
where Sr and Ir satisfy
{˙Sr=−βSrIr−rSr,˙Ir=βSrIr−γIr, | (2.1) |
with (Sr(0),Ir(0))=(x,y).
A crucial property of ur is that for each t∈[0,ur(x,y)],
ur(x,y)=t+ur(Sr(t),Ir(t)), | (2.2) |
which is known as the DPP. This relationship can be interpreted as follows: at time t, the remaining eradication time from the state (Sr(t),Ir(t)) is ur(x,y).
Finally, the minimum eradication time is defined as
u(x,y):=minr∈Uur(x,y). | (2.3) |
The mathematical properties of this value function have been extensively studied in [6]. For the convenience of readers, we summarize the theoretical results provided in [6].
Thanks to (2.2), it is known that u is the unique viscosity solution to the following HJB equation.
Theorem 1 (Theorem 1.2 of [6]). For μ>0, the value function u defined in (2.3) is the unique viscosity solution to
βxy∂xu+x(∂xu)++(γ−βx)y∂yu=1in(0,∞)×(μ,∞), | (2.4) |
with the boundary conditions
u(0,y)=1γln(yμ)fory≥μ, |
and
u(x,μ)=0for0≤x≤γβ. |
In the next section, we proceed by identifying the optimal control that minimizes the eradication time.
With the value function u(x,y), characterized as the unique viscosity solution to the HJB equation (2.4), we now turn our attention to identifying the corresponding optimal control. In the HJB framework, the optimal cost is encoded in the value function, and the structure of the optimal control is identified through minimization of the Hamiltonian. In our setting, the dynamics are affine in the control variable, which implies that the optimal control takes the form of bang-bang [35].
To make this connection more explicit, we invoke the PMP [8], which provides a necessary condition for optimality in terms of the dynamics of adjoint variables. In our model, the PMP reveals that any optimal vaccination strategy r(t)∈[0,1] must satisfy a bang-bang structure, meaning that r(t)∈{0,1} for almost every t∈[0,ur]. The switching behavior is determined by the sign of the adjoint variable, which evolves according to a differential equation coupled with the state dynamics [4,6,35].
Moreover, in the time-homogeneous controlled SIR model with constant infection and recovery rates, it is known [4] that the optimal control switches at most once from r(t)=0 to r(t)=1. This result follows from a careful analysis of the monotonicity properties of the adjoint state. Consequently, we restrict our attention to switching controls of the form
rτ(t)={0,t<τ,1,t≥τ. | (2.5) |
Here, rτ and τ are referred to as the switching control and switching time, respectively.
We now recall some well-known results on the minimum eradication time and the optimal switching time.
Theorem 2 (Theorem 1.4 of [6]). Let μ>0, x≥0, and y≥μ. Then,
u(x,y)=minτ≥0{τ+ur0(S(τ),I(τ))}, | (2.6) |
where S(t) and I(t) satisfy the uncontrolled SIR model:
{˙S=−βSI,˙I=βSI−γI, | (2.7) |
with (S(0),I(0))=(x,y). Moreover, any τ for which the minimum in (2.6) is achieved is the switching time of the optimal switching control.
To solve for the optimal switching time τ from (2.6), it is necessary to compute ur0. Recalling the DPP (2.2), when r≡1, i.e., r=r0, we have the following identity:
ur0(x,y)=t+ur0(Sr0(t),Ir0(t)), | (2.8) |
where Sr0 and Ir0 satisfy
{˙Sr0=−βSr0Ir0−Sr0,˙Ir0=βSr0Ir0−γIr0, | (2.9) |
with (Sr0(0),Ir0(0))=(x,y). Taking the time derivative of both sides of (2.8), we deduce that
0=1+ddtur0(Sr0(t),Ir0(t))=1+˙Sr0(t)∂xur0+˙Ir0(t)∂yur0=1+(−βSr0(t)Ir0(t)−Sr0(t))∂xur0+(βSr0(t)Ir0(t)−γIr0(t))∂yur0. |
Setting t=0 and using the dynamics (2.9), we obtain:
βxy∂xur0+x∂xur0+(γ−βx)y∂yur0=1in (0,∞)×(μ,∞), |
where ur0 satisfies the same boundary conditions as u.
We finish this section with well-established properties of the optimal switching time τ∗ such that u(x,y)=uτ∗(x,y). We define the following:
S:={(x,y):u(x,y)=ur0(x,y)}, |
Corollary 6.2 in [6] yields
{∂xur0(x,y)≥0for(x,y)∈S,∂xur0(S(τ∗),I(τ∗))=0for(x,y)∈SC, | (2.10) |
where (S,I) satisfies (2.7) with (S(0),I(0))=(x,y).
Furthermore, by Corollary 6.4 in [6], we have
{∂xu(x,y)≥0for(x,y)∈S,∂xu(x,y)≤0for(x,y)∈SC. | (2.11) |
On the basis of the theoretical properties of the minimum eradication time, we propose a training procedure to solve (2.4) and compute the optimal switching time through a PINN framework, which does not require spatial discretization.
In this section, we explain the variable scaling physics-informed neural network (VS-PINN), which is a crucial component of our method. For completeness, we begin with an overview of PINNs.
PINNs are trained using a loss function that enables the neural network to approximate a solution satisfying both the differential equation and the initial or boundary conditions. Specifically, we focus on solving a PDE with a boundary condition, as the problem we address falls into this category. Suppose we have a bounded open domain Ω⊂Rd and the following equations:
D[u](x)=f(x)inΩ,u(x)=g(x)on∂Ω, |
where D is a differential operator.
We train a neural network u(x;θ) using the loss function
L=λrLr+λbLb, | (3.1) |
where the residual loss Lr and the boundary loss Lb are defined as
Lr=1NrNr∑i=1|D[u](xir)−f(xir)|2,Lb=1NbNb∑j=1|u(xjb)−g(xjb)|2. |
Here, the residual data points xir∈Ω and the boundary data points xjb∈∂Ω are typically sampled randomly from uniform distributions. The weights λr, λb, and the number of data points Nr, Nb are tunable parameters.
In our problem, we consider Ω=(0,∞)×(μ,∞) for μ>0 and define the operators D and D0 as
D=βxy∂x+x(∂x)++(γ−βx)y∂y,D0=βxy∂x+x∂x+(γ−βx)y∂y, | (3.2) |
where (∂x)+u=(∂xu)+. We solve for u and ur0, satisfying
D[u]=1andD0[ur0]=1inΩ, | (3.3) |
with the boundary conditions
{u(0,y)=ur0(0,y)=1γln(yμ)for y≥μ,u(x,μ)=ur0(x,μ)=0for 0≤x≤γβ. | (3.4) |
A schematic diagram of the framework is presented in Figure 1.
In [11], Ko and Park proposed a simple method that improves the performance of PINNs. The idea is to scale the variables so that the domain of the target function is magnified, making the function less stiff.
Let us begin with a change of variable x=ˆx/N and setting
ˆu(ˆx):=u(ˆx/N). |
The function ˆu is a scaled version of the function u, with its domain expanded by a factor of N, causing its stiffness to decrease by a factor of N compared with u. Thus, our goal is to train ˆu instead of directly training u. After training ˆu, we can simply recover our original target u by substituting
u(x)=ˆu(Nx). |
Unlike [11], we use different scaling parameters Nx, Ny for each component x,y and also apply translations, which means we use
ˆx=Nxx+bx,ˆy=Nyy+by. |
We employ this approach to address the anisotropic behavior in PDEs such as our HJB equation, where the dynamics exhibit asymmetry across different variables x and y. By doing so, we effectively mitigate the imbalance in gradient magnitudes during training and improve the optimization's stability. Thus, after training ˆu, we recover u by
u(x,y)=ˆu(Nxx+bx,Nyy+by). |
Recall that the differential operator of our problem in (3.2) takes the form of D=F(x,y,∂x,∂y). The modified version of our problem has the domain ˆΩ=(bx,∞)×(Nyμ+by,∞).
By the chain rule,
∂xu(x,y)=∂ˆxu(x,y)⋅∂xˆx=∂ˆxˆu(ˆx,ˆy)⋅Nx,∂yu(x,y)=∂ˆyu(x,y)⋅∂yˆy=∂ˆyˆu(ˆx,ˆy)⋅Ny, |
we define the operators ˆD and ^D0 as
ˆD=β(ˆx−bx)(ˆy−by)Ny∂ˆx+(ˆx−bx)(∂ˆx)++(γ−βˆx−bxNx)(ˆy−by)∂ˆy, |
and
^D0=β(ˆx−bx)(ˆy−by)Ny∂ˆx+(ˆx−bx)∂ˆx+(γ−βˆx−bxNx)(ˆy−by)∂ˆy. |
We then solve for ˆu and ˆur0, satisfying
ˆD[ˆu]=1and^D0[ˆur0]=1inˆΩ, |
with
{ˆu(0,ˆy)=ˆur0(0,ˆy)=1γln(ˆy−byNyμ)forˆy≥Nyμ+by,ˆu(ˆx,μ)=ˆur0(ˆx,μ)=0forbx≤ˆx≤Nxγβ+bx. | (3.5) |
As a result, we should use the loss function in (3.1) modified by replacing D (or D0) and f,g,u(x;θ) (or ur0(x;θ)) with ˆD (or ^D0) and ˆf,ˆg,ˆu(ˆx;θ) (or ˆur0(ˆx;θ)) in order to train ˆu(ˆx;θ) (or ˆur0(ˆx;θ)), where f(x)=ˆf(ˆx)=1, g and ˆg are given in (3.4) and (3.5), respectively, that is
{g(0,y)=1γln(yμ)for y≥μ,g(x,μ)=0for 0≤x≤γβ, |
and
{ˆg(0,ˆy)=1γln(ˆy−byNyμ)forˆy≥Nyμ+by,ˆg(ˆx,μ)=0forbx≤ˆx≤Nxγβ+bx. |
In the following section, we analyze the effect of the scaling factors.
We now establish the theoretical foundation for the efficiency of the variable scaling method in our problem through the lens of NTK [12], a widely used framework for analyzing the training dynamics of deep neural networks. In [25], the NTK theory was extended to PINNs.
Applying this theory, the authors of [11] demonstrated that the variable scaling method in PINNs applied to a simple one-dimensional Poisson equation enhances training efficiency. In this section, we establish a proof for Eq (2.4), which is more complex than the one-dimensional Poisson equation. To avoid redundancy, we focus on solving for u.
Given that the parameter θ of a PINN u(x;θ) is trained through the gradient flow
dθdt=−∇θL |
with respect to the loss (3.1) with λr=λb=1/2, it is proved in [25] that the evolution of u and D[u] follows
[du(xb;θ(t))dtdD[u](xr;θ(t))dt]=−[Kuu(t)Kur(t)Kru(t)Krr(t)][u(xb;θ(t))−g(xb)D[u](xr;θ(t))−f(xr)], |
where Kuu(t)∈RNb×Nb, Krr(t)∈RNr×Nr, and Kru(t)=[Kur(t)]⊤∈RNr×Nb, whose (i,j)th entries are given by
(Kuu)ij(t)=⟨du(xib;θ(t))dθ,du(xjb;θ(t))dθ⟩,(Kru)ij(t)=⟨dD[u](xir;θ(t))dθ,du(xjb;θ(t))dθ⟩,(Krr)ij(t)=⟨dD[u](xir;θ(t))dθ,dD[u](xjr;θ(t))dθ⟩. |
We call the matrix K(t)=[Kuu(t)Kur(t)Kru(t)Krr(t)] the NTK of the training dynamics of u(x;θ) via PINN.
In [25], it is proved that the NTK of the PINN at initialization, i.e., the kernel at t=0, converges to a deterministic kernel and remains the same in the infinite-width limit when θ(0) is assumed to follow the standard normal distribution. In [11], it is shown that variable scaling can enhance the performance of PINNs by analyzing how the initial NTK of the one-dimensional Poisson equation evolves with variable scaling. We follow their argument to deduce a similar result for Eq (2.4).
By definition, K(t) is positive semi-definite. The eigenvalues of K(t) are related to the convergence rate of the training dynamics. Since it is difficult to directly compute all eigenvalues of K(t), we instead consider the average of the eigenvalues Tr(K(t))Nr+Nb. It is a weighted average of Tr(Kuu(t))Nb and Tr(Krr(t))Nr as follows:
Tr(K(t))Nr+Nb=NbNr+NbTr(Kuu(t))Nb+NrNr+NbTr(Krr(t))Nr. |
For notational simplicity, we set the translation parameters bx=by=0 and omit the hat (^) notation in this subsection as follows:
D=βxyNy∂x+x(∂x)++(γ−βxNx)y∂y, |
and let our neural network be
u(x,y;θ)=1√d1d1∑k=1Wk2σ(W1k1x+W2k1y+bk1)+b2, | (3.6) |
with σ=max{0,x3}, since the activation function is required to be twice differentiable in NTK theory [12,25] to ensure stationarity of NTK. Among the twice differentiable activation functions, we focused on the cubic rectified linear unit (ReLU) activation function in our theoretical analysis, primarily for reasons of analytical tractability. Specifically, for commonly used activation functions σ such as a sigmoid or hyperbolic tangent, computing expectations like E[σ(X)] when X∼N(0,δ2) becomes complex and often intractable in closed form. Although our theoretical analysis is based on a strong assumption, our experimental results indicate that the variable scaling method remains effective for a broader class of activation functions. We initialize the parameters Wik1,Wk2,bk1,b2 from the standard normal distribution N(0,1).
We focus on training a PINN within the rectangular domain
[0,Nxℓx]×[Nyμ,Ny(μ+ℓy)], | (3.7) |
and introduce an augmented boundary given by
[0,Nxℓx]×{Nyμ}∪{0}×[Nyμ,Ny(μ+ℓy)]∪[0,Nxℓx]×{Ny(μ+ℓy)}. | (3.8) |
Although this approach necessitates additional simulations to obtain ground-truth data for the augmented boundary given by (3.8), the associated computational cost is significantly lower compared with simulating the entire two-dimensional domain. To proceed, we sample the boundary points {xjb,yjb}Nbj=1 uniformly from the augmented boundary (3.8) and sample the residual points {xir,yir}Nri=1 uniformly from the restricted domain (3.7). Note that the domain (3.7) is obtained by scaling the domain [0,ℓx]×[μ,μ+ℓy] (ℓx,ℓy>0).
We now present the main result, which demonstrates that the average of eigenvalues of the deterministic kernel grows in the scaling factors.
Theorem 3. Under the sampling regime of data points and the differential operator as described above, the following convergences hold as the width d1 of (3.6) goes to infinity:
Tr(Kuu(0))NbP→O((N2x+N2y+1)3)andTr(Krr(0))NrP→O(P(Nx,Ny)), |
where P→ represents the convergence in probability and P(x,y) is a degree 3 homogeneous polynomial in x2 and y2 with positive coefficients. Moreover, if Nx=Ny=N, then
Tr(K(0))Nr+NbP→O(N6)∗. |
*f(x)=O(g(x)) implies that constant C>0 and x0 exist such that |f(x)|≤C|g(x)| for all x≥x0.
Before presenting the proof, we state a useful lemma regarding the moment bounds of Gaussian distributions, which will be used repeatedly throughout the proof. The proof of the lemma is omitted and can be found in [36].
Lemma 1. If X∼N(0,δ2), then
E[X2n]=(2n)!2nn!δ2nandE[X2n−1]=0∀n∈N. |
Proof of Theorem 3. We first compute the limit of Tr(Kuu(0))/Nb as the width d1 goes to infinity. From the definition, we see that
Tr(Kuu(0))Nb=1NbNb∑j=1⟨du(xjb,yjb;θ)dθ,du(xjb,yjb;θ)dθ⟩, |
and each summand is decomposed as
⟨du(xjb,yjb;θ)dθ,du(xjb,yjb;θ)dθ⟩=⟨du(xjb,yjb;θ)dW11,du(xjb,yjb;θ)dW11⟩+⟨du(xjb,yjb;θ)dW21,du(xjb,yjb;θ)dW21⟩+⟨du(xjb,yjb;θ)dW2,du(xjb,yjb;θ)dW2⟩+⟨du(xjb,yjb;θ)db1,du(xjb,yjb;θ)db1⟩+⟨du(xjb,yjb;θ)db2,du(xjb,yjb;θ)db2⟩. | (3.9) |
For notational convenience, we write
X1:=W1k1∼N(0,1),Y1:=W2k1∼N(0,1),X2:=Wk2∼N(0,1),Z:=bk1∼N(0,1). |
Since X1,Y1, and Z are independent, we have
S:=X1xjb+Y1yjb+Z∼N(0,(xjb)2+(yjb)2+1). |
The first term of (3.9) is then expressed as
⟨du(xjb,yjb;θ)dθ,du(xjb,yjb;θ)dθ⟩=1d1d1∑k=1(Wk2σ′(W1k1xjb+W2k1yjb+bk1)xjb)2. |
By the law of large numbers, this term converges in probability to
(xjb)2E[X22σ′(S)2]=(xjb)2E[X22]E[σ′(S)2]=(xjb)2E[σ′(S)2] |
as d1 goes to infinity, where the first equality holds, since X2 and S are independent. As
σ′(S)={3S2,if S≥0,0,if S<0, |
we have
E[σ′(S)2]=92E[S4]=O(((xjb)2+(yjb)2+1)2), | (3.10) |
where the last equality follows from Lemma 1 and S∼N(0,(xjb)2+(yjb)2+1). Thus, the first term of (3.9) is the order of
(xjb)2O(((xjb)2+(yjb)2+1)2). |
The second term of (3.9) is
⟨du(xjb,yjb;θ)dW21,du(xjb,yjb;θ)dW21⟩=1d1d1∑k=1(Wk2σ(W1k1xjb+W2k1yjb+bk1)yjb)2. |
By the law of large numbers and (3.10), this term converges in probability to
(yjb)2E[X22σ′(S)2]=(yjb)2E[X22]E[σ′(S)2]=(yjb)2E[σ′(S)2]=(yjb)2O(((xjb)2+(yjb)2+1)2) |
as d1 goes to infinity.
The third term of (3.9) is
⟨du(xjb,yjb;θ)dW2,du(xjb,yjb;θ)dW2⟩=1d1d1∑k=1(σ(W1k1xjb+W2k1yjb+bk1))2. |
By the law of large numbers, this term converges in probability to E[σ(S)2] as d1 goes to infinity. Noticing that
σ(S)2={S6,if S≥0,0,if S<0, |
we deduce
E[σ(S)2]=12E[S6]=O(((xjb)2+(yjb)2+1)3) |
by Lemma 1.
The fourth term of (3.9) is
⟨du(xjb,yjb;θ)db1,du(xjb,yjb;θ)db1⟩=1d1d1∑k=1(Wk2σ′(W1k1xjb+W2k1yjb+bk1))2. |
By the law of large numbers and (3.10), this term converges in probability to
E[σ′(S)2]=O(((xjb)2+(yjb)2+1)2) |
as d1 goes to infinity.
Finally, the last term in (3.9) is
⟨du(xjb,yjb;θ)db2,du(xjb,yjb;θ)db2⟩=1. |
Combining them all together, we deduce that
⟨du(xjb,yjb;θ)dθ,du(xjb,yjb;θ)dθ⟩=O(((xjb)2+(yjb)2+1)3). |
Note that we sample (xjb,yjb) uniformly from the scaled boundary (3.8), and thus xjb and yjb are scaled with Nx and Ny, respectively. Thus, we conclude that
Tr(Kuu(0))Nb=1NbNb∑j=1⟨du(xjb,yjb;θ)dθ,du(xjb,yjb;θ)dθ⟩=O((N2x+N2y+1)3). | (3.11) |
Next, we compute the limit of Tr(Krr(0))/Nr as the width d1 goes to infinity. From the definition,
Tr(Krr(0))Nr=1NrNr∑i=1⟨dD[u(xir,yir;θ)]dθ,dD[u(xir,yir;θ)]dθ⟩, |
and each summand is decomposed as
⟨dD[u(xir,yir;θ)]dθ,dD[u(xir,yir;θ)]dθ⟩=⟨dD[u(xir,yir;θ)]dW11,dD[u(xir,yir;θ)]dW11⟩+⟨dD[u(xir,yir;θ)]dW21,dD[u(xir,yir;θ)]dW21⟩+⟨dD[u(xir,yir;θ)]dW2,dD[u(xir,yir;θ)]dW2⟩+⟨dD[u(xir,yir;θ)]db1,dD[u(xir,yir;θ)]db1⟩+⟨dD[u(xir,yir;θ)]db2,dD[u(xir,yir;θ)]db2⟩. | (3.12) |
Recall that
D[u]=βxyNy∂xu+x(∂xu)++(γ−βxNx)y∂yu. |
Computing the first derivatives of u, we get
∂xu(xir,yir;θ)=1√d1d1∑k=1Wk2W1k1σ′(W1k1xir+W2k1yir+bk1) |
and
∂yu(xir,yir;θ)=1√d1d1∑k=1Wk2W2k1σ′(W1k1xir+W2k1yir+bk1). |
Thus, we get
D[u(xir,yir;θ)]=βxiryirNy√d1d1∑k=1Wk2W1k1σ′(W1k1xir+W2k1yir+bk1)+xir√d1(d1∑k=1Wk2W1k1σ′(W1k1xir+W2k1yir+bk1))++(γ−βxirNx)yir√d1d1∑k=1Wk2W2k1σ′(W1k1xir+W2k1yir+bk1). |
The first term of (3.12) is
⟨dD[u(xir,yir;θ)]dW11,dD[u(xir,yir;θ)]dW11⟩=1d1d1∑k=1[βxiryirNyWk2(W1k1σ″(W1k1xir+W2k1yir+bk1)xir+σ′(W1k1xir+W2k1yir+bk1))+δxirWk2(W1k1σ″(W1k1xir+W2k1yir+bk1)xir+σ′(W1k1xir+W2k1yir+bk1))+(γ−βxirNx)yirWk2W1k1σ″(W1k1xir+W2k1yir+bk1)xir]2, |
where
δ={1 if d1∑k=1Wk2W1k1σ′(W1k1xir+W2k1yir+bk1)≥0,0 if d1∑k=1Wk2W1k1σ′(W1k1xir+W2k1yir+bk1)<0. |
Temporarily denoting
Ak=βxiryirNyWk2W1k1σ″(W1k1xir+W2k1yir+bk1)xir,Bk=βxiryirNyWk2σ′(W1k1xir+W2k1yir+bk1),Ck=xirWk2W1k1σ″(W1k1xir+W2k1yir+bk1)xir,Dk=xirWk2σ′(W1k1xir+W2k1yir+bk1),Ek=(γ−βxirNx)yirWk2W1k1σ″(W1k1xir+W2k1yir+bk1)xir, |
and applying the power-mean inequality, we achieve
⟨dD[u(xir,yir;θ)]dW11,dD[u(xir,yir;θ)]dW11⟩≤1d1d1∑k=15(A2k+B2k+C2k+D2k+E2k). |
By the law of large numbers,
1d1d1∑k=1A2kP→β2(xir)4(yir)2N2yE[X21σ″(S)2],1d1d1∑k=1B2kP→β2(xir)2(yir)2N2yE[σ′(S)2],1d1d1∑k=1C2kP→(xir)2E[X21σ″(S)2],1d1d1∑k=1D2kP→(xir)2E[σ′(S)2],1d1d1∑k=1E2kP→(γ−βxirNx)2(xir)2(yir)2E[X21σ″(S)2]. |
where S=X1xir+Y1yir+Z and P→ denotes the convergence in probability.
By (3.10), we have E[σ′(S)2]=O(((xir)2+(yir)2+1)2).
Using the fact
|σ″(X1xir+Y1yir+Z)|2≤36|X1xir+Y1yir+Z|2, |
we have
E[X21σ″(S)2]≤36(2π)3/2∫∞−∞∫∞−∞x2(x2(xir)2+y2(yir)2+z2+2xyxiryir+2yzyir+2xzxir)exp(−x2+y2+z22)dzdydx=36(|xir|2E[X41]+(yir)2E[X21Y21]+E[X21Z2])=36(3(xir)2+(yir)2+1). | (3.13) |
Since we sample (xir,yir) from the scaled domain (3.7), xir and yir are scaled with Nx and Ny, respectively, as we adjust Nx,Ny. Thus, we have
⟨dD[u(xir,yir;θ)]dW11,dD[u(xir,yir;θ)]dW11⟩=O((N4x+N2x+N2xN2y)(3N2x+N2y+1)+N2x(N2x+N2y+1)2), | (3.14) |
where we omitted other variables such as β and γ, as our primary focus is on the dependency on Nx and Ny.
To proceed, we now observe that the second term in (3.12) is written as
⟨dD[u(xir,yir;θ)]dW21,dD[u(xir,yir;θ)]dW21⟩=1d1d1∑k=1[βxiryirNyWk2W1k1σ″(W1k1xir+W2k1yir+bk1)yir+δxirWk2W1k1σ″(W1k1xir+W2k1yir+bk1)yir+(γ−βxirNx)yirWk2(W1k1σ″(W1k1xir+W2k1yir+bk1)yir+σ′(W1k1xir+W2k1yir+bk1))]2. |
Similarly, we temporarily denote
Ak=βxiryirNyWk2W1k1σ″(W1k1xir+W2k1yir+bk1)yir,Bk=xirWk2W1k1σ″(W1k1xir+W2k1yir+bk1)yir,Ck=(γ−βxirNx)yirWk2W2k1σ″(W1k1xir+W2k1yir+bk1)yir,Dk=(γ−βxirNx)yirWk2σ′(W1k1xir+W2k1yir+bk1)yir, |
and apply the power mean inequality to deduce
⟨dD[u(xir,yir;θ)]dW21,dD[u(xir,yir;θ)]dW21⟩≤1d1d1∑k=14(A2k+B2k+C2k+D2k). |
By the law of large numbers, (3.10), (3.13), and the symmetry between E[X21σ″(S)2] and E[Y21σ″(S)2], we have
1d1d1∑k=1A2kP→β2(xir)2(yir)4N2yE[X21σ″(S)2]=O(N2xN2y(3N2x+N2y+1)),1d1d1∑k=1B2kP→(xir)2(yir)2E[X21σ″(S)2]=O(N2xN2y(3N2x+N2y+1)),1d1d1∑k=1C2kP→(γ−βxirNx)2(yir)4E[Y21σ″(S)2]=O(N4y(N2x+3N2y+1)),1d1d1∑k=1D2kP→(γ−βxirNx)2(yir)2E[σ′(S)2]=O(N2y(N2x+N2y+1)2). |
Thus, we get
⟨dD[u(xir,yir;θ)]dW21,dD[u(xir,yir;θ)]dW21⟩=O(N2xN2y(3N2x+N2y+1)+N4y(N2x+3N2y+1)+N2y(N2x+N2y+1)2). | (3.15) |
The third term of (3.12) is
⟨dD[u(xir,yir;θ)]dW2,dD[u(xir,yir;θ)]dW2⟩=1d1d1∑k=1[βxiryirNyW1k1σ′(W1k1xir+W2k1yir+bk1)+δxirW1k1σ′(W1k1xir+W2k1yir+bk1)+(γ−βxirNx)yirW2k1σ′(W1k1xir+W2k1yir+bk1)]2. |
Denoting
Ak=βxiryirNyW1k1σ′(W1k1xir+W2k1yir+bk1),Bk=xirW1k1σ′(W1k1xir+W2k1yir+bk1),Ck=(γ−βxirNx)yirW2k1σ′(W1k1xir+W2k1yir+bk1), |
we have
⟨dD[u(xir,yir;θ)]dW2,dD[u(xir,yir;θ)]dW2⟩≤1d1d1∑k=13(A2k+B2k+C2k). |
By the law of large numbers,
1d1d1∑k=1A2kP→β2(xir)2(yir)2N2yE[X21σ′(S)2],1d1d1∑k=1B2kP→(xir)2E[X21σ′(S)2],1d1d1∑k=1C2kP→(γ−βxirNx)2(yir)2E[Y21σ′(S)2]. |
By Lemma 1 and the fact that σ′(x)2≤9x4, we have
E[X21σ′(S)2]≤9(2π)3/2∫∞−∞∫∞−∞∫∞−∞x2(xxir+yyir+z)4dzdydx=9(2π)3/2∫∞−∞∫∞−∞∫∞−∞x2(x4(xir)4+y4(yir)4+z4+6x2y2(xiryir)2+6y2z2(yir)2+6x2z2(xir)2)dzdydx=9(15(xir)4+3(yir)4+3+18(xiryir)2+6(yir)2+18(xir)2)=O(5(xir)4+(yir)4+6(xiryir)2). |
Similarly,
E[Y21σ′(S)2]≤9(15(yir)4+3(xir)4+3+18(xiryir)2+6(xir)2+18(yir)2)=O((xir)4+5(yir)4+6(xiryir)2). |
Thus, we have
⟨dD[u(xir,yir;θ)]dW2,dD[u(xir,yir;θ)]dW2⟩=O(N2x(5N4x+N4y+6N2xN2y)+N2y(N4x+5N4y+6N2xN2y))=O(5N6x+7N4xN2y+7N2xN4y+5N6y). | (3.16) |
The fourth term of (3.12) is
⟨dD[u(xir,yir;θ)]db1,dD[u(xir,yir;θ)]db1⟩=1d1d1∑k=1[βxiryirNyWk2W1k1σ″(W1k1xir+W2k1yir+bk1)+δxirWk2W1k1σ″(W1k1xir+W2k1yir+bk1)+(γ−βxirNx)yirWk2W2k1σ″(W1k1xir+W2k1yir+bk1)]2. |
Denoting
Ak=βxiryirNyWk2W1k1σ″(W1k1xir+W2k1yir+bk1),Bk=xirWk2W1k1σ″(W1k1xir+W2k1yir+bk1),Ck=(γ−βxirNx)yirWk2W2k1σ″(W1k1xir+W2k1yir+bk1), |
we have
⟨dD[u(xir,yir;θ)]db1,dD[u(xir,yir;θ)]db1⟩≤1d1d1∑k=13(A2k+B2k+C2k). |
By the law of large numbers,
1d1d1∑k=1A2kP→β2(xir)2(yir)2N2yE[X21σ″(S)2],1d1d1∑k=1B2kP→(xir)2E[X21σ″(S)2],1d1d1∑k=1C2kP→(γ−βxirNx)2(yir)2E[Y21σ″(S)2]. |
By (3.13), we have
⟨dD[u(xir,yir;θ)]db1,dD[u(xir,yir;θ)]db1⟩=O(N2x(3N2x+N2y+1)+N2y(N2x+3N2y+1)). | (3.17) |
Lastly, the fifth term of (3.12) is simply
⟨dD[u(xir,yir;θ)]db2,dD[u(xir,yir;θ)]db2⟩=0. | (3.18) |
Combining (3.14)–(3.18), we obtain
Tr(Krr(0))Nr=O(P(Nx,Ny)), | (3.19) |
where P(x,y) is a degree 3 homogeneous polynomial in x2 and y2 with positive coefficients.
By (3.11) and (3.19), we finally conclude that the average convergence rate of NTK increases as the variable scaling factors Nx and Ny increase. In the special case where we set Nx=Ny=N, the right-hand sides of both (3.11) and (3.19) become O(N6).
In practice, we use gradient descent in training, which is not identical to the ideal gradient flow. Consequently, arbitrarily increasing Nx and Ny may lead to excessively large parameter updates during a single step of gradient descent, potentially destabilizing the training process. Therefore, it is crucial to empirically determine an appropriate value for Nx and Ny to ensure stable learning.
Our goal is to approximate u and ur0 using the PINN framework without discretizing the spatial domain and to use these results to estimate the optimal switching control via (2.6) with high accuracy. To achieve this, we train u and ur0 separately by solving (3.3) within a fixed domain D:=[0,ℓx]×[μ,μ+ℓy], where ℓx,ℓy and μ>0 are given. While the original domain is unbounded, we restrict it to a bounded subset for training, since sampling can only be performed over a finite region. To reduce potential boundary effects near the truncated region along the positive y-axis, we impose suitable boundary conditions. This approximation is physically justified, as the total population is finite and the state variables x and y are inherently bounded in real-world epidemic settings.
We then identify the optimal control r via (2.6), which minimizes the eradication time, by modeling τ as a neural network function. For the ease of training the switching time τ, we introduce another neural network to approximate the uncontrolled dynamics (S(t),I(t)) satisfying (2.7).
Figure 2 outlines the full workflow, where we first train u and ur0 using PINNs, then learn the uncontrolled system dynamics, and finally optimize τ using the DPP framework.
Let us begin by solving
βxy∂xu+x(∂xu)++(γ−βx)y∂yu=1in[0,ℓx]×[μ,μ+ℓy], |
and
βxy∂xur0+x∂xur0+(γ−βx)y∂yin[0,ℓx]×[μ,μ+ℓy], |
via PINNs, or equivalently,
D[u]=1andD0[ur0]=1[0,ℓx]×[μ,μ+ℓy], |
where
D=βxy∂x+x(∂x)++(γ−βx)y∂yandD0=βxy∂x+x∂x+(γ−βx)y∂y. |
To leverage the idea of the VS-PINN, we set
u(x,y;θ1)=NN(Nxx+bx,Nyy+by;θ1)andur0(x,y;θ2)=NN(Nxx+bx,Nyy+by;θ2), |
where NN(⋅;θ1) and NN(⋅;θ2) represent fully connected neural networks parametrized by θ1 and θ2, respectively. Note that Nx,Ny,bx and by are not trained during training. We then explicitly define the loss function.
Residual loss: with the prespecified training data {(xir,yir)}Nri=1∈[0,ℓx]×[μ,ℓy+μ] for Nr∈N, the residual loss associated with u is given as
Lr[u(⋅;θ1)]=1NrNr∑i=1(D[u(xir,yir;θ1)]−1)2. |
Similarly, the residual loss corresponding to ur0 is defined as
L0r[ur0(⋅;θ2)]=1NrNr∑i=1(D0[ur0(xir,yir;θ2)]−1)2. |
Boundary loss: we define the boundary loss function on
Γ:=[0,ℓx]×{μ}⏟=:D1∪{0}×[μ,μ+ℓy]⏟=:D2∪[0,ℓx]×{μ+ℓy}⏟=:D3⊂∂D, | (4.1) |
where ∂D denotes the boundary of the domain D. Given Nkb∈N, we randomly sample Nkb points from Dk and denote them by {(xjb,k,yjb,k)}Nkbj=1 for k=1,2,3. With Nb=N1b+N2b+N3b, let
Lb[u(⋅;θ1)]=1Nb3∑k=1Nkb∑j=1(u(xjb,k,yjb,k;θ1)−u(xjb,k,yjb,k))2. | (4.2) |
We note that the boundary loss for learning ur0 is identical to that used for learning u.
One key challenge is the approximation of u(xjb,k,yjb,k) at the boundary points for each j and k. To address this, we revisit an intrinsic property of optimal control for the minimum eradication time, demonstrating that the optimal vaccination strategy takes the form of (2.5). Accordingly, we propose Algorithm 1 for numerical implementation. Leveraging the structure of the optimal control, we compute the first time when I falls below μ by using the controls rsdτ for s∈N∪{0} and dτ>0. The solution to the controlled SIR model (2.1) is approximated using the fourth-order Runge-Kutta method with a step size dt>0.
Algorithm 1 Find the minimum eradication time |
1: Input: Time resolution of the optimal control dτ, the maximum switching time L, the discretization size dt, the maximum iteration step M, the eradication threshold μ, and the initial susceptible and infectious population x,y∈D.
2: Output: τ>0 such that rτ is an optimal control. 3: Set (S0,I0)=(x,y) and τ=0. 4: while True do 5: for m=0,1,2,…,M do 6: Compute Sm+1,Im+1 using the Runge-Kutta method corresponding to {˙S=−βSI−rτS,˙I=βSI−γI. 7: end for 8: mτ:=min{m′:Im′≤μ}. 9: τ=τ+dτ. 10: Break if τ>L. 11: end while 12: Return u(x,y)=minτ{mτ}. |
The recovery of the optimal control is significantly simplified by the theoretical result from [4] that the optimal policy is of bang-bang form with at most one switch, as we pointed out in (2.5). This allows us to avoid general policy parameterization and instead directly optimize over a scalar switching time, as formalized via the dynamic programming principle.
The optimal control can be recovered efficiently by exploiting its known structure: as shown in [4] and (2.5), the policy takes a bang-bang form with at most one switch. This allows us to reduce the control synthesis to a scalar optimization over the switching time, bypassing general policy parameterization.
With u and ur0 computed in the previous section, we now solve for the optimal switching time τ satisfying (2.6),
u(x,y)=minτ≥0{τ+ur0(S(τ),I(τ))}, |
which involves a complex and nonlinear optimization process. To avoid solving this optimization problem directly, we introduce a neural network τ(x,y;θ) and propose optimizing the following problem:
minθ(u(⋅)−τ(⋅;θ)−ur0(S(τ(⋅;θ)),I(τ(⋅;θ))))2, |
where S and I satisfy the uncontrolled system of ODEs (2.7), that is,
{˙S=−βSI,˙I=βSI−γI, |
with (S(0),I(0))=(x,y). Since the closed-form solutions of S and I are unavailable, we reduce the above optimization problem to
minθ((u(⋅)−τ(⋅;θ)−ur0(S(τ(⋅;θ);ω),I(τ(⋅;θ);ω)))2), |
where we approximate (S(t),I(t)) using a neural network w(x,y,t;ω), that is,
(S(t),I(t))≈w(x,y,t;ω). |
This approximation corresponds to training a neural network w(x,y,t;ω) with w(x,y,0;ω)=(x,y) that best mimics the uncontrolled dynamics (S(t),I(t)) satisfying (2.7) with (S(0),I(0))=(x,y).
Algorithm 2 Training of the dynamics |
1: Input: The number of initial samples Np, the number of interior collocation points Nint, the number of boundary points Nbdry, the number of temporal discretization steps NT, the total time horizon T, and the timestep size dt=T/NT.
2: Output: Trained neural network w(x,y,t;ω)≈(S(t),I(t)), approximating the solution of (2.7) with initial condition (S(0),I(0))=(x,y). 3: Sample the boundary data {(xkbdry,ykbdry,tk)}Nbdryk=1⊂∂D×T. 4: for k=1 to Nbdry do 5: Initialize (S0,I0)←(xkbdry,ykbdry). 6: for m=0 to tk do 7: Compute (Stk,Itk) using the Runge-Kutta method applied to: {˙S=−βSI,˙I=βSI−γI. 8: end for 9: Assign w(xkbdry,ykbdry,tk)←(Stk,Itk). 10: end for 11: while training has not converged do 12: Sample the interior collocation points {(xjint,yjint,tj)}Nintj=1⊂D×[0,T]. 13: Sample the initial points {(xip,yip)}Npi=1⊂D. 14: Compute the loss function L[w(⋅;ω)] as defined in (4.3). 15: Update the parameters: ω←Adam(L). 16: end while |
We propose to train a neural network w(x,y,t;ω):(x,y,t)↦R2 that approximates the flow of uncontrolled dynamics (S(t),I(t)) with (S(0),I(0))=(x,y) for all points (x,y)∈D. To train such a w, we first randomly choose interior points from the domain D=[0,ℓx]×[μ,μ+ℓy]. For the stability of training, we further sample points from the boundary of D given by
∂D=[0,ℓx]×{μ}∪{0}×[μ,μ+ℓy]∪[0,ℓx]×{μ+ℓy}∪{ℓx}×[μ,μ+ℓy]. |
Let T, NT, Np, Nint, and Nbdry be given and set dt=T/NT and T:={mdt:m=0,1,2,...,NT}. With random samples {(xip,yip)}Npi=1⊂D, {(xjint,yjint,tj)}Nintj=1⊂D×[0,T], {(xkbdry,ykbdry,tk)}Nbdryk=1⊂∂D×T, and Ntot=Np+Nint+Nbdry, let us define the loss function as
L[w(⋅;ω)]:=1Ntot(Np∑i=1L0[w(xip,yip,0;ω)]⏟initial loss+Nint∑j=1Lp[w(xjint,yjint,tj;ω)]⏟ODE system loss+Nbdry∑k=1Lbdry[w(xkbdry,ykbdry,tk;ω)]⏟boundary loss) | (4.3) |
for
L0[w(x,y,0;ω)]:=(w(x,y,0;ω)[1]−x)2+(w(x,y,0;ω)[2]−y)2, |
and
Lp[w(⋅;ω)]:=(∂w(⋅;ω)[1]∂t+βw(⋅;ω)[2]w(⋅;ω)[1])2+(∂w(⋅;ω)[2]∂t−βw(⋅;ω)[1]w(⋅)[2]+γw(⋅;ω)[2])2, |
where w(⋅)[1] and w(⋅)[2] denote the first and second components of w.
Lastly, Lbdry is defined via
Lbdry[w(x,y,kdt;ω)]:=‖w(x,y,kdt;ω)−g(x,y,kdt)‖22, |
where the reference vector g(x,y,kdt) is computed as follows:
g(x,y,kdt)=(Sk,Ik), |
where the reference vector g(x,y,kdt) is obtained by numerically solving the uncontrolled dynamics (2.7) using the fourth-order Runge-Kutta method, with the initial conditions (S0,I0)=(x,y) and a step size dt>0. The training details are provided in Algorithm 2, and an overview of the framework is illustrated in Figure 3.
Given u, ur0, and a neural network w(x,y,t;ω) that approximates the uncontrolled SIR dynamics satisfying (2.7) with (S(0),I(0))=(x,y), we minimize the following loss function:
L[τ(⋅;θ)]:=1NN∑i=1{(u(xi,yi)−τ(xi,yi;θ)−ur0(S(τ(xi,yi;θ)),I(τ(xi,yi;θ))))2+pen(xi,yi,τ(⋅;θ))}, | (4.4) |
over θ for the randomly sampled points {(xi,yi)}Ni=1∈D where
pen(x,y,τ)={(∂xur0(w(x,y,τ)[1],w(x,y,τ)[2]))2if(x,y)∈SC,0otherwise. |
Here, w(x,y,τ) approximates the solution (S(τ),I(τ)) to the uncontrolled SIR with (S(0),I(0))=(x,y) and is learned from the previous section. The penalty term ensures that τ satisfies (2.10).
In all experiments presented in this paper, the parameters of the controlled SIR model are set as follows: β=2, γ=10, and μ=0.01. We optimize using the Adam optimizer [37] with a fixed learning rate of 0.0001. For hardware, we used a single local GPU (RTX 3090) for all experiments.
Various model parameters are explored to investigate the optimal eradication time. Since the focus of this study is to characterize the optimal control, we focus on identifying model parameters that yield non-trivial control, excluding scenarios in which the optimal control is identically one, i.e., r(t)≡1.
To train models, we employ uniform sampling over the bounded computational domain. This choice is supported by the semiconcavity of the viscosity solution to the HJB equation [6] and the known structure of the optimal control, which is bang-bang with at most one switching time. Variations in hyperparameters, including batch sizes and learning rates, did not produce significant differences in performance. Notably, the incorporation of skip connections led to improvements in the prediction of u and uncontrolled dynamics, but had a negative effect on τ. Detailed discussions are provided in Appendix A.
We employ feed-forward neural networks with skip connections where the width is fixed as 50 and the depth consists of five hidden layers unless otherwise stated, as illustrated in Figure 4. Regarding the activation functions, we use the hyperbolic tangent function. Our experimental results indicate that the variable scaling method remains effective for the hyperbolic tangent activation function.
To test the efficiency of our method for approximating u and ur0, we consider the domain [0,20]×[0.01,1] by setting ℓx=20 and ℓy+μ=1, while varying the scaling factors Nx and Ny in Section 4.1. During training, we sample 1000 interior collocation points at each iteration.
To define the boundary loss (4.2), we randomly select 100 data points from each boundary Di defined in (4.1) for i=1,2,3, resulting in a total of 300 points, as illustrated in Figure 5. For the reference values obtained by Algorithm 1, we set dt=dτ=0.001.
Additional experiments validating the robustness of the variable scaling method are presented in Appendix B.
To learn the uncontrolled dynamics, we use the same neural network architecture and activation functions as described above but consider a slightly larger spatio-temporal sampling domain given by [0,20]×[0.01,2]×[0,2.5], setting ℓx=20, ℓy+μ=2, and T=2.5 in Section 4.2.1. The value of T is determined heuristically on the basis of the observation that the susceptible and infectious populations approach zero after a sufficiently long duration, which we set to T=2.5.
To define the boundary loss, we set NT=250, and hence, dt=0.01 in Algorithm 2 and sample 4000 points from ∂D×T by setting Nbdry=4000 in (4.3).
For the initial loss, we sample 1000 points per batch from D by setting Np=1000, shown as green dots in the figure. Additionally, to define the ODE system loss in (4.3), 1000 collocation points (S(0),I(0),t) are sampled per batch uniformly by setting Nint=1000.
We use a neural network without skip connections, with a width of 200 and five hidden layers, where leaky ReLU functions are applied in all hidden layers, and the Softplus function is employed in the final layer.
The domain of interest is given by [0,20]×[0.01,1]. Recalling the loss function (4.4), we note that the values of u are required in the same domain, while a larger domain must be considered for ur0 and w since I(τ(x,y;θ)) may exceed 1. Therefore, we approximate ur0 in the domain [0,20]×[0.01,10], using variable scaling factors of Nx=1 and Ny=4.
Similar to the previous experiments, we randomly sample 1,000 collocation points per batch uniformly, as shown in Figure 5 (right), and train using the DPP loss function (4.4).
We use the minimum eradication time u obtained from Algorithm 1 as a reference and compare it with the approximate minimum eradication time computed by our PINN algorithm. Similarly, we compute the eradication time under the control ur0 using the Runge-Kutta method with a timestep of dt=0.001. The mean square error (MSE) is then evaluated over the entire set of reference points obtained from Algorithm 1 across the domain.
Table 1 summarizes the MSE values corresponding to different choices of the scaling factors Nx and Ny. As observed, applying variable scaling with various scaling parameters Nx and Ny consistently improves the MSE of the base case (Nx=Ny=1), which aligns with the error analysis presented in Section 3.3. Among the many choices of scaling parameters, we empirically found that the result with Nx=1,Ny=20 results in the lowest MSE, which reduced the MSE of the baseline setting by approximately 3.743×10−5−5.893×10−63.743×10−5×100%≃84.3%.
Nx | Ny | MSE of u | MSE of ur0 |
1 | 1 | 3.743×10−5 | 7.067×10−6 |
1 | 20 | 5.893×10−6 | 3.883×10−6 |
1 | 40 | 7.992×10−6 | 5.490×10−5 |
1 | 80 | 4.608×10−5 | 2.217×10−5 |
2 | 1 | 1.488×10−5 | 7.130×10−6 |
2 | 20 | 9.979×10−6 | 7.257×10−6 |
2 | 40 | 8.175×10−6 | 3.403×10−5 |
4 | 80 | 5.372×10−5 | 1.917×10−5 |
5 | 1 | 7.730×10−6 | 1.046×10−5 |
We also investigate the effect of network depth in the domain [0,20]×[0.01,10]. Table 2 presents the results for different numbers of hidden layers. Training with a model containing five hidden layers yields a lower MSE compared with training with a single hidden layer, demonstrating that deeper networks improve the approximation accuracy. The level set for values of u and ur0 is presented in Figure 6.
Number of hidden layers | MSE of u | MSE of ur0 |
5 | 5.893×10−6 | 3.883×10−6 |
1 | 3.111×10−4 | 4.006×10−5 |
Furthermore, we provide a comparison with the finite difference method (FDM) for solving ur0 in Appendix C, where FDM provides inaccurate estimates of the solution, supporting the idea that our method is more efficient.
To evaluate the trained uncontrolled SIR dynamics w, we use the Runge-Kutta method. For reference data, we first sample {(xi,yi)}1000i=1⊂[0,20]×[0.01,1] and approximate (S(tk),I(tk)), satisfying the uncontrolled dynamics (2.7) with the initial conditions (S(0),I(0))=(xi,yi) for all i=1,…,1000 and tk∈{0.025k∣0≤k≤100}, using the Runge-Kutta method.
The MSE of w in estimating the population dynamics is 7.557×10−3, indicating that the neural network effectively estimates the uncontrolled dynamics. Figure 7 illustrates a comparison between the trajectories approximated by the trained neural network and those obtained via the Runge-Kutta method.
Recalling (2.11) and (2.10), the region where the optimal switching time is nonzero is characterized by the inequality ∂xu(x,y)≤0, as demonstrated in Figure 8.
To evaluate the accuracy of the estimated switching time, we compute the MSE of the estimated switching time τ using the samples {(xi,yi)}1000i=1⊂SC.
The optimal switching time for the initial conditions (S(0),I(0)):{(0.01i,0.01j)∣0≤i≤2000,0≤j≤100} is approximated using the reference values obtained from Algorithm 1, with a time discretization of dt=0.001 and dτ=0.001.
Furthermore, we assess the performance of our model using different numbers of hidden layers. When using a single hidden layer, the model achieves an MSE of 6.662×10−4. However, increasing the depth to five hidden layers results in a slightly higher MSE of 6.667×10−4, indicating that additional depth does not necessarily improve accuracy in this setting.
Figure 9 confirms that our trained model achieves accurate estimation of τ with minimal error.
We proposed a novel approach for approximating the minimum eradication time and synthesizing optimal vaccination strategies for a controlled SIR model via variable scaling PINNs and the dynamic programming principle. A salient feature of our method is that it does not require domain discretization and offers a computationally efficient solution to HJB equations related to the minimum eradication time. The experimental results confirmed the accuracy and robustness of our method in approximating the eradication time and deriving optimal vaccination strategies. Furthermore, the theoretical justification for our approach is established on the basis of NTK theory.
Despite these contributions, our study has several limitations. The current framework assumes a time-homogeneous controlled SIR model with constant infection and recovery rates, which might limit its applicability to real-world scenarios with time-inhomogeneous dynamics or heterogeneous populations. Additionally, while our method improves computational efficiency, the training of physics-informed neural networks remains sensitive to the hyperparameter choices and boundary conditions, which require further investigation.
Future work will focus on extending this framework to time-inhomogeneous SIR models, for which theoretical support has been proposed in [9]. In addition, exploring systems with state-dependent infection rates β=β(S,I), as considered in nonlinear incidence models [5], is also a promising direction. Another important extension is to adapt the framework to impose the minimal, non-augmented boundary conditions that ensure the well-posedness of the HJB equation. These directions would broaden the applicability of the proposed approach and address key theoretical and practical challenges in controlled SIR models.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
This research was supported by Seoul National University of Science and Technology.
The authors declare there is no conflict of interest.
The distribution of collocation points significantly affects PINNs' performance. We tested both uniform and exponential sampling strategies, and observed that both are effective under our experimental setup.
For the training of the value functions u and ur0, we use 1000 collocation points per batch, together with 300 fixed boundary points to ensure sufficient coverage near the domain boundaries. To train a neural network w(x,y,t;ω) that approximates the uncontrolled SIR dynamics, we use 1000 collocation and initial points per batch, along with 4000 fixed boundary points to accurately capture the evolution under the uncontrolled dynamics. For the training of the switching time function τ(x,y;θ), we use 1000 collocation points per batch sampled from the SC.
These choices provide a balance between numerical accuracy and computational efficiency, and are consistent with the stability and convergence behavior reported in the main paper.
We also note that the choice of timestep size in learning the uncontrolled SIR model is critical for obtaining accurate numerical solutions. To determine appropriate values, we first conduct preliminary experiments by plotting the SIR trajectories. We generate (S,I) trajectories with different timestep sizes and confirm that using dt=0.01 and dt=0.001 leads to consistent results, as seen in Figure A1. On the other hand, we choose a smaller timestep dt=dτ=0.001 when approximating the minimum eradication time and optimal switching time for better accuracy.
To further investigate the robustness of the variable scaling method, we conducted additional experiments under various parameter settings. Specifically, we examined how changes in the infection rate β, the recovery rate γ, and the domain scaling factors affect the error.
As seen in Table A1, the mean squared residual errors for different scaling parameters (Nx,Ny) across several combinations of (β,γ) and domain widths (ℓx,ℓy). In all cases, the appropriate choice of variable scaling leads to a significant reduction in error, which confirms the robustness and generalizability of the proposed scaling scheme.
β | γ | x | y | Nx | Ny | MSE of u |
2 | 4 | [0, 5] | [μ, 5] | 1 | 1 | 7.488×10−5 |
2 | 4 | [0, 5] | [μ, 5] | 2 | 2 | 4.350×10−5 |
2 | 4 | [0, 15] | [μ, 8] | 1 | 1 | 4.049×10−5 |
2 | 4 | [0, 15] | [μ, 8] | 1 | 2 | 3.923×10−5 |
2 | 2 | [0, 5] | [μ, 10] | 1 | 1 | 2.983×10−5 |
2 | 2 | [0, 5] | [μ, 10] | 1 | 2 | 2.694×10−5 |
2 | 2 | [0, 5] | [μ, 10] | 1 | 8 | 2.407×10−5 |
To compare our method with the standard finite difference method (FDM), we use the same experimental parameters (β=2 and γ=10) as in Figure 6(b) and consider the following equation:
βxy∂xur0+x∂xur0+(γ−βx)y∂yur0=1in[0,ℓx]×[μ,μ+ℓy], | (6.1) |
with boundary conditions
ur0(0,y)=1γln(yμ)forμ≤y≤μ+ℓy,ur0(x,μ)=0for0≤x≤γβ,ur0(x,μ)=b2(x)forγβ≤x≤ℓx,ur0(x,μ+ℓy)=b1(x)for0≤x≤ℓx, |
where the boundary functions b1(x) and b2(x) are obatined via Algorithm 1.
To implement the standard finite difference scheme, we discretize the domain [0,ℓx]×[μ,μ+ℓy] using Δx=Δy=0.01, and obtain the discretized domain
G:={(xi=iΔx,yj=μ+jΔy):0≤i≤ℓx/Δx,0≤j≤ℓy/Δy}. |
For all (xi,yj)∈G, letting
Ui,j≈ur0(xi,yj), |
the first-order scheme is given by
(βxiyj+xi)Uj,i−Uj,i−1Δx+(γ−βxi)yjUj,i−Uj−1,iΔy=1, |
where
∂xur0(xi,yj)≈Uj,i−Uj,i−1Δx,∂yur0(xi,yj)≈Uj,i−Uj−1,iΔy. |
However, as shown in Figure A2, while the finite difference method provides accurate estimates of the minimum eradication time in the region S(0)≤γβ=5, it fails to yield stable results beyond this threshold. In particular, for S(0)>5, where the value function exhibits steep gradients and rapid transitions (see Figure A2), the FDM scheme produces numerical instabilities and diverging solutions.
In contrast, the PINN approach demonstrates superior robustness and stability, successfully capturing the solution structure even in such challenging regions. This comparison highlights the advantages of the proposed PINN framework, especially in handling domains with sharp transitions or lacking reliable stability theory for traditional schemes.
[1] |
W. O. Kermack, A. G. McKendrick, A contribution to the mathematical theory of epidemics, Math. Phys. Eng. Sci., 115 (1927), 700–721. https://doi.org/10.1098/rspa.1927.0118 doi: 10.1098/rspa.1927.0118
![]() |
[2] |
M. Barro, A. Guiro, D. Quedraogo, Optimal control of a SIR epidemic model with general incidence function and a time delays, CUBO A Math. J., 20 (2018), 53–66. https://doi.org/10.4067/S0719-06462018000200053 doi: 10.4067/S0719-06462018000200053
![]() |
[3] |
P. A. Bliman, M. Duprez, Y. Privat, N. Vauchelet, Optimal immunity control and final size minimization by social distancing for the SIR epidemic model, J. Optim. Theory Appl., 189 (2021), 408–436. https://doi.org/10.1007/s10957-021-01830-1 doi: 10.1007/s10957-021-01830-1
![]() |
[4] |
L. Bolzoni, E. Bonacini, C. Soresina, M. Groppi, Time-optimal control strategies in SIR epidemic models, Math. Biosci., 292 (2017), 86–96. https://doi.org/10.1016/j.mbs.2017.07.011 doi: 10.1016/j.mbs.2017.07.011
![]() |
[5] |
E. V. Grigorieva, E. N. Khailov, A. Korobeinikov, Optimal control for a SIR epidemic model with nonlinear incidence rate, Math. Model. Nat. Phenom., 11 (2016), 89–104. https://doi.org/10.1051/mmnp/201611407 doi: 10.1051/mmnp/201611407
![]() |
[6] |
R. Hynd, D. Ikpe, T. Pendleton, An eradication time problem for the SIR model, J. Differ. Equations, 303 (2021), 214–252. https://doi.org/10.1016/j.jde.2021.09.001 doi: 10.1016/j.jde.2021.09.001
![]() |
[7] |
R. Hynd, D. Ikpe, T. Pendleton, Two critical times for the SIR model, J. Math. Anal. Appl., 505 (2022), 125507. https://doi.org/10.1016/j.jmaa.2021.125507 doi: 10.1016/j.jmaa.2021.125507
![]() |
[8] | L. S. Pontryagin, Mathematical Theory of Optimal Processes, Routledge, 2018. |
[9] | J. Jang, Y. Kim, On a minimum eradication time for the SIR model with time-dependent coefficients, preprint, arXiv: 2311.14657. https://doi.org/10.48550/arXiv.2311.14657 |
[10] | H. V. Tran, Hamilton-Jacobi equations: Theory and applications, Am. Math. Soc., 213 (2021), 322. |
[11] |
S. Ko, S. Park, VS-PINN: A fast and efficient training of physics-informed neural networks using variable-scaling methods for solving PDEs with stiff behavior, J. Comput. Phys., 529 (2025), 113860. https://doi.org/10.1016/j.jcp.2025.113860 doi: 10.1016/j.jcp.2025.113860
![]() |
[12] | A. Jacot, F. Gabriel, C. Hongler, Neural tangent kernel: Convergence and generalization in neural networks, in STOC 2021: Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, (2021). https://doi.org/10.1145/3406325.3465355 |
[13] |
M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, J. Comput. Phys., 378 (2019), 686–707. https://doi.org/10.1016/j.jcp.2018.10.045 doi: 10.1016/j.jcp.2018.10.045
![]() |
[14] |
E. Kharazmi, M. Cai, X. Zheng, G. Lin, G. E. Karniadakis, Identifiability and predictability of integer-and fractional-order epidemiological models using physics-informed neural networks, Nat. Comput. Sci., 1 (2021), 744–753. https://doi.org/10.1038/s43588-021-00158-0 doi: 10.1038/s43588-021-00158-0
![]() |
[15] |
A. Yazdani, L. Lu, M. Raissi, G. E. Karniadakis, Systems biology informed deep learning for inferring parameters and hidden dynamics, PLoS Comput. Biol., 16 (2020), e1007575. https://doi.org/10.1371/journal.pcbi.1007575 doi: 10.1371/journal.pcbi.1007575
![]() |
[16] |
S. Cai, Z. Mao, Z. Wang, M. Yin, G. E. Karniadakis, Physics-informed neural networks (PINNs) for fluid mechanics: A review, Acta Mech. Sin., 37 (2021), 1727–1738. https://doi.org/10.1007/s10409-021-01148-1 doi: 10.1007/s10409-021-01148-1
![]() |
[17] |
M. Raissi, A. Yazdani, G. E. Karniadakis, Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations, Science, 367 (2020), 1026–1030. https://doi.org/10.1126/science.aaw4741 doi: 10.1126/science.aaw4741
![]() |
[18] |
X. Jin, S. Cai, H. Li, G. E. Karniadakis, NSFnets (Navier-Stokes flow nets): Physics-informed neural networks for the incompressible Navier-Stokes equations, J. Comput. Phys., 426 (2021), 109951. https://doi.org/10.1016/j.jcp.2020.109951 doi: 10.1016/j.jcp.2020.109951
![]() |
[19] |
X. Wang, J. Li, J. Li, A deep learning based numerical PDE method for option pricing, Comput. Econ., 62 (2023), 149–164. https://doi.org/10.1007/s10614-022-10279-x doi: 10.1007/s10614-022-10279-x
![]() |
[20] |
Y. Bai, T. Chaolu, S. Bilige, The application of improved physics-informed neural network (IPINN) method in finance, Nonlinear Dyn., 107 (2022), 3655–3667. https://doi.org/10.1007/s11071-021-07146-z doi: 10.1007/s11071-021-07146-z
![]() |
[21] |
G. Kissas, Y. Yang, E. Hwuang, W. R. Witschey, J. A. Detre, P. Perdikaris, Machine learning in cardiovascular flows modeling: Predicting arterial blood pressure from non-invasive 4d flow MRI data using physics-informed neural networks, Comput. Methods Appl. Mech. Eng., 358 (2020), 112623. https://doi.org/10.1016/j.cma.2019.112623 doi: 10.1016/j.cma.2019.112623
![]() |
[22] |
F. Sahli C., Y. Yang, P. Perdikaris, D. E. Hurtado, E. Kuhl, Physics-informed neural networks for cardiac activation mapping, Front. Phys., 8 (2020), 42. https://doi.org/10.3389/fphy.2020.00042 doi: 10.3389/fphy.2020.00042
![]() |
[23] |
S. Wang, Y. Teng, P. Perdikaris, Understanding and mitigating gradient flow pathologies in physics-informed neural networks, SIAM J. Sci. Comput., 43 (2021), A3055–A3081. https://doi.org/10.1137/20M1318043 doi: 10.1137/20M1318043
![]() |
[24] |
Y. Liu, L. Cai, Y. Chen, B. Wang, Physics-informed neural networks based on adaptive weighted loss functions for Hamilton–Jacobi equations, Math. Biosci. Eng., 19 (2022), 12866–12896. https://doi.org/10.3934/mbe.2022601 doi: 10.3934/mbe.2022601
![]() |
[25] |
S. Wang, X. Yu, P. Perdikaris, When and why PINNs fail to train: A neural tangent kernel perspective, J. Comput. Phys., 449 (2022), 110768. https://doi.org/10.1016/j.jcp.2021.110768 doi: 10.1016/j.jcp.2021.110768
![]() |
[26] |
W. Ji, W. Qiu, Z. Shi, S. Pan, S. Deng, Stiff-pinn: Physics-informed neural network for stiff chemical kinetics, J. Phys. Chem. A, 125 (2021), 8098–8106. https://doi.org/10.1021/acs.jpca.1c05102 doi: 10.1021/acs.jpca.1c05102
![]() |
[27] |
L. D. McClenny, U. M. Braga-Neto, Self-adaptive physics-informed neural networks, J. Comput. Phys., 474 (2023), 111722. https://doi.org/10.1016/j.jcp.2022.111722 doi: 10.1016/j.jcp.2022.111722
![]() |
[28] |
S. Mowlavi, S. Nabi, Optimal control of PDEs using physics-informed neural networks, J. Comput. Phys., 473 (2023), 111731. https://doi.org/10.1016/j.jcp.2022.111731 doi: 10.1016/j.jcp.2022.111731
![]() |
[29] | Y. Meng, R. Zhou, A. Mukherjee, M. Fitzsimmons, C. Song, J. Liu, Physics-informed neural network policy iteration: Algorithms, convergence, and verification, preprint, arXiv: 2402.10119. https://doi.org/10.48550/arXiv.2402.10119 |
[30] | J. Y. Lee, Y. Kim, Hamilton-Jacobi based policy-iteration via deep operator learning, preprint, arXiv: 2406.10920. https://doi.org/10.48550/arXiv.2406.10920 |
[31] |
W. Tang, H. Tran, Y. Zhang, Policy iteration for the deterministic control problems—A viscosity approach, SIAM J. Control Optim., 63 (2025), 375–401. https://doi.org/10.1137/24M1631602 doi: 10.1137/24M1631602
![]() |
[32] |
S. Yin, J. Wu, P. Song, Optimal control by deep learning techniques and its applications on epidemic models, J. Math. Biol., 86 (2023), 36. https://doi.org/10.1007/s00285-023-01873-0 doi: 10.1007/s00285-023-01873-0
![]() |
[33] |
Y. Ye, A. Pandey, C. Bawden, D. Sumsuzzman, R. Rajput, A. Shoukat, et al., Integrating artificial intelligence with mechanistic epidemiological modeling: A scoping review of opportunities and challenges, Nat. Commun., 16 (2025), 581. https://doi.org/10.1038/s41467-024-55461-x doi: 10.1038/s41467-024-55461-x
![]() |
[34] |
Z. Yang, Z. Zeng, K. Wang, S. Wong, W. Liang, M. Zanin, et al., Modified SEIR and AI prediction of the epidemics trend of COVID-19 in china under public health interventions, J. Thorac. Dis., 12 (2020), 165. https://doi.org/10.21037/jtd.2020.02.64 doi: 10.21037/jtd.2020.02.64
![]() |
[35] | L. Evans, Partial differential equations: 2nd edition, in Graduate Studies in Mathematics, Am. Math. Soc., 19 (2010), 749. |
[36] | R. V. Hogg, J. W. McKean, A. T. Craig, Introduction to Mathematical Statistics, 8th edition, 2013. |
[37] | D. Kingma, Adam: A method for stochastic optimization, preprint, arXiv: 1412.6980. |
Nx | Ny | MSE of u | MSE of ur0 |
1 | 1 | 3.743×10−5 | 7.067×10−6 |
1 | 20 | 5.893×10−6 | 3.883×10−6 |
1 | 40 | 7.992×10−6 | 5.490×10−5 |
1 | 80 | 4.608×10−5 | 2.217×10−5 |
2 | 1 | 1.488×10−5 | 7.130×10−6 |
2 | 20 | 9.979×10−6 | 7.257×10−6 |
2 | 40 | 8.175×10−6 | 3.403×10−5 |
4 | 80 | 5.372×10−5 | 1.917×10−5 |
5 | 1 | 7.730×10−6 | 1.046×10−5 |
Number of hidden layers | MSE of u | MSE of ur0 |
5 | 5.893×10−6 | 3.883×10−6 |
1 | 3.111×10−4 | 4.006×10−5 |
β | γ | x | y | Nx | Ny | MSE of u |
2 | 4 | [0, 5] | [μ, 5] | 1 | 1 | 7.488×10−5 |
2 | 4 | [0, 5] | [μ, 5] | 2 | 2 | 4.350×10−5 |
2 | 4 | [0, 15] | [μ, 8] | 1 | 1 | 4.049×10−5 |
2 | 4 | [0, 15] | [μ, 8] | 1 | 2 | 3.923×10−5 |
2 | 2 | [0, 5] | [μ, 10] | 1 | 1 | 2.983×10−5 |
2 | 2 | [0, 5] | [μ, 10] | 1 | 2 | 2.694×10−5 |
2 | 2 | [0, 5] | [μ, 10] | 1 | 8 | 2.407×10−5 |
Nx | Ny | MSE of u | MSE of ur0 |
1 | 1 | 3.743×10−5 | 7.067×10−6 |
1 | 20 | 5.893×10−6 | 3.883×10−6 |
1 | 40 | 7.992×10−6 | 5.490×10−5 |
1 | 80 | 4.608×10−5 | 2.217×10−5 |
2 | 1 | 1.488×10−5 | 7.130×10−6 |
2 | 20 | 9.979×10−6 | 7.257×10−6 |
2 | 40 | 8.175×10−6 | 3.403×10−5 |
4 | 80 | 5.372×10−5 | 1.917×10−5 |
5 | 1 | 7.730×10−6 | 1.046×10−5 |
Number of hidden layers | MSE of u | MSE of ur0 |
5 | 5.893×10−6 | 3.883×10−6 |
1 | 3.111×10−4 | 4.006×10−5 |
β | γ | x | y | Nx | Ny | MSE of u |
2 | 4 | [0, 5] | [μ, 5] | 1 | 1 | 7.488×10−5 |
2 | 4 | [0, 5] | [μ, 5] | 2 | 2 | 4.350×10−5 |
2 | 4 | [0, 15] | [μ, 8] | 1 | 1 | 4.049×10−5 |
2 | 4 | [0, 15] | [μ, 8] | 1 | 2 | 3.923×10−5 |
2 | 2 | [0, 5] | [μ, 10] | 1 | 1 | 2.983×10−5 |
2 | 2 | [0, 5] | [μ, 10] | 1 | 2 | 2.694×10−5 |
2 | 2 | [0, 5] | [μ, 10] | 1 | 8 | 2.407×10−5 |