This paper mainly discusses the non-zero-sum Nash differential games for stochastic differential equations (SDEs) involving time-varying coefficient and infinite Markov jumps. First of all, a necessary and sufficient conditions for the existence of Nash equilibrium strategies is given, which turns the non-zero-sum Nash differential games into solving the equations that are composed of countable coupled generalized differential Riccati equations (CGDREs). As an application, a unified treatment is presented for H2, H∞, and H2/H∞ control by the Nash game approach, which can reveal the relationship among these three problems. Furthermore, the theoretical results are used to solve a numerical example.
Citation: Yueying Liu, Mengping Sun, Zhen Wang, Xiangyun Lin, Cuihua Zhang. Nash equilibrium strategies for non-zero-sum differential games of SDEs with time-varying coefficient and infinite Markov jumps[J]. Electronic Research Archive, 2025, 33(4): 2525-2542. doi: 10.3934/era.2025112
[1] | Qinglong Zhou, Gaofeng Zong . A stochastic linear-quadratic differential game with time-inconsistency. Electronic Research Archive, 2022, 30(7): 2550-2567. doi: 10.3934/era.2022131 |
[2] | Yang Song, Beiyan Yang, Jimin Wang . Stability analysis and security control of nonlinear singular semi-Markov jump systems. Electronic Research Archive, 2025, 33(1): 1-25. doi: 10.3934/era.2025001 |
[3] | David Cheban, Zhenxin Liu . Averaging principle on infinite intervals for stochastic ordinary differential equations. Electronic Research Archive, 2021, 29(4): 2791-2817. doi: 10.3934/era.2021014 |
[4] | J. F. Toland . Path-connectedness in global bifurcation theory. Electronic Research Archive, 2021, 29(6): 4199-4213. doi: 10.3934/era.2021079 |
[5] | Hankang Ji, Yuanyuan Li, Xueying Ding, Jianquan Lu . Stability analysis of Boolean networks with Markov jump disturbances and their application in apoptosis networks. Electronic Research Archive, 2022, 30(9): 3422-3434. doi: 10.3934/era.2022174 |
[6] | Dandan Zuo, Wansheng Wang, Lulu Zhang, Jing Han, Ling Chen . Non-fragile sampled-data control for synchronizing Markov jump Lur'e systems with time-variant delay. Electronic Research Archive, 2024, 32(7): 4632-4658. doi: 10.3934/era.2024211 |
[7] | Yi Wei . The Riccati-Bernoulli subsidiary ordinary differential equation method to the coupled Higgs field equation. Electronic Research Archive, 2023, 31(11): 6790-6802. doi: 10.3934/era.2023342 |
[8] | Jiali Wang, Xin Jin, Yang Tang . Optimal strategy analysis for adversarial differential games. Electronic Research Archive, 2022, 30(10): 3692-3710. doi: 10.3934/era.2022189 |
[9] | Lynnyngs K. Arruda . Multi-shockpeakons for the stochastic Degasperis-Procesi equation. Electronic Research Archive, 2022, 30(6): 2303-2320. doi: 10.3934/era.2022117 |
[10] | Zhencheng Fan . Zero-stability of waveform relaxation methods for ordinary differential equations. Electronic Research Archive, 2022, 30(3): 1126-1141. doi: 10.3934/era.2022060 |
This paper mainly discusses the non-zero-sum Nash differential games for stochastic differential equations (SDEs) involving time-varying coefficient and infinite Markov jumps. First of all, a necessary and sufficient conditions for the existence of Nash equilibrium strategies is given, which turns the non-zero-sum Nash differential games into solving the equations that are composed of countable coupled generalized differential Riccati equations (CGDREs). As an application, a unified treatment is presented for H2, H∞, and H2/H∞ control by the Nash game approach, which can reveal the relationship among these three problems. Furthermore, the theoretical results are used to solve a numerical example.
Applications of Markov jump stochastic systems have been found in a variety of fields, such as robotics, economics, and fault detection. For Markov jump stochastic systems, a sample of problems can be found in the literature, such as stability and stabilization, see [1,2,3,4,5,6], reinforcement learning-based optimization, see [7,8], H2 optimal control, see [9,10], H∞ control, see [11,12,13,14], H2/H∞ control, see [15,16], and game problem, see [17,18,19]. Recalling some existing results, most of them take values in finite state space for Markov chains, while few are based on the assumption that Markov chains are valued in countable state space. Thus, this is a significant research topic.
From an applicable point of view, countable Markov chain may be better suited to describe sudden changes in many practical scenarios such as modern queueing theory, solar thermal receivers, and so on [20]. From a theoretical point of view, in terms of stability, stochastic systems with finite or infinite Markov jumps are fundamentally different. The essential root lies in the fact that the causal and anticausal Lyapunov operators of infinite Markov jump systems are no longer adjoint. As a special hybrid system, infinite Markov jump stochastic systems contain two kinds of mixed dynamic forms. One is called mode, which is described by a Markov process with countable discrete states. The other, called the state, is described by stochastic differential equations (SDEs) for each mode. To be specific, [21] clarified the relationship among four kinds of stability for stochastic systems with countable Markov chains. Further, [22] took into account the effect of time delay and parametric uncertainties and also summarized the relationship among the above four stabilities. On this basis, some controller synthesis problems have been solved in [23,24,25]. Therefore, it is of significance both in theory and in practice to consider stochastic systems with countable Markov chains.
The differential games are widely applied in many fields, such as engineering, finance, and biology. For a special case where the state equations are linear and the payoff functionals are quadratic, [26] considered the linear quadratic (LQ) stochastic zero-sum differential game for the Markov jump system driven by Brownian motion and obtained a linear feedback saddle point characterized by the set of coupled Riccati differential equations. Under a more general functional, both open-loop solvability and closed-loop solvability are discussed in [27], and the solvability of associated Riccati equations under the uniform convexity-concavity condition has been studied. Besides, the differential equation characterizations of the lower and upper value functions of the game under a rather general setup are obtained in [28]. Further, [29] extended the differential games for Markov jump-diffusion models to the leader-follower Stackelberg game framework. For non-zero-sum differential games, games of regime-switching diffusions with mean-field interactions were concerned in [30]. However, for such systems with countable Markov chains, there are few results reported on the Nash game problem. In [31], although a unified treatment approach for the three control design problems, that is, H2, H∞, and H2/H∞ control, is presented via Nash equilibrium solution, a countable Markov chain is not involved. In [32], the Nash game problem is studied, while the effect of the control term on noise is neglected, which is focused on revealing the relationship between Nash equilibrium strategies and H2/H∞ control. This is basically different from the starting point of our research that emphasizes a unified treatment for the three control problems. In [33], although the finite horizon H2/H∞ controller design was investigated for a system considering countable Markov chain, the Nash equilibrium points and finite horizon H2/H∞ controller design were not equivalent for system (2.1); see Remark 4.3. Hence, it is necessary to investigate the non-zero-sum Nash differential games for SDEs involving time-varying coefficient and infinite Markov jumps, which is the main motivation of the paper.
In this paper, we will discuss the non-zero-sum Nash differential games for SDEs involving time-varying coefficients and infinite Markov jumps, in which the Markov chain takes values in countable state space. The contribution of our paper rests on four aspects: First of all, since finite and infinite Markov jump systems have the essential difference on stability, the countable dimension Banach spaces are introduced, and their elements are linear and bounded operators. Next, with the tool of stochastic analysis, the Nash equilibrium strategies can be obtained by coupled generalized differential Riccati equations (CGDREs), which are a countable coupled Riccati equations, and this makes the equations more difficult to handle than those in [31,32]. Specifically, for the existence of Nash equilibrium strategies, a necessary and sufficient condition is given based on the pseudo inverse matrix. Once more, to demonstrate the above game result's theoretical value, a unified treatment for the three control problems is presented with some corresponding parameters. Last but not least, to overcome the difficulty of solving the CGDREs analytically, the discretization method and backward recursive algorithm are applied to solve CGDREs approximately.
The structure of this article is as follows: Preliminary discussions are included in Section 2. We show in Section 3 that Nash equilibrium strategies can be obtained by the CGDREs. Based on this result, in Section 4, a unified treatment is presented for the three control problems. In Section 5, one numerical example is given. Section 6 describes the conclusions.
The following symbols are used. Rn(Rl×m) is n-dimensional real Euclidean space (the linear space of all l by m real matrices). A′ and A† stand for the transpose and pseudo-inverse of the matrix A, respectively. The totality of P-null sets is denoted by N, Fς:=σ(w(s),0≤s≤ς)∨σ(ϖ(s),0≤s≤ς)∨N. l2([0,T];Rl)={e∈Rl|e is Fς−measurable and ∫T0E‖e(ς)‖2dς<∞}. D:={1,2,…}.
Consider the following linear SDEs with time-varying coefficients and infinite Markov jumps:
{dx(ς)=[C1(ς,ϖς)x(ς)+D1(ς,ϖς)η(ς)+E1(ς,ϖς)σ(ς)]dς +[C2(ς,ϖς)x(ς)+D2(ς,ϖς)η(ς)+E2(ς,ϖς)σ(ς)]dw(ς),z(ς)=[A(ς,ϖς)x(ς)B(ς,ϖς)η(ς)], B(ς,ϖς)′B(ς,ϖς)=Inη, | (2.1) |
where x(0)=x0∈Rn, ϖ(0)=ϖ0∈D, x(ς)∈Rn is on behalf of the system state, z(ς)∈Rnz stands for the measurement output, and for two different players, η(ς)∈Rnη and σ(ς)∈Rnσ are the control processes, respectively. w(ς) represents a standard one-dimensional Brownian motion. {ϖς}ς∈[0,T] takes values in the countable state space D, which is a right continuous and homogeneous Markov process. P=[pℵj(ς)] is the transition probability matrix of {ϖς}ς∈[0,T] with pℵj(ς)=P(ϖs+ς=j|ϖs=ℵ), which is assumed to be stationary. The infinitesimal matrix of {ϖς}ς∈[0,T] is defined as Φ=(ϕℵj)ℵ,j∈D, where ϕℵj=limς→0pℵj(ς)−pℵj(0)ς, pℵj(0)=δ(ℵ−j), ℵ,j∈D. It should be noted that when ℵ≠j, ϕℵj≥0, and for ℵ∈D and some c>0, 0≤−ϕℵℵ=∑j∈D,j≠ℵϕℵj<c<∞. Added to that, P is nondegenerate, and for s∈[0,T], πs(ℵ):=P(ϖs=ℵ)>0, ℵ∈D.
Am×n1 (Am×n∞) represents the real Banach space of {A|A=(A(1),A(2),⋯),A(ℵ)∈Rm×n} with the norm ‖A‖1=∑∞ℵ=1‖A(ℵ)‖<∞ (‖A‖∞=supℵ∈D‖A(ℵ)‖<∞). When m=n, Am×n1 (Am×n∞) will be expressed as An1 (An∞). For A∈An+1(An+∞), A≥0 if and only if A(ℵ)≥0 for all ℵ∈D. A∈˜An+1(˜An+∞) means A>0. By C1([0,T],An+∞) (Cb([0,T],An+∞)), we denote all continuously differentiable (bounded) mappings g, and by C1b([0,T],An+∞), we denote all bounded mappings g(ς) and dg(ς)dς.
Given two parameters with α>0 and β≥0, the relevant cost functionals are defined as follows:
J1(x0,ϖ0,η(⋅),σ(⋅))=E{∫T0[α2‖σ(ς)‖2−‖z(ς)‖2]dς|ϖ0=ℵ}, | (2.2) |
J2(x0,ϖ0,η(⋅),σ(⋅))=E{∫T0[‖z(ς)‖2−β2‖σ(ς)‖2]dς|ϖ0=ℵ}. | (2.3) |
We will look for the optimal Nash equilibrium strategies (η∗(⋅),σ∗(⋅)) to minimize cost functionals (2.2) and (2.3) subject to system (2.1).
Definition 2.1. For all admissible (η(⋅),σ(⋅))∈l2([0,T];Rnη)×l2([0,T];Rnσ), if
J1(x0,ϖ0,η∗(⋅),σ∗(⋅))≤J1(x0,ϖ0,η∗(⋅),σ(⋅)), | (2.4) |
J2(x0,ϖ0,η∗(⋅),σ∗(⋅))≤J2(x0,ϖ0,η(⋅),σ∗(⋅)), | (2.5) |
then (η∗(⋅),σ∗(⋅))∈l2(NT;Rnη)×l2(NT;Rnσ) is called the Nash equilibrium strategy.
Lemma 2.1 ([33]). For system (2.1) with Dl(ς,ϖς)=0,El(ς,ϖς)=0,l=1,2, if G(ς,ϖς)∈C1b([0,T],An+∞), then we yield
E[x(T)′G(ς,ϖς)x(T)−x′0G(0,ϖ0)x0|ϖ0=ℵ] =E{∫T0[x(ς)′(˙G(ς,ϖς)+G(ς,ϖς)C1(ς,ϖς)+C1(ς,ϖς)′G(ς,ϖς)+∞∑j=1ϕϖςjG(ς,j) +C2(ς,ϖς)′G(ς,ϖς)C2(ς,ϖς))x(ς)]dς|ϖ0=ℵ} | (2.6) |
for (x0,ℵ)∈Rn×D.
Lemma 2.2 ([36]). Let matrices T1, T2, T3, and F be given appropriate sizes. Then the matrix equation T1YT2=T3 admits a solution Y if and only if T1T†1T3T†2T2=T3. Moreover, the solution can be represented by Y=T†1T3T†2+F−T†1T1FT2T†2.
The purpose of this section is to get the necessary and sufficient conditions for the existence of Nash equilibrium strategies. For this end, suppose that the feedback strategies can take the following form [37]:
η(ς)=Γ2(ς,ϖς)x(ς),σ(ς)=Γ1(ς,ϖς)x(ς). | (3.1) |
Theorem 3.1. Under hypothetical conditions that Cm(ς,ϖς)∈Cb([0,T],An∞), Dm(ς,ϖς)∈Cb([0,T],An×nη∞), Em(ς,ϖς)∈Cb([0,T],An×nσ∞), m=1,2, A(ς,ϖς)∈Cb([0,T],Anz×n∞) in (2.1), for Nash game problem (2.4) and (2.5), a unique Nash equilibrium strategy
(η∗(ς)=Γ2(ς,ϖς)x(ς),σ∗(ς)=Γ1(ς,ϖς)x(ς)) |
exists iff the following CGDREs:
{−˙G1(ς,ℵ)=[C1(ς,ℵ)+D1(ς,ℵ)Γ2(ς,ℵ)]′G1(ς,ℵ)+G1(ς,ℵ)[C1(ς,ℵ) +D1(ς,ℵ)Γ2(ς,ℵ)]+[C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ)]′G1(ς,ℵ) ⋅[C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ)]−A(ς,ℵ)′A(ς,ℵ) −Γ2(ς,ℵ)′Γ2(ς,ℵ)+∞∑j=1ϕℵjG1(ς,j) −H1(ς,ℵ)M1(ς,ℵ)†H1(ς,ℵ)′,M1(ς,ℵ)M1(ς,ℵ)†H1(ς,ℵ)′−H1(ς,ℵ)′=0,G1(T,ℵ)=0, M1(ς,ℵ)≥0, (ς,ℵ)∈[0,T]×D, | (3.2) |
{−˙G2(ς,ℵ)=[C1(ς,ℵ)+E1(ς,ℵ)Γ1(ς,ℵ)]′G2(ς,ℵ)+G2(ς,ℵ)[C1(ς,ℵ) +E1(ς,ℵ)Γ1(ς,ℵ)]+[C2(ς,ℵ)+E2(ς,ℵ)Γ1(ς,ℵ)]′G2(ς,ℵ) ⋅[C2(ς,ℵ)+E2(ς,ℵ)Γ1(ς,ℵ)]+A(ς,ℵ)′A(ς,ℵ) −β2Γ1(ς,ℵ)′Γ1(ς,ℵ)+∞∑j=1ϕℵjG2(ς,j) −H2(ς,ℵ)M2(ς,ℵ)†H2(ς,ℵ)′,M2(ς,ℵ)M2(ς,ℵ)†H2(ς,ℵ)′−H2(ς,ℵ)′=0,G2(T,ℵ)=0, M2(ς,ℵ)≥0, (ς,ℵ)∈[0,T]×D, | (3.3) |
admit solutions (G1(ς,ℵ),G2(ς,ℵ)) for (ς,ℵ)∈[0,T]×D, where
M1(ς,ℵ)=α2Inσ+E2(ς,ℵ)′G1(ς,ℵ)E2(ς,ℵ),M2(ς,ℵ)=Inη+D2(ς,ℵ)′G2(ς,ℵ)D2(ς,ℵ),H1(ς,ℵ)=G1(ς,ℵ)E1(ς,ℵ)+[C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ)]′G1(ς,ℵ)E2(ς,ℵ),H2(ς,ℵ)=G2(ς,ℵ)D1(ς,ℵ)+[C2(ς,ℵ)+E2(ς,ℵ)Γ1(ς,ℵ)]′G2(ς,ℵ)D2(ς,ℵ),Γ1(ς,ℵ)=−M1(ς,ℵ)†H1(ς,ℵ)′,Γ2(ς,ℵ)=−M2(ς,ℵ)†H2(ς,ℵ)′. |
Proof. Sufficiency: Because CGDREs (3.2) and (3.3) have solutions G1(ς,ℵ)≤0, G2(ς,ℵ)≥0, (ς,ℵ)∈[0,T]×D, we can infer from Γ2(ς,ϖς)=−M2(ς,ϖς)†H2(ς,ϖς)′ that η(ς) can be substituted by η∗(ς)=Γ2(ς,ϖς)x(ς) in system (2.1)
{dx(ς)={[C1(ς,ϖς)+D1(ς,ϖς)Γ2(ς,ϖς)]x(ς)+E1(ς,ϖς)σ(ς)}dς +{[C2(ς,ϖς)+D2(ς,ϖς)Γ2(ς,ϖς)]x(ς)+E2(ς,ϖς)σ(ς)}dw(ς),z(ς)=[A(ς,ϖς)x(ς)B(ς,ϖς)Γ2(ς,ϖς)x(ς)], B(ς,ϖς)′B(ς,ϖς)=Inη. | (3.4) |
By now applying the method of completing the square, it can be obtained from Lemma 2.1 and Eq (3.2) that
J1(x0,ϖ0,η∗(⋅),σ(⋅)) =E[x′0G1(0,ℵ)x0] +E{∫T0[α2‖σ(ς)‖2−‖z(ς)‖2+d(x(ς)′G1(ς,ϖς)x(ς))]dς|ϖ0=ℵ} =E[x′0G1(0,ℵ)x0]+E{∫T0[α2‖σ(ς)‖2−‖z(ς)‖2]|ϖ0=ℵ} +E{∫T0{x(ς)′{˙G1(ς,ℵ)+[C1(ς,ℵ)+D1(ς,ℵ)Γ2(ς,ℵ)]′G1(ς,ℵ) +G1(ς,ℵ)[C1(ς,ℵ)+D1(ς,ℵ)Γ2(ς,ℵ)]+[C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ)]′ ⋅G1(ς,ℵ)[C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ)]+∞∑j=1ϕℵjG1(ς,j)}x(ς) +σ(ς)′H1(ς,ℵ)′x(t)+x(t)′H1(ς,ℵ)σ(ς) +σ(ς)′E2(ς,ℵ)′G1(ς,ℵ)E2(ς,ℵ)σ(ς)}dς|ϖ0=ℵ} =E[x′0G1(0,ℵ)x0]+E∫T0[σ(ς)−Γ1(ς,ϖς)x(ς)]′M1(ς,ϖς)[σ(ς)−Γ1(ς,ϖς)x(ς)]dς +E{∫T0{x(ς)′{˙G1(ς,ℵ)+[C1(ς,ℵ)+D1(ς,ℵ)Γ2(ς,ℵ)]′G1(ς,ℵ) +G1(ς,ℵ)[C1(ς,ℵ)+D1(ς,ℵ)Γ2(ς,ℵ)]+[C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ)]′ ⋅G1(ς,ℵ)[C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ)]−A(ς,ℵ)′A(ς,ℵ)−Γ2(ς,ℵ)′Γ2(ς,ℵ) +∞∑j=1ϕℵjG1(ς,j)−H1(ς,ℵ)M1(ς,ℵ)†H1(ς,ℵ)′}x(ς)}dς|ϖ0=ℵ} =E∫T0[σ(ς)−Γ1(ς,ϖς)x(ς)]′M1(ς,ϖς)[σ(ς)−Γ1(ς,ϖς)x(ς)]dς +∞∑ℵ=1π0(ℵ)x′0G1(0,ℵ)x0 ≥J1(x0,ϖ0,η∗(⋅),σ∗(⋅))=∞∑ℵ=1π0(ℵ)x′0G1(0,ℵ)x0, | (3.5) |
where σ∗(ς)=Γ1(ς,ϖς)x(ς)=−M1(ς,ϖς)†H1(ς,ϖς)′x(ς), π0(ℵ)=G1(ϖ0=ℵ), ℵ∈D. This means the Nash equilibrium strategies inequality (2.4) of Definition 2.1 is valid.
In order to illustrate the other Nash equilibrium strategies inequality (2.5), put σ∗(ς)=Γ1(ς,ϖς)x(ς) into system (2.1), and we can obtain
{dx(ς)={[C1(ς,ϖς)+E1(ς,ϖς)Γ1(ς,ϖς)]x(ς)+D1(ς,ϖς)η(ς)}dς +{[C2(ς,ϖς)+E2(ς,ϖς)Γ1(ς,ϖς)]x(ς)+D2(ς,ϖς)η(ς)}dw(ς),z(ς)=[A(ς,ϖς)x(ς)B(ς,ϖς)η(ς)], B(ς,ϖς)′B(ς,ϖς)=Inη. | (3.6) |
Then, with such restrictive conditions attached to (2.5), we can compute
J2(x0,ϖ0,η(⋅),σ∗(⋅)) =E{∫T0{x(ς)′[A(ς,ϖς)′A(ς,ϖς)−β2Γ1(ς,ϖς)′Γ1(ς,ϖς)]x(ς) +η(ς)′η(ς)}dς|ϖ0=ℵ}. | (3.7) |
To move forward a single step, a combination of Lemma 2.1 and system (3.6) causes
J2(x0,ϖ0,η(⋅),σ∗(⋅)) =E{∫T0{x(ς)′[A(ς,ϖς)′A(ς,ϖς)−β2Γ1(ς,ϖς)′Γ1(ς,ϖς)]x(ς) +η(ς)′η(ς)+d(x(ς)′G2(ς,ϖς)x(ς))}dς|ϖ0=ℵ}+E[x′0G2(0,ℵ)x0] =E{∫T0{x(ς)′[˙G2(ς,ϖς)+[C1(ς,ϖς)+E1(ς,ϖς)Γ1(ς,ϖς)]′G2(ς,ϖς) +G2(ς,ϖς)[C1(ς,ϖς)+E1(ς,ϖς)Γ1(ς,ϖς)]+A(ς,ϖς)′A(ς,ϖς) −β2Γ1(ς,ϖς)′Γ1(ς,ϖς)+[C2(ς,ϖς)+E2(ς,ϖς)Γ1(ς,ϖς)]′G2(ς,ϖς) ⋅[C2(ς,ϖς)+E2(ς,ϖς)Γ1(ς,ϖς)]+∞∑j=1δϖtjG2(ς,j)]x(ς) +x(ς)′H2(ς,ϖς)η(ς)+η(ς)′H2(ς,ϖς)′x(ς)+η(ς)′M2(ς,ϖς)η(ς)}dς|ϖ0=ℵ} +E[x′0G2(0,ℵ)x0]. | (3.8) |
Via Eq (3.3), it can be received from completing the square that
J2(x0,ϖ0,η(⋅),σ∗(⋅)) =E∫T0[η(ς)−Γ2(ς,ϖς)x(ς)]′M2(ς,ϖς)[η(ς)−Γ2(ς,ϖς)x(ς)]dς +∞∑ℵ=1π0(ℵ)x′0G2(0,ℵ)x0 ≥J2(x0,ϖ0,η∗(⋅),σ∗(⋅))=∞∑ℵ=1π0(ℵ)x′0G2(0,ℵ)x0, | (3.9) |
where η∗(ς)=Γ2(ς,ϖς)x(ς)=−M2(ς,ϖς)†H2(ς,ϖς)′x(ς). So far, there exist Nash equilibrium strategies (η∗(⋅),σ∗(⋅)) for system (2.1).
Necessity: For Nash game problems (2.4) and (2.5), make use of the definition of (2.4). We discover that σ∗(ς)=Γ1(ς,ϖς)x(ς) solves the following LQ optimal control problem:
{minσ(⋅)∈l2([0,T];Rnσ){J1(x0,ϖ0,,η∗(⋅),σ(⋅))=E{∫T0[α2‖σ(ς)‖2−‖z(ς)‖2]dς|ϖ0=ℵ}},subject to (3.4). | (3.10) |
In fact, it is an indefinite problem on account of
J1(x0,ϖ0,η∗(⋅),σ(⋅)) =E{∫T0{x(ς)′[−A(ς,ϖς)′A(ς,ϖς)−Γ2(ς,ϖς)′Γ2(ς,ϖς)]x(ς) +α2σ(ς)′σ(ς)}dς|ϖ0=ℵ}. |
Of course, the above problem is well-posed, as it should be. Subsequent work will state that the well-posed indefinite LQ optimal control problem (3.10) leads to the following CGDREs:
{˙˜G1(ς,ℵ)+[C1(ς,ℵ)+D1(ς,ℵ)Γ2(ς,ℵ)]′˜G1(ς,ℵ)+˜G1(ς,ℵ)[C1(ς,ℵ) +D1(ς,ℵ)Γ2(ς,ℵ)]+[C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ)]′˜G1(ς,ℵ) ⋅[C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ)]−A(ς,ℵ)′A(ς,ℵ)−Γ2(ς,ℵ)′Γ2(ς,ℵ) +∞∑j=1ϕℵj˜G1(ς,j)−˜H1(ς,ℵ)˜M1(ς,ℵ)†˜H1(ς,ℵ)′=0,˜M1(ς,ℵ)˜M1(ς,ℵ)†˜H1(ς,ℵ)′−˜H1(ς,ℵ)′=0,˜G1(T,ℵ)=0, ˜M1(ς,ℵ)≥0, (ς,ℵ)∈[0,T]×D, | (3.11) |
admit a solution ˜G1(ς,ℵ) for (ς,ℵ)∈[0,T]×D, where ˜M1(ς,ℵ),˜H1(ς,ℵ),˜Γ1(ς,ℵ) can be obtained by replacing G1(ς,ℵ) with ˜G1(ς,ℵ) in M1(ς,ℵ),H1(ς,ℵ),Γ1(ς,ℵ). In this connection, define the value function as
V(x(ς),ς,ϖς)=minσ(⋅)∈l2([0,T];Rnσ)J1(x0,ϖ0,η∗(⋅),σ(⋅)). | (3.12) |
Since the problem (3.10) is well-posed, similar to [36], by a simple adaptation, (3.12) has the form
V(x(ς),ς,ℵ)=x(ς)′˜G1(ς,ℵ)x(ς),ℵ∈D, | (3.13) |
where ˜G1(ς,ℵ) is a symmetric matrix. Further, for ℵ∈D, applying the dynamic programming method and considering (3.13), we can yield that
x(ς)′[˙˜G1(ς,ℵ)+˜G1(ς,ℵ)(C1(ς,ℵ)+D1(ς,ℵ)Γ2(ς,ℵ)) ⋅(C1(ς,ℵ)+D1(ς,ℵ)Γ2(ς,ℵ))′˜G1(ς,ℵ) +(C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ))′˜G1(ς,ℵ)(C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ)) −A(ς,ℵ)′A(ς,ℵ)−Γ2(ς,ℵ)′Γ2(ς,ℵ)+∞∑j=1ϕℵj˜G1(ς,j)]x(ς) +minΓ1(ς,ℵ){x(ς)′[Γ1(ς,ℵ)′˜M1(ς,ℵ)Γ1(ς,ℵ)+2˜H1(ς,ℵ)Γ1(ς,ℵ)]x(ς)}=0. | (3.14) |
To minimize the above equation, the following condition is required:
∂∂Γ1(ς,ℵ){Γ1(ς,ℵ)˜M1(ς,ℵ)Γ1(ς,ℵ)+2˜H1(ς,ℵ)Γ1(ς,ℵ)}|Γ1(ς,ℵ)=˜Γ1(ς,ℵ)=0, | (3.15) |
and (3.15) is equivalent to
˜M1(ς,ℵ)˜Γ1(ς,ℵ)+˜H1(ς,ℵ)′=0. | (3.16) |
At present, let T1=˜M1(ς,ℵ), T2=Inσ, T3=−˜H1(ς,ℵ)′, and via Lemma 2, we can gain
˜M1(ς,ℵ)˜M1(ς,ℵ)†˜H1(ς,ℵ)′=˜H1(ς,ℵ)′ |
and
˜Γ1(ς,ℵ)=−˜M1(ς,ℵ)†˜H1(ς,ℵ)′ | (3.17) |
with F=0. Plugging (3.17) into (3.14) can be calculated. The Eq (3.11) has solution ˜G1(ς,ℵ), (ς,ℵ)∈[0,T]×D. Besides, we can generalize Lemma 3 in [38] to the infinite Markov jump case, which is processed in a similar manner. What follows is ˜M1(ς,ℵ)≥0. So far, we can get that the CGDREs (3.11) admit a solution ˜G1(ς,ℵ) for (ς,ℵ)∈[0,T]×D: then it can be obtained that the solution of the indefinite LQ optimal control problem (3.10) is σ∗(ς)=−˜M1(ς,ℵ)†˜H1(ς,ℵ)′x(ς) with ˜Γ1(ς,ℵ)=−˜M1(ς,ℵ)†˜H1(ς,ℵ)′.
The same can be seen with inequality (2.5) by Definition 2.1, and η∗(ς)=Γ2(ς,ϖς)x(ς) is the solution of the following indefinite LQ optimal control problem:
{minη(⋅)∈l2([0,T];Rnη){J2(x0,ϖ0,η(⋅),σ∗(⋅))=E{∫T0[‖z(ς)‖2−β2‖σ(ς)‖2]dς|ϖ0=ℵ},subject to (3.6). | (3.18) |
There is an analogous method for proving the following CGDREs
{−˙˜G2(ς,ℵ)=[C1(ς,ℵ)+E1(ς,ℵ)Γ1(ς,ℵ)]′˜G2(ς,ℵ)+˜G2(ς,ℵ)[C1(ς,ℵ) +E1(ς,ℵ)Γ1(ς,ℵ)]+[C2(ς,ℵ)+E2(ς,ℵ)Γ1(ς,ℵ)]′˜G2(ς,ℵ) ⋅[C2(ς,ℵ)+E2(ς,ℵ)Γ1(ς,ℵ)]+A(ς,ℵ)′A(ς,ℵ) −β2Γ1(ς,ℵ)′Γ1(ς,ℵ)+∞∑j=1ϕℵj˜G2(ς,j) −˜H2(ς,ℵ)˜M2(ς,ℵ)†˜H2(ς,ℵ)′,˜M2(ς,ℵ)˜M2(ς,ℵ)†˜H2(ς,ℵ)′−˜H2(ς,ℵ)′=0,˜G2(T,ℵ)=0, ˜M2(ς,ℵ)≥0, (ς,ℵ)∈[0,T]×D, | (3.19) |
has solution ˜G2(ς,ℵ), (ς,ℵ)∈[0,T]×D, where ˜M2(ς,ℵ),˜H2(ς,ℵ),˜Γ2(ς,ℵ) can be gained by replacing G2(ς,ℵ) with ˜G2(ς,ℵ) in M2(ς,ℵ),H2(ς,ℵ),Γ2(ς,ℵ), and the solution of indefinite LQ optimal problem (3.18) is η∗(ς)=−˜M2(ς,ℵ)†˜H2(ς,ℵ)′x(ς) with ˜Γ2(ς,ℵ)=−˜M2(ς,ℵ)†˜H2(ς,ℵ)′. To sum up, a mixture of (3.11) and (3.19) decides ˜G1(ς,ℵ)=G1(ς,ℵ),˜G2(ς,ℵ)=G2(ς,ℵ). That shows the CGDREs admit solutions (G1(ς,ℵ),G2(ς,ℵ)) for (ς,ℵ)∈[0,T]×D. This ends the proof.
Remark 3.1. Based on Theorem 3.1, one can glean that the key to obtaining Nash equilibrium strategies for system (2.1) is solving the CGDREs (3.2) and (3.3). Since the CGDREs here are a countably infinite set of equations, compared with [32] and the discrete-time case of [23,31], it is harder and more complex to solve. Actually, a discretization method is presented in [33], which can apply to calculate (3.2) and (3.3).
Remark 3.2. Indeed, the solvability of (3.2) and (3.3) is very crucial. However, it is worth noting that even for LQ problems with no Markov jumps, the related Riccati equations remain unsolved, and their solvability can only be guaranteed under certain specific conditions, for example, LQ optimal control in [34], LQ zero-sum game in [35], and LQ non-zero-sum game in [30]. The main difficulty lies in that the Riccati equations are highly coupled. In future research, we will focus on the discussion about the existence of solutions to the Riccati equations.
From the previous studies, the existence of Nash equilibrium strategies for system (2.1) has been discussed. This section will focus on a unified treatment for the three control problems with the different values of α and β or regarding σ(ς) as an exogenous disturbance.
Let α→∞,β=0 in (2.2) and (2.3) we can obtain the following LQ optimal control problem
minη(⋅)∈l2([0,T];Rnη){J2(x0,ϖ0,,η(⋅))=E[∫T0‖z(ς)‖2dς|ϖ0=ℵ]}, | (4.1) |
subject to
{dx(ς)=[C1(ς,ϖς)x(ς)+D1(ς,ϖς)η(ς)]dς+[C2(ς,ϖς)x(ς)+D2(ς,ϖς)η(ς)]dw(ς),z(ς)=[A(ς,ϖς)x(ς)B(ς,ϖς)η(ς)], B(ς,ϖς)′B(ς,ϖς)=Inη,x(0)=x0∈Rn, ϖ(0)=ϖ0∈D, ς∈[0,T]. | (4.2) |
The weighting matrices of state and control in the cost function (4.1) of the above LQ optimal control problem are A(ς,ϖς)′A(ς,ϖς) and Inη, respectively. Moreover, it can be computed by Theorem 3.1 that M1(ς,ℵ)†→0,H1(ς,ℵ)=0,Γ1(ς,ℵ)→0,H2(ς,ℵ)→G2(ς,ℵ)D1(ς,ℵ)+C2(ς,ℵ)′G2(ς,ℵ)D2(ς,ℵ), and G2(ς,ℵ) is the solution of the following CGDREs
{˙G2(ς,ℵ)+C1(ς,ℵ)′G2(ς,ℵ)+G2(ς,ℵ)C1(ς,ℵ)+A(ς,ℵ)′A(ς,ℵ) +C2(ς,ℵ)′G2(ς,ℵ)C2(ς,ℵ)+∞∑j=1ϕℵjG2(ς,j)−[G2(ς,ℵ)D1(ς,ℵ) +C2(ς,ℵ)′G2(ς,ℵ)D2(ς,ℵ)][Inη+D2(ς,ℵ)′G2(ς,ℵ)D2(ς,ℵ)]−1 ⋅[G2(ς,ℵ)D1(ς,ℵ)+C2(ς,ℵ)′G2(ς,ℵ)D2(ς,ℵ)]′=0,G2(T,ℵ)=0,Inη+D2(ς,ℵ)′G2(ς,ℵ)D2(ς,ℵ)>0, (ς,ℵ)∈[0,T]×D. | (4.3) |
Via Theorem 3.1, we can further obtain that optimal control is η∗(ς)=Γ2(ς,ϖς)x(ς) with
Γ2(ς,ℵ)=−[Inη+D2(ς,ℵ)′G2(ς,ℵ)D2(ς,ℵ)]−1 ⋅[G2(ς,ℵ)D1(ς,ℵ)+C2(ς,ℵ)′G2(ς,ℵ)D2(ς,ℵ)]′ |
for ϖt=ℵ, and the optimal value function is
minη(⋅)∈l2([0,T];Rnη)J2(x0,ϖ0,η(⋅))=J2(x0,ϖ0,η∗(⋅))=∞∑ℵ=1π0(ℵ)x′0G2(0,ℵ)x0. | (4.4) |
Remark 4.1. Significantly different from (3.3), the positive definiteness of M2(ς,ℵ) can be guaranteed; in other words, M2(ς,ℵ)†=M2(ς,ℵ)−1. In reality, the main reason is that, taking advantage of (4.4), it can easily prove G2(ς,ℵ)≥0 for (ς,ℵ)∈[0,T]×D.
Set α=β in (2.2) and (2.3), then it is clear that
J1(x0,ϖ0,η(⋅),σ(⋅))+J2(x0,ϖ0,η(⋅),σ(⋅))=0. |
Furthermore, the Eq (3.3) is expressed in terms of the following form:
−˙G2(ς,ℵ)=[C1(ς,ℵ)+E1(ς,ℵ)Γ1(ς,ℵ)+D1(ς,ℵ)Γ2(ς,ℵ)]′G2(ς,ℵ) +G2(ς,ℵ)[C1(ς,ℵ)+E1(ς,ℵ)Γ1(ς,ℵ)+D1(ς,ℵ)Γ2(ς,ℵ)] +A(ς,ℵ)′A(ς,ℵ)−α2Γ1(ς,ℵ)′Γ1(ς,ℵ) +[C2(ς,ℵ)+E2(ς,ℵ)Γ1(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ)]′ ⋅G2(ς,ℵ)[C2(ς,ℵ)+E2(ς,ℵ)Γ1(ς,ℵ) +D2(ς,ℵ)Γ2(ς,ℵ)]+∞∑j=1ϕℵjG2(ς,j)+Γ2(ς,ℵ)′Γ2(ς,ℵ), | (4.5) |
and it is equivalent to
−˙G2(ς,ℵ)=[C1(ς,ℵ)+D1(ς,ℵ)Γ2(ς,ℵ)]′G2(ς,ℵ)+G2(ς,ℵ)[C1(ς,ℵ) +D1(ς,ℵ)Γ2(ς,ℵ)]+A(ς,ℵ)′A(ς,ℵ) +[C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ)]′G2(ς,ℵ)[C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ)] +Γ1(ς,ℵ)′[−α2Inσ+E2(ς,ℵ)′G2(ς,ℵ)E2(ς,ℵ)]Γ1(ς,ℵ) +Γ1(ς,ℵ)′[E1(ς,ℵ)′G2(ς,ℵ)+E2(ς,ℵ)′G2(ς,ℵ)(C2(ς,ℵ) +D2(ς,ℵ)Γ2(ς,ℵ))]+[G2(ς,ℵ)E1(ς,ℵ) +(C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ))′G2(ς,ℵ)E2(ς,ℵ)]Γ1(ς,ℵ) +∞∑j=1ϕℵjG2(ς,j)+Γ2(ς,ℵ)′Γ2(ς,ℵ). | (4.6) |
At present, we plug G2(ς,ℵ)=−G1(ς,ℵ) into (4.6), and have
˙G1(ς,ℵ)=−[C1(ς,ℵ)+D1(ς,ℵ)Γ2(ς,ℵ)]′G1(ς,ℵ)−G1(ς,ℵ)[C1(ς,ℵ) +D1(ς,ℵ)Γ2(ς,ℵ)]+A(ς,ℵ)′A(ς,ℵ)−∞∑j=1ϕℵjG1(ς,j) +Γ2(ς,ℵ)′Γ2(ς,ℵ)−[C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ)]′G1(ς,ℵ) ⋅[C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ)]−Γ1(ς,ℵ)′M1(ς,ℵ)Γ1(ς,ℵ) −Γ1(ς,ℵ)′H1(ς,ℵ)′−H1(ς,ℵ)Γ1(ς,ℵ). | (4.7) |
Taking notice of Γ1(ς,ℵ)=−M1(ς,ℵ)†H1(ς,ℵ)′, which makes (4.7) the same as (3.2). On the other hand, it should be noted that on the grounds of the definition of H∞ control, ‖LT‖<γ is the premise [33]. Hence, under the condition of ‖LT‖<γ, following the line of Lemma 8.1.2 in [39], it can be deduced that M1(ς,ℵ)>0. Keep in mind that M1(ς,ℵ)>0 leads to M1(ς,ℵ)†=M1(ς,ℵ)−1. At this point, taking advantage of Theorem 3.1, we can obtain that the H∞ optimal controller is η∗(ς)=Γ_2(ς,ℵ)x(ς), and σ∗(ς) is the corresponding worst-case disturbance, where G(ς,ℵ) is the solution of (4.7) and satisfies the following CGDREs:
{−˙G(ς,ℵ)=[C1(ς,ℵ)+D1(ς,ℵ)Γ_2(ς,ℵ)]′G(ς,ℵ)+G(ς,ℵ)[C1(ς,ℵ) +D1(ς,ℵ)Γ_2(ς,ℵ)]+[C2(ς,ℵ)+D2(ς,ℵ)Γ_2(ς,ℵ)]′G1(ς,ℵ) ⋅[C2(ς,ℵ)+D2(ς,ℵ)Γ_2(ς,ℵ)]−A(ς,ℵ)′A(ς,ℵ) −Γ_2(ς,ℵ)′Γ_2(ς,ℵ)+∞∑j=1ϕℵjG(ς,j) −H1(ς,ℵ)M1(ς,ℵ)−1H1(ς,ℵ)′,G(T,ℵ)=0, M1(ς,ℵ)>0,(ς,ℵ)∈[0,T]×D, | (4.8) |
where
M1(ς,ℵ)=α2Inσ+E2(ς,ℵ)′G(ς,ℵ)E2(ς,ℵ),M2(ς,ℵ)=Inη−D2(ς,ℵ)′G(ς,ℵ)D2(ς,ℵ),H1(ς,ℵ)=G(ς,ℵ)E1(ς,ℵ)+[C2(ς,ℵ)+D2(ς,ℵ)Γ_2(ς,ℵ)]′G(ς,ℵ)E2(ς,ℵ),H2(ς,ℵ)=−G(ς,ℵ)D1(ς,ℵ)−[C2(ς,ℵ)+E2(ς,ℵ)Γ_1(ς,ℵ)]′G(ς,ℵ)D2(ς,ℵ),Γ_1(ς,ℵ)=−M1(ς,ℵ)−1H1(ς,ℵ)′,Γ_2(ς,ℵ)=−M2(ς,ℵ)−1H2(ς,ℵ)′. |
Remark 4.2. Through the above analysis, to ensure the existence of an H∞ optimal controller, the premise is M1(ς,ℵ)>0. Besides, the solution to the CGDREs (4.8) is G1(ς,ℵ)≤0; the reason for G1(ς,ℵ)≤0 is
J1(x0,ϖ0,η∗(⋅),σ∗(⋅))=∞∑ℵ=1π0(ℵ)x′0G1(0,ℵ)x0 ≤J1(x0,ϖ0,η∗(⋅),0)=E{∫T0[−‖z(ς)‖2]dς|ϖ0=ℵ}≤0. |
If we set β=0 in (2.2) and (2.3), then we have the following new cost functionals:
J1(x0,ϖ0,η(⋅),σ(⋅))=E{∫T0[α2‖σ(ς)‖2−‖z(ς)‖2]dς|ϖ0=ℵ}, | (4.9) |
J2(x0,ϖ0,η(⋅),σ(⋅))=E{∫T0‖z(ς)‖2dς|ϖ0=ℵ}. | (4.10) |
In light of the definition of H2/H∞ control in [33], it can be concluded that
J1(x0,ϖ0,η∗(⋅),σ∗(⋅))≤J1(x0,ϖ0,η∗(⋅),σ(⋅)),J2(x0,ϖ0,η∗(⋅),σ∗(⋅))≤J2(x0,ϖ0,η(⋅),σ∗(⋅)). |
Consequently, it can be deduced from M1(ς,ℵ)>0 and Theorem 3.1 that the following CGDREs
{−˙G1(ς,ℵ)=[C1(ς,ℵ)+D1(ς,ℵ)Γ2(ς,ℵ)]′G1(ς,ℵ)+G1(ς,ℵ)[C1(ς,ℵ) +D1(ς,ℵ)Γ2(ς,ℵ)]+[C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ)]′G1(ς,ℵ) ⋅[C2(ς,ℵ)+D2(ς,ℵ)Γ2(ς,ℵ)]−A(ς,ℵ)′A(ς,ℵ) −Γ2(ς,ℵ)′Γ2(ς,ℵ)+∞∑j=1ϕℵjG1(ς,j) −H1(ς,ℵ)M1(ς,ℵ)−1H1(ς,ℵ)′,G1(T,ℵ)=0, M1(ς,ℵ)>0, (ς,ℵ)∈[0,T]×D, | (4.11) |
Γ1(ς,ℵ)=−M1(ς,ℵ)−1H1(ς,ℵ)′, | (4.12) |
{−˙G2(ς,ℵ)=[C1(ς,ℵ)+E1(ς,ℵ)Γ1(ς,ℵ)]′G2(ς,ℵ)+G2(ς,ℵ)[C1(ς,ℵ) +E1(ς,ℵ)Γ1(ς,ℵ)]+[C2(ς,ℵ)+E2(ς,ℵ)Γ1(ς,ℵ)]′G2(ς,ℵ) ⋅[C2(ς,ℵ)+E2(ς,ℵ)Γ1(ς,ℵ)]+A(ς,ℵ)′A(ς,ℵ) +∞∑j=1ϕℵjG2(ς,j)−H2(ς,ℵ)M2(ς,ℵ)−1H2(ς,ℵ)′,G2(T,ℵ)=0, M2(ς,ℵ)>0, (ς,ℵ)∈[0,T]×D, | (4.13) |
Γ2(ς,ℵ)=−M2(ς,ℵ)−1H2(ς,ℵ)′. | (4.14) |
admit solutions G1(ς,ℵ)≤0, G2(ς,ℵ)≥0 for (ς,ℵ)∈[0,T]×D. As a matter of fact, as we described in Remark 4.1 and Remark 4.2, we have G1(ς,ℵ)≤0, G2(ς,ℵ)≥0. In the meantime, η∗(ς)=Γ2(ς,ℵ)x(ς), σ∗(ς)=Γ1(ς,ℵ)x(ς) is the H2/H∞ optimal controller.
Remark 4.3. It is important to note that the CGDREs (4.11)–(4.14) are the same as (5.2)–(5.5) in Theorem 5.1 of [33]. By contrast, the two group equations between Theorem 3.1 and Theorem 5.1 are not equivalent, which indicates that although we deal with the H2/H∞ control by making use of Theorem 3.1 for system (2.1), the equivalence between Nash equilibrium points and H2/H∞ control is not valid. This is fundamentally different from the discussion in [40].
Remark 4.4. The main reason for the inequivalence between Nash equilibrium points and H2/H∞ control is that the conditions of M1(ς,ℵ)>0 in (4.11) and M2(ς,ℵ)>0 in (4.13) are not satisfied for Nash equilibrium points. In fact, it is important to notice that the root cause is whether the diffusion term contains disturbance.
This part concentrates on a numerical example, which states the validity of the proposed method.
Example 5.1. Consider the linear SDEs with time-varying coefficients and infinite Markov jumps (2.1) with
ς=0:C1(0,ℵ)=ℵℵ+1, D1(0,ℵ)=1, E1(0,ℵ)=1,C2(0,ℵ)=1, D2(0,ℵ)=1, E2(0,ℵ)=1, A(0,ℵ)=√ℵℵ+1, B(0,ℵ)=1;ς=1:C1(1,ℵ)=27(ℵ+1), D1(1,ℵ)=1ℵ+1, E1(1,ℵ)=1, B(1,ℵ)=1,C2(1,ℵ)=−1ℵ+1, D2(1,ℵ)=1, E2(1,ℵ)=12, A(1,ℵ)=1√7(ℵ+1);ς=2:C1(2,ℵ)=ℵ7(ℵ+1), D1(2,ℵ)=−1, E1(2,ℵ)=−1ℵ+1, B(2,ℵ)=1,C2(2,ℵ)=ℵ2(ℵ+1), D2(2,ℵ)=1ℵ+1, E2(2,ℵ)=13(ℵ+1)2, A(2,ℵ)=1. |
Set T=2; it should be noted that a homogeneous Poisson process can be regarded as an infinite Markov process. Thus let {ϖς}ς∈[0,T] be a homogeneous Poisson process with parameter ψ>0, and the infinitesimal matrix of {ϖς}ς∈[0,T] is Φ=(ϕℵj)ς,j∈D with −ϕℵℵ=ϕℵ,ℵ+1=ψ and ϕℵj=0, ℵ∈D, j∈D/{ℵ,ℵ+1}.
Observe that the discretization method and backward recursive algorithm are the key to solving corresponding CGDREs as described in Remark 3.1. In view of the discussion in the previous section, first of all, it turns the LQ optimal controller designing into solving Eq (4.3). Further, we can compute the following approximate solutions:
G2(0,ℵ)=1+47(ℵ+1)+87(ℵ+2)2≥0,Γ2(0,ℵ)=−7(ℵ+1)2+4ℵ+127(ℵ+1)2+2ℵ+6. |
The LQ optimal control and optimal value function can be immediately obtained by (4.4). And then, let α=1; the H∞ optimal controller design problem is translated to the solvability of Eq (4.8). Additionally, the following approximate solutions can be given:
G(0,ℵ)=−157−107(ℵ+1)2+7649(ℵ+1)≤0,Γ_2(0,ℵ)=−6(ℵ+1)2G(0,ℵ)+2G(0,ℵ)2(ℵ+1)2(1−G(0,ℵ)+G(0,ℵ)2)+G(0,ℵ)−G(0,ℵ)2. |
We can get the H∞ optimal control quickly. And finally, when β=0, the finite horizon H2/H∞ optimal control can be transformed into the existence of the solution to the CGDREs (4.11)–(4.14). Go a step further; we can figure out the approximate solutions below:
G1(0,ℵ)=−157−107(ℵ+1)2+7649(ℵ+1)≤0,G2(0,ℵ)=809196−[25196(ℵ+1)−697784(ℵ+1)2≥0,Γ1(0,ℵ)=−(ℵ+1)2(ℵ+1)2+G1(0,ℵ)[2G1(0,ℵ) +(ℵ+1)2(G1(0,ℵ)2−2G1(0,ℵ)G2(0,ℵ))−2G1(0,ℵ)2G2(0,ℵ)(ℵ+1)2(1+G2(0,ℵ)−G1(0,ℵ)G2(0,ℵ))+G1(0,ℵ)+G1(0,ℵ)G2(0,ℵ),Γ2(0,ℵ)=−2(ℵ+1)2(G1(0,ℵ)−2G2(0,ℵ))−2G1(0,ℵ)G2(0,ℵ)(ℵ+1)2(1+G2(0,ℵ)−G1(0,ℵ)G2(0,ℵ))+G1(0,ℵ)+G1(0,ℵ)G2(0,ℵ). |
The corresponding finite-horizon H2/H∞ optimal controller can be derived naturally.
In this note, we studied the non-zero-sum Nash differential games for SDEs involving time-varying coefficients and infinite Markov jumps. By means of a pseudo-inverse matrix, necessary and sufficient condition for the existence of Nash equilibrium strategies is given by the solvability of CGDREs. As an application, by the Nash game approach, we present a unified treatment for H2, H∞, and H2/H∞ control with some corresponding parameters. At last, the theoretical results are used to solve a numerical example. There are several interesting problems that deserve further investigation, in particular, how to generalize our result to the infinite horizon Nash game problem.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
This research was funded by the Natural Science Foundation of Qingdao under Grant 23-2-1-7-zyyd-jch, the National Natural Science Foundation of China under Grant 62273212, the Natural Science Foundation of Shandong Province under grant ZR202103030362, the Science Research Project of the Hebei Education Department under Grant QN2022077, the S&T Program of Hebei under Grant 236Z1603G, and the Hebei Natural Science Foundation under Grant F2022203097.
The authors declare there are no conflicts of interest.
[1] |
H. Shen, M. Xing, S. Xu, M. V. Basin, J. H. Park, H∞ stabilization of discrete-time nonlinear semi-Markov jump singularly perturbed systems with partially known semi-Markov kernel information, IEEE T. Circuits Syst. I, Reg. Papers, 68 (2021), 818–828. http://doi.org/10.1109/TCSI.2020.3034897 doi: 10.1109/TCSI.2020.3034897
![]() |
[2] |
Q. Zhu, Event-triggered sampling problem for exponential stability of stochastic nonlinear delay systems driven by Levy processes, IEEE Trans. Autom. Control, 70 (2025), 1176–1183. http://doi.org/10.1109/TAC.2024.3448128 doi: 10.1109/TAC.2024.3448128
![]() |
[3] |
Z. Chen, F. Li, D. Luo, J. Wang, H. Shen, Stabilization of discrete-time semi-Markov jump singularly perturbed systems subject to actuator saturation and partially known semi-Markov kernel information, J. Frankl. Inst., 359 (2022), 6043–6060. http://doi.org/10.1016/j.jfranklin.2022.06.011 doi: 10.1016/j.jfranklin.2022.06.011
![]() |
[4] |
S. Jiao, H. Shen, Y. Wei, X. Huang, Z. Wang, Further results on dissipativity and stability analysis of Markov jump generalized neural networks with time-varying interval delays, Appl. Math. Comput., 336 (2018), 338–350. http://doi.org/10.1016/j.amc.2018.05.013 doi: 10.1016/j.amc.2018.05.013
![]() |
[5] |
B. Wang, Q. Zhu, S. Li, Stabilization of discrete-time hidden semi-Markov jump linear systems with partly unknown emission probability matrix, IEEE Trans. Autom. Control, 69 (2024), 1952–1959. http://doi.org/10.1109/TAC.2023.3272190 doi: 10.1109/TAC.2023.3272190
![]() |
[6] |
B. Wang, Q. Zhu, The stabilization problem for a class of discrete-time semi-Markov jump singular systems, Automatica, 171 (2025), 111960. http://doi.org/10.1016/j.automatica.2024.111960 doi: 10.1016/j.automatica.2024.111960
![]() |
[7] |
H. Shen, Y. Wang, J. Wang, J. H. Park, A fuzzy-model-based approach to optimal control for nonlinear Markov jump singularly perturbed systems: a novel integral reinforcement learning scheme, IEEE Trans. Fuzzy Syst., 31 (2023), 3734–3740. http://doi.org/10.1109/TFUZZ.2023.3265666 doi: 10.1109/TFUZZ.2023.3265666
![]() |
[8] |
K. Zhang, S. Luo, H. Wu, R. Su, Data-driven tracking control for non-affine yaw channel of helicopter via off-policy reinforcement learning, IEEE Trans. Aerosp. Electron. Syst., (2025), 1–13. http://doi.org/10.1109/TAES.2025.3539264 doi: 10.1109/TAES.2025.3539264
![]() |
[9] |
X. Song, S. Ma, Indefinite linear quadratic optimal control problem for continuous-time linear descriptor Markov jump systems, Int. J. Control Autom., 21 (2023), 485–498. http://doi.org/10.1007/s12555-021-0778-5 doi: 10.1007/s12555-021-0778-5
![]() |
[10] |
J. Wu, M. Tang, Q. Meng, A stochastic linear-quadratic optimal control problem with jumps in an infinite horizon, AIMS Math., 8 (2023), 4042–4078. http://doi.org/10.3934/math.2023202 doi: 10.3934/math.2023202
![]() |
[11] |
Y. Zhu, N. Xu, X. Chen, W. Zheng, H∞ control for continuous-time Markov jump nonlinear systems with piecewise-affine approximation, Automatica, 141 (2022), 110300. https://doi.org/10.1016/j.automatica.2022.110300 doi: 10.1016/j.automatica.2022.110300
![]() |
[12] |
H. Zhang, J. Xia, G. Zhuang, H. Shen, Robust interval stability/stabilization and H∞ feedback control for uncertain stochastic Markovian jump systems based on the linear operator, Sci. China Inf. Sci., 65 (2022), 142202. http://doi.org/10.1007/s11432-020-3087-1 doi: 10.1007/s11432-020-3087-1
![]() |
[13] |
S. Xing, W. Zheng, F. Deng, C. Chang, H∞ control for stochastic singular systems with time-varying delays via sampled-data controller, IEEE Trans. Cybern., 53 (2022), 7048–7057. http://doi.org/10.1109/TCYB.2022.3168273 doi: 10.1109/TCYB.2022.3168273
![]() |
[14] |
R. Dong, Z. Li, H. Shen, J. Wang, L. Su, Finite-time asynchronous H∞ control for Markov jump singularly perturbed systems with partially known probabilities, Appl. Math. Comput., 457 (2023), 128193. http://doi.org/10.1016/j.amc.2023.128193 doi: 10.1016/j.amc.2023.128193
![]() |
[15] |
M. Gao, L. Sheng, W. Zhang, Finite horizon H2/H∞ control of time-varying stochastic systems with Markov jumps and (x,u,v)-dependent noise, IET Control Theory Appl., 8 (2014), 1354–1363. http://doi.org/10.1049/iet-cta.2013.1070 doi: 10.1049/iet-cta.2013.1070
![]() |
[16] |
M. Wang, Q. Meng, Y. Shen, H2/H∞ control for stochastic jump-diffusion systems with Markovian switching, J. Syst. Sci. Complex., 34 (2021), 924–954. http://doi.org/10.1007/s11424-020-9131-y doi: 10.1007/s11424-020-9131-y
![]() |
[17] | X. Gao, F. Deng, P. Zeng, Zero-sum game-based security control of unknown nonlinear Markov jump systems under false data injection attacks, Int. J. Robust Nonlinear Control, 2022. http://doi.org/10.1002/rnc.6418 |
[18] |
K. Zhang, Z. Zhang, X. Xie, J. D. J. Rubio, An unknown multiplayer nonzero-sum game: prescribed-time dynamic event-triggered control via adaptive dynamic programming, IEEE Trans. Autom. Sci. Eng., 22 (2024), 8317–8328. http://doi.org/10.1109/TASE.2024.3484412 doi: 10.1109/TASE.2024.3484412
![]() |
[19] |
Y. Liu, Z. Wang, X. Lin, Non-zero sum Nash game for discrete-time infinite Markov jump stochastic systems with applications, Axioms, 12 (2023), 882. http://doi.org/10.3390/axioms12090882 doi: 10.3390/axioms12090882
![]() |
[20] |
O. L. V. Costa, D. Z. Figueiredo, Stochastic stability of jump discrete-time linear systems with Markov chain in a general Borel space, IEEE T. Autom. Control, 59 (2014), 223–227. http://doi.org/10.1109/TAC.2013.2270031 doi: 10.1109/TAC.2013.2270031
![]() |
[21] |
H. Ma, Y. Jia, Stability analysis for stochastic differential equations with infinite Markovian switchings, J. Math. Anal. Appl., 435 (2016), 593–605. http://doi.org/10.1016/j.jmaa.2015.10.047 doi: 10.1016/j.jmaa.2015.10.047
![]() |
[22] |
T. Hou, Y. Liu, F. Deng, Stability for discrete-time uncertain systems with infinite Markov jump and time-delay, Sci. China Inf. Sci., 64 (2021), 152202. http://doi.org/10.1007/s11432-019-2897-9 doi: 10.1007/s11432-019-2897-9
![]() |
[23] |
Y. Liu, T. Hou, Infinite horizon LQ Nash Games for SDEs with infinite jumps, Asian J. Control, 23 (2021), 2431–2443. http://doi.org/10.1002/asjc.2371 doi: 10.1002/asjc.2371
![]() |
[24] |
Y. Liu, T. Hou, Robust H2/H∞ fuzzy filtering for nonlinear stochastic systems with infinite Markov jump, J. Syst. Sci. Complex., 33 (2020), 1023–1039. http://doi.org/10.1007/s11424-020-8364-0 doi: 10.1007/s11424-020-8364-0
![]() |
[25] | Y. Liu, T. Hou, LQ optimal control for stochastic system with infinite Markovian jumps, in 2017 Chinese Automation Congress (CAC), (2017), 7107–7111. http://doi.org/10.1109/CAC.2017.8244060 |
[26] |
J. Moon, A sufficient condition for linear-quadratic stochastic zero-sum differential games for Markov jump systems, IEEE Trans. Autom. Control, 64 (2019), 1619–1626. http://doi.org/10.1109/TAC.2018.2849945 doi: 10.1109/TAC.2018.2849945
![]() |
[27] | F. Wu, X. Li, X. Zhang, Open-loop and closed-loop solvabilities for zero-sum stochastic linear quadratic differential games of Markovian regime switching system, preprint, arXiv: 2409.01973. https://doi.org/10.48550/arXiv.2409.01973 |
[28] |
S. Lv, Two-player zero-sum stochastic differential games with regime switching, Automatica, 114 (2020), 108819. http://doi.org/10.1016/j.automatica.2020.108819 doi: 10.1016/j.automatica.2020.108819
![]() |
[29] |
J. Moon, Linear–quadratic stochastic leader–follower differential games for Markov jump-diffusion models, Automatica, 147 (2023), 110713. http://doi.org/10.1016/j.automatica.2022.110713 doi: 10.1016/j.automatica.2022.110713
![]() |
[30] |
S. Lv, Z Wu, J Xiong, Linear quadratic nonzero-sum mean-field stochastic differential games with regime switching, Appl. Math. Optim., 90 (2024), 44. http://doi.org/10.1007/s00245-024-10188-5 doi: 10.1007/s00245-024-10188-5
![]() |
[31] |
T. Hou, W. Zhang, A game-based control design for discrete-time Markov jump systems with multiplicative noise, IET Control Theory Appl., 7 (2013), 773–783. http://doi.org/10.1049/iet-cta.2012.1018 doi: 10.1049/iet-cta.2012.1018
![]() |
[32] |
L. Sheng, W. Zhang, M. Gao, Relationship between Nash equilibrium strategies and H2/H∞ control of stochastic Markov jump systems with multiplicative noise, IEEE Trans. Autom. Control, 59 (2014), 2592–2597. http://doi.org/10.1109/TAC.2014.2309274 doi: 10.1109/TAC.2014.2309274
![]() |
[33] |
T. Hou, Y. Liu, F. Deng, Finite horizon H2/H∞ control for SDEs with infinite Markovian jumps, Nonlinear Anal. Hybrid Syst., 34 (2019), 108–120. http://doi.org/10.1016/j.nahs.2019.05.009 doi: 10.1016/j.nahs.2019.05.009
![]() |
[34] |
Y. Hu, X. Zhou, Indefinite stochastic Riccati equations, SIAM J. Control Optim., 42 (2003), 123–137. http://doi.org/10.1137/S0363012901391330 doi: 10.1137/S0363012901391330
![]() |
[35] |
Z. Yu, An optimal feedback control-strategy pair for zero-sum linear-quadratic stochastic differential game: the Riccati equation approach, SIAM J. Control Optim., 53 (2015), 2141–2167. http://doi.org/10.1137/130947465 doi: 10.1137/130947465
![]() |
[36] |
M. A. Rami, J. B. Moore, X. Zhou, Indefinite stochastic linear quadratic control and generalized differential Riccati equation, SIAM J. Control Optim., 40 (2001), 1296–1311. http://doi.org/10.1137/S0363012900371083 doi: 10.1137/S0363012900371083
![]() |
[37] | T. Basar, G. J. Olsder, Dynamic Noncooperative Game Theory, 2nd edition, SIAM, Philadelphia, 1999. http://doi.org/10.1137/1.9781611971132 |
[38] |
X. Li, X. Zhou, Indefinite stochastic LQ control with Markovian jumps in a finite time horizon, Commun. Inf. Syst., 2 (2002), 265–282. http://doi.org/10.4310/CIS.2002.v2.n3.a4 doi: 10.4310/CIS.2002.v2.n3.a4
![]() |
[39] | V. Dragan, T. Morozan, A. M. Stoica, Mathematical Methods in Robust Control of Linear Stochastic Systems, 2nd edition, Springer, New York, 2013. http://doi.org/10.1007/978-1-4614-8663-3 |
[40] |
B. S. Chen, W. Zhang, Stochastic H2/H∞ control with state-dependent noise, IEEE Trans. Automat. Control, 49 (2004), 45–57. http://doi.org/10.1109/TAC.2003.821400 doi: 10.1109/TAC.2003.821400
![]() |