
This paper studies mean-field linear-quadratic-Gaussian (LQG) games with a major agent and a large number of minor agents, where each agent's state process is driven by a Poisson random measure and independent Brownian motion. The major and minor agents were coupled via both their state dynamics as well as in their individual cost functionals. By the Nash certainty equivalence (NCE) methodology, two limiting control problems were constructed and the decentralized strategies were derived through the consistency condition. The ϵ-Nash equilibrium property of the obtained decentralized strategies was shown for a finite N population system where ϵ=O(1/√N). A numerical example was presented to illustrate the consistency of the mean-field estimation and the impact of the population's collective behavior.
Citation: Ruimin Xu, Kaiyue Dong, Jingyu Zhang, Ying Zhou. Linear-quadratic-Gaussian mean-field games driven by Poisson jumps with major and minor agents[J]. AIMS Mathematics, 2025, 10(5): 11086-11110. doi: 10.3934/math.2025503
[1] | Hamidou Tembine . Mean-field-type games. AIMS Mathematics, 2017, 2(4): 706-735. doi: 10.3934/Math.2017.4.706 |
[2] | Xiuxian Chen, Zhongyang Sun, Dan Zhu . Mean-variance investment and risk control strategies for a dynamic contagion process with diffusion. AIMS Mathematics, 2024, 9(11): 33062-33086. doi: 10.3934/math.20241580 |
[3] | Jiali Wu, Maoning Tang, Qingxin Meng . A stochastic linear-quadratic optimal control problem with jumps in an infinite horizon. AIMS Mathematics, 2023, 8(2): 4042-4078. doi: 10.3934/math.2023202 |
[4] | Wen Li, Deyi Li, Yuqiang Feng, Du Zou . Existence and stability of fuzzy Pareto-Nash equilibria for fuzzy constrained multi-objective games with fuzzy payoffs. AIMS Mathematics, 2023, 8(7): 15907-15931. doi: 10.3934/math.2023812 |
[5] | Chenwei Liu, Shuwen Xiang, Yanlong Yang . Existence and essential stability of Nash equilibria for biform games with Shapley allocation functions. AIMS Mathematics, 2022, 7(5): 7706-7719. doi: 10.3934/math.2022432 |
[6] | Chengqing Pan, Haishu Lu . On the existence of solutions for systems of generalized vector quasi-variational equilibrium problems in abstract convex spaces with applications. AIMS Mathematics, 2024, 9(11): 29942-29973. doi: 10.3934/math.20241447 |
[7] | Weiwei Shen, Yan Zhang . Strong convergence of the Euler-Maruyama method for the stochastic volatility jump-diffusion model and financial applications. AIMS Mathematics, 2025, 10(5): 12032-12054. doi: 10.3934/math.2025545 |
[8] | Jaicer López-Rivero, Hugo Cruz-Suárez, Carlos Camilo-Garay . Nash equilibria in risk-sensitive Markov stopping games under communication conditions. AIMS Mathematics, 2024, 9(9): 23997-24017. doi: 10.3934/math.20241167 |
[9] | Ramkumar Kasinathan, Ravikumar Kasinathan, Dumitru Baleanu, Anguraj Annamalai . Well posedness of second-order impulsive fractional neutral stochastic differential equations. AIMS Mathematics, 2021, 6(9): 9222-9235. doi: 10.3934/math.2021536 |
[10] | Huimin Li, Shuwen Xiang, Yanlong Yang, Chenwei Liu . Differential evolution particle swarm optimization algorithm based on good point set for computing Nash equilibrium of finite noncooperative game. AIMS Mathematics, 2021, 6(2): 1309-1323. doi: 10.3934/math.2021081 |
This paper studies mean-field linear-quadratic-Gaussian (LQG) games with a major agent and a large number of minor agents, where each agent's state process is driven by a Poisson random measure and independent Brownian motion. The major and minor agents were coupled via both their state dynamics as well as in their individual cost functionals. By the Nash certainty equivalence (NCE) methodology, two limiting control problems were constructed and the decentralized strategies were derived through the consistency condition. The ϵ-Nash equilibrium property of the obtained decentralized strategies was shown for a finite N population system where ϵ=O(1/√N). A numerical example was presented to illustrate the consistency of the mean-field estimation and the impact of the population's collective behavior.
Mean-field games of a large population system have attracted consistent and intense attention in recent years (see, e.g., [1,2,3,4,5,6,7,8,9,10]) due to their wide applicability in many fields such as finance, economics, engineering, biological science, and social science. The agents in mean-field games are individually insignificant, while their aggregated behavior has a substantial effect on each agent. This collective influence can be captured by the mean-field couplings in their individual dynamics and/or individual cost functionals. For mean-field games, it is unrealistic for a given agent to collect detailed state information of all agents due to the highly complex interactions among its peers. To tackle the dimensionality difficulty caused by the highly complex interactions among the agents in mean-field games, Huang, Caines, and Malhamé [11], Huang [12], and Nourian and Caines [13] developed a powerful approach—the Nash certainty equivalence (NCE) methodology. The key idea of this methodology is to establish a consistency relationship between the individual strategies and the mass effect (i.e., the asymptotic limit of state-average) as the population size goes to infinity. Based on this effective analytical tool, one can construct a set of decentralized strategies for each agent in the mean-field game, and verify the asymptotic Nash equilibrium property (namely, ϵ-Nash equilibrium) of the decentralized strategies where the individual optimality loss level ϵ depends on the population size N. A closely related method for solving mean-field games was independently developed by Lasry and Lions [14,15,16]. For a comprehensive survey of the theory of the mean-field game and its applications, one is referred to [11,12,14,16,17,18,19,20,21] and the references therein.
The consideration of major and minor a agent game problems under a large population framework has been well studied in [3,12,13,21,22]. Huang [12] investigated a kind of stochastic dynamic linear-quadratic-Gaussian mean-field games model involving a major agent interacting with a large number of minor agents. The major agent has a significant influence in affecting minor agents, while the minor agents individually have negligible impact on others, but their collective behavior will impose a significant impact on all agents through mean-field coupling terms in the individual dynamics and costs. Applications of this type of mean-field game appear in many socio-economic problems such as economic and social opinion models with an influential leader (e.g., [23]), such as the charging control of plug-in electric vehicles [24]. Xu and Wu [21] studied large-population dynamic games involving a LQG system with an exponential cost functional, and the parameter in the cost functional can describe an investor's risk attitude. Moreover, in the game, there is a major agent and a population of N minor agents where N is very large. Wang and Xu [22] investigated a time-inconsistent linear-quadratic game involving a major agent as well as numerous minor agents.
Motivated by the absence of relevant theory and some practical applications, this paper studies mean-field LQG games with random jumps involving a major agent and plenty of minor agents. Specifically, we consider mean-field games with agents of the following mixed types: (ⅰ) a major agent and (ⅱ) a large population of N minor agents where N is very large. The dynamic of each agent follows a linear stochastic differential equation driven by both Brownian motions and Poisson random measures. Moreover, the present study considers the mean-field LQG mixed games in which the diffusion term depends on the major agent's and the minor agent's states as well as the individual control strategy. Stochastic processes with random jumps can be used to model fluctuations in the financial market, both for option pricing purposes and risk management (see [20,25,26,27]). As for mean-field LQG games with random jumps, Benazzoli, Campi, and Di Persio [1] studied a symmetric n-player nonzero-sum stochastic differential game with jump-diffusion dynamics and mean-field type interaction among the players, and they constructed an approximate Nash equilibrium for the n-player game with n sufficiently large. Xu and Shi [20] investigated LQG games of a stochastic large population system with jump diffusion processes. It is worth noting that in existing research on mean-field games of a stochastic large population system driven by jump-diffusion processes, all agents are comparably small and may be regarded as peers.
To obtain an asymptotic Nash equilibrium property (i.e., ϵ-Nash equilibrium) for the original mean-field game, we apply the NCE approach to establish a certain consistency relationship between all minor agents and the mass effect. First, we construct two auxiliary stochastic control problems driven by stochastic differential equations driven by Poisson jumps (SDEPs) which depict the state of the major agent and a generic minor agent, and obtain the corresponding optimal control in feedback form. Next, to devise the decentralized strategies of individual agents, we formulate a kind of fully coupled forward-backward stochastic differential equation driven by Poisson jumps which is called a consistency condition (CC) system. Then, a set of decentralized strategies are constructed by using the solution of the CC system, which are demonstrated to be the ϵ-Nash equilibrium.
The main contributions of this paper can be summarized as follows:
● A new class of LQG mean-field games involving major and minor agents is investigated. The dynamics of each agent follows a linear stochastic differential equation driven by both Brownian motions and Poisson random measures, in which the diffusion terms of the major and minor agents depend on their states and control strategy.
● The average state of all minor agents x(N)(⋅) appears in the drift term and diffusion term of the state equations for both the major agent and all the minor agents, as well as in their cost functionals.
● The consistency condition system called the NCE equation is represented through a fully coupled two-point boundary value problem, and based on this equation, we design a set of decentralized feedback control strategies for the N+1 agents by use of two limiting control systems.
● By the approximation relationship between the closed-loop mean-field game system and the limiting systems, the set of NCE-based decentralized control strategies is shown to be an ϵ-Nash equilibrium for a finite N+1 population system where ϵ=O(1/√N).
This paper is organized as follows. In Section 2, we formulate the LQG mean-field games driven by Poisson random jumps involving a major agent and many minor agents. Section 3 introduces two auxiliary optimization problems for the major agent and each minor agent, respectively, and the consistency condition system is derived. Section 4 aims to present the ϵ-Nash equilibrium property of the decentralized control strategies. A numerical example is given in Section 5. Finally, Section 6 concludes the paper.
Throughout this paper, we denote by Rn the n-dimensional Euclidean space. For a given Euclidean space, we denote by |⋅| (respectively, ⟨⋅,⋅⟩) the standard Euclidean norm (respectively, inner product). The transpose of a matrix (or vector) X is denoted by XT. Let (Ω,F,{Ft}0≤t≤T,P) be a complete filtered probability measure space for fixed time T>0, and let the number N represent the population size of minor agents. Denote by N the index set {1,2,⋯,N}. Let Ft be the filtration generated by the following mutually independent processes:
(ⅰ) (N+1) independent one-dimensional standard Brownian motions {Wi(t), i=0,1,⋯,N}0≤t≤T;
(ⅱ) (N+1) independent Poisson random measures {˜Gi,i=0,1,⋯,N} on Ei×R+, where Ei⊂R is a nonempty open set equipped with its Borel field B(Ei), with compensator ˆGi(dedt)=πi(de)dt, such that Gi(S×[0,t])=(˜Gi−ˆGi)(S×[0,t])t≥0 is a martingale for all S∈B(Ei). πi is a σ-finite measure on (Ei,B(Ei)) and is called the characteristic measure. Moreover, ∀S∈B(Ei), C0:=sup0≤i≤Nπi(S)<+∞ is a positive constant independent of the number N.
We also set
F0t:=σ{W0(s),0≤s≤t}⋁σ{G0(S0×[0,s]),0≤s≤t,∀S0∈B(E0)},Fit:=σ{Wi(s),0≤s≤t}⋁σ{Gi(Si×[0,s]),0≤s≤t,∀Si∈B(Ei)},F0,it:=σ{W0,Wi(s),0≤s≤t}⋁σ{G0(S0×[0,s]),Gi(Si×[0,s]),0≤s≤t, ∀S0∈B(E0),Si∈B(Ei)}, |
where ⋁αFα:=σ(⋃αFα). Here, {F0t}0≤t≤T represents the information of the major agent, whereas for the given i∈N, {Fit}0≤t≤T stands the individual information of the ith minor agent.
Denote by Sn the set of symmetric n×n matrices with real elements. If M∈Sn is positive (semi) definite, we write M>(≥) 0. We also introduce the following spaces:
L2G(Rn):={ζ:Ω→Rn|ζ is G-measurable and E[|ζ|2]<+∞};S2G([0,T];Rn):={ϕ(⋅):[0,T]×Ω→Rn|ϕ(⋅) is Gt-adapted and E[sup0≤t≤T|ϕ(t)|2]<+∞};L2G([0,T];Rn):={ϕ(⋅):[0,T]×Ω→Rn|ϕ(⋅) is a Gt-progressively measurable process and E[∫T0|ϕ(t)|2dt]<+∞}. |
Let us consider an LQG mean-field game involving a major agent A0 and a population of N minor agents {Ai,i=1,2,⋯,N}. For the major agent A0, Uc,0ad:={u(⋅)|u(⋅)∈L2F([0,T];Rk)} denotes the centralized admissible control set, and U0ad:={u(⋅)|u(⋅)∈L2F0([0,T];Rk)} represents the corresponding decentralized admissible control set. For each i∈N, we define the centralized admissible control set for the minor agent Ai as Uc,iad:={ui(⋅)|ui(⋅)∈L2F([0,T];Rk)}, while the corresponding decentralized admissible control set is Uiad:={ui(⋅)|ui(⋅)∈L2F0,i([0,T];Rk)}. Note that we have Uiad⊂Uc,iad for i=0,1,⋯,N.
The dynamics of the major agent A0 is given as follows:
{dx0(t)=[A0x0(t)+B0u0(t)+b0x(N)(t)+f0(t)]dt+[C0x0(t)+D0u0(t) + l0x(N)(t)+σ0(t)]dW0(t)+F0∫E0G0(dedt),x0(0)=a0∈Rn, | (1) |
and the state of the minor agent Ai is described by
{dxi(t)=[Axi(t)+Bui(t)+b1x(N)(t)+f(t)]dt+[Cxi(t)+Dui(t) + b2x(N)(t)+Hx0(t)+σ(t)]dWi(t)+F∫EiGi(dedt),xi(0)=ai∈Rn, i=1,⋯,N, | (2) |
where x(N)(t)=1N∑Nj=1xj(t) represents the average state of all minor agents. Here, A0∈Rn×n,B0∈Rn×k,C0∈Rn×n,D0∈Rn×k, b0∈Rn×n,l0∈Rn×n,F0∈Rn,A∈Rn×n, B∈Rn×k,C∈Rn×n,D∈Rn×k, b1∈Rn×n,b2∈Rn×n,H∈Rn×n, and F∈Rn are given constants, and f0(⋅)∈Rn,σ0(⋅)∈Rn,f(⋅)∈Rn, and σ(⋅)∈Rn are given deterministic functions. For given admissible control u0 and ui, it follows that the systems (1) and (2) admit a unique solution x0(⋅),xi(⋅)∈S2F([0,T];Rn).
Let u=(u0,u1,…,ui,…,uN) be the set of control strategies for all N+1 agents, and u−i=(u0,u1,…,ui−1,ui+1,…,uN) for i=0,1⋯,N. The cost functional for the major agent A0 is
J0(u0,u−0)= 12E{∫T0[⟨Q0(x0(t)−β0x(N)(t)),(x0(t)−β0x(N)(t))⟩+⟨R0u0(t),u0(t)⟩]dt +⟨M0x0(T),x0(T)⟩}. | (3) |
The cost functional for minor agent Ai,1≤i≤N, is
Ji(ui,u−i)= 12E{∫T0[⟨Q(xi(t)−β1x(N)(t)−β2x0(t)),(xi(t)−β1x(N)(t)−β2x0(t))⟩ +⟨Rui(t),ui(t)⟩]dt+⟨Mxi(T),xi(T)⟩}. | (4) |
The coefficients of cost functionals satisfy that Q0,Q∈Sn,Q0≥0,Q≥0,β0,β1,β2∈Rn,R0>0,R>0,R0,R∈Sk and M0≥0,M≥0,M0,M∈Sn.
Parallel to (2), the cost functional (4) contains the term β2x0(t) to capture the strong influence of the major agent. Note that the state dynamics (1) and (2), and the cost functionals (3) and (4), indicate that the major agent A0 has a significant influence on minor agents, while each minor agent Ai,i∈N, has a negligible impact on other agents in a large N population system.
Now, we propose the following LQG mean-field games.
Problem (LP): Find an admissible strategy ˉu=(ˉu0,ˉu1,…,ˉui,…,ˉuN) where ˉui(⋅)∈Uc,iad,i=0,1,⋯,N, such that
Ji(ˉui,ˉu−i)=infui(⋅)∈Uc,iadJi(ui,u−i),i=0,1,⋯,N. |
We call ˉu a Nash equilibrium strategy for Problem (LP).
Remark 2.1. It should be noted that this paper only addresses the existence of Nash equilibrium strategies and does not involve whether the Nash equilibrium is unique. The study of the uniqueness of Nash equilibrium strategies is also an active research topic. The variational inequality approach proposed in He and Wang [28] provides a feasible methodology for studying the uniqueness of Nash equilibrium strategies.
In this section, we first construct two auxiliary stochastic optimal control problems, which are called limiting systems, for the major and a generic minor agent in Sections 3.1 and 3.2, respectively. Then we present the approximations between the limiting systems and the corresponding mean-field system in Section 3.3.
For any v0(⋅)∈U0ad, the state y0(⋅) of agent A0 satisfies the following stochastic differential equation:
{dy0(t)=[A0y0(t)+B0v0(t)+b0x(0)(t)+f0(t)]dt+[C0y0(t)+D0v0(t) + l0x(0)(t)+σ0(t)]dW0(t)+F0∫E0G0(dedt),y0(0)=a0, | (5) |
where function x(0)(⋅) will be given later.
The corresponding cost functional is given by
˜J0(v0)= 12E{∫T0[⟨Q0(y0(t)−β0x(0)(t)),(y0(t)−β0x(0)(t))⟩+⟨R0v0(t),v0(t)⟩]dt +⟨M0y0(T),y0(T)⟩}. |
Problem (LM1): The objective is to find ˉv0(⋅)∈U0ad such that
˜J0(ˉv0)=infv0∈U0ad˜J0(v0). |
Let P0(⋅) be the solution of the following Riccati equation:
{−˙P0(t)=P0(t)A0+A⊤0P0(t)+C⊤0P0(t)C0+Q0−(B⊤0P0(t)+D⊤0P0(t)C0)⊤ × (R0+D⊤0P0(t)D0)−1(B⊤0P0(t)+D⊤0P0(t)C0),R0+D⊤0P0(t)D0≥0,P0(T)=M0. |
Let η0(⋅) denote the solution of
{˙η0(t)=−{[A0−B0(R0+D⊤0P0(t)D0)−1×(B⊤0P0(t)+D⊤0P0(t)C0)]⊤η0(t) + [C0−D0(R0+D⊤0P0(t)D0)−1×(B⊤0P0(t)+D⊤0P0(t)C0)]⊤ × P0(t)(l0x(0)(t)+σ0(t))+[P0(t)(b0x(0)(t)+f0(t))−β0Q0x(0)(t)]},η0(T)=0. |
The following result presents the optimal control of Problem (LM1).
Theorem 3.1. Suppose that
{Λ0(t):=−(R0+D⊤0P0(t)D0)−1×(B⊤0P0(t)+D⊤0P0(t)C0),Θ0(t):=−(R0+D⊤0P0(t)D0)−1×[B⊤0η0(t)+D⊤0P0(t)(l0x(0)(t)+σ0(t))]. |
Then the optimal control strategy of Problem (LM1) is
ˉv0(t)=Λ0(t)ˉy0(t)+Θ0(t), |
where ˉy0(⋅) satisfies
{dˉy0(t)=[(A0+B0Λ0(t))ˉy0(t)+B0Θ0(t)+b0x(0)(t)+f0(t)]dt + [(C0+D0Λ0(t))ˉy0(t)+D0Θ0(t)+l0x(0)(t)+σ0(t)]dW0(t) + F0∫E0G0(dedt),ˉy0(0)=a0. | (6) |
Proof. Let ˆb(t):=b0x(0)(t)+f0(t),ˆσ(t):=l0x(0)(t)+σ0(t). Then the state equation (5) can be written as
{dy0(t)=[A0y0(t)+B0u0(t)+ˆb(t)]dt+[C0y0(t)+D0u0(t)+ˆσ(t)]dW0(t) + F0∫E0G0(dedt),y0(0)=a0. |
For simplicity, we denote ˆR0(t):=R0+D⊤0P0(t)D0, ˆB0(t):=B⊤0P0(t)+D⊤0P0(t)C0. Applying Itô's formula to (12y⊤0(t)P0(t)y0(t)+y⊤0(t)η0(t)), we obtain
E{12y⊤0(T)P0(T)y0(T)−12y⊤0(0)P0(0)y0(0)+y⊤0(T)η0(T)−y⊤0(0)η0(0)}= E{12M0y20(T)−12y⊤0(0)P0(0)y0(0)−y⊤0(0)η0(0)}= E∫T0[−12Q0y20−12y20ˆB20ˆR−10+P0y0v⊤0B⊤0+P0C0y0v⊤0D⊤0+η0v⊤0B⊤0+η0ˆb⊤]dt+E∫T0(12P0D20v20+P0D0v0ˆσ+12P0ˆσ2)dt+E∫T0[B0ˆR−10(B⊤0P0+D⊤0P0C0)]⊤η0y⊤0dt+E∫T0{[D0ˆR−10ˆB0]⊤P0ˆσy⊤0+β0Q0x(0)(t)y⊤0}dt+12P0F20∫E0∫T0π0(dedt). |
Combing the above equation with the definition of ˜J0(v0), it follows that
˜J0(v0)= E{∫T0(12Q0(y0−β0x(0)(t))2+12R0v20)dt+12M0y20(T)}= E{∫T0[−β0Q0x(0)(t)y0+12Q0(β0x(0)(t))2+12R0v20+12P0D20v20+y0v⊤0P0B⊤0 +y0v⊤0P0C0D⊤0+β0Q0x(0)(t)y⊤0+12y20ˆB20ˆR−10+P0D0v0σ0+η0v⊤0B⊤0 +[B0ˆR−10ˆB0]⊤η0y⊤0+[D0ˆR−10ˆB0]⊤P0ˆσy⊤0+12P0ˆσ2+η0ˆb⊤]dt} +12P0F20∫E0∫T0π0(dedt)+12a2i0P(0)+ai0η(0)= E{∫T0[12ˆR−10{[ˆR0v0+ˆB0y0]2+2(B⊤0η0+D⊤0P0ˆσ)(ˆR0v0+ˆB0y0)} +12P0ˆσ2+η0ˆb⊤]dt}+12P0F20∫E0∫T0π0(dedt)+12a2i0P(0)+ai0η(0)= E{∫T0[12ˆR−10‖ˆR0v0+ˆB0y0+(B⊤0η0+D⊤0P0ˆσ)‖2−12ˆR−10(B⊤0η0+D⊤0P0ˆσ)2 +12P0ˆσ2+η0ˆb⊤]dt}+12P0F20∫E0∫T0π0(dedt)+12a2i0P(0)+ai0η(0). |
Hence we obtain the optimal control
ˉv0(t)=−ˆR−10(t)ˆB−10(t)ˉy0(t)−ˆR−10(t)(B⊤0η0(t)+D⊤0P0(t)ˆσ(t))= Λ0(t)ˉy0(t)+Θ0(t). |
The proof is therefore complete.
For any i∈N, the limiting state of minor agent Ai is
{dyi(t)=[Ayi(t)+Bvi(t)+b1x(0)(t)+f(t)]dt+[Cyi(t)+Dvi(t)+b2x(0)(t) + Hy0(t)+σ(t)]dWi(t)+F∫EiGi(dedt),yi(0)=ai. |
The limiting cost functional is given by
˜Ji(vi)= 12E{∫T0[⟨Q(yi(t)−β1x(0)(t)−β2y0(t)),(yi(t)−β1x(0)(t)−β2y0(t))⟩ +⟨Rvi(t),vi(t)⟩]dt+⟨Myi(T),yi(T)⟩}. |
Problem (LM2): Find a control strategy ˉvi(⋅)∈Uiad,1≤i≤N, such that
˜Ji(ˉvi)=infvi∈Uiad˜Ji(vi). |
Let P1(⋅) be the solution of the following Riccati equation:
{−˙P1(t)=P1(t)A+A⊤P1(t)+C⊤P1(t)C+Q−(B⊤P1(t)+D⊤P1(t)C)⊤ × (R+D⊤P1(t)D)−1(B⊤P1(t)+D⊤P1(t)C),R+D⊤P1(t)D≥0,P1(T)=M. |
η1(⋅) satisfies
{˙η1(t)=−{[A−B(R+D⊤P1(t)D)−1×(B⊤P1(t)+D⊤P1(t)C)]⊤η1(t) + [C−D(R+D⊤P1(t)D)−1×(B⊤P1(t)+D⊤P1(t)C)]⊤×P1(t)(b2x(0)(t) + Hy0(t)+σ(t))+[P1(t)(b1x(0)(t)+f(t))−β1Qx(0)(t)−β2Qy0(t)]},η1(T)=0. |
Denote
{Λ1(t):=−(R+D⊤P1(t)D)−1×(B⊤P1(t)+D⊤P1(t)C),Θ1(t):=−(R+D⊤P1(t)D)−1×[B⊤η1(t)+D⊤P1(t)(b2x(0)(t)+Hy0(t)+σ(t))],ˉΘ1(t):=−(R+D⊤P1(t)D)−1×[B⊤η1(t)+D⊤P1(t)(b2x(0)(t)+Hˉy0(t)+σ(t))]. |
Using a similar proof as in Theorem 3.1, we have the following result.
Theorem 3.2. The optimal control strategy of Problem (LM2) is
ˉvi(t)=Λ1(t)ˉyi(t)+ˉΘ1(t), |
where ˉyi(⋅) satisfies
{dˉyi(t)=[(A+BΛ1(t))ˉyi(t)+BˉΘ1(t)+b1x(0)(t)+f(t)]dt+[(C+DΛ1(t))ˉyi(t) + DˉΘ1(t)+b2x(0)(t)+Hˉy0(t)+σ(t)]dWi(t)+F∫EiGi(dedt),ˉyi(0)=ai. | (7) |
In this subsection, we design a closed-loop mean-field system, and show the approximations between the limiting system and the corresponding closed-loop system.
Based on the feedback formulation of the optimal control for major agent A0 and minor agents Ai,1≤i≤N, we obtain
{dˉx0(t)=[(A0+B0Λ0(t))ˉx0(t)+B0Θ0(t)+b0ˉx(N)(t)+f0(t)]dt+[(C0+D0Λ0(t))ˉx0(t) + D0Θ0(t)+l0ˉx(N)(t)+σ0(t)]dW0(t)+F0∫E0G0(dedt),ˉx0(0)=a0, | (8) |
and
{dˉxi(t)=[(A+BΛ1(t))ˉxi(t)+BˉΘ1(t)+b1ˉx(N)(t)+f(t)]dt+[(C+DΛ1(t))ˉxi(t) + DˉΘ1(t)+b2ˉx(N)(t)+Hˉx0(t)+σ(t)]dWi(t)+F∫EiGi(dedt),ˉxi(0)=ai. | (9) |
By ˉx(N)(t)=1NN∑k=1ˉxi(t), the function x(0)(t) fulfills
{dx(0)(t)=[(A+BΛ1(t)+b1)x(0)(t)+BˉΘ1(t)+f(t)]dt,x(0)(0)=1NN∑j=1aj. | (10) |
Now, we introduce the following NCE equation:
{dˉy0(t)=[(A0+B0Λ0(t))ˉy0(t)+B0Θ0(t)+b0x(0)(t)+f0(t)]dt + [(C0+D0Λ0(t))ˉy0(t)+D0Θ0(t)+l0x(0)(t)+σ0(t)]dW0(t)+F0∫E0G0(dedt),˙x(0)(t)=(A+BΛ1(t)+b1)x(0)(t)−B(R+D⊤P1(t)D)−1 × [B⊤η1(t)+D⊤P1(t)(b2x(0)(t)+Hˉy0(t)+σ(t))]+f(t),−˙η1(t)=[A+BΛ1(t)]⊤η1(t)+[C+DΛ1(t)]⊤P1(t)[b2x(0)(t)+Hˉy0(t)+σ(t)] + P1(t)(b1x(0)(t)+f(t))−β1Qx(0)(t)−β2Qˉy0(t),−˙η0(t)=[A0+B0Λ0(t)]⊤η0(t)+[C0+D0Λ0(t)]⊤P0(t)l0x(0)(t)σ0(t) + P0(t)(b0x(0)(t)+f0(t))−β0Q0x(0)(t),ˉy0(0)=a0, η0(T)=η1(T)=0, x(0)(0)=1NN∑j=1aj, |
which can be written as
{dˉy0(t)=[ˆA0(t)ˉy0(t)+G0(t)x(0)(t)−B0ˆR−10(t)B⊤0η0(t)+ˆG0(t)]dt + [C0(t)ˉy0(t)+H0(t)x(0)(t)−D0ˆR−10(t)B⊤0η0(t)+ˆH0(t)]dW0(t)+F0∫E0G0(dedt),˙x(0)(t)=G1(t)x(0)(t)−BˆR−1(t)[B⊤η1(t)+D⊤P1(t)Hˉy0(t)+D⊤P1(t)σ(t)]+f(t),−˙η1(t)=ˆA⊤(t)η1(t)+L1(t)x(0)(t)+H1(t)ˉy0(t)+K1(t),−˙η0(t)=ˆA⊤0η0(t)+L0(t)x(0)(t)+P0(t)f0(t)ˉy0(0)=a0, η0(T)=η1(T)=0, x(0)(0)=1N∑Nj=1aj, | (11) |
where
ˆA0(t):=A0+B0Λ0(t), G0(t):=−B0ˆR−10(t)D⊤0P0(t)l0+b0,ˆR0(t):=R0+D⊤0P0(t)D0, ˆG0(t):=f0(t)−B0ˆR−10(t)D⊤0P0(t)σ0(t),C0(t):=C0+D0Λ0(t), H0(t):=−D0ˆR−10(t)D⊤0P0(t)l0+l0,ˆH0(t):=σ0(t)−D0ˆR−10(t)D⊤0P0(t)σ0(t), G1(t):=ˆA(t)+b1−BˆR−1(t)D⊤P1(t)b2,ˆA(t):=A+BΛ1(t), L1(t):=[C+DΛ1(t)]⊤P1(t)b2+P1(t)b1−β1Q,ˆR(t):=R+D⊤P1(t)D, H1(t):=[C+DΛ1(t)]⊤P1(t)H−β2Q,K1(t):=[C+DΛ1(t)]⊤P1(t)σ(t)+P1(t)f(t),L0(t):=[C0+D0Λ0(t)]⊤P0(t)l0σ0(t)+P0(t)b0−β0Q0. |
The above NCE equation is a kind of coupled two-point boundary value problem, whose well-posedness can be found in Theorem 4.2 of Hu et al. [3] under some monotonicity assumptions. We will not repeat them here for simplicity.
Next, we establish the approximation relationship between the closed-loop mean-field game system and the limiting system.
Proposition 3.3. The following estimates hold:
(i) sup0≤t≤TE|ˉx(N)(t)−x(0)(t)|2=O(1N),(ii) sup0≤t≤TE||ˉx(N)(t)|2−|x(0)(t)|2|=O(1√N),(iii) sup0≤t≤TE|ˉx0(t)−ˉy0(t)|2=O(1N),(iv) sup0≤t≤TE||ˉx0(t)|2−|ˉy0(t)|2|=O(1√N),(v) sup0≤t≤TE|ˉxi(t)−ˉyi(t)|2=O(1N), 1≤i≤N,(vi) sup0≤t≤TE||ˉxi(t)|2−|ˉyi(t)|2|=O(1√N), 1≤i≤N. |
Proof. Let ˉz(t):=ˉx(N)(t)−x(0)(t), ˉz0(t):=ˉx0(t)−ˉy0(t), ˉzi(t):=ˉxi(t)−ˉyi(t) (1≤i≤N). Combining (9) with (10), we derive
{dˉz(t)=[(A+BΛ1(t)+b1)ˉz(t)]dt+1NN∑j=1[(C+DΛ1(t))ˉxj(t)+DˉΘ1(t)+b2ˉx(N)(t)+Hˉx0(t)+σ(t)]dWj(t)+1NN∑j=1F∫EjGj(dedt),ˉz(0)=0. |
Define χ(t):=b2ˉx(N)(t)+Hˉx0(t)+σ(t). Applying Itô's formula to ˉz2(t), we obtain
E[ˉz2(t)]= 2∫t0(A+BΛ1(s)+b1)E[ˉz2(s)]ds+1N2N∑j=1E∫t0{[(C+DΛ1(s))ˉxj(s)+DˉΘ1(s)+χ(s)]}2ds+F2N2N∑j=1E∫Ej∫t0πj(deds),≤ 2sup0≤t≤T(A+BΛ1(t)+b1)×∫t0E[ˉz2(s)]ds+TNmax0≤j≤NE[(C+DΛ1(t))ˉxj(t)+DˉΘ1(t)+χ(t)]2+F2Nmax0≤j≤NE∫Ej∫t0πj(deds). |
According to Gronwall's inequality, it follow that
sup0≤t≤TE|ˉx(N)(t)−x(0)(t)|2=O(1N). | (12) |
For (ii), according to Hölder's inequality, we have
E||ˉx(N)(t)|2−|x(0)(t)|2|= E||ˉx(N)(t)−x(0)(t)|2+2x(0)(t)(ˉx(N)(t)−x(0)(t))|≤ E[|ˉx(N)(t)−x(0)(t)|2]+2|x(0)(t)|(E[|ˉx(N)(t)−x(0)(t)|2])12. |
By (12) and the boundedness of |x(0)(t)|, one has
sup0≤t≤TE||ˉx(N)(t)|2−|x(0)(t)|2|=O(1√N). |
We now prove (iii). According to (6) and (8), it follows that
{dˉz0(t)=[(A0+B0Λ0(t))ˉz0(t)+b0ˉz(t)]dt+[(C0+D0Λ0(t))ˉz0(t)+l0ˉz(t)]dW0(t),ˉz0(0)=0. |
Applying Itô's formula to ˉz20(t), we obtain
E[ˉz20(t)]= 2∫t0E[(A0+B0Λ0(s))ˉz20(s)+b0ˉz(s)ˉz0(s)]ds+∫t0E[(C0+D0Λ0(s))ˉz0(s)+l0ˉz(s)]2ds≤ 2∫t0[(A0+B0Λ0(s))+(C0+D0Λ0(s))2+b20]Eˉz20(s)ds+∫t0(12+2l20)Eˉz2(s)ds. |
By (12) and Gronwall's inequality, we have
sup0≤t≤TE|ˉx0(t)−ˉy0(t)|2=O(1N). | (13) |
Note that
E||ˉx0(t)|2−|ˉy0(t)|2|= E||ˉx0(t)−ˉy0(t)|2+2ˉy0(t)(ˉx0(t)−ˉy0(t))|≤ E[|ˉx0(t)−ˉy0(t)|2]+2(E[|ˉy0(t)|2])12(E[|ˉx0(t)−ˉy0(t)|2])12. |
According to (13) and the boundedness of |ˉy0(t)|, we obtain
sup0≤t≤TE||ˉx0(t)|2−|ˉy0(t)|2|=O(1√N). |
Next, we prove (v). Combining (7) with (9), we have
{dˉzi(t)=[(A+BΛ1(t))ˉzi(t)+b1ˉz(t)]dt+[(C+DΛ1(t))ˉzi(t)+b2ˉz(t)+Hˉz0(t)]dWi(t),ˉzi(0)=0. |
Applying Itô's formula to ˉz2i(t), we obtain
E[ˉz2i(t)]= 2∫t0E[(A+BΛ1(s))ˉz2i(s)+b1ˉz(s)ˉzi(s)]ds+∫t0E[(C+DΛ1(s))ˉzi(s)+b2ˉz(s)+Hˉz0(t)]2ds≤∫t0[2(A+BΛ1(s))+b21+3(C+DΛ1(s))2]Eˉz2i(s)ds+∫t0(1+3b22)Eˉz2(s)ds+3H2∫t0Eˉz20(s)ds. |
By Gronwall's inequality, and estimates (12) and (13), we obtain
sup0≤t≤TE|ˉxi(t)−ˉyi(t)|2=O(1N). | (14) |
Finally, we prove (ⅵ). Since
E||ˉxi(t)|2−|ˉyi(t)|2|≤ E[|ˉxi(t)−ˉyi(t)|2]+2E[|ˉyi(t)||ˉxi(t)−ˉyi(t)|]≤ E[|ˉxi(t)−ˉyi(t)|2]+2(E[|ˉyi(t)|2])12(E[|ˉxi(t)−ˉyi(t)|2])12. |
According to (14) and the boundedness of |ˉyi(t)|, we get
sup0≤t≤TE||ˉxi(t)|2−|ˉyi(t)|2|=O(1√N). |
The proof is then complete.
Define the control strategy for the major agent as
ˉu0(t)=Λ0(t)ˉx0(t)+Θ0(t), | (15) |
and the control strategy for minor agents as
ˉui(t)=Λ1(t)ˉxi(t)+ˉΘ1(t). | (16) |
Based on the approximation relationship between the closed-loop mean-field systems and the limiting system, the following approximation relationship between cost functionals can be derived.
Proposition 3.4. For any i=0,1,⋯,N, we have
|Ji(ˉui,ˉu−i)−~Ji(ˉvi)|=O(1√N). |
Proof. Based on the definitions of the cost functionals, we obtain
|Ji(ˉui,ˉu−i)−˜Ji(ˉvi)|= |12E∫T0{[Q(ˉxi(t)−β1ˉx(N)(t)−β2ˉx0(t))2−Q(ˉyi(t)−β1x(0)(t)−β2ˉy0(t))2]+[Rˉu2i(t)−Rˉv2i(t)]}dt+12E[Mˉx2i(T)−Mˉy2i(T)]|= |12E∫T0{Q[(ˉxi(t)−β1ˉx(N)(t)−β2ˉx0(t))+(ˉyi(t)−β1x(0)(t)−β2ˉy0(t))]×[(ˉxi(t)−β1ˉx(N)(t)−β2ˉx0(t))−(ˉyi(t)−β1x(0)(t)−β2ˉy0(t))]+R[(Λ1(t)ˉxi(t)+ˉΘ1(t))2−(Λ1(t)ˉyi(t)+ˉΘ1(t))2]}dt+12E[Mˉx2i(T)−Mˉy2i(T)]|= |12E∫T0{Q[(2ˉxi(t)−2β1ˉx(N)(t)−2β2ˉx0(t))−L(t)]×L(t)+R[(Λ1(t))2(ˉx2i(t)−ˉy2i(t))+2Λ1(t)ˉΘ1(t)(ˉxi(t)−ˉyi(t))]}dt+12E[Mˉx2i(T)−Mˉy2i(T)]≤ 12∫T0{QE[|(2ˉxi(t)−2β1ˉx(N)(t)−2β2ˉx0(t))L(t)|]+QE[|L2(t)|]+R(Λ1(t))2E[|ˉx2i(t)−ˉy2i(t)|]+2RΛ1(t)ˉΘ1(t)E[|ˉxi(t)−ˉyi(t)|]}dt+12ME[|ˉx2i(T)−ˉy2i(T)|]≤ 12QTsup0≤t≤TE[|(2ˉxi(t)−2β1ˉx(N)(t)−2β2ˉx0(t))L(t)|]+12QTsup0≤t≤TE[|L2(t)|]+12RT(Λ1(t))2sup0≤t≤TE[|ˉx2i(t)−ˉy2i(t)|]+RTΛ1(t)ˉΘ1(t)sup0≤t≤TE[|ˉxi(t)−ˉyi(t)|]+12Msup0≤t≤TE[|ˉx2i(T)−ˉy2i(T)|], |
where L(t):=[(ˉxi(t)−ˉyi(t))−β1(ˉx(N)(t)−x(0)(t))−β2(ˉx0(t)−ˉy0(t))]. Obviously, according to Proposition 3.3, we have E[|L(t)|2]=O(1N). Therefore, it follows that |Ji(ˉui,ˉu−i)−~Ji(ˉvi)|=O(1√N). The proof is then complete.
This section will verify the asymptotic Nash equilibrium property of the decentralized control strategies ˉu=(ˉu0,ˉu1,⋯,ˉuN) specified by (15) and (16).
Let the major agent ˜A0 take an alternative control strategy u0, and let the minor agent ˜Ai take the control law (16). Then the state system with the major agent's perturbation is
{d˜x0(t)=[A0˜x0(t)+B0u0(t)+b0˜x(N)(t)+f0(t)]dt+[C0˜x0(t)+D0u0(t)+l0˜x(N)(t)+σ0(t)]dW0(t)+F0∫E0G0(dedt),d˜xi(t)=[(A+BΛ1(t))˜xi(t)+BˉΘ1(t)+b1˜x(N)(t)+f(t)]dt+[(C+DΛ1(t))˜xi(t)+DˉΘ1(t)+b2˜x(N)(t)+H˜x0(t)+σ(t)]dWi(t)+F∫EiGi(dedt),˜x0(0)=a0,˜xi(0)=ai, i=1,⋯,N, | (17) |
where ˜x(N)(t)=1NN∑k=1˜xk(t). The cost functional for major agent ˜A0 is
J0(u0,u−0)= 12E{∫T0[⟨Q0(˜x0(t)−β0˜x(N)(t)),(˜x0(t)−β0˜x(N)(t))⟩ +⟨R0u0(t),u0(t)⟩]dt+⟨M0˜x0(T),˜x0(T)⟩}. |
The corresponding limiting state equation with the major agent's perturbation control is
{d˜y0(t)=[A0˜y0(t)+B0u0(t)+b0x(0)(t)+f0(t)]dt+[C0˜y0(t)+D0u0(t) + l0x(0)(t)+σ0(t)]dW0(t)+F0∫E0G0(dedt)d˜yi(t)=[(A+BΛ1(t))˜yi(t)+BˉΘ1(t)+b1x(0)(t)+f(t)]dt+[(C+DΛ1(t))˜yi(t) + DˉΘ1(t)+b2x(0)(t)+H˜y0(t)+σ(t)]dWi(t)+F∫EiGi(dedt)˜y0(0)=a0, ˜yi(0)=ai, i=1,⋯,N. |
The cost functional is
˜J0(u0)= 12E{∫T0[⟨Q0(˜y0(t)−β0x(0)(t)),(˜y0(t)−β0x(0)(t))⟩ +⟨R0u0(t),u0(t)⟩]dt+⟨M0˜y0(T),˜y0(T)⟩}. |
The following result presents an approximation relationship between two perturbation systems.
Proposition 4.1. We have the following conclusion:
(i) sup0≤t≤TE|˜x(N)(t)−x(0)(t)|2=O(1N),(ii) sup0≤t≤TE||˜x(N)(t)|2−|x(0)(t)|2|=O(1√N),(iii) sup0≤t≤TE|˜x0(t)−˜y0(t)|2=O(1N),(iv) sup0≤t≤TE||˜x0(t)|2−|˜y0(t)|2|=O(1√N). |
Proof. We only need to prove the first approximation relationship, and a other three approximation relationships can be obtained by a similar proof as in Proposition 3.3.
Define Φ(t):=˜x(N)(t)−x(0)(t). Combining (10) with (17), we have
{dΦ(t)=[(A+BΛ1(t)+b1)Φ(t)]dt+1NN∑k=1[(C+DΛ1(t))˜xk(t)+DˉΘ1(t)+b2˜x(N)(t) + H˜x0(t)+σ(t)]dWk(t)+1NN∑k=1F∫EkGk(dedt),Φ(0)=0. |
Define Lk(t):=[(C+DΛ1(t))˜xk(t)+DˉΘ1(t)+b2˜x(N)(t)+H˜x0(t)+σ(t)]. Therefore
E∫t0|Lk(s)|2ds= E∫t0[(C+DΛ1(s))˜xk(s)+D˜Θ1(s)+b2(˜x(N)(s)−x(0)(s))+b2x(0)(s)+H˜x0(s)+σ(s)]2ds≤ CE∫t0[|˜xk(s)|2+1+|˜x(N)(s)−x(0)(s)|2+|x(0)(s)|2+|˜x0(s)|2+|σ(s)|2]ds≤ CE∫t0|(˜x(N)(s)−x(0)(s))|2ds+C1, |
where
C:=max{supt∈[0,T]|C+DΛ1(t)|,supt∈[0,T]|D˜Θ1(t)|,|b2|,|H|,1},C1:=CE∫T0[|˜xk(s)|2+1+|x(0)(s)|2+|˜x0(s)|2+|σ(s)|2]ds |
are constants independent of N.
Furthermore,
EΦ2(t)= 2E{∫t0[(A+BΛ1(s)+b1)Φ(s)]ds}2+2N2E{∫t0N∑j=1L2j(s)ds}+2E{∫t01NN∑k=1F∫EkGk(deds)}2≤ 2E∫T0[T|(A+BΛ1(s)+b1)Φ(s)|2+1Nmax1≤k≤N|Lk(s)|2]ds+2N2E∫t0N∑k=1∫Ek|FGk|2n(de)ds. |
By Grownwall's inequality, we have
sup0≤t≤TE|˜x(N)(t)−x(0)(t)|2=O(1N). |
Then, the proof is complete.
Similarly to the proof of Proposition 3.4, we can obtain the following result.
Proposition 4.2. For any u0(⋅)∈Uc,0ad, we have
|J0(u0,ˉu−0)−~J0(u0)|=O(1√N). |
Now, let us consider the following case: a given minor agent ˜Ai takes an alternative control strategy ui(⋅)∈Uc,iad, the major agent uses the optimal control strategy ˉu0(⋅) defined by (15), while other minor agents ˜Aj take the control strategy ˉuj(⋅),j≠i, 1≤j≤N, defined by (16). Then the dynamics of the agents with the given minor agent's perturbation can be written in the form
{dˆx0(t)=[(A0+B0Λ0(t))ˆx0(t)+B0Θ0(t)+b0ˆx(N)(t)+f0(t)]dt+[(C0+D0Λ0(t))ˆx0(t) +D0Θ0(t)+l0ˆx(N)(t)+σ0(t)]dW0(t)+F0∫E0G0(dedt),dˆxi(t)=[Aˆxi(t)+Bui(t)+b1ˆx(N)(t)+f(t)]dt+[Cˆxi(t)+Dui(t)+b2ˆx(N)(t) +Hˆx0(t)+σ(t)]dWi(t)+F∫EiGi(dedt),dˆxj(t)=[(A+BΛ1(t))ˆxj(t)+BˉΘ1(t)+b1ˆx(N)(t)+f(t)]dt+[(C+DΛ1(t))ˆxj(t) +DˉΘ1(t)+b2ˆx(N)(t)+Hˆx0(t)+σ(t)]dWj(t)+F∫EjGj(dedt),˜x0(0)=a0, ˜xi(0)=ai, ˜xj(0)=aj,j=1,2,⋯,N, j≠i, | (18) |
where ˆx(N)(t)=1NN∑k=1ˆxk(t). The cost functional is
Ji(ui,u−i)= 12E{∫T0[⟨Q(ˆxi(t)−β1ˆx(N)(t)−β2ˆx0(t)),(ˆxi(t)−β1ˆx(N)(t)−β2ˆx0(t))⟩ +⟨Rui(t),ui(t)⟩]dt+⟨Mˆxi(T),ˆxi(T)⟩}. |
The corresponding limiting system with the minor agent's perturbation strategy is
{dˆy0(t)=[(A0+B0Λ0(t))ˆy0(t)+B0Θ0(t)+b0x(0)(t)+f0(t)]dt+[(C0+D0Λ0(t))ˆy0(t) +D0Θ0(t)+l0x(0)(t)+σ0(t)]dW0(t)+F0∫E0G0(dedt),dˆyi(t)=[Aˆyi(t)+Bui(t)+b1x(0)(t)+f(t)]dt+[Cˆyi(t)+Dui(t)+b2x(0)(t) +Hˆy0(t)+σ(t)]dWi(t)+F∫EiGi(dedt),dˆyj(t)=[(A+BΛ1(t))ˆyj(t)+BˉΘ1(t)+b1x(0)(t)+f(t)]dt+[(C+DΛ1(t))ˆyj(t) +DˉΘ1(t)+b2x(0)(t)+Hˆy0(t)+σ(t)]dWj(t)+F∫EjGj(dedt),˜y0(0)=a0, ˜yi(0)=ai, ˜yj(0)=aj, j=1,2,⋯,N, j≠i. |
The cost functional is
˜Ji(ui)= 12E{∫T0[⟨Q(ˆyi(t)−β1x(0)(t)−β2ˆy0(t)),(ˆyi(t)−β1x(0)(t)−β2ˆy0(t))⟩ +⟨Rui(t),ui(t)⟩]dt+⟨Mˆyi(T),ˆyi(T)⟩}. |
Now, we are in a position to state the following approximation results.
Proposition 4.3. For the fixed i, we have
(i) sup0≤t≤TE|ˆx(N)(t)−x(0)(t)|2=O(1N),(ii) sup0≤t≤TE||ˆx(N)(t)|2−|x(0)(t)|2|=O(1√N),(iii) sup0≤t≤TE|ˆxi(t)−ˆyi(t)|2=O(1N),(iv) sup0≤t≤TE||ˆxi(t)|2−|ˆyi(t)|2|=O(1√N). |
Proof. We prove only the first approximation relationship, and the other three approximation relationships can be obtained by a similar proof as in Proposition 3.3.
Define ˜z(t):=ˆx(N)(t)−x(0)(t). According to (10) and (18), we get
{d˜z(t)=[(A+BΛ1(t)+b1)˜z(t)]dt+S(t)dt+dL(t)+1NN∑k=1F∫EkGk(dedt),˜z(0)=0, |
where
S(t)= BN[ui(t)−Λ1(t)ˆxi(t)−ˉΘ1(t)],L(t)= 1NN∑k=1,k≠i∫t0[(C+DΛ1(r))ˆxk(r)+DˉΘ1(r)+b2ˆx(N)(r)+Hˆx0(r)+σ(r)]dWk(r)+1N∫t0[Cˆxi(r)+Dui(r)+b2ˆx(N)(r)+Hˆx0(r)+σ(r)]dWi(r). |
Since
∫t0E|S(r)|2dr≤3B2N2(∫t0E[u2i(r)]dr+∫t0E[(Λ1(r))2ˆx2i(r)]dr+∫t0E[(ˉΘ1(r))2]dr), |
we get
∫t0E|S(r)|2dr=O(1N2). | (19) |
Note that
V(t):= E∫t0(dL(r))2= 1N2N∑k=1,k≠i∫t0E|(C+DΛ1(r))ˆxk(r)+DˉΘ1(r)+b2ˆx(N)(r)+Hˆx0(r)+σ(r)|2dr+1N2∫t0E|Cˆxi(r)+Dui(r)+b2ˆx(N)(r)+Hˆx0(r)+σ(r)|2dr≤ TNsup0≤t≤Tmax0≤t≤TE|(C+DΛ1(t))ˆxk(t)+DˉΘ1(t)+b2ˆx(N)(t)+Hˆx0(t)+σ(t)|2+TN2sup0≤t≤TE|Cˆxi(t)+Dui(t)+b2ˆx(N)(t)+Hˆx0(t)+σ(t)|2. |
Thus
V(t)=O(1N). | (20) |
Applying Itô's formula to ˜z2(t), we obtain
E[˜z2(t)]= 2∫t0(A+BΛ1(r)+b1)E[˜z2(r)]dr+2∫t0E[˜z(r)S(r)]dr+V(t)+F2N2N∑i=1E∫Ei∫t0πi(dedr)≤sup0≤t≤T(|2A+2BΛ1(t)+2b1|+1)∫t0E[˜z2(r)]dr+∫t0E[S2(r)]dr+V(t)+F2Nmax0≤t≤TE∫Ei∫t0πi(dedr). |
Combining (19) and (20) with Gronwall's inequality, we get
sup0≤t≤TE|ˆx(N)(t)−x(0)(t)|2=O(1N). |
This completes the proof.
By using similar arguments as in Proposition 3.4, we can obtain the following conclusion.
Proposition 4.4. For any ui(⋅)∈Uc,iad,1≤i≤N, one has
|Ji(ui,ˉu−i)−~Ji(ui)|=O(1√N). | (21) |
In this subsection, we will verify the ϵ-Nash equilibrium property of the decentralized control strategies (15) and (16).
Before presenting the main result, we give the definition of ϵ-Nash equilibrium in the following manner.
Definition 4.5. A set of control strategies ˉu=(ˉu0,ˉu1,⋯,ˉuN) where ˉui(⋅)∈Uc,ia,d, i=0,1⋯,N, is called an ϵ-Nash equilibrium with respect to costs Ji,i=0,1⋯,N, if there exists an ϵ≥0, such that for any i=0,1⋯,N, we have
Ji(ˉui,ˉu−i)≤Ji(ui,ˉu−i)+ϵ, | (22) |
when any alternative strategy ui(⋅)∈Uc,ia,d is applied by agent Ai.
Based on the above results, we obtain the following main result.
Theorem 4.6. Suppose that ˉxi(⋅),i=0,1,⋯,N, is the solution to the equation systems (8) and (9). Then the set of control strategy profiles ˉu=(ˉu0,ˉu1,⋯,ˉuN) defined by (15) and (16) is an ϵ-Nash equilibrium of Problem (LP), where ϵ=O(1√N)→0 as N→+∞.
Proof. Combining Propositions 3.4 and 4.2 with Proposition 4.4, we obtain
Ji(ˉui,ˉu−i)= ~Ji(ˉvi)+O(1√N)≤ ~Ji(ui)+O(1√N)≤ Ji(ui,ˉu−i)+O(1√N), i=0,1,⋯,N. |
Therefore, the conclusion holds with ϵ=O(1√N).
This section demonstrates the consistency of mean-field estimation as well as the influence of the population's collective behavior ˉx(N)(⋅) on the state trajectories of the agents through a numerical example.
Consider a mean-field game system with one major agent and N=500 minor agents. For any uj∈Uc,jad,j=0,1,⋯,N, the dynamics of the major agent and minor agents are given by
{dx0(t)=(12x0(t)+u0(t)+x(N)(t))dt+(x0(t)+u0(t)+x(N)(t))dW0(t) + 2∫E0G0(dedt),dxi(t)=(3xi(t)+5ui(t)+x(N)(t))dt+(2xi(t)+ui(t)+x(N)(t)+x0(t))dWi(t) +∫EiGi(dedt),x0(0)=5, xi(0)=ai, i=1,⋯,N, | (23) |
where t∈[0,T] with T=1. Let the initial states of the agents {ai,i=1,⋯,N} be independent and identically distributed random variables with the normal distribution N(−5,1).
The cost functional of the major agent A0 is
J0(u0,u−0)=12E{∫T0[3(x0(t)−x(500)(t))2+u20(t)]dt+3x20(1)}, | (24) |
and the cost functional of the minor agent Ai,i=1,⋯,500, is
Ji(ui,u−i)=12E{∫T0[2(xi(t)−x(500)(t)−x0(t))2+u2i(t)]dt+x2i(1)}. | (25) |
It is easy to check that {P0(t)≡3,∀t∈[0,1]} is a unique solution of the following Riccati equation:
{˙P0(t)+2P0(t)−4P20(t)(1+P0(t))−1+3=0,P0(1)=3. |
Suppose that P1(⋅) fulfills
{˙P1(t)+10P1(t)−49P21(t)(1+P1(t))−1+2=0,P1(1)=1. |
Then the NCE Eq (11) turns out to be
{dˉy0(t)=[−ˉy0(t)+14x(0)(t)]dt+[−12ˉy0(t)+14x(0)(t)]dW0(t)+2∫E0G0(dedt),˙x(0)(t)=(4+40P(t))x(0)(t)−25(1+P1(t))−1η1(t)+5P(t)ˉy0(t),−˙η1(t)=(3+35P(t))η1(t)+P1(t)x(0)(t)+(2P1(t)+7P(t)P1(t)−2)(ˉy0(t)+x(0)(t)),ˉy0(0)=a0, η1(T)=0, x(0)(0)=1NN∑j=1aj, η0(t)≡0, t∈[0,T], | (26) |
where P(t)=−(1+P1(t))−1P1(t).
According to Theorem 4.6, the set of control strategies ˉu=(ˉu0,ˉu1,⋯,ˉuN) defined by
ˉu0(t)=−32ˉx0(t)−34x(0)(t),ˉui(t)= P(t)(7ˉxi(t)+x(0)(t)+ˉy0(t))−5(1+P1(t))−1η1(t), i=1,2,⋯,N, |
is an ϵ-Nash equilibrium of the mean-field systems (24) and (25), where ˉx0(⋅) and ˉxi(⋅) satisfy
{dˉx0(t)=(−ˉx0(t)−34x(0)(t)+ˉx(500)(t))dt+(−12ˉx0(t)−34x(0)(t)+ˉx(500)(t))dW0(t) + 2∫E0G0(dedt),dˉxi(t)=[(3+35P(t))ˉxi(t)+5P(t)x(0)(t)+5P(t)ˉy0(t)+ˉx(N)(t) − 25(1+P1(t))−1η1(t)]dt+[(2+7P(t))ˉxi(t)+P(t)x(0)(t)+P(t)ˉy0(t) + ˉx(N)(t)+ˉx0(t)−5(1+P1(t))−1η1(t)]dWi(t)+∫EiGi(dedt),ˉx0(0)=a0, ˉxi(0)=ai, i=1,⋯,N, | (27) |
where ˉx(500)(t)=1500500∑j=1ˉxj(t).
In this article, Merton's jump model (see Merton [29], as well as Platen and Bruti-Liberati [30, pg. 37] is applied to describe the jump-diffusion process. Assume that ∫E0G0(dedt)=Q0(μ0,σ0)dΠ0(λ0). Q0(μ0,σ0) is the jump size with a normally distributed mean μ0∼N(2,1) and a standard deviation σ0=0.1. The Poisson process Π0(λ0) has a jump intensity of λ0=2. For agent Ai,i=1,⋯,500, let ∫EiGi(dedt)=Qi(μ1,σ1)dΠi(λ). Qi(μ1,σ1) is the jump size with a normally distributed mean μ1∼N(1,1) and a standard deviation σ1=0.05. The Poisson process Πi(λ) has a jump intensity of λ=5.
Figure 1 shows the consistency of mean-field estimation, and the interactive influence between mean-field term ˉx(500)(⋅), and the major state ˉx0(⋅). When the number of minor agents N=500, as shown in Figure 1, the curves of ˉx(500)(⋅) and x(0)(⋅) coincide well, which illustrates the consistency of the mean-field estimation indicated by Proposition 3.3.
Figure 2 illustrates the state trajectories of the major agent and all the minor agents. As shown in Figure 2, for each fixed i, the trajectory ˉxi(⋅) of Ai, in addition to being influenced by its own initial values and parameters, is also affected by the major agent and the collective behavior of all the minor agents.
To illustrate how the key parameters in the control strategies of Eqs (15) and (16) influence the system's dynamic behavior, we set another set of initial values for N+1 agents with x0(0)=−5 and the independent and identically distributed random variables {ai∼N(5,1), i=1,⋯,500}. Figures 3 and 4 are shown to elaborate the consistency of mean-field estimation and the curves of ˉxi,i=0,1,2,⋯,500.
Motivated by the lack of theory and some practical applications, this paper in concerned with linear-quadratic-Gaussian mean-field games involving mixed agents of a stochastic large population system with random jumps. There are two mixed types of agents: (ⅰ) a major agent and (ⅱ) a population of N minor agents where N is very large. The coupling of the major and minor agents exists in both their state dynamics and their individual cost functions. To deal with the dimensionality difficulty and obtain decentralized strategies, the NCE methodology is applied to yield a set of decentralized strategies which is verified to be the ϵ-Nash equilibrium. We provide numerical examples to illustrate both the consistency of the mean-field estimation and the impact of the population's collective behavior. In the future, an interesting research direction is to extend the modeling and analysis to the social optima case, which may involve more applications in practice and generate more challenges in theory. Another potential direction is to study the uniqueness of the equilibrium strategy, which may be more valuable and challenging.
Conceptualization and methodology, R. X.; writing-original draft, review and editing, K. D., J. Z. and Y. Z.; supervision, R. X. All authors have read and agreed to the published version of the manuscript.
The authors declare that they have not used Artificial Intelligence (AI) tools in the creation of this article.
This research is partially supported by the Natural Science Foundation of Shandong Province of China (Grant no. ZR2020MA031, ZR2021MA049), Qilu University of Technology (Shandong Academy of Sciences) Major Innovation Project of Science, Education and Industry Integration Pilot Project (2024ZDZX11), National Natural Science Foundation of China (11971266), and the Colleges and Universities Twenty Terms Foundation of Jinan City (2021GXRC100).
All authors declare no conflicts of interest.
[1] |
C. Benazzoli, L. Campi, L. Di Persio, ϵ-Nash equilibrium in stochastic differential games with mean-field interaction and controlled jumps, Stat. Probabil. Lett., 154 (2019), 108522. https://doi.org/10.1016/j.spl.2019.05.021 doi: 10.1016/j.spl.2019.05.021
![]() |
[2] |
R. Carmona, F. Delarue, Probabilistic analysis of mean-field games, SIAM J. Control Optim., 51 (2013), 2705–2734. https://doi.org/10.1137/120883499 doi: 10.1137/120883499
![]() |
[3] |
Y. Hu, J. Huang, T. Nie, Linear-quadratic-gaussian mixed mean-field games with heterogeneous input constraints, SIAM J. Control Optim., 56 (2018), 2835–2877. https://doi.org/10.1137/17M1151420 doi: 10.1137/17M1151420
![]() |
[4] |
M. Huang, P. E. Caines, R. P. Malhame, Social optima in mean field LQG control: centralized and decentralized strategies, IEEE T. Automat. Contr., 57 (2012), 1736–1751. https://doi.org/10.1109/TAC.2012.2183439 doi: 10.1109/TAC.2012.2183439
![]() |
[5] |
J. Huang, Z. Qiu, S. Wang, Z. Wu, A unified relation analysis of linear-quadratic mean-field game, team, and control, IEEE T. Automat. Contr., 69 (2024), 3325–3332. https://doi.org/10.1109/TAC.2023.3323576 doi: 10.1109/TAC.2023.3323576
![]() |
[6] |
B. Wang, J. Zhang, Social optima in mean field linear-quadratic-gaussian models with Markov jump parameters, SIAM J. Control Optim., 55 (2017), 429–456. https://doi.org/10.1137/15M104178X doi: 10.1137/15M104178X
![]() |
[7] |
H. Wang, R. Xu, Time-inconsistent LQ games for large-population systems and applications, J. Optim. Theory Appl., 197 (2023), 1249–1268. https://doi.org/10.1007/s10957-023-02223-2 doi: 10.1007/s10957-023-02223-2
![]() |
[8] |
R. Xu, F. Zhang, ϵ-Nash mean-field games for general linear-quadratic systems with applications, Automatica, 114 (2020), 108835. https://doi.org/10.1016/j.automatica.2020.108835 doi: 10.1016/j.automatica.2020.108835
![]() |
[9] |
H. Yuan, Q. Zhu, The well-posedness and stabilities of mean-field stochastic differential equations driven by g-Brownian motion, SIAM J. Control Optim., 63 (2025), 596–624. https://doi.org/10.1137/23M1593681 doi: 10.1137/23M1593681
![]() |
[10] |
H. Yuan, Q. Zhu, The stabilities of delay stochastic McKean-Vlasov equations in the g-framework, Sci. China Inf. Sci., 68 (2025), 112203. https://doi.org/10.1007/s11432-024-4075-2 doi: 10.1007/s11432-024-4075-2
![]() |
[11] |
M. Huang, P. E. Caines, R. P. Malhame, Large-population cost-coupled LQG problems with nonuniform agents: individual-mass behavior and decentralized ϵ-Nash equilibria, IEEE T. Automat. Contr., 52 (2007), 1560–1571. https://doi.org/10.1109/TAC.2007.904450 doi: 10.1109/TAC.2007.904450
![]() |
[12] |
M. Huang, Large-population LQG games involving a major player: the Nash certainty equivalence principle, SIAM J. Control Optim., 48 (2010), 3318–3353. https://doi.org/10.1137/080735370 doi: 10.1137/080735370
![]() |
[13] |
M. Nourian, P. E. Caines, ϵ-Nash mean field game theory for nonlinear stochastic dynamical systems with major and minor agents, SIAM J. Control Optim., 51 (2013), 3302–3331. https://doi.org/10.1137/120889496 doi: 10.1137/120889496
![]() |
[14] |
J. M. Lasry, P. L. Lions, Jeux à Champ Moyen. Ⅰ–Le cas stationnaire, C. R. Math., 343 (2006), 619–625. https://doi.org/10.1016/j.crma.2006.09.019 doi: 10.1016/j.crma.2006.09.019
![]() |
[15] |
J. M. Lasry, P. L. Lions, Jeux à Champ Moyen. Ⅱ–Horizon fini et contrôle optimal, C. R. Math., 343 (2006), 679–684. https://doi.org/10.1016/j.crma.2006.09.018 doi: 10.1016/j.crma.2006.09.018
![]() |
[16] |
J. M. Lasry, P. L. Lions, Mean field games, Jpn. J. Math., 2 (2007), 229–260. https://doi.org/10.1007/s11537-007-0657-8 doi: 10.1007/s11537-007-0657-8
![]() |
[17] | A. Bensoussan, J. Frehse, P. Yam, Mean field games and mean field type control theory, New York: Springer, 2013. https://doi.org/10.1007/978-1-4614-8508-7 |
[18] |
J. Huang, S. Wang, Z. Wu, Backward mean-field linear-quadratic-gaussian (LQG) games: full and partial information, IEEE T. Automat. Contr., 61 (2016), 3784–3796. https://doi.org/10.1109/TAC.2016.2519501 doi: 10.1109/TAC.2016.2519501
![]() |
[19] |
T. Nie, S. Wang, Z. Wu, Linear-quadratic delayed mean-field social optimization, Appl. Math. Optim., 89 (2024), 4. https://doi.org/10.1007/s00245-023-10067-5 doi: 10.1007/s00245-023-10067-5
![]() |
[20] |
R. Xu, J. Shi, ϵ-Nash mean-field games for linear-quadratic systems with random jumps and applications, Int. J. Control, 94 (2021), 1415–1425. https://doi.org/10.1080/00207179.2019.1651940 doi: 10.1080/00207179.2019.1651940
![]() |
[21] |
R. Xu, T. Wu, Risk-sensitive large-population linear-quadratic-gaussian games with major and minor agents, Asian J. Control, 25 (2023), 4391–4403. https://doi.org/10.1002/asjc.3106 doi: 10.1002/asjc.3106
![]() |
[22] |
H. Wang, R. Xu, Time-inconsistent large-population linear-quadratic games with major and minor agents, Int. J. Control, 2025. https://doi.org/10.1080/00207179.2025.2491823 doi: 10.1080/00207179.2025.2491823
![]() |
[23] |
B. Düring, P. Markowich, J. F. Pietschmann, M. T. Wolfram, Boltzmann and Fokker-Planck equations modelling opinion formation in the presence of strong leaders, P. Roy. Soc. A Math. Phy., 465 (2009), 3687–3708. https://doi.org/10.1098/rspa.2009.0239 doi: 10.1098/rspa.2009.0239
![]() |
[24] |
Z. Ma, D. S. Callaway, I. A. Hiskens, Decentralized charging control of large populations of plug-in electric vehicles, IEEE T. Control Syst. Technol., 21 (2011), 67–78. https://doi.org/10.1109/TCST.2011.2174059 doi: 10.1109/TCST.2011.2174059
![]() |
[25] |
J. Shi, Z. Wu, Maximum principle for forward-backward stochastic control system with random jumps and applications to finance, J. Syst. Sci. Complex., 23 (2010), 219–231. https://doi.org/10.1007/s11424-010-7224-8 doi: 10.1007/s11424-010-7224-8
![]() |
[26] |
J. Shi, Z. Wu, A risk-sensitive stochastic maximum principle for optimal control of jump diffusions and its applications, Acta Math. Sci., 31 (2011), 419–433. https://doi.org/10.1016/S0252-9602(11)60242-7 doi: 10.1016/S0252-9602(11)60242-7
![]() |
[27] | R. Cont, P. Tankov, Financial modelling with jump processes, Chapman and Hall/CRC, 2003. |
[28] |
W. He, Y. Wang, Distributed optimal variational GNE seeking in merely monotone games, IEEE/CAA J. Automatic., 11 (2024), 1621–1630. https://doi.org/10.1109/JAS.2024.124284 doi: 10.1109/JAS.2024.124284
![]() |
[29] |
R. Merton, Option pricing when underlying stock returns are discontinuous, J. Financ. Econ., 3 (1976), 125–144. https://doi.org/10.1016/0304-405X(76)90022-2 doi: 10.1016/0304-405X(76)90022-2
![]() |
[30] | E. Platen, N. Bruti-Liberati, Numerical solution of stochastic differential equations with jumps in finance, Berlin: Springer-Verlag, 2010. https://doi.org/10.1007/978-3-642-13694-8 |