Citation: Francesco Cordoni, Luca Di Persio. A maximum principle for a stochastic control problem with multiple random terminal times[J]. Mathematics in Engineering, 2020, 2(3): 557-583. doi: 10.3934/mine.2020025
[1] | Menita Carozza, Luca Esposito, Raffaella Giova, Francesco Leonetti . Polyconvex functionals and maximum principle. Mathematics in Engineering, 2023, 5(4): 1-10. doi: 10.3934/mine.2023077 |
[2] | Joan Mateu, Maria Giovanna Mora, Luca Rondi, Lucia Scardia, Joan Verdera . A maximum-principle approach to the minimisation of a nonlocal dislocation energy. Mathematics in Engineering, 2020, 2(2): 253-263. doi: 10.3934/mine.2020012 |
[3] | Lucio Boccardo . Weak maximum principle for Dirichlet problems with convection or drift terms. Mathematics in Engineering, 2021, 3(3): 1-9. doi: 10.3934/mine.2021026 |
[4] | Italo Capuzzo Dolcetta . The weak maximum principle for degenerate elliptic equations: unbounded domains and systems. Mathematics in Engineering, 2020, 2(4): 772-786. doi: 10.3934/mine.2020036 |
[5] | Diogo Gomes, Julian Gutierrez, Ricardo Ribeiro . A mean field game price model with noise. Mathematics in Engineering, 2021, 3(4): 1-14. doi: 10.3934/mine.2021028 |
[6] | Isabeau Birindelli, Kevin R. Payne . Principal eigenvalues for k-Hessian operators by maximum principle methods. Mathematics in Engineering, 2021, 3(3): 1-37. doi: 10.3934/mine.2021021 |
[7] | Bin Deng, Xinan Ma . Gradient estimates for the solutions of higher order curvature equations with prescribed contact angle. Mathematics in Engineering, 2023, 5(6): 1-13. doi: 10.3934/mine.2023093 |
[8] | Juan-Carlos Felipe-Navarro, Tomás Sanz-Perela . Semilinear integro-differential equations, Ⅱ: one-dimensional and saddle-shaped solutions to the Allen-Cahn equation. Mathematics in Engineering, 2021, 3(5): 1-36. doi: 10.3934/mine.2021037 |
[9] | M. Delgado, I. Gayte, C. Morales-Rodrigo . Optimal control of a chemotaxis equation arising in angiogenesis. Mathematics in Engineering, 2022, 4(6): 1-25. doi: 10.3934/mine.2022047 |
[10] | Andrea Manzoni, Alfio Quarteroni, Sandro Salsa . A saddle point approach to an optimal boundary control problem for steady Navier-Stokes equations. Mathematics in Engineering, 2019, 1(2): 252-280. doi: 10.3934/mine.2019.2.252 |
In the last decades stochastic optimal control theory has received an increasing attention by the mathematical community, also in connection with several concrete applications, spanning from industry to finance, from biology to crowd dyamics, etc. In all of above applications a rigorous theory of stochastic optimal control (SOC), under suitable assumption on the source of random noise, revealed to be a fundamental point.
To this aim different theoretical approaches have been developed. They can be broadly divided into two classes: partial differential equations (PDE) methods via the Hamilton-Jacobi-Bellman (HJB) equation, and methods based on the maximum principle via backward stochastic differential equations (BSDEs), see, e.g., [12,24,27]
In particular BSDEs' methods have proved to be particularly adapted for a large set of SOC-problems, as reported, e.g., in [25]. Within previously mentioned problems a particular role is played by those SOC problems characterized by the specification of a random terminal time. In particular, this a classical task in Finance at least since the recent financial credit crunch which imposed the need to model possible defaults and credit risks. When dealing with optimal control with random terminal time, two main approaches are possible. The first possible setting considers the random terminal time as a to be completely inaccessible to the reference filtration. The related classical approach consists in enlarging the reference filtration, see, e.g., [20]. In this way, via a suitable density assumption on the conditional law of the random time, the original problem is converted into a control problem with fixed terminal time, with respect to the new enlarged filtration, see, e.g., [11,23] for more theoretical insights and to [2,3,5,18] for some concrete applications.
A second, alternative, approach assumes that the stopping times are accessible from the reference filtration, hence implying a perfect information about the triggered random times. The typical assumption in this setting is that the stopping time τ is defined as the first hitting time of a barrier v for a reference system whose dynamic is given by a stochastic differential equation (SDE). In a credit risk setting, such an approach is known as the structural approach, and it has a long-standing financial literature whose first results date back to [21]. It is worth stressing that this last scenario does not fall back into previous one, where inaccessible stopping times are considered. In fact, if the stopping time is to be defined as the first hitting time, it does not satisfy above mention density hypothesis.
The present paper investigates a SOC-problem with multiple random events of the latter type. Therefore, differently from [18,23], we will not assume random events to be totally inaccessible, but, instead, they will be defined as first hitting time, against a predetermined boundary, of the driving process.
In particular, we will consider a controlled system of n∈N SDEs of the general form
{dXi(t)=μi(t,Xi(t),αi(t))dt+σi(t,Xi(t),αi(t))dWi(t),i=1,…,n,Xi(0)=xi, | (1) |
under standard assumptions of Lipschitz coefficients μi and σi with at most linear growth, being αi the control. The notation will be specified in detail within subsequent sections.
We aim at minimizing the following functional up to a given stopping time τ,
J(x,α)=E∫τ0L(t,X(t),α(t))dt+G(τ,X(τ)), |
for some suitable functions L and G, where we have denoted by X(t)=(X1(t),…,Xn(t)) and α(t)=(α1(t),…,αn(t)).
Then we assume that the system, instead of being stopped as soon as the stopping time τ is triggered, continues to evolve according to a new system of SDEs written as follows
{dXi1(t)=μi1(t,Xi1(t),αi1(t))dt+σi1(t,Xi1(t),αi1(t))dWi(t),i=1,…,n−1,Xi1(τ)=xi1, |
for some new coefficients μi1 and σi1 again satisfying standard assumptions of linear growth and Lipschitz continuity. In particular, we will assume that, according to the triggered stopping time the k−th component in Eq (1) has been set to 0, according to rigorous definitions later specified. Then, we again aim at minimizing a functional of the form
J1(x1,α)=E∫τ1τL1(t,X1(t),α1(t))dt+G1(τ1,X1(τ1)), |
with the same notation used before, τ1 being a new stopping time. We repeat such a scheme for a series of n stopping times. Moreover, in complete generality, we assume that the order of the random times is not know a priori, hence forcing us to consider all possible combinations of random events with associated all the possible combinations of driving SDEs.
The main result of the present paper consists in deriving a stochastic maximum principle, both in necessary and sufficient form, for the whole series of control problems stated above. To the best of our knowledge this is the first work that derives a maximum principle for a series of interconnected optimal problem. The maximum principle can in turn help in deriving a closed–loop optimal control problem. This will be further showed into the work for the relevant case of a linear–quadratic optimal control problem whose closed-form solution is of affine form.
Clearly, we cannot expect that the global optimal solution is given by gluing each optimal control between two consecutive stopping times. Instead, we will tackle the problem following a dynamic programming principle approach, as exploited, e.g., in [23]. In particular, we will solve the problem backward. Therefore, the case of all stopping times but one have been triggered is considered first, then we consider the problem with two random events left, etc., until the very first control problem. Following this scheme, we are able to provide the global optimal solution recursively, so that the k−th optimal control problem depends on the (k+1)−th optimal solution. We remark that altough the backward approach has been used in literature, see, e.g., [18,23], to the best of our knowledge the present work is the first one using such techniques where stopping times are defined as hitting times.
After having derived the main result, i.e., the aforementioned maximum principle, we will consider the particular case of a linear–quadratic control problem, that is we assume the underlying dynamics to be linear in both the state variable and the control, with quadratic costs to be minimized. Such type of problems have been widely studied both from a theoretical and practical point of view since they often allow to obtain closed form solution for the optimal control.
In particular, usually one can write the solution to a linear–quadratic control problem in terms of the solution of a Riccati backward ordinary differential equation (ODE), hence reducing the original linear–quadratic stochastic control problem to the solution of a simpler ODE, see, e.g., [27] and [24,Section 6.6], for possible financial applications. Let us recall that, considering either random coefficients for the driving equation or random terminal time in the control problem, the latter case being the one here treated, the backward Riccati ODE becomes a Riccati BSDE, see, e.g., [13,14,16,17].
We stress that the results derived in the present paper find natural applications in many areas related to mathematical finance, and mainly related to systemic risk, where after recent credit crisis, the assumption of possible failures has become the main ingredient in many robust financial models. Also, network models have seen an increasing mathematical attention during last years, as witnessed by the development of several ad hoc techniques derived to consider general dynamics on networks. We refer the interested reader to [6,7,9], for general results on network models, and to [15] for a financially oriented treatment.
In particular, these models have proved to be particularly suitable if one is to consider a system of interconnected banks. Following thus the approach of [4,10,19], results derived in the present work can be successfully applied to a system of n interconnected banks, lending and borrowing money. As in [4,8] one can assume the presence of an external controller, typically called lender of last resort (LOLR), who actively supervises the banks' system and possibly lending money to actors in needs. A standard assumption is that the LOLR lends money in order to optimize a given quadratic functional. Therefore, modelling the system as in [8], we recover a linear–quadratic setting allowing us to apply results obtained in the present work.
The paper is organized as follows: In Section 2 we introduce the general setting, clarifying main assumptions; then, Section 2.1 is devoted to the proof of the necessary maximum principle, whereas in Section 2.2 we will prove the sufficient maxim principle; at last, in Section 3, we apply previous results to the case of a linear–quadratic control problems also deriving the global solution by an iterative scheme to solve a system of Riccati BSDEs.
Let n∈N and T<∞ a fixed terminal time and let us consider a standard complete filtered probability space (Ω,F,(Ft)t∈[0,T],P) satisfying usual assumptions.
In what follows we are going to consider a controlled system of n SDEs, for t∈[0,T] and i=1,…,n, evolvong has folllows
{dXi;0(t)=μi;0(t,Xi;0(t),αi;0(t))dt+σi;0(t,Xi;0(t),αi;0(t))dWi(t),Xi;0(0)=xi;00, | (2) |
where Wi(t) is a standard Brownian motion, αi;0 being the control. In particular, we assume
Ai:={αi;0∈L2ad([0,T];R):αi;0(t)∈Ai,a.e.t∈[0,T]}, |
where Ai⊂R is assumed to be convex and closed, and we have denoted by L2ad([0,T];R) the space of (Ft)t∈[0,T]–adapted processes α such that
E∫T0|αi;0(t)|2dt<∞, |
while A:=⊗ni=1Ai.
In what follows we will assume the following assumptions to hold.
Assumptions 2.1. Let μ:[0,T]×R×A→R and σ:[0,T]×R×A→R be measurable functions and suppose that there exits a constant C>0 such that, for any x, y∈R, for any a∈A and for any t∈[0,T], it holds
|μ(t,x,a)−μ(t,y,a)|+|σ(t,x,a)−σ(t,y,a)|≤C|x−y|,|μ(t,x,a)|+|σ(t,x,a)|≤C(1+|x|+|a|). |
We thus assume the coefficients μi;0 and σi;0, for i=1,…,n, in Eq (2), satisfy assumptions 2.1. Thererfore, we have that there exists a unique strong solution to Eq (2), see, e.g., [12,24].
Remark 2.2. In Eq (2) we have considered an R−valued SDE, nevertheless what follows still holds if we consider a system of SDEs, each of which takes values in Rmi, mi∈N, i=1,…,n.
Let us denote by
X0(t)=(X1;0(t),…,Xn;0(t)),α0(t)=(α1;0(t),…,αn;0(t)), |
then define the coefficients
B0:[0,T]×Rn×A→Rn,Σ0:[0,T]×Rn×A→Rn×n,, |
as
B0(t,X0(t),α0(t)):=(μ1;0(t,X1;0(t),α1;0(t)),…,μn;0(t,Xn;0(t),αn;0(t)))T, |
and
Σ0(t,X0(t),α0(t)):=diag[σ1;0(t,X1;0(t),α1;0(t)),…,σn;0(t,Xn;0(t),αn;0(t))], |
that is the matrix with σi;0(t,x,a) entry on the diagonal and null off-diagonal.
Let us also denote x00=(x1;00,…,xn;00) and W(t)=(W1(t),…,Wn(t))T. Hence, system (2) can be compactly rewritten as follows
{dX0(t)=B0(t,X0(t),α0(t))dt+Σ0(t,X0(t),α0(t))dW(t),X0(0)=x00. | (3) |
We will minimize the following functional
J(x,α)=E∫ˆτ10L0(t,X0(t),α0(t))dt+G0(ˆτ1,X0(ˆτ1)), | (4) |
where L0 and G0 are assumed to satisfy the following assumptions:
Assumptions 2.3. Let L0:[0,T]×Rn×A0→R and G0:[0,T]×Rn→R be two measurable and continuous functions such that there exist two constants K, k>0 such that, for any t∈[0,T], x∈Rn and a∈A0, it holds
|L0(t,x,a)|≤K(1+|x|k+|a|k),|G0(t,x)|≤K(1+|x|k). |
Let us underline that in the cost functional defined by (4), the terminal time ˆτ1 is assumed to be triggered as soon as X0 reaches a given boundary v0. In particular, we assume the stopping boundary to be of the form
v0=(v1;0,…,vn;0), |
for some given constants vi;0∈R, i=1,…,n. We thus denote by
τi;0:=T∧min{t≥0:Xi;0(t)=vi;0},i=1,…,n, | (5) |
the first time Xi;0 reaches the boundary vi;0 and we set
ˆτ1:=τ1;0∧⋯∧τn;0, |
the first stopping time to happen.
We stress that, in what follows we will denote by ˆτ the ordered stopping times. In particular, ˆτ1≤⋯≤ˆτn, where ˆτk denotes the k−th stopping time to happen. On the contrary, the notation τk indicates that the stopping time has been triggered by the k-th node. In what follows, we will use the convention that, if ˆτ1=τk, then τj=T, for j≠k.
Remark 2.4. From a practical point of view, we are considering a controller that aims at supervise n different elements defining a system, up to the first time one of the element of it exits from a given domain. From a financial perspective, each element represents a financial agent, while the stopping time denotes its failure time. Hence, a possible cost to be optimized, as we shall see in Section 3, is to maximize the distance between the element/financial agent from the associated stopping/default boundary.
As briefly mentioned in the introduction, instead of stopping the overall control problem when the first stopping time is triggered, we assume that the system continues to evolve according to a (possibly) new dynamic. As to make an example, let us consider the case of ˆτ1≡ˆτk;0, that is the first process to hit the stopping boundary is Xk;0. We thus set to 0 the k−th component of X0, then considering the new process
Xk(t)=(X1;k(t),…,Xk−1;k(t),0,Xk+1;k(t),…,Xk;n(t)), |
with control given by
αk(t)=(α1;k(t),…,αk−1;k(t),0,αk+1;k(t),…,αk;n(t)), |
where the superscript k denotes that the k−th component hit the stopping boundary and therefore has been set to 0.
Then, we consider the n−dimensional system, for t∈[ˆτ1,T], defined by
{dXi;k(t)=μi;k(t,Xi;k(t),αi;k(t))dt+σi;k(t,Xi;k(t),αi;k(t))dWi(t),Xi;k(ˆτ1)=Xi;0(ˆτ1)=:xi;k,i=1,…,k−1,k+1,…,n, |
where the coefficients μi;k and σi;k satisfy assumptions 2.1 and we have also set Xk;k(t)=0.
We therefore define Bk:[ˆτ1,T]×Rn×A→Rn and Σk:[ˆτ1,T]×Rn×A→Rn×n as
Bk(t,Xk(t),αk(t)):=(μ1;k(t,X1;k(t),α1;k(t)),…,μn;k(t,Xn;k(t),αn;k(t)))T,Σk(t,Xk(t),αk(t)):=diag[σ1;k(t,X1;k(t),α1;k(t)),…,σn;k(t,Xn;k(t),αn;k(t))], |
which allows us to rewrite the above system as
{dXk(t)=Bk(t,Xk(t),αk(t))dt+Σk(t,Xk(t),αk(t))dW(t),t≥ˆτ1,Xk(τk)=Φk(τk)X0(τk)=:xk, | (6) |
where Φk is the diagonal n×n matrix defined as
Φk(τk)=diag[1,…,1,0,1,…,1], |
the null-entry being in the k−th position.
Then we minimize the following functional
Jk(x,α)=E∫ˆτ2ˆτ1Lk(t,Xk(t),αk(t))dt+Gk(ˆτ2,Xk(ˆτ2)), |
where Lk and Gk are assumed to satisfy assumptions 2.3, while ˆτ2 is a stopping time triggered as soon as Xk hits a defined boundary. In particular, we define the stopping boundary
vk=(v1;k,…,vk−1;k,1,vk+1;k,…,vn;k),t∈[ˆτ1,T], |
and, following the same scheme as before, we define by
τi;k:=T∧min{t≥ˆτ1:Xi;k(t)=vi;k},i=1,…,k−1,k+1,…,n, |
the first time Xi;k reaches the boundary vi;k, denoting
ˆτ2:=τ1;k∧⋯∧τk−1;k∧τk+1;k∧⋯∧τn;k. |
It follows that, considering for instance the case τl;k has been triggered by Xl;k, we have ˆτ2≡ˆτl;k, meaning that vl;k has been hit. Iteratively proceeding, we consequently define
X(k,l)(t)=(X1;(k,l)(t),…,Xk−1;(k,l)(t),0,Xk+1;(k,l)(t),…,,Xl−1;(k,l)(t),0,Xl+1;(k,l)(t),…,Xn;(k,l)(t))T, |
again assuming X(k,l)(t) evolves according to a system as in (3), and so on until either no nodes are left or the terminal time T is reached.
As mentioned above, one of the major novelty of the present work consists in not assuming the knowledge of the stopping times order. From a mathematical point of view, the latter implies that we have to consider all the possible combinations of such critical points during a given time interval [0,T]. Let us note that this is in fact the natural setting to work with having in mind the modelling of concrete scenarios, as happens, e.g., concerning possible multiple failures happening within a system of interconnected banks.
Therefore, in what follows we are going to denote by Cn,k the combinations of k elements from a set of n, while πk∈Cn,k stands for one of those element. Hence, exploiting the notation introduced above, we define the process X=(X(t))t∈[0,T] as
X(t)=X0(t)1{t<ˆτ1}+n−1∑k=1∑πk∈Cn,kXπk(t)1{τπk<t<ˆτk+1}, | (7) |
where each Xπk(t) is defined as above and, consequently, the the global control reads as follow
α(t)=α0(t)1{t<ˆτ1}+n−1∑k=0∑πk∈Cn,kαπk(t)1{τπk<t<ˆτk+1}. | (8) |
Remark 2.5. Let us underlined within the setting defined so far, each stopping time ˆτk depends on previously triggered stopping times τπj, j=1,…,k−1. As a consequence, also the solution Xπk in (7) depends on triggered stopping times as well as on their order. To simplify notation, we have avoided to explicitly write such dependencies, defining for short
ˆτk:=ˆτk(ˆτ1,…,ˆτk−1). |
By Eq (7) we have that the dynamic for X is given by
dX(t)=B(t,X(t),α(t))dt+Σ(t,X(t),α(t))dW(t), | (9) |
where, according to the above introduced notation, we have defined
B(t,X(t),α(t))=B0(t,X0(t),α0(t))1{t<ˆτ1}++n−1∑k=1∑πk∈Cn,kBπk(t,Xπk(t),απk(t))1{τπk<t<ˆτk+1},Σ(t,X(t),α(t))=Σ0(t,X0(t),α0(t))1{t<ˆτ1}++n−1∑k=1∑πk∈Cn,kΣπk(t,Xπk(t),απk(t))1{τπk<t<ˆτk+1}, | (10) |
aiming at minimizing the following functional
J(x,α):=E∫ˆτn0L(t,X(t),α(t))dt+G(ˆτn,X(ˆτn)). | (11) |
L and G being defined as
L(t,X(t),α(t))=L0(t,X0(t),α0(t))1{t<ˆτ1}++n−1∑k=1∑πk∈Cn,kLπk(t,Xπk(t),απk(t))1{τπk<t<ˆτk+1},G(ˆτn,X(ˆτn))=G0(ˆτ1,X0(ˆτ1))1{ˆτ1≤T}++n∑k=1∑πk∈Cn,kGπk(τπk,Xπk(τπk))1{τπk<T≤ˆτk+1}. |
Remark 2.6. It is worth to mention that we are considering the sums stated above as to be done over all possible combinations, hence implying we are not considering components' order, namely considering X(k,l)=X(l,k). Dropping such an assumption implies that the sums in Eqs (7), (8) and (10) have to be considered over the disposition Dn,k.
In what follows we shall give an example of the theory developed so far, as to better clarify our approach as well as its concrete applicability.
Example 2.1. Let us consider the case of a system constituted by just n=2 components. Then Eq (7) becomes
X(t)=X0(t)1{t<ˆτ1}+X1(t)1{τ1<t<ˆτ2}+X2(t)1{τ2<t<ˆτ2}, |
where X0(t), resp. X1(t), resp. X2(t), denotes the dynamics in case neither 1 nor 2 has hit the stopping boundary, resp. 1 has, resp. 2 has.
Then, denoting by α0(t), α1(t) and α2(t) the respective associated controls, we have that the functional (11) reads
J(x,α):=E∫ˆτ10L0(t,X0(t),α0(t))dt+G0(ˆτ1,X0(ˆτ1))++E∫ˆτ2τ1L1(t,X1(t),α1(t))dt+G1(ˆτ2,X1(ˆτ2))++E∫ˆτ2τ2L2(t,X2(t),α2(t))dt+G2(ˆτ2,X2(ˆτ2)). |
The main issue in solving the optimal control problem defined in Section 2 consists in solving a series of connected optimal problems, each of which may depends on previous ones. Moreover, we do not assume to have an a priori knowledge about the stopping times' order.
To overcome such issues, we consider a backward approach. In particular, we first solve the last control problem, then proceeding with the penultimate, and so on, until the first one, via backward induction. Let us underline that assuming the perfect knowledge of the stopping times' order would imply a simplification of the backward scheme, because of the need to solve only n control problems, then saving us to take into account all the combinations. Nevertheless in one case as in the other, the backward procedure runs analogously.
Aiming at deriving a global maximum principle, in what follows we denote by ∂x the partial derivative w.r.t. the space variable x∈Rn and by ∂a the partial derivative w.r.t. the control a∈An. Moreover we assume
Assumptions 2.7. (ⅰ) For any πk∈Cn,k, k=1,…,n, it holds that Bπk and Σπk are continuously differentiable w.r.t. to both x∈Rn and to a∈A. Furthermore, there exists a constant C1>0 such that for any t∈[0,T], x∈Rn and a∈A, it holds
|∂xBπk(t,x,a)|+|∂aBπk(t,x,a)|≤C1,|∂xΣπk(t,x,a)|+|∂aΣπk(t,x,a)|≤C1. |
(ⅱ) For any πk∈Cn,k, k=1,…,n, it holds that Lπk, resp. Gπk, is continuously differentiable w.r.t. to both x∈Rn and a∈An, resp. only w.r.t. x∈Rn. Furthermore, there exists a constant C2>0 such that for any t∈[0,T], x∈Rn and a∈An, it holds
|∂xLπk(t,x,a)|+|∂aLπk(t,x,a)|≤C2(1+|x|+|a|),|∂xGπk(t,x)|≤C2. |
We thus have the following result.
Theorem 2.8. [Necessary Maximum Principle] Let assumptions 2.1–2.3–2.7 hold and let (ˉX,ˉα) be an optimal pair for the problem (9)–(11), then it holds
⟨∂aH(t,ˉX(t),ˉα(t),ˉY(t),ˉZ(t)),(ˉα(t)−˜α)⟩≤0,a.e.t∈[0,ˆτn],P−a.s,∀˜α∈A, | (12) |
equivalently
ˉα(t)=argmin˜α∈AH(t,ˉX(t),˜α(t),Y(t),Z(t)), |
where the pair (Y(t),Z(t)) solves the following dual backward equation
Y(t)=Y0(t)1{t<ˆτ1}+n−1∑k=1∑πk∈Cn,kYπk(t)1{τπk<t<ˆτk+1},Z(t)=Z0(t)1{t<ˆτ1}+n−1∑k=1∑πk∈Cn,kZπk(t)1{τπk<t<ˆτk+1}, |
the pairs (Yπk(t),Zπk(t)) being solutions of the following system of interconnected BSDEs
{−dYπn−1(t)=∂xHπn−1(t,Xπn−1(t),απn−1(t),Yπn−1(t),Zπn−1(t))dt−Zπn−1dW(t),Yπn−1(ˆτn)=∂xGπn−1(ˆτn,Xπn−1(ˆτn)),{−dYπk(t)=∂xHπk(t,Xπk(t),απk(t),Yπk(t),Zπk(t))dt−ZπkdW(t),Yπk(ˆτk+1)=∂xGπk(ˆτk+1,Xπk(ˆτk+1))+ˉYk+1(ˆτk+1),{−dY0(t)=∂xH0(t,X0(t),α0(t),Y0(t),Z0(t))dt−Z0dW(t),Y0(τ1)=∂xG0(τ1,X0(τ1))+ˉY1(τ1), | (13) |
having denoted by
ˉYπk+1(ˆτk+1):=∑πk+1∈Cn,k+1Yπk+1(τπk+1)1{ˆτk+1=τπk+1}, |
where Hπk is the generalized Hamiltonian
Hπk:[0,T]×Rn×A×Rn×Rn×n→R, |
defined as
Hπk(t,xπk,aπk,yπk,zπk):=Bπk(t,xπk,aπk)⋅yπk++Tr[(Σπk(t,xπk,aπk))∗zπk]+Lπk(t,xπk,aπk), | (14) |
and H represents the global generalized Hamiltonian defined as
H(t,x,a,y,z)=H0(t,x,a,y,z)1{t<ˆτ1}++n−1∑k=1∑πk∈Cn,kHπk(t,x,a,y,z)1{τπk<t<ˆτk+1}. |
Remark 2.9. Before entering into details about proving Theorem 2.8, let us underline some of its characteristics. In particular, here the main idea is to find a solution iteratively acting backward in time. Therefore, starting from the very last control problem, namely the case where a single node is left into the system, we consider a standard maximum principle. Indeed, Yπn−1 in (13) represents a classical dual BSDE form associated to the standard stochastic maximum principle, see, e.g., [27,Th. 3.2]. Then, we can consider the second last control problem. A this point, a naive tentative to obtain a global solution, could be to first solve such penultimate problem to then gluing together the obtained solutions. Nevertheless, such a method only produces a a suboptimal solution. Instead, the right approach, similarly to what happens applying the standard dynamic programming principle, consists in treating the solution to the last control problem as the terminal cost for the subsequent (second last) control problem, and so on for the remaining ones.
It follows that, in deriving the global optimal solution, one considers the cost coming from future evolution of the system. Mathematically, this is clearly expressed by the terminal condition Yπk the Eq (13) is endowed with. Therefore the solution scheme resulting in a global connection of all the control problems we have to consider, from the very last of them and then backward to the first one.
Proof. [Necessary Maximum Principle.] We proceed according to a backward induction technique. In particular,
for t0>ˆτn−1 the proof follows from the standard stochastic necessary maximum principle, see, e.g., [27,Th. 3.2]. Then we consider the case of ˆτn−2<t0<ˆτn−1, and we define
ˉα:={ˉαπn−2(t)t0<t<ˆτn−2,ˉαπn−1(t)ˆτn−2<t<ˆτn−1. |
to be the optimal control, α being another admissible control and further setting αh as
αh:=ˉα+hα,h>0. |
Since in the present case the cost functional reads as follow
J(x,α):=E∫ˆτn−1t0Lπn−2(t,Xπn−2(t),απn−2(t))dt+Gπn−2(ˆτn−1,X(ˆτn−1))++∑πn−1∈Cn,n−1E∫ˆτnτπn−1Lπn−1(t,Xπn−1(t),απn−1(t))dt++Gπn−1(ˆτn,X(ˆτn)), |
we can choose α=ˉα−˜α, ˜α∈A. Then, by the optimality of ˉα and via a standard variational argument, see, e.g., [1,22,27], we have
J(x,ˉα)−J(x,αh)≤0, |
which implies
limh→0J(x,ˉα)−J(x,αh)h≤0. |
In what follows, for the sake of clarity, we will denote by Xα the solution X with control α. Thus, from the optimality of ˉα, we have
E∫ˆτn−1t0Lπn−2(t,ˉXπn−2ˉα(t),ˉαπn−2(t))dt+Gπn−2(ˆτn−1,ˉXπn−2ˉα(ˆτn−1))++∑πn−1∈Cn,n−1∫ˆτnτπn−1Lπn−1(t,ˉXπn−1ˉα(t),ˉαπn−1(t))dt+Gπn−1(ˆτn,ˉXπn−1ˉα(ˆτn))≤≤E∫ˆτn−1t0Lπn−2(t,ˉXπn−2αh(t),ˉαπn−2(t))dt+Gπn−2(ˆτn−1,ˉXπn−2αh(ˆτn−1))++∑πn−1∈Cn,n−1∫ˆτnτπn−1Lπn−1(t,ˉXπn−1αh(t),ˉαπn−1(t))dt+Gπn−1(ˆτn,ˉXπn−1αh(ˆτn)). | (15) |
Then, for any α∈A, by (15), we obtain
E∫ˆτn−1t0∂xLπn−2(t,ˉXπn−2ˉα(t),ˉαπn−2(t))Zπn−2(t)dt++∂xGπn−2(ˆτn−1,ˉXπn−2ˉα(ˆτn−1))Zπn−2(ˆτn−1)++∑πn−1∈Cn,n−1E∫ˆτnτπn−1∂xLπn−1(t,ˉXπn−1ˉα(t),ˉαπn−1(t))Zπn−1(t)dt++∑πn−1∈Cn,n−1∂xGπn−1(ˆτn,ˉXπn−1ˉα(τπn))Zπn−1(ˆτn)≤0 | (16) |
where Zπn−1 and Zπn−2 solve the first variation process
{dZπn−1(t)=∂xBπn−1(t,ˉXπn−1(t),απn−1(t))Zπn−1(t)dt++∂aBπn−1(t,ˉXπn−1(t),απn−1(t))απn−1(t)dt++∂xΣπn−1(t,ˉXπn−1(t),απn−1(t))Zπn−1(t)dW(t)++∂aΣπn−1(t,ˉXπn−1(t),απn−1(t))απn−1(t)dW(t),Zπn−1(ˆτn−1)=ˉZπn−2(ˆτn−1),t∈[ˆτn−1,ˆτn],{dZπn−2(t)=∂xBπn−2(t,ˉXπn−2(t),απn−2(t))Zπn−2(t)dt++∂aBπn−2(t,ˉXπn−2(t),απn−2(t))απn−2(t)dt++∂xΣπn−2(t,ˉXπn−2(t),απn−2(t))Zπn−2(t)dW(t)++∂aΣπn−2(t,ˉXπn−2(t),απn−2(t))απn−2(t)dW(t),Zπn−2(t0)=0,t∈[t0,ˆτn−1]. |
Applying Itô formula to Yπn−2⋅Zπn−2, we have
E(∂xGπn−2(ˆτn−1,Xπn−2(ˆτn−1))+ˉYn−1(ˆτn−1))⋅Zπn−2(ˆτn−1)==EYπn−2(ˆτn−1)⋅Zπn−1(ˆτn−1)==−E∫τπn−1t0(∂xHπn−2(t,Xπn−2(t),απn−2(t),Yπn−2(t),Zπn−2(t))dt)⋅Zπn−2(t)dt++E∫ˆτn−1t0(∂xBπn−2(t,ˉXπn−2(t),απn−2(t))Zπn−2(t))⋅Yπn−2(t)dt++E∫ˆτn−1t0(∂aBπn−2(t,ˉXπn−2(t),απn−2(t))απn−2(t))⋅Yπn−2(t)dt++E∫ˆτn−1t0(∂xΣπn−2(t,ˉXπn−2(t),απn−2(t))Zπn−2(t))⋅Zπn−2(t)dt++E∫ˆτn−1t0(∂aΣπn−2(t,ˉXπn−2(t),απn−2(t))απn−2(t))⋅Zπn−2(t)dt==−E∫ˆτn−1t0∂xLπn−2(t,Xπn−2(t),απn−2(t))Zπn−2(t)+E∫ˆτn−1t0(∂aBπn−2(t,ˉXπn−2(t),απn−2(t))Yπn−2(t))⋅απn−2(t)dt++E∫ˆτn−1t0(∂aΣπn−2(t,ˉXπn−2(t),απn−2(t))Zπn−2(t))⋅απn−2(t)dt+, | (17) |
and similarly for Yπn−1⋅Zπn−1, we obtain
E(∂xGπn−1(ˆτn,Xπn−1(ˆτn)))⋅Zπn−1(ˆτn)=EYπn−1(ˆτn)⋅Zπn−1(ˆτn)==EYπn−1(τπn−1)⋅Zπn−1(τπn−1)+−E∫ˆτnτπn−1∂xLπn−1(t,Xπn−1(t),ˉαπn−1(t))Zπn−1(t)dt++E∫ˆτnτπn−1(∂aBπn−1(t,ˉXπn−1(t),απn−1(t))Yπn−1(t))⋅απn−1(t)dt++E∫ˆτnτπn−1(∂aΣπn−1(t,ˉXπn−1(t),απn−1(t))Zπn−1(t))⋅απn−1(t)dt. | (18) |
Exploiting Eq (16), together with Eqs (17) and (18), we thus have
∫ˆτn−1t0(∂αHπn−2(t,ˉXπn−2(t),ˉαπn−2(t),ˉYπn−2(t),ˉZπn−2(t)))απn−2(t)dt++∑πn−1∈Cn,n−1∫ˆτnτπn−1(∂αHπn−1(t,ˉXπn−1(t),ˉαπn−1(t),ˉYπn−1(t),ˉZπn−1(t)))απn−1(t)dt≤0, |
for all α=ˉα−˜α, and thus we eventually obtain, for t0>ˆτn−2
∂αH(t,ˉX(t),ˉα(t),ˉY(t),ˉZ(t))(ˉα(t)−˜α)≤0,a.e. t∈[t0,ˆτn],P−a.s,∀˜α∈A, |
which is the desired local form for optimality (12). Analogously proceeding via backward induction, we derive that the same results also hold for any πk∈Cn,k, hence obtaining the system (13) and concluding the proof.
In this section we consider a generalization of the classical sufficient maximum principle, see, e.g., [24,Th. 6.4.6], for the present setting of interconnected multiple optimal control problems with random terminal time. To this end, we assume
Assumptions 2.10. For any πk∈Cn,k the derivative w.r.t. x of B, Σ and L are continuous and there exists a constant La>0 such that, for any a1, a2∈A,
|Bπk(t,x,a1)−Bπk(t,x,a2)|+|Σπk(t,x,a1)−Σπk(t,x,a2)|++|Lπk(t,x,a1)−Lπk(t,x,a2)|≤La|a1−a2|. |
Theorem 2.11 (Sufficient maximum principle). Let 2.1–2.3–2.7–2.10 hold, let (Y,Z) be the solution to the dual BSDE 13, and suppose the following conditions hold true
(ⅰ) the maps x↦Gπk(x) are convex for any πk;
(ⅱ) the maps (x,a)↦Hπk(x,a,Yπk,Zπk) are convex for a.e. t∈[0,T] and for any πk;
(ⅲ) for a.e. t∈[0,T] and P−a.s. it holds
ˉαπk(t)=argmin˜απk∈AπkHπk(t,Xπk(t),˜α(t),Yπk,Zπk), |
then (ˉα,ˉX) is an optimal pair for the problem (9)–(11).
Proof. Let us proceed as in the proof of Theorem 2.8, namely via backward induction. For t0>ˆτn−1 the proof follows from the standard sufficient stochastic maximum principle, see, e.g., [27,Th. 5.2].
Let us thus then consider the case of ˆτn−2<t0<ˆτn−1, denoting by ΔXπk(t):=ˉXπk(t)−Xπk(t) and, for the sake of clarity, by using similar notations for any other function.
The convexity of Gπn−1, together with the terminal condition
Yπn−1(ˆτn)=∂xGπn−1(ˆτn,ˉXπn−1(ˆτn)), |
yields
EΔGπn−1(ˆτn,ˉXπn−1(ˆτn))≤≤E[ΔXπn−1(ˆτn)∂xGπn−1(ˆτn,ˉXπn−1(ˆτn))]=E[ΔXπn−1(ˆτn)Yπn−1(ˆτn)]. | (19) |
Applying the Itô-formula to ΔXπn−1Yπn−1(ˆτn), we obtain
E[ΔXπn−1(ˆτn))Yπn−1(ˆτn)]=E[ΔXπn−1(ˆτn−1)Yπn−1(ˆτn−1)]++E∫ˆτnˆτn−1ΔXπn−1(t)dYπn−1(t)+E∫ˆτnˆτn−1Yπn−1(t)dΔXπn−1(t)++E∫ˆτnˆτn−1Tr[ΔΣπn−1(t,Xπn−1(t),απn−1(t))Zπn−1(t)]dt==E[ΔXπn−1(ˆτn−1)Yπn−1(ˆτn−1)]+−E∫ˆτnˆτn−1ΔXπn−1(t)∂xHπn−1(t,ˉXπn−1(t),ˉαπn−1(t),Yπn−1(t),Zπn−1(t))dt++E∫ˆτnˆτn−1(ΔBπn−1(t)Yπn−1(t)+ΔΣπn−1(t)Zπn−1(t))dt. | (20) |
Similarly, from the convexity of the Hamiltonian, we also have
E∫ˆτnˆτn−1[ΔLπn−1(t,ˉXπn−1(t),ˉαπn−1(t))]dt==E∫ˆτnˆτn−1[ΔHπn−1(t,ˉXπn−1(t),ˉαπn−1(t),Yπn−1(t),Zπn−1(t))]dt+−E∫ˆτnˆτn−1(ΔBπn−1(t)Yπn−1(t)+ΔΣπn−1(t)Zπn−1(t))dt≤≤E∫ˆτnˆτn−1[ΔXπn−1(t)∂xHπn−1(t,ˉXπn−1(t),ˉαπn−1(t),Yπn−1(t),Zπn−1(t))]dt+−E∫ˆτnˆτn−1(ΔBπn−1(t)Yπn−1(t)+ΔΣπn−1(t)Zπn−1(t))dt, | (21) |
so that, for any πn−1, by combining Eqs (19)–(21), we derive
E∫ˆτnˆτn−1[ΔLπn−1(t,ˉXπn−1(t),ˉαπn−1(t))]dt+EΔGπn−1(ˆτn,ˉXπn−1(ˆτn))≤≤E[ΔXπn−1(ˆτn−1)Yπn−1(ˆτn−1)]. | (22) |
Analogously, for t0∈[ˆτn−2,ˆτn−1], and since
Yπn−2(ˆτn−1)=∂xGπn−2(ˆτn−1,ˉXπn−2(ˆτn−1))+ˉYπn−1(ˆτn−1), |
together with the convexity of Gπn−2, we have
EΔGπn−2(ˆτn−1,ˉXπn−2(ˆτn−1))≤≤E[ΔXπn−2(ˆτn−1)∂xGπn−2(ˆτn−1,ˉXπn−2(ˆτn−1))]==E[ΔXπn−2(ˆτn−1)Yπn−2(ˆτn−1)−ΔXπn−2(ˆτn−1)ˉYπn−1(ˆτn−1)]. |
Similar computations also give us
E∫ˆτn−1ˆτn−2[ΔLπn−2(t,ˉXπn−2(t),ˉαπn−2(t))]dt+EΔGπn−2(ˆτn−1,ˉXπn−2(ˆτn−1))≤≤−EΔXπn−2(ˆτn−1)ˉYπn−1(ˆτn−1), | (23) |
so that, for t0∈[ˆτn−2,ˆτn−1], by Eqs (22) and (23), we infer that
J(t0,x,ˉα)−J(t0,x,α):=E∫ˆτn−1t0Lπn−2(t,ˉXπn−2(t),ˉαπn−2(t))dt++EGπn−2(ˆτn−1,ˉXπn−2(ˆτn−1))++∑πn−1∈Cn,n−1E∫ˆτnτπn−1Lπn−1(t,ˉXπn−1(t),ˉαπn−1(t))dt++Gπn−1(ˆτn,ˉXπn−1(ˆτn))++E∫ˆτn−1t0Lπn−2(t,Xπn−2(t),απn−2(t))dt++EGπn−2(ˆτn−1,X(ˆτn−1))++∑πn−1∈Cn,n−1E∫ˆτnτπn−1Lπn−1(t,Xπn−1(t),απn−1(t))dt++Gπn−1(ˆτn,X(ˆτn))≤0, | (24) |
which implies that
J(t0,x,ˉα)≤J(t0,x,α), |
and the optimality of (ˉα,ˉX).
Proceeding backward, previously exploited arguments allow us to show the same results for any πk∈Cn,k, hence ending the proof.
In the present section we consider a particular case for the control problem stated in Sections 2.1 and 2.2. In particular, we will assume that the dynamic of the state equation is linear in both the space and the control variable. Moreover, we impose that the control enters (linearly) only in the drift and that the cost functional is quadratic and of a specific form. More precisely, let us first consider μ0(t) as the n×n matrix defined as follows
μ0(t):=diag[μ1;0(t),…,μn;0(t)], |
that is the matrix with μi;0(t) entry on the diagonal and null off-diagonal, μi;0:[0,T]→R being a deterministic and bounded function of the time. Also let
b0(t)=(b1;0(t),…,bn;0(t))T, |
where again bi;0:[0,T]→R is a deterministic and bounded function of time. Then we set
B0(t,X0(t),α(t))=μ0(t)X0(t)+b0(t)+α(t). | (25) |
Let us also define the n×n matrix Σ0, to be independent of the control, as follows
Σ0(t,X0(t)):=(σ1;0(t)X1;0(t)+ν1;0(t)000⋱000σn;0(t)Xn;0(t)+νn;0(t)), | (26) |
σi;0, νi;0:[0,T]→R being deterministic and bounded function of time.
Same assumptions of linearity holds for any other coefficients Bπk and Σπk, so that, using the same notation introduced along previous sections, we consider the system
dX(t)=B(t,X(t),α(t))dt+Σ(t,X(t))dW(t), | (27) |
where both the drift and the volatility coefficients are now assumed to be linear. In the present (particular) setting, both the running and the terminal cost are assumed to be suitable quadratic weighted averages of the distance from the stopping boundaries, namely we set
Lπk(t,x,a)=n∑i=1(γπki|xi−vi;πk|22+12|ai;πk|2),Gπk(t,x)=n∑i=1γπki|xi−vi;πk|22, | (28) |
for some given weights γπk such that
γπk=(γπk1,…,γπkn)T. |
Remark 3.1. From a financial perspective, converting the minimization problem into a maximization one, the above cost functional can be seen as a financial supervisor, such as the one introduced in [4,8], aiming at lending money to each node (e.g., a bank, a financial player, an institution, etc.) in the system to avert it from the corresponding (default) boundary. Continuing the financial interpretation, different weights γ can be used to assign to any node a relative importance. This allows to establish a hierarchy of (financial) relevance within the system, resulting in a priority scale related to the systemic (monetary) importance took on by each node. As to give an example, in [8] a systematic procedure has been derived to obtain the overall importance of any node in a financial network.
In what follows, we derive a set of Riccati BSDEs to provide the global optimal control in feedback form. For the sake of notation clarity, we denote by Xk;−k(t) the dynamics when only the k−th node is left. Similarly, Xk;−(k,l)(t), resp. Xl;−(k,l)(t), denotes the evolution of the node k, resp. of the node l, when this pair (k,l) survives. Analogously, we will make use of a component-wise notation, namely Xi;−k will denote the i−th component of the n−dimensional vector X−k. According to such a notation, we have the following
Theorem 3.2. The optimal control problem (27), with associated costs given by (28), has an optimal feedback control solution given by
ˉα(t)=P(t)X(t)+φ(t), |
where P and φ are defined as follows
P(t)=P0(t)1{t<ˆτ1}+n−1∑k=1∑πk∈Cn,kPπk(t)1{τπk<t<ˆτk+1},φ(t)=φ0(t)1{t<ˆτ1}+n−1∑k=1∑πk∈Cn,kφπk(t)1{τπk<t<ˆτk+1}, | (29) |
Pπk and φπk being solution to the following recursive system of Riccati BSDEs
{−dPπn−1(t)=((Pπn−1(t))2+(σπn−1(t))2Pπn−1(t)+2Zπn−1;P(t)σπn−1(t)−1)dt+−Zπn−1;P(t)dWπn−1(t),Pπn−1(ˆτn)=1,{−dφπn−1(t)=((Pπn−1(t)−μπn−1(t))φπn−1(t)+σπn−1(t)Zπn−1;φ(t)−hπn−1(P(t),v(t)))dt+−Zπn−1;φ(t)dWπn−1(t),φπn−1(ˆτn)=−vπn−1(ˆτn),{−dPπk(t)=(−Pπk(t)2+(σπk)2Pπk(t))dt++(Zπk;Pj(t)σπk(t)−γπk)dt−Zπk;P(t)dWπk(t),Pπk(ˆτn−1)=γπk−∑πk+1∈Cn,k+1Pπk+1(ˆτn−1)1{ˆτk+1=τπk+1},{−dφπk(t)=((μπk(t)−Pπk(t))φπk(t)+σπk(t)Zπk;φ(t))dt+−hπk(Pπk(t),vπk(t))dt−Zπk;φ(t)dWπk(t),φ(ˆτn−1)=−γvˆτn−1(ˆτn−1)+∑πk+1∈Cn,k+1φπk+1(ˆτn−1)1{ˆτk+1=τπk+1},{−dP0)(t)=(−(P0(t))2+(σ0(t))2P0(t)+Z0;P(t)σ0(t)−γ0)dt+−Z0;P(t)dW(t),P0(ˆτ1)=γ0−∑π∈Cn,1P1(ˆτ1)1{ˆτ1=τπ},{−dφ0(t)=((μ0)−P0(t))φ0(t)+σ0(t)Z0;φ(t)−γ0v0(t))dt+−Z0;φ(t)dW(t),φ0(ˆτ1)=γ0v0(ˆτ1)−∑π∈Cn,1φ1(ˆτ1)1{ˆτ1=τπ}. |
Proof. Let us thus first consider the last control problem, recalling that H−k(t,x,a,y,z) is the generalized Hamiltonian defined in (14), where B−k, resp. Σ−k, resp. L−k, is given in Eq (25), resp. Eq (26), resp. Eq (28). An application of the stochastic maximum principle, see Theorems 2.8–2.11, leads us to consider the following adjoint BSDE
Y−k(t)=∂xG−k(X−k(ˆτn))+∫ˆτnt∂xH−k(X−k(s),α−k(s),Y−k(s),Z−k(s))ds+−∫ˆτntZ−k(s)dW(s),t∈[0,ˆτn], | (30) |
Y−k being a n−dimensional vector, whereas Z−k is a n×n matrix whose (i,j)−entry is denoted by Z−ki,j. Then, considering the particular form for B−k(t,x,a), Σ−k(t,x), L−k(t,x,a) and G−k(t,x), in Eqs (25), (26) and (28), we have
∂xkH−k(t,x,a,y,z)=μ−k;k(t)yk+σk:−kzk,k+γ−kk|xk−vk;−k|,∂xkG−k(t,x)=γ−kk|xk−vk;−k|, |
and
∂xiH−k(t,x,a,y,z)=0=∂xiG−k(t,x), if i≠k, |
where ∂xi denotes the derivative w.r.t. the i−th component of x∈Rn.
Thus we have that the k−th component of the BSDE (30) now reads
Yk;−k(t)=γ−kkXk;−k(ˆτn)−γ−kkvk;−k(ˆτn)++∫ˆτnt(μk;−k(s)Yk;−k(s)+σk;−k(s)Z−kk,k(s)+γ−kkXk;−k(s)−γ−kkvk;−k(s))ds+−∫ˆτntZ−kk,k(s)dWk(s),t∈[0,ˆτn]. | (31) |
Analogously, we have that the second last control problem is associated to the following system of BSDEs
Yi;−(k,l)(t)=γ−(k,l)iXi;−(k,l)(ˆτn−1)−γ−(k,l)vi;−(k,l)(τπn−1)+ˉYi;n−1(ˆτn−1)+∫ˆτn−1t(μi;−(k,l)(s)Yi;−(k,l)(s)+l∑j=kσj;−(k,l)(s)Z−(k,l)j,j(s))ds++∫ˆτn−1t(γ−(k,l)iXi;−(k,l)(s)−γ−(k,l)ivi;−(k,l)(s))ds+−l∑j=k∫ˆτn−1tZ−(k,l)i,j(s)dWj(s),t∈[0,τπn−1],i=k,l, | (32) |
and so on for any πk, until we reach the first control problem with associated the following BSDEs system
Yi;0(t)=γ0iXi;0(ˆτ1)−γ0iv0(ˆτ1)+ˉYi;1(ˆτ1)++∫ˆτ1t(μi;0(s)Yi;0(s)+n∑j=1σj;0(s)Z0j,j(s)+γ0iXi;0(s)−γ0iv0(s))ds+−n∑j=1∫ˆτ1tZ0i,j(s)dWj(s),t∈[0,τ0],i=1,…,n. | (33) |
Therefore, for t∈[0,ˆτn], we are left with the minimization problem for
J(x,t):=Et∫ˆτnt(|Xk;−k(s)−vk;−k(s)|2+12|αk;−k(s)|2)ds++Et|Xk;−k(ˆτ1)−vk;−k(ˆτ1)|2. |
Exploiting Theorem 2.8, we have that, on the interval [τπn−1,ˆτn], the above control problem is associated to the following forward–backward system
{dXk;−k(t)=(μk;−k(t)Xk;−k(t)+bk;−k(t)+αk;−k(t))dt++(σk;−k(t)Xk;−k(t)+νk;−k(t))dWk(t),Xk;−k(τπn−1)=Xk;n−1(τπn−1),−dYk;−k(t)=(μk;−k(t)Yk;−k(t)+σk;−k(t)Z−kk,k(t)+Xk;−k(t)−vk;−k(t))dt+−Zk;−kk,k(t)dWk(t),Yk;−k(ˆτn)=Xk;−k(ˆτn)−vk;−k(ˆτn). | (34) |
In what follows, for the sake of brevity, we will drop the index (k;−k). Therefore, until otherwise specified, we will write X instead of Xk;−k, and similarly for any other coefficients. We also recall that system (47) has to be solved for any k=1,…,n.
We thus guess the solution of the backward component Y in Eq (47) to be of the form
−Y(t)=P(t)X(t)−φ(t), | (35) |
for P and φ two R−valued processes to be determined.
Notice that in standard cases, that is when the coefficients are not random or the terminal time is deterministic, P and φ solve a backward ODE, while in the present case, because of the terminal time randomness, P and φ will solve a BSDE.
Let us thus assume that (P(t),ZP(t)) is the solution to
−dP(t)=FP(t)dt−ZP(t)dW(t),P(ˆτn)=1, | (36) |
and that (φ(t),Zφ(t)) solves
−dφ(t)=Fφ(t)dt−Zφ(t)dWj(t),φ(ˆτn)=−v(ˆτn). | (37) |
From the first order condition, namely ∂aH(t,x,a,y,z)=0, we have that the optimal control is given by
ˉα=−Y(t)=P(t)X(t)−φ(t). | (38) |
An application of Itô formula yields
(μ(t)Y(t)+σ(t)Z(t)+X(t)−v(t))dt−Z(t)dW(t)=−dY(t)=d(P(t)X(t))−dφ(t)==(−FP(t)X(t)+P(t)μ(t)X(t)+P(t)α(t)+ZP(t)σ(t)X(t)+ZP(t)ν(t)+P(t)b(t)+Fφ(t))dt++(ZP(t)X(t)+P(t)σ(t)X(t)+P(t)ν(t)−Zφj(t))dW(t)==(−FP(t)+P(t)μ(t)+ZP(t)σ(t))X(t)dt+P(t)α(t)dt+(ZP(t)ν(t)+P(t)b(t)+Fφ(t))dt++(ZP(t)+P(t)σ(t))X(t)dW(t)+(P(t)ν(t)−Zφ(t))dW(t). | (39) |
Therefore, equating the left hand side and the right hand side of Eq (39), we derive
−Z(t)=(ZP(t)+P(t)σ(t))X(t)+(P(t)ν(t)−Zφ(t)), | (40) |
moreover, by substituting Eq (40) into the left hand side of Eq (39), exploiting the first order optimality condition (38), and equating again the left hand side and the right hand side of Eq (39), we obtain
(μ(t)P(t)−σ(t)ZP(t)−σ2(t)P(t)+1)X(t)−(μ(t)φ(t)+σ(t)P(t)ν(t)−σ(t)Zφ(t)+v(t))==(−FP(t)+P(t)μ(t)+ZP(t)σ(t)+P2(t))X(t)+(ZP(t)ν(t)+P(t)b(t)+Fφ(t)−P(t)φ(t)). | (41) |
Since Eq (41) has to hold for any X(t), we have
μ(t)P(t)−σ(t)ZP(t)−σ2(t)P(t)+1=−FP(t)+P(t)μ(t)+ZP(t)σ(t)+P(t)2, | (42) |
which, after some computations, leads to
FP(t)=P(t)2+σ2(t)P(t)+2ZP(t)σ(t)−1. | (43) |
Similarly, we also have that
Fφ(t)=(P(t)−μ(t))φ(t)+σ(t)Zφ(t)−v(t)−σ(t)ν(t)P(t)−ZP(t)ν(t)−P(t)b(t), | (44) |
hence using the particular form for the generator FP, resp. of Fφ, stated in Eq (43), resp. in Eq (44), in Eq (36), resp. in Eq (37), and reintroducing, for the sake of clarity, the index k, the last optimal control ˉαk;−k(t) reads as follow
ˉαk;−k(t)=Pk;−k(t)Xk;−k(t)−φk;−k(t), |
Pk;−k(t) and φk;−k(t) being solutions to the BSDEs
{−dPk;−k(t)=((Pk;−k(t))2+(σk;−k(t))2Pk;−k(t)+2Z−k;Pk,k(t)σk;−k(t)−1)dt+−Z−k;Pk,k(t)dWk(t),Pk;−k(ˆτn)=1, | (45) |
{−dφk;−k(t)=((Pk;−k(t)−μk;−k(t))φk;−k(t)+σk;−k(t)Z−k;φk,k(t)−hk;−k(P(t),v(t)))dt+−Z−k;φk,k(t)dWk(t),φ(ˆτn)=−vk;−k(ˆτn), | (46) |
where we have introduced the function
hk;−k(P(t),v(t)):=v(t)+σ(t)ν(t)P(t)+ZP(t)ν(t)+P(t)b(t). |
Notice that, from Eq (46), we have that φ is a BSDE with linear generator, so that its solution is explicitly given by
φk;−k(t)=−Γ−1(t)Et[Γ(ˆτn)vk;−k(ˆτn)−∫ˆτntΓ(s)hk;−k(P(s),v(s))ds], |
where Γ solves
dΓ(t)=Γ(t)[(Pk;−k(t)−μk;−k(t))dt+σk;−k(t)dW(t)],Γ(0)=1. |
Moreover, by [26,Th. 5.2,Th. 5.3], it follows that Eq (45) admits a unique adapted solution on [0,ˆτn]. Therefore, iterating the above analysis for any k=1,…,n, we gain the optimal solution to the last control problem. Having solved the last control problem, we can consider the second last control problem. Assuming, with no loss of generality, that nodes (k,l) are left, all subsequent computation has to be carried out for any possible couple k=1,…,n, l=k+1,…,n.
By Theorem 2.8, the optimal pair (ˉXi,ˉαi), i=k,l, satisfies, component–wise, the following forward–backward system for i=k,l,
{dXi;−(k,l)(t)=(μi;−(k,l)(t)Xi;−(k,l)(t)+bi;−(k,l)(t)+αi;−(k,l)(t))dt++(σi;−(k,l)(t)Xi;−(k,l)(t)+νi;−(k,l)(t))dWi(t),Xi;−(k,l)(τπn−2)=Xi;n−2(τπn−2),−dYi;−(k,l)(t)=(μi;−(k,l)(t)Yi;−(k,l)(t)+σi;−(k,l)Zi;−(k,l)i,i(t))dt++(γi;−(k,l)Xi;−(k,l)(t)−γi;−(k,l)vi;−(k,l)(t))dt−l∑j=kZi;−(k,l)i,j(t)dWj(t),Yi;−(k,l)(ˆτn−1)=γi;−(k,l)Xi;−(k,l)(ˆτn−1)−γi;−(k,l)vi;−(k,l)(ˆτn−1)+ˉYk;n−1(ˆτn−1); | (47) |
in what follows we will denote by Zj the j−th n−dimensional column of Z in Eq (32). Note that the only non null entries of Z will be Zi,j, for i,j=k,l. Also, for the sake of simplicity, we will avoid to use the notation Xi;−(k,l), i=k,l, only using Xi, i=k,l, instead.
Mimicking the same method earlier used, we again guess the solution of the backward component Yi to be of the form
−Yi(t)=Pi(t)Xi(t)−φi(t),i=k,l, | (48) |
for Pi and φi, i=k,l, a R−valued process.
Because of the particular form of Eq (34), the i−th component of the BSDE Y depends only on the i−th component of the forward SDE X, the matrix P has null entry off the main diagonal, namely it has the form
P(t)=(000000Pk(t)00000000000Pl(t)000000), |
similarly for φ.
Let us assume that (Pi(t),Zi;P(t)), i=k,l solves
−dPi(t)=Fi;P(t)dt−l∑j=kZPj(t)dWj(t),Pi(ˆτn−1)=γi−Pi(ˆτn−1)1{ˆτn−1=τ−i}, |
and that (φi(t),Zi;φ(t)) solves
−dφi(t)=Fi;φ(t)dt−l∑j=kZφj(t)dWj(t),φ(ˆτn−1)=−γivi(ˆτn−1)+φi(ˆτn−1)1{ˆτn−1=τ−i}. |
From the first order condition we have that the optimal control is of the form
ˉαi=−Yi(t)=Pi(t)Xi(t)−φi(t). | (49) |
Then, again applying the Itô formula, we have
(μi(t)φi(t)−μi(t)Pi(t)Xi(t)+σi(t)Zii(t)+γiXi(t)−γivi(t))dt−l∑j=kZij(t)dWj(t)==−dYi(t)=d(Pi(t)Xi(t))−dφi(t)==(−Fi;P(t)Xi(t)+Pi(t)μi(t)Xi(t)+Pi(t)bi(t)+Pi(t)αi(t)+)+Fi;φ(t)dt++(k∑j=l(Zi;Pj(t)ρijσi(t)Xi(t)+Zi;Pj(t)ρijνi(t)))dt+l∑j=kZi;Pj(t)Xi(t)dWj(t)+Pi(t)(σi(t)Xi(t)+νi(t))dWi(t)−2∑j=1Zi;φj(t)dWj(t)==(−Fi;P(t)+Pi(t)μi(t)+l∑j=kZi;Pj(t)ρijσi(t))Xi(t)dt+Pi(t)αi(t)dt++Fi;φ(t)dt+k∑j=lZi;Pj(t)ρijνi(t)dt+Pi(t)bi(t)dt++(Zi;Pi(t)+Pi(t)σi(t))Xi(t)dWi(t)+l∑j=kj≠iZi;Pj(t)Xi(t)dWj(t)+Pi(t)νi(t)−l∑j=kZi;φj(t)dWj(t). | (50) |
Thus, substituting Eq (49) into Eq (50), and proceeding as for (42), we have
Fi;P(t)=−(Pi(t))2+(σi(t))2Pi(t)+l∑j=kZi;Pj(t)ℓijσi(t)−γi, | (51) |
with
ℓij:={ρiji≠j,2i=j, |
together with
Fi;φ(t)=(μi−Pi(t))φ(t)+σiZi;φi(t)−k∑j=lZi;Pj(t)ρijνidt−Pi(t)νi(t)dt−γivi(t)−σi(t)νi(t)Pi(t). |
Turning back, for the sake of clarity, to use the extended notation dropped before, we have that ˉαi;−(k,l)(t), i=k,l, is given by
ˉαi;−(k,l)(t)=Pi;−(k,l)(t)Xi;−(k,l)(t)+φi;−(k,l)(t), |
where Pi;−(k,l) and φi;−(k,l) are solutions, for i=k,l, to the BSDEs
{−dPi;−(k,l)(t)=(−Pi;−(k,l)(t)2+(σi;−(k,l))2Pi;−(k,l)(t))dt++(l∑j=kZ−(k,l);Pj(t)ℓijσi;−(k,l)(t)−γi;−(k,l))dt−Z−(k,l);Pi,i(t)dWi(t),Pi;−(k,l)(ˆτn−1)=γi;−(k,l)−Pi,−i(ˆτn−1)1{ˆτn−1=τ−i}, | (52) |
{−dφi;−(k,l)(t)=((μi;−(k,l)(t)−Pi;−(k,l)(t))φi;−(k,l)(t)+σi;−(k,l)(t)Z−(k,l);φi,i(t))dt+−hi;−(k,l)(Pi;−(k,l)(t),vi;−(k,l)(t))dt−Z−(k,l);φi,i(t)dWi(t),φ(ˆτn−1)=−γi;−(k,l)vi;−(k,l)(ˆτn−1)+φi,−i(ˆτn−1)1{ˆτn−1=τ−i}, | (53) |
with
hi;−(k,l)(Pk;−(k,l)(t),,vk;−(k,l)(t))=k∑j=lZi;Pj(t)ρijνi;−(k,l)dt+Pi;−(k,l)(t)νi;−(k,l)(t)dt++γi;−(k,l)vi;−(k,l)(t)+σi(t)νi;−(k,l)(t)Pi;−(k,l)(t). |
Let us underline that Eqs (52) and (53) have to be solved for any possible couple k=1,…,n, l=k+1,…,n. As before, by the linearity of the generator of φi in Eq (53), we have
φi;−(k,l)(t)=−(Γi(t))−1Et[Γi(ˆτn−1)(φi,−i(ˆτn−1)1{ˆτn−1=τ−i})−γivi(ˆτn−1)]+−(Γi(t))−1Et[∫ˆτn−1tΓi(s)hi;−(k,l)(Pi;−(k,l)(s),vi;−(k,l)(s))ds], |
where Γi is the solution to
dΓi(t)=Γi(t)[μi(t)dt+σi(t)dWi(t)],Γi(0)=1. |
Hence, Eq (52) admits a unique adapted solution on [0,ˆτn−1], see [26,Th.5.2,Th. 5.3].
Analogously, via a backward induction, we can solve the first control problem, that is we solve, for i=1,…,n,
{dXi;0(t)=(μi;0(t)Xi;0(t)+bi;0(t)+αi;0(t))dt+(σi;0(t)Xi;0(t)+νi;0(t))dWi(t),Xi;0(0)=xi0,−dYi;0(t)=(μi;0(t)Yi;0(t)+σi;0Z0i,i(t)+γi;0Xi;0(t)−γi;0vi;0(t))dt−n∑j=1Z0i,j(t)dWj(t),Yi;0(ˆτ1)=γi;0Xi;0(ˆτ1)−γi;0vi;0(ˆτ1)+Yi;1(ˆτ1)1{ˆτ1≠τi}, | (54) |
resulting, exactly repeating what considered so far, to consider an optimal control of the form
αi;0(t)=−Yi;0(t)=Pi;0(t)Xi;0(t)−φi;0(t), |
and
{−dPi;0)(t)=(−(Pi;0(t))2+(σi;0(t))2Pi;0(t)+n∑j=1Z0;Pj(t)ℓijσi;0(t)−γi;0)dt+−Z0;Pi,i(t)dWi(t),Pi;0(ˆτ1)=γi;0−Pi;1(ˆτ1)1{ˆτ1≠τi}, | (55) |
{−dφi;0(t)=((μi;0)−Pi;0(t))φi;0(t)+σi;0(t)Z0;φi,i(t)−γi;0vi;0(t))dt+−Z0;φi,i(t)dWi(t),φi;0(ˆτ1)=φi;1(ˆτ1)1{ˆτ1≠τi}−γi;0vi;0(ˆτ1), | (56) |
with
hi;0(Pi;0(t),vi;0(t))=n∑j=1Zi;Pj(t)ρijνi;0dt+Pi;0(t)νi;0(t)dt++γi;0vi;0(t)+σi;0(t)νi;0(t)Pi;0(t), |
which concludes the proof.
[1] |
Barbu V, Cordoni F, Di Persio L (2016) Optimal control of stochastic FitzHugh-Nagumo equation. Int J Control 89: 746-756. doi: 10.1080/00207179.2015.1096023
![]() |
[2] | Bielecki TR, Jeanblanc M, Rutkowski M (2004) Modeling and Valuation of Credit Risk, In: Stochastic Methods in Finance, Berlin: Springer, 27-126. |
[3] | Bielecki TR, Rutkowski M (2013) Credit Risk: Modeling, Valuation and Hedging, Springer Science & Business Media. |
[4] |
Capponi A, Chen PC (2015) Systemic risk mitigation in financial networks. J Econ Dynam Control 58: 152-166. doi: 10.1016/j.jedc.2015.06.008
![]() |
[5] | Cordoni F, Di Persio L (2016) A BSDE with delayed generator approach to pricing under counterparty risk and collateralization. Int J Stoch Anal 2016: 1-10. |
[6] |
Cordoni F, Di Persio L (2017) Gaussian estimates on networks with dynamic stochastic boundary conditions. Infin Dimens Anal Qu 20: 1750001. doi: 10.1142/S0219025717500011
![]() |
[7] |
Cordoni F, Di Persio L (2017) Stochastic reaction-diffusion equations on networks with dynamic time-delayed boundary conditions. J Math Anal Appl 451: 583-603. doi: 10.1016/j.jmaa.2017.02.008
![]() |
[8] | Cordoni F, Di Persio L, Prezioso L. A lending scheme for a system of interconnected banks with probabilistic constraints of failure. Available from: https://arxiv.org/abs/1903.06042. |
[9] | Di Persio L, Ziglio G (2011) Gaussian estimates on networks with applications to optimal control. Net Het Media 6: 279-296. |
[10] |
Eisenberg L, Noe TH (2001) Systemic risk in financial systems. Manage Sci 47: 236-249. doi: 10.1287/mnsc.47.2.236.9835
![]() |
[11] |
El Karoui N, Jeanblanc M, Jiao Y (2010) What happens after a default: The conditional density approach. Stoch Proc Appl 120: 1011-1032. doi: 10.1016/j.spa.2010.02.003
![]() |
[12] | Fleming WH, Soner HM (2006) Controlled Markov Processes and Viscosity Solutions, Springer Science & Business Media. |
[13] |
Guatteri G, Tessitore G (2008) Backward stochastic Riccati equations and infinite horizon LQ optimal control with infinite dimensional state space and random coefficients. Appl Math Opt 57: 207-235. doi: 10.1007/s00245-007-9020-y
![]() |
[14] |
Guatteri G, Tessitore G (2005) On the backward stochastic Riccati equation in infinite dimensions. SIAM J Control Opt 44: 159-194. doi: 10.1137/S0363012903425507
![]() |
[15] | Hurd TR (2015) Contagion! The Spread of Systemic Risk in Financial Networks, Springer. |
[16] |
Kohlmann M, Zhou XY (2000) Relationship between backward stochastic differential equations and stochastic controls: A linear-quadratic approach. SIAM J Control Opt 38: 1392-1407. doi: 10.1137/S036301299834973X
![]() |
[17] |
Kohlmann M, Tang S (2002) Global adapted solution of one-dimensional backward stochastic Riccati equations, with application to the mean-variance hedging. Stoch Proc Appl 97: 255-288. doi: 10.1016/S0304-4149(01)00133-8
![]() |
[18] |
Ying J, Kharroubi I, Pham H (2013). Optimal investment under multiple defaults risk: A BSDEdecomposition approach. Ann Appl Probab 23: 455-491. doi: 10.1214/11-AAP829
![]() |
[19] |
Lipton A (2016) Modern monetary circuit theory, stability of interconnected banking network, and balance sheet optimization for individual banks. Int J Theor Appl Financ 19: 1650034. doi: 10.1142/S0219024916500345
![]() |
[20] | Mansuy R, Yor M (2006) Random Times and Enlargements of Filtrations in a Brownian Setting, Berlin: Springer. |
[21] | Merton RC (1974) On the pricing of corporate debt: The risk structure of interest rates. J financ 29: 449-470. |
[22] |
Mou L, Yong J (2007) A variational formula for stochastic controls and some applications. Pure Appl Math Q 3: 539-567. doi: 10.4310/PAMQ.2007.v3.n2.a7
![]() |
[23] |
Pham H (2010) Stochastic control under progressive enlargement of filtrations and applications to multiple defaults risk management. Stoch Proc Appl 120: 1795-1820. doi: 10.1016/j.spa.2010.05.003
![]() |
[24] | Pham H (2009) Continuous-Time Stochastic Control and Optimization with Financial Applications, Springer Science & Business Media. |
[25] |
Pham H (2005) On some recent aspects of stochastic control and their applications. Probab Surv 2: 506-549. doi: 10.1214/154957805100000195
![]() |
[26] |
Tang S (2003) General linear quadratic optimal stochastic control problems with random coefficients: Linear stochastic Hamilton systems and backward stochastic Riccati equations. SIAM J Control Opt 42: 53-75. doi: 10.1137/S0363012901387550
![]() |
[27] | Yong J, Zhou XY (1999) Stochastic Controls: Hamiltonian Systems and HJB Equations, Springer Science & Business Media. |
1. | Claudio Bellani, Damiano Brigo, Mechanics of Good Trade Execution in the Framework of Linear Temporary Market Impact, 2019, 1556-5068, 10.2139/ssrn.3458454 | |
2. | Francesco Giuseppe Cordoni, Luca Di Persio, Luca Prezioso, A lending scheme for a system of interconnected banks with probabilistic constraints of failure, 2020, 120, 00051098, 109111, 10.1016/j.automatica.2020.109111 | |
3. | Claudio Bellani, Damiano Brigo, Mechanics of good trade execution in the framework of linear temporary market impact, 2021, 21, 1469-7688, 143, 10.1080/14697688.2020.1814395 | |
4. | Francesco Giuseppe Cordoni, Luca Di Persio, Yilun Jiang, A Bank Salvage Model by Impulse Stochastic Controls, 2020, 8, 2227-9091, 60, 10.3390/risks8020060 | |
5. | Zhun Gou, Nan-jing Huang, Ming-hui Wang, A linear-quadratic mean-field stochastic Stackelberg differential game with random exit time, 2023, 96, 0020-7179, 731, 10.1080/00207179.2021.2011423 | |
6. | Riccardo Cesari, Harry Zheng, Stochastic Maximum Principle for Optimal Liquidation with Control-Dependent Terminal Time, 2022, 85, 0095-4616, 10.1007/s00245-022-09848-1 | |
7. | Giacomo Ascione, Giuseppe D’Onofrio, Deterministic Control of SDEs with Stochastic Drift and Multiplicative Noise: A Variational Approach, 2023, 88, 0095-4616, 10.1007/s00245-023-09978-0 | |
8. | Moh Shahid Khan, Ravi Kumar Mandava, Design of dynamically balanced gait for the biped robot while crossing the obstacle, 2024, 238, 0954-4062, 9125, 10.1177/09544062241246878 |