Loading [MathJax]/jax/element/mml/optable/BasicLatin.js
Research article Special Issues

Optimal strategy analysis for adversarial differential games


  • Received: 25 May 2022 Revised: 04 July 2022 Accepted: 13 July 2022 Published: 05 August 2022
  • Optimal decision-making and winning-regions analysis in adversarial differential games are challenging theoretical problems because of the complex interactions between players. To solve these problems, we present an organized review for pursuit-evasion games, reach-avoid games and capture-the-flag games; we also outline recent developments in three types of games. First, we summarize recent results for pursuit-evasion games and classify them according to different numbers of players. As a special kind of pursuit-evasion games, target-attacker-defender games with an active target are analyzed from the perspectives of different speed ratios for players. Second, the related works for reach-avoid games and capture-the-flag games are compared in terms of analytical methods and geometric methods, respectively. These methods have different effects on the barriers and optimal strategy analysis between players. Future directions for the pursuit-evasion games, reach-avoid games, capture-the-flag games and their applications are discussed in the end.

    Citation: Jiali Wang, Xin Jin, Yang Tang. Optimal strategy analysis for adversarial differential games[J]. Electronic Research Archive, 2022, 30(10): 3692-3710. doi: 10.3934/era.2022189

    Related Papers:

    [1] Jingjing Yang, Jianqiu Lu . Stabilization in distribution of hybrid stochastic differential delay equations with Lévy noise by discrete-time state feedback controls. AIMS Mathematics, 2025, 10(2): 3457-3483. doi: 10.3934/math.2025160
    [2] Guangyang Liu, Yang Chang, Hongyan Yan . Uncertain random problem for multistage switched systems. AIMS Mathematics, 2023, 8(10): 22789-22807. doi: 10.3934/math.20231161
    [3] Hui Sun, Zhongyang Sun, Ya Huang . Equilibrium investment and risk control for an insurer with non-Markovian regime-switching and no-shorting constraints. AIMS Mathematics, 2020, 5(6): 6996-7013. doi: 10.3934/math.2020449
    [4] Jiaojiao Li, Yingying Wang, Jianyu Zhang . Event-triggered sliding mode control for a class of uncertain switching systems. AIMS Mathematics, 2023, 8(12): 29424-29439. doi: 10.3934/math.20231506
    [5] Qiang Yu, Na Xue . Asynchronously switching control of discrete-time switched systems with a Φ-dependent integrated dwell time approach. AIMS Mathematics, 2023, 8(12): 29332-29351. doi: 10.3934/math.20231501
    [6] Gengjiao Yang, Fei Hao, Lin Zhang, Lixin Gao . Stabilization of discrete-time positive switched T-S fuzzy systems subject to actuator saturation. AIMS Mathematics, 2023, 8(6): 12708-12728. doi: 10.3934/math.2023640
    [7] Zhengqi Zhang, Huaiqin Wu . Cluster synchronization in finite/fixed time for semi-Markovian switching T-S fuzzy complex dynamical networks with discontinuous dynamic nodes. AIMS Mathematics, 2022, 7(7): 11942-11971. doi: 10.3934/math.2022666
    [8] Qiang Yu, Yuanyang Feng . Stability analysis of switching systems with all modes unstable based on a Φ-dependent max-minimum dwell time method. AIMS Mathematics, 2024, 9(2): 4863-4881. doi: 10.3934/math.2024236
    [9] P. K. Lakshmi Priya, K. Kaliraj, Panumart Sawangtong . Analysis of relative controllability and finite-time stability in nonlinear switched fractional impulsive systems. AIMS Mathematics, 2025, 10(4): 8095-8115. doi: 10.3934/math.2025371
    [10] Yanmei Xue, Jinke Han, Ziqiang Tu, Xiangyong Chen . Stability analysis and design of cooperative control for linear delta operator system. AIMS Mathematics, 2023, 8(6): 12671-12693. doi: 10.3934/math.2023637
  • Optimal decision-making and winning-regions analysis in adversarial differential games are challenging theoretical problems because of the complex interactions between players. To solve these problems, we present an organized review for pursuit-evasion games, reach-avoid games and capture-the-flag games; we also outline recent developments in three types of games. First, we summarize recent results for pursuit-evasion games and classify them according to different numbers of players. As a special kind of pursuit-evasion games, target-attacker-defender games with an active target are analyzed from the perspectives of different speed ratios for players. Second, the related works for reach-avoid games and capture-the-flag games are compared in terms of analytical methods and geometric methods, respectively. These methods have different effects on the barriers and optimal strategy analysis between players. Future directions for the pursuit-evasion games, reach-avoid games, capture-the-flag games and their applications are discussed in the end.



    Over the past decades, the use of biochemical reactors and correlation techniques has increased greatly because of their fruitful application in converting biomass or cells into pharmaceutical or chemical products, such as vaccines [1], antibiotics [2], beverages [3], and industrial solvents [4]. Among various classes or operation regions of bioreactors, the fed-batch modes have extensively used in the biotechnological industry due to its considerable economic profits [5,6,7]. The main objective of these reactors is to achieve a given or maximum concentration of production at the end of the operation, which can be implemented by using some suitable feed rates [8,9,10]. Thus, in order to ensure economic benefit and product quality of the fed-batch processes, the process control of this units is an very important topic for the engineers [11,12,13].

    Switched dynamical systems provide a flexible modeling method for a variety of different types of engineering systems, such as financial system [14], train control system [15], hybrid electric vehicle [16], chemical process system [17], and biological system [18,19,20,21]. Generally speaking, switched dynamical systems are formed by some continuous-time or discrete-time subsystems and a switching rule [22]. There usually exist four types of switching rules as follows: time-dependent switching [23], state-dependent switching [24], average dwell time switching [25], and minimum dwell time switching [26]. Recently, switched dynamical system optimal control problems are becoming increasingly attractive due to their significance in theory and industry production [27,28,29,30]. Because of the discrete nature of switching rules, it is very challenging that switched dynamical system optimal control problems are solved by directly using the classical optimal control approaches such as the maximum principle and the dynamic programming method [31,32,33,34]. In additions, analytical methods also can not be applied to obtain an solution for switched dynamical system optimal control problems due to their nonlinear nature [35,36,37]. Thus, in recent work, two kinds of well-known numerical optimization algorithms are developed for switched dynamical system optimal control problems to obtain numerical solutions. One is the bi-level algorithm [38,39]. The other is the embedding algorithm [40,41]. Besides above two kinds of well-known numerical optimization algorithms, many other available numerical optimization algorithms are also developed for obtaining the solution of switched dynamical system optimal control problems [42]. Unfortunately, most of these numerical optimization algorithms depend on the following assumption: the time-dependent switching strategy is used to design the switching rules, which implies that the system dynamic must be continuously differentiable with respect to the system state [43,44,45]. However, this assumption is not reasonable, since some small perturbations of the system state may lead to the dynamic equations being changed discontinuously. Thus, the solution obtained is usually not optimal. In additions, although these approaches have demonstrated to be effective by solving many practical problems, they only obtaining an open loop control [46,47,48,49,50,51,52,53]. Unfortunately, such open loop controls are not usually robust in practice. Thus, an optimal feedback controller is more and more popular.

    In this paper, we consider an optimal feedback control problem for a class of fed-batch fermentation processes by using switched dynamical system approach. Our main contributions are as follows. Firstly, a dynamic optimization problem for a class of fed-batch fermentation processes is modeled as a switched dynamical system optimal control problem, and a general state-feedback controller is designed for this dynamic optimization problem. Unlike the existing works, the state-dependent switching method is applied to design the switching rule, and the structure of this state-feedback controller is not restricted to a particular form. In generally, the traditional methods for obtaining an optimal feedback control require solving the well-known Hamilton-Jacobi-Bellman partial differential equation, which is a very difficult issue even for unconstrained optimal control problems. Then, in order to overcome this difficulty, this problem is transformed into a mixed-integer optimal control problem by introducing a discrete-valued function. Furthermore, each of these discrete variables is represented by using a set of 0-1 variables. Then, by using a quadratic constraint, these 0-1 variables are relaxed such that they are continuous on the closed interval [0,1]. Accordingly, the original mixed-integer optimal control problem is transformed into a nonlinear parameter optimization problem, which can be solved by using any gradient-based numerical optimization algorithm. Unlike the existing works, the constraint introduced for these 0-1 variables are at most quadratic. Thus, it does not increase the number of locally optimal solutions of the original problem. During the past decades, many iterative approaches have been proposed for solving the nonlinear parameter optimization problem by using the information of the objective function. The idea of these iterative approaches is usually that a iterative sequence is generated such that the corresponding objective function value sequence is monotonically decreasing. However, the existing algorithms have the following disadvantage: if an iteration is trapped to a curved narrow valley bottom of the objective function, then the iterative methods will lose their efficiency due to the target with objective function value monotonically decreasing may leading to very short iterative steps. Next, in order to overcome this challenge, an improved gradient-based algorithm is developed based on a novel search approach. In this novel search approach, it is not required that the objective function value sequence is always monotonically decreasing. And a large number of numerical experiments shows that this novel search approach can effectively improve the convergence speed of this algorithm, when an iteration is trapped to a curved narrow valley bottom of the objective function. Finally, an optimal feedback control problem of 1, 3-propanediol fermentation processes is provided to illustrate the effectiveness of this method developed by this paper. Numerical simulation results show that this method developed by this paper is low time-consuming, has faster convergence speed, and obtains a better result than the existing approaches.

    The rest of this paper is organized as follows. Section 2 presents the optimal feedback control problem for a class of fed-batch fermentation processes. In Section 3, by introducing a discrete-valued function and using a relaxation technique, this problem is transformed into a nonlinear parameter optimization problem, which can be solved by using any gradient-based numerical optimization algorithm. An improved gradient-based numerical optimization algorithm are developed in Section 4. In Section 5, the convergence results of this numerical optimization algorithm are established. In Section 6, an optimal feedback control problem of 1, 3-propanediol fermentation processes is provided to illustrate the effectiveness of this algorithm developed by this paper.

    In this section, a general state-feedback controller is proposed for a class of fed-batch fermentation process dynamic optimization problems, which will be modeled as an optimal control problem of switched dynamical systems under state-dependent switching.

    Let α1=[α11,,α1r1]TRr1 and α2=[α21,,α2r2]TRr2 be two parameter vectors satisfying

    a_iα1riˉai,i=1,,r1, (2.1)

    and

    b_jα2rjˉbj,j=1,,r2, (2.2)

    respectively, where a_i, ˉai, i=1,,r1; b_j, ˉbj, j=1,,r2 present given constants. Suppose that tf>0 presents a given terminal time. Then, a class of fed-batch fermentation process dynamic optimization problems can be described as choose two parameter vectors α1Rr1, α2Rr2, and a general state-feedback controller

    u(t)=Υ(x(t),ϑ),t[0,tf], (2.3)

    to minimize the objective function

    J(u(t),α1,α2)=ϕ(x(tf)), (2.4)

    subject to the switched dynamical system under state-dependent switching

    {Subsystem1:dx(t)dt=f1(x(t),t),ifg1(x(t),α1,t)=0,Subsystem2:dx(t)dt=f2(x(t),u(t),t),ifg2(x(t),α2,t)=0,t[0,tf], (2.5)

    with the initial condition

    x(0)=x0, (2.6)

    where x(t)Rn presents the system state; x0 presents a given initial system state; u(t)Rm presents the control input; ϑ=[ϑ1,,ϑr1]TRr3 presents a state-feedback parameter vector satisfying

    c_kϑkˉck,k=1,,r3, (2.7)

    c_k and ˉck, k=1,,r present given constants. Υ:Rn×RrRm; ϕ:RnR, f1:Rn×[0,tf]Rn, f2:Rn×Rm×[0,tf]Rn, g1:Rn×Rr1×[0,tf]Rn, g2:Rn×Rr2×[0,tf]Rn present five continuously differentiable functions. For convenience, this problem is called as Problem 1.

    Remark 1. In the switched dynamical system (2.5), Subsystem 1 presents the batch mode, during which there exists no input feed (i.e., control input) u(t), and Subsystem 2 presents the feeding mode, during which there exists input feed (i.e., control input) u(t). This fed-batch fermentation process will oscillate between Subsystem 1 (the batch mode) and Subsystem 2 (the feeding mode), and g1(x(t),α1,t)=0 and g2(x(t),α2,t)=0 present the active conditions of Subsystems 1 and 2, respectively.

    Remark 2. Note that an integral term, which is used to measure the system running cost, can be easily incorporated into the objective function (2.4) by augmenting the switched dynamical system (2.5) with an additional system state variable (see Chapter 8 of this work [54]). Thus, it is not a serious restriction that the integral term does not appear in the objective function (2.4).

    Remark 3. The structure for this general state-feedback controller (2.3) can be governed by the given continuously differentiable function Υ, and the state-feedback parameter vector ϑ is decision variable vector, which will be chosen optimally. For example, the linear state-feedback controller described by u(t)=Kx(t) is a very common state-feedback controller, where KRm×n presents a state-feedback gain matrix to be found optimally.

    In Problem 1, the state-dependent switching strategy is adopted to design the switching rule, which is unlike the existing switched dynamical system optimal control problem. Then, the solution of Problem 1 can not be obtained by directly using the existing numerical computation approaches for switched dynamical systems optimal control problem, in which the switching rule is designed by using time-dependent strategy. In order to overcome this difficulty, by introducing a discrete-valued function, the problem will be transformed into a equivalent nonlinear dynamical system optimal control problem with discrete and continuous variables in this subsection.

    Firstly, by substituting the general state-feedback controller (2.3) into the switched dynamical system (2.5), Problem 1 can be equivalently written as the following problem:

    Problem 2. Choose (α1,α2,ϑ)Rr1×Rr2×Rr3 to minimize the objective function

    ˉJ(α1,α2,ϑ)=ϕ(x(tf)), (3.1)

    subject to the switched dynamical system under state-dependent switching

    {Subsystem1:dx(t)dt=f1(x(t),t),ifg1(x(t),α1,t)=0,Subsystem2:dx(t)dt=ˉf2(x(t),ϑ,t),ifg2(x(t),α2,t)=0,t[0,tf], (3.2)

    and the three bound constraints (2.1), (2.2) and (2.7), where ˉf2(x(t),ϑ,t)=f2(x(t),Υ(x(t),ϑ),t).

    Next, note that the solution of Problem 1 can not be obtained by directly using the existing numerical computation approaches for switched dynamical systems optimal control problem, in which the switching rule is designed by using time-dependent strategy and not state-dependent strategy. In order to overcome this difficulty, a novel discrete-valued function y(t) is introduced as follows:

    y(t)={1,ifg1(x(t),α1,t)=0,2,ifg2(x(t),α2,t)=0,t[0,tf]. (3.3)

    Then, Problem 2 can be transformed into the following equivalent optimization problem with discrete and continuous variables:

    Problem 3. Choose (α1,α2,ϑ,y(t))Rr1×Rr2×Rr3×{1,2} to minimize the objective function

    ˜J(α1,α2,ϑ,y(t))=ϕ(x(tf)), (3.4)

    subject to the nonlinear dynamical system

    dx(t)dt=(2y(t))y(t)f1(x(t),t)+(y(t)1)ˉf2(x(t),ϑ,t),t[0,tf], (3.5)

    the equality constraint

    (2y(t))y(t)g1(x(t),α1,t)+(y(t)1)g2(x(t),α2,t)=0,t[0,tf], (3.6)

    and the three bound constraints (2.1), (2.2), and (2.7).

    Note that standard nonlinear numerical optimization algorithms are usually developed for nonlinear optimization problems only with continuous variables, for example the sequential quadratic programming algorithm, the interior-point method, and so on. Thus, the solution of Problem 3, which has discrete and continuous variables, can not be obtained by directly using these existing standard algorithms. In order to overcome this difficulty, this subsection will introduce a relaxation problem, which has only continuous variables.

    Define

    P(σ(t))=2i=1i2σi(t)(2i=1iσi(t))2, (3.7)

    where σ(t)=[σ1(t),σ2(t)]T. Then, a theorem can be established as follows.

    Theorem 1. If the nonnegative functions σ1(t) and σ2(t) satisfy the following equality:

    σ1(t)+σ2(t)=1,t[0,tf], (3.8)

    then two results can be obtained as follows:

    (1) For any t[0,tf], the function P(σ(t)) is nonnegative;

    (2) For any t[0,tf], P(σ(t))=0 if and only if σi(t)=1 for one i{1,2} and σi(t)=0 for the other i{1,2}.

    Proof. (1) By using the equality (3.8) and the Cauchy-Schwarz inequality, we have

    2i=1iσi(t)=2i=1(iσi(t))σi(t)2i=1(i2σi(t))2i=1σi(t)=2i=1(i2σi(t)), (3.9)

    Note that the functions σ1(t) and σ2(t) are nonnegative. Then, squaring both sides of the inequality (3.9) yields

    2i=1(i2σi(t))(2i=1iσi(t))2,

    which implies that for any t[0,tf], the function P(σ(t)) is nonnegative.

    (2) The correctness of the second part for Theorem 1 only need to prove the following result: for any t[0,tf], P(σ(t))=0 has solutions σi(t)=1 for one i{1,2} and σi(t)=0 for the other i{1,2} and ii.

    Define

    v1(t)=[σ1(t),2σ2(t)],v2(t)=[σ1(t),σ2(t)].

    Then, the inequality (3.9) can be equivalently transformed into as follows:

    v1(t)v2(t)v1(t)v2(t), (3.10)

    where and present the vector dot product and the Euclidean norm, respectively. Note that the equality

    v1(t)v2(t)=v1(t)v2(t) (3.11)

    holds if and only if there exists a constant βR such that

    v1(t)=βv2(t). (3.12)

    By using the equality (3.8), one obtain v1(t)0 and v2(t)0, where 0 presents the zero vector. Then, β is a nonzero constant and the equality (3.12) implies

    (1β)σ1(t)=0, (3.13)
    (2β)σ2(t)=0. (3.14)

    Furthermore, the constant β can be set equal to one integer i{1,2}, and for the other integer i{1,2}, one have

    σi(t)=0,ii, (3.15)

    From the two equalities (3.8) and (3.15), we obtain σi(t)=1. This completes the proof of Theorem 1.

    Now, Problem 3 can be rewritten as a relaxation problem as follows:

    Problem 4. Choose (α1,α2,ϑ,σ(t))Rr1×Rr2×Rr3×R2 to minimize the objective function

    Jrelax(α1,α2,ϑ,σ(t))=ϕ(x(tf)), (3.16)

    subject to the nonlinear dynamical system

    dx(t)dt=(2ˉy(t))ˉy(t)f1(x(t),t)+(ˉy(t)1)ˉf2(x(t),ϑ,t),t[0,tf], (3.17)

    the two equality constraints

    (2ˉy(t))ˉy(t)g1(x(t),α1,t)+(ˉy(t)1)g2(x(t),α2,t)=0,t[0,tf], (3.18)
    P(σ(t))=0,t[0,tf], (3.19)

    the bound constraint

    0σi(t)1,i=1,2,t[0,tf], (3.20)

    the equality constraint (3.8), and the three bound constraints (2.1), (2.2), and (2.7), where

    ˉy(t)=1×σ1(t)+2×σ2(t). (3.21)

    By using Theorem 1, one can derive that Problems 3 and 4 are equivalent.

    Note that the bound constraint (3.20) is essentially some continuous-time inequality constraints. Thus, the solution of Problem 4 can not also be obtained by directly using the existing standard algorithms. In order to obtain the solution of Problem 4, this subsection will introduce a nonlinear parameter optimization problem, which has some continuous-time equality constraints and several bound constraints.

    Suppose that τi presents the ith switching time. Then, one have

    0=τ0τ1τ2τM1τM=tf, (3.22)

    where M1 presents a given fixed integer. It is important to note that the switching times are not independent optimization variables, whose values can be obtained indirectly by using the state trajectory of the switched dynamical system (2.5). Then, Problem 4 can be transformed into an equivalent optimization problem as follows:

    Problem 5. Choose (α1,α2,ϑ,ξ)Rr1×Rr2×Rr3×R2M to minimize the objective function

    ˉJrelax(α1,α2,ϑ,ξ)=ϕ(x(tf)), (3.23)

    subject to the nonlinear dynamical system

    dx(t)dt=Mi=1((2(ξ1i+2ξ2i))(ξ1i+2ξ2i)f1(x(t),t)+((ξ1i+2ξ2i)1)ˉf2(x(t),ϑ,t))χ[τi1,τi)(t),
    t[0,tf], (3.24)

    the equality constraints

    Mi=1((2(ξ1i+2ξ2i))(ξ1i+2ξ2i)g1(x(t),α1,t)+((ξ1i+2ξ2i)1)g2(x(t),α2,t))χ[τi1,τi)(t)=0,
    t[0,tf], (3.25)
    ˉP(ξ,t)=0,t[0,tf], (3.26)
    ξ1i+ξ2i=1,i=1,,M, (3.27)

    the bound constraint

    0ξji1,j=1,2,i=1,,M, (3.28)

    and the three bound constraints (2.1), (2.2), and (2.7), where ξi1 and ξi2 present, respectively, the values of σ1(t) and σ2(t) on the ith subinterval [τi1,τi), i=1,,M; ξ=[(ξ1)T,(ξ2)T]T, ξ1=[ξ11,,ξ1M]T, ξ2=[ξ21,,ξ2M]T; ˉP(ξ,t)=Mi=1(2j=1j2ξji(2j=1jξji)2)χ[τi1,τi)(t); and χI(t) is given by

    χI(t)={1,iftI,0,otherwise, (3.29)

    which is a function defined on the subinterval I[0,tf].

    Due to the switching times being unknown, it is very challenging to acquire the gradient of the objective function (3.23). In order to overcome this challenge, the following time-scaling transformation is developed to transform variable switching times into fixed times:

    Suppose that the function t(s):[0,M]R is continuously differentiable and is governed by the following equation:

    dt(s)ds=Mi=1θiχ[i1,i)(s), (3.30)

    with the boundary condition

    t(0)=0, (3.31)

    where θi is the subsystem dwell time on the ith subinterval [i1,i)[0,tf]. In general, the transformation (3.30)–(3.31) is referred to as a time-scaling transformation.

    Define θ=[θ1,,θM]T, where

    0θitf,i=1,,M. (3.32)

    Then, by using the time-scaling transform (3.30) and (3.31), we can rewrite Problem 5 as the following equivalent nonlinear parameter optimization problem, which has fixed switching times.

    Problem 6. Choose (α1,α2,ϑ,ξ,θ)Rr1×Rr2×Rr3×R2M×RM to minimize the objective function

    ˆJrelax(α1,α2,ϑ,ξ,θ)=ϕ(ˆx(M)), (3.33)

    subject to the nonlinear dynamical system

    dˆx(s)ds=Mi=1θi((2(ξ1i+2ξ2i))(ξ1i+2ξ2i)f1(ˆx(s),s)+((ξ1i+2ξ2i)1)ˉf2(ˆx(s),ϑ,s))χ[i1,i)(s),
    s[0,M], (3.34)

    the continuous-time equality constraints

    Mi=1θi((2(ξ1i+2ξ2i))(ξ1i+2ξ2i)g1(ˆx(s),α1,s)+((ξ1i+2ξ2i)1)g2(ˆx(s),α2,s))χ[i1,i)(s)=0,
    s[0,M], (3.35)
    ˆP(ξ,s)=0,s[0,M], (3.36)

    the linear equality constraint (3.27), the three bound constraints (2.1), (2.2), (2.7), (3.28), and (3.32), where ˆx(s)=x(t(s)) and ˆP(ξ,s)=Mi=1θi(2j=1j2ξji(2j=1jξji)2)χ[i1,i)(s).

    In this section, an improved gradient-based numerical optimization algorithm will be proposed for obtaining the solution of Problem 1.

    In order to handle the continuous-time equality constraints (3.35) and (3.36), by adopting the idea of l1 penalty function [55], Problem 6 will be written as a nonlinear parameter optimization problem with a linear equality constraint and several simple bounded constraints in this subsection.

    Problem 7. Choose (α1,α2,ϑ,ξ,θ)Rr1×Rr2×Rr3×R2M×RM to minimize the objective function

    Jγ(α1,α2,ϑ,ξ,θ)=ϕ(ˆx(M))+γM0L(ˆx(s),α1,α2,ϑ,ξ,θ,s)ds, (4.1)

    subject to the nonlinear dynamical system (3.34), the linear equality constraint (3.27), the three bound constraints (2.1), (2.2), (2.7), (3.28) and (3.32), where

    L(ˆx(s),α1,α2,ϑ,ξ,θ,s)
    =ˆP(ξ,s)+Mi=1θi((2(ξ1i+2ξ2i))(ξ1i+2ξ2i)g1(ˆx(s),α1,s)+((ξ1i+2ξ2i)1)g2(ˆx(s),α2,s))χ[i1,i)(s),

    where γ>0 presents the penalty parameter.

    The idea of l1 penalty function [47] indicates that any solution of Problem 7 is also a solution of Problem 6. In additions, it is straightforward to acquire the gradient of the linear function in the equality constraint (3.27), and the gradient of the objective function (4.1) will be presented in Section 4.2. Thus, the solution of Problem 7 can be achieved by applying any gradient-based numerical computation method.

    In order to acquire the solution of Problem 7, the gradient formulae of this objective function (4.1) will be presented by the following theorem in this subsection.

    Theorem 2. For any s[0,M], the gradient formulae of the objective function (4.1) with respect to the decision variables α1, α2, ϑ, ξ, and θ are given by

    Jγ(α1,α2,ϑ,ξ,θ)α1=M0H(ˆx(s),α1,α2,ϑ,ξ,θ,λ(s))α1ds, (4.2)
    Jγ(α1,α2,ϑ,ξ,θ)α2=M0H(ˆx(s),α1,α2,ϑ,ξ,θ,λ(s))α2ds, (4.3)
    Jγ(α1,α2,ϑ,ξ,θ)ϑ=M0H(ˆx(s),α1,α2,ϑ,ξ,θ,λ(s))ϑds, (4.4)
    Jγ(α1,α2,ϑ,ξ,θ)ξ=M0H(ˆx(s),α1,α2,ϑ,ξ,θ,λ(s))ξds, (4.5)
    Jγ(α1,α2,ϑ,ξ,θ)θ=M0H(ˆx(s),α1,α2,ϑ,ξ,θ,λ(s))θds, (4.6)

    where H(ˆx(s),α1,α2,ϑ,ξ,θ,λ(s)) denotes the Hamiltonian function defined by

    H(ˆx(s),α1,α2,ϑ,ξ,θ,λ(s))=L(ˆx(s),α1,α2,ϑ,ξ,θ,s)+(λ(s))Tˉf(ˆx(s),α1,α2,ϑ,ξ,θ,s), (4.7)
    ˉf(ˆx(s),α1,α2,ϑ,ξ,θ,s)
    =Mi=1θi((2(ξ1i+2ξ2i))(ξ1i+2ξ2i)f1(ˆx(s),s)+((ξ1i+2ξ2i)1)ˉf2(ˆx(s),ϑ,s))χ[i1,i)(s), (4.8)

    and the function λ(s) presents the costate satisfying the following system:

    (dλ(s)ds)T=H(ˆx(s),α1,α2,ϑ,ξ,θ,λ(s))ˆx(s) (4.9)

    with the terminal condition

    (λ(M))T=ϕ(ˆx(M))ˆx(M). (4.10)

    Proof. Similarly to the discussion of Theorem 5.2.1 described in [56], the gradient formulae (4.2)–(4.6) can be obtained. This completes the proof of Theorem 2.

    For simplicity of notation, let g(η)=˜Jγ(η) presents the gradient of the objective function Jγ described by (4.1) at η, where η=[(α1)T,(α2)T,ϑT,ξT,θT]T. In additions, let and present, respectively, the Euclidean norm and the infinity norm, and suppose that the subscript k presents the function value at the point ηk or in the kth iteration, for instance, gk and (Jγ)k. Then, based on the above discussion, an improved gradient-based numerical optimization algorithm will be provided to acquire the solution of Problem 1 in this subsection.

    Algorithm 1. An improved gradient-based numerical optimization algorithm for solving Problem 1.
    01. Initial: η0Rr1+r2+r3+3M, 0<μ<1, 0<ϖ<1, ρmax, , ;
    02. begin
    03.   calculate the objective function and the gradient at the point ;
    04.   , , ;
    05.   while do
    06.     , , ;
    07.     while do
    08.     , ;
    09.     end
    10.     , ;
    11.     calculate by using the following equality:
                   
            where , ;
    12.     if then
    13.     ;
    14.     otherwise
    15.     ;
    16.     end
    17.     calculate ;
    18.     calculate by using the following equality:
                   
          in ;
    19.     update by using the following equality:
                   
           which satisfying the following inequality:
                   
    20.     k := k +1;
    21.   end
    22. , ;
    23. end
    24. Output: , .
    25. construct the optimal solution and optimal value of Problem 1 by using and .

     | Show Table
    DownLoad: CSV

    Remark 4. During the past decades, many iterative approaches have been proposed for solving the nonlinear parameter optimization problem by using the information of the objective function [57]. The idea of these iterative approaches is usually that a iterative sequence is generated such that the corresponding objective function value sequence is monotonically decreasing. However, the existing algorithms have the following disadvantage: if an iteration is trapped to a curved narrow valley bottom of the objective function, then the iterative methods will lose their efficiency due to the target with objective function value monotonically decreasing may leading to very short iterative steps. Then, in order to overcome this challenge, an improved gradient-based algorithm is developed based on a novel search approach in Algorithm 1. In this novel search approach, it is not required that the objective function value sequence is always monotonically decreasing. In additions, an improved adaptive strategy for the memory element described by (4.12), which is used in (4.13), is proposed in iterative processes in Algorithm 1. The corresponding explanation on the equality (4.12) is as follows. If the st condition described by (4.12) holds, then it implies that the iteration is trapped to a curved narrow valley bottom of the objective function. Thus, in order to avoid creeping along the bottom of this narrow curved valley, the value of the memory element should be increased. If the nd condition described by (4.12) holds, then the value of the memory element is better to remain unchanged. If the rd condition described by (4.12) holds, then it implies that the iteration is in a flat region. Thus, in order to decrease the objective function value, the value of the memory element will be decreased. Above discussions imply that the novel search approach described in Algorithm 1 is also an adaptive method.

    Remark 5. The sufficient descent condition is extremely important for the convergence of any gradient-based numerical optimization algorithm. Thus, the goal of lines 12–16 described in Algorithm 1 is avoiding uphill directions and keeping uniformly bounded. As a matter of fact, for any , and ensure that there are two constants and such that satisfies the following two conditions:

    (4.15)
    (4.16)

    This section will establish the convergence results of Algorithm 1 developed by Section 4. In order to establish the convergence results of this algorithm, we suppose that the following two conditions hold:

    Assumption 1. is a continuous differentiable function and bounded below on .

    Assumption 2. For any and , there is a constant such that

    (5.1)

    where presents a open set and presents the gradient of .

    Theorem 3. Suppose that Assumptions 1 and 2 hold. Let be a sequence obtained by using Algorithm 1. Then, there is a constant such that the following inequality holds:

    (5.2)

    Proof. Let be defined by .

    If , then by using the inequalities (4.14) and (4.15), one can obtain

    (5.3)

    Let be defined by . Then, the proof of Theorem 1 is complete for .

    If , then there is a subset such that the following equality holds:

    (5.4)

    which indicates that there exists a such that the following inequality holds:

    (5.5)

    for any and . Let . Then, the inequality (4.14) does not hold. That is, one can obtain

    (5.6)

    which implies

    (5.7)

    Applying the mean value theorem to the left-hand side of the inequality (5.7) yields

    (5.8)

    where . From the inequality (5.8), one obtain

    (5.9)

    By using Assumption 2 and Cauchy-Schwartz inequality, from (4.15) and (5.9), we have

    (5.10)

    Furthermore, applying and the inequality (4.16) to the inequality (5.10), one obtain

    (5.11)

    for any and . Clearly, the inequalities (5.4) and (5.11) are contradictory. Thus, . This completes the proof of Theorem 3.

    Lemma 1. Suppose that Assumptions 1 and 2 hold. Let be a sequence obtained by using Algorithm 1. Then, the following inequalities

    (5.12)
    (5.13)

    are true, where .

    Proof. Note that if the following inequality

    (5.14)

    is true, then the inequality (5.12) also holds. Here, the inequality (5.14) will be proved by using mathematical induction.

    Firstly, Theorem 3 indicates

    (5.15)

    where . By using and the inequality (5.15), one can derive that the inequality (5.14) is true for .

    Suppose that the inequality (5.14) is true for . Note that and the term described in (5.14) is nonnegative. Then, one can obtain

    (5.16)

    for .

    Next, by using , the inequality (5.2), and the inequality (5.16), one can derive

    (5.17)

    which implies that the inequality (5.14) is also true for . Then, the inequality (5.14) is true for by using mathematical induction. Thus, the inequality (5.12) holds.

    In additions, Assumption 1 shows being a continuous differentiable function and bounded below on , which indicates that

    (5.18)

    Then, summing the inequality (5.12) over yields

    Thus, the inequality (5.13) holds. This completes the proof of Lemma 1.

    Theorem 4. Suppose that these conditions of Theorem 3 are true. Then, the following equality holds:

    (5.19)

    where presents the gradient of the objective function described by (4.1) at the point .

    Proof. Firstly, the following result will be proved: there is a constant such that

    (5.20)

    By using Assumptions 1 and 2, one can obtain

    (5.21)

    Let the constant be defined by . Then, the inequality (5.21) implies that the inequality (5.20) is true.

    Define the function by

    (5.22)

    Then, Lemma 1 indicates that the following equality holds:

    (5.23)

    By using the inequality (5.20), one can obain

    (5.24)

    Thus, from (5.23) and (5.24), one can deduce that the equality (5.19) is true. This completes the proof of Theorem 4.

    In this section, an optimal feedback control problem of 1, 3-propanediol fermentation processes is provided to illustrate the effectiveness of the approach developed by Sections 2–5, and the numerical simulations are all implemented on a personal computer with Intel Pentium Skylake dual core processor i5-6200U CPU(2.3GHz).

    The 1, 3-propanediol fermentation process can be described by switching between two subsystem: batch subsystem and feeding subsystem. There exists no input feed during the batch subsystem, while alkali and glycerol will be added to the fermentor during the feeding subsystem. In generally, the subsystem switching will happen, if the glycerol concentration reaches the given upper and lower thresholds. By using the result of the work [58], the 1, 3-propanediol fermentation process can be modeled as the following switched dynamical system under state-dependent switching:

    (6.1)

    where denotes the given terminal time; the system states , , , denote the volume of fluid (), the concentration of biomass (), the concentration of glycerol (), the concentration of 1, 3-propanediol (), respectively; the control input denotes the feeding rate (); denotes the system state vector; Subsystem 1 and Subsystem 2 denote the batch subsystem and the feeding subsystem, respectively; and (two parameters that need to be optimized) denote the upper and lower of the glycerol concentration, respectively; and the functions , are given by

    (6.2)
    (6.3)

    Subsystem 1 is essentially a natural fermentation process due to no input feed. The functions , , and are defined by

    (6.4)
    (6.5)
    (6.6)

    which denote the growth rate of cell, the consumption rate of substrate, and the formation rate of 1, 3-propanediol, respectively. In the equality (6.4), the parameters and denote the critical concentrations of glycerol and 1, 3-propanediol, respectively; , , , , , , , , , and are given parameters.

    Note that the feeding subsystem doesn't only consist of the natural fermentation process. Thus, the function is provided to describe the process dynamics because of the control input feed in Subsystem 2. In the equality (6.3), the given parameters and denote the proportion and concentration of glycerol in the control input feed, respectively.

    In generally, as the increase of the biomass, the consumption of glycerol also increases. Then, during Subsystem 1 (batch subsystem), the concentration of glycerol will eventually become too low due to no new glycerol being added. Thus, Subsystem 1 will switch to Subsystem 2 (feeding subsystem), when the equality (the active condition of Subsystem 2) satisfies. On the other hand, during Subsystem 2 (feeding subsystem), the concentration of glycerol will eventually become too high due to new glycerol being added. This will inhibit the growth of cell. Thus, Subsystem 2 will switch to Subsystem 1 (batch subsystem), when the equality (the active condition of Subsystem 1) satisfies.

    Suppose that the feeding rate , the upper of the glycerol concentration , and the lower of the glycerol concentration satisfy the following bound constraints:

    (6.7)
    (6.8)
    (6.9)

    respectively.

    The model parameters of the dynamic optimization problem for the 1, 3-propanediol fermentation process are presented by

    Suppose that the control input takes the piecewise state-feedback controller . Our main objective is to maximize the concentration of 1, 3-propanediol at the terminal time . Thus, the optimal feedback control problem of 1, 3-propanediol fermentation processes can be presented as follows: choose a control input to minimize the objective function subject to the switched dynamical system described by (6.1) with with the initial condition and the bound constraints (6.7–6.9). Then, the improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3) is adopted to solve the optimal feedback control problem of 1, 3-propanediol fermentation processes by using Matlab 2010a. The optimal objective function value is and the optimal values of the parameters and are and , respectively. The optimal feedback gain matrixes , are presented by

    and the corresponding numerical simulation results are presented by Figures 14.

    Figure 1.  The optimal volume () of fluid: .
    Figure 2.  The optimal concentration () of biomass: .
    Figure 3.  The optimal concentration () of glycerol: .
    Figure 4.  The optimal concentration () of 1, 3-propanediol: .

    Note that Problem 6 is an optimal control problem of nonlinear dynamical systems with state constraints. Thus, the finite difference approximation approach developed by Nikoobin and Moradi [59] can also be applied for solving this dynamic optimization problem of 1, 3-propanediol fermentation processes. In order to compare with the improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3), the finite difference approximation approach developed by Nikoobin and Moradi [59] is also adopted for solving this dynamic optimization problem of 1, 3-propanediol fermentation process with the same model parameters under the same condition, and the numerical comparison results are presented by Figure 5 and Table 1.

    Figure 5.  Convergence rates for the finite difference approximation approach developed by Nikoobin and Moradi [59] and the improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3).
    Table 1.  The comparison results between the finite difference approximation approach developed by Nikoobin and Moradi [59] and the improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3).
    Algorithm Computation time (second)
    The finite difference approximation approach developed by Nikoobin and Moradi [59] 1165.3872 1052.9140
    The improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3) 439.1513 1265.5597

     | Show Table
    DownLoad: CSV

    Figure 5 shows that the improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3) takes only 67 iterations to obtain the satisfactory result , while the finite difference approximation approach developed by Nikoobin and Moradi [59] takes 139 iterations to achieve the satisfactory result . That is, the iterations of the improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3) is reduced by . In additions, Table 1 also shows that the result obtained by using the finite difference approximation approach developed by Nikoobin and Moradi [59] is not superior to the result () obtained by using the improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3) with saving computation time.

    In conclusion, the above numerical simulation results show that the improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3) is low time-consuming, has faster convergence speed, and can obtain a better numerical optimization than the finite difference approximation approach developed by Nikoobin and Moradi [59]. That is, an effective numerical optimization algorithm is presented for solving the dynamic optimization problem of 1, 3-propanediol fermentation process.

    In this paper, the dynamic optimization problem for a class of fed-batch fermentation processes is modeled as an optimal control problem of switched dynamical systems under state-dependent switching, and a general state-feedback controller is designed for this dynamic optimization problem. Then, by introducing a discrete-valued function and using a relaxation technique, this problem is transformed into a nonlinear parameter optimization problem. Next, an improved gradient-based algorithm is developed based on a novel search approach, and a large number of numerical experiments show that this novel search approach can effectively improve the convergence speed of this algorithm, when an iteration is trapped to a curved narrow valley bottom of the objective function. Finally, an optimal feedback control problem of 1, 3-propanediol fermentation processes is provided to illustrate the effectiveness of this method developed by this paper, and the numerical simulation results show that this method developed by this paper is low time-consuming, has faster convergence speed, and obtains a better result than the existing approaches. In the future, we will continue to study the dynamic optimization problem for a class of fed-batch fermentation processes with uncertainty constraints.

    The authors express their sincere gratitude to the anonymous reviewers for their constructive comments in improving the presentation and quality of this manuscript. This work was supposed by the National Natural Science Foundation of China under Grant Nos. 61963010 and 61563011, and the Special Project for Cultivation of New Academic Talent and Innovation Exploration of Guizhou Normal University in 2019 under Grant No. 11904-0520077.

    The authors declare no conflicts of interest.



    [1] R. Yan, Z. Shi, Y. Zhong, Task assignment for multiplayer reach–avoid games in convex domains via analytical barriers, IEEE Trans. Rob., 36 (2019), 107–124. https://doi.org/10.1109/TRO.2019.2935345 doi: 10.1109/TRO.2019.2935345
    [2] E. Garcia, I. Weintraub, D. W. Casbeer, M. Pachter, Optimal strategies for the game of protecting a plane in 3-d, preprint, arXiv: 2202.01826.
    [3] E. Garcia, D. W. Casbeer, M. Pachter, Optimal strategies of the differential game in a circular region, IEEE Control Syst. Lett., 4 (2019), 492–497. https://doi.org/10.1109/LCSYS.2019.2963173 doi: 10.1109/LCSYS.2019.2963173
    [4] J. Chen, W. Zha, Z. Peng, D. Gu, Multi-player pursuit–evasion games with one superior evader, Automatica, 71 (2016), 24–32. https://doi.org/10.1016/j.automatica.2016.04.012 doi: 10.1016/j.automatica.2016.04.012
    [5] K. Chen, W. He, Q. L. Han, M. Xue, Y. Tang, Leader selection in networks under switching topologies with antagonistic interactions, Automatica, 142 (2022), 110334. https://doi.org/10.1016/j.automatica.2022.110334 doi: 10.1016/j.automatica.2022.110334
    [6] Z. Li, X. Yu, J. Qiu, H. Gao, Cell division genetic algorithm for component allocation optimization in multifunctional placers, IEEE Trans. Ind. Inf., 18 (2021), 559–570. https://doi.org/10.1109/TⅡ.2021.3069459 doi: 10.1109/TⅡ.2021.3069459
    [7] Y. Tang, C. Zhao, J. Wang, C. Zhang, Q. Sun, W. Zheng, et al., An overview of perception and decision-making in autonomous systems in the era of learning, IEEE Trans. Neural Networks Learn. Syst., 2022. https://doi.org/10.1109/TNNLS.2022.3167688 doi: 10.1109/TNNLS.2022.3167688
    [8] E. Garcia, D. W. Casbeer, A. V. Moll, M. Pachter, Multiple pursuer multiple evader differential games, IEEE Trans. Autom. Control, 66 (2020), 2345–2350. https://doi.org/10.1109/TAC.2020.3003840 doi: 10.1109/TAC.2020.3003840
    [9] E. Garcia, D. W. Casbeer, M. Pachter, Optimal strategies for a class of multi-player reach-avoid differential games in 3d space, IEEE Rob. Autom. Lett., 5 (2020), 4257–4264, https://doi.org/10.1109/LRA.2020.2994023 doi: 10.1109/LRA.2020.2994023
    [10] H. Huang, J. Ding, W. Zhang, C. J. Tomlin, Automation-assisted capture-the-flag: A differential game approach, IEEE Trans. Control Syst. Technol., 23 (2014), 1014–1028. https://doi.org/10.1109/TCST.2014.2360502 doi: 10.1109/TCST.2014.2360502
    [11] Z. Zhou, J. Huang, J. Xu, Y. Tang, Two-phase jointly optimal strategies and winning regions of the capture-the-flag game, in IECON 2021 – 47th Annual Conference of the IEEE Industrial Electronics Society, (2021), 1–6. https://doi.org/10.1109/IECON48115.2021.9589624
    [12] E. Garcia, A. V. Moll, D. W. Casbeer, M. Pachter, Strategies for defending a coastline against multiple attackers, in 2019 IEEE 58th Conference on Decision and Control (CDC), (2019), 7319–7324. https://doi.org/10.1109/CDC40024.2019.9029340
    [13] I. E. Weintraub, M. Pachter, E. Garcia, An introduction to pursuit-evasion differential games, in 2020 American Control Conference (ACC), (2020), 1049–1066. https://doi.org/10.23919/ACC45564.2020.9147205
    [14] T. Başar, A tutorial on dynamic and differential games, Dyn. Games Appl. Econ., (1986), 1–25. https://doi.org/10.1007/978-3-642-61636-5_1 doi: 10.1007/978-3-642-61636-5_1
    [15] S. S. Kumkov, S. L. Ménec, V. S. Patsko, Zero-sum pursuit-evasion differential games with many objects: survey of publications, Dyn. Games Appl., 7 (2017), 609–633. https://doi.org/10.1007/s13235-016-0209-z doi: 10.1007/s13235-016-0209-z
    [16] R. Yan, Z. Shi, Y. Zhong, Defense game in a circular region, in 2017 IEEE 56th Annual Conference on Decision and Control (CDC), (2017), 5590–5595. https://doi.org/10.1109/CDC.2017.8264502
    [17] I. E. Weintraub, A. V. Moll, E. Garcia, D. Casbeer, Z. J. Demers, M. Pachter, Maximum observation of a faster non-maneuvering target by a slower observer, in 2020 American Control Conference (ACC), (2020), 100–105. https://doi.org/10.23919/ACC45564.2020.9147340
    [18] J. Wang, Y. Hong, J. Wang, J. Xu, Y. Tang, Q. L. Han, et al., Cooperative and competitive multi-agent systems:from optimization to games, IEEE/CAA J. Autom. Sin., 9 (2022), 763–783. https://doi.org/10.1109/JAS.2022.105506 doi: 10.1109/JAS.2022.105506
    [19] A. A. Al-Talabi, Multi-player pursuit-evasion differential game with equal speed, in 2017 International Automatic Control Conference (CACS), (2017), 1–6. https://doi.org/10.1109/CACS.2017.8284276
    [20] D. Shishika, J. Paulos, V. Kumar, Cooperative team strategies for multi-player perimeter-defense games, IEEE Rob. Autom. Lett., 5 (2020), 2738–2745. https://doi.org/10.1109/LRA.2020.2972818 doi: 10.1109/LRA.2020.2972818
    [21] E. Garcia, Z. E. Fuchs, D. Milutinovic, D. W. Casbeer, M. Pachter, A geometric approach for the cooperative two-pursuer one-evader differential game, IFAC-PapersOnLine, 50 (2017), 15209–15214. https://doi.org/10.1016/j.ifacol.2017.08.2366 doi: 10.1016/j.ifacol.2017.08.2366
    [22] A. V. Moll, D. Casbeer, E. Garcia, D. Milutinović, M. Pachter, The multi-pursuer single-evader game, J. Intell. Rob. Syst., 96 (2019), 193–207. https://doi.org/10.1007/s10846-018-0963-9 doi: 10.1007/s10846-018-0963-9
    [23] E. Garcia, S. D. Bopardikar, Cooperative containment of a high-speed evader, in 2021 American Control Conference (ACC), (2021), 4698–4703. https://doi.org/10.23919/ACC50511.2021.9483097
    [24] E. Garcia, D. W. Casbeer, D. Tran, M. Pachter, A differential game approach for beyond visual range tactics, in 2021 American Control Conference (ACC), (2021), 3210–3215. https://doi.org/10.23919/ACC50511.2021.9482650
    [25] Y. Xu, H. Yang, B. Jiang, M. M. Polycarpou, Multi-player pursuit-evasion differential games with malicious pursuers, IEEE Trans. Autom. Control, 2022. https://doi.org/10.1109/TAC.2022.3168430 doi: 10.1109/TAC.2022.3168430
    [26] W. Lin, Z. Qu, M. A. Simaan, Nash strategies for pursuit-evasion differential games involving limited observations, IEEE Trans. Aerosp. Electron. Syst., 51 (2015), 1347–1356. https://doi.org/10.1109/TAES.2014.130569 doi: 10.1109/TAES.2014.130569
    [27] M. Pachter, E. Garcia, D. W. Casbeer, Active target defense differential game, in 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton), (2014), 46–53. https://doi.org/10.1109/ALLERTON.2014.7028434
    [28] E. Garcia, D. W. Casbeer, M. Pachter, Active target defense using first order missile models, Automatica, 78 (2017), 139–143. https://doi.org/10.1016/j.automatica.2016.12.032 doi: 10.1016/j.automatica.2016.12.032
    [29] M. Coon, D. Panagou, Control strategies for multiplayer target-attacker-defender differential games with double integrator dynamics, in 2017 IEEE 56th Annual Conference on Decision and Control (CDC), (2017), 1496–1502. https://doi.org/10.1109/CDC.2017.8263864
    [30] I. E. Weintraub, E. Garcia, M. Pachter, A kinematic rejoin method for active defense of non-maneuverable aircraft, in 2018 Annual American Control Conference (ACC), (2018), 6533–6538. https://doi.org/10.23919/ACC.2018.8431129
    [31] E. Garcia, D. W. Casbeer, M. Pachter, Design and analysis of state-feedback optimal strategies for the differential game of active defense, IEEE Trans. Autom. Control, 64 (2018), 553–568. https://doi.org/10.1109/TAC.2018.2828088 doi: 10.1109/TAC.2018.2828088
    [32] E. Garcia, D. W. Casbeer, M. Pachter, Optimal target capture strategies in the target-attacker-defender differential game, in 2018 Annual American Control Conference (ACC), (2018), 68–73. https://doi.org/10.23919/ACC.2018.8431715
    [33] E. Garcia, D. W. Casbeer, M. Pachter, The complete differential game of active target defense, J. Optim. Theory Appl., 191 (2021), 675–699. https://doi.org/10.1007/s10957-021-01816-z doi: 10.1007/s10957-021-01816-z
    [34] E. Garcia, D. W. Casbeer, M. Pachter, Pursuit in the presence of a defender, Dyn. Games Appl., 9 (2019), 652–670. https://doi.org/10.1007/s13235-018-0271-9 doi: 10.1007/s13235-018-0271-9
    [35] M. Pachter, E. Garcia, D. W. Casbeer, Toward a solution of the active target defense differential game, Dyn. Games Appl., 9 (2019), 165–216. https://doi.org/10.1007/s13235-018-0250-1 doi: 10.1007/s13235-018-0250-1
    [36] E. Garcia, Cooperative target protection from a superior attacker, Automatica, 131 (2021), 109696. https://doi.org/10.1016/j.automatica.2021.109696 doi: 10.1016/j.automatica.2021.109696
    [37] M. Pachter, E. Garcia, R. Anderson, D. W. Casbeer, K. Pham, Maximizing the target's longevity in the active target defense differential game, in 2019 18th European Control Conference (ECC), (2019), 2036–2041. https://doi.org/10.23919/ECC.2019.8795650
    [38] E. Garcia, D. W. Casbeer, M. Pachter, Defense of a target against intelligent adversaries: A linear quadratic formulation, in 2020 IEEE Conference on Control Technology and Applications (CCTA), (2020), 619–624. https://doi.org/10.1109/CCTA41146.2020.9206368
    [39] E. Garcia, D. W. Casbeer, M. Pachter, Cooperative strategies for optimal aircraft defense from an attacking missile, J. Guid., Control, Dyn., 38 (2015), 1510–1520. https://doi.org/10.2514/1.G001083 doi: 10.2514/1.G001083
    [40] L. Liang, F. Deng, Z. Peng, X. Li, W. Zha, A differential game for cooperative target defense, Automatica, 102 (2019), 58–71. https://doi.org/10.1016/j.automatica.2018.12.034 doi: 10.1016/j.automatica.2018.12.034
    [41] Z. Zhou, J. Ding, H. Huang, R. Takei, C. Tomlin, Efficient path planning algorithms in reach-avoid problems, Automatica, 89 (2018), 28–36. https://doi.org/10.1016/j.automatica.2017.11.035 doi: 10.1016/j.automatica.2017.11.035
    [42] P. Shi, W. Sun, X. Yang, I. J. Rudas, H. Gao, Master-slave synchronous control of dual-drive gantry stage with cogging force compensation, IEEE Trans. Syst. Man Cybern.: Syst., https://doi.org/10.1109/TSMC.2022.3176952
    [43] J. Lorenzetti, M. Chen, B. Landry, M. Pavone, Reach-avoid games via mixed-integer second-order cone programming, in 2018 IEEE Conference on Decision and Control (CDC), (2018), 4409–4416. https://doi.org/10.1109/CDC.2018.8619382
    [44] R. Isaacs, Differential games: Their scope, nature, and future, J. Optim. Theory Appl., 3 (1969), 283–295. https://doi.org/10.1007/BF00931368 doi: 10.1007/BF00931368
    [45] R. Yan, Z. Shi, Y. Zhong, Guarding a subspace in high-dimensional space with two defenders and one attacker, IEEE Trans. Cybern., 2020. https://doi.org/10.1109/TCYB.2020.3015031 doi: 10.1109/TCYB.2020.3015031
    [46] R. Yan, Z. Shi, Y. Zhong, Construction of the barrier for reach-avoid differential games in three-dimensional space with four equal-speed players, in 2019 IEEE 58th Conference on Decision and Control (CDC), (2019), 4067–4072. https://doi.org/10.1109/CDC40024.2019.9029495
    [47] K. Margellos, J. Lygeros, Hamilton–jacobi formulation for reach–avoid differential games, IEEE Trans. Autom. Control, 56 (2011), 1849–1861. https://doi.org/10.1109/TAC.2011.2105730 doi: 10.1109/TAC.2011.2105730
    [48] J. F. Fisac, M. Chen, C. J. Tomlin, S. S. Sastry, Reach-avoid problems with time-varying dynamics, targets and constraints, in HSCC '15: Proceedings of the 18th International Conference on Hybrid Systems: Computation and Control, (2015), 11–20. https://doi.org/10.1145/2728606.2728612
    [49] M. Chen, Z. Zhou, C. J. Tomlin, Multiplayer reach-avoid games via pairwise outcomes, IEEE Trans. Autom. Control, 62 (2016), 1451–1457. https://doi.org/10.1109/TAC.2016.2577619 doi: 10.1109/TAC.2016.2577619
    [50] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, et al., Playing atari with deep reinforcement learning, preprint, arXiv: 1312.5602.
    [51] S. Bansal, C. J. Tomlin, Deepreach: A deep learning approach to high-dimensional reachability, in 2021 IEEE International Conference on Robotics and Automation (ICRA), (2021), 1817–1824. https://doi.org/10.1109/ICRA48506.2021.9561949
    [52] J. Li, D. Lee, S. Sojoudi, C. J. Tomlin, Infinite-horizon reach-avoid zero-sum games via deep reinforcement learning, preprint, arXiv: 2203.10142.
    [53] K. C. Hsu, V. R. Royo, C. J. Tomlin, J. F. Fisac, Safety and liveness guarantees through reach-avoid reinforcement learning, preprint, arXiv: 2112.12288.
    [54] E. Garcia, D. W. Casbeer, A. V. Moll, M. Pachter, Cooperative two-pursuer one-evader blocking differential game, in 2019 American Control Conference (ACC), (2019), 2702–2709. https://doi.org/10.23919/ACC.2019.8814294
    [55] R. Yan, X. Duan, Z. Shi, Y. Zhong, F. Bullo, Matching-based capture strategies for 3d heterogeneous multiplayer reach-avoid differential games, Automatica, 140 (2022), 110207. https://doi.org/10.1016/j.automatica.2022.110207 doi: 10.1016/j.automatica.2022.110207
    [56] J. Selvakumar, E. Bakolas, Feedback strategies for a reach-avoid game with a single evader and multiple pursuers, IEEE Trans. Cybern., 51 (2019), 696–707. https://doi.org/10.1109/TCYB.2019.2914869 doi: 10.1109/TCYB.2019.2914869
    [57] E. Garcia, D. W. Casbeer, M. Pachter, J. W. Curtis, E. Doucette, A two-team linear quadratic differential game of defending a target, in 2020 American Control Conference (ACC), (2020), 1665–1670. https://doi.org/10.23919/ACC45564.2020.9147665
    [58] S. D. Bopardikar, F. Bullo, J. P. Hespanha, A cooperative homicidal chauffeur game, Automatica, 45 (2009), 1771–1777. https://doi.org/10.1016/j.automatica.2009.03.014 doi: 10.1016/j.automatica.2009.03.014
    [59] R. Lopez-Padilla, R. Murrieta-Cid, I. Becerra, G. Laguna, S. M. LaValle, Optimal navigation for a differential drive disc robot: A game against the polygonal environment, J. Intell. Rob. Syst., 89 (2018), 211–250. https://doi.org/10.1007/s10846-016-0433-1 doi: 10.1007/s10846-016-0433-1
    [60] A. Pierson, Z. Wang, M. Schwager, Intercepting rogue robots: An algorithm for capturing multiple evaders with multiple pursuers, IEEE Rob. Autom. Lett., 2 (2016), 530–537. https://doi.org/10.1109/LRA.2016.2645516 doi: 10.1109/LRA.2016.2645516
    [61] Z. Zhou, W. Zhang, J. Ding, H. Huang, D. M. Stipanović, C. J. Tomlin, Cooperative pursuit with voronoi partitions, Automatica, 72 (2016), 64–72. https://doi.org/10.1016/j.automatica.2016.05.007 doi: 10.1016/j.automatica.2016.05.007
    [62] E. Bakolas, P. Tsiotras, Relay pursuit of a maneuvering target using dynamic voronoi diagrams, Automatica, 48 (2012), 2213–2220. https://doi.org/10.1016/j.automatica.2012.06.003 doi: 10.1016/j.automatica.2012.06.003
    [63] R. Yan, Z. Shi, Y. Zhong, Reach-avoid games with two defenders and one attacker: An analytical approach, IEEE Trans. Cybern., 49 (2018), 1035–1046. https://doi.org/10.1109/TCYB.2018.2794769 doi: 10.1109/TCYB.2018.2794769
    [64] R. Yan, Z. Shi, Y. Zhong, Cooperative strategies for two-evader-one-pursuer reach-avoid differential games, Int. J. Syst. Sci., 52 (2021), 1894–1912. https://doi.org/10.1080/00207721.2021.1872116 doi: 10.1080/00207721.2021.1872116
    [65] J. Wang, J. Huang, Y. Tang, Swarm intelligence capture-the-flag game with imperfect information based on deep reinforcement learning, Sci. Sin. Technol., 2021. https://doi.org/10.1360/SST-2021-0382 doi: 10.1360/SST-2021-0382
    [66] I. M. Mitchell, A. M. Bayen, C. J. Tomlin, A time-dependent hamilton-jacobi formulation of reachable sets for continuous dynamic games, IEEE Trans. Autom. Control, 50 (2005), 947–957. https://doi.org/10.1109/TAC.2005.851439 doi: 10.1109/TAC.2005.851439
    [67] E. Garcia, D. W. Casbeer, M. Pachter, The capture-the-flag differential game, in 2018 IEEE Conference on Decision and Control (CDC), (2018), 4167–4172. https://doi.org/10.1109/CDC.2018.8619026
    [68] M. Pachter, D. W. Casbeer, E. Garcia, Capture-the-flag: A differential game, in 2020 IEEE Conference on Control Technology and Applications (CCTA), (2020), 606–610. https://doi.org/10.1109/CCTA41146.2020.9206333
    [69] Z. Liu, W. Lin, X. Yu, J. J. Rodríguez-Andina, H. Gao, Approximation-free robust synchronization control for dual-linear-motors-driven systems with uncertainties and disturbances, IEEE Trans. Ind. Electron., 69 (2021), 10500–10509. https://doi.org/10.1109/TIE.2021.3137619 doi: 10.1109/TIE.2021.3137619
    [70] Y. Tang, X. Jin, Y. Shi, W. Du, Event-triggered attitude synchronization of multiple rigid body systems with velocity-free measurements, Automatica, in press.
    [71] X. Jin, Y. Shi, Y. Tang, X. Wu, Event-triggered attitude consensus with absolute and relative attitude measurements, Automatica, 122 (2020), 109245. https://doi.org/10.1016/j.automatica.2020.109245 doi: 10.1016/j.automatica.2020.109245
    [72] R. R. Brooks, J. E. Pang, C. Griffin, Game and information theory analysis of electronic countermeasures in pursuit-evasion games, IEEE Trans. Syst. Man Cybern. Part A Syst. Humans, 38 (2008), 1281–1294. https://doi.org/10.1109/TSMCA.2008.2003970 doi: 10.1109/TSMCA.2008.2003970
    [73] J. Ni, S. X. Yang, Bioinspired neural network for real-time cooperative hunting by multirobots in unknown environments, IEEE Trans. Neural Networks, 22 (2011), 2062–2077. https://doi.org/10.1109/TNN.2011.2169808 doi: 10.1109/TNN.2011.2169808
    [74] J. Poropudas, K. Virtanen, Game-theoretic validation and analysis of air combat simulation models, IEEE Trans. Syst. Man Cybern. Part A Syst. Humans, 40 (2010), 1057–1070. https://doi.org/10.1109/TSMCA.2010.2044997 doi: 10.1109/TSMCA.2010.2044997
    [75] Z. E. Fuchs, P. P. Khargonekar, J. Evers, Cooperative defense within a single-pursuer, two-evader pursuit evasion differential game, in 49th IEEE Conference on Decision and Control (CDC), (2010), 3091–3097. https://doi.org/10.1109/CDC.2010.5717894
    [76] B. Goode, A. Kurdila, M. Roan, Pursuit-evasion with acoustic sensing using one step nash equilibria, in Proceedings of the 2010 American Control Conference, (2010), 1925–1930. https://doi.org/10.1109/ACC.2010.5531356
    [77] Y. Tang, D. Zhang, P. Shi, W. Zhang, F. Qian, Event-based formation control for nonlinear multiagent systems under DoS attacks, IEEE Trans. Autom. Control, 66 (2020), 452–459. https://doi.org/10.1109/TAC.2020.2979936 doi: 10.1109/TAC.2020.2979936
    [78] S. Wang, X. Jin, S. Mao, A. V. Vasilakos, Y. Tang, Model-free event-triggered optimal consensus control of multiple Euler-Lagrange systems via reinforcement learning, IEEE Trans. Network Sci. Eng., 8 (2020), 246–258. https://doi.org/10.1109/TNSE.2020.3036604 doi: 10.1109/TNSE.2020.3036604
    [79] H. Gao, Z. Li, X. Yu, J. Qiu, Hierarchical multiobjective heuristic for PCB assembly optimization in a beam-head surface mounter, IEEE Trans. Cybern., 2021. https://doi.org/10.1109/TCYB.2020.3040788 doi: 10.1109/TCYB.2020.3040788
    [80] Y. Tang, X. Wu, P. Shi, F. Qian, Input-to-state stability for nonlinear systems with stochastic impulses, Automatica, 113 (2020), 108766. https://doi.org/10.1016/j.automatica.2019.108766 doi: 10.1016/j.automatica.2019.108766
  • This article has been cited by:

    1. Jibril Abdullahi Bala, Taliha Abiodun Folorunso, Majeed Soufian, Abiodun Musa Aibinu, Olayemi Mikail Olaniyi, Nimat Ibrahim, 2022, Fuzzy Logic based Fed Batch Fermentation Control Scheme for Plant Culturing, 978-1-6654-7978-3, 1, 10.1109/NIGERCON54645.2022.9803140
    2. Ricardo Aguilar‐López, Pablo A. López‐Pérez, Ricardo Femat, Eduardo Alvarado‐Santos, Improved bioethanol production from cocoa agro‐industrial waste optimization based on reaction rate rules, 2025, 0268-2575, 10.1002/jctb.7815
    3. Dadang Rustandi, Mersi Kurniati, Sensus Wijonarko, Siddiq Wahyu Hidayat, Tatik Maftukhah, , A Control System for Pepper Submersion Tub Actuators, 2025, 2973, 1742-6588, 012009, 10.1088/1742-6596/2973/1/012009
  • Reader Comments
  • © 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(2922) PDF downloads(159) Cited by(4)

Figures and Tables

Figures(6)  /  Tables(1)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog