Optimal strategy analysis for adversarial differential games

Jiali Wang; Xin Jin; Yang Tang; Jiali Wang; Xin Jin; Yang Tang

doi:10.3934/era.2022189

Electronic Research Archive

2022, Volume 30, Issue 10: 3692-3710. doi: 10.3934/era.2022189

Previous Article Next Article

Research article Special Issues

Optimal strategy analysis for adversarial differential games

Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai 200237, China

Academic Editor: Wei Lin

Received: 25 May 2022 Revised: 04 July 2022 Accepted: 13 July 2022 Published: 05 August 2022

Optimal decision-making and winning-regions analysis in adversarial differential games are challenging theoretical problems because of the complex interactions between players. To solve these problems, we present an organized review for pursuit-evasion games, reach-avoid games and capture-the-flag games; we also outline recent developments in three types of games. First, we summarize recent results for pursuit-evasion games and classify them according to different numbers of players. As a special kind of pursuit-evasion games, target-attacker-defender games with an active target are analyzed from the perspectives of different speed ratios for players. Second, the related works for reach-avoid games and capture-the-flag games are compared in terms of analytical methods and geometric methods, respectively. These methods have different effects on the barriers and optimal strategy analysis between players. Future directions for the pursuit-evasion games, reach-avoid games, capture-the-flag games and their applications are discussed in the end.

Keywords:

Citation: Jiali Wang, Xin Jin, Yang Tang. Optimal strategy analysis for adversarial differential games[J]. Electronic Research Archive, 2022, 30(10): 3692-3710. doi: 10.3934/era.2022189

Related Papers:

[1]	Jingjing Yang, Jianqiu Lu . Stabilization in distribution of hybrid stochastic differential delay equations with Lévy noise by discrete-time state feedback controls. AIMS Mathematics, 2025, 10(2): 3457-3483. doi: 10.3934/math.2025160
[2]	Guangyang Liu, Yang Chang, Hongyan Yan . Uncertain random problem for multistage switched systems. AIMS Mathematics, 2023, 8(10): 22789-22807. doi: 10.3934/math.20231161
[3]	Hui Sun, Zhongyang Sun, Ya Huang . Equilibrium investment and risk control for an insurer with non-Markovian regime-switching and no-shorting constraints. AIMS Mathematics, 2020, 5(6): 6996-7013. doi: 10.3934/math.2020449
[4]	Jiaojiao Li, Yingying Wang, Jianyu Zhang . Event-triggered sliding mode control for a class of uncertain switching systems. AIMS Mathematics, 2023, 8(12): 29424-29439. doi: 10.3934/math.20231506
[5]	Qiang Yu, Na Xue . Asynchronously switching control of discrete-time switched systems with a $\Phi$ -dependent integrated dwell time approach. AIMS Mathematics, 2023, 8(12): 29332-29351. doi: 10.3934/math.20231501
[6]	Gengjiao Yang, Fei Hao, Lin Zhang, Lixin Gao . Stabilization of discrete-time positive switched T-S fuzzy systems subject to actuator saturation. AIMS Mathematics, 2023, 8(6): 12708-12728. doi: 10.3934/math.2023640
[7]	Zhengqi Zhang, Huaiqin Wu . Cluster synchronization in finite/fixed time for semi-Markovian switching T-S fuzzy complex dynamical networks with discontinuous dynamic nodes. AIMS Mathematics, 2022, 7(7): 11942-11971. doi: 10.3934/math.2022666
[8]	Qiang Yu, Yuanyang Feng . Stability analysis of switching systems with all modes unstable based on a $\Phi$ -dependent max-minimum dwell time method. AIMS Mathematics, 2024, 9(2): 4863-4881. doi: 10.3934/math.2024236
[9]	P. K. Lakshmi Priya, K. Kaliraj, Panumart Sawangtong . Analysis of relative controllability and finite-time stability in nonlinear switched fractional impulsive systems. AIMS Mathematics, 2025, 10(4): 8095-8115. doi: 10.3934/math.2025371
[10]	Yanmei Xue, Jinke Han, Ziqiang Tu, Xiangyong Chen . Stability analysis and design of cooperative control for linear delta operator system. AIMS Mathematics, 2023, 8(6): 12671-12693. doi: 10.3934/math.2023637

Abstract

1. Introduction

Over the past decades, the use of biochemical reactors and correlation techniques has increased greatly because of their fruitful application in converting biomass or cells into pharmaceutical or chemical products, such as vaccines ^[1], antibiotics ^[2], beverages ^[3], and industrial solvents ^[4]. Among various classes or operation regions of bioreactors, the fed-batch modes have extensively used in the biotechnological industry due to its considerable economic profits ^[5,6,7]. The main objective of these reactors is to achieve a given or maximum concentration of production at the end of the operation, which can be implemented by using some suitable feed rates ^[8,9,10]. Thus, in order to ensure economic benefit and product quality of the fed-batch processes, the process control of this units is an very important topic for the engineers ^[11,12,13].

Switched dynamical systems provide a flexible modeling method for a variety of different types of engineering systems, such as financial system ^[14], train control system ^[15], hybrid electric vehicle ^[16], chemical process system ^[17], and biological system ^{[18,19,20,21]}. Generally speaking, switched dynamical systems are formed by some continuous-time or discrete-time subsystems and a switching rule ^[22]. There usually exist four types of switching rules as follows: time-dependent switching ^[23], state-dependent switching ^[24], average dwell time switching ^[25], and minimum dwell time switching ^[26]. Recently, switched dynamical system optimal control problems are becoming increasingly attractive due to their significance in theory and industry production ^{[27,28,29,30]}. Because of the discrete nature of switching rules, it is very challenging that switched dynamical system optimal control problems are solved by directly using the classical optimal control approaches such as the maximum principle and the dynamic programming method ^{[31,32,33,34]}. In additions, analytical methods also can not be applied to obtain an solution for switched dynamical system optimal control problems due to their nonlinear nature ^[35,36,37]. Thus, in recent work, two kinds of well-known numerical optimization algorithms are developed for switched dynamical system optimal control problems to obtain numerical solutions. One is the bi-level algorithm ^[38,39]. The other is the embedding algorithm ^[40,41]. Besides above two kinds of well-known numerical optimization algorithms, many other available numerical optimization algorithms are also developed for obtaining the solution of switched dynamical system optimal control problems ^[42]. Unfortunately, most of these numerical optimization algorithms depend on the following assumption: the time-dependent switching strategy is used to design the switching rules, which implies that the system dynamic must be continuously differentiable with respect to the system state ^[43,44,45]. However, this assumption is not reasonable, since some small perturbations of the system state may lead to the dynamic equations being changed discontinuously. Thus, the solution obtained is usually not optimal. In additions, although these approaches have demonstrated to be effective by solving many practical problems, they only obtaining an open loop control ^{[46,47,48,49,50,51,52,53]}. Unfortunately, such open loop controls are not usually robust in practice. Thus, an optimal feedback controller is more and more popular.

In this paper, we consider an optimal feedback control problem for a class of fed-batch fermentation processes by using switched dynamical system approach. Our main contributions are as follows. Firstly, a dynamic optimization problem for a class of fed-batch fermentation processes is modeled as a switched dynamical system optimal control problem, and a general state-feedback controller is designed for this dynamic optimization problem. Unlike the existing works, the state-dependent switching method is applied to design the switching rule, and the structure of this state-feedback controller is not restricted to a particular form. In generally, the traditional methods for obtaining an optimal feedback control require solving the well-known Hamilton-Jacobi-Bellman partial differential equation, which is a very difficult issue even for unconstrained optimal control problems. Then, in order to overcome this difficulty, this problem is transformed into a mixed-integer optimal control problem by introducing a discrete-valued function. Furthermore, each of these discrete variables is represented by using a set of 0-1 variables. Then, by using a quadratic constraint, these 0-1 variables are relaxed such that they are continuous on the closed interval $[0, 1]$ . Accordingly, the original mixed-integer optimal control problem is transformed into a nonlinear parameter optimization problem, which can be solved by using any gradient-based numerical optimization algorithm. Unlike the existing works, the constraint introduced for these 0-1 variables are at most quadratic. Thus, it does not increase the number of locally optimal solutions of the original problem. During the past decades, many iterative approaches have been proposed for solving the nonlinear parameter optimization problem by using the information of the objective function. The idea of these iterative approaches is usually that a iterative sequence is generated such that the corresponding objective function value sequence is monotonically decreasing. However, the existing algorithms have the following disadvantage: if an iteration is trapped to a curved narrow valley bottom of the objective function, then the iterative methods will lose their efficiency due to the target with objective function value monotonically decreasing may leading to very short iterative steps. Next, in order to overcome this challenge, an improved gradient-based algorithm is developed based on a novel search approach. In this novel search approach, it is not required that the objective function value sequence is always monotonically decreasing. And a large number of numerical experiments shows that this novel search approach can effectively improve the convergence speed of this algorithm, when an iteration is trapped to a curved narrow valley bottom of the objective function. Finally, an optimal feedback control problem of 1, 3-propanediol fermentation processes is provided to illustrate the effectiveness of this method developed by this paper. Numerical simulation results show that this method developed by this paper is low time-consuming, has faster convergence speed, and obtains a better result than the existing approaches.

The rest of this paper is organized as follows. Section 2 presents the optimal feedback control problem for a class of fed-batch fermentation processes. In Section 3, by introducing a discrete-valued function and using a relaxation technique, this problem is transformed into a nonlinear parameter optimization problem, which can be solved by using any gradient-based numerical optimization algorithm. An improved gradient-based numerical optimization algorithm are developed in Section 4. In Section 5, the convergence results of this numerical optimization algorithm are established. In Section 6, an optimal feedback control problem of 1, 3-propanediol fermentation processes is provided to illustrate the effectiveness of this algorithm developed by this paper.

2. Problem formulation

In this section, a general state-feedback controller is proposed for a class of fed-batch fermentation process dynamic optimization problems, which will be modeled as an optimal control problem of switched dynamical systems under state-dependent switching.

Let $\alpha _1 = \left[{\alpha _{11}, \cdots, \alpha _{1r_1 } } \right]^{\rm T} \in R^{r_1}$ and $\alpha _2 = \left[{\alpha _{21}, \cdots, \alpha _{2r_2 } } \right]^{\rm T} \in R^{r_2}$ be two parameter vectors satisfying

$\begin{equation} \underline{a}_i \leqslant \alpha _{1r_i } \leqslant \bar a_i , \quad i = 1, \cdots , r_1 , \end{equation}$

(2.1)

and

$\begin{equation} \underline{b}_j \leqslant \alpha _{2r_j } \leqslant \bar b_j , \quad j = 1, \cdots , r_2 , \end{equation}$

(2.2)

respectively, where $\underline{a}_i$ , $\bar a_i$ , $i = 1, \cdots, r_1$ ; $\underline{b}_j$ , $\bar b_j$ , $j = 1, \cdots, r_2$ present given constants. Suppose that $t_f > 0$ presents a given terminal time. Then, a class of fed-batch fermentation process dynamic optimization problems can be described as choose two parameter vectors $\alpha _1 \in R^{r_1}$ , $\alpha _2 \in R^{r_2}$ , and a general state-feedback controller

$\begin{equation} u\left( t \right) = \Upsilon \left( {x\left( t \right), \vartheta} \right), \quad t \in \left[ {0, t_f } \right], \end{equation}$

(2.3)

to minimize the objective function

$\begin{equation} J\left( {u\left( t \right)}, \alpha _1, \alpha _2 \right) = \phi \left( {x\left( {t_f } \right)} \right), \end{equation}$

(2.4)

subject to the switched dynamical system under state-dependent switching

$\begin{equation} \left\{ \begin{array}{l} Subsystem\;1:\;\frac{{dx\left( t \right)}} {{dt}} = f_1 \left( {x\left( t \right), t} \right), \quad \quad \quad \;\, if\;g_1 \left( {x\left( t \right), \alpha _1 , t} \right) = 0, \hfill \cr Subsystem\;2:\;\frac{{dx\left( t \right)}} {{dt}} = f_2 \left( {x\left( t \right), u\left( t \right), t} \right), \quad \, if\;g_2 \left( {x\left( t \right), \alpha _2 , t} \right) = 0, \hfill \cr \end{array} \right. \quad t \in \left[ {0, t_f } \right], \end{equation}$

(2.5)

with the initial condition

$\begin{equation} x\left( 0 \right) = x_0, \end{equation}$

(2.6)

where $x\left(t \right) \in R^n$ presents the system state; $x_0$ presents a given initial system state; $u\left(t \right) \in R^m$ presents the control input; $\vartheta = \left[{\vartheta_1, \cdots, \vartheta_{r_1} } \right]^{\rm T} \in R^{r_3}$ presents a state-feedback parameter vector satisfying

$\begin{equation} \underline{c}_k \leqslant \vartheta _k \leqslant \bar{c}_k, \quad k = 1, \cdots , r_3, \end{equation}$

(2.7)

$\underline{c}_k$ and $\bar c_k$ , $k = 1, \cdots, r$ present given constants. $\Upsilon :R^n \times R^r \to R^m$ ; $\phi: R^n \to R$ , $f_1: R^n \times \left[{0, t_f} \right] \to R^n$ , $f_2: R^n \times R^m \times \left[{0, t_f} \right] \to R^n$ , $g_1: R^n \times R^{r_1} \times \left[{0, t_f} \right] \to R^n$ , $g_2: R^n \times R^{r_2} \times \left[{0, t_f} \right] \to R^n$ present five continuously differentiable functions. For convenience, this problem is called as Problem 1.

Remark 1. In the switched dynamical system (2.5), Subsystem 1 presents the batch mode, during which there exists no input feed (i.e., control input) $u\left(t \right)$ , and Subsystem 2 presents the feeding mode, during which there exists input feed (i.e., control input) $u\left(t \right)$ . This fed-batch fermentation process will oscillate between Subsystem 1 (the batch mode) and Subsystem 2 (the feeding mode), and $g_1 \left({x\left(t \right), \alpha _1, t} \right) = 0$ and $g_2 \left({x\left(t \right), \alpha _2, t} \right) = 0$ present the active conditions of Subsystems 1 and 2, respectively.

Remark 2. Note that an integral term, which is used to measure the system running cost, can be easily incorporated into the objective function (2.4) by augmenting the switched dynamical system (2.5) with an additional system state variable (see Chapter 8 of this work ^[54]). Thus, it is not a serious restriction that the integral term does not appear in the objective function (2.4).

Remark 3. The structure for this general state-feedback controller (2.3) can be governed by the given continuously differentiable function $\Upsilon$ , and the state-feedback parameter vector $\vartheta$ is decision variable vector, which will be chosen optimally. For example, the linear state-feedback controller described by $u\left(t \right) = Kx\left(t \right)$ is a very common state-feedback controller, where $K \in R^{m \times n}$ presents a state-feedback gain matrix to be found optimally.

3. Problem transformation and relaxation

3.1. Problem transformation

In Problem 1, the state-dependent switching strategy is adopted to design the switching rule, which is unlike the existing switched dynamical system optimal control problem. Then, the solution of Problem 1 can not be obtained by directly using the existing numerical computation approaches for switched dynamical systems optimal control problem, in which the switching rule is designed by using time-dependent strategy. In order to overcome this difficulty, by introducing a discrete-valued function, the problem will be transformed into a equivalent nonlinear dynamical system optimal control problem with discrete and continuous variables in this subsection.

Firstly, by substituting the general state-feedback controller (2.3) into the switched dynamical system (2.5), Problem 1 can be equivalently written as the following problem:

Problem 2. Choose $\left({\alpha _1, \alpha _2, \vartheta } \right) \in R^{r_1 } \times R^{r_2 } \times R^{r_3 }$ to minimize the objective function

$\begin{equation} \bar J\left( {\alpha _1 , \alpha _2 , \vartheta } \right) = \phi \left( {x\left( {t_f } \right)} \right), \end{equation}$

(3.1)

subject to the switched dynamical system under state-dependent switching

$\begin{equation} \left\{ \begin{array}{l} Subsystem\;1:\;\frac{{dx\left( t \right)}} {{dt}} = f_1 \left( {x\left( t \right), t} \right), \quad \quad \quad if\;g_1 \left( {x\left( t \right), \alpha _1 , t} \right) = 0, \hfill \cr Subsystem\;2:\;\frac{{dx\left( t \right)}} {{dt}} = \bar f_2 \left( {x\left( t \right), \vartheta , t} \right), \quad \quad if\;g_2 \left( {x\left( t \right), \alpha _2 , t} \right) = 0, \hfill \cr \end{array} \right. \quad t \in \left[ {0, t_f } \right], \end{equation}$

(3.2)

and the three bound constraints (2.1), (2.2) and (2.7), where $\bar f_2 \left({x\left(t \right), \vartheta, t} \right) = f_2 \left({x\left(t \right), \Upsilon \left({x\left(t \right), \vartheta } \right), t} \right)$ .

Next, note that the solution of Problem 1 can not be obtained by directly using the existing numerical computation approaches for switched dynamical systems optimal control problem, in which the switching rule is designed by using time-dependent strategy and not state-dependent strategy. In order to overcome this difficulty, a novel discrete-valued function $y\left(t \right)$ is introduced as follows:

$\begin{equation} y\left( t \right) = \left\{ \begin{array}{l} 1, \quad if\;g_1 \left( {x\left( t \right), \alpha _1 , t} \right) = 0, \hfill \cr 2, \quad if\;g_2 \left( {x\left( t \right), \alpha _2 , t} \right) = 0, \hfill \cr \end{array} \right. \quad t \in \left[ {0, t_f } \right]. \end{equation}$

(3.3)

Then, Problem 2 can be transformed into the following equivalent optimization problem with discrete and continuous variables:

Problem 3. Choose $\left({\alpha _1, \alpha _2, \vartheta, y\left(t \right)} \right) \in R^{r_1 } \times R^{r_2 } \times R^{r_3 } \times \left\{ {1, 2} \right\}$ to minimize the objective function

$\begin{equation} \tilde J\left( {\alpha _1 , \alpha _2 , \vartheta , y\left( t \right)} \right) = \phi \left( {x\left( {t_f } \right)} \right), \end{equation}$

(3.4)

subject to the nonlinear dynamical system

$\begin{equation} \frac{{dx\left( t \right)}} {{dt}} = \left( 2 - y\left( t \right) \right)y\left( t \right)f_1 \left( {x\left( t \right), t} \right) + \left( {y\left( t \right) - 1} \right)\bar f_2 \left( {x\left( t \right), \vartheta , t} \right), \quad t \in \left[ {0, t_f } \right], \end{equation}$

(3.5)

the equality constraint

$\begin{equation} \left(2 - y\left( t \right) \right)y\left( t \right)g_1 \left( {x\left( t \right), \alpha _1 , t} \right) + \left( {y\left( t \right) - 1} \right)g_2 \left( {x\left( t \right), \alpha _2 , t} \right) = 0, \quad t \in \left[ {0, t_f } \right], \end{equation}$

(3.6)

and the three bound constraints (2.1), (2.2), and (2.7).

3.2. Problem relaxation

Note that standard nonlinear numerical optimization algorithms are usually developed for nonlinear optimization problems only with continuous variables, for example the sequential quadratic programming algorithm, the interior-point method, and so on. Thus, the solution of Problem 3, which has discrete and continuous variables, can not be obtained by directly using these existing standard algorithms. In order to overcome this difficulty, this subsection will introduce a relaxation problem, which has only continuous variables.

Define

$\begin{equation} P\left( {\sigma \left( t \right)} \right) = \sum\limits_{i = 1}^2 {i^2 \sigma _i \left( t \right)} - \left( {\sum\limits_{i = 1}^2 {i\sigma _i \left( t \right)} } \right)^2, \end{equation}$

(3.7)

where $\sigma \left(t \right) = \left[{\sigma _1 \left(t \right), \sigma _2 \left(t \right)} \right]^{\rm T}$ . Then, a theorem can be established as follows.

Theorem 1. If the nonnegative functions $\sigma _1 \left(t \right)$ and $\sigma _2 \left(t \right)$ satisfy the following equality:

$\begin{equation} \sigma _1 \left( t \right) + \sigma _2 \left( t \right) = 1, \quad t \in \left[ {0, t_f } \right], \end{equation}$

(3.8)

then two results can be obtained as follows:

(1) For any $t \in \left[{0, t_f } \right]$ , the function $P\left({\sigma \left(t \right)} \right)$ is nonnegative;

(2) For any $t \in \left[{0, t_f } \right]$ , $P\left({\sigma \left(t \right)} \right) = 0$ if and only if $\sigma _i \left(t \right) = 1$ for one $i \in \left\{ {1, 2} \right\}$ and $\sigma _i \left(t \right) = 0$ for the other $i \in \left\{ {1, 2} \right\}$ .

Proof. (1) By using the equality (3.8) and the Cauchy-Schwarz inequality, we have

$\begin{equation} \sum\limits_{i = 1}^2 {i\sigma _i \left( t \right)} = \sum\limits_{i = 1}^2 {\left( {i\sqrt {\sigma _i \left( t \right)} } \right)} \sqrt {\sigma _i \left( t \right)} \leqslant \sqrt {\sum\limits_{i = 1}^2 {\left( {i^2 \sigma _i \left( t \right)} \right)} } \sqrt {\sum\limits_{i = 1}^2 {\sigma _i \left( t \right)} } = \sqrt {\sum\limits_{i = 1}^2 {\left( {i^2 \sigma _i \left( t \right)} \right)} }, \end{equation}$

(3.9)

Note that the functions $\sigma _1 \left(t \right)$ and $\sigma _2 \left(t \right)$ are nonnegative. Then, squaring both sides of the inequality (3.9) yields

$\sum\limits_{i = 1}^2 {\left( {i^2 \sigma _i \left( t \right)} \right)} \geqslant \left( {\sum\limits_{i = 1}^2 {i\sigma _i \left( t \right)} } \right)^2,$

which implies that for any $t \in \left[{0, t_f } \right]$ , the function $P\left({\sigma \left(t \right)} \right)$ is nonnegative.

(2) The correctness of the second part for Theorem 1 only need to prove the following result: for any $t \in \left[{0, t_f } \right]$ , $P\left({\sigma \left(t \right)} \right) = 0$ has solutions $\sigma _{i^*} \left(t \right) = 1$ for one $i^* \in \left\{ {1, 2} \right\}$ and $\sigma _i \left(t \right) = 0$ for the other $i \in \left\{ {1, 2} \right\}$ and $i \neq i^*$ .

Define

$v_1\left( t \right) = \left[ {\sqrt {\sigma _1 \left( t \right)} , 2\sqrt {\sigma _2 \left( t \right)} } \right], \quad v_2\left( t \right) = \left[ {\sqrt {\sigma _1 \left( t \right)} , \sqrt {\sigma _2 \left( t \right)} } \right].$

Then, the inequality (3.9) can be equivalently transformed into as follows:

$\begin{equation} v_1 \left( t \right) \cdot v_2 \left( t \right) \leqslant \left\| {v_1 \left( t \right)} \right\| \left\| {v_2 \left( t \right)} \right\|, \end{equation}$

(3.10)

where $\cdot$ and $\left\| \cdot \right\|$ present the vector dot product and the Euclidean norm, respectively. Note that the equality

$\begin{equation} v_1 \left( t \right) \cdot v_2 \left( t \right) = \left\| {v_1 \left( t \right)} \right\| \left\| {v_2 \left( t \right)} \right\| \end{equation}$

(3.11)

holds if and only if there exists a constant $\beta \in R$ such that

$\begin{equation} v_1 \left( t \right) = \beta v_2 \left( t \right). \end{equation}$

(3.12)

By using the equality (3.8), one obtain $v_1 \left(t \right) \neq \textbf{0}$ and $v_2 \left(t \right) \neq \textbf{0}$ , where $\textbf{0}$ presents the zero vector. Then, $\beta$ is a nonzero constant and the equality (3.12) implies

$\begin{equation} \left( {1 - \beta } \right)\sqrt {\sigma _1 \left( t \right)} = 0, \end{equation}$

(3.13)

$\begin{equation} \left( {2 - \beta } \right)\sqrt {\sigma _2 \left( t \right)} = 0. \end{equation}$

(3.14)

Furthermore, the constant $\beta$ can be set equal to one integer $i^* \in \left\{ {1, 2} \right\}$ , and for the other integer $i \in \left\{ {1, 2} \right\}$ , one have

$\begin{equation} \sigma _i \left( t \right) = 0, \quad i^* \neq i, \end{equation}$

(3.15)

From the two equalities (3.8) and (3.15), we obtain $\sigma_{i^*} \left(t \right) = 1$ . This completes the proof of Theorem 1.

Now, Problem 3 can be rewritten as a relaxation problem as follows:

Problem 4. Choose $\left({\alpha _1, \alpha _2, \vartheta, \sigma\left(t \right)} \right) \in R^{r_1 } \times R^{r_2 } \times R^{r_3 } \times R^2$ to minimize the objective function

$\begin{equation} J_{relax}\left( {\alpha _1 , \alpha _2 , \vartheta , \sigma\left( t \right)} \right) = \phi \left( {x\left( {t_f } \right)} \right), \end{equation}$

(3.16)

subject to the nonlinear dynamical system

$\begin{equation} \frac{{dx\left( t \right)}} {{dt}} = \left( 2 - \bar y\left( t \right) \right) \bar y\left( t \right)f_1 \left( {x\left( t \right), t} \right) + \left( {\bar y\left( t \right) - 1} \right)\bar f_2 \left( {x\left( t \right), \vartheta , t} \right), \quad t \in \left[ {0, t_f } \right], \end{equation}$

(3.17)

the two equality constraints

$\begin{equation} \left(2 - \bar y\left( t \right) \right) \bar y\left( t \right)g_1 \left( {x\left( t \right), \alpha _1 , t} \right) + \left( {\bar y\left( t \right) - 1} \right)g_2 \left( {x\left( t \right), \alpha _2 , t} \right) = 0, \quad t \in \left[ {0, t_f } \right], \end{equation}$

(3.18)

$\begin{equation} P\left( {\sigma \left( t \right)} \right) = 0, \quad t \in \left[ {0, t_f } \right], \end{equation}$

(3.19)

the bound constraint

$\begin{equation} 0 \leqslant \sigma _i \left( t \right) \leqslant 1, \quad i = 1, 2, \quad t \in \left[ {0, t_f } \right], \end{equation}$

(3.20)

the equality constraint (3.8), and the three bound constraints (2.1), (2.2), and (2.7), where

$\begin{equation} \bar y\left( t \right) = 1 \times \sigma _1 \left( t \right) + 2 \times \sigma _2 \left( t \right). \end{equation}$

(3.21)

By using Theorem 1, one can derive that Problems 3 and 4 are equivalent.

3.3. A nonlinear parameter optimization problem

Note that the bound constraint (3.20) is essentially some continuous-time inequality constraints. Thus, the solution of Problem 4 can not also be obtained by directly using the existing standard algorithms. In order to obtain the solution of Problem 4, this subsection will introduce a nonlinear parameter optimization problem, which has some continuous-time equality constraints and several bound constraints.

Suppose that $\tau_i$ presents the $i$ th switching time. Then, one have

$\begin{equation} 0 = \tau _0 \leqslant \tau _1 \leqslant \tau _2 \leqslant \cdots \tau _{M - 1} \leqslant \tau _M = t_f, \end{equation}$

(3.22)

where $M \geqslant 1$ presents a given fixed integer. It is important to note that the switching times are not independent optimization variables, whose values can be obtained indirectly by using the state trajectory of the switched dynamical system (2.5). Then, Problem 4 can be transformed into an equivalent optimization problem as follows:

Problem 5. Choose $\left({\alpha _1, \alpha _2, \vartheta, \xi} \right) \in R^{r_1 } \times R^{r_2 } \times R^{r_3 } \times R^{2M}$ to minimize the objective function

$\begin{equation} \bar J_{relax}\left( {\alpha _1 , \alpha _2 , \vartheta , \xi} \right) = \phi \left( {x\left( {t_f } \right)} \right), \end{equation}$

(3.23)

subject to the nonlinear dynamical system

$\frac{{dx\left( t \right)}} {{dt}} \! = \! \sum\limits_{i = 1}^M {\left( {\left( {2 - \left( {\xi _i^1 + 2\xi _i^2 } \right)} \right)\left( {\xi _i^1 + 2\xi _i^2 } \right)f_1 \left( {x\left( t \right), t} \right) + \left( {\left( {\xi _i^1 + 2\xi _i^2 } \right) - 1} \right)\bar f_2 \left( {x\left( t \right), \vartheta , t} \right)} \right)\chi _{\left[ {\tau _{i - 1} , \tau _i } \right)} \left( t \right)} ,$

$\begin{equation} t \in \left[ {0, t_f } \right], \end{equation}$

(3.24)

the equality constraints

$\sum\limits_{i = 1}^M {\left( {\left( {2 - \left( {\xi _i^1 + 2\xi _i^2 } \right)} \right)\left( {\xi _i^1 + 2\xi _i^2 } \right)g_1 \left( {x\left( t \right), \alpha _1 , t} \right) + \left( {\left( {\xi _i^1 + 2\xi _i^2 } \right) - 1} \right)g_2 \left( {x\left( t \right), \alpha _2 , t} \right)} \right)\chi _{\left[ {\tau _{i - 1} , \tau _i } \right)} \left( t \right) = 0} ,$

$\begin{equation} t \in \left[ {0, t_f } \right], \end{equation}$

(3.25)

$\begin{equation} \bar P\left( {\xi, t} \right) = 0, \quad t \in \left[ {0, t_f } \right], \end{equation}$

(3.26)

$\begin{equation} \xi _i^1 + \xi _i^2 = 1, \quad i = 1, \cdots , M, \end{equation}$

(3.27)

the bound constraint

$\begin{equation} 0 \leqslant \xi _i^j \leqslant 1, \quad j = 1, 2, \quad i = 1, \cdots , M, \end{equation}$

(3.28)

and the three bound constraints (2.1), (2.2), and (2.7), where $\xi _1^i$ and $\xi _2^i$ present, respectively, the values of $\sigma _1 \left(t \right)$ and $\sigma _2 \left(t \right)$ on the $i$ th subinterval ${\left[{\tau _{i - 1}, \tau _i } \right)}$ , $i = 1, \cdots, M$ ; $\xi = \left[{\left({\xi ^1 } \right)^{\rm T}, \left({\xi ^2 } \right)^{\rm T} } \right]^{\rm T}$ , $\xi ^1 = \left[{\xi _1^1, \cdots, \xi _M^1 } \right]^{\rm T}$ , $\xi ^2 = \left[{\xi _1^2, \cdots, \xi _M^2 } \right]^{\rm T}$ ; $\bar P\left(\xi, t \right) = \sum\limits_{i = 1}^M {\left({\sum\limits_{j = 1}^2 {j^2 \xi _i^j } - \left({\sum\limits_{j = 1}^2 {j\xi _i^j } } \right)^2 } \right)} \chi _{\left[{\tau _{i - 1}, \tau _i } \right)} \left(t \right)$ ; and $\chi _I \left(t \right)$ is given by

$\begin{equation} \chi _I \left( t \right) = \left\{ \begin{array}{l} 1, \quad if\;t \in I, \\ 0, \quad {otherwise}{, } \\ \end{array} \right. \end{equation}$

(3.29)

which is a function defined on the subinterval $I \subset \left[{0, t_f } \right]$ .

Due to the switching times being unknown, it is very challenging to acquire the gradient of the objective function (3.23). In order to overcome this challenge, the following time-scaling transformation is developed to transform variable switching times into fixed times:

Suppose that the function ${\rm{t}}\left(s \right):\left[{0, M} \right] \to R$ is continuously differentiable and is governed by the following equation:

$\begin{equation} \frac{{dt\left( s \right)}}{{ds}} = \sum\limits_{i = 1}^{M} {\theta_i \chi _{\left[ {i - 1, i} \right)} \left( s \right)}, \end{equation}$

(3.30)

with the boundary condition

$\begin{equation} {\rm{t}}\left( 0 \right) = 0, \end{equation}$

(3.31)

where $\theta_i$ is the subsystem dwell time on the $i$ th subinterval $\left[{i- 1, i} \right) \subset \left[{0, t_f } \right]$ . In general, the transformation (3.30)–(3.31) is referred to as a time-scaling transformation.

Define $\theta = \left[{\theta _1, \cdots, \theta _M } \right]^{\rm T}$ , where

$\begin{equation} 0 \leq \theta _i \leq t_f, \quad i = 1, \cdots, M. \end{equation}$

(3.32)

Then, by using the time-scaling transform (3.30) and (3.31), we can rewrite Problem 5 as the following equivalent nonlinear parameter optimization problem, which has fixed switching times.

Problem 6. Choose $\left({\alpha _1, \alpha _2, \vartheta, \xi, \theta} \right) \in R^{r_1 } \times R^{r_2 } \times R^{r_3 } \times R^{2M} \times R^M$ to minimize the objective function

$\begin{equation} \hat J_{relax}\left( {\alpha _1 , \alpha _2 , \vartheta , \xi, \theta} \right) = \phi \left( {\hat x\left( {M} \right)} \right), \end{equation}$

(3.33)

subject to the nonlinear dynamical system

$\frac{{d\hat x\left( s \right)}} {{ds}} \! = \! \sum\limits_{i = 1}^M {\theta _i \left( {\left( {2 \!-\! \left( {\xi _i^1 + 2\xi _i^2 } \right)} \right)\left( {\xi _i^1 \!+\! 2\xi _i^2 } \right)f_1 \left( {\hat x\left( s \right), s} \right) \!+\! \left( {\left( {\xi _i^1 + 2\xi _i^2 } \right) - 1} \right)\bar f_2 \left( {\hat x\left( s \right), \vartheta , s} \right)} \right)\chi _{\left[ {i - 1, i} \right)} \left( s \right)} ,$

$\begin{equation} s \in \left[ {0, M} \right], \end{equation}$

(3.34)

the continuous-time equality constraints

$\sum\limits_{i = 1}^M {\theta _i \left( {\left( {2 \!-\! \left( {\xi _i^1 \!+\! 2\xi _i^2 } \right)} \right)\left( {\xi _i^1 \!+\! 2\xi _i^2 } \right)g_1 \left( {\hat x\left( s \right), \alpha _1 , s} \right) \!+\! \left( {\left( {\xi _i^1 \!+\! 2\xi _i^2 } \right) - 1} \right)g_2 \left( {\hat x\left( s \right), \alpha _2 , s} \right)} \right)\chi _{\left[ {i - 1, i} \right)} \left( s \right) = 0} ,$

$\begin{equation} s \in \left[ {0, M} \right], \end{equation}$

(3.35)

$\begin{equation} \hat P\left( {\xi, s} \right) = 0, \quad s \in \left[ {0, M} \right], \end{equation}$

(3.36)

the linear equality constraint (3.27), the three bound constraints (2.1), (2.2), (2.7), (3.28), and (3.32), where $\hat x\left(s \right) = x\left({t\left(s \right)} \right)$ and $\hat P\left({\xi, s} \right) = \sum\limits_{i = 1}^M {\theta _i \left({\sum\limits_{j = 1}^2 {j^2 \xi _i^j } - \left({\sum\limits_{j = 1}^2 {j\xi _i^j } } \right)^2 } \right)} \chi _{\left[{i- 1, i} \right)} \left(s \right)$ .

4. An improved gradient-based numerical optimization algorithm

In this section, an improved gradient-based numerical optimization algorithm will be proposed for obtaining the solution of Problem 1.

4.1. A penalty problem

In order to handle the continuous-time equality constraints (3.35) and (3.36), by adopting the idea of $l_1$ penalty function ^[55], Problem 6 will be written as a nonlinear parameter optimization problem with a linear equality constraint and several simple bounded constraints in this subsection.

Problem 7. Choose $\left({\alpha _1, \alpha _2, \vartheta, \xi, \theta} \right) \in R^{r_1 } \times R^{r_2 } \times R^{r_3 } \times R^{2M} \times R^M$ to minimize the objective function

$\begin{equation} J_\gamma \left( {\alpha _1 , \alpha _2 , \vartheta , \xi , \theta } \right) = \phi \left( {\hat x\left( {M} \right)} \right) + \gamma \int_0^M {L\left( {\hat x\left( s \right), \alpha _1 , \alpha _2 , \vartheta , \xi , \theta , s} \right)ds}, \end{equation}$

(4.1)

subject to the nonlinear dynamical system (3.34), the linear equality constraint (3.27), the three bound constraints (2.1), (2.2), (2.7), (3.28) and (3.32), where

$L\left( {\hat x\left( s \right), \alpha _1 , \alpha _2 , \vartheta , \xi , \theta , s} \right)$

$= \hat P\left( {\xi , s} \right) + \sum\limits_{i = 1}^M {\theta _i} \\ \left( {\left( {2 - \left( {\xi _i^1 + 2\xi _i^2 } \right)} \right)\left( {\xi _i^1 + 2\xi _i^2 } \right)g_1 \left( {\hat x\left( s \right), \alpha _1 , s} \right) + \left( {\left( {\xi _i^1 + 2\xi _i^2 } \right) - 1} \right)g_2 \left( {\hat x\left( s \right), \alpha _2 , s} \right)} \right)\chi _{\left[ {i - 1, i} \right)} \left( s \right),$

where $\gamma > 0$ presents the penalty parameter.

The idea of $l_1$ penalty function ^[47] indicates that any solution of Problem 7 is also a solution of Problem 6. In additions, it is straightforward to acquire the gradient of the linear function in the equality constraint (3.27), and the gradient of the objective function (4.1) will be presented in Section 4.2. Thus, the solution of Problem 7 can be achieved by applying any gradient-based numerical computation method.

4.2. Gradient formulae

In order to acquire the solution of Problem 7, the gradient formulae of this objective function (4.1) will be presented by the following theorem in this subsection.

Theorem 2. For any $s \in \left[{0, M} \right]$ , the gradient formulae of the objective function (4.1) with respect to the decision variables $\alpha _1$ , $\alpha _2$ , $\vartheta$ , $\xi$ , and $\theta$ are given by

$\begin{equation} \frac{{\partial J_\gamma \left( {\alpha _1 , \alpha _2 , \vartheta , \xi , \theta } \right)}} {{\partial \alpha _1 }} = \int_0^M {\frac{{\partial H\left( {\hat x\left( s \right), \alpha _1 , \alpha _2 , \vartheta , \xi , \theta, \lambda \left( s \right)} \right)}}{{\partial \alpha _1 }}} ds, \end{equation}$

(4.2)

$\begin{equation} \frac{{\partial J_\gamma \left( {\alpha _1 , \alpha _2 , \vartheta , \xi , \theta } \right)}} {{\partial \alpha _2 }} = \int_0^M {\frac{{\partial H\left( {\hat x\left( s \right), \alpha _1 , \alpha _2 , \vartheta , \xi , \theta, \lambda \left( s \right)} \right)}}{{\partial \alpha _2 }}} ds, \end{equation}$

(4.3)

$\begin{equation} \frac{{\partial J_\gamma \left( {\alpha _1 , \alpha _2 , \vartheta , \xi , \theta } \right)}} {{\partial \vartheta }} = \int_0^M {\frac{{\partial H\left( {\hat x\left( s \right), \alpha _1 , \alpha _2 , \vartheta , \xi , \theta, \lambda \left( s \right)} \right)}}{{\partial \vartheta }}} ds, \end{equation}$

(4.4)

$\begin{equation} \frac{{\partial J_\gamma \left( {\alpha _1 , \alpha _2 , \vartheta , \xi , \theta } \right)}} {{\partial \xi }} = \int_0^M {\frac{{\partial H\left( {\hat x\left( s \right), \alpha _1 , \alpha _2 , \vartheta , \xi , \theta, \lambda \left( s \right)} \right)}}{{\partial \xi }}} ds, \end{equation}$

(4.5)

$\begin{equation} \frac{{\partial J_\gamma \left( {\alpha _1 , \alpha _2 , \vartheta , \xi , \theta } \right)}} {{\partial \theta }} = \int_0^M {\frac{{\partial H\left( {\hat x\left( s \right), \alpha _1 , \alpha _2 , \vartheta , \xi , \theta, \lambda \left( s \right)} \right)}}{{\partial \theta }}} ds, \end{equation}$

(4.6)

where $H\left({\hat x\left(s \right), \alpha _1, \alpha _2, \vartheta, \xi, \theta, \lambda \left(s \right)} \right)$ denotes the Hamiltonian function defined by

$\begin{equation} H\left( {\hat x\left( s \right), \alpha _1 , \alpha _2 , \vartheta , \xi , \theta , \lambda \left( s \right)} \right) =\\ L\left( {\hat x\left( s \right), \alpha _1 , \alpha _2 , \vartheta , \xi , \theta , s} \right) + \left( {\lambda \left( s \right)} \right)^{\rm T} \bar f\left( {\hat x\left( s \right), \alpha _1 , \alpha _2 , \vartheta , \xi , \theta , s} \right), \end{equation}$

(4.7)

$\bar f\left( {\hat x\left( s \right), \alpha _1 , \alpha _2 , \vartheta , \xi , \theta , s} \right)$

$= \sum\limits_{i = 1}^M {\theta _i }\\ \left( {\left( {2 - \left( {\xi _i^1 + 2\xi _i^2 } \right)} \right)\left( {\xi _i^1 + 2\xi _i^2 } \right)f_1 \left( {\hat x\left( s \right), s} \right) + \left( {\left( {\xi _i^1 + 2\xi _i^2 } \right) - 1} \right)\bar f_2 \left( {\hat x\left( s \right), \vartheta , s} \right)} \right)\chi _{\left[ {i - 1, i} \right)} \left( s \right),$

(4.8)

and the function $\lambda \left(s \right)$ presents the costate satisfying the following system:

$\begin{equation} \left( {\frac{{d\lambda \left( s \right)}} {{ds}}} \right)^{\rm T} = - \frac{{\partial H\left( {\hat x\left( s \right), \alpha _1 , \alpha _2 , \vartheta , \xi , \theta , \lambda \left( s \right)} \right)}}{{\partial \hat x\left( s \right)}} \end{equation}$

(4.9)

with the terminal condition

$\begin{equation} \left( {\lambda \left( M \right)} \right)^{\rm T} = \frac{{\partial \phi \left( {\hat x\left( M \right)} \right)}} {{\partial \hat x\left( M \right)}}. \end{equation}$

(4.10)

Proof. Similarly to the discussion of Theorem 5.2.1 described in ^[56], the gradient formulae (4.2)–(4.6) can be obtained. This completes the proof of Theorem 2.

4.3. Algorithm

For simplicity of notation, let $g \left({\eta } \right) = \nabla \tilde J_\gamma \left(\eta \right)$ presents the gradient of the objective function $J_\gamma$ described by (4.1) at $\eta$ , where $\eta = \left[{\left({\alpha _1 } \right)^{\rm T}, \left({\alpha _2 } \right)^{\rm T}, \vartheta ^{\rm T}, \xi ^{\rm T}, \theta ^{\rm T} } \right]^{\rm T}$ . In additions, let $\left\| \cdot \right\|$ and $\left\| \cdot \right\|_\infty$ present, respectively, the Euclidean norm and the infinity norm, and suppose that the subscript $k$ presents the function value at the point $\eta_k$ or in the $k$ th iteration, for instance, $g_k$ and $\left({J_\gamma } \right)_k$ . Then, based on the above discussion, an improved gradient-based numerical optimization algorithm will be provided to acquire the solution of Problem 1 in this subsection.

Algorithm 1. An improved gradient-based numerical optimization algorithm for solving Problem 1.
01. Initial: $\eta_0 \in R^{r_1+r_2+r_3+3M}$ , $0 < \mu < 1$ , $0 < \varpi < 1$ , $\rho _{\max } \geqslant \rho _{\min } > 0$ , $0 < N_{\min } \leqslant N_0 \leqslant N_{\max }$ , $\varepsilon > 0$ ;
02. begin
03. calculate the objective function $\left({J_\gamma } \right)_0$ and the gradient $g_0$ at the point $\eta_0$ ;
04. $\left({\hat J_\gamma } \right)_{p\left(0 \right)} : = J_\gamma \left({\eta _0 } \right)$ , $\rho _0 : = 1$ , $k: = 0$ ;
05. while $\left\\| {g_k } \right\\| \geqslant \varepsilon$ do
06. $d_k : = - \rho _k g_k$ , $\omega_k : = 1$ , $\hat \eta _k : = \eta _k + \omega_k d_k$ ;
07. while $J_\gamma \left({\hat \eta _k } \right) > \left({\hat J_\gamma } \right)_{p\left(k \right)} + \mu \omega_k \left({g_k } \right)^{\rm T} d_k$ do
08. $\omega_k : = \varpi \omega_k$ , $\hat \eta _k : = \eta _k + \omega_k d_k$ ;
09. end
10. $\eta _{k + 1} : = \hat \eta _k$ , $\left({J_\gamma } \right)_{k + 1} : = J_\gamma \left({\hat \eta _k } \right)$ ;
11. calculate $\delta _k$ by using the following equality:
$\begin{equation} \delta _k = \frac{{\left({z_{k - 1} } \right)^{\rm T} e_{k - 1} }}{{\left({e_{k - 1} } \right)^{\rm T} e_{k - 1} }}, \ \ \ \ \ \ \ \ \ \ (4.11) \end{equation}$
where $z_{k - 1} = \eta _k - \eta _{k - 1}$ , $e_{k - 1} = g_k - g_{k - 1}$ ;
12. if $\delta _k < 0$ then
13. $\rho _k : = \frac{1}{{\rho _{\max } }}$ ;
14. otherwise
15. $\rho _{k + 1} : = \min \left\{ {\rho _{\max }, \max \left\{ {\rho _{\min }, \frac{1}{{\rho _k }}} \right\}} \right\}$ ;
16. end
17. calculate $g_{k+1}$ ;
18. calculate $N_k$ by using the following equality:
$\begin{equation} N_k = \left\{ \begin{array}{l} N_{k - 1} + 1, if\; \left\\| {g_k } \right\\|_\infty \geqslant 0.1, \hfill \cr N_{k - 1}, \; \; \, if\; 0.001 \leqslant \left\\| {g_k } \right\\|_\infty \leqslant 0.1, \hfill \cr N_{k - 1} - 1, otherwise, \hfill \cr \end{array} \right. \ \ \ \ \ \ \ \ \ \ \ \ \ (4.12)\end{equation}$
in $\left[{N_{\min }, N_{\max } } \right]$ ;
19. update $\left({\hat J_\gamma } \right)_{p\left(k \right)}$ by using the following equality:
$\begin{equation} \left({\hat J_\gamma } \right)_{p\left(k \right)} = \mathop {\max }\limits_{0 \leqslant i \leqslant \min \left({k, N_k } \right)} \left\{ {\left({J_\gamma } \right)_{k - i} } \right\}, k = 0, 1, 2, \cdots, \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (4.13) \end{equation}$
which satisfying the following inequality:
$\begin{equation} J_\gamma \left({\eta _k + \omega _k d_k } \right) \leqslant \left({\hat J_\gamma } \right)_{p\left(k \right)} + \mu \omega _k \left({g_k } \right)^{\rm T} d_k; \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (4.14) \end{equation}$
20. k := k +1;
21. end
22. $\eta ^ * : = \eta _k$ , $J_\gamma ^ * : = J_\gamma \left({ \eta_k } \right)$ ;
23. end
24. Output: $\eta^$ , $J_\gamma^$ .
25. construct the optimal solution and optimal value of Problem 1 by using $\eta^$ and $J_\gamma^$ .

| Show Table

DownLoad: CSV

Remark 4. During the past decades, many iterative approaches have been proposed for solving the nonlinear parameter optimization problem by using the information of the objective function ^[57]. The idea of these iterative approaches is usually that a iterative sequence is generated such that the corresponding objective function value sequence is monotonically decreasing. However, the existing algorithms have the following disadvantage: if an iteration is trapped to a curved narrow valley bottom of the objective function, then the iterative methods will lose their efficiency due to the target with objective function value monotonically decreasing may leading to very short iterative steps. Then, in order to overcome this challenge, an improved gradient-based algorithm is developed based on a novel search approach in Algorithm 1. In this novel search approach, it is not required that the objective function value sequence is always monotonically decreasing. In additions, an improved adaptive strategy for the memory element $N_k$ described by (4.12), which is used in (4.13), is proposed in iterative processes in Algorithm 1. The corresponding explanation on the equality (4.12) is as follows. If the $1$ st condition described by (4.12) holds, then it implies that the iteration is trapped to a curved narrow valley bottom of the objective function. Thus, in order to avoid creeping along the bottom of this narrow curved valley, the value of the memory element $N_k$ should be increased. If the $2$ nd condition described by (4.12) holds, then the value of the memory element $N_k$ is better to remain unchanged. If the $3$ rd condition described by (4.12) holds, then it implies that the iteration is in a flat region. Thus, in order to decrease the objective function value, the value of the memory element $N_k$ will be decreased. Above discussions imply that the novel search approach described in Algorithm 1 is also an adaptive method.

Remark 5. The sufficient descent condition is extremely important for the convergence of any gradient-based numerical optimization algorithm. Thus, the goal of lines 12–16 described in Algorithm 1 is avoiding uphill directions and keeping $\{ \rho_k \}$ uniformly bounded. As a matter of fact, for any $k$ , $\rho _{\min } \leqslant \rho _k \leqslant \rho _{\max }$ and $d_k = - \rho _k g_k$ ensure that there are two constants $l_1 > 0$ and $l_2 > 0$ such that $d_k$ satisfies the following two conditions:

$\begin{equation} \left( {g_k } \right)^{\rm T} d_k \leqslant - l_1 \left\| {g_k } \right\|^2, \end{equation}$

(4.15)

$\begin{equation} \left\| {d_k } \right\| \leqslant l_2 \left\| {g_k } \right\|. \end{equation}$

(4.16)

5. Convergence analysis

This section will establish the convergence results of Algorithm 1 developed by Section 4. In order to establish the convergence results of this algorithm, we suppose that the following two conditions hold:

Assumption 1. $J_\gamma$ is a continuous differentiable function and bounded below on $R^{r_1 } \times R^{r_2 } \times R^{r_3 } \times R^{2M} \times R^M$ .

Assumption 2. For any $\eta_1 \in \Omega$ and $\eta_2 \in \Omega$ , there is a constant $l_3$ such that

$\begin{equation} \left\| {g\left( {\rho _1 } \right) - g\left( {\rho _2 } \right)} \right\| \leqslant l_3 \left\| {\rho _1 - \rho _2 } \right\|, \end{equation}$

(5.1)

where $\Omega$ presents a open set and $g\left(\eta \right)$ presents the gradient of $J_\gamma \left(\eta \right)$ .

Theorem 3. Suppose that Assumptions 1 and 2 hold. Let $\{ \eta_k \}$ be a sequence obtained by using Algorithm 1. Then, there is a constant $\varsigma > 0$ such that the following inequality holds:

$\begin{equation} \left( {J_\gamma } \right)_{k + 1} \leqslant \left( {\hat J_\gamma } \right)_{p\left( k \right)} - \varsigma \left\| {g_k } \right\|^2. \end{equation}$

(5.2)

Proof. Let $\varsigma _0$ be defined by $\varsigma _0 = \mathop {\inf }\limits_{\forall k} \left\{ {\omega _k } \right\} \geq 0$ .

If $\varsigma _0 > 0$ , then by using the inequalities (4.14) and (4.15), one can obtain

$\begin{equation} \left( {J_\gamma } \right)_{k{\rm{ + }}1} \leqslant \left( {\hat J_\gamma } \right)_{p\left( k \right)} - l_1 \varsigma _0 \left\| {g_k } \right\|^2. \end{equation}$

(5.3)

Let $\varsigma$ be defined by $\varsigma = l_1 \varsigma _0$ . Then, the proof of Theorem 1 is complete for $\varsigma _0 > 0$ .

If $\varsigma _0 = 0$ , then there is a subset $\Lambda \subseteq \left\{ {0, 1, 2, \cdots } \right\}$ such that the following equality holds:

$\begin{equation} \mathop {\lim }\limits_{k \in \Lambda , \;k \to \infty } \omega _k = 0, \end{equation}$

(5.4)

which indicates that there exists a $\hat k$ such that the following inequality holds:

$\begin{equation} \frac{{\omega _k }} {\varpi } \leqslant 1, \end{equation}$

(5.5)

for any $k > \hat k$ and ${k \in \Lambda }$ . Let $\omega = \omega _k \varpi$ . Then, the inequality (4.14) does not hold. That is, one can obtain

$\begin{equation} J_\gamma \left( {\eta _k + \omega d_k } \right) \leqslant \left( {\hat J_\gamma } \right)_{p\left( k \right)} + \mu \omega \left( {g_k } \right)^{\rm T} d_k, \end{equation}$

(5.6)

which implies

$\begin{equation} \left( {J_\gamma } \right)_k - J_\gamma \left( {\eta _k + \omega d_k } \right) < - \mu \omega \left( {g_k } \right)^{\rm T} d_k. \end{equation}$

(5.7)

Applying the mean value theorem to the left-hand side of the inequality (5.7) yields

$\begin{equation} - \omega \left( {g\left( {\eta _k + \zeta _k \omega d_k } \right)} \right)^{\rm T} d_k < - \mu \omega \left( {g_k } \right)^{\rm T} d_k, \end{equation}$

(5.8)

where $0 \leqslant \zeta _k \leqslant 1$ . From the inequality (5.8), one obtain

$\begin{equation} \left( {g\left( {\eta _k + \zeta _k \omega d_k } \right)} \right)^{\rm T} d_k > \mu \left( {g_k } \right)^{\rm T} d_k. \end{equation}$

(5.9)

By using Assumption 2 and Cauchy-Schwartz inequality, from (4.15) and (5.9), we have

$l_3 \omega \left\| {d_k } \right\|^2 \geqslant \left\| {g\left( {\eta _k + \omega \zeta _k d_k } \right) - g_k } \right\|\left\| {d_k } \right\| \geqslant \left( {g\left( {\eta _k + \omega \zeta _k d_k } \right) - g_k } \right)^{\rm T} d_k$

$\begin{equation} \geqslant - \left( {1 - \mu } \right)\left( {g_k } \right)^{\rm T} d_k \geqslant l_1 \left( {1 - \mu } \right)\left\| {g_k } \right\|^2. \end{equation}$

(5.10)

Furthermore, applying $\omega = \omega _k \varpi$ and the inequality (4.16) to the inequality (5.10), one obtain

$\begin{equation} \omega _k \geqslant \frac{{l_1 \left( {1 - \mu } \right)\left\| {g_k } \right\|^2 }} {{\varpi l_3 \left\| {d_k } \right\|^2 }} \geqslant \frac{{l_1 \left( {1 - \mu } \right)}} {{\left( {l_2 } \right)^2 \varpi l_3 }} > 0, \end{equation}$

(5.11)

for any $k > \hat k$ and ${k \in \Lambda }$ . Clearly, the inequalities (5.4) and (5.11) are contradictory. Thus, $\varsigma _0 > 0$ . This completes the proof of Theorem 3.

Lemma 1. Suppose that Assumptions 1 and 2 hold. Let $\{ \eta_k \}$ be a sequence obtained by using Algorithm 1. Then, the following inequalities

$\begin{equation} \mathop {\max }\limits_{1 \leqslant j \leqslant A} J_\gamma \left( {\eta _{Ap + j} } \right) \leqslant \mathop {\max }\limits_{1 \leqslant j \leqslant A} J_\gamma \left( {\eta _{A\left( {p - 1} \right) + j} } \right) - \varsigma \mathop {\min }\limits_{1 \leqslant j \leqslant A} \left\| {g_{Ap + j - 1} } \right\|^2, \end{equation}$

(5.12)

$\begin{equation} \sum\limits_{p = 1}^\infty {\mathop {\min }\limits_{1 \leqslant j \leqslant A} } \left\| {g_{Ap + j - 1} } \right\|^2 < + \infty, \end{equation}$

(5.13)

are true, where $A = N_{\max }$ .

Proof. Note that if the following inequality

$\begin{equation} J_\gamma \left( {\eta _{Ap + j} } \right) \leqslant \mathop {\max }\limits_{1 \leqslant j \leqslant A} J_\gamma \left( {\eta _{A\left( {p - 1} \right) + j} } \right) - \varsigma \left\| {g_{Ap + j - 1} } \right\|^2, j = 1, 2, \cdots, A, \end{equation}$

(5.14)

is true, then the inequality (5.12) also holds. Here, the inequality (5.14) will be proved by using mathematical induction.

Firstly, Theorem 3 indicates

$\begin{equation} J_\gamma \left( {\eta _{Ap + 1} } \right) \leqslant \mathop {\max }\limits_{1 \leqslant j \leqslant q\left( {Ap} \right)} J_\gamma \left( {\eta _{Ap + j} } \right) - \varsigma \left\| {g_{Ap} } \right\|^2, \end{equation}$

(5.15)

where $q\left({Ap} \right) = \min \left\{ {Ap, N_{Ap} } \right\}$ . By using $0 \leqslant q\left({Ap} \right) \leqslant A$ and the inequality (5.15), one can derive that the inequality (5.14) is true for $j = 1$ .

Suppose that the inequality (5.14) is true for $1 \leqslant j \leqslant A - 1$ . Note that $\varsigma > 0$ and the term $\left\| {g_{Ap + j - 1} } \right\|^2$ described in (5.14) is nonnegative. Then, one can obtain

$\begin{equation} \mathop {\max }\limits_{1 \leqslant i \leqslant j} J_\gamma \left( {\eta _{Ap + i} } \right) \leqslant \mathop {\max }\limits_{1 \leqslant i \leqslant A} J_\gamma \left( {\eta _{A\left( {p - 1} \right) + i} } \right), \end{equation}$

(5.16)

for $1 \leqslant j \leqslant A - 1$ .

Next, by using $0 \leqslant q\left({Ap} \right) \leqslant A$ , the inequality (5.2), and the inequality (5.16), one can derive

$J_\gamma \left( {\eta _{Ap + j + 1} } \right) \leqslant \mathop {\max }\limits_{1 \leqslant i \leqslant q\left( {Ap + j} \right)} J_\gamma \left( {\eta _{A\left( {p - 1} \right) + j + 1} } \right) - \varsigma \left\| {g_{Ap + j} } \right\|^2$

$\leqslant \max \left\{ {\mathop {\max }\limits_{1 \leqslant i \leqslant A} J_\gamma \left( {\eta _{A\left( {p - 1} \right) + i} } \right), \mathop {\max }\limits_{1 \leqslant i \leqslant j} J_\gamma \left( {\eta _{Ap + i} } \right)} \right\} - \varsigma \left\| {g_{Ap + j} } \right\|^2$

$\begin{equation} \leqslant \mathop {\max }\limits_{1 \leqslant i \leqslant A} J_\gamma \left( {\eta _{A\left( {p - 1} \right) + i} } \right) - \varsigma \left\| {g_{Ap + j} } \right\|^2, \end{equation}$

(5.17)

which implies that the inequality (5.14) is also true for $j+1$ . Then, the inequality (5.14) is true for $1 \leqslant j \leqslant A$ by using mathematical induction. Thus, the inequality (5.12) holds.

In additions, Assumption 1 shows $J_\gamma$ being a continuous differentiable function and bounded below on $R^{r_1 } \times R^{r_2 } \times R^{r_3 } \times R^{2M} \times R^M$ , which indicates that

$\begin{equation} \mathop {\max }\limits_{1 \leqslant i \leqslant A} J_\gamma \left( {\eta _{Ap + i} } \right) > - \infty. \end{equation}$

(5.18)

Then, summing the inequality (5.12) over $p$ yields

$\sum\limits_{p = 1}^\infty {\mathop {\min }\limits_{1 \leqslant j \leqslant A} } \left\| {g_{Ap + j - 1} } \right\|^2 < + \infty.$

Thus, the inequality (5.13) holds. This completes the proof of Lemma 1.

Theorem 4. Suppose that these conditions of Theorem 3 are true. Then, the following equality holds:

$\begin{equation} \mathop {\lim }\limits_{k \to \infty } \left\| {g\left( {\eta _k } \right)} \right\| = 0, \end{equation}$

(5.19)

where $g\left({\eta _k } \right)$ presents the gradient of the objective function $J_\gamma$ described by (4.1) at the point $\eta_k$ .

Proof. Firstly, the following result will be proved: there is a constant $l_4$ such that

$\begin{equation} \left\| {g\left( {\eta _{k + 1} } \right)} \right\| \leqslant l_4 \left\| {g\left( {\eta _k } \right)} \right\|. \end{equation}$

(5.20)

By using Assumptions 1 and 2, one can obtain

$\left\| {g\left( {\eta _{k + 1} } \right)} \right\| \leqslant \left\| {g\left( {\eta _{k + 1} } \right) - g\left( {\eta _k } \right) + g\left( {\eta _k } \right)} \right\|$

$\leqslant \left\| {g\left( {\eta _{k + 1} } \right) - g\left( {\eta _k } \right)} \right\| + \left\| {g\left( {\eta _k } \right)} \right\|$

$\leqslant l_3 \omega _k \left\| {d_k } \right\| + \left\| {g\left( {\eta _k } \right)} \right\|$

$\begin{equation} \leqslant \left( {1 + l_2 l_3 \omega _k} \right)\left\| {g\left( {\eta _k } \right)} \right\|. \end{equation}$

(5.21)

Let the constant $l_4$ be defined by $l_4 = 1 + l_2 l_3 \omega _k$ . Then, the inequality (5.21) implies that the inequality (5.20) is true.

Define the function $\psi \left(p \right)$ by

$\begin{equation} \psi \left( p \right) = \mathop {\arg \min }\limits_{0 \leqslant j \leqslant A - 1} \left\| {g\left( {\eta _{Ap + j} } \right)} \right\|. \end{equation}$

(5.22)

Then, Lemma 1 indicates that the following equality holds:

$\begin{equation} \mathop {\lim }\limits_{p \to \infty } \left\| {g\left( {\eta _{Ap + \psi \left( p \right)} } \right)} \right\| = 0. \end{equation}$

(5.23)

By using the inequality (5.20), one can obain

$\begin{equation} \left\| {g\left( {\eta _{A\left( {p + 1} \right) + j} } \right)} \right\| \leqslant l_4^{2A} \left\| {g\left( {\eta _{Ap + \psi \left( p \right)} } \right)} \right\|, \quad j = 0, 1, \cdots , A - 1. \end{equation}$

(5.24)

Thus, from (5.23) and (5.24), one can deduce that the equality (5.19) is true. This completes the proof of Theorem 4.

6. Numerical results

In this section, an optimal feedback control problem of 1, 3-propanediol fermentation processes is provided to illustrate the effectiveness of the approach developed by Sections 2–5, and the numerical simulations are all implemented on a personal computer with Intel Pentium Skylake dual core processor i5-6200U CPU(2.3GHz).

The 1, 3-propanediol fermentation process can be described by switching between two subsystem: batch subsystem and feeding subsystem. There exists no input feed during the batch subsystem, while alkali and glycerol will be added to the fermentor during the feeding subsystem. In generally, the subsystem switching will happen, if the glycerol concentration reaches the given upper and lower thresholds. By using the result of the work ^[58], the 1, 3-propanediol fermentation process can be modeled as the following switched dynamical system under state-dependent switching:

$\begin{equation} \left\{ \begin{array}{l} Subsystem\;1: \ \frac{{dx\left( t \right)}} {{dt}} = f_1 \left( {x\left( t \right), t} \right), \quad \quad \quad \quad if \; x_3 \left( t \right) - \alpha _1 = 0, \hfill \cr Subsystem\;2: \ \frac{{dx\left( t \right)}} {{dt}} = f_1 \left( {x\left( t \right), t} \right) + f_2 \left( {x\left( t \right), u\left( t \right), t} \right), \, \, \, if \; x_3 \left( t \right) - \alpha _2 = 0, \hfill \cr \end{array} \right. t \in \left[ {0, \;t_f } \right], \end{equation}$

(6.1)

where $t_f$ denotes the given terminal time; the system states $x_1 \left(t \right)$ , $x_2 \left(t \right)$ , $x_3 \left(t \right)$ , $x_4 \left(t \right)$ denote the volume of fluid ( $L$ ), the concentration of biomass ( $gL^{-1}$ ), the concentration of glycerol ( $mmolL^{-1}$ ), the concentration of 1, 3-propanediol ( $mmolL^{-1}$ ), respectively; the control input $u\left(t \right)$ denotes the feeding rate ( $Lh^{-1}$ ); $x\left(t \right) = \left[{x_1 \left(t \right), x_2 \left(t \right), x_3 \left(t \right), x_4 \left(t \right)} \right]^{\rm T}$ denotes the system state vector; Subsystem 1 and Subsystem 2 denote the batch subsystem and the feeding subsystem, respectively; $\alpha _1$ and $\alpha _2$ (two parameters that need to be optimized) denote the upper and lower of the glycerol concentration, respectively; and the functions $f_1 \left({x\left(t \right), t} \right)$ , $f_2 \left({x\left(t \right), u\left(t \right), t} \right)$ are given by

$\begin{equation} f_1 \left( {x\left( t \right), t} \right) = \left( \begin{array}{l} 0 \\ {\varphi \left( {x_3 \left( t \right), x_4 \left( t \right)} \right)x_2 \left( t \right)} \\ { - \Delta _1 \left( {x_3 \left( t \right), x_4 \left( t \right)} \right)x_2 \left( t \right)} \\ {\Delta _2 \left( {x_3 \left( t \right), x_4 \left( t \right)} \right)x_2 \left( t \right)} \\ \end{array} \right), \end{equation}$

(6.2)

$\begin{equation} f_2 \left( {x\left( t \right), u\left( t \right), t} \right) = \frac{{u\left( t \right)}} {{x_1 \left( t \right)}}\left( \begin{array}{l} {x_1 \left( t \right)} \\ { - x_2 \left( t \right)} \\ {l_5 l_6 - x_3 \left( t \right)} \\ { - x_4 \left( t \right)} \\ \end{array} \right). \end{equation}$

(6.3)

Subsystem 1 is essentially a natural fermentation process due to no input feed. The functions $\varphi$ , $\Delta_1$ , and $\Delta_2$ are defined by

$\begin{equation} \varphi \left( {x_3 \left( t \right), x_4 \left( t \right)} \right) = \frac{{h_1 x_3 \left( t \right)}} {{x_3 \left( t \right) + Y_1 }}\left( {1 - \frac{{x_3 \left( t \right)}} {{x_3^ * }}} \right)\left( {1 - \frac{{x_4 \left( t \right)}} {{x_4^ * }}} \right), \end{equation}$

(6.4)

$\begin{equation} \Delta _1 \left( {x_3 \left( t \right), x_4 \left( t \right)} \right){\rm{ = }}l_7 + Z_1 \varphi \left( {x_3 \left( t \right), x_4 \left( t \right)} \right) + \frac{{h_2 x_3 \left( t \right)}} {{x_3 \left( t \right) + Y_2 }}, \end{equation}$

(6.5)

$\begin{equation} \Delta _2 \left( {x_3 \left( t \right), x_4 \left( t \right)} \right) = - l_8 + Z_2 \varphi \left( {x_3 \left( t \right), x_4 \left( t \right)} \right) + \frac{{h_3 x_3 \left( t \right)}} {{x_3 \left( t \right) + Y_3 }}, \end{equation}$

(6.6)

which denote the growth rate of cell, the consumption rate of substrate, and the formation rate of 1, 3-propanediol, respectively. In the equality (6.4), the parameters $x_3^ *$ and $x_4^ *$ denote the critical concentrations of glycerol and 1, 3-propanediol, respectively; $h_1$ , $h_2$ , $h_3$ , $Y_1$ , $Y_2$ , $Y_3$ , $Z_1$ , $Z_2$ , $l_7$ , and $l_8$ are given parameters.

Note that the feeding subsystem doesn't only consist of the natural fermentation process. Thus, the function $f_2 \left({x\left(t \right), u\left(t \right), t} \right)$ is provided to describe the process dynamics because of the control input feed in Subsystem 2. In the equality (6.3), the given parameters $l_5$ and $l_6$ denote the proportion and concentration of glycerol in the control input feed, respectively.

In generally, as the increase of the biomass, the consumption of glycerol also increases. Then, during Subsystem 1 (batch subsystem), the concentration of glycerol will eventually become too low due to no new glycerol being added. Thus, Subsystem 1 will switch to Subsystem 2 (feeding subsystem), when the equality $x_3 \left(t \right) - \alpha _2 = 0$ (the active condition of Subsystem 2) satisfies. On the other hand, during Subsystem 2 (feeding subsystem), the concentration of glycerol will eventually become too high due to new glycerol being added. This will inhibit the growth of cell. Thus, Subsystem 2 will switch to Subsystem 1 (batch subsystem), when the equality $x_3 \left(t \right) - \alpha _1 = 0$ (the active condition of Subsystem 1) satisfies.

Suppose that the feeding rate $u \left(t \right)$ , the upper of the glycerol concentration $\alpha _1$ , and the lower of the glycerol concentration $\alpha _2$ satisfy the following bound constraints:

$\begin{equation} 1.0022 \leqslant u\left( t \right) \leqslant 1.9390, \end{equation}$

(6.7)

$\begin{equation} 295 \leqslant \alpha _1 \leqslant 605, \end{equation}$

(6.8)

$\begin{equation} 45 \leqslant \alpha _2 \leqslant 265, \end{equation}$

(6.9)

respectively.

The model parameters of the dynamic optimization problem for the 1, 3-propanediol fermentation process are presented by

$h_1 = 0.8041, \quad h_2 = 7.8296, \quad h_3 = 20.2518, \quad Y_1 = 0.4901, \quad Y_2 = 9.4628, \quad Y_3 = 38.6596,$

$Z_1 = 144.9216, Z_2 = 80.8538, l_5 = 0.5698, l_6 = 10759.0000 \; mmolL^{-1}, l_7 = 0.2981, l_8 = 12.2603,$

$x_3^* = 2040.0000 \; mmolL^{-1}, \quad x_4^* = 1035.0000 \; mmolL^{-1}, \quad t_f = 25.0000 \; hours, \quad M = 9,$

$x_0 = \left[ {5.0000, \; 0.1113, \;496.0000, \; 0.0000} \right]^{\rm T}.$

Suppose that the control input $u\left(t \right)$ takes the piecewise state-feedback controller $u\left(t \right) = \sum\limits_{i = 1}^{M} {k_i x\left(t \right)\chi _{\left[{\tau _{i - 1}, \tau _i } \right)} \left(t \right)}$ . Our main objective is to maximize the concentration of 1, 3-propanediol at the terminal time $t_f$ . Thus, the optimal feedback control problem of 1, 3-propanediol fermentation processes can be presented as follows: choose a control input $u\left(t \right)$ to minimize the objective function $J \left(u\left(t \right) \right) = - x_4 \left(t \right)$ subject to the switched dynamical system described by (6.1) with with the initial condition $x \left(0 \right) = x_0$ and the bound constraints (6.7–6.9). Then, the improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3) is adopted to solve the optimal feedback control problem of 1, 3-propanediol fermentation processes by using Matlab 2010a. The optimal objective function value is $J^* = -x_4 \left(t_f \right) = -1265.5597$ and the optimal values of the parameters $\alpha_1$ and $\alpha_2$ are $584.3908$ and $246.5423$ , respectively. The optimal feedback gain matrixes $K_i^*$ , $i = 1, \cdots, 9$ are presented by

$K_1^* = \left[ {0, \; 0, \; 0, \; 0} \right], \quad K_2^* = \left[ {0.0140, \; 0.0039, \; 1.1300, \; 0.4786} \right], \quad K_3^* = \left[ {0, \; 0, \; 0, \; 0} \right],$

$K_4^* = \left[ {0.0095, \; 0.0058, \; 0.7149, \; 0.6392} \right], K_5^* = \left[ {0, \; 0, \; 0, \; 0} \right], \\ K_6^* = \left[ {0.0082, \; 0.0069, \; 0.6080, \; 0.8297} \right],$

$K_7^* = \left[ {0, \; 0, \; 0, \; 0} \right], \quad K_8^* = \left[ {0.0084, \; 0.0073, \; 0.5711, \; 1.0615} \right], \quad K_9^* = \left[ {0, \; 0, \; 0, \; 0} \right],$

and the corresponding numerical simulation results are presented by Figures 1–4.

Figure 1. The optimal volume ( $L$ ) of fluid: $x_1(t)$ .

DownLoad: Full-Size Img PowerPoint

Figure 2. The optimal concentration ( $gL^{-1}$ ) of biomass: $x_2(t)$ .

DownLoad: Full-Size Img PowerPoint

Figure 3. The optimal concentration ( $mmolL^{-1}$ ) of glycerol: $x_3(t)$ .

DownLoad: Full-Size Img PowerPoint

Figure 4. The optimal concentration ( $mmolL^{-1}$ ) of 1, 3-propanediol: $x_4(t)$ .

DownLoad: Full-Size Img PowerPoint

Note that Problem 6 is an optimal control problem of nonlinear dynamical systems with state constraints. Thus, the finite difference approximation approach developed by Nikoobin and Moradi ^[59] can also be applied for solving this dynamic optimization problem of 1, 3-propanediol fermentation processes. In order to compare with the improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3), the finite difference approximation approach developed by Nikoobin and Moradi ^[59] is also adopted for solving this dynamic optimization problem of 1, 3-propanediol fermentation process with the same model parameters under the same condition, and the numerical comparison results are presented by Figure 5 and Table 1.

Figure 5. Convergence rates for the finite difference approximation approach developed by Nikoobin and Moradi ^[59] and the improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3).

DownLoad: Full-Size Img PowerPoint

Table 1. The comparison results between the finite difference approximation approach developed by Nikoobin and Moradi ^[59] and the improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3).

Algorithm	Computation time (second)	$x_4(t_f)$
The finite difference approximation approach developed by Nikoobin and Moradi ^[59]	1165.3872	1052.9140
The improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3)	439.1513	1265.5597

| Show Table

DownLoad: CSV

shows that the improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3) takes only 67 iterations to obtain the satisfactory result $x_4(t_f) = 1265.5597$ , while the finite difference approximation approach developed by Nikoobin and Moradi ^[59] takes 139 iterations to achieve the satisfactory result $x_4(t_f) = 1052.9140$ . That is, the iterations of the improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3) is reduced by $51.7986\%$ . In additions, also shows that the result $x_4(t_f) = 1052.9140$ obtained by using the finite difference approximation approach developed by Nikoobin and Moradi ^[59] is not superior to the result ( $x_4(t_f) = 1265.5597$ ) obtained by using the improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3) with saving $60.4695\%$ computation time.

In conclusion, the above numerical simulation results show that the improved gradient-based numerical optimization algorithm (Algorithm 1 described by Section 4.3) is low time-consuming, has faster convergence speed, and can obtain a better numerical optimization than the finite difference approximation approach developed by Nikoobin and Moradi ^[59]. That is, an effective numerical optimization algorithm is presented for solving the dynamic optimization problem of 1, 3-propanediol fermentation process.

7. Conclusions

In this paper, the dynamic optimization problem for a class of fed-batch fermentation processes is modeled as an optimal control problem of switched dynamical systems under state-dependent switching, and a general state-feedback controller is designed for this dynamic optimization problem. Then, by introducing a discrete-valued function and using a relaxation technique, this problem is transformed into a nonlinear parameter optimization problem. Next, an improved gradient-based algorithm is developed based on a novel search approach, and a large number of numerical experiments show that this novel search approach can effectively improve the convergence speed of this algorithm, when an iteration is trapped to a curved narrow valley bottom of the objective function. Finally, an optimal feedback control problem of 1, 3-propanediol fermentation processes is provided to illustrate the effectiveness of this method developed by this paper, and the numerical simulation results show that this method developed by this paper is low time-consuming, has faster convergence speed, and obtains a better result than the existing approaches. In the future, we will continue to study the dynamic optimization problem for a class of fed-batch fermentation processes with uncertainty constraints.

Acknowledgments

The authors express their sincere gratitude to the anonymous reviewers for their constructive comments in improving the presentation and quality of this manuscript. This work was supposed by the National Natural Science Foundation of China under Grant Nos. 61963010 and 61563011, and the Special Project for Cultivation of New Academic Talent and Innovation Exploration of Guizhou Normal University in 2019 under Grant No. 11904-0520077.

Conflict of interest

The authors declare no conflicts of interest.

References

[1]	R. Yan, Z. Shi, Y. Zhong, Task assignment for multiplayer reach–avoid games in convex domains via analytical barriers, IEEE Trans. Rob., 36 (2019), 107–124. https://doi.org/10.1109/TRO.2019.2935345 doi: 10.1109/TRO.2019.2935345
[2]	E. Garcia, I. Weintraub, D. W. Casbeer, M. Pachter, Optimal strategies for the game of protecting a plane in 3-d, preprint, arXiv: 2202.01826.
[3]	E. Garcia, D. W. Casbeer, M. Pachter, Optimal strategies of the differential game in a circular region, IEEE Control Syst. Lett., 4 (2019), 492–497. https://doi.org/10.1109/LCSYS.2019.2963173 doi: 10.1109/LCSYS.2019.2963173
[4]	J. Chen, W. Zha, Z. Peng, D. Gu, Multi-player pursuit–evasion games with one superior evader, Automatica, 71 (2016), 24–32. https://doi.org/10.1016/j.automatica.2016.04.012 doi: 10.1016/j.automatica.2016.04.012
[5]	K. Chen, W. He, Q. L. Han, M. Xue, Y. Tang, Leader selection in networks under switching topologies with antagonistic interactions, Automatica, 142 (2022), 110334. https://doi.org/10.1016/j.automatica.2022.110334 doi: 10.1016/j.automatica.2022.110334
[6]	Z. Li, X. Yu, J. Qiu, H. Gao, Cell division genetic algorithm for component allocation optimization in multifunctional placers, IEEE Trans. Ind. Inf., 18 (2021), 559–570. https://doi.org/10.1109/TⅡ.2021.3069459 doi: 10.1109/TⅡ.2021.3069459
[7]	Y. Tang, C. Zhao, J. Wang, C. Zhang, Q. Sun, W. Zheng, et al., An overview of perception and decision-making in autonomous systems in the era of learning, IEEE Trans. Neural Networks Learn. Syst., 2022. https://doi.org/10.1109/TNNLS.2022.3167688 doi: 10.1109/TNNLS.2022.3167688
[8]	E. Garcia, D. W. Casbeer, A. V. Moll, M. Pachter, Multiple pursuer multiple evader differential games, IEEE Trans. Autom. Control, 66 (2020), 2345–2350. https://doi.org/10.1109/TAC.2020.3003840 doi: 10.1109/TAC.2020.3003840
[9]	E. Garcia, D. W. Casbeer, M. Pachter, Optimal strategies for a class of multi-player reach-avoid differential games in 3d space, IEEE Rob. Autom. Lett., 5 (2020), 4257–4264, https://doi.org/10.1109/LRA.2020.2994023 doi: 10.1109/LRA.2020.2994023
[10]	H. Huang, J. Ding, W. Zhang, C. J. Tomlin, Automation-assisted capture-the-flag: A differential game approach, IEEE Trans. Control Syst. Technol., 23 (2014), 1014–1028. https://doi.org/10.1109/TCST.2014.2360502 doi: 10.1109/TCST.2014.2360502
[11]	Z. Zhou, J. Huang, J. Xu, Y. Tang, Two-phase jointly optimal strategies and winning regions of the capture-the-flag game, in IECON 2021 – 47th Annual Conference of the IEEE Industrial Electronics Society, (2021), 1–6. https://doi.org/10.1109/IECON48115.2021.9589624
[12]	E. Garcia, A. V. Moll, D. W. Casbeer, M. Pachter, Strategies for defending a coastline against multiple attackers, in 2019 IEEE 58th Conference on Decision and Control (CDC), (2019), 7319–7324. https://doi.org/10.1109/CDC40024.2019.9029340
[13]	I. E. Weintraub, M. Pachter, E. Garcia, An introduction to pursuit-evasion differential games, in 2020 American Control Conference (ACC), (2020), 1049–1066. https://doi.org/10.23919/ACC45564.2020.9147205
[14]	T. Başar, A tutorial on dynamic and differential games, Dyn. Games Appl. Econ., (1986), 1–25. https://doi.org/10.1007/978-3-642-61636-5_1 doi: 10.1007/978-3-642-61636-5_1
[15]	S. S. Kumkov, S. L. Ménec, V. S. Patsko, Zero-sum pursuit-evasion differential games with many objects: survey of publications, Dyn. Games Appl., 7 (2017), 609–633. https://doi.org/10.1007/s13235-016-0209-z doi: 10.1007/s13235-016-0209-z
[16]	R. Yan, Z. Shi, Y. Zhong, Defense game in a circular region, in 2017 IEEE 56th Annual Conference on Decision and Control (CDC), (2017), 5590–5595. https://doi.org/10.1109/CDC.2017.8264502
[17]	I. E. Weintraub, A. V. Moll, E. Garcia, D. Casbeer, Z. J. Demers, M. Pachter, Maximum observation of a faster non-maneuvering target by a slower observer, in 2020 American Control Conference (ACC), (2020), 100–105. https://doi.org/10.23919/ACC45564.2020.9147340
[18]	J. Wang, Y. Hong, J. Wang, J. Xu, Y. Tang, Q. L. Han, et al., Cooperative and competitive multi-agent systems:from optimization to games, IEEE/CAA J. Autom. Sin., 9 (2022), 763–783. https://doi.org/10.1109/JAS.2022.105506 doi: 10.1109/JAS.2022.105506
[19]	A. A. Al-Talabi, Multi-player pursuit-evasion differential game with equal speed, in 2017 International Automatic Control Conference (CACS), (2017), 1–6. https://doi.org/10.1109/CACS.2017.8284276
[20]	D. Shishika, J. Paulos, V. Kumar, Cooperative team strategies for multi-player perimeter-defense games, IEEE Rob. Autom. Lett., 5 (2020), 2738–2745. https://doi.org/10.1109/LRA.2020.2972818 doi: 10.1109/LRA.2020.2972818
[21]	E. Garcia, Z. E. Fuchs, D. Milutinovic, D. W. Casbeer, M. Pachter, A geometric approach for the cooperative two-pursuer one-evader differential game, IFAC-PapersOnLine, 50 (2017), 15209–15214. https://doi.org/10.1016/j.ifacol.2017.08.2366 doi: 10.1016/j.ifacol.2017.08.2366
[22]	A. V. Moll, D. Casbeer, E. Garcia, D. Milutinović, M. Pachter, The multi-pursuer single-evader game, J. Intell. Rob. Syst., 96 (2019), 193–207. https://doi.org/10.1007/s10846-018-0963-9 doi: 10.1007/s10846-018-0963-9
[23]	E. Garcia, S. D. Bopardikar, Cooperative containment of a high-speed evader, in 2021 American Control Conference (ACC), (2021), 4698–4703. https://doi.org/10.23919/ACC50511.2021.9483097
[24]	E. Garcia, D. W. Casbeer, D. Tran, M. Pachter, A differential game approach for beyond visual range tactics, in 2021 American Control Conference (ACC), (2021), 3210–3215. https://doi.org/10.23919/ACC50511.2021.9482650
[25]	Y. Xu, H. Yang, B. Jiang, M. M. Polycarpou, Multi-player pursuit-evasion differential games with malicious pursuers, IEEE Trans. Autom. Control, 2022. https://doi.org/10.1109/TAC.2022.3168430 doi: 10.1109/TAC.2022.3168430
[26]	W. Lin, Z. Qu, M. A. Simaan, Nash strategies for pursuit-evasion differential games involving limited observations, IEEE Trans. Aerosp. Electron. Syst., 51 (2015), 1347–1356. https://doi.org/10.1109/TAES.2014.130569 doi: 10.1109/TAES.2014.130569
[27]	M. Pachter, E. Garcia, D. W. Casbeer, Active target defense differential game, in 2014 52nd Annual Allerton Conference on Communication, Control, and Computing (Allerton), (2014), 46–53. https://doi.org/10.1109/ALLERTON.2014.7028434
[28]	E. Garcia, D. W. Casbeer, M. Pachter, Active target defense using first order missile models, Automatica, 78 (2017), 139–143. https://doi.org/10.1016/j.automatica.2016.12.032 doi: 10.1016/j.automatica.2016.12.032
[29]	M. Coon, D. Panagou, Control strategies for multiplayer target-attacker-defender differential games with double integrator dynamics, in 2017 IEEE 56th Annual Conference on Decision and Control (CDC), (2017), 1496–1502. https://doi.org/10.1109/CDC.2017.8263864
[30]	I. E. Weintraub, E. Garcia, M. Pachter, A kinematic rejoin method for active defense of non-maneuverable aircraft, in 2018 Annual American Control Conference (ACC), (2018), 6533–6538. https://doi.org/10.23919/ACC.2018.8431129
[31]	E. Garcia, D. W. Casbeer, M. Pachter, Design and analysis of state-feedback optimal strategies for the differential game of active defense, IEEE Trans. Autom. Control, 64 (2018), 553–568. https://doi.org/10.1109/TAC.2018.2828088 doi: 10.1109/TAC.2018.2828088
[32]	E. Garcia, D. W. Casbeer, M. Pachter, Optimal target capture strategies in the target-attacker-defender differential game, in 2018 Annual American Control Conference (ACC), (2018), 68–73. https://doi.org/10.23919/ACC.2018.8431715
[33]	E. Garcia, D. W. Casbeer, M. Pachter, The complete differential game of active target defense, J. Optim. Theory Appl., 191 (2021), 675–699. https://doi.org/10.1007/s10957-021-01816-z doi: 10.1007/s10957-021-01816-z
[34]	E. Garcia, D. W. Casbeer, M. Pachter, Pursuit in the presence of a defender, Dyn. Games Appl., 9 (2019), 652–670. https://doi.org/10.1007/s13235-018-0271-9 doi: 10.1007/s13235-018-0271-9
[35]	M. Pachter, E. Garcia, D. W. Casbeer, Toward a solution of the active target defense differential game, Dyn. Games Appl., 9 (2019), 165–216. https://doi.org/10.1007/s13235-018-0250-1 doi: 10.1007/s13235-018-0250-1
[36]	E. Garcia, Cooperative target protection from a superior attacker, Automatica, 131 (2021), 109696. https://doi.org/10.1016/j.automatica.2021.109696 doi: 10.1016/j.automatica.2021.109696
[37]	M. Pachter, E. Garcia, R. Anderson, D. W. Casbeer, K. Pham, Maximizing the target's longevity in the active target defense differential game, in 2019 18th European Control Conference (ECC), (2019), 2036–2041. https://doi.org/10.23919/ECC.2019.8795650
[38]	E. Garcia, D. W. Casbeer, M. Pachter, Defense of a target against intelligent adversaries: A linear quadratic formulation, in 2020 IEEE Conference on Control Technology and Applications (CCTA), (2020), 619–624. https://doi.org/10.1109/CCTA41146.2020.9206368
[39]	E. Garcia, D. W. Casbeer, M. Pachter, Cooperative strategies for optimal aircraft defense from an attacking missile, J. Guid., Control, Dyn., 38 (2015), 1510–1520. https://doi.org/10.2514/1.G001083 doi: 10.2514/1.G001083
[40]	L. Liang, F. Deng, Z. Peng, X. Li, W. Zha, A differential game for cooperative target defense, Automatica, 102 (2019), 58–71. https://doi.org/10.1016/j.automatica.2018.12.034 doi: 10.1016/j.automatica.2018.12.034
[41]	Z. Zhou, J. Ding, H. Huang, R. Takei, C. Tomlin, Efficient path planning algorithms in reach-avoid problems, Automatica, 89 (2018), 28–36. https://doi.org/10.1016/j.automatica.2017.11.035 doi: 10.1016/j.automatica.2017.11.035
[42]	P. Shi, W. Sun, X. Yang, I. J. Rudas, H. Gao, Master-slave synchronous control of dual-drive gantry stage with cogging force compensation, IEEE Trans. Syst. Man Cybern.: Syst., https://doi.org/10.1109/TSMC.2022.3176952
[43]	J. Lorenzetti, M. Chen, B. Landry, M. Pavone, Reach-avoid games via mixed-integer second-order cone programming, in 2018 IEEE Conference on Decision and Control (CDC), (2018), 4409–4416. https://doi.org/10.1109/CDC.2018.8619382
[44]	R. Isaacs, Differential games: Their scope, nature, and future, J. Optim. Theory Appl., 3 (1969), 283–295. https://doi.org/10.1007/BF00931368 doi: 10.1007/BF00931368
[45]	R. Yan, Z. Shi, Y. Zhong, Guarding a subspace in high-dimensional space with two defenders and one attacker, IEEE Trans. Cybern., 2020. https://doi.org/10.1109/TCYB.2020.3015031 doi: 10.1109/TCYB.2020.3015031
[46]	R. Yan, Z. Shi, Y. Zhong, Construction of the barrier for reach-avoid differential games in three-dimensional space with four equal-speed players, in 2019 IEEE 58th Conference on Decision and Control (CDC), (2019), 4067–4072. https://doi.org/10.1109/CDC40024.2019.9029495
[47]	K. Margellos, J. Lygeros, Hamilton–jacobi formulation for reach–avoid differential games, IEEE Trans. Autom. Control, 56 (2011), 1849–1861. https://doi.org/10.1109/TAC.2011.2105730 doi: 10.1109/TAC.2011.2105730
[48]	J. F. Fisac, M. Chen, C. J. Tomlin, S. S. Sastry, Reach-avoid problems with time-varying dynamics, targets and constraints, in HSCC '15: Proceedings of the 18th International Conference on Hybrid Systems: Computation and Control, (2015), 11–20. https://doi.org/10.1145/2728606.2728612
[49]	M. Chen, Z. Zhou, C. J. Tomlin, Multiplayer reach-avoid games via pairwise outcomes, IEEE Trans. Autom. Control, 62 (2016), 1451–1457. https://doi.org/10.1109/TAC.2016.2577619 doi: 10.1109/TAC.2016.2577619
[50]	V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, et al., Playing atari with deep reinforcement learning, preprint, arXiv: 1312.5602.
[51]	S. Bansal, C. J. Tomlin, Deepreach: A deep learning approach to high-dimensional reachability, in 2021 IEEE International Conference on Robotics and Automation (ICRA), (2021), 1817–1824. https://doi.org/10.1109/ICRA48506.2021.9561949
[52]	J. Li, D. Lee, S. Sojoudi, C. J. Tomlin, Infinite-horizon reach-avoid zero-sum games via deep reinforcement learning, preprint, arXiv: 2203.10142.
[53]	K. C. Hsu, V. R. Royo, C. J. Tomlin, J. F. Fisac, Safety and liveness guarantees through reach-avoid reinforcement learning, preprint, arXiv: 2112.12288.
[54]	E. Garcia, D. W. Casbeer, A. V. Moll, M. Pachter, Cooperative two-pursuer one-evader blocking differential game, in 2019 American Control Conference (ACC), (2019), 2702–2709. https://doi.org/10.23919/ACC.2019.8814294
[55]	R. Yan, X. Duan, Z. Shi, Y. Zhong, F. Bullo, Matching-based capture strategies for 3d heterogeneous multiplayer reach-avoid differential games, Automatica, 140 (2022), 110207. https://doi.org/10.1016/j.automatica.2022.110207 doi: 10.1016/j.automatica.2022.110207
[56]	J. Selvakumar, E. Bakolas, Feedback strategies for a reach-avoid game with a single evader and multiple pursuers, IEEE Trans. Cybern., 51 (2019), 696–707. https://doi.org/10.1109/TCYB.2019.2914869 doi: 10.1109/TCYB.2019.2914869
[57]	E. Garcia, D. W. Casbeer, M. Pachter, J. W. Curtis, E. Doucette, A two-team linear quadratic differential game of defending a target, in 2020 American Control Conference (ACC), (2020), 1665–1670. https://doi.org/10.23919/ACC45564.2020.9147665
[58]	S. D. Bopardikar, F. Bullo, J. P. Hespanha, A cooperative homicidal chauffeur game, Automatica, 45 (2009), 1771–1777. https://doi.org/10.1016/j.automatica.2009.03.014 doi: 10.1016/j.automatica.2009.03.014
[59]	R. Lopez-Padilla, R. Murrieta-Cid, I. Becerra, G. Laguna, S. M. LaValle, Optimal navigation for a differential drive disc robot: A game against the polygonal environment, J. Intell. Rob. Syst., 89 (2018), 211–250. https://doi.org/10.1007/s10846-016-0433-1 doi: 10.1007/s10846-016-0433-1
[60]	A. Pierson, Z. Wang, M. Schwager, Intercepting rogue robots: An algorithm for capturing multiple evaders with multiple pursuers, IEEE Rob. Autom. Lett., 2 (2016), 530–537. https://doi.org/10.1109/LRA.2016.2645516 doi: 10.1109/LRA.2016.2645516
[61]	Z. Zhou, W. Zhang, J. Ding, H. Huang, D. M. Stipanović, C. J. Tomlin, Cooperative pursuit with voronoi partitions, Automatica, 72 (2016), 64–72. https://doi.org/10.1016/j.automatica.2016.05.007 doi: 10.1016/j.automatica.2016.05.007
[62]	E. Bakolas, P. Tsiotras, Relay pursuit of a maneuvering target using dynamic voronoi diagrams, Automatica, 48 (2012), 2213–2220. https://doi.org/10.1016/j.automatica.2012.06.003 doi: 10.1016/j.automatica.2012.06.003
[63]	R. Yan, Z. Shi, Y. Zhong, Reach-avoid games with two defenders and one attacker: An analytical approach, IEEE Trans. Cybern., 49 (2018), 1035–1046. https://doi.org/10.1109/TCYB.2018.2794769 doi: 10.1109/TCYB.2018.2794769
[64]	R. Yan, Z. Shi, Y. Zhong, Cooperative strategies for two-evader-one-pursuer reach-avoid differential games, Int. J. Syst. Sci., 52 (2021), 1894–1912. https://doi.org/10.1080/00207721.2021.1872116 doi: 10.1080/00207721.2021.1872116
[65]	J. Wang, J. Huang, Y. Tang, Swarm intelligence capture-the-flag game with imperfect information based on deep reinforcement learning, Sci. Sin. Technol., 2021. https://doi.org/10.1360/SST-2021-0382 doi: 10.1360/SST-2021-0382
[66]	I. M. Mitchell, A. M. Bayen, C. J. Tomlin, A time-dependent hamilton-jacobi formulation of reachable sets for continuous dynamic games, IEEE Trans. Autom. Control, 50 (2005), 947–957. https://doi.org/10.1109/TAC.2005.851439 doi: 10.1109/TAC.2005.851439
[67]	E. Garcia, D. W. Casbeer, M. Pachter, The capture-the-flag differential game, in 2018 IEEE Conference on Decision and Control (CDC), (2018), 4167–4172. https://doi.org/10.1109/CDC.2018.8619026
[68]	M. Pachter, D. W. Casbeer, E. Garcia, Capture-the-flag: A differential game, in 2020 IEEE Conference on Control Technology and Applications (CCTA), (2020), 606–610. https://doi.org/10.1109/CCTA41146.2020.9206333
[69]	Z. Liu, W. Lin, X. Yu, J. J. Rodríguez-Andina, H. Gao, Approximation-free robust synchronization control for dual-linear-motors-driven systems with uncertainties and disturbances, IEEE Trans. Ind. Electron., 69 (2021), 10500–10509. https://doi.org/10.1109/TIE.2021.3137619 doi: 10.1109/TIE.2021.3137619
[70]	Y. Tang, X. Jin, Y. Shi, W. Du, Event-triggered attitude synchronization of multiple rigid body systems with velocity-free measurements, Automatica, in press.
[71]	X. Jin, Y. Shi, Y. Tang, X. Wu, Event-triggered attitude consensus with absolute and relative attitude measurements, Automatica, 122 (2020), 109245. https://doi.org/10.1016/j.automatica.2020.109245 doi: 10.1016/j.automatica.2020.109245
[72]	R. R. Brooks, J. E. Pang, C. Griffin, Game and information theory analysis of electronic countermeasures in pursuit-evasion games, IEEE Trans. Syst. Man Cybern. Part A Syst. Humans, 38 (2008), 1281–1294. https://doi.org/10.1109/TSMCA.2008.2003970 doi: 10.1109/TSMCA.2008.2003970
[73]	J. Ni, S. X. Yang, Bioinspired neural network for real-time cooperative hunting by multirobots in unknown environments, IEEE Trans. Neural Networks, 22 (2011), 2062–2077. https://doi.org/10.1109/TNN.2011.2169808 doi: 10.1109/TNN.2011.2169808
[74]	J. Poropudas, K. Virtanen, Game-theoretic validation and analysis of air combat simulation models, IEEE Trans. Syst. Man Cybern. Part A Syst. Humans, 40 (2010), 1057–1070. https://doi.org/10.1109/TSMCA.2010.2044997 doi: 10.1109/TSMCA.2010.2044997
[75]	Z. E. Fuchs, P. P. Khargonekar, J. Evers, Cooperative defense within a single-pursuer, two-evader pursuit evasion differential game, in 49th IEEE Conference on Decision and Control (CDC), (2010), 3091–3097. https://doi.org/10.1109/CDC.2010.5717894
[76]	B. Goode, A. Kurdila, M. Roan, Pursuit-evasion with acoustic sensing using one step nash equilibria, in Proceedings of the 2010 American Control Conference, (2010), 1925–1930. https://doi.org/10.1109/ACC.2010.5531356
[77]	Y. Tang, D. Zhang, P. Shi, W. Zhang, F. Qian, Event-based formation control for nonlinear multiagent systems under DoS attacks, IEEE Trans. Autom. Control, 66 (2020), 452–459. https://doi.org/10.1109/TAC.2020.2979936 doi: 10.1109/TAC.2020.2979936
[78]	S. Wang, X. Jin, S. Mao, A. V. Vasilakos, Y. Tang, Model-free event-triggered optimal consensus control of multiple Euler-Lagrange systems via reinforcement learning, IEEE Trans. Network Sci. Eng., 8 (2020), 246–258. https://doi.org/10.1109/TNSE.2020.3036604 doi: 10.1109/TNSE.2020.3036604
[79]	H. Gao, Z. Li, X. Yu, J. Qiu, Hierarchical multiobjective heuristic for PCB assembly optimization in a beam-head surface mounter, IEEE Trans. Cybern., 2021. https://doi.org/10.1109/TCYB.2020.3040788 doi: 10.1109/TCYB.2020.3040788
[80]	Y. Tang, X. Wu, P. Shi, F. Qian, Input-to-state stability for nonlinear systems with stochastic impulses, Automatica, 113 (2020), 108766. https://doi.org/10.1016/j.automatica.2019.108766 doi: 10.1016/j.automatica.2019.108766

This article has been cited by:

1.	Jibril Abdullahi Bala, Taliha Abiodun Folorunso, Majeed Soufian, Abiodun Musa Aibinu, Olayemi Mikail Olaniyi, Nimat Ibrahim, 2022, Fuzzy Logic based Fed Batch Fermentation Control Scheme for Plant Culturing, 978-1-6654-7978-3, 1, 10.1109/NIGERCON54645.2022.9803140
2.	Ricardo Aguilar‐López, Pablo A. López‐Pérez, Ricardo Femat, Eduardo Alvarado‐Santos, Improved bioethanol production from cocoa agro‐industrial waste optimization based on reaction rate rules, 2025, 0268-2575, 10.1002/jctb.7815
3.	Dadang Rustandi, Mersi Kurniati, Sensus Wijonarko, Siddiq Wahyu Hidayat, Tatik Maftukhah, , A Control System for Pepper Submersion Tub Actuators, 2025, 2973, 1742-6588, 012009, 10.1088/1742-6596/2973/1/012009

Reader Comments

Your name:*

Email:*
© 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Electronic Research Archive

1.1 1.7

Metrics

Article views(2922) PDF downloads(159) Cited by(4)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Figures(6) / Tables(1)

Electronic Research Archive

Optimal strategy analysis for adversarial differential games

Related Papers:

Abstract

1. Introduction

2. Problem formulation

3. Problem transformation and relaxation

3.1. Problem transformation

3.2. Problem relaxation

3.3. A nonlinear parameter optimization problem

4. An improved gradient-based numerical optimization algorithm

4.1. A penalty problem

4.2. Gradient formulae

4.3. Algorithm

5. Convergence analysis

6. Numerical results

7. Conclusions

Acknowledgments

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

Electronic Research Archive

Optimal strategy analysis for adversarial differential games

Related Papers:

Abstract

1. Introduction

2. Problem formulation

3. Problem transformation and relaxation

3.1. Problem transformation

3.2. Problem relaxation

3.3. A nonlinear parameter optimization problem

4. An improved gradient-based numerical optimization algorithm

4.1. A penalty problem

4.2. Gradient formulae

4.3. Algorithm

5. Convergence analysis

6. Numerical results

7. Conclusions

Acknowledgments

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog