Loading [MathJax]/jax/output/SVG/jax.js
Research article Special Issues

Fault-tolerant control for nonlinear systems with a dead zone: Reinforcement learning approach


  • Received: 02 December 2022 Revised: 31 December 2022 Accepted: 17 January 2023 Published: 01 February 2023
  • This paper focuses on the adaptive reinforcement learning-based optimal control problem for standard nonstrict-feedback nonlinear systems with the actuator fault and an unknown dead zone. To simultaneously reduce the computational complexity and eliminate the local optimal problem, a novel neural network weight updated algorithm is presented to replace the classic gradient descent method. By utilizing the backstepping technique, the actor critic-based reinforcement learning control strategy is developed for high-order nonlinear nonstrict-feedback systems. In addition, two auxiliary parameters are presented to deal with the input dead zone and actuator fault respectively. All signals in the system are proven to be semi-globally uniformly ultimately bounded by Lyapunov theory analysis. At the end of the paper, some simulation results are shown to illustrate the remarkable effect of the proposed approach.

    Citation: Zichen Wang, Xin Wang. Fault-tolerant control for nonlinear systems with a dead zone: Reinforcement learning approach[J]. Mathematical Biosciences and Engineering, 2023, 20(4): 6334-6357. doi: 10.3934/mbe.2023274

    Related Papers:

    [1] Ruiping Yuan, Jiangtao Dou, Juntao Li, Wei Wang, Yingfan Jiang . Multi-robot task allocation in e-commerce RMFS based on deep reinforcement learning. Mathematical Biosciences and Engineering, 2023, 20(2): 1903-1918. doi: 10.3934/mbe.2023087
    [2] Yangjie Sun, Xiaoxi Che, Nan Zhang . 3D human pose detection using nano sensor and multi-agent deep reinforcement learning. Mathematical Biosciences and Engineering, 2023, 20(3): 4970-4987. doi: 10.3934/mbe.2023230
    [3] Jin Zhang, Nan Ma, Zhixuan Wu, Cheng Wang, Yongqiang Yao . Intelligent control of self-driving vehicles based on adaptive sampling supervised actor-critic and human driving experience. Mathematical Biosciences and Engineering, 2024, 21(5): 6077-6096. doi: 10.3934/mbe.2024267
    [4] Siqi Chen, Ran Su . An autonomous agent for negotiation with multiple communication channels using parametrized deep Q-network. Mathematical Biosciences and Engineering, 2022, 19(8): 7933-7951. doi: 10.3934/mbe.2022371
    [5] Shixuan Yao, Xiaochen Liu, Yinghui Zhang, Ze Cui . An approach to solving optimal control problems of nonlinear systems by introducing detail-reward mechanism in deep reinforcement learning. Mathematical Biosciences and Engineering, 2022, 19(9): 9258-9290. doi: 10.3934/mbe.2022430
    [6] Siqi Chen, Yang Yang, Ran Su . Deep reinforcement learning with emergent communication for coalitional negotiation games. Mathematical Biosciences and Engineering, 2022, 19(5): 4592-4609. doi: 10.3934/mbe.2022212
    [7] Jia Mian Tan, Haoran Liao, Wei Liu, Changjun Fan, Jincai Huang, Zhong Liu, Junchi Yan . Hyperparameter optimization: Classics, acceleration, online, multi-objective, and tools. Mathematical Biosciences and Engineering, 2024, 21(6): 6289-6335. doi: 10.3934/mbe.2024275
    [8] Jingxu Xiao, Chaowen Chang, Yingying Ma, Chenli Yang, Lu Yuan . Secure multi-path routing for Internet of Things based on trust evaluation. Mathematical Biosciences and Engineering, 2024, 21(2): 3335-3363. doi: 10.3934/mbe.2024148
    [9] Koji Oshima, Daisuke Yamamoto, Atsuhiro Yumoto, Song-Ju Kim, Yusuke Ito, Mikio Hasegawa . Online machine learning algorithms to optimize performances of complex wireless communication systems. Mathematical Biosciences and Engineering, 2022, 19(2): 2056-2094. doi: 10.3934/mbe.2022097
    [10] Jose Guadalupe Beltran-Hernandez, Jose Ruiz-Pinales, Pedro Lopez-Rodriguez, Jose Luis Lopez-Ramirez, Juan Gabriel Avina-Cervantes . Multi-Stroke handwriting character recognition based on sEMG using convolutional-recurrent neural networks. Mathematical Biosciences and Engineering, 2020, 17(5): 5432-5448. doi: 10.3934/mbe.2020293
  • This paper focuses on the adaptive reinforcement learning-based optimal control problem for standard nonstrict-feedback nonlinear systems with the actuator fault and an unknown dead zone. To simultaneously reduce the computational complexity and eliminate the local optimal problem, a novel neural network weight updated algorithm is presented to replace the classic gradient descent method. By utilizing the backstepping technique, the actor critic-based reinforcement learning control strategy is developed for high-order nonlinear nonstrict-feedback systems. In addition, two auxiliary parameters are presented to deal with the input dead zone and actuator fault respectively. All signals in the system are proven to be semi-globally uniformly ultimately bounded by Lyapunov theory analysis. At the end of the paper, some simulation results are shown to illustrate the remarkable effect of the proposed approach.



    Let C be the complex plane. Denote by CN the N-dimensional complex Euclidean space with the inner product z,w=Nj=1zj¯wj; by |z|2=z,z; by H(CN) the set of all holomorphic functions on CN; and by I the identity operator on CN.

    The Fock space F2(CN) is a Hilbert space of all holomorphic functions fH(CN) with the inner product

    f,g=1(2π)NCNf(z)¯g(z)e12|z|2dν(z),

    where ν(z) denotes Lebesgue measure on CN. To simplify notation, we will often use F2 instead of F2(CN), and we will denote by f the corresponding norm of f. The reproducing kernel functions of the Fock space are given by

    Kw(z)=ez,w2,zCN,

    which means that if fF2, then f(z)=f,Kz for all zCN. It is easy to see that Kw=e|w|2/4. Therefore, the following evaluation holds:

    |f(z)|e|z|24f

    for fF2 and zCN. If kw is the normalization of Kw, then

    kw(z)=ez,w2|w|24,zCN.

    Indeed, F2 is used to describe systems with varying numbers of particles in the states of quantum harmonic oscillators. On the other hand, the reproducing kernels in F2 are used to describe the coherent states in quantum physics. See [17] for more about the Fock space, and see [1,7,11] for the studies of some operators on the Fock space.

    For a given holomorphic mapping φ:CNCN and uH(CN), the weighted composition operator, usually denoted by Wu,φ, on or between some subspaces of H(CN) is defined by

    Wu,φf(z)=u(z)f(φ(z)).

    When u=1, it is the composition operator, usually denoted by Cφ. While φ(z)=z, it is the multiplication operator, usually denoted by Mu.

    Forelli in [8] proved that the isometries on Hardy space Hp defined on the open unit disk (for p2) are certain weighted composition operators, which can be regarded as the earliest presence of the weighted composition operators. Weighted composition operators have also been used in descriptions of adjoints of composition operators (see [4]). An elementary problem is to provide function-theoretic characterizations for which the symbols u and φ induce a bounded or compact weighted composition operator on various holomorphic function spaces. There have been many studies of the weighted composition operators and composition operators on holomorphic function spaces. For instance, several authors have recently worked on the composition operators and weighted composition operators on Fock space. For the one-variable case, Ueki [13] characterized the boundedness and compactness of weighted composition operators on Fock space. As a further work of [13], Le [10] found the easier criteria for the boundedness and compactness of weighted composition operators. Recently, Bhuia in [2] characterized a class of C-normal weighted composition operators on Fock space.

    For the several-variable case, Carswell et al. [3] studied the boundedness and compactness of composition operators. From [3], we see that the one-variable case composition operator Cφ is bounded on Fock space if and only if φ(z)=az+b, where |a|1, and if |a|=1, then b=0. Let A:CNCN be a linear operator. Zhao [14,15,16] characterized the unitary, invertible, and normal weighted composition operator Wu,φ on Fock space, when φ(z)=Az+b and u=kc. Interestingly enough, Zhao [15] proved that for φ(z)=Az+b and u(z)=Kc(z), weighted composition operator Wu,φ is bounded on Fock space if and only if A1 and Aζ,b+Ac=0 whenever |Aζ|=|ζ| for ζCN.

    Motivated by the above-mentioned interesting works, for the special symbols φ(z)=Az+b and u=Kc, here we study the adjoint, self-adjointness, and hyponormality of weighted composition operators on Fock space. Such properties of the abstract or concrete operators (for example, Toeplitz operators, Hankel operators, and composition operators) have been extensively studied on some other holomorphic function spaces. This paper can be regarded as a continuation of the weighted composition operators on Fock space.

    In this section, we characterize the adjoints of weighted composition operators Wu,φ on Fock space, where φ(z)=Az+b and u=Kc.

    We first have the following result:

    Lemma 2.1. Let A, B:CNCN be linear operators with A1 and B1, φ(z)=Az+a, ψ(z)=Bz+b for a,bCN, and the operators Cφ and Cψ be bounded on F2. Then

    CφCψ=WKa,BAz+b,

    where A is the adjoint operator of A.

    Proof. From Lemma 2 in [3], it follows that

    CφCψ=MKaCAzCBz+b=MKaC(Bz+b)Az=MKaCBAz+b=WKa,BAz+b,

    from which the result follows. The proof is complete.

    In Lemma 2.1, we prove that the product of the adjoint of a composition operator and another composition operator is expressed as a weighted composition operator. Next, we will see that in some sense, the converse of Lemma 2.1 is also true. Namely, we will prove that if φ(z)=Az+b, where A:CNCN is a linear operator with A<1, and u=Kc, then the operator Wu,φ on F2 can be written as the product of the adjoint of a composition operator and another composition operator.

    Lemma 2.2. Let A:CNCN be a linear operator with A<1. If A and c satisfy the condition Aζ,c=0 whenever |Aζ|=|ζ|, then there exists a positive integer n such that the operator Wu,φ on F2 defined by φ(z)=Az+b and u(z)=Kc(z) is expressed as

    Wu,φ=Cn+1nAz+cCnn+1z+b.

    Proof. From Theorem 2 in [3], we see that the operator CAz+c is bounded on F2. Since A<1, there exists a large enough positive integer n such that

    (1+1n)A1.

    Also, by Theorem 2 in [3], the operator Cn+1nAz+c is bounded on F2, which implies that the operator Cn+1nAz+c is also bounded on F2. Since |nn+1Iζ|=|ζ| if and only if ζ=0, nn+1Iζ,b=0 whenever |nn+1Iζ|=|ζ|. By Theorem 2 in [3], the operator Cnn+1Iz+b is bounded on F2. Then, it follows from Lemma 2.1 that

    Cn+1nAz+cCnn+1Iz+b=WKc,Az+b.

    The proof is complete.

    Now, we can obtain the adjoint for some weighted composition operators.

    Theorem 2.1. Let φ(z)=Az+b, u(z)=Kc(z), and A and c satisfy Aζ,c=0 whenever |Aζ|=|ζ|. Then it holds that

    Wu,φ=WKb,Az+c.

    Proof. In Lemma 2.2, we have

    Wu,φ=Cn+1nAz+cCnn+1Iz+b. (2.1)

    It follows from (2.1) that

    Wu,φ=Cnn+1Iz+bCn+1nAz+c. (2.2)

    Therefore, from (2.2) and Lemma 2.1, the desired result follows. The proof is complete.

    By using the kernel functions, we can obtain the following result:

    Lemma 2.3. Let the operator Wu,φ be a bounded operator on F2. Then it holds that

    Wu,φKw=¯u(w)Kφ(w).

    Proof. Let f be an arbitrary function in F2. We see that

    Wu,φKw,f=Kw,Wu,φf=¯Wu,φf,Kw=¯u(w)f(φ(w))=¯u(w)Kφ(w),f.

    From this, we deduce that Wu,φKw=¯u(w)Kφ(w). The proof is complete.

    Here, we characterize the self-adjoint weighted composition operators.

    Theorem 2.2. Let A:CNCN be a linear operator, b,cCN, φ(z)=Az+b, u(z)=Kc(z), and the operator Wu,φ be bounded on F2. Then the operator Wu,φ is self-adjoint on F2 if and only if A:CNCN is self-adjoint and b=c.

    Proof. In Lemma 2.3, we have

    Wu,φKw(z)=¯u(w)Kφ(w)=¯Kc(w)ez,φ(w)2=ec,w2ez,Aw+b2. (2.3)

    On the other hand,

    Wu,φKw(z)=u(z)Kw(φ(z))=ez,c2eAz+b,w2. (2.4)

    It is clear that operator Wu,φ is self-adjoint on F2 if and only if

    Wu,φKw=Wu,φKw.

    From (2.3) and (2.4), it follows that

    ec,w2ez,Aw+b2=ez,c2eAz+b,w2. (2.5)

    Letting z=0 in (2.5), we obtain that ec,w2=eb,w2 which implies that

    c,wb,w=4kπi, (2.6)

    where kN. Also, letting w=0 in (2.6), we see that k=0. This shows that c,wb,w=0, that is, c,w=b,w. From this, we deduce that b=c. Therefore, (2.5) becomes ez,Aw2=eAz,w2. From this, we obtain that z,Aw=Az,w, which implies that Az,w=Az,w. This shows that A=A, that is, A:CNCN is self-adjoint.

    Now, assume that A is a self-adjoint operator on CN and b=c. A direct calculation shows that (2.5) holds. Then Wu,φ is a self-adjoint operator on F2. The proof is complete.

    In [14], Zhao proved that the operator Wu,φ on F2 is unitary if and only if there exist an unitary operator A:CNCN, a vector bCN, and a constant α with |α|=1 such that φ(z)=Azb and u(z)=αKA1b(z). Without loss of generality, here we characterize the self-adjoint unitary operator Wu,φ on F2 for the case α=1 and obtain the following result from Theorem 2.2.

    Corollary 2.1. Let A:CNCN be a unitary operator and bCN such that φ(z)=Azb and u(z)=KA1b(z). Then the operator Wu,φ is self-adjoint on F2 if and only if A:CNCN is self-adjoint and Ab+b=0.

    First, we recall the definition of hyponormal operators. An operator T on a Hilbert space H is said to be hyponormal if AxAx for all vectors xH. T is called co-hyponormal if T is hyponormal. In 1950, Halmos, in his attempt to solve the invariant subspace problem, extended the notion of normal operators to two new classes, one of which is now known as the hyponormal operator (see [9]). Clearly, every normal operator is hyponormal. From the proof in [6], it follows that T is hyponormal if and only if there exists a linear operator C with C1 such that T=CT. In some sense, this result can help people realize the characterizations of the hyponormality of some operators. For example, Sadraoui in [12] used this result to characterize the hyponormality of composition operators defined by the linear fractional symbols on Hardy space. On the other hand, some scholars studied the hyponormality of composition operators on Hardy space by using the fact that the operator Cφ on Hardy space is hyponormal if and only if

    Cφf2Cφf2

    for all f in Hardy space. For example, Dennis in [5] used the fact to study the hyponormality of composition operators on Hardy space. In particular, this inequality for norms is used when f is a reproducing kernel function Kw for any wCN. Actually, to the best of our knowledge, there are few studies on the hyponormality of weighted composition operators. Here, we consider this property of weighted composition operators on Fock space.

    First, we have the following result, which can be proved by using the reproducing kernel functions.

    Lemma 3.1. Let wCN and the operator Wu,φ be bounded on F2. Then

    Wu,φKw2=Wu,φWu,φKw(w).

    Proof. From the inner product, we have

    Wu,φKw2=Wu,φKw,Wu,φKw=Wu,φWu,φKw,Kw=Wu,φWu,φKw(w).

    The proof is complete.

    Theorem 3.1. Let A:CNCN be a linear operator, φ(z)=Az+b, u=kc, and the operator Wu,φ be bounded on F2. If the operator Wu,φ is hyponormal on F2, then Abb=Acc and |b||c|.

    Proof. From a direct calculation, we have

    Wu,φKw(z)=u(z)Kw(φ(z))=kc(z)Kw(Az+b)=ez,c2|c|24eAz+b,w2=ez,Aw+c+b,w2|c|24=eb,w2|c|24KAw+c(z). (3.1)

    From (3.1), it follows that

    Wu,φWu,φKw(z)=eb,w2|c|24Wu,φKAw+c(z)=eb,w2|c|24¯u(Aw+c)Kφ(Aw+c)(z)=eb,w2+c,Aw+c2+z,AAw+Ac+b2|c|22=eb+Ac,w2+z,AAw2+z,Ac+b2. (3.2)

    On the other hand, we also have

    Wu,φWu,φKw(z)=¯u(w)Wu,φKφ(w)(z)=¯u(w)u(z)Kφ(w)(φ(z))=ec,w2+z,c2+Az+b,Aw+b2|c|22=ec+Ab,w2+|b|22+z,AAw2+z,c+Ab2|c|22. (3.3)

    From Lemma 3.1, (3.2), and (3.3), it follows that

    Wu,φKw2=Wu,φWu,φKw(w)=ec+Ab,w2+|b|22+|Aw|22+w,c+Ab2|c|22

    and

    Wu,φKw2=Wu,φWu,φKw(w)=eb+Ac,w2+|Aw|22+w,Ac+b2.

    Then, we have

    Wu,φKw2Wu,φKw2=e|Aw|22(ec+Ab,w2+|b|22+w,c+Ab2|c|22eb+Ac,w2+w,Ac+b2),

    which shows that

    Wu,φKw2Wu,φKw20

    for all wCN if and only if

    ec+Ab,w2+|b|22+w,c+Ab2|c|22eb+Ac,w2+w,Ac+b2. (3.4)

    It is clear that (3.4) holds if and only if

    c+Ab,w+|b|2+w,c+Ab|c|2b+Ac,w+w,Ac+b. (3.5)

    From (3.5), we see that (3.4) holds if and only if

    AbAc+cb,w+w,AbAc+cb+|b|2|c|20. (3.6)

    Therefore, we deduce that (3.4) holds for all wCN if and only if |b||c| and Abb=Acc. The proof is complete.

    If b=c=0 in Theorem 3.1, then Wu,φ is reduced into the composition operator CAz. For this case, Theorem 3.1 does not provide any useful information on the operator A:CNCN when CAz is hyponormal on F2. However, we have the following result, which completely characterizes the hyponormal composition operators:

    Theorem 3.2. Let A:CNCN be a linear operator such that CAz is bounded on F2. Then the operator CAz is hyponormal on F2 if and only if A:CNCN is co-hyponormal.

    Proof. Assume that A:CNCN is co-hyponormal. Then there exists an operator B:CNCN with B1 such that A=BA. We therefore have

    CAz=CAz=CABz=CBzCAz.

    Next, we want to show that CBz=1. By Theorem 4 in [3], we have

    CBz=e14(|w0|2|Bw0|2), (3.7)

    where w0 is any solution to (IBB)w=0. From this, we obtain that w0=BBw0, and then

    |Bw0|2=Bw0,Bw0=w0,BBw0=w0,w0=|w0|2. (3.8)

    Thus, by considering (3.7) and (3.8), we see that CBz=1. It follows that the operator CAz is hyponormal on F2.

    Now, assume that the operator CAz is hyponormal on F2. Then there exists a linear operator C on F2 with C1 such that CAz=CCAz. By Lemma 2 in [3], we have CAz=CAz. This shows that CCAz is a composition operator. This result shows that there exists a holomorphic mapping φ:CNCN such that C=Cφ. So Az=A(φ(z)) for all zCN, which implies that there exists a linear operator B:CNCN such that φ(z)=Bz, and then C=CBz. Therefore, A=AB, that is, A=BA. Since C1, this shows that the operator C=CBz is bounded on F2. From Lemma 2.3 in [15], we obtain that B1, which also shows that B1 since B=B. We prove that A:CNCN is co-hyponormal. The proof is complete.

    Remark 3.1. In the paper, we only obtain a necessary condition for the hyponormality of weighted composition operators on Fock space. We hope that the readers can continuously consider the problem in Fock space.

    In this paper, I give a proper description of the adjoint Wu,φ on Fock space for the special symbol functions u(z)=Kc(z) and φ(z)=Az+b. However, it is difficult to give a proper description of the general symbols. On the other hand, I consider the hyponormal weighted composition operators on Fock space and completely characterize hyponormal composition operators on this space. I hope that people are interested in the research in this paper.

    The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

    This study was supported by Sichuan Science and Technology Program (2024NSFSC0416).

    The author declares that he has no competing interests.



    [1] J. B. Du, W. J. Cheng, G. Y. Lu, H. T. Gao, X. L. Chu, Z. C. Zhang, et al., Resource pricing and allocation in MEC enabled blockchain systems: An A3C deep reinforcement learning approach, IEEE Trans. Network Sci. Eng., 9 (2022), 33–44. https://10.1109/TNSE.2021.3068340 doi: 10.1109/TNSE.2021.3068340
    [2] H. X. Peng, X. M. Shen, Deep reinforcement learning based resource management for multi-access edge computing in vehicular networks, IEEE Trans. Network Sci. Eng., 7 (2021), 2416–2428. https://10.1109/TNSE.2020.2978856 doi: 10.1109/TNSE.2020.2978856
    [3] D. C. Chen, X. L. Liu, W. W. Yu, Finite-time fuzzy adaptive consensus for heterogeneous nonlinear multi-agent systems, IEEE Trans. Network Sci. Eng., 7 (2021), 3057–3066. https://10.1109/TNSE.2020.3013528 doi: 10.1109/TNSE.2020.3013528
    [4] J. Wang, Q. Wang, H. Wu, T. Huang, Finite-time consensus and finite-time H consensus of multi-agent systems under directed topology, IEEE Trans. Network Sci. Eng., 7 (2020), 1619–1632. https://10.1109/TNSE.2019.2943023 doi: 10.1109/TNSE.2019.2943023
    [5] T. Gao, T. Li, Y. J. Liu, S. Tong, IBLF-based adaptive neural control of state-constrained uncertain stochastic nonlinear systems, IEEE Trans. Neural Networks Learn. Syst., 33 (2022), 7345–7356. https://10.1109/TNNLS.2021.3084820 doi: 10.1109/TNNLS.2021.3084820
    [6] T. T. Gao, Y. J. Liu, D. P. Li, S. C. Tong, T. S. Li, Adaptive neural control using tangent time-varying BLFs for a class of uncertain stochastic nonlinear systems with full state constraints, IEEE Trans. Cybern., 51 (2021), 1943–1953. https://10.1109/TCYB.2019.2906118 doi: 10.1109/TCYB.2019.2906118
    [7] P. J. Werbos, Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences, Ph. D. dissertation, Harvard University, 1974.
    [8] Y. Tang, D. D. Zhang, P. Shi, W. B. Zhang, F. Qian, Event-based formation control for nonlinear multiagent systems under DOS attacks, IEEE Trans. Autom. Control., 66 (2021) 452–459. https://10.1109/TAC.2020.2979936 doi: 10.1109/TAC.2020.2979936
    [9] Y. Tang, X. T. Wu, P. Shi, F. Qian, Input-to-state stability for nonlinear systems with stochastic impulses, Automatica, 113 (2020), 0005–1098. https://doi.org/10.1016/j.automatica.2019.108766 doi: 10.1016/j.automatica.2019.108766
    [10] X. T. Wu, Y. Tang, J. D. Cao, X. R. Mao, Stability analysis for continuous-time switched systems with stochastic switching signals, IEEE Trans. Autom. Control., 63 (2018), 3083–3090. https://10.1109/TAC.2017.2779882 doi: 10.1109/TAC.2017.2779882
    [11] B. Kiumarsi, K. G. Vamvoudakis, H. Modares, F. L. Lewis, Optimal and autonomous control using reinforcement learning: A survey, IEEE Trans. Neural Networks Learn. Syst., 29 (2018), 2042–2062. https://10.1109/TNNLS.2017.2773458 doi: 10.1109/TNNLS.2017.2773458
    [12] V. Narayanan, S. Jagannathan, Event-triggered distributed control of nonlinear interconnected systems using online reinforcement learning with exploration, IEEE Trans. Cybern., 48 (2018), 2510–2519. https://10.1109/TCYB.2017.2741342 doi: 10.1109/TCYB.2017.2741342
    [13] B. Luo, H. N. Wu, T. Huang, Off-policy reinforcement learning for H control design, IEEE Trans. Cybern., 45 (2015), 65–76. https://10.1109/TCYB.2014.2319577 doi: 10.1109/TCYB.2014.2319577
    [14] R. Song, F. L. Lewis, Q. Wei, Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzero sum games, IEEE Trans. Neural Networks Learn. Syst., 28 (2017), 704–713. https://10.1109/TNNLS.2016.2582849 doi: 10.1109/TNNLS.2016.2582849
    [15] X. Yang, D. Liu, B. Luo, C. Li, Data-based robust adaptive control for a class of unknown nonlinear constrained-input systems via integral reinforcement learning, Inf. Sci., 369 (2016), 731–747. https://doi.org/10.1016/j.ins.2016.07.051 doi: 10.1016/j.ins.2016.07.051
    [16] H. Zhang, K. Zhang, Y. Cai, J. Han, Adaptive fuzzy fault-tolerant tracking control for partially unknown systems with actuator faults via integral reinforcement learning method, IEEE Trans. Fuzzy Syst., 27 (2019), 1986–1998. https://10.1109/TFUZZ.2019.2893211 doi: 10.1109/TFUZZ.2019.2893211
    [17] W. Bai, Q. Zhou, T. Li, H. Li, Adaptive reinforcement learning neural network control for uncertain nonlinear system with input saturation, IEEE Trans. Cybern., 50 (2020), 3433–3443. https://10.1109/TCYB.2019.2921057 doi: 10.1109/TCYB.2019.2921057
    [18] Y. Li, S. Tong, Adaptive neural networks decentralized FTC design for nonstrict-feedback nonlinear interconnected large-scale systems against actuator faults, IEEE Trans. Neural Networks Learn. Syst., 28 (2017), 2541–2554. https://10.1109/TNNLS.2016.2598580 doi: 10.1109/TNNLS.2016.2598580
    [19] Q. Chen, H. Shi, M. Sun, Echo state network-based backstepping adaptive iterative learning control for strict-feedback systems: An error-tracking approach, IEEE Trans. Cybern., 50 (2020), 3009–3022. https://10.1109/TCYB.2019.2931877 doi: 10.1109/TCYB.2019.2931877
    [20] S. Tong, Y. Li, S. Sui, Adaptive fuzzy tracking control design for SISO uncertain nonstrict feedback nonlinear systems, IEEE Trans. Fuzzy Syst., 24 (2016), 1441–1454. https://10.1109/TFUZZ.2016.2540058 doi: 10.1109/TFUZZ.2016.2540058
    [21] W. Bai, T. Li, S. Tong, NN reinforcement learning adaptive control for a class of nonstrict-feedback discrete-time systems, IEEE Trans. Cybern., 50 (2020), 4573–4584. https://10.1109/TCYB.2020.2963849 doi: 10.1109/TCYB.2020.2963849
    [22] Y. Li, K. Sun, S. Tong, Observer-based adaptive fuzzy fault-tolerant optimal control for SISO nonlinear systems, IEEE Trans. Cybern., 49 (2019), 649–661. https://10.1109/TCYB.2017.2785801 doi: 10.1109/TCYB.2017.2785801
    [23] H. Modares, F. L. Lewis, M. B. Naghibi-Sistani, Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems, Automatica, 50 (2014), 193–202. https://doi.org/10.1016/j.automatica.2013.09.043 doi: 10.1016/j.automatica.2013.09.043
    [24] Z. Wang, L. Liu, Y. Wu, H. Zhang, Optimal fault-tolerant control for discrete-time nonlinear strict-feedback systems based on adaptive critic design, IEEE Trans. Neural Networks Learn. Syst., 29 (2018), 2179–2191. https://10.1109/TNNLS.2018.2810138 doi: 10.1109/TNNLS.2018.2810138
    [25] H. Li, Y. Wu, M. Chen, Adaptive fault-tolerant tracking control for discrete-time multiagent systems via reinforcement learning algorithm, IEEE Trans. Cybern., 51 (2021), 1163–1174. https://10.1109/TCYB.2020.2982168 doi: 10.1109/TCYB.2020.2982168
    [26] Y. J. Liu, L. Tang, S. Tong, C. L. P. Chen, D. J. Li, Reinforcement learning design-based adaptive tracking control with less learning parameters for nonlinear discrete-time MIMO systems, IEEE Trans. Neural Networks Learn. Syst., 26 (2015), 165–176. https://10.1109/TNNLS.2014.2360724 doi: 10.1109/TNNLS.2014.2360724
    [27] W. Bai, T. Li, Y. Long, C. L. P. Chen, Event-triggered multigradient recursive reinforcement learning tracking control for multiagent systems, IEEE Trans. Neural Networks Learn. Syst., 34 (2023), 355–379. https://10.1109/TNNLS.2021.3094901 doi: 10.1109/TNNLS.2021.3094901
    [28] H. Wang, G. H. Yang, A finite frequency domain approach to fault detection for linear discrete-time systems, Int. J. Control., 81 (2008), 1162–1171. https://doi.org/10.1080/00207170701691513 doi: 10.1080/00207170701691513
    [29] C. Tan, G. Tao, R. Qi, A discrete-time parameter estimation based adaptive actuator failure compensation control scheme, Int. J. Control., 86 (2013), 276–289. https://doi.org/10.1080/00207179.2012.723828 doi: 10.1080/00207179.2012.723828
    [30] J. Na, X. Ren, G. Herrmann, Z. Qiao, Adaptive neural dynamic surface control for servo systems with unknown dead-zone, Control Eng. Pract., 19 (2011), 1328–1343. https://doi.org/10.1016/j.conengprac.2011.07.005 doi: 10.1016/j.conengprac.2011.07.005
    [31] Y. J. Liu, S. Li, S. Tong, C. L. P. Chen, Adaptive reinforcement learning control based on neural approximation for nonlinear discrete-time systems with unknown nonaffine dead-zone input, IEEE Trans. Neural Networks Learn. Syst., 30 (2019), 295–305. https://10.1109/TNNLS.2018.2844165 doi: 10.1109/TNNLS.2018.2844165
    [32] S. S. Ge, J. Zhang, T. H. Lee, Adaptive neural network control for a class of MIMO nonlinear systems with disturbances in discrete time, IEEE Trans. Syst., Man Cybern. B, Cybern., 34 (2004), 1630–1645. https://10.1109/TSMCB.2004.826827 doi: 10.1109/TSMCB.2004.826827
    [33] Y. J. Liu, Y. Gao, S. Tong, Y. Li, Fuzzy approximation-based adaptive backstepping optimal control for a class of nonlinear discrete time systems with dead-zone, IEEE Trans. Fuzzy Syst., 24 (2016), 16–28. https://10.1109/TFUZZ.2015.2418000 doi: 10.1109/TFUZZ.2015.2418000
    [34] S. S. Ge, G. Y. Li, T. H. Lee, Adaptive NN control for a class of strict-feedback discrete-time nonlinear systems, Automatica, 39 (2003), 807–819. https://doi.org/10.1016/S0005-1098(03)00032-3 doi: 10.1016/S0005-1098(03)00032-3
    [35] Q. Yang, S. Jagannathan, Reinforcement learning controller design for affine nonlinear discrete-time systems using online approximators, IEEE Trans. Syst., Man, Cybern. B, Cybern., 42 (2012), 377–390. https://10.1109/TSMCB.2011.2166384 doi: 10.1109/TSMCB.2011.2166384
    [36] Y. J. Liu, Y. Gao, S. Tong, Y. Li, Fuzzy approximation-based adaptive backstepping optimal control for a class of nonlinear discrete time systems with dead-zone, IEEE Trans. Fuzzy Syst., 24 (2016), 16–28. https://10.1109/TFUZZ.2015.2418000 doi: 10.1109/TFUZZ.2015.2418000
    [37] S. Ferrari, J. E. Steck, R. Chandramohan, Adaptive feedback control by constrained approximate dynamic programming, IEEE Trans. Syst., Man, Cybern. B, Cybern., 38 (2008), 982–987. https://10.1109/TSMCB.2008.924140 doi: 10.1109/TSMCB.2008.924140
    [38] S. Tong, Y. Li, S. Sui, Adaptive fuzzy tracking control design for SISO uncertain nonstrict feedback nonlinear systems, IEEE Trans. Fuzzy Syst., 24 (2016), 1441–1454. https://10.1109/TFUZZ.2016.2540058 doi: 10.1109/TFUZZ.2016.2540058
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(2028) PDF downloads(148) Cited by(1)

Figures and Tables

Figures(10)  /  Tables(1)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog