Model-free optimal consensus control for multi-agent systems via DDPG-based event-triggered adaptive dynamic programming method

Pengfei Zhu; Xiaolin Wang; Fangfei Li; Siyu Qian; Haitao Li; Pengfei Zhu; Xiaolin Wang; Fangfei Li; Siyu Qian; Haitao Li

doi:10.3934/mmc.2026008

Mathematical Modelling and Control

2026, Volume 6, Issue 1: 97-110. doi: 10.3934/mmc.2026008

Previous Article Next Article

Research article

Model-free optimal consensus control for multi-agent systems via DDPG-based event-triggered adaptive dynamic programming method

1.
School of Mathematics, East China University of Science and Technology, Shanghai, China
2.
Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai, China
3.
School of Mathematics and Statistics, Shandong Normal University, Jinan 250014, China

Received: 17 April 2024 Revised: 17 June 2024 Accepted: 26 August 2024 Published: 05 March 2026

This paper primarily addresses the design of distributed optimal cooperative controllers and the utilization of a reinforcement learning (RL)-based event-triggered mechanism for multi-agent systems (MASs) with unknown dynamics. By setting an extra compensator, the augmented system is constructed to overcome the dependence for system dynamics. Then, to address the issue of computational burden, we utilize an event-triggered mechanism based on reinforcement learning (RL) and neural networks (NNs) to implement the adaptive dynamic programming (ADP) algorithm. Additionally, we take into consideration the trade-off between computational burden and achieving consensus control by introducing a weighting factor in the reward design for MASs. With this reward design, we present an algorithm based on the deep deterministic policy gradient (DDPG) algorithm to learn the event-triggered condition for MASs and achieve a balance between these two factors. The event-triggered mechanism of our algorithm can also identify constraints such as time limitations or computational resource restrictions, aiming to achieve consensus control without violating these constraints. We demonstrate the absence of Zeno behavior and the uniform ultimate boundedness (UUB) of both local consensus error and weight estimation error. Finally, simulation results illustrate the effectiveness of the control algorithm and the weighting factor.
- augmented system,
- event-triggered adaptive dynamic programming,
- deep deterministic policy gradient,
- event-triggered condition
Citation: Pengfei Zhu, Xiaolin Wang, Fangfei Li, Siyu Qian, Haitao Li. Model-free optimal consensus control for multi-agent systems via DDPG-based event-triggered adaptive dynamic programming method[J]. Mathematical Modelling and Control, 2026, 6(1): 97-110. doi: 10.3934/mmc.2026008

Related Papers:

Abstract

This paper primarily addresses the design of distributed optimal cooperative controllers and the utilization of a reinforcement learning (RL)-based event-triggered mechanism for multi-agent systems (MASs) with unknown dynamics. By setting an extra compensator, the augmented system is constructed to overcome the dependence for system dynamics. Then, to address the issue of computational burden, we utilize an event-triggered mechanism based on reinforcement learning (RL) and neural networks (NNs) to implement the adaptive dynamic programming (ADP) algorithm. Additionally, we take into consideration the trade-off between computational burden and achieving consensus control by introducing a weighting factor in the reward design for MASs. With this reward design, we present an algorithm based on the deep deterministic policy gradient (DDPG) algorithm to learn the event-triggered condition for MASs and achieve a balance between these two factors. The event-triggered mechanism of our algorithm can also identify constraints such as time limitations or computational resource restrictions, aiming to achieve consensus control without violating these constraints. We demonstrate the absence of Zeno behavior and the uniform ultimate boundedness (UUB) of both local consensus error and weight estimation error. Finally, simulation results illustrate the effectiveness of the control algorithm and the weighting factor.

References

[1]	M. A. Khamis, W. Gomaa, Adaptive multi-objective reinforcement learning with hybrid exploration for traffic signal control based on cooperative multi-agent framework, Eng. Appl. Artif. Intell., 29 (2014), 134–151. https://doi.org/10.1016/j.engappai.2014.01.007 doi: 10.1016/j.engappai.2014.01.007
[2]	L. Ding, S. Li, H. Gao, C. Chen, Z. Deng, Adaptive partial reinforcement learning neural network-based tracking control for wheeled mobile robotic systems, IEEE T. Syst. Man Cybern.-Syst., 50 (2020), 2512–2523. https://doi.org/10.1109/TSMC.2018.2819191 doi: 10.1109/TSMC.2018.2819191
[3]	O. P. Mahela, M. Khosravy, N. Gupta, B. Khan, H. H. Alhelou, R. Mahla, et al., Comprehensive overview of multi-agent systems for controlling smart grids, CSEE J. Power Energy Syst, 8 (2022), 115–131. https://doi.org/10.17775/CSEEJPES.2020.03390 doi: 10.17775/CSEEJPES.2020.03390
[4]	J. Wang, J. Gao, P. Wu, Attack-resilient event-triggered formation control of multi-agent systems under periodic DoS attacks using complex Laplacian, ISA Trans., 128 (2022), 10–16. https://doi.org/10.1016/j.isatra.2021.10.030 doi: 10.1016/j.isatra.2021.10.030
[5]	H. Zhang, H. Jiang, Y. Luo, G. Xiao, Data-driven optimal consensus control for discrete-time multi-agent systems with unknown dynamics using reinforcement learning method, IEEE Trans. Ind. Electron., 64 (2017), 4091–4100. https://doi.org/10.1109/TIE.2016.2542134 doi: 10.1109/TIE.2016.2542134
[6]	Q. Wei, D. Liu, H. Lin, Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems, IEEE Trans. Cybern., 46 (2016), 840–853. https://doi.org/10.1109/TCYB.2015.2492242 doi: 10.1109/TCYB.2015.2492242
[7]	R. Song, F. L. Lewis, Q. Wei, H. Zhang, Off-policy actor-critic structure for optimal control of unknown systems with disturbances, IEEE Trans. Cybern., 46 (2016), 1041–1050. https://doi.org/10.1109/TCYB.2015.2421338 doi: 10.1109/TCYB.2015.2421338
[8]	H. Lyu, Y. Lyu, Y. Gao, H. Qian, S. Du, MIMO fuzzy adaptive control systems based on fuzzy semi-tensor product, Math. Model. Control, 3 (2023), 316–330. https://doi.org/10.3934/mmc.2023026 doi: 10.3934/mmc.2023026
[9]	V. Stojanovic, Fault-tolerant control of a hydraulic servo actuator via adaptive dynamic programming, Math. Model. Control, 3 (2023), 181–191. https://doi.org/10.3934/mmc.2023016 doi: 10.3934/mmc.2023016
[10]	J. Zhang, H. Zhang, T. Feng, Distributed optimal consensus control for nonlinear multiagent system with unknown dynamic, IEEE T. Neural Networks Learn. Syst., 29 (2018), 3339–3348. https://doi.org/10.1109/TNNLS.2017.2728622 doi: 10.1109/TNNLS.2017.2728622
[11]	S. Jiao, Q. Wei, A new optimal consensus control for nonlinear multi-agent systems, 2023 9th International Conference on Control Science and Systems Engineering (ICCSSE), 2023, 1–6. https://doi.org/10.1109/iccsse59359.2023.10244873 doi: 10.1109/iccsse59359.2023.10244873
[12]	B. Zhao, D. Liu, C. Luo, Reinforcement learning-based optimal stabilization for unknown nonlinear systems subject to inputs with uncertain constraints, IEEE T. Neural Networks Learn. Syst., 31 (2020), 4330–4340. https://doi.org/10.1109/TNNLS.2019.2954983 doi: 10.1109/TNNLS.2019.2954983
[13]	Q. Wei, R. Song, P. Yan, Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP, IEEE T. Neural Networks Learn. Syst., 27 (2016), 444–458. https://doi.org/10.1109/TNNLS.2015.2464080 doi: 10.1109/TNNLS.2015.2464080
[14]	X. Zhong, H. He, An event-triggered ADP control approach for continuous-time system With unknown internal states, IEEE Trans. Cybern., 47 (2017), 683–694. https://doi.org/10.1109/TCYB.2016.2523878 doi: 10.1109/TCYB.2016.2523878
[15]	B. Li, N. Chen, B. Luo, J. Chen, C. Yang, W. Gui, ADP-based event-triggered constrained optimal control on spatiotemporal process: application to temperature field in roller kiln, IEEE T. Neural Networks Learn. Syst., 35 (2024), 3229–3241. https://doi.org/10.1109/TNNLS.2023.3267516 doi: 10.1109/TNNLS.2023.3267516
[16]	R. Mu, A. Wei, H. Li, X. Zhang, Y. Qi, Bipartite output consensus for heterogeneous multi-agent systems with observer-based adaptive event-triggered strategy, IEEE Syst. J., 17 (2023), 6011–6021. https://doi.org/10.1109/JSYST.2023.3307686 doi: 10.1109/JSYST.2023.3307686
[17]	L. Chen, F. Hao, Dynamic event-triggered robust stabilization of continuous-time nonaffine nonlinear systems based on ADP, 2023 42nd Chinese Control Conference (CCC), 2023, 1611–1616. https://doi.org/10.23919/CCC58697.2023.10240133 doi: 10.23919/CCC58697.2023.10240133
[18]	A. Sahoo, H. Xu, S. Jagannathan, Near optimal event-triggered control of nonlinear discrete-time systems using neurodynamic programming, IEEE T. Neural Networks Learn. Syst., 27 (2016), 1801–1815. https://doi.org/10.1109/TNNLS.2015.2453320 doi: 10.1109/TNNLS.2015.2453320
[19]	Z. Wang, Q. Wei, D. Liu, Y. Luo, Event-triggered adaptive control for discrete-time zero-sum games, 2019 International Joint Conference on Neural Networks (IJCNN), 2019, 1–7. https://doi.org/10.1109/IJCNN.2019.8852115 doi: 10.1109/IJCNN.2019.8852115
[20]	L. Dong, X. Zhong, C. Sun, H. He, Adaptive event-triggered control based on heuristic dynamic programming for nonlinear discrete-time systems, IEEE T. Neural Networks Learn. Syst., 28 (2017), 1594–1605. https://doi.org/10.1109/TNNLS.2016.2541020 doi: 10.1109/TNNLS.2016.2541020
[21]	B. Xu, Y. X. Li, Z. Hou, C. K. Ahn, Dynamic event-triggered reinforcement learning-based consensus tracking of nonlinear multi-agent systems, IEEE Trans. Circuits Syst.-I, 70 (2023), 2120–2132. https://doi.org/10.1109/TCSI.2023.3246001 doi: 10.1109/TCSI.2023.3246001
[22]	G. Wen, C. L. P. Chen, J. Feng, N. Zhou, Optimized multi-agent formation control based on an identifier-Actor-Critic reinforcement learning algorithm, IEEE Trans. Fuzzy Syst., 26 (2018), 2719–2731. https://doi.org/10.1109/TFUZZ.2017.2787561 doi: 10.1109/TFUZZ.2017.2787561
[23]	J. Peng, C. Mu, K. Wang, A nearly optimal multi-agent formation control with reinforcement learning, 2021 40th Chinese Control Conference (CCC), 2021, 5315–5320. https://doi.org/10.23919/CCC52363.2021.9550415 doi: 10.23919/CCC52363.2021.9550415
[24]	G. Wen, C. L. P. Chen, Optimized backstepping consensus control using reinforcement learning for a class of nonlinear strict-feedback-dynamic multi-agent systems, IEEE T. Neural Networks Learn. Syst., 34 (2023), 1524–1536. https://doi.org/10.1109/TNNLS.2021.3105548 doi: 10.1109/TNNLS.2021.3105548
[25]	G. Wen, B. Li, Optimized leader-follower consensus control using reinforcement learning for a class of second-order nonlinear multiagent systems, IEEE T. Syst. Man Cybern. Syst., 52 (2022), 5546–5555. https://doi.org/10.1109/TSMC.2021.3130070 doi: 10.1109/TSMC.2021.3130070
[26]	M. Long, H. Su, Z. Zeng, Model-free event-triggered consensus algorithm for multiagent systems using reinforcement learning method, IEEE T. Syst. Man Cybern. Syst., 52 (2022), 5212–5221. https://doi.org/10.1109/TSMC.2021.3120008 doi: 10.1109/TSMC.2021.3120008
[27]	P. Chen, S. Liu, D. Zhang, A Q-learning based dynamic event-triggered control for load frequency regulation of power systems with denial-of-service attacks, 2021 IEEE 30th International Symposium on Industrial Electronics (ISIE), 2021, 1–5. https://doi.org/10.1109/ISIE45552.2021.9576200 doi: 10.1109/ISIE45552.2021.9576200
[28]	S. Wang, X. Jin, S. Mao, A. V. Vasilakos, Y. Tang, Model-free event-triggered optimal consensus control of multiple euler-lagrange systems via reinforcement learning, IEEE Trans. Network Sci. Eng., 8 (2021), 246–258. https://doi.org/10.1109/TNSE.2020.3036604 doi: 10.1109/TNSE.2020.3036604
[29]	H. K. Khalil, Nonlinear systems, 3 Eds., Upper Saddle River: Prentice Hall, 2002.

Reader Comments

Your name:*

Email:*
© 2026 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)