Due to the complexity of the driving environment and the dynamics of the behavior of traffic participants, self-driving in dense traffic flow is very challenging. Traditional methods usually rely on predefined rules, which are difficult to adapt to various driving scenarios. Deep reinforcement learning (DRL) shows advantages over rule-based methods in complex self-driving environments, demonstrating the great potential of intelligent decision-making. However, one of the problems of DRL is the inefficiency of exploration; typically, it requires a lot of trial and error to learn the optimal policy, which leads to its slow learning rate and makes it difficult for the agent to learn well-performing decision-making policies in self-driving scenarios. Inspired by the outstanding performance of supervised learning in classification tasks, we propose a self-driving intelligent control method that combines human driving experience and adaptive sampling supervised actor-critic algorithm. Unlike traditional DRL, we modified the learning process of the policy network by combining supervised learning and DRL and adding human driving experience to the learning samples to better guide the self-driving vehicle to learn the optimal policy through human driving experience and real-time human guidance. In addition, in order to make the agent learn more efficiently, we introduced real-time human guidance in its learning process, and an adaptive balanced sampling method was designed for improving the sampling performance. We also designed the reward function in detail for different evaluation indexes such as traffic efficiency, which further guides the agent to learn the self-driving intelligent control policy in a better way. The experimental results show that the method is able to control vehicles in complex traffic environments for self-driving tasks and exhibits better performance than other DRL methods.
Citation: Jin Zhang, Nan Ma, Zhixuan Wu, Cheng Wang, Yongqiang Yao. Intelligent control of self-driving vehicles based on adaptive sampling supervised actor-critic and human driving experience[J]. Mathematical Biosciences and Engineering, 2024, 21(5): 6077-6096. doi: 10.3934/mbe.2024267
[1] | Sufang Wu, Hua He, Jingjing Huang, Shiyao Jiang, Xiyun Deng, Jun Huang, Yuanbing Chen, Yiqun Jiang . FMR1 is identified as an immune-related novel prognostic biomarker for renal clear cell carcinoma: A bioinformatics analysis of TAZ/YAP. Mathematical Biosciences and Engineering, 2022, 19(9): 9295-9320. doi: 10.3934/mbe.2022432 |
[2] | Miao Zhu, Tao Yan, Shijie Zhu, Fan Weng, Kai Zhu, Chunsheng Wang, Changfa Guo . Identification and verification of FN1, P4HA1 and CREBBP as potential biomarkers in human atrial fibrillation. Mathematical Biosciences and Engineering, 2023, 20(4): 6947-6965. doi: 10.3934/mbe.2023300 |
[3] | Xinwang Yan, Xiaowen Zhao, Qing Yan, Ye Wang, Chunling Zhang . Analysis of the role of METTL5 as a hub gene in lung adenocarcinoma based on a weighted gene co-expression network. Mathematical Biosciences and Engineering, 2021, 18(5): 6608-6619. doi: 10.3934/mbe.2021327 |
[4] | Ming Zhang, Yingying Zhou, Yanli Zhang . High Expression of TLR2 in the serum of patients with tuberculosis and lung cancer, and can promote the progression of lung cancer. Mathematical Biosciences and Engineering, 2020, 17(3): 1959-1972. doi: 10.3934/mbe.2020104 |
[5] | Xiangyue Zhang, Wen Hu, Zixian Lei, Hongjuan Wang, Xiaojing Kang . Identification of key genes and evaluation of immune cell infiltration in vitiligo. Mathematical Biosciences and Engineering, 2021, 18(2): 1051-1062. doi: 10.3934/mbe.2021057 |
[6] | Bo Wei, Rui Wang, Le Wang, Chao Du . Prognostic factor identification by analysis of the gene expression and DNA methylation data in glioma. Mathematical Biosciences and Engineering, 2020, 17(4): 3909-3924. doi: 10.3934/mbe.2020217 |
[7] | Jie Chen, Jinggui Chen, Bo Sun, Jianghong Wu, Chunyan Du . Integrative analysis of immune microenvironment-related CeRNA regulatory axis in gastric cancer. Mathematical Biosciences and Engineering, 2020, 17(4): 3953-3971. doi: 10.3934/mbe.2020219 |
[8] | Fang Niu, Zongwei Liu, Peidong Liu, Hongrui Pan, Jiaxue Bi, Peng Li, Guangze Luo, Yonghui Chen, Xiaoxing Zhang, Xiangchen Dai . Identification of novel genetic biomarkers and treatment targets for arteriosclerosis-related abdominal aortic aneurysm using bioinformatic tools. Mathematical Biosciences and Engineering, 2021, 18(6): 9761-9774. doi: 10.3934/mbe.2021478 |
[9] | Rongxing Qin, Lijuan Huang, Wei Xu, Qingchun Qin, Xiaojun Liang, Xinyu Lai, Xiaoying Huang, Minshan Xie, Li Chen . Identification of disulfidptosis-related genes and analysis of immune infiltration characteristics in ischemic strokes. Mathematical Biosciences and Engineering, 2023, 20(10): 18939-18959. doi: 10.3934/mbe.2023838 |
[10] | Jian-Di Li, Gang Chen, Mei Wu, Yu Huang, Wei Tang . Downregulation of CDC14B in 5218 breast cancer patients: A novel prognosticator for triple-negative breast cancer. Mathematical Biosciences and Engineering, 2020, 17(6): 8152-8181. doi: 10.3934/mbe.2020414 |
Due to the complexity of the driving environment and the dynamics of the behavior of traffic participants, self-driving in dense traffic flow is very challenging. Traditional methods usually rely on predefined rules, which are difficult to adapt to various driving scenarios. Deep reinforcement learning (DRL) shows advantages over rule-based methods in complex self-driving environments, demonstrating the great potential of intelligent decision-making. However, one of the problems of DRL is the inefficiency of exploration; typically, it requires a lot of trial and error to learn the optimal policy, which leads to its slow learning rate and makes it difficult for the agent to learn well-performing decision-making policies in self-driving scenarios. Inspired by the outstanding performance of supervised learning in classification tasks, we propose a self-driving intelligent control method that combines human driving experience and adaptive sampling supervised actor-critic algorithm. Unlike traditional DRL, we modified the learning process of the policy network by combining supervised learning and DRL and adding human driving experience to the learning samples to better guide the self-driving vehicle to learn the optimal policy through human driving experience and real-time human guidance. In addition, in order to make the agent learn more efficiently, we introduced real-time human guidance in its learning process, and an adaptive balanced sampling method was designed for improving the sampling performance. We also designed the reward function in detail for different evaluation indexes such as traffic efficiency, which further guides the agent to learn the self-driving intelligent control policy in a better way. The experimental results show that the method is able to control vehicles in complex traffic environments for self-driving tasks and exhibits better performance than other DRL methods.
DNA segregation is a complicated process that is critical for cell proliferation and survival [16,30]. Failures during segregation can result in aberrant DNA contents (aneuploidy), a phenomenon which is prevalent in human cancers [24,28]. The fidelity of chromosome segregation during cell division is monitored by control mechanisms called checkpoints, which ensure that particular criteria are met before moving on irreversibly to the next phase [30].
In mitosis, the Spindle Assembly Checkpoint (SAC; [33]) ensures that all chromosomes are properly attached to spindle microtubules via their kinetochores. Even a single unattached or misattached chromosome is sufficient to keep the checkpoint active and engaged [32,31]. (A human mitotic cell has 46 chromosomes and 92 kinetochores.) In budding yeast (Saccharomyces cerevisiae and Drosophila male), an additional control checkpoint exists to place the correct DNA into the right cell during asymmetric cell division. This regulation is known as the Spindle Position Checkpoint (SPOC), and delays mitotic progression until the spindle is correctly aligned along with the polarity axis [2].
Both SAC and SPOC have prominent similarities, although they constitute different mitotic checkpoints. They broadcast a 'wait' signal to the environment, and rely on turnover of the inhibitor and activator at an organelle (kinetochore for SAC and spindle pole body in SPOC). The SAC integrates signaling information about attachment of the individual kinetochores, which broadcast a 'wait'-signal unless a correct attachment is established. Many core SAC components like Mad2 are recruited to unattached kinetochores, and broadcast a nucleoplasm 'wait'-signal and inhibit Cdc20, the APC/C activator. Upon kinetochore-microtubule attachment, these components are rapidly removed from the kinetochores and APC/C:Cdc20 formation (SAC silencing) is turned on (see Fig. 1A). Similarly, central SPOC components Bf1:Bub2 are localized and regulated at the spindle pole bodies (SPBs), which are broadcasted through out the cytosol and inhibit the downstream pathway, Tem1. Signaling from the SPBs is shut down after correct spindle alignment with the polarity axis is achieved (see Fig. 1B).
The checkpoint mechanisms, SAC and SPOC, are hard to observe experimentally in living cells, due to a number of technical challenges. For example, even a low number of components can have various localizations and states upon which the interactions depend. Another issue is that the average diameter of many proteins at the kinetochore is about 40Å, making connections between them invisible to current microscopy techniques. Likewise, the SPOC protein Tem1 activity, whether it is GTP-or GDP-bound, and Bfa1 protein phosphorylation by various kinases (or lack thereof), are not observable in the wet lab. These limitations can be addressed by employing mathematical models and numerical simulations, which can improve our understanding of the mechanics of cell division. However, mathematical methods can be hindered by combinatorial explosion in the amount of intermediate components (complexes) and explicit representations. Also, the different components often interact nonlinearly in time and space; in the presence of various feedback loops, these interactions lead to phenomena that are difficult to predict [10,0,25,11,38]. A combination of experimental work and rigorous mathematical models was central to exclude some hypothesized SAC checkpoint architectures and to elucidate how the SPOC's elaborate system functions (e.g. [4]). All mathematical models extant in the literature studied either human SAC or yeast SPOC activation at a detailed molecular level or in abstract models to distinguish between different pathways [7,34,14,13,15,21,26,19,9,0,18,23,27,29]. However, none of these models addressed SAC or SPOC silencing. Moreover, no rigorous mathematical analysis of properties such as bifurcation, stability, or the existence of feedback loops has been performed for either SAC or SPOC to date. The research groups of Novak and Tyson work intensively on cell cycle-related modeling for yeast. Their recent small model consists of five reactions, five species, and two ODEs based on Michaelis-Menten kinetics with double-negative and two double-positive feedback loops[39]. Also, the smallest chemical reaction system with bistability in the literature contains four reactions, and two ODEs based on mass-action kinetics [41]. It has double positive and a single negative feedback loops. Unfortunately, no regeneration for the reactants is possible in this model.
The purpose of this paper is fourfold. First, a minimal bistable SAC model for activation and silencing was constructed. The model is based on mass-action kinetics and comprises four reactions, double-negative and two double-positive feedback loops, and two ODEs. It is structurally fully distinct to the known smallest biochemical model [41] and structurally comparable to the yeast mitotic model [39,17]. Second, the same model structure was applied to the SPOC, with both SPBs included. Subsequently, a one-parameter bifurcation was computed for these models, in order to demonstrate the realistic biochemical switches. Eventually, numerical simulations were carried out for the system as a system of ordinary differential equations (ODEs), and also as partial differential equations (PDEs; reaction-diffusion systems) with various parameters.
The reaction rules governing the Spindle Assembly Checkpoint (SAC) system are (cf. Fig. 2A):
|
|
|
||
Initial amount | ||||
APC/C | 0.09 |
[37] | ||
MCC | 0.15 |
[15] | ||
Tem1 | 0.06 |
[4] | ||
Bfa1 | 0.04 |
[4] | ||
Bub2 | 0.04 |
[4] | ||
Diffusion constants | ||||
MCC | 1-20 |
This study | ||
APC/C | 1.8 |
[40] | ||
Cdc20 | 19.5 |
[40] | ||
Mad2 | 5 |
[13] | ||
Environment | ||||
Radius of the kinetochore | 0.1 |
0.01 |
[5] | |
Radius of the cell | 10 |
4 |
[21] | |
Rate constants | ||||
kinetochores or SPBs | 0-92 | 0-2 | [16] | |
|
|
This study | ||
|
|
This study | ||
|
|
This study | ||
|
|
This study |
Cdc20:C-Mad2k1,Unattached Kin →MCC | (1) |
MCC+APC/Ck2⇌k−2MCC:APC/C | (2) |
MCC:APC/C k3,Attached Kin,APC/C→Cdc20:C-Mad2+APC/C | (3) |
The Spindle Position Checkpoint mechanism (SPOC) has very similar reaction rules (see below), but differs significantly in the initial concentrations, rate contacts, and signals of exactly two SPBs (cf. Fig. 4A):
Bub2k1,misaligned SPB →Bfa1:Bub2 | (4) |
Bfa1:bub2+Tem1k2⇌k−2Bfa1:Bub2:Tem1 | (5) |
Bfa1:Bub2:Tem1 k3,aligned SPB,Tem1→Bub2+Tem1 | (6) |
By applying the law of mass-action kinetics, the reaction rules (Eqs.(1-3) and Eqs.4-6) can be translated into sets of time-dependent nonlinear ordinary differential equations (ODEs). The translation is done by computing
Adding a diffusion term as a second spatial derivative transforms the system into one of coupled partial differential equations (PDEs), which is known as a reaction-diffusion system and has the following general form:
∂[Ci]∂t=Di∇2[Ci]⏟Diffusion+Rj({[Ci]};P)⏟Reaction, | (7) |
where
∂[Ci]∂t=Dir2∂∂r(r2∂[Ci]∂r)⏟Diffusion+Rj({[Ci]};P)⏟Reaction. | (8) |
The systems of ODEs were implemented in the freely-available software package XPPAUT[8], and integrated using the Rosenbrock method (stiff solver). The bifurcation analyzes and the related numerical integrations were conducted with AUTO [6] via an XPPAUT interface.
For the spatial simulations, the mitotic cell is considered as a 3-sphere with radius
The reaction-diffusion system of PDEs were solved numerically using MATLAB (MathWorks), and integrated using its predefined function called pdepe-solver, which solves systems of parabolic and elliptic PDEs in one space variable
The pdepe-solver converts the PDEs to ODEs using a second-order accurate spatial discretization based on a fixed set of user-specified nodes [36]. This is done using piecewise non-linear Galerkin (regular case) and implicit Petrov-Galerkin (singular case, second-order accurate). The ordinary differential equations resulting from discretization in space are integrated via the multistep solver ode15s which is a variable order solver based on the numerical differentiation formulas (NDFs)using the numerical Gear method [35]. To check that our results were not influenced by the spatial discretization method used in pdepe, we repeated all simulations for 50,100 and 1000 grid cells.
The wiring diagram of the SAC mechanism (Fig. 2A) was translated into a set of reaction equations (see Methods), which were then translated into a set of coupled ordinary differential equations (ODEs) under the assumption of mass action kinetics for all reactions. It is clear biochemically that the total concentration of Cdc20:Mad2 is constant in the system and can be expressed as [Cdc0:CMad2T] = [Cdc20:CMad2] + [MCC] + [MCC:APC/C]. The same is true for total APC/C, thus [APC/CT] = [APC/C] + [MCC:APC/C]. Also, for simplicity, the total amount of MCC was defined as: [MCCT]=[MCC]+[MCC: APC/C]. Under these assumptions, the reduced system can be easily written as the following nonlinear ODEs:
d[MCCT]dt=k1.U([MadT]−[MCCT])−k3.A[APC][MCC:APC/C]) | (9) |
d[APC/C]dt=−k2[MCC][APC/C]+(k−2+A[APC])[MCC:APC/C] | (10) |
The parameter
d[MCCT]dt=k1.U([MadT]−[MCCT])−k3.A([APC/CT]-[MCC:APC/C])[MCC:APC/C]) | (11) |
a.[MCC:APC/C]2+b.[MCC:APC/C]+c=0 | (12) |
where the parameters
First, one-parameter bifurcation analysis was performed for the nonlinear system (Eq. 11 and 16). The aim is to demonstrate the bistable switch states influencing total MCC, while kinetochores are gradually attached. The simulations were conducted using AUTO software (see Methods). The results in Fig. 2B display a typical S-shape, representing the number of attached kinetochores versus the total concentration of the MCC inhibitor. The stable node points for steady states are shown in solid lines and the unstable saddle points are depicted with dashed lines. Stable and unstable steady states meet at saddle-node bifurcation points, which are indicated by solid circles. At the attached kinetochore number (91.98, i.e. nearly all), the SAC checkpoint switched off and APC/C activated rapidly. The total MCC lowers back to zero as the cell enters anaphase. The switch flipping from the SAC-active state to the SAC-inactive state as the number of attached kinetochores raises is indicated by a black dashed line. Slight movements of the bifurcation curve to the left or right are possible, depending on the values of the parameters
Additionally, the dynamics as the change of concentrations over time were simulated and plotted (Fig. 2C) using XPPAUT (see Methods). The concentration of the APC/C component (Fig. 2C pink line) remains very low, as long as no kinetochores are attached. After approximately 25 minutes, the APC/C activity increases quickly to reach its maximum. This result is consistent with the experimental findings reported in the literature [1,27]. The inhibitor complexes MCC:APC/C that sequester APC/C display exactly the opposite behaviors compared to APC/C (Fig. 2C, brown line). MCC concentration behaves similarly to MCC:APC/C with respect to the difference in the initial amount (Fig. 2C, blue line).
Also interesting is the effect of the diffusion coefficient on the SAC system, particularly since the MCC diffusion constant is unknown. To investigate this, a second-derivative diffusion term was added to the original system (Eq.10-Eq.9). The resulting system of coupled PDEs is known as a reaction-diffusion system:
d[MCCT]dt=Dr2∂∂r(r2∂[MCCT]∂r)+k1.U([MadT]−[MCCT])−k3.A[APC][MCC:APC/C]) | (13) |
d[APC/C]dt=Dr2∂∂r(r2∂[APC/C]∂r)−k2[MCC][APC/C]+(k−2+A[APC])[MCC:APC/C] | (14) |
This system was subjected to the initial conditions given in Table 1, with reflective (Neumann) boundary conditions and equivalent geometry details for the cell and kinetochore as specified in Methods. The reaction-diffusion system (Eq.13-Eq.14) was implemented in MATLAB and simulated with various diffusion constant values (Table 1). No qualitative changes were recorded using a wide range of diffusion coefficients for MCC from the literature and higher; all behave similarly as shown in the typical curves in Fig. 3A. To ensure that the assumptions used to reduce the model had no influence of this result, a full system including three PDEs was re-simulated using various diffusion values for MCC. Again, no effects were observed (Fig. 3B). We conclude that diffusion has no major influence on the SAC model, and that the use of ODEs is in principle sufficient. This is certainly not applicable to other model structures, and cannot be generalized, particularly to high-dimensional systems or using low diffusion constants.
Following the same steps as for SAC analysis as in the previous section, the SPOC wiring diagram (Fig. 4A) can be translated into a set of coupled ODEs. Accordingly, the total concentrations of Bub2, Tem1 and Bfa1:Bub2 can be expressed as: [Bub2T] = [Bub2] + [Tem1] + [Bfa1:Bub2:Tem1], [Tem1T] = [Tem1] + [Bfa1:Bub2:Tem1], and [Bfa1:Bub2T] = [Bfa1:Bub2]+[Bfa1:Bub2:Tem1]. The SPOC system is governed by the following equations, under the aforementioned assumptions (see also Methods):
d[Bfa1:Bub2T]dt=k1.X([Bub2T]−[Bfa1:Bub2])−k3.Y([Tem1T]-[Bfa1:Bub2:Tem1])[Bfa1:Bub2:Tem1T] | (15) |
a.[Bfa1:Bub2:Tem1T]2+b.[Bfa1:Bub2:Tem1T]+c=0 | (16) |
where the parameters
Again, AUTO software was used to find the bifurcation curve (see Methods for details). The bifurcation curve (Fig. 4B) is shown for the number of misaligned SPBs versus the total concentration of the Bfa1:Bub2:Tem1 inhibitor. The system switches its bistable state at the value 1.99; subsequently, SPOC is turned off, and eventually Tem1 is rapidly activated.
The numerical simulations of the ODEs is depicted in Fig. 4C, using a stiff solver in XPPAUT. Tem1 (Fig. 2C pink line) is inactive until both SPBs are correctly aligned, which takes place after about 7 minutes. The inhibitor complexes Bfa1:Bub2 and also Bfa1:Bub2:Tem1 have asymmetric behavior compared to Tem1 (Fig. 4C, gray and brown lines).
In eukaryotic cells, the mitotic control prevents DNA missegregation and aneuploidy. The evolutionarily conserved SAC mechanism guarantees that each chromosome has established its attachment to the spindle apparatus before commencing sister-chromatid separation, while the SPOC mechanism assures correct spindle alignment in some asymmetric cell divisions. The complexity of the mitotic control system arises from its fundamental spatial feature. In SAC, a single unattached or incorrectly attached kinetochore (out of 92) has to inhibit all APC/Cs of the cell and solely after last proper attachment the inhibitor has to be switched off rapidly. This behavior likely implies of a feedback loop contribution. The same is applied for SPOC mechanism with the distinguish in the signal that represents two SPBs.
Mathematical modeling can help to improve our molecular-level understanding of the interplay of SAC as well as SPOC components, allowing an understanding of the requirements that the system has to meet. To that end, a mathematical framework containing all signals in SAC (or SPOC) was studied. The network models were constructed based on biochemical reaction rules, and spatial characteristics, such as diffusion coefficients or cell size. Kinetochores and SPBs act as sensory-driven signals in SAC, and SPOC regulations, respectively. The simulation results (as ODEs or PDEs) were able to capture the desired behavior of both SAC and SPOC. Additionally to the simulations, a crucial feedback loop was built around the core components APC/C and MCC (or Tem1 and Bfa1 in SPOC) and their interplay. A one-parameter bifurcation diagram clearly depicts the bistability of the system and the realistic switch from active to inactive control states.
The presented mathematical models can be extended in future work to include various cell cycle checkpoints. Additionally, this approach will serve as a basis for designing experiments and evaluating novel hypotheses related to mitosis and cell cycle. The results provide systems-level details into the DNA segregation control mechanism and demonstrate that the combination of mathematical analysis with experimental data constitutes a powerful tool for investigation of complex biomedical systems.
The author gratefully acknowledges the visiting fund of the Institute for Numerical Simulation (INS) at Bonn University.
[1] |
B. R. Kiran, I. Sobh, V. Talpaert, P. Mannion, A. A. A. Sallab, S. Yogamani, et al., Deep reinforcement learning for autonomous driving: A survey, IEEE Trans. Intell. Transp. Syst., 23 (2022), 4909–4926. https://doi.org/10.1109/TITS.2021.3054625 doi: 10.1109/TITS.2021.3054625
![]() |
[2] | J. Chen, B. Yuan, M. Tomizuka, Model-free deep reinforcement learning for urban autonomous driving, in 2019 IEEE intelligent transportation systems conference (ITSC), (2019), 2765–2771. https://doi.org/10.1109/ITSC.2019.8917306 |
[3] |
M. Panzer, B. Bender, Deep reinforcement learning in production systems: a systematic literature review, Int. J. Prod. Res., 60 (2022), 4316–4341. https://doi.org/10.1080/00207543.2021.1973138 doi: 10.1080/00207543.2021.1973138
![]() |
[4] |
N. Ma, Y. Gao, J. Li, D. Li, Interactive cognition in self-driving, Sci. Sin. Inf., 48 (2018), 1083–1096. https://doi.org/10.1360/N112018-00028 doi: 10.1360/N112018-00028
![]() |
[5] |
H. Shi, D. Chen, N. Zheng, X. Wang, Y. Zhou, B. Ran, A deep reinforcement learning based distributed control strategy for connected automated vehicles in mixed traffic platoon, Transp. Res. Part C: Emerging Technol., 148 (2023), 104019. https://doi.org/10.1016/j.trc.2023.104019 doi: 10.1016/j.trc.2023.104019
![]() |
[6] | Y. Zhao, K. Wu, Z. Xu, Z. Che, Q. Lu, J. Tang, et al., Cadre: A cascade deep reinforcement learning framework for vision-based autonomous urban driving, preprint, arXiv:2202.08557. |
[7] |
S. Feng, H. Sun, X. Yan, H. Zhu, Z. Zou, S. Shen, et al., Dense reinforcement learning for safety validation of autonomous vehicles, Nature, 615 (2023), 620–627. https://doi.org/10.1038/s41586-023-05732-2 doi: 10.1038/s41586-023-05732-2
![]() |
[8] |
S. B. Prathiba, G. Raja, K. Dev, N. Kumar, M. Guizani, A hybrid deep reinforcement learning for autonomous vehicles smart-platooning, IEEE Trans. Veh. Technol., 70 (2021), 13340–13350. https://doi.org/10.1109/TVT.2021.3122257 doi: 10.1109/TVT.2021.3122257
![]() |
[9] |
Y. Yao, N. Ma, C. Wang, Z. Wu, C. Xu, J. Zhang, Research and implementation of variable-domain fuzzy pid intelligent control method based on q-learning for self-driving in complex scenarios, Math. Biosci. Eng., 20 (2023), 6016–6029. https://doi.org/10.3934/mbe.2023260 doi: 10.3934/mbe.2023260
![]() |
[10] |
Z. Cao, S. Xu, X. Jiao, H. Peng, D. Yang, Trustworthy safety improvement for autonomous driving using reinforcement learning, Transp. Res. part C: Emerging Technol., 138 (2022), 103656. https://doi.org/10.1016/j.trc.2022.103656 doi: 10.1016/j.trc.2022.103656
![]() |
[11] | D. Rempe, J. Philion, L. J. Guibas, S. Fidler, O. Litany, Generating useful accident-prone driving scenarios via a learned traffic prior, preprint, arXiv:2112.05077. |
[12] |
G. Li, Y. Yang, S. Li, X. Qu, N. Lyu, S. E. Li, Decision making of autonomous vehicles in lane change scenarios: Deep reinforcement learning approaches with risk awareness, Transp. Res. part C: Emerging Technol., 134 (2022), 103452. https://doi.org/10.1016/j.trc.2021.103452 doi: 10.1016/j.trc.2021.103452
![]() |
[13] | P. Bhattacharyya, C. Huang, K. Czarnecki, Ssl-lanes: Self-supervised learning for motion forecasting in autonomous driving, preprint, arXiv:2206.14116. |
[14] |
Y. Du, J. Chen, C. Zhao, C. Liu, F. Liao, C. Chan, Comfortable and energy-efficient speed control of autonomous vehicles on rough pavements using deep reinforcement learning, Transp. Res. Part C: Emerging Technol., 134 (2022), 103489. https://doi.org/10.1016/j.trc.2021.103489 doi: 10.1016/j.trc.2021.103489
![]() |
[15] |
B. Zou, J. Peng, S. Li, Y. Li, J. Yan, H. Yang, Comparative study of the dynamic programming-based and rule-based operation strategies for grid-connected pv-battery systems of office buildings, Appl. Energy, 305 (2022), 117875. https://doi.org/10.1016/j.apenergy.2021.117875 doi: 10.1016/j.apenergy.2021.117875
![]() |
[16] |
B. Du, B. Lin, C. Zhang, B. Dong, W. Zhang, Safe deep reinforcement learning-based adaptive control for usv interception mission, Ocean Eng., 246 (2022), 110477. https://doi.org/10.1016/j.oceaneng.2021.110477 doi: 10.1016/j.oceaneng.2021.110477
![]() |
[17] |
P. R. Wurman, S. Barrett, K. Kawamoto, J. MacGlashan, K. Subramanian, T. J. Walsh, et al., Outracing champion gran turismo drivers with deep reinforcement learning, Nature, 602 (2022), 223–228. https://doi.org/10.1038/s41586-021-04357-7 doi: 10.1038/s41586-021-04357-7
![]() |
[18] |
J. Duan, D. Shi, R. Diao, H. Li, Z. Wang, B. Zhang, et al., Deep-reinforcement-learning-based autonomous voltage control for power grid operations, IEEE Trans. Power Syst., 35 (2020), 814–817. https://doi.org/10.1109/TPWRS.2019.2941134 doi: 10.1109/TPWRS.2019.2941134
![]() |
[19] |
S. Aradi, Survey of deep reinforcement learning for motion planning of autonomous vehicles, IEEE Trans. Intell. Transp. Syst., 23 (2022), 740–759. https://doi.org/10.1109/TITS.2020.3024655 doi: 10.1109/TITS.2020.3024655
![]() |
[20] |
H. An, J. Jung, Decision-making system for lane change using deep reinforcement learning in connected and automated driving, Electronics, 8 (2019), 543. https://doi.org/10.3390/electronics8050543 doi: 10.3390/electronics8050543
![]() |
[21] |
Y. Du, J. Chen, C. Zhao, F. Liao, M. Zhu, A hierarchical framework for improving ride comfort of autonomous vehicles via deep reinforcement learning with external knowledge, Comput.-Aided Civ. Infrastruct. Eng., 38 (2023), 1059–1078. https://doi.org/10.1111/mice.12934 doi: 10.1111/mice.12934
![]() |
[22] |
K. Jo, Y. Jo, J. K. Suhr, H. G. Jung, M. Sunwoo, Precise localization of an autonomous car based on probabilistic noise models of road surface marker features using multiple cameras, IEEE Trans. Intell. Transp. Syst., 16 (2015), 3377–3392. https://doi.org/10.1109/TITS.2015.2450738 doi: 10.1109/TITS.2015.2450738
![]() |
[23] |
B. Okumura, M. R. James, Y. Kanzawa, M. Derry, K. Sakai, T. Nishi, et al., Challenges in perception and decision making for intelligent automotive vehicles: A case study, IEEE Trans. Intell. Veh., 1 (2016), 20–32. https://doi.org/10.1109/TIV.2016.2551545 doi: 10.1109/TIV.2016.2551545
![]() |
[24] | R. Guidolini, L. G. Scart, L. F. R. Jesus, V. B. Cardoso, C. Badue, T. Oliveira-Santos, Handling pedestrians in crosswalks using deep neural networks in the iara autonomous car, in 2018 International Joint Conference on Neural Networks (IJCNN), (2018), 1–8. https://doi.org/10.1109/IJCNN.2018.8489397 |
[25] | A. Sadat, M. Ren, A. Pokrovsky, Y. Lin, E. Yumer, R. Urtasun, Jointly learnable behavior and trajectory planning for self-driving vehicles, preprint, arXiv:1910.04586. |
[26] |
A. Bacha, C. Bauman, R. Faruque, M. Fleming, C. Terwelp, C. Reinholtz, et al., Odin: Team victortango's entry in the darpa urban challenge, J. Field Rob., 25 (2008), 467–492. https://doi.org/10.1002/rob.20248 doi: 10.1002/rob.20248
![]() |
[27] |
R. Kala, K. Warwick, Multi-level planning for semi-autonomous vehicles in traffic scenarios based on separation maximization, J. Intell. Rob. Syst., 72 (2013), 559–590. https://doi.org/10.1007/s10846-013-9817-7 doi: 10.1007/s10846-013-9817-7
![]() |
[28] |
X. Li, Z. Sun, D. Cao, Z. He, Q. Zhu, Real-time trajectory planning for autonomous urban driving: Framework, algorithms, and verifications, IEEE/ASME Trans. Mechatron., 21 (2016), 740–753. https://doi.org/10.1109/TMECH.2015.2493980 doi: 10.1109/TMECH.2015.2493980
![]() |
[29] |
S. Xie, J. Hu, P. Bhowmick, Z. Ding, F. Arvin, Distributed motion planning for safe autonomous vehicle overtaking via artificial potential field, IEEE Trans. Intell. Transp. Syst., 23 (2022), 21531–21547. https://doi.org/10.1109/TITS.2022.3189741 doi: 10.1109/TITS.2022.3189741
![]() |
[30] | A. E. Sallab, M. Abdou, E. Perot, S. Yogamani, Deep reinforcement learning framework for autonomous driving, preprint, arXiv:1704.02532. |
[31] | A. E. Sallab, M. Abdou, E. Perot, S. Yogamani, End-to-end deep reinforcement learning for lane keeping assist, preprint, arXiv:1612.04340. |
[32] | H. Chae, C. M. Kang, B. Kim, J. Kim, C. C. Chung, J. W. Choi, Autonomous braking system via deep reinforcement learning, in 2017 IEEE 20th International conference on intelligent transportation systems (ITSC), (2017), 1–6. https://doi.org/10.1109/ITSC.2017.8317839 |
[33] |
M. Zhu, Y. Wang, Z. Pu, J. Hu, X. Wang, R. Ke, Safe, efficient, and comfortable velocity control based on reinforcement learning for autonomous driving, Transp. Res. Part C: Emerging Technol., 117 (2020), 102662. https://doi.org/10.1016/j.trc.2020.102662 doi: 10.1016/j.trc.2020.102662
![]() |
[34] | M. Jaritz, R. De Charette, M. Toromanoff, E. Perot, F. Nashashibi, End-to-end race driving with deep reinforcement learning, in 2018 IEEE international conference on robotics and automation (ICRA), (2018), 2070–2075. https://doi.org/10.1109/ICRA.2018.8460934 |
[35] |
L. Qian, X. Xu, Y. Zeng, J. Huang, Deep, consistent behavioral decision making with planning features for autonomous vehicles, Electronics, 8 (2019), 1492. https://doi.org/10.3390/electronics8121492 doi: 10.3390/electronics8121492
![]() |
[36] | N. K. Ure, M. U. Yavas, A. Alizadeh, C. Kurtulus, Enhancing situational awareness and performance of adaptive cruise control through model predictive control and deep reinforcement learning, in 2019 IEEE Intelligent Vehicles Symposium (IV), (2019), 626–631. https://doi.org/10.1109/IVS.2019.8813905 |
[37] | S. Feng, X. Yan, H. Sun, Y. Feng, H. X. Liu, Intelligent driving intelligence test for autonomous vehicles with naturalistic and adversarial environment, Nature Commun., 748 (2021). https://doi.org/10.1038/s41467-021-21007-8 |
[38] |
S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, J. Gao, Deep learning-based text classification: A comprehensive review, ACM Comput. Surv., 54 (2021), 1–40. https://doi.org/10.1145/3439726 doi: 10.1145/3439726
![]() |
[39] |
G. Li, S. Lin, S. Li, X. Qu, Learning automated driving in complex intersection scenarios based on camera sensors: A deep reinforcement learning approach, IEEE Sens. J., 22 (2022), 4687–4696. https://doi.org/10.1109/JSEN.2022.3146307 doi: 10.1109/JSEN.2022.3146307
![]() |
1. | Stephan Peter, Fanar Ghanim, Peter Dittrich, Bashar Ibrahim, Organizations in reaction-diffusion systems: Effects of diffusion and boundary conditions, 2020, 43, 1476945X, 100855, 10.1016/j.ecocom.2020.100855 | |
2. | Richard Henze, Chunyan Mu, Mate Puljiz, Nishanthan Kamaleson, Jan Huwald, John Haslegrave, Pietro Speroni di Fenizio, David Parker, Christopher Good, Jonathan E. Rowe, Bashar Ibrahim, Peter Dittrich, Multi-scale stochastic organization-oriented coarse-graining exemplified on the human mitotic checkpoint, 2019, 9, 2045-2322, 10.1038/s41598-019-40648-w | |
3. | Stephan Peter, Bashar Ibrahim, Peter Dittrich, Linking Network Structure and Dynamics to Describe the Set of Persistent Species in Reaction Diffusion Systems, 2021, 20, 1536-0040, 2037, 10.1137/21M1396708 | |
4. | Faiza Hanif Waghu, Karishma Desai, Sumana Srinivasan, Kaushiki S. Prabhudesai, Vikas Dighe, Kareenhalli V. Venkatesh, Susan Idicula-Thomas, FSHR antagonists can trigger a PCOS-like state, 2022, 68, 1939-6368, 129, 10.1080/19396368.2021.2010837 |
|
|
|
||
Initial amount | ||||
APC/C | 0.09 |
[37] | ||
MCC | 0.15 |
[15] | ||
Tem1 | 0.06 |
[4] | ||
Bfa1 | 0.04 |
[4] | ||
Bub2 | 0.04 |
[4] | ||
Diffusion constants | ||||
MCC | 1-20 |
This study | ||
APC/C | 1.8 |
[40] | ||
Cdc20 | 19.5 |
[40] | ||
Mad2 | 5 |
[13] | ||
Environment | ||||
Radius of the kinetochore | 0.1 |
0.01 |
[5] | |
Radius of the cell | 10 |
4 |
[21] | |
Rate constants | ||||
kinetochores or SPBs | 0-92 | 0-2 | [16] | |
|
|
This study | ||
|
|
This study | ||
|
|
This study | ||
|
|
This study |
|
|
|
||
Initial amount | ||||
APC/C | 0.09 |
[37] | ||
MCC | 0.15 |
[15] | ||
Tem1 | 0.06 |
[4] | ||
Bfa1 | 0.04 |
[4] | ||
Bub2 | 0.04 |
[4] | ||
Diffusion constants | ||||
MCC | 1-20 |
This study | ||
APC/C | 1.8 |
[40] | ||
Cdc20 | 19.5 |
[40] | ||
Mad2 | 5 |
[13] | ||
Environment | ||||
Radius of the kinetochore | 0.1 |
0.01 |
[5] | |
Radius of the cell | 10 |
4 |
[21] | |
Rate constants | ||||
kinetochores or SPBs | 0-92 | 0-2 | [16] | |
|
|
This study | ||
|
|
This study | ||
|
|
This study | ||
|
|
This study |