A comprehensive review of graph convolutional networks: approaches and applications

Xinzheng Xu; Xiaoyang Zhao; Meng Wei; Zhongnian Li; Xinzheng Xu; Xiaoyang Zhao; Meng Wei; Zhongnian Li

doi:10.3934/era.2023213

Electronic Research Archive

2023, Volume 31, Issue 7: 4185-4215. doi: 10.3934/era.2023213

Previous Article Next Article

Review

A comprehensive review of graph convolutional networks: approaches and applications

School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China

Received: 03 October 2022 Revised: 04 March 2023 Accepted: 28 March 2023 Published: 31 May 2023

Convolutional neural networks (CNNs) utilize local translation invariance in the Euclidean domain and have remarkable achievements in computer vision tasks. However, there are many data types with non-Euclidean structures, such as social networks, chemical molecules, knowledge graphs, etc., which are crucial to real-world applications. The graph convolutional neural network (GCN), as a derivative of CNNs for non-Euclidean data, was established for non-Euclidean graph data. In this paper, we mainly survey the progress of GCNs and introduce in detail several basic models based on GCNs. First, we review the challenges in building GCNs, including large-scale graph data, directed graphs and multi-scale graph tasks. Also, we briefly discuss some applications of GCNs, including computer vision, transportation networks and other fields. Furthermore, we point out some open issues and highlight some future research trends for GCNs.

Keywords:

Citation: Xinzheng Xu, Xiaoyang Zhao, Meng Wei, Zhongnian Li. A comprehensive review of graph convolutional networks: approaches and applications[J]. Electronic Research Archive, 2023, 31(7): 4185-4215. doi: 10.3934/era.2023213

Related Papers:

[1]	Ye Sun, Daniel B. Work . Error bounds for Kalman filters on traffic networks. Networks and Heterogeneous Media, 2018, 13(2): 261-295. doi: 10.3934/nhm.2018012
[2]	Caterina Balzotti, Maya Briani, Benedetto Piccoli . Emissions minimization on road networks via Generic Second Order Models. Networks and Heterogeneous Media, 2023, 18(2): 694-722. doi: 10.3934/nhm.2023030
[3]	Matthieu Canaud, Lyudmila Mihaylova, Jacques Sau, Nour-Eddin El Faouzi . Probability hypothesis density filtering for real-time traffic state estimation and prediction. Networks and Heterogeneous Media, 2013, 8(3): 825-842. doi: 10.3934/nhm.2013.8.825
[4]	Maya Briani, Emiliano Cristiani . An easy-to-use algorithm for simulating traffic flow on networks: Theoretical study. Networks and Heterogeneous Media, 2014, 9(3): 519-552. doi: 10.3934/nhm.2014.9.519
[5]	Emiliano Cristiani, Smita Sahu . On the micro-to-macro limit for first-order traffic flow models on networks. Networks and Heterogeneous Media, 2016, 11(3): 395-413. doi: 10.3934/nhm.2016002
[6]	Samitha Samaranayake, Axel Parmentier, Ethan Xuan, Alexandre Bayen . A mathematical framework for delay analysis in single source networks. Networks and Heterogeneous Media, 2017, 12(1): 113-145. doi: 10.3934/nhm.2017005
[7]	Mauro Garavello, Benedetto Piccoli . On fluido-dynamic models for urban traffic. Networks and Heterogeneous Media, 2009, 4(1): 107-126. doi: 10.3934/nhm.2009.4.107
[8]	Dirk Helbing, Jan Siegmeier, Stefan Lämmer . Self-organized network flows. Networks and Heterogeneous Media, 2007, 2(2): 193-210. doi: 10.3934/nhm.2007.2.193
[9]	Fabio Della Rossa, Carlo D’Angelo, Alfio Quarteroni . A distributed model of traffic flows on extended regions. Networks and Heterogeneous Media, 2010, 5(3): 525-544. doi: 10.3934/nhm.2010.5.525
[10]	Tibye Saumtally, Jean-Patrick Lebacque, Habib Haj-Salem . A dynamical two-dimensional traffic model in an anisotropic network. Networks and Heterogeneous Media, 2013, 8(3): 663-684. doi: 10.3934/nhm.2013.8.663

Abstract

1. Introduction

The unprecedented growth of sensing and computational capabilities have advanced the development of real-time traffic estimation techniques. However, many estimation algorithms proposed to monitor traffic conditions are only verified through experiments (e.g., [5,13,26,27,42,43]), while a theoretical analysis on the estimator performance is lacking. This is mainly due to the complexity of the models describing traffic (e.g., non-linearity and non-differentiability) [2], and, more importantly, the non-observability [30,37] of the systems subject to conservation laws. When a system is not observable, the available sensor measurements (in conjunction with the model describing system dynamics) are insufficient to correctly reconstruct the full state to be estimated. In traffic estimation problems, the non-observability issue is inevitable when a shock exists and the sensors cannot measure every state variable in the road network of interest.

This work aims at addressing the issue of the lack of a theoretical performance analysis for traffic estimation problems under unobservable scenarios occurring at junctions. Note that in classical estimation / filtering theory, an unobservable system is very likely to result in an estimation error that diverges [1,17]. Nevertheless, in this work we derive theoretical error bounds for traffic estimation on unobservable transportation networks (with junctions), where the state estimate is provided by the Kalman filter (KF). Specifically, when the system is not observable, the error bounds are derived through leveraging the intrinsic properties of traffic models (e.g., mass conservation and the flow-density relationship), and exploring the interactions between these intrinsic properties and the update scheme in the Kalman filtering algorithm.

1.1. Related work

The classical conservation law describing the evolution of traffic density on a road link (i.e., a road section without junctions) is the Lighthill-Whitham-Richards partial differential equation (LWR PDE) [25,32]. The cell transmission model (CTM) [7,8,23] is a discretization of the LWR PDE using the Godunov scheme [11]. In [30,35], the CTM is transformed to a hybrid linear system known as the switching mode model (SMM) that switches among different linear modes, and the observability of each mode is analyzed.

To extend the traffic model on road links to networks, a model is needed to describe how the traffic exiting the road links on the upstream side of a junction is received by the road links on the downstream side of the junction. A well known issue is that the conservation of vehicles across the junction is insufficient to uniquely define the flows at the junction. To address this issue, a number of junction models [8,9,10,12,14,15,18,19,20,22] have been proposed via additional rules governing the distribution or priority of the flows on different road links.

In parallel to the ongoing development of traffic models, a number of sequential traffic estimation algorithms have been proposed to integrate model predictions with real-time sensor measurements. For example, the mixture Kalman filter is applied to the SMM in [35] to estimate traffic densities for ramp metering. The parallelized particle filters and the parallelized Gaussian sum particle filter are designed in [27] for computational scalability. In [40], an efficient multiple model particle filter is proposed for joint traffic estimation and incident detection on freeways. Other treatments of traffic estimation include [5,26,29,38,41,42,43]. A comprehensive survey of sequential estimation techniques for traffic models can be found in [34].

Although many traffic estimation algorithms proposed in the existing literature are verified experimentally, very few theoretical results exist that analyze the performance of traffic estimators (e.g., bounds on the estimation error), especially under unobservable scenarios. The main results to date are as follows. In [16], the KF is applied on a Gaussian approximation of a stochastic traffic model, and the stochastic observability of the system is proved. To ensure observability of the system, a warm-up period is required where the initial conditions are restricted to be free-flow traffic conditions. In [6], the local observability around an equilibrium traffic state is studied using a Lagrangian formulation of the traffic model. The authors of [28] prove the performance of a noise-free Luenberger observer for traffic estimation based on the SMM, which is the first work to provide theoretical performance analysis for any traffic estimator under both observable and unobservable scenarios. The Luenberger observer in [28] discards measurement feedback in unobservable modes, ensuring the spatial integral of the estimation error is conserved. Similarly in [39], the estimator runs an open-loop predictor under unobservable cases to ensure that the estimation error does not diverge. Although conserving the spatial integral of the estimation error, it is illustrated in [36] that dropping measurement feedback can lead to physically unreasonable estimates. In our earlier work [37], we showed that incorporating measurement feedback under unobservable scenarios is essential in maintaining physically reasonable estimates, i.e., due to the interactions of the model prediction and the measurement feedback in the filtering algorithm, the mean estimates of all state variables are guaranteed to be ultimately bounded inside the physically meaningful interval under unobservable modes. Note that the results in [6,16,28,37,39] are restricted to road stretches without considering junction dynamics, which is the focus of the work in this article.

1.2. Contributions and outline of the article

The main contribution of this article is the theoretical analysis of the estimation error bounds when using the KF to estimate traffic densities on transportation networks with junctions. To support a performance study of the KF, we first propose a switched linear system to model the evolution of traffic densities on road networks with junctions, namely the switching mode model for junctions (SMM-J). The SMM-J is an extension of the SMM, in the sense that the SMM-J describes the traffic density evolution on a road section with a junction, while the SMM only considers one-dimensional (i.e., without junctions) road sections. The SMM-J combines a switched linear system representation of the CTM with the junction model developed in [24], however, other junction models can also be incorporated in a similar fashion to derive linear traffic models on junctions. We also provide the observability result on each mode of the SMM-J, and show that nearly all modes are unobservable due to the irreversibility of the conservation laws in the presence of shocks and junctions. Compared to the one-dimensional road sections, the issue of non-observability is more frequently encountered when junctions exists, motivating attentions to the error bounds under unobservable scenarios. Next, we prove the estimation error properties of the KF that uses the SMM-J to estimate an unobservable freeway section with a junction inside. Although an unobservable system typically results in diverging estimation errors [17], we show that by combining the update scheme of the KF with the physical properties embedded in the traffic model, the following properties can be derived for traffic estimation under unobservable modes:

1. For an unobservable road section, the Kalman gain has bounded infinity norm. This is a necessary condition to ensure bounded mean estimation error for the KF.

2. When a road section stays in an unobservable mode, the mean estimate of each state variable is ultimately bounded inside a physically meaningful interval, thus the mean estimation error of the enire state is also ultimately bounded.

This work complements our previous work studying the performance of traffic estimators that use the SMM to monitor unobservable one-dimensional freeway traffic conditions [37]. Thus, when estimating the traffic conditions in a large-scale road network, we can partition the traffic network into local sections which are either one-dimensional, or having a junction inside. The traffic condition in each local section is estimated by a local estimator based on the KF and the SMM-J (or the SMM). Under this distributed computing manner, the estimation errors for the sections without junctions reside below the bounds derived in [37], and the error bounds studied in this work are used to justify the estimation accuracy in the sections with junctions.

This work is organized as follows. Section 2 reviews the KF and its error properties under observable and unobservable systems, and briefly summarizes the CTM. Section 3 introduces the SMM-J, its observability under different modes, and the properties of the state transition matrices that reflect the intrinsic physical properties of the traffic model. The properties of the state transition matrices of the SMM-J are applied in Section 4, where we prove the boundedness of the Kalman gain and the ultimate bound of the mean estimation error under unobservable scenarios. Finally, some concluding remarks are provided in Section 5.

Notations. Let $I_n$ and ${\bf{0}}_{n, m}$ be the $n\times n$ identity matrix and the $n\times m$ zero matrix, respectively. The subscripts of $I_n$ and ${\bf{0}}_{n, m}$ are sometimes omitted, when dimensionality is clear from the context. Denote as $\mathbb{E}[\cdot]$ the expectation operator, and the "bar" symbol is used to denote the expected value of random vector $x$ , i.e., $\bar{x} = \mathbb{E}[x]$ . The symbol $^{\top}$ is used to denote the transpose operator.

2. Preliminaries

In this section, we first review the Kalman filter and the properties of its error covariance as well as mean estimation error under observable and unobservable scenarios. We also provide a brief overview of the cell transmission model used to construct the SMM-J.

2.1. Kalman filter

In this subsection, we briefly review the KF and introduce necessary notations in this article. Consider a general linear time-varying system

$\begin{align} \rho_{k+1}& = A_k\rho_k+u_k+\omega_k, ~~~~~\rho_k\in \mathbb{R}^n, \label{eq:sys_dynamics}\end{align}$

(1)

$\begin{align} z_k& = H_k\rho_k+v_k, ~~~~~ z_k\in \mathbb{R}^{m}, \label{eq:sys_obsv} \end{align}$

(2)

where $\rho_k$ and $z_k$ are the state vector and sensor measurement vector at time $k\in \mathbb{N}$ , respectively. The matrices $A_k$ and $H_k$ are the state transition matrix and the observation matrix at time $k$ . The term $u_k$ in (1) is a deterministic system input. The noise terms $\omega_k\sim \mathcal{N}({\bf{0}}, Q_k)$ and $v_k\sim \mathcal{N}({\bf{0}}, R_k)$ are the white Gaussian model and measurement noise, where $Q_k$ and $R_k$ denote the model and the measurement error covariance matrices at time $k$ . Given the sensor data up to time $k$ denoted by $\mathcal{Z}_k = \{z_0, \cdots, z_k\}$ , the prior estimate and posterior estimate of the state can be expressed as $\rho_{k|k-1} = \mathbb{E}[\rho_k|\mathcal{Z}_{k-1}]$ and $\rho_{k|k} = \mathbb{E}[\rho_k|\mathcal{Z}_k]$ , respectively. Let $\eta_{k|k-1} = \rho_{k|k-1}-\rho_k$ and $\eta_{k|k} = \rho_{k|k}-\rho_k$ denote the prior and posterior estimation errors. The estimation error covariance matrices associated with $\rho_{k|k-1}$ and $\rho_{k|k}$ are given by $\Gamma_{k|k-1} = \mathbb{E}[\eta_{k|k-1}\eta_{k|k-1}^{\top}|\mathcal{Z}_{k-1}]$ and $\Gamma_{k|k} = \mathbb{E}[\eta_{k|k}\eta_{k|k}^{\top}|\mathcal{Z}_k]$ . The KF sequentially computes $\rho_{k|k}$ from $\rho_{k-1|k-1}$ as follows:

$\begin{align} &\textrm{Prediction:} \left\{ \begin{array}{lc} \rho_{k|k-1} = A_{k-1}\rho_{k-1|k-1}+u_k\\ \Gamma_{k|k-1} = A_{k-1}\Gamma_{k-1|k-1}A_{k-1}^{\top}+Q_{k-1}, \end{array}\right.\label{eq:kf_prediction} \end{align}$

(3)

$\begin{align} &\textrm{Correction:} \left\{ \begin{array}{lc} \rho_{k|k} = \rho_{k|k-1}+K_{k}(z_{k}-H_{k}\rho_{k|k-1})\\ \Gamma_{k|k} = \Gamma_{k|k-1}-K_kH_k\Gamma_{k|k-1}\\ K_k = \Gamma_{k|k-1}H_{k}^{\top}(R_{k}+H_{k}\Gamma_{k|k-1}H_{k}^{\top})^{-1}. \end{array}\right.\label{eq:kf_correction} \end{align}$

(4)

In (4), the matrix $K_k$ is denoted as the Kalman gain at time $k$ . Note that for all $k$ , the state estimates $\rho_{k|k-1}$ and $\rho_{k|k}$ are random vectors. The mean posterior estimate and the mean posterior estimation error¹ are denoted as $\bar{\rho}_{k|k}$ and $\bar{\eta}_{k|k}$ , respectively.

¹For the remainder of this article, the term state estimates/estimation errors/error covariance refers to the posterior estimates, unless specified otherwise.

Observability. The discrete system (1)-(2) is uniformly completely observable if there exists a positive integer $N$ and positive constants $\alpha$ , $\beta$ such that

$\begin{align} \alpha I_n < \mathcal{I}_{k, k-N} < \beta I_n, ~~~ \text{ for all } k \ge N, \end{align}$

(5)

where $\mathcal{I}_{k_1, k_0}$ is defined as the information matrix for time interval $k\in [k_0, k_1]$ :

$\begin{align*} \mathcal{I}_{k_1, k_0} = \sum\limits_{k = k_0}^{k_1}\Xi^{\top}_{k, k_1}H_{k}^{\top}R_{k}^{-1}H_{k}\Xi_{k, k_1}, \end{align*}$

where

$\begin{align} \Xi_{k, k_1} = \prod\limits_{\kappa = k}^{k_1-1}A^{-1}_{\kappa}, ~~~ \text{and}~~~ \Xi_{k_1, k} = \Xi^{-1}_{k, k_1} = \prod\limits_{\kappa = k_1-1}^{k}A_{\kappa}.\label{eq:def_xi} \end{align}$

(6)

The observability of a system characterizes whether the sensor measurements of the system is sufficient for the KF to correctly estimate the state vector. Given the positive definiteness of the information matrix, the observability grammian matrix linking the the initial state $\rho_0$ to the sensor measurements up to time $N$ (i.e., $\mathcal{Z}_N$ ) is invertible. The next lemma states the boundedness (both from below and above) of the estimation error covariance and the convergence (to zero) of the estimation error, when using the KF (3)-(4) to estimate a uniformly completely observable system.

Lemma 1 (Chapter 7.6 in [17]). If the dynamical system (1)-(2) is uniformly completely observable and the following conditions hold:

(C.1): there exist positive constants $q_1$ , $q_2$ , $r_1$ and $r_2$ such that $q_1 I_n <Q_{k}<q_2 I_n$ and $r_1 I_m<R_{k}<r_2 I_m$ for all $k$ ;

(C.2): the initial error covariance is positive definite, i.e., $\Gamma_{0|0}> {\bf{0}}$ ;

(C.3): the state transition matrix $A_k$ is nonsingular for all $k$ ,

then there exist positive constants $c_1>0$ and $c_2>0$ such that the error covariance of the KF (3)-(4) satisfies

$\begin{align*} c_1 I_n < \Gamma_{k|k} < c_2 I_n, ~~~ {for ~all }~k \ge 0. \end{align*}$

Moreover, there exists positive constants $a>0$ and $0<q<1$ such that the 2-norm² of the mean estimation error satisfies

²For the remainder of this article, we denote as $\|\cdot\|$ the 2-norm of a matrix or a vector.

$\begin{align*} \left\|\bar{\eta}_{k|k}\right\| < a q^{k} \left\|\bar{\eta}_{0|0}\right\|, ~~~{for~ all ~ }k \ge 0. \end{align*}$

When system (1)-(2) is not observable, the mean estimation error $\bar{\eta}_{k|k}$ will diverge, unless the unobservable part of the state is bounded or converges to zero automatically. In Appendix.1, we present an example illustrating the evolution of the mean estimation error given by the KF when tracking an unobservable system.

2.2. Cell transmission model

The classical conservation law describing the evolution of traffic density $\rho(t, x)$ on a road at location $x$ and time $t$ is the Lighthill-Whitham-Richards partial differential equation (LWR PDE) [25,32]:

$\begin{align} \partial_t\rho+\partial_x{\rm{F}}(\rho) = 0.\label{eq:LWRPDE} \end{align}$

(7)

The function ${\rm{F}}(\rho) = \rho {\rm{v}}(\rho)$ is called the flux function, where ${\rm{v}}(\rho)$ is an empirical velocity function used to close the model.

The cell transmission model (CTM) [7,8,23] is a discretization of (7) using a Godunov scheme [11]. Consider a uniformly sized discretization grid defined by a space step $\Delta x >0$ and a time step $\Delta t >0$ . Let $l$ index the cell defined by $x\in [l\Delta x, (l+1)\Delta x)$ , and denote as $\rho^l_{k}$ the density at time $k\Delta t$ in cell $l$ , where $k\in \mathbb{N}$ and $l\in \mathbb{N}^{+}$ . Moreover, denote as ${\rm{f}}(\rho^{l-1}_{k}, \rho^l_{k})$ the flux between cell $l-1$ and $l$ . In the CTM, the discretized model (7) becomes

$\begin{align} \rho^l_{k+1} = \rho^l_{k}+\frac{\Delta t}{\Delta x}\left({\rm{f}}(\rho^{l-1}_{k}, \rho^l_{k})-{\rm{f}}(\rho^l_{k}, \rho^{l+1}_{k})\right), \label{eq:DiscretizedLWR} \end{align}$

(8)

where ${\rm{f}}(\rho^{l-1}_{k}, \rho^l_{k})$ is computed by

$\begin{align} {\rm{f}}(\rho^{l-1}_{k}, \rho^l_{k}) = \min \left\{{\rm{s}}\left(\rho^{l-1}_{k}\right), {\rm{r}}\left(\rho^{l}_{k}\right)\right\}.\label{eq:qflow} \end{align}$

(9)

In (9), ${\rm{s}}(\rho^{l-1}_{k})$ is the sending capacity (i.e., maximum sending flow) of cell $l-1$ at time $k$ , which is a function of $\rho^{l-1}_{k}$ , and ${\rm{r}}(\rho^{l}_{k})$ is the receiving capacity (i.e., maximum receiving flow) of cell $l$ at time $k$ , which is a function of $\rho^{l}_{k}$ . Note that if the Courant-Friedrichs-Lewy (CFL) condition is satisfied, the solution of the CTM converges in $L^1$ to the weak solution of the LWR PDE as $\Delta x\rightarrow 0$ [33].

Remark 1. Note that the terminologies sending capacity and receiving capacity are equivalent to the notions of demand and supply. Both sending/receiving and demand/supply terminologies are widely used in the traffic community, with the former introduced in [7,8], and the latter introduced in [23]. In this work we use the sending/receiving terminology, which is consistent throughout the article, and is also consistent with our previous work [37].

The flux function [8] used in this work is the triangular flux function (shown in Figure 1) given by

Figure 1. The triangular fundamental diagram in (10).

DownLoad: Full-Size Img PowerPoint

$\begin{align} {\rm{F}}(\rho) = \left\{ \begin{array}{ll} \rho v_{\text{m}}& \textrm{if $\rho \in [0, \varrho_{\text{c}}]$}\\ w(\varrho_{\text{m}}-\rho)& \textrm{if $\rho \in [\varrho_{\text{c}}, \varrho_{\text{m}}]$, }\ \end{array} \right.\label{eq:FluxTraingular} \end{align}$

(10)

where $w = \frac{\varrho_{\text{c}} v_{\text{m}}}{\varrho_{\text{m}}-\varrho_{\text{c}}}$ , $v_{\text{m}}$ denotes the freeflow speed, and $\varrho_{\text{m}}$ denotes the maximum density. The parameter $\varrho_{\text{c}}$ is the critical density at which the maximum flux is realized. For the triangular fundamental diagram, the flux function has different slopes in freeflow ( $0<\rho \leq\varrho_{\text{c}}$ ) and congestion ( $\varrho_{\text{c}}<\rho \le \varrho_{\text{m}}$ ). In freeflow, the slope is $v_{\text{m}}$ , and in congestion, it is $w$ . Under the triangular flux function, the sending and receiving capacities are determined by:

$\begin{align} {\rm{s}}(\rho) = \left\{ \begin{array}{ll} \rho v_{\text{m}}& \textrm{if $\rho \in [0, \varrho_{\text{c}}]$}\\ q_{\text{m}}& \textrm{if $\rho \in [\varrho_{\text{c}}, \varrho_{\text{m}}]$}\ \end{array} \right.\quad {\rm{r}}(\rho) = \left\{ \begin{array}{ll} q_{\text{m}}& \textrm{if $\rho \in [0, \varrho_{\text{c}}]$}\\ w(\varrho_{\text{m}}-\rho)& \textrm{if $\rho \in [\varrho_{\text{c}}, \varrho_{\text{m}}]$, }\ \end{array} \right.\label{eq:FluxTri_SR} \end{align}$

(11)

where $q_{\text{m}}$ is the maximum flow given by $q_{\text{m}} = v_{\text{m}}\varrho_{\text{c}}$ .

3. The switching mode model for junctions

In this section, we derive the hybrid linear model describing traffic density evolution on a road section with a junction, namely the switching mode model for junctions (SMM-J). The SMM-J combines a switched linear system representation of the CTM (i.e., the CTM where the flux function is the triangular fundamental diagram) with a junction solver. This section starts with a review of the applied junction solver. Next, we introduce the SMM-J and provide examples regarding its explicit formulas. Finally, the observability of the SMM-J is discussed.

3.1. Junction solver on traffic networks

This subsection introduces the junction solver proposed in [24]. As shown in Figure 2a, the junction solver computes flux ${\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k})$ and ${\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k})$ between the connecting cells $l$ , $i$ and $j$ forming a diverge junction. In the merge junction shown in Figure 2b, the junction solver computes ${\rm{f}}(\rho^{i}_{k}, \rho^{l}_{k})$ and ${\rm{f}}(\rho^{j}_{k}, \rho^{l}_{k})$ between the connecting cells. This solver is later applied in Section 3.2 to develop the SMM-J.

Figure 2. A diverge and a merge junction connected by three cells indexed by

$i$ ,

$j$ , and

$l$ .

DownLoad: Full-Size Img PowerPoint

3.1.1. Diverge junction

At a diverge junction in Figure 2a, the junction solver is governed by the following rules:

(R.1): The mass across the junction is conserved.

(R.2): The throughput flow ${\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k})+{\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k})$ , i.e., the outgoing flow of cell $l$ , is maximized subject to the maximum flow that can be sent or received on each connecting cell.

(R.3): The distribution of flux ${\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k})$ and ${\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k})$ satisfies ${\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k}) = \alpha_{\text{d}} {\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k})$ , where $\alpha_{\text{d}}$ is the prescribed distribution parameter that models the routing preference to the downstream cells. When (R.3) conflicts (R.2), i.e., the flow solution that satisfies the distribution equation does not maximize the throughput, then (R.3) is relaxed, such that the solution satisfies (R.2) and minimizes the deviation from the prescribed distribution parameter, e.g., $|{\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k})/{\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k})-\alpha_{\text{d}}|$ .

The diverge junction solver is posed as a convex program with a carefully constructed objective function to accommodate the throughput maximization (R.2) and the flow distribution (R.3). The mathematical formula of the diverge junction solver is given below (see [24,Section 3.2] for more details).

Definition 1 (Convex program for the diverge junction solver). Define the objective function ${\rm{J}}(f_1, f_2)$ as:

$\begin{align*} {\rm{J}}(f_1, f_2) = \left(1-\lambda\right)\left(f_1+f_2\right)-\lambda\left(f_2-\alpha_{\text{d}}f_1\right)^2, \end{align*}$

where $0<\lambda<1$ and $\lambda$ is chosen such that $\frac{\partial {\rm{J}}}{\partial f_1}>0$ and $\frac{\partial {\rm{J}}}{\partial f_2}>0$ ³. The conditions on $\lambda$ is used to prioritize maximizing the throughput ${\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k})+{\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k})$ (as stated in (R.2)), then minimizing the deviation from the prescribed distribution parameter $\alpha_{\text{d}}$ (as stated in (R.3)). The fluxes ${\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k})$ and ${\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k})$ are obtained by solving the following convex program:

³One possible choice of $\lambda$ is $\lambda = \min\left\{(1+2\alpha_{\text{d}}^2q_{\text{m}}+\varepsilon)^{-1}, (1+2q_{\text{m}}+\varepsilon)^{-1}\right\}$ , where $\varepsilon > 0$ can be any positive value.

$\begin{align} {\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k}), ~ {\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k}) ~ = &~ \underset{f_1, f_2}{\arg\max}~ {\rm{J}}(f_1, f_2)\notag\\ \text{s.t.}~&~ f_1 \le {\rm{r}}(\rho_k^i)\label{eq:cp_c1}\end{align}$

(12)

$f_2 \le {\rm{r}}(\rho_k^j)\label{eq:cp_c2}$

(13)

$f_1+f_2 \le \mathrm{s}(\rho_k^l)\label{eq:cp_c3}.$

(14)

Figure 3 provides a graphical illustration for the solutions of the convex program defined in Definition 1. The blue vertical solid line denotes the receiving capacity of cell $i$ , i.e., ${\rm{r}}(\rho_k^i)$ , and the blue horizontal solid line denotes the receiving capacity of cell $j$ , i.e., ${\rm{r}}(\rho_k^j)$ . The intercepts of the blue dashed line (with slope -1) denote the sending capacity of cell $l$ , i.e., ${\rm{s}}(\rho_k^l)$ . The shaded area denotes the feasible values of the flux from cell $l$ to $i$ and the flux from cell $l$ to $j$ , the feasible area is obtained based on (12)-(14). The slope of the black dotted line is the prescribed distribution ratio $\alpha_{\text{d}}$ . The fluxes computed by the junction solver in Definition 1 is marked by the red dot, whose horizontal axis and vertical axis values are the obtained ${\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k})$ and ${\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k})$ , respectively.

Figure 3. Three scenarios in the junction solver [24] for the diverge junction shown in Figure 2a, where cell

$l$ diverges to cell

$i$ and cell

$j$ . The blue vertical (resp. horizontal) solid line denotes the receiving capacity of cell

$i$ (resp. cell

$j$ ). The intercepts of the blue dashed line denote the sending capacity of cell

$l$ . The shaded area denotes the feasible values of the flux from cell

$l$ to

$i$ and the flux from cell

$l$ to

$j$ . The slope of the black dotted line is the prescribed distribution ratio

$\alpha_{\text{d}}$ . The fluxes computed by the junction solver is marked by the red dot, whose horizontal axis and vertical axis values are the obtained

${\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k})$ and

${\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k})$ , respectively. Note that in diverge case Ⅱ and diverge case Ⅲ, the receiving capacities of cell

$i$ and cell

$j$ are not necessarily smaller than the sending capacity of cell

$l$ , and the graphical illustration of the flux solutions is also applicable for

${\rm{r}}(\rho_k^i)\ge {\rm{s}}(\rho_k^l)$ and/or

${\rm{r}}(\rho_k^j)\ge {\rm{s}}(\rho_k^l)$ ..

DownLoad: Full-Size Img PowerPoint

According to the convex program in Definition 1, to obtain ${\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k})$ and ${\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k})$ we need to find the solution point (i.e., the red dot) in Figure 3 that satisfies the following requirements:

● The point lies in the shaded feasible area, so that conditions (12)-(14) are satisfied;

● The point is as close as possible to the blue dashed line (with slope -1), so that the throughput ${\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k})+{\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k})$ is maximized; this is due to the fact that the distance $d_0$ between the point and the blue dashed line is proportional to the disparity between the sending capacity of cell $l$ and the throughput, i.e., $|{\rm{s}}(\rho_k^l)-({\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k})+{\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k}))| = \sqrt{2}d_0$ (as illustrated in diverge case Ⅰ of Figure 3);

● Conditioned on prioritizing maximizing the throughput, the point is as close as possible to the black dotted line (with slope $\alpha_{\text{d}}$ ), so that the distribution ratio is respected; this means that when maximizing the throughput conflicts with the distribution ratio, the requirement ${\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k}) = \alpha_{\text{d}} {\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k})$ can be relaxed.

As shown in Figure 3, there are in total three scenarios depending on the values of ${\rm{s}}(\rho_k^l)$ , ${\rm{r}}(\rho_k^i)$ and ${\rm{r}}(\rho_k^j)$ . The three scenarios are: (i) diverge case Ⅰ, when ${\rm{s}}(\rho_k^l) \ge {\rm{r}}(\rho_k^i)+{\rm{r}}(\rho_k^j)$ ; (ii) diverge case Ⅱ, when ${\rm{s}}(\rho_k^l) < {\rm{r}}(\rho_k^i)+{\rm{r}}(\rho_k^j)$ , and the prescribed distribution ratio $\alpha_{\text{d}}$ can be followed exactly; and (iii) diverge case Ⅲ, when ${\rm{s}}(\rho_k^l) < {\rm{r}}(\rho_k^i)+{\rm{r}}(\rho_k^j)$ , but (due to throughput maximization) the prescribed distribution ratio $\alpha_{\text{d}}$ cannot be followed exactly.

Under diverge case Ⅰ, the fluxes ${\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k})$ and ${\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k})$ are computed by:

$\begin{align} {\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k}) = {\rm{r}}(\rho^{i}_{k}), ~~~ {\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k}) = {\rm{r}}(\rho^{j}_{k}). \label{eq:flow_diverge1} \end{align}$

(15)

Under diverge case Ⅱ, the fluxes ${\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k})$ and ${\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k})$ are given as follows:

$\begin{align} {\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k}) = \frac{1}{\alpha_{\text{d}}+1}{\rm{s}}(\rho^{l}_{k}), ~~~ {\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k}) = \frac{\alpha_{\text{d}}}{\alpha_{\text{d}}+1}{\rm{s}}(\rho^{l}_{k}). \label{eq:flow_diverge2} \end{align}$

(16)

Under diverge case Ⅲ, the junction solver first finds the two vertices of the shaded area that lie on the dashed line, and next define the point closer to the dotted line as the flux solutions ${\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k})$ and ${\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k})$ . Hence, depending on the magnitude of ${\rm{s}}(\rho^{l}_{k})$ , ${\rm{r}}(\rho^{i}_{k})$ and ${\rm{r}}(\rho^{j}_{k})$ , the solutions could be either

$\begin{align} {\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k}) = {\rm{r}}(\rho^{i}_{k}), ~~ {\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k}) = {\rm{s}}(\rho^{l}_{k})-{\rm{r}}(\rho^{i}_{k}), \label{eq:flow_diverge3_1} \end{align}$

(17)

$\begin{align} {\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k}) = {\rm{s}}(\rho^{l}_{k})-{\rm{r}}(\rho^{j}_{k}), ~~ {\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k}) = {\rm{r}}(\rho^{j}_{k}).\label{eq:flow_diverge3_2} \end{align}$

(18)

Note that the diverge case Ⅲ shown in Figure 3 illustrates the flux solutions presented in (17).

Remark 2. The diverge junction solver described above is a non-First-In-First-Out (FIFO) model [8,10]. The FIFO diverge model maximizes the outgoing flow of cell $l$ subject to the distribution ratio ${\rm{f}}(\rho^{l}_{k}, \rho^{j}_{k}) = \alpha_{\text{d}} {\rm{f}}(\rho^{l}_{k}, \rho^{i}_{k})$ . Although the FIFO model circumvents the conflicts between throughput maximization and flow distribution, it produces unrealistic solutions in some circumstances. Several diverge junction models have been proposed to resolve this issue [14,20,22]. In the same spirit of these models, the diverge junction solver applied in this article is developed to produce similar traffic condition dependent solutions without introducing additional complexity on the traffic dynamics [24]. As a related note, the results proved in this article can be extended to FIFO models with minor changes to the proof.

3.1.2. Merge junction

At a merge junction in Figure 2b, the junction solver conserves mass, and maximizes the throughput while minimizing the deviation from a prescribed priority parameter $\alpha_{\text{p}}$ denoting the flow assignment ratio ${\rm{f}}(\rho^{j}_{k}, \rho^{l}_{k}) = \alpha_{\text{p}} {\rm{f}}(\rho^{i}_{k}, \rho^{l}_{k})$ . This priority equation is relaxed if it conflicts with flow maximization. The reader is referred to [10,12] for a detailed description of the merge model.

The structure of the diverge and merge models are similar, in the sense that both maximize the throughput while minimizing the deviation from the prescribed distribution or priority parameters. Therefore, the remainder of this article focuses on deriving the linear traffic model and analyzing the performance of the KF on the road network with a diverge junction. The analysis can be transferred to the merge case via combining the merge junction solver with the switched linearized representation of the CTM, exploring the properties of the resulting state transition matrices as in the diverge case, and analyzing the effect of these properties on the boundedness of the Kalman gain and the mean estimation error.

3.2. The switching mode model for junctions

As stated in Section 1.2, when monitoring traffic on large-scale networks, the road network is partitioned into local sections which are either one-dimensional, or having a junction inside. The traffic states on the one-dimensional local sections evolve according to the SMM, and the SMM-J is used to describe the evolution of traffic states on local sections with junctions.

As shown in Figure 4, consider a local section with $n$ cells, three links and a junction. The number of cells on each link is $n_1$ , $n_2$ , and $n_3$ , respectively, with $n_1+n_2+n_3 = n$ . The state variable at time $k \in \mathbb{N}$ is constructed as $\rho_k = \left(\rho^1_{k}, \cdots, \rho^{n_1}_{k}, \rho^{n_1+1}_{k}, \cdots, \rho^{n_1+n_2}_k, \rho^{n_1+n_2+1}_k, \cdots, \rho^{n}_k\right)^{\top}$ . As a common treatment [28,30,35,38,39], the boundary flows, denoted by $\phi_k^1$ , $\phi_k^2$ , and $\phi_k^3$ , are considered to be deterministic system inputs (please refer to [3] for the concept of using ghost cells to compute boundary flows using boundary state measurements). The SMM-J describes the evolution of $\rho_k$ using a switched linear dynamics, and is derived under the following assumptions:

Figure 4. A local section with

$n$ cells, three links and a junction.

DownLoad: Full-Size Img PowerPoint

(A.1): For each local section, there is at most one transition between freeflow and congestion in each of the three links.

(A.2): The three connecting cells forming the junction (i.e., cell $n_1$ , $n_1+1$ and $n_1+n_2+1$ in Figure 4) are either all in freeflow or all in congestion.

Assumption (A.1) is practically meaningful since the local sections are usually partitioned sufficiently short such that at most one queue is growing or dissipating within each link, which is also a commonly used assumption in the SMM [28,30,35,37,39]. Assumption (A.2) is imposed to simplify the model by reducing the number of modes considered. Note that when (A.2) is relaxed, the number of modes is greatly increased, but without yielding new insights into the estimation performance at junctions. The analysis in this work still holds, the only difference is to consider an enormously increased number of modes.

Given the assumptions stated above, on each local section with a junction, the SMM-J may switch among 32 modes (defined in Table 1) depending on (i) the freeflow/congestion status of the boundary cells and the connecting cells near the junction, and (ii) the flux solution of the junction solver characterized by the three scenarios shown in Figure 3.

Table 1. Mode definition and observability of the SMM-J.

Mode	F/C $^1$ status of cell(s)				Transition $^{3}$ on link			Diverge case	Obser-vability $^4$
Mode	$1$	$n_1+n_2$	$n$	near junction $^{2}$	1	2	3	Diverge case	Obser-vability $^4$
1	F	F	F	F	none	none	none	Ⅱ	O
2	F	F	F	C	Sh.	Ep.	Ep.	Ⅰ	U
3	F	F	F	C	Sh.	Ep.	Ep.	Ⅱ	U
4	F	F	F	C	Sh.	Ep.	Ep.	Ⅲ	U
5	C	C	C	C	none	none	none	Ⅰ	U
6	C	C	C	C	none	none	none	Ⅱ	U
7	C	C	C	C	none	none	none	Ⅲ	U
8	C	C	C	F	Ep.	Sh.	Sh.	Ⅱ	U
9	F	C	C	C	Sh.	none	none	Ⅰ	U
10	F	C	C	C	Sh.	none	none	Ⅱ	U
11	F	C	C	C	Sh.	none	none	Ⅲ	U
12	F	C	C	F	none	Sh.	Sh.	Ⅱ	U
13	C	C	F	C	none	none	Ep.	Ⅰ	U
14	C	C	F	C	none	none	Ep.	Ⅱ	U
15	C	C	F	C	none	none	Ep.	Ⅲ	U
16	C	C	F	F	Ep.	Sh.	none	Ⅱ	U
17	C	F	C	C	none	Ep.	none	Ⅰ	U
18	C	F	C	C	none	Ep.	none	Ⅱ	U
19	C	F	C	C	none	Ep.	none	Ⅲ	U
20	C	F	C	F	Ep.	none	Sh.	Ⅱ	U
21	C	F	F	F	Ep.	none	none	Ⅱ	O
22	C	F	F	C	none	Ep.	Ep.	Ⅰ	U
23	C	F	F	C	none	Ep.	Ep.	Ⅱ	U
24	C	F	F	C	none	Ep.	Ep.	Ⅲ	U
25	F	C	F	F	none	Sh.	none	Ⅱ	U
26	F	C	F	C	Sh.	none	Ep.	Ⅰ	U
27	F	C	F	C	Sh.	none	Ep.	Ⅱ	U
28	F	C	F	C	Sh.	none	Ep.	Ⅲ	U
29	F	F	C	F	none	none	Sh.	Ⅱ	U
30	F	F	C	C	Sh.	Ep.	none	Ⅰ	U
31	F	F	C	C	Sh.	Ep.	none	Ⅱ	U
32	F	F	C	C	Sh.	Ep.	none	Ⅲ	U
1 "F" and "C" stand for freeflow and congestion, respectively. 2 Cells indexed by $n_1$ , $n_1+1$ and $n_1+n_2+1$ . 3 "Sh." and "Ep." stand for shock (i.e., transition from freeflow to congestion) and expansion fan (i.e., transition from congestion to freeflow), respectively. 4 "O" stands for uniformly completely observable and "U" stands for unobservable. Note that the observability results are derived under sensor locations shown in Figure 4.

| Show Table

DownLoad: CSV

In the SMM-J, the density update scheme for the interior cells in each link (i.e., cells indexed by $l\in\{2, \cdots, n_1-1\}\cup\{n_1+2, \cdots, n_1+n_2-1\}\cup\{n_1+n_2+2, \cdots, n-1\}$ ) is given by (8), where the flow between two adjacent cells is computed according to (9). For the three boundary cells, their density updates are given as follows:

$\begin{align*} \rho^1_{k+1}& = \rho^1_{k}+\frac{\Delta t}{\Delta x}\left(\phi_k^1-{\rm{f}}(\rho^1_{k}, \rho^{2}_{k})\right), \\ \rho^{n_1+n_2}_{k+1}& = \rho^{n_1+n_2}_{k}+\frac{\Delta t}{\Delta x}\left({\rm{f}}(\rho^{n_1+n_2-1}_{k}, \rho^{n_1+n_2}_{k})-\phi_k^2\right), \\ \rho^{n}_{k+1}& = \rho^{n}_{k}+\frac{\Delta t}{\Delta x}\left({\rm{f}}(\rho^{n-1}_{k}, \rho^{n}_{k})-\phi_k^3\right), \end{align*}$

where ${\rm{f}}(\rho^1_{k}, \rho^{2}_{k})$ , ${\rm{f}}(\rho^{n_1+n_2-1}_{k}, \rho^{n_1+n_2}_{k})$ and ${\rm{f}}(\rho^{n-1}_{k}, \rho^{n}_{k})$ are also obtained from (9). The density update scheme for the three cells near the junction reads:

$\begin{align*} \rho^{n_1}_{k+1}& = \rho^{n_1}_{k}+\frac{\Delta t}{\Delta x}\left({\rm{f}}(\rho^{n_1-1}_{k}, \rho^{n_1}_{k})-{\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+1}_{k})-{\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+n_2+1}_{k})\right), \\ \rho^{n_1+1}_{k+1}& = \rho^{n_1+1}_{k}+\frac{\Delta t}{\Delta x}\left({\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+1}_{k})-{\rm{f}}(\rho^{n_1+1}_{k}, \rho^{n_1+2}_{k})\right), \\ \rho^{n_1+n_2+1}_{k+1}& = \rho^{n_1+n_2+1}_{k}+\frac{\Delta t}{\Delta x}\left({\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+n_2+1}_{k})-{\rm{f}}(\rho^{n_1+n_2+1}_{k}, \rho^{n_1+n_2+2}_{k})\right), \end{align*}$

where ${\rm{f}}(\rho^{n_1-1}_{k}, \rho^{n_1}_{k})$ , ${\rm{f}}(\rho^{n_1+1}_{k}, \rho^{n_1+2}_{k})$ , and ${\rm{f}}(\rho^{n_1+n_2+1}_{k}, \rho^{n_1+n_2+2}_{k})$ are computed by (9), and the flows between neighboring cells (i.e., ${\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+1}_{k})$ and ${\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+n_2+1}_{k})$ ) are computed based on the junction solver discussed in Section 3.1.

When applying the triangular fundamental diagram (10) to compute the flow across the cells, the traffic state $\rho_k$ in a local section evolves with linear dynamics in each mode stated in Table 1, forming a hybrid system:

$\begin{align} \rho_{k+1} = A_k\rho_k+B^{\rho}_{k}{\bf{1}}\varrho_{\text{m}}+B^{q}_k{\bf{1}}q_{\text{m}}+B^{\phi}_{k}\phi_k, \label{eq:SMMeq} \end{align}$

(19)

where ${\bf{1}}$ is the vector of all ones, the vector $\phi_k = (\phi_k^1, \phi_k^2, \phi_k^3)^{\top}$ , and $A_k \in \mathbb{R}^{n\times n}$ , $B^{\rho}_{k} \in \mathbb{R}^{n\times n}$ , $B^{q}_k\in \mathbb{R}^{n\times n}$ , $B^{\phi}_k\in \mathbb{R}^{n\times 3}$ are constructed based on the current mode of the local section, the locations of the shocks and expansion fans (if they exist), and the moving direction of the shocks.

The next two examples demonstrate the construction of matrices $A_k$ , $B^{\rho}_{k}$ , $B^{q}_k$ and $B^{\phi}_k$ . Specifically, the construction of matrices $A_k$ provides insights on the properties of the state transition matrices of the SMM-J that reflect the intrinsic physical properties of the traffic model. These properties are critical in proving the estimation error bounds of the KF on traffic networks. Before showing the examples, we first introduce some notation which will be used as elements of the matrices to be constructed. For $p\in \mathbb{N}^+$ , define $\Theta_p\in\mathbb{R}^{p\times p}$ and $\Delta_p\in\mathbb{R}^{p\times p}$ by their $(i, j)^{\textrm{th}}$ entries as

$\begin{align} \Theta_p(i, j) = \left\{ \begin{array}{ll} 1-\frac{v_{\text{m}}\Delta t}{\Delta x}& \textrm{if $i = j$}\\ \frac{v_{\text{m}}\Delta t}{\Delta x}& \textrm{if $i = j+1$}\\ 0& \textrm{otherwise, } \end{array} \right.\label{eq:theta_p} \end{align}$

(20)

$\begin{align} \Delta_p(i, j) = \left\{ \begin{array}{ll} 1-\frac{w\Delta t}{\Delta x}& \textrm{if $i = j$}\\ \frac{w\Delta t}{\Delta x}& \textrm{if $i = j-1$}\\ 0& \textrm{otherwise.} \end{array} \right.\label{eq:delta_p} \end{align}$

(21)

For $p_1\in \mathbb{N}^+$ , $p_2\in \mathbb{N}^+$ , $p_3\in \mathbb{N}^+$ and $p_4\in \mathbb{N}^+$ , define $E_{p_1, p_2}^{p_3, p_4}\in\mathbb{R}^{p_1\times p_2}$ as the $p_1\times p_2$ matrix with all entries zero except its $(p_3, p_4)^{\textrm{th}}$ entry, which is one. Explicitly,

$\begin{align} E_{p_1, p_2}^{p_3, p_4}(i, j) = \left\{ \begin{array}{ll} 1&\textrm{if $i = p_3$ and $j = p_4$}\\ 0& \textrm{otherwise.} \end{array} \right.\label{eq:E_p_st} \end{align}$

(22)

Moreover, define $\tilde{\Theta}_p \in\mathbb{R}^{p\times p}$ and $\tilde{\Delta}_p \in\mathbb{R}^{p\times p}$ as:

$\begin{align} \tilde{\Theta}_p = \left\{ \begin{array}{ll} \left( \begin{array}{cc} \Theta_{p-1}&{\bf{0}}_{p-1, 1}\\ \frac{v_{\text{m}}\Delta t}{\Delta x}E_{1, p-1}^{1, p-1}&1 \end{array} \right)&\textrm{if $p \ge 2$, } \\ 1&\textrm{if $p = 1$, } \end{array} \right.\notag \end{align}$

and

$\begin{align} \tilde{\Delta}_p = \left\{ \begin{array}{ll} \left( \begin{array}{cc} 1&\frac{w\Delta t}{\Delta x}E_{1, p-1}^{1, 1}\\ {\bf{0}}_{p-1, 1}&\Delta_{p-1} \end{array} \right)&\textrm{if $p \ge 2$, } \\ 1&\textrm{if $p = 1$.} \end{array} \right.\notag \end{align}$

Example 1 (System dynamics of the SMM-J under Mode 1). Consider the local section in Figure 4 where all the cells are in freeflow. Within a single link, the flux between two neighboring cells indexed by $i$ and $i+1$ is given by ${\rm{f}}(\rho^{i}_{k}, \rho^{i+1}_{k}) = v_{\text{m}}\rho^{i}_{k}$ . At the junction, it holds that ${\rm{s}}(\rho^{n_1}_{k}) = v_{\text{m}}\rho^{n_1}_{k}<{\rm{r}}(\rho^{n_1+1}_{k})+{\rm{r}}(\rho^{n_1+n_2+1}_{k}) = 2q_{\text{m}}$ . Also note that ${\rm{s}}(\rho^{n_1}_{k})\le {\rm{r}}(\rho^{n_1+1}_{k}) = q_{\text{m}}$ and ${\rm{s}}(\rho^{n_1}_{k})\le {\rm{r}}(\rho^{n_1+n_2+1}_{k}) = q_{\text{m}}$ , the distribution ratio $\alpha_{d}$ can be followed exactly. Hence, the state is under Mode 1 at time $k$ , and the junction solver computes fluxes ${\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+1}_{k})$ and ${\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+n_2+1}_{k})$ according to diverge case Ⅱ (16) as follows:

$\begin{align*} {\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+1}_{k}) = \frac{\rho^{n_1}_{k}v_{\text{m}}}{1+\alpha_{\text{d}}}, ~~~ {\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+n_2+1}_{k}) = \frac{\alpha_{\text{d}}\rho^{n_1}_{k}v_{\text{m}}}{1+\alpha_{\text{d}}}. \end{align*}$

Substituting the flows computed above into the update scheme of the traffic density on each cell, it follows that the explicit forms of $A_k$ , $B^{\rho}_{k}$ , $B^{q}_k$ , and $B^{\phi}_{k}$ in (19) are

$\begin{align} A_{k}& = \left( \begin{array}{ccc} \Theta_{n_1}&&\\ \frac{v_{\text{m}}\Delta t}{(1+\alpha_{\text{d}})\Delta x}E_{n_2, n_1}^{1, n_1}&\tilde{\Theta}_{n_2}& \\ \frac{\alpha_{\text{d}}v_{\text{m}}\Delta t}{(1+\alpha_{\text{d}})\Delta x}E_{n_3, n_1}^{1, n_1}&& \tilde{\Theta}_{n_3} \end{array} \right), ~~~ B^{\rho}_{k} = B^{q}_{k} = {\bf{0}}_{n, n}, \\ B^{\phi}_{k}& = \frac{\Delta t}{\Delta x}\left(E_{n, 3}^{1, 1}-E_{n, 3}^{n_1+n_2, 2}-E_{n, 3}^{n, 3}\right)\notag \end{align}$

(23)

in Mode 1. Note that in the above definitions and for the remainder of this subsection, blocks in the matrices which are left blank are zeros everywhere.

Example 2 (System dynamics of the SMM-J under Modes 2-4). Consider the local section in Figure 4 where the three boundary cells indexed by $1$ , $n_1+n_2$ and $n$ are all in freeflow, and the three cells near the junction, i.e., the cells indexed by $n_1$ , $n_1+1$ and $n_1+n_2+1$ are in congestion. Given the assumption that there is at most one transition between freeflow and congestion in each of the three links connecting the junction, it can be concluded that there is a shock (i.e., transition from freeflow to congestion) on link 1, while link 2 and link 3 each has an expansion fan. Let $l_1$ be the location of the shock on link 1, i.e., the transition from freeflow to congestion on link 1 occurs between cell $l_1$ and $l_1+1$ . Moreover, we define

$\begin{align*} \tilde{l}_1 = \left\{ \begin{array}{ll} l_1&\textrm{if the shock has positive velocity or is stationary}\\ l_1-1& \textrm{if the shock has negative velocity, } \end{array} \right. \end{align*}$

and

$\begin{align*} \hat{l}_1 = n_1-1-\tilde{l}_1, \end{align*}$

which are later used to simplify the notations to define matrices $A_k$ , $B^{\rho}_{k}$ , $B^{q}_k$ , and $B^{\phi}_{k}$ . Similarly, denote as $l_2$ (resp. $l_3$ ) the location of the expansion fan on link 2 (resp. link 3), i.e., the transition from congestion to freeflow on link 2 (resp. link 3) occurs between cells $l_2$ and $l_2+1$ (resp. cells $l_3$ and $l_3+1$ ). To simplify the notation, we also define

$\begin{align*} \tilde{l}_2 = l_2-n_1, ~~~\hat{l}_2 = n_2-\tilde{l}_2, ~~~\tilde{l}_3 = l_3-n_1-n_2,~~~ \hat{l}_3 = n_3-\tilde{l}_3. \end{align*}$

At the junction, the sending capacity of cell $n_1$ is ${\rm{s}}(\rho^{n_1}_{k}) = q_{\text{m}}$ , and the receiving capacities of cell $n_1+1$ and cell $n_1+n_2+1$ are ${\rm{r}}(\rho^{n_1+1}_{k}) = w(\varrho_{\text{m}}-\rho^{n_1+1}_{k})$ and ${\rm{r}}(\rho^{n_1+n_2+1}_{k}) = w(\varrho_{\text{m}}-\rho^{n_1+n_2+1}_{k})$ , respectively. Depending on the magnitudes of ${\rm{s}}(\rho^{n_1}_{k})$ , ${\rm{r}}(\rho^{n_1+1}_{k})$ , and ${\rm{r}}(\rho^{n_1+n_2+1}_{k})$ , the junction solver follows one of the three possible scenarios shown in Figure 3.

1. Diverge case Ⅰ: when ${\rm{s}}(\rho^{n_1}_{k})\ge {\rm{r}}(\rho^{n_1+1}_{k})+{\rm{r}}(\rho^{n_1+n_2+1}_{k})$ . In this case, the state is under Mode 2 at time $k$ , and the junction solver computes fluxes ${\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+1}_{k})$ and ${\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+n_2+1}_{k})$ according to diverge case Ⅰ (15) as follows:

$\begin{align*} {\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+1}_{k})& = {\rm{r}}(\rho^{n_1+1}_{k}) = w(\varrho_{\text{m}}-\rho^{n_1+1}_{k}), \\ {\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+n_2+1}_{k})& = {\rm{r}}(\rho^{n_1+n_2+1}_{k}) = w(\varrho_{\text{m}}-\rho^{n_1+n_2+1}_{k}). \end{align*}$

Hence in Mode 2, the explicit forms of $A_k$ , $B^{\rho}_{k}$ , $B^{q}_k$ , and $B^{\phi}_{k}$ in (19) are

$\begin{align} A_{k}& = \left( \begin{array}{ccccccc} \Theta_{\tilde{l}_1}&& &&&& \\ \frac{v_{\text{m}}\Delta t}{\Delta x}E_{1, \tilde{l}_1}^{1, \tilde{l}_1}&1& \frac{w\Delta t}{\Delta x}E_{1, \hat{l}_1}^{1, 1} && && \\ && \Delta_{\hat{l}_1}& \frac{w\Delta t}{\Delta x}E_{\hat{l}_1, \tilde{l}_2}^{\hat{l}_1, 1} && \frac{w\Delta t}{\Delta x}E_{\hat{l}_1, \tilde{l}_3}^{\hat{l}_1, 1}&\\ &&& \Delta_{\tilde{l}_2}&&&\\ &&&& \tilde{\Theta}_{\hat{l}_2}&&\\ &&&&& \Delta_{\tilde{l}_3}&\\ &&&&&& \tilde{\Theta}_{\hat{l}_3}\\ \end{array} \right), \notag \end{align}$

$\begin{align*} B^{\rho}_{k}& = \frac{w\Delta t}{\Delta x}\left(-E_{n, n}^{\tilde{l}_1+1, \tilde{l}_1+2} -E_{n, n}^{n_1, n_1+n_2+1}+E_{n, n}^{l_2, l_2}+E_{n, n}^{l_3, l_3}\right), \\ B^{q}_{k}& = \frac{\Delta t}{\Delta x}\left(-E_{n, n}^{l_2.l_2}+E_{n, n}^{l_2+1, l_2}-E_{n, n}^{l_3, l_3}+E_{n, n}^{l_3+1, l_3}\right), \\B^{\phi}_{k}& = \frac{\Delta t}{\Delta x}\left(E_{n, 3}^{1, 1}-E_{n, 3}^{n_1+n_2, 2}-E_{n, 3}^{n, 3}\right). \end{align*}$

2. Diverge case Ⅱ: when ${\rm{s}}(\rho^{n_1}_{k})<{\rm{r}}(\rho^{n_1+1}_{k})+{\rm{r}}(\rho^{n_1+n_2+1}_{k})$ , and the prescribed distribution ratio $\alpha_{\text{d}}$ can be followed exactly. In this case, the state is under Mode 3 at time $k$ , and the junction solver computes fluxes ${\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+1}_{k})$ and ${\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+n_2+1}_{k})$ according to diverge case Ⅱ (16) as follows:

$\begin{align*} {\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+1}_{k})& = \frac{1}{\alpha_{\text{d}}+1}{\rm{s}}(\rho^{n_1}_{k}) = \frac{1}{\alpha_{\text{d}}+1}q_{\text{m}}, \\ {\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+n_2+1}_{k})& = \frac{\alpha_{\text{d}}}{\alpha_{\text{d}}+1}{\rm{s}}(\rho^{n_1}_{k}) = \frac{\alpha_{\text{d}}}{\alpha_{\text{d}}+1}q_{\text{m}}. \end{align*}$

In Mode 3, the explicit forms of $A_k$ , $B^{\rho}_{k}$ , $B^{q}_k$ , and $B^{\phi}_{k}$ in (19) are

$\begin{align} A_{k}& = \left( \begin{array}{ccccccc} \Theta_{\tilde{l}_1}&& &&&& \\ \frac{v_{\text{m}}\Delta t}{\Delta x}E_{1, \tilde{l}_1}^{1, \tilde{l}_1}&1& \frac{w\Delta t}{\Delta x}E_{1, \hat{l}_1}^{1, 1} && && \\ && \Delta_{\hat{l}_1} &&&& \\ &&& \tilde{\Delta}_{\tilde{l}_2}&&&\\ &&&& \tilde{\Theta}_{\hat{l}_2}&&\\ &&&&& \tilde{\Delta}_{\tilde{l}_3}&\\ &&&&&& \tilde{\Theta}_{\hat{l}_3}\\ \end{array} \right), \notag \end{align}$

$\begin{align*} B^{\rho}_{k} = &\frac{w\Delta t}{\Delta x}\left(-E_{n, n}^{\tilde{l}_1+1, \tilde{l}_1+2}+E_{n, n}^{n_1, n_1}-E_{n, n}^{n_1+1, n_1+2}\right.\\ & \left.+E_{n, n}^{l_2, l_2}-E_{n, n}^{n_1+n_2+1, n_1+n_2+2}+E_{n, n}^{l_3, l_3}\right), \\ B^{q}_{k} = &\frac{\Delta t}{\Delta x}\left(-E_{n, n}^{n_1+n_1}+\frac{1}{1+\alpha_{\text{d}}}E_{n, n}^{n_1+1, n_1}+\frac{\alpha_{\text{d}}}{1+\alpha_{\text{d}}}E_{n, n}^{n_1+n_2+1, n_1}\right.\\ & \left.-E_{n, n}^{l_2, l_2}+E_{n, n}^{l_2+1, l_2}-E_{n, n}^{l_3, l_3}+E_{n, n}^{l_3+1, l_3}\right), \\ B^{\phi}_{k} = &\frac{\Delta t}{\Delta x}\left(E_{n, 3}^{1, 1}-E_{n, 3}^{n_1+n_2, 2}-E_{n, 3}^{n, 3}\right). \end{align*}$

3. Diverge case Ⅲ: when ${\rm{s}}(\rho^{n_1}_{k})<{\rm{r}}(\rho^{n_1+1}_{k})+{\rm{r}}(\rho^{n_1+n_2+1}_{k})$ , but the prescribed distribution ratio $\alpha_{\text{d}}$ cannot be followed exactly. In this case, the state is under Mode 4 at time $k$ . Depending on the magnitudes of ${\rm{s}}(\rho^{n_1}_{k})$ , ${\rm{r}}(\rho^{n_1+1}_{k})$ and ${\rm{r}}(\rho^{n_1+n_2+1}_{k})$ , the fluxes ${\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+1}_{k})$ and ${\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+n_2+1}_{k})$ computed by the junction solver are either obtained from (17), i.e.,

$\begin{split} &{\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+1}_{k}) = {\rm{r}}(\rho^{n_1+1}_{k}) = w(\varrho_{\text{m}}-\rho^{n_1+1}_{k}), \\ &{\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+n_2+1}_{k}) = {\rm{s}}(\rho^{n_1}_{k})-{\rm{r}}(\rho^{n_1+1}_{k}) = q_{\text{m}}-w(\varrho_{\text{m}}-\rho^{n_1+1}_{k}), \end{split}$

(24)

or obtained from (18), i.e.,

$\begin{split} &{\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+1}_{k}) = {\rm{s}}(\rho^{n_1}_{k})-{\rm{r}}(\rho^{n_1+n_2+1}_{k}) = q_{\text{m}}-w(\varrho_{\text{m}}-\rho^{n_1+n_2+1}_{k}), \\ &{\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+n_2+1}_{k}) = {\rm{r}}(\rho^{n_1+n_2+1}_{k}) = w(\varrho_{\text{m}}-\rho^{n_1+n_2+1}_{k}). \end{split}$

(25)

For conciseness of the presentation, we provide next the explicit formulas of $A_k$ , $B^{\rho}_{k}$ , $B^{q}_k$ , and $B^{\phi}_{k}$ when the fluxes are computed according to (24), and the construction of the system dynamics when the fluxes are given by (25) can be done in a similar fashion. The matrices $A_k$ , $B^{\rho}_{k}$ , $B^{q}_k$ , and $B^{\phi}_{k}$ read

$\begin{align} A_{k}& = \left( \begin{array}{ccccccc} \Theta_{\tilde{l}_1}&& &&&& \\ \frac{v_{\text{m}}\Delta t}{\Delta x}E_{1, \tilde{l}_1}^{1, \tilde{l}_1}&1& \frac{w\Delta t}{\Delta x}E_{1, \hat{l}_1}^{1, 1} && && \\ && \Delta_{\hat{l}_1}& \frac{w\Delta t}{\Delta x}E_{\hat{l}_1, \tilde{l}_2}^{\hat{l}_1, 1} &&& \\ &&& \Delta_{\tilde{l}_2}&&&\\ &&&& \tilde{\Theta}_{\hat{l}_2}&&\\ &&&\frac{w\Delta t}{\Delta x}E_{\tilde{l}_3, \tilde{l}_2}^{1, 1}&& \tilde{\Delta}_{\tilde{l}_3}&\\ &&&&&& \tilde{\Theta}_{\hat{l}_3}\\ \end{array} \right), \notag \end{align}$

$\begin{align*} B^{\rho}_{k} = &\frac{w\Delta t}{\Delta x}\left(-E_{n, n}^{\tilde{l}_1+1, \tilde{l}_1+2}+E_{n, n}^{l_2, l_2}\right.\\ & \left.-E_{n, n}^{n_1+n_2+1, n_1+n_2+2}-E_{n, n}^{n_1+n_2+1, n_1+1}+E_{n, n}^{l_3, l_3}\right), \\ B^{q}_{k} = &\frac{\Delta t}{\Delta x}\left(-E_{n, n}^{l_2.l_2}+E_{n, n}^{l_2+1, l_2}\right.\\ & \left.+E_{n, n}^{n_1+n_2+1, n_1+1}-E_{n, n}^{l_3, l_3}+E_{n, n}^{l_3+1, l_3}\right), \\ B^{\phi}_{k} = &\frac{\Delta t}{\Delta x}\left(E_{n, 3}^{1, 1}-E_{n, 3}^{n_1+n_2, 2}-E_{n, 3}^{n, 3}\right). \end{align*}$

3.3. Properties of the state transition matrices of the SMM-J

Some properties of the state transition matrix $A_{k}$ are summarized below, which will be demonstrated in Section 4.1 to assume important roles in proving the boundedness of the Kalman gain (a necessary condition to ensure the ultimate boundedness of the mean estimation error) when using the KF to estimate traffic conditions based on the SMM-J.

(P.1): For $A_k$ in all modes, each element satisfies

$\begin{align*} 0\le A_{k}(r, c)\le 1, \quad\textrm{for all $k\in \mathbb{N}$ and $r, c\in \{1, \cdots, n\}$.} \end{align*}$

This property is due to the CFL condition [33] in the discretization scheme (8).

(P.2): When $A_k$ is derived under diverge case Ⅰ and diverge case Ⅱ, the sum of the elements in $A_k$ at the same column $c$ satisfies

$\begin{align*} \sum\limits_{r = 1}^n A_{k}(r, c) \le 1, \quad\textrm{for all $k\in \mathbb{N}$ and $c\in \{1, \cdots, n\}$.} \end{align*}$

This property is due to the CFL condition as in (P.1) and the conservation law embedded in the traffic model.

(P.3): When $A_k$ is derived under diverge case Ⅲ, the sum of the elements in $A_k$ at the same column $c$ satisfies

$\begin{align*} \sum\limits_{r = 1}^n A_{k}(r, c) \le 1, \quad\textrm{for all $k\in \mathbb{N}$ and $c\in\{c|c\in \{1, \cdots, n\}, c\neq \ell\}$, } \end{align*}$

where $\ell = n_1+1$ if ${\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+1}_{k}) = {\rm{r}}(\rho^{n_1+1}_{k})$ and $\ell = n_1+n_2+1$ if ${\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+1}_{k}) = {\rm{s}}(\rho^{n_1}_{k})-{\rm{r}}(\rho^{n_1+n_2+1}_{k})$ . Moreover, it also holds that for $A_k$ under diverge case Ⅲ,

$\begin{align*} A_{k}(r, c) = 0, \quad\textrm{for all $k$, $r\in\{n_1+1, \cdots, n\}$ and $c\in\{1, \cdots n_1\}$.} \end{align*}$

This is due to the facts that (i) the flows from cell $n_1$ at the junction to the two downstream cells (i.e., cell $n_1+1$ and cell $n_1+n_2+1$ ) do not depend on the densities of the cells on link 1, as shown in (24)-(25), and (ii) the internal flows between adjacent cells on link 2 and link 3 are also independent of the densities of the cells on link 1.

Based on the above properties, the following lemma derives the bounds on each entry of the product of the state transition matrices, which will later be applied to prove the boundedness of the Kalman gain (see Lemma 3).

Lemma 2. Consider the local section in Figure 4 that stays inside a SMM-J mode while $k\in (k_0, k_1]$ ⁴, where $0\le k_0<{k}_1\le+\infty$ . Recall from (6) that the product of the state transition matrices is defined as

⁴Recall that the time instant $k\in \mathbb{N}$ , hence $k\in (k_0, {k}_1]$ denotes $k\in\{k_0+1, \cdots, k_1\}$ .

$\begin{align} \Xi_{k+1, k_0+1} = \prod\limits_{\kappa = k}^{k_0+1}A_{\kappa}, ~~~{for ~k\in (k_0, {k}_1]}. \end{align}$

(26)

The $(i, j)^{{th}}$ entry of $\Xi_{k+1, k_0+1}$ satisfies

$\begin{align}\label{eq:prod_A_bound} 0\le \Xi_{k+1, k_0+1}(i, j)\le 1 ~~{for~ all ~k\in (k_0, k_1]~~ and ~~i, j \in \{1, \cdots, n\}.} \end{align}$

(27)

Proof. The proof applies properties (P.1)-(P.3), and is reported in Appendix.2.

3.4. Observability of the SMM-J

From an estimation point of view, it is assumed that the sensors are located on the far ends of the three links connecting the junction (as illustrated in Figure 4), measuring the densities of the boundary cells of the local section (i.e., $\rho^1_{k}$ , $\rho^{n_1+n_2}_k$ and $\rho^n_{k}$ ).

Incorporating model noise in the SMM-J (19) yields:

$\begin{align} \rho_{k+1} = A_k\rho_k+u_k+\omega_{k},~~~~ \rho_k \in \mathbb{R}^{n}, \label{eq:smm_Dynamics_0} \end{align}$

(28)

where $\omega_{k}\sim \mathcal{N}(0, Q_{k})$ is the white Gaussian model noise, and define the deterministic system input as:

$\begin{align*} u_k = B^{\rho}_{k}{\bf{1}}\varrho_{\text{m}}+B^{q}_k{\bf{1}}q_{\text{m}}+B^{\phi}_{k}\phi_k. \end{align*}$

The sensor measurements are modeled as follows:

$\begin{align} z_{k} = H_{k}\rho_{k}+v_{k},~~~~ z_k \in \mathbb{R}^{3}, \label{eq:smm_measurement_0} \end{align}$

(29)

where the observation matrix $H_k = E_{3, n}^{1, 1}+E_{3, n}^{2, n_1+n_2}+E_{3, n}^{3, n}$ , and $v_k\sim \mathcal{N}(0, R_k)$ is the white Gassian measurement noise. Hence, as shown in (28)-(29), the system dynamics of the SMM-J is rewritten in the form of (1)-(2).

The observability of system (28)-(29) under different modes are listed in Table 1, which can be derived directly from the definition of observability stated in Section 2.1, i.e., checking the boundedness of the information matrix. According to Table 1, most of the modes are not observable except (i) when all cells in the local section are in freeflow, and (ii) when an expansion fan sits on link 1 and no other transitions between freeflow and congestion exist in the local section. From a physical viewpoint, the non-observability of the SMM-J is due to the irreversibility of the vehicle conservation law given the available sensor measurements in the presence of shocks, and the presence of the junction. It is indicated that compared to the observability of the SMM [30], the issue of non-observability is more critical when junctions exist. For example, a one-dimensional road section where the traffic is in congestion everywhere is observable given measurements of the upstream boundary cell [30], while a congested local section with a junction is not observable even with the measurements of all the three boundary cells (as shown by the cases for Modes 5-7). The performance of the KF under uniformly completely observable systems is widely studied (as summarized in Lemma 1). In this article, we focus on analysing the theoretical performance of the KF under the unobservable modes of the SMM-J.

4. Performance analysis of the KF for unobservable traffic network estimation

Challenges for estimating an unobservable system stem from the fact that the estimation error covariance could grow unbounded, thus the mean estimation error also potentially diverges (as shown in Example 3 in the appendix). In this subsection, we show that when combining the physical properties of the traffic model (i.e., vehicle conservation and the flow-density relationship) with the update scheme of the KF, the mean estimation errors of all the cells in an unobservable local section are ultimately bounded inside $(-\epsilon, \varrho_{\text{m}}+\epsilon)$ for all $\epsilon>0$ , provided that the density measurements of the three boundary cells are available. This ensures that the mean estimates of the KF for unobservable modes are always physically meaningful to within $\epsilon$ . Comparatively, it is shown in [36] that an open-loop observer may result in non-physical state estimates in unobservable modes.

Note that the three conditions (C.1)-(C.3) in Lemma 1 are necessary for proving the properties of the KF for traffic estimation under unobservable systems. For system (28)-(29), we assume that conditions (C.1) and (C.2) can be ensured when setting up the parameters in the KF. It can also be directly verified that condition (C.3) always holds for all the modes of the SMM-J.

4.1. Boundedness of the Kalman gain

Let $({k}_0^{\text{U}}, {k}_1^{\text{U}}]$ be the time interval inside which the local section shown in Figure 4 stays in an unobservable mode of the SMM-J. In this subsection, we present a lemma stating the boundedness of the Kalman gain for $k\in ({k}_0^{\text{U}}, {k}_1^{\text{U}}]$ , which is obtained based on the boundedness of the cross-covariance of the observable and unobservable subsystems in the Kalman observability canonical form (see Appendix.3). According to the KF update scheme in (4), the boundedness of the Kalman gain is a necessary condition for the boundedness of the state estimate.

Lemma 3. Consider an unobservable local section shown in Figure 4. The local section stays inside an unobservable mode of the SMM-J while $k\in ({k}_0^{\text{U}}, {k}_1^{\text{U}}]$ , where $0\le {k}_0^{\text{U}}<{k}_1^{\text{U}}\le+\infty$ . Given density measurements of the three boundary cells, the infinity norm⁵ of the Kalman gain computed by the KF (3)-(4) satisfies

⁵Recall that for matrix $M \in \mathbb{R}^{p \times q}$ , its infinity norm is defined as $\|M\|_{\infty} = \underset{{r\in \{1, \cdots, p\}}}{\max}\sum_{c = 1}^q\left|M(r, c)\right|$ .

$\begin{equation}\label{eq:mathfrak_k} \left\|K_k\right\|_{\infty}\le {\rm{k}}\left(\Gamma_{{k}_0^{{{{U}}}}|{k}_0^{{{{U}}}}}\right), \quad {for ~all ~k\in ({k}_0^{{{{U}}}}, {k}_1^{{{{U}}}}], } \end{equation}$

(30)

where ${\rm{k}}\left(\cdot\right)$ is a function of the error covariance at time ${k}_0^{{\text{U}}}$ .

Proof. The explicit formula of ${\rm{k}}\left(\Gamma_{{k}_0^{{\text{U}}}|{k}_0^{{\text{U}}}}\right)$ is presented in Appendix.4.1, and is derived in Appendix.4.2.

In the proof of Lemma 3, the Kalman gain is partitioned into blocks corresponding to the observable and unobservable subsystems (as shown in (49)). The part corresponding to the observable subsystem is a function of the estimation error covariance of the observable subsystem (see (50)), thus its boundedness is relatively straightforward to justify. On the other hand, the block of the Kalman gain that corresponds to the unobservable subsystem is a function of the cross-covariance of the observable and the unobservable subsystems (see (51)). By exploring the interaction between the evolution equation of the cross-covariance (shown in (48)) and the physical properties of the traffic model (reflected in (P.1)-(P.3)), we also derive the boundeness of the unobservable block of the Kalman gain. In summary, the combination of the update scheme of the KF and the intrinsic properties of the traffic model is critical in showing the boundedness of the Kalman gain under unobservable modes.

4.2. Ultimate boundedness of the mean error

In this subsection, we show that for an unobservable local section, the mean estimate of each cell will ultimately achieve the physically meaningful interval, thus the mean estimation error is also ultimately bounded. Unlike typical unobservable scenarios where the mean estimation error diverges (as shown in Example 3), the boundedness of the mean error here is ensured by the intrinsic physical properties of the traffic model, i.e., the relationship between the density and the sending/receiving capacity for each cell, as illustrated in the proof of the next proposition.

Proposition 1. Consider an unobservable local section as shown in Figure 4 that stays inside an unobservable mode while $k\in ({k}_0^{\text{U}}, \infty)$ . For all $\epsilon>0$ , a finite time ${\rm{t}}(\epsilon)$ exists such that $\bar{\rho}^l_{k|k}\in(-\epsilon, \varrho_{\text{m}}+\epsilon)$ for all $k>{k}_0^{\text{U}}+{\rm{t}}(\epsilon)$ and for all $l\in \{1, \cdots, n\}$ , independent of the initial estimate. Moreover, the mean estimation error satisfies $\|\bar{\eta}_{k|k}\|<\sqrt{n}(\varrho_{\text{m}}+\epsilon)$ for all $k>{k}_0^{\text{U}}+{\rm{t}}(\epsilon)$ , independent of the initial estimate.

Proof. Denote as ${\bar{\eta}}^{\text{b}}_{k|k} = (\bar{\eta}^{1}_{k|k}, \bar{\eta}^{n_1+n_2}_{k|k}, \bar{\eta}^{n}_{k|k})^{\top}$ the mean error of the three boundary cells. Since the three boundary cells are all inside the observable subsystem, it follows that $\left\|{\bar{\eta}}^{\text{b}}_{k|k}\right\|\rightarrow 0$ as $k\rightarrow\infty$ .

The proof is by induction. In Step 1, we use an induction from cell $1$ to the downstream cells to show that if the estimate of cell $1$ converges to the true state, then the estimate of all cells will ultimately be greater than $-\epsilon$ for all $\epsilon >0$ . In Step 2, we use an induction from the two downstream boundary cells (i.e., cell $n_1+n_2$ and cell $n$ ) to their upstream cells to show that if the estimates of the two downstream boundary cells converge to the true state, then the estimate of all cells will ultimately be smaller than $\varrho_{\text{m}}+\epsilon$ for all $\epsilon >0$ . In Step 3, we combine Step 1 and Step 2 to derive an ultimate bound for the mean estimation error.

Step 1. We use induction to show that for all $\epsilon>0$ and $l \in \{1, \cdots, n\}$ , there exists a finite time ${\rm{t}}_1^l(\epsilon)$ such that $\bar{\rho}^{l}_{k|k}> -\epsilon$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_1^l(\epsilon)$ .

Since the upstream boundary cell (i.e., cell $1$ ) is in the observable subsystem, we have $\bar{\eta}^{1}_{k|k}\rightarrow 0$ and $\bar{\rho}^1_{k|k}\rightarrow\rho^1_{k}$ , where $\rho^1_{k}\ge 0$ . Hence a finite time ${\rm{t}}_1^1(\epsilon)$ exists such that $\bar{\rho}^1_{k|k}>-\frac{\epsilon}{n}$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_1^1(\epsilon)$ .

For all interior cells on link 1, i.e., cells indexed by $l \in \{2, 3, \cdots, n_1\}$ , suppose $\bar{\rho}^{l-1}_{k|k}>-\frac{(l-1)\epsilon}{n}$ , if $\bar{\rho}^{l}_{k|k}<-\frac{(l-1)\epsilon}{n}$ , we obtain from (9) that

$\begin{align} &{\rm{f}}\left(\bar{\rho}^{l-1}_{k|k}, \bar{\rho}^{l}_{k|k}\right) = v_{\text{m}}\bar{\rho}^{l-1}_{k|k} > -v_{\text{m}}\frac{(l-1)\epsilon}{n}, \label{UBProofqIn} \end{align}$

(31)

$\begin{align} &{\rm{f}}\left(\bar{\rho}^{l}_{k|k}, \bar{\rho}^{l+1}_{k|k}\right)\leq v_{\text{m}}\bar{\rho}^{l}_{k|k}.\label{UBProofqOut} \end{align}$

(32)

It follows that the estimate of cell $l$ satisfies

$\begin{array}{rl} \bar{\rho}^{l}_{k+1|k+1}& = \bar{\rho}^{l}_{k|k}+\frac{\Delta t}{\Delta x}\left({\rm{f}}\left(\bar{\rho}^{l-1}_{k|k}, \bar{\rho}^{l}_{k|k}\right)-{\rm{f}}\left(\bar{\rho}^{l}_{k|k}, \bar{\rho}^{l+1}_{k|k}\right)\right)\\ & -K_{k+1}(l, 1)\bar{\eta}^1_{k+1|k}-K_{k+1}(l, 2)\bar{\eta}^{n_1+n_2}_{k+1|k}-K_{k+1}(l, 3)\bar{\eta}^{n}_{k+1|k}\\ & > \bar{\rho}^{l}_{k|k}+\frac{v_{\text{m}}\Delta t}{\Delta x}\left|\bar{\rho}^{l}_{k|k}+\frac{(l-1)\epsilon}{n}\right|-{\rm{k}}\left(\Gamma_{{k}_0^{\text{U}}|{k}_0^{\text{U}}}\right)\left\|{\bar{\eta}}^{\text{b}}_{k|k}\right\|_{\infty}, \end{array}$

(33)

where the inequality is due to $\|K_k\|_{\infty}\le {\rm{k}}(\Gamma_{{k}_0^{\text{U}}|{k}_0^{\text{U}}})$ given in Lemma 3. Thus there exists a scalar $v_1>\frac{\Delta x{\rm{k}}\left(\Gamma_{{k}_0^{\text{U}}|{k}_0^{\text{U}}}\right)}{v_{\text{m}}\Delta t}$ such that

$\begin{equation}\label{eq:induction_1_2} \begin{split} \bar{\rho}^{l}_{k+1|k+1}-\bar{\rho}^{l}_{k|k} > &v_0\left|\bar{\rho}^{l}_{k|k}+\frac{(l-1)\epsilon}{n}\right|, \\[2mm] &\text{for all $\left|\bar{\rho}^{l}_{k|k}+\frac{(l-1)\epsilon}{n}\right|\geq v_1\left\|{\bar{\eta}}^{\text{b}}_{k|k}\right\|_{\infty}$, } \end{split} \end{equation}$

(34)

where $v_0 = \frac{v_{\text{m}}\Delta t}{\Delta x}-\frac{{\rm{k}}\left(\Gamma_{{k}_0^{\text{U}}|{k}_0^{\text{U}}}\right)}{v_1}>0$ . Also note that $\left\|{\bar{\eta}}^{\text{b}}_{k|k}\right\|_{\infty}\rightarrow 0$ as $k\rightarrow\infty$ , which indicates that the one-step change of the estimates is ultimately positive, and large enough so that a finite time ${\rm{t}}_1^l(\epsilon)$ exists such that $\bar{\rho}^{l}_{k|k}>-\frac{l\epsilon}{n}>-\epsilon$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_1^l(\epsilon)$ [21].

We now show that for the two cells on the downstream side of the junction, i.e., cell $n_1+1$ and cell $n_1+n_2+1$ , there exist finite times ${\rm{t}}_1^{n_1+1}(\epsilon)$ and ${\rm{t}}_1^{n_1+n_2+1}(\epsilon)$ such that $\bar{\rho}^{n_1+1}_{k|k}>-\frac{(n_1+1)\epsilon}{n}>-\epsilon$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_1^{n_1+1}(\epsilon)$ and $\bar{\rho}^{n_1+n_2+1}_{k|k}>-\frac{(n_1+1)\epsilon}{n}>-\epsilon$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_1^{n_1+n_2+1}(\epsilon)$ . Suppose $\bar{\rho}^{n_1}_{k|k}>-\frac{n_1\epsilon}{n}$ , if $\bar{\rho}^{n_1+1}_{k|k}<-\frac{n_1\epsilon}{n}$ , the junction solver follows diverge case Ⅱ or diverge case Ⅲ. Hence, the flow from cell $n_1$ to cell $n_1+1$ satisfies:

$\begin{align*} {\rm{f}}\left(\bar{\rho}^{n_1}_{k|k}, \bar{\rho}^{n_1+1}_{k|k}\right) = &\left\{ \begin{array}{ll} \frac{1}{\alpha_{\text{d}}+1}{\rm{s}}\left(\bar{\rho}^{n_1}_{k|k}\right) > -v_{\text{m}}\frac{n_1\epsilon}{n}&\textrm{diverge case II}\\ {\rm{s}}\left(\bar{\rho}^{n_1}_{k|k}\right)-{\rm{r}}\left(\bar{\rho}^{n_1+n_2+1}_{k|k}\right) > -v_{\text{m}}\frac{n_1\epsilon}{n}& \textrm{diverge case III, } \end{array} \right. \end{align*}$

and the outgoing flow for cell $n_1+1$ satisfies:

$\begin{align*} {\rm{f}}\left(\bar{\rho}^{n_1+1}_{k|k}, \bar{\rho}^{n_1+2}_{k|k}\right)\le v_{\text{m}}\bar{\rho}^{n_1+1}_{k|k}. \end{align*}$

Following the similar arguments as in (33)-(34), it can be concluded that there exist a finite time ${\rm{t}}_1^{n_1+1}(\epsilon)$ such that $\bar{\rho}^{n_1+1}_{k|k}>-\frac{(n_1+1)\epsilon}{n}>-\epsilon$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_1^{n_1+1}(\epsilon)$ . Applying the same analysis for cell $n_1+n_2+1$ , it can be concluded that there exist a finite time ${\rm{t}}_1^{n_1+n_2+1}(\epsilon)$ such that $\bar{\rho}^{n_1+n_2+1}_{k|k}>-\frac{(n_1+1)\epsilon}{n}>-\epsilon$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_1^{n_1+n_2+1}(\epsilon)$ . Continuing the induction on link 2 from cell $n_1+1$ to cell $n_1+n_2$ , we obtain that for all $\epsilon>0$ and $l \in \{n_1+1, n_1+2, \cdots, n_1+n_2\}$ , there exists a finite time ${\rm{t}}_1^l(\epsilon)$ such that $\bar{\rho}^{l}_{k|k}>-\frac{l\epsilon}{n}> -\epsilon$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_1^l(\epsilon)$ . As for the cells on link 3, we process the same induction from cell $n_1+n_2+1$ to cell $n$ , which yields that for all $\epsilon>0$ and $l \in \{n_1+n_2+1, n_1+n_2+2, \cdots, n\}$ , there exists a finite time ${\rm{t}}_1^l(\epsilon)$ such that $\bar{\rho}^{l}_{k|k}>-\frac{(l-n_2)\epsilon}{n}> -\epsilon$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_1^l(\epsilon)$ .

Let ${\rm{t}}_1(\epsilon) = \max_{l\in\{1, \cdots, n\}}\{{\rm{t}}_1^l(\epsilon)\}$ , it is concluded that $\bar{\rho}^{l}_{k|k}>-\epsilon$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_1(\epsilon)$ and $l\in \{1, \cdots, n\}$ . This proves the ultimate lower bound of the estimates.

Step 2. We use induction to show that for all $\epsilon>0$ and $l \in \{1, \cdots, n\}$ , there exists a finite time ${\rm{t}}_2^l(\epsilon)$ such that $\bar{\rho}^{l}_{k|k}<\varrho_{\text{m}}+\epsilon$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_2^l(\epsilon)$ .

Since the two downstream boundary cells (indexed by $n_1+n_2$ and $n$ ) are in the observable subsystem, we have $\bar{\eta}^{n_1+n_2}_{k|k}\rightarrow 0$ and $\bar{\rho}^{n_1+n_2}_{k|k}\rightarrow\rho^{n_1+n_2}_{k}$ , as well as $\bar{\eta}^{n}_{k|k}\rightarrow 0$ and $\bar{\rho}^{n}_{k|k}\rightarrow\rho^{n}_{k}$ . Given the facts that $\rho^{n_1+n_2}_{k}\le \varrho_{\text{m}}$ and $\rho^{n}_{k}\le \varrho_{\text{m}}$ , there exist finite times ${\rm{t}}_2^{n_1+n_2}(\epsilon)$ and ${\rm{t}}_2^{n}(\epsilon)$ such that $\bar{\rho}^{n_1+n_2}_{k|k}<\varrho_{\text{m}}+\frac{\epsilon}{n}$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_2^{n_1+n_2}(\epsilon)$ , and $\bar{\rho}^{n}_{k|k}<\varrho_{\text{m}}+\frac{\epsilon}{n}$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_2^{n}(\epsilon)$ .

For all interior cells on link 3, i.e., cells indexed by $l \in \{n_1+n_2+1, n_1+n_2+2, \cdots, n-1\}$ , suppose $\bar{\rho}^{l+1}_{k|k}<\varrho_{\text{m}}+\frac{(n-l)\epsilon}{n}$ , if $\bar{\rho}^{l}_{k|k}>\varrho_{\text{m}}+\frac{(n-l)\epsilon}{n}$ , we obtain from (9) that

$\begin{align} &{\rm{f}}\left(\bar{\rho}^{l-1}_{k|k}, \bar{\rho}^{l}_{k|k}\right)\le w\left(\varrho_{\text{m}}-\bar{\rho}^{l}_{k|k}\right), \label{UBProofqIn_1}\end{align}$

(35)

$\begin{align}{\rm{f}}\left(\bar{\rho}^{l}_{k|k}, \bar{\rho}^{l+1}_{k|k}\right) = w\left(\varrho_{\text{m}}-\bar{\rho}^{l+1}_{k|k}\right) > w\left(-\frac{(n-l)\epsilon}{n}\right).\label{UBProofqOut_1} \end{align}$

(36)

It follows that the estimate of cell $l$ satisfies

$\begin{array}{rl} \bar{\rho}^{l}_{k+1|k+1}& = \bar{\rho}^{l}_{k|k}+\frac{\Delta t}{\Delta x}\left({\rm{f}}\left(\bar{\rho}^{l-1}_{k|k}, \bar{\rho}^{l}_{k|k}\right)-{\rm{f}}\left(\bar{\rho}^{l}_{k|k}, \bar{\rho}^{l+1}_{k|k}\right)\right)\\ & -K_{k+1}(l, 1)\bar{\eta}^1_{k+1|k}-K_{k+1}(l, 2)\bar{\eta}^{n_1+n_2}_{k+1|k}-K_{k+1}(l, 3)\bar{\eta}^{n}_{k+1|k}\\ & < \bar{\rho}^{l}_{k|k}-\frac{w\Delta t}{\Delta x}\left|\bar{\rho}^{l}_{k|k}-\varrho_{\text{m}}-\frac{(n-l)\epsilon}{n}\right|+{\rm{k}}\left(\Gamma_{{k}_0^{\text{U}}|{k}_0^{\text{U}}}\right)\left\|{\bar{\eta}}^{\text{b}}_{k|k}\right\|_{\infty}. \end{array}$

(37)

Thus there exists scalar $w_1$ such that $\frac{\Delta x{\rm{k}}\left(\Gamma_{{k}_0^{\text{U}}|{k}_0^{\text{U}}}\right)}{w\Delta t}<w_1$ , and

$\begin{split} \bar{\rho}^{l}_{k+1|k+1}-\bar{\rho}^{l}_{k|k} < &-w_0\left|\bar{\rho}^{l}_{k|k}-\varrho_{\text{m}}-\frac{(n-l)\epsilon}{n}\right|, \\ &\text{for all $\left|\bar{\rho}^{l}_{k|k}-\varrho_{\text{m}}-\frac{(n-l)\epsilon}{n}\right|\geq w_1\left\|{\bar{\eta}}^{\text{b}}_{k|k}\right\|_{\infty}$.} \end{split}$

(38)

Also note that $\left\|{\bar{\eta}}^{\text{b}}_{k|k}\right\|_{\infty}\rightarrow 0$ as $k\rightarrow\infty$ , which indicates that the one-step change of the estimates is ultimately negative, and large enough so that a finite time ${\rm{t}}_2^l(\epsilon)$ exists such that $\bar{\rho}^{l}_{k|k}<\varrho_{\text{m}}+\frac{(n-l+1)\epsilon}{n}<\varrho_{\text{m}}+\epsilon$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_2^l(\epsilon)$ .

The above arguments can be generalized for all cells on link 2. Hence for all $\epsilon>0$ and $l \in \{n_1+1, n_1+2, \cdots, n_1+n_2-1\}$ , there exists a finite time ${\rm{t}}_2^l(\epsilon)$ such that $\bar{\rho}^{l}_{k|k}<\varrho_{\text{m}}+\frac{(n_1+n_2-l+1)\epsilon}{n}< \varrho_{\text{m}}+\epsilon$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_2^l(\epsilon)$ .

We now show that for the cell on the upstream side of the junction, i.e., cell $n_1$ , there exists finite time ${\rm{t}}_2^{n_1}(\epsilon)$ such that $\bar{\rho}^{n_1}_{k|k}<\varrho_{\text{m}}+\frac{(n-n_1+1)\epsilon}{n}<\varrho_{\text{m}}+\epsilon$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_2^{n_1}(\epsilon)$ . Suppose $\bar{\rho}^{n_1+1}_{k|k}<\varrho_{\text{m}}+\frac{n_2\epsilon}{n}$ and $\bar{\rho}^{n_1+n_2+1}_{k|k}<\varrho_{\text{m}}+\frac{n_3\epsilon}{n}$ , if $\bar{\rho}^{n_1}_{k|k}>\varrho_{\text{m}}+\frac{(n-n_1)\epsilon}{n} = \varrho_{\text{m}}+\frac{(n_2+n_3)\epsilon}{n}$ , the incoming and outgoing flows of cell $n_1$ satisfy

$\begin{align*} {\rm{f}}\left(\bar{\rho}^{n_1-1}_{k|k}, \bar{\rho}^{n_1}_{k|k}\right)\le w\left(\varrho_{\text{m}}-\bar{\rho}^{n_1}_{k|k}\right), \end{align*}$

and

$\begin{align*} &{\rm{f}}\left(\bar{\rho}^{n_1}_{k|k}, \bar{\rho}^{n_1+1}_{k|k}\right)+ {\rm{f}}\left(\bar{\rho}^{n_1}_{k|k}, \bar{\rho}^{n_1+n_2+1}_{k|k}\right)\\ = &\left\{ \begin{array}{ll} {\rm{s}}\left(\bar{\rho}^{n_1}_{k|k}\right) = q_{\text{m}} > w\left(-\frac{n_2+n_3}{n}\epsilon\right)&\textrm{diverge case II or III}\\ {\rm{r}}\left(\bar{\rho}^{n_1+1}_{k|k}\right)+{\rm{r}}\left(\bar{\rho}^{n_1+n_2+1}_{k|k}\right) > w\left(-\frac{n_2+n_3}{n}\epsilon\right)& \textrm{diverge case I.} \end{array} \right. \end{align*}$

Following the similar arguments as in (37)-(38) it can be concluded that there exist finite time ${\rm{t}}_2^{n_1}(\epsilon)$ such that $\bar{\rho}^{n_1}_{k|k}<\varrho_{\text{m}}+\frac{(n-n_1+1)\epsilon}{n}<\varrho_{\text{m}}+\epsilon$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_2^{n_1}(\epsilon)$ . Continuing the induction from cell $n_1$ to cell $1$ , it follows that for all $l\in \{1, 2, \cdots, n_1\}$ there exists a finite time ${\rm{t}}_2^l(\epsilon)$ such that $\bar{\rho}^{l}_{k|k}<\varrho_{\text{m}}+\frac{(n-l+1)\epsilon}{n}\le \varrho_{\text{m}}+\epsilon$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_2^l(\epsilon)$ .

Let ${\rm{t}}_2(\epsilon) = \max_{l\in\{1, \cdots, n\}}\{{\rm{t}}_2^l(\epsilon)\}$ , we obtain $\bar{\rho}^{l}_{k|k}<\varrho_{\text{m}}+\epsilon$ for all $k>{k}_0^{\text{U}}+{\rm{t}}_2(\epsilon)$ and $l\in \{1, \cdots, n\}$ . This proves the ultimate upper bound of the estimates.

Step 3. Combining Steps 1 and 2, and define ${\rm{t}}(\epsilon) = \max_{l}\{{\rm{t}}_1(\epsilon), {\rm{t}}_2(\epsilon)\}$ , we obtain $\bar{\rho}^{l}_{k|k}\in(-\epsilon, \varrho_{\text{m}}+\epsilon)$ for all $l\in \{1, \cdots, n\}$ and $k>{k}_0^{\text{U}}+{\rm{t}}(\epsilon)$ . Consequently, $\|\bar{\eta}_{k|k}\|<\sqrt{n}(\varrho_{\text{m}}+\epsilon)$ for all $k>{k}_0^{\text{U}}+{\rm{t}}(\epsilon)$ .

Proposition 1 indicates that when the mean estimation error of the three boundary cells converges to zero, it will drive the state estimate of all the interior cells inside $[0, \varrho_{\text{m}}]$ due to the conservation law and the flow-density relationship embedded in the traffic model. For example, when the state estimate $\bar{\rho}_{k|k}^{l}$ is smaller than zero, the sending capacity of cell $l$ is much smaller than the receiving capacity of cell $l$ (as shown in equations (31)-(32)). Consequently, the update equation of the estimate (33) ensures that the one-step change of the state estimate of cell $l$ is always positive, and the magnitude of the one-step change is proportional to the distance between zero and the current state estimate $\bar{\rho}_{k|k}^{l}$ . This ensures that the estimate of cell $l$ is constantly pushed towards zero unless it is sufficiently close to zero. The ultimate upper bound can also be derived under the same fashion.

5. Conclusions

In this article, we establish the theoretical performance of the KF applied to estimate the traffic density on transportation networks under unobservable scenarios. To facilitate the performance analysis of the KF, a linear SMM-J model is introduced which combines a junction solver with the switched linear system representation of the CTM. It is shown that in addition to the existence of shocks, the presence of junctions contributes significantly to the non-observability of the system.

To derive the error bounds for the KF under unobservable traffic estimation problems, we analyze several properties of the state transition matrices of the SMM-J, which reflect the intrinsic physical properties (e.g., vehicle conservation and the CFL condition in the discretization scheme) of the traffic model. Based on the above properties of the SMM-J, we show that the infinity norm of the Kalman gain is uniformly bounded under unobservable modes. Finally, we show that the mean estimate of each cell is ultimately bounded inside the physically meaningful interval, provided that the density measurements of the boundary cells are available. The ultimate lower and upper bounds are derived based on the convergence (to zero) of the mean estimation error of the boundary cells, the boundedness of the Kalman gain, and the flow-density relationship embedded in the model prediction step of the KF. As indicated in the proof, feeding sensor data back to the estimator is critical to ensure physically meaningful estimates under unobservable system, which cannot be naturally achieved by an open-loop observer. These results provide some theoretical insights into the performance of sequential estimation algorithms widely used in traffic monitoring applications.

Appendix

Appendix.1. Example of the divergence of the mean estimation error under an unobservable system

Example 3. Consider a linear discrete system describing the evolution of a moving object. The state vector is constructed as $\rho_k = \left(\rho^1_k, \rho^2_k, \rho^3_k\right)$ , where $\rho_{k|k}^1$ , $\rho_{k|k}^2$ , and $\rho^3_{k|k}$ are the location, speed, and acceleration of the moving object at time $k\in\mathbb{N}$ , respectively. The moving object travels with a constant acceleration. The system dynamics is given by

$\begin{align} \rho_{k+1}& = A_k\rho_k+\omega_k, \quad \rho_k\in \mathbb{R}^3, \label{eq:ex_dynamics} \end{align}$

(39)

where $\omega_k\sim \mathcal{N}\left({\bf{0}}, Q_k\right)$ , the state transition matrix and the model error covariance matrices are given as follows:

$\begin{align*} A_{k} = \left( \begin{array}{ccc} 1&1&0.5\\ 0&1&1 \\ 0&0&1 \end{array} \right), \quad Q_k = \left( \begin{array}{ccc} 1&0&0\\ 0&1&0 \\ 0&0&1 \end{array} \right), \quad \text{for all }k. \end{align*}$

The initial state is $\rho_0 = (2, 1, 0.05)^{\top}$ . The sensor measures the acceleration of the moving object, i.e., the measurement is modeled by

$\begin{align} z_k& = H_k\rho_k+v_k, \quad z_k\in \mathbb{R}, \label{eq:ex_measure} \end{align}$

(40)

where

$\begin{align*} H_{k} = \left( \begin{array}{ccc} 0&0&1 \end{array} \right), \quad \text{$v_k \sim \mathcal{N}\left(0, R_k\right)$ with $R_k = 1$}, \quad \text{for all $k$.} \end{align*}$

We use the KF (3)-(4) to estimate the state, where the initial condition is set to be $\bar{\eta}_{0|0} = (3, 2, 0.2)^{\top}$ and $\Gamma_{0|0} = I_3$ .

The system (39)-(40) is not observable, which can be concluded by computing its observability matrix [4,Theorem 6.O1], and showing that the observability matrix is not full rank. The mean estimation error evolves as the following equation:

$\begin{align} \bar{\eta}_{k|k} = \left(I-K_kH_k\right)A_{k-1}\bar{\eta}_{k-1|k-1}. \label{eq:evolve_mean_eta} \end{align}$

(41)

In Figure 5a, the solid curve shows the analytical evolution of $\left\| \bar{\eta}_{k|k}\right\|$ which follows (41). A Monte Carlo test of $N_r = 10,000$ realizations of the KF is also conducted, and the dashed curve in Figure 5a shows the empirical evolution of the estimation error $\left\|\hat{\eta}_{k|k}\right\|$ , where $\hat{\eta}_{k|k} = \frac{1}{N_r} \sum_{r = 1}^{N_r} \eta_{r, k|k}$ , and $\eta_{r, k|k}$ is the posterior estimation error at time $k$ on the $r^{\text{th}}$ realization. We also plot in Figure 5b the trace of the estimation error covariance $\text{tr}(\Gamma_{k|k})$ . It is shown that unlike the observable scenarios described in Lemma 1, the error covariance and the estimation error diverge as $k$ increases in this example, which is typical for unobservable systems.

Figure 5. The evolutions of the estimation errors (A) and the trace of the error covariance (B) when using the KF to track the unobservable system (39)-(40).

DownLoad: Full-Size Img PowerPoint

Appendix.2. Proof of Lemma 2

The proof is divided into two steps. In Step 1, we apply properties (P.1) and (P.2) to show that (27) holds for the modes where the junction solver follows diverge case Ⅰ or diverge case Ⅱ. In Step 2, properties (P.1) and (P.3) are applied to show that (27) holds for the modes where the junction solver follows diverge case Ⅲ.

For all $k_0+1 \le \ell < k \le {k}_1$ and $i, j\in \{1, \cdots, n\}$ , the $(i, j)^{\text{th}}$ entry of $\prod_{\kappa = k}^{\ell}A_{\kappa}$ is given by

$\begin{align} \left(\prod\limits_{\kappa = k}^{\ell}A_{\kappa}\right)\left(i, j\right) = \sum\limits_{r = 1}^{n}\left(\left(\prod\limits_{\kappa = k}^{\ell+1}A_{\kappa}\right)\left(i, r\right)\right)\left(A_{\ell}\left(r, j\right)\right).\label{eq:prod_A} \end{align}$

(42)

Step 1. Suppose $A_k$ is under a mode where the junction solver follows diverge case Ⅰ or diverge case Ⅱ. Recall from (P.2) that

$\begin{align*} \sum\limits_{r = 1}^n A_{k}\left(r, j\right) \le 1, \quad \text{for all $k\in (k_0, {k}_1]$ and $j\in \{1, \cdots, n\}$.} \end{align*}$

Hence, the $(i, j)^{\text{th}}$ entry of $\prod_{\kappa = k}^{\ell}A_{\kappa}$ is no greater than the convex combination of all the entries on the $i^{\text{th}}$ row of $\prod_{\kappa = k}^{\ell+1}A_{\kappa}$ . Moreover, recall from (P.1) that

$\begin{align*} 0\le A_{k}(r, c)\le 1, \quad\textrm{for all $k\in (k_0, k_1]$ and $r, c\in \{1, \cdots, n\}$, } \end{align*}$

it follows that

$\begin{align*} 0\le \left(\prod\limits_{\kappa = k}^{\ell}A_{\kappa}\right)\left(i, j\right) \le 1, \quad\textrm{for all $k\in (k_0, k_1]$, $\ell \in [k_0+1, k)$ and $i, j\in \{1, \cdots, n\}$, } \end{align*}$

thus (27) follows directly by setting $\ell = k_0+1$ in the above equation.

Step 2. Suppose $A_k$ is under a mode where the junction solver follows diverge case Ⅲ. We prove for the case where ${\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+1}_{k}) = {\rm{r}}(\rho^{n_1+1}_{k})$ , and the proof for the case where ${\rm{f}}(\rho^{n_1}_{k}, \rho^{n_1+1}_{k}) = {\rm{s}}(\rho^{n_1}_{k})-{\rm{r}}(\rho^{n_1+n_2+1}_{k})$ follows by symmetry.

Recall from (P.3) that

$\begin{align} \sum\limits_{r = 1}^n A_{k}(r, j) \le 1, \quad\textrm{for all $k\in (k_0, k_1]$ and $j\in\{j|j\in \{1, \cdots, n\}, j\neq n_1+1\}$.}\label{eq:A_prop_ele} \end{align}$

(43)

For $j = n_1+1$ , the sum of all entries of $A_k$ on column $j$ is given by

$\begin{equation*} \begin{split} \sum\limits_{r = 1}^n A_{k}(r, n_1+1)& = A_{k}(n_1, n_1+1)+A_{k}(n_1+1, n_1+1)+A_{k}(n_1+n_2+1, n_1+1)\\ & = \frac{w\Delta t}{\Delta x}+\left(1-\frac{w\Delta t}{\Delta x}\right)+\frac{w\Delta t}{\Delta x} = 1+\frac{w\Delta t}{\Delta x}, \quad\textrm{for all $k\in (k_0, k_1]$ }. \end{split} \end{equation*}$

Additionally, one may note that for all $k\in (k_0, k_1]$ ,

$\begin{equation*} A_{k}(r, n_1+n_2+1) = \left\{ \begin{array}{ll} 1 &\textrm{if $r = n_1+n_2+1$ }\\ 0& \textrm{otherwise.} \end{array} \right. \end{equation*}$

It follows that for all $k_0+1 \le \ell < k \le k_1$ ,

$\begin{equation}\label{eq:prod_A_zero} \left(\prod\limits_{\kappa = k}^{\ell}A_{\kappa}\right)\left(r, n_1+n_2+1\right) = \left\{ \begin{array}{ll} 1 &\textrm{if $r = n_1+n_2+1$ }\\ 0 &\textrm{otherwise.} \end{array} \right. \end{equation}$

(44)

Combining (43) and (44) with (42), we obtain that for all $(i, j)\neq (n_1+n_2+1, n_1+1)$ , the $(i, j)^{\text{th}}$ entry of $\prod_{\kappa = k}^{\ell}A_{\kappa}$ is no greater than the convex combination of all the (non-zero) entries on the $i^{\text{th}}$ row of $\prod_{\kappa = k}^{\ell+1}A_{\kappa}$ . Also recall from (P.3) that for all $k\in (k_0, k_1]$ ,

$\begin{align*} A_{k}(r, c) = 0, \quad\textrm{for all $r\in\{n_1+1, \cdots, n\}$ and $c\in\{1, \cdots n_1\}$, } \end{align*}$

which yields that for all $k_0+1 \le \ell < k \le k_1$ ,

$\begin{align*} \left(\prod\limits_{\kappa = k}^{\ell+1}A_{\kappa} \right)\left(r, c\right) = 0, \quad\textrm{for all $r\in\{n_1+1, \cdots, n\}$ and $c\in\{1, \cdots n_1\}$, } \end{align*}$

thus

$\begin{align*} \left(\prod\limits_{\kappa = k}^{\ell+1}A_{\kappa} \right)\left(n_1+n_2+1, n_1\right) = 0, \quad\textrm{for all $k_0+1 \le \ell < k \le k_1$.} \end{align*}$

Hence for $(i, j) = (n_1+n_2+1, n_1+1)$ , the $(i, j)^{\text{th}}$ entry of $\prod_{\kappa = k}^{\ell}A_{\kappa}$ is also no greater than the convex combination of all the (non-zero) entries on the $i^{\text{th}}$ row of $\prod_{\kappa = k}^{\ell+1}A_{\kappa}$ . Moreover, according to (P.1) it holds that

$\begin{align*} 0\le A_{k}(r, c)\le 1, \quad\textrm{for all $k\in (k_0, k_1]$ and $r, c\in \{1, \cdots, n\}$.} \end{align*}$

It can be concluded that

thus (27) follows directly by setting $\ell = k_0+1$ in the above equation.

Appendix.3. Observable and unobservable subsystems in the unobservable modes

In an unobservable mode, the SMM-J can be transformed to the Kalman observability canonical form. The transformed state consists of the observable and the unobservable parts of the system, i.e.,

$\begin{align*} \rho_{k}^{(\text{t})} = U\rho_{k} = \left( \begin{array}{c} {\rho}_k^{(1)}\\ {\rho}_k^{(2)} \end{array} \right), \end{align*}$

where $U$ is an orthogonal matrix, ${\rho}_k^{(1)}\in \mathbb{R}^{d_1}$ and ${\rho}_k^{(2)}\in \mathbb{R}^{d_2}$ are the state in the observable and unobservable subsystems, respectively, with $d_1+d_2 = n$ . Moreover, since the densities of the three boundary cells are directly measured, it holds that $d_1 \ge 3$ . As a consequence, system (28)-(29) is transformed to the following formula:

$\begin{align*} {\rho}^{(\text{t})}_{k+1}& = {A}^{(\text{t})}_k\rho^{(\text{t})}_k+u^{(\text{t})}_k+{\omega}^{(\text{t})}_{k}, \quad \rho_k \in \mathbb{R}^{n}, \\ z_{k}& = {H}^{(\text{t})}_{k}{\rho}^{(\text{t})}_{k}+v_{k}, \quad z_k \in \mathbb{R}^{3}, \end{align*}$

where the transformed state transition matrix ${A}^{(\text{t})}_{k}$ and the transformed observation matrix ${H}^{(\text{t})}_{k}$ can also be partitioned according to the observable and unobservable subsystems, i.e.,

$\begin{align} {A}^{(\text{t})}_{k} = UA_{k}U^{\top} = \left( \begin{array}{cc} {A}_k^{(1)}&{\bf{0}}_{d_1, d_2}\\ {A}_k^{(21)}&{A}^{(2)}_k \end{array} \right), \quad {H}^{(\text{t})}_{k} = H_kU^{\top} = \left( \begin{array}{cc} {H}_k^{(1)}&{\bf{0}} \end{array} \right), \label{eq:a_transform} \end{align}$

(45)

with ${H}_k^{(1)}\in \mathbb{R}^{3\times d_1}$ defined as follows:

$\begin{equation*} {H}_k^{(1)} = \left\{ \begin{array}{ll} I_3 &\textrm{if $d_1 = 3$}\\ \left( \begin{array}{cc} I_3&{\bf{0}}_{3, d_1-3} \end{array} \right)& \textrm{if $d_1 > 3$, } \end{array} \right. \quad \text{for all $k$.} \end{equation*}$

Moreover, the transformed system input is given by ${u}^{(\text{t})}_k = U u_k$ , and the transformed model noise is given by ${w}^{(\text{t})}_k = U w_k \sim \mathcal{N}({\bf{0}}, {Q}^{(\text{t})}_k)$ , where the transformed model error covariance ${Q}^{(\text{t})}_{k}$ can be partitioned to blocks corresponding to the observable and unobservable subsystems, i.e.,

$\begin{align*} {Q}^{(\text{t})}_{k} = UQ_{k}U^{\top} = \left( \begin{array}{cc} {Q}_k^{(1)}&{Q}_k^{(12)}\\ {Q}_k^{(21)}&{Q}^{(2)}_k \end{array} \right). \end{align*}$

The prior estimation error covariance matrix partitioned into the observable and unobservable subsystems is constructed as follows:

$\begin{align} {\Gamma}^{(\text{t})}_{k|k-1} = \left( \begin{array}{cc} {\Gamma}^{(1)}_{k|k-1}&{\Gamma}^{(12)}_{k|k-1}\\ {\Gamma}^{(21)}_{k|k-1}&{\Gamma}^{(2)}_{k|k-1} \end{array} \right).\notag \end{align}$

In the KF, the prior error covariance matrix is computed recursively by the Riccati equation

$\begin{align} {\Gamma}^{(\text{t})}_{k+1|k} = &{A}^{(\text{t})}_k\left({\Gamma}^{(\text{t})}_{k|k-1}-{\Gamma}^{(\text{t})}_{k|k-1}\left({H}^{(\text{t})}_k\right)^{\top}\left({H}^{(\text{t})}_k{\Gamma}^{(\text{t})}_{k|k-1}\left({H}_k^{(\text{t})}\right)^{\top}+R_k\right)^{-1}\times\right.\notag\\ &\quad \quad \quad \left.{H}^{(\text{t})}_k{\Gamma}^{(\text{t})}_{k|k-1}\right)\left({A}_{k}^{(\text{t})}\right)^{\top}+{Q}^{(\text{t})}_k, \label{eq:reca_eq} \end{align}$

(46)

Define

$\begin{align} {\Upsilon}^{(1)}_{k}& = {A}_k^{(1)}-{A}_k^{(1)}{\Gamma}^{(1)}_{k|k-1}\left({H}_k^{(1)}\right)^{\top}\left({H}^{(\text{t})}_k{\Gamma}^{(\text{t})}_{k|k-1}\left({H}_k^{(\text{t})}\right)^{\top}+R_k\right)^{-1}{H}_k^{(1)}\notag\\ & = {A}_k^{(1)}-{A}_k^{(1)}{K}^{(1)}_{k}{H}_k^{(1)}, \notag \end{align}$

and apply partition into observable and unobservable subsystems to both sides of (46), we obtain the following two blocks of equations describing the evolutions of ${\Gamma}^{(1)}_{k+1|k}$ and ${\Gamma}^{(12)}_{k+1|k}$ :

$\begin{align} {\Gamma}^{(1)}_{k+1|k}& = {\Upsilon}^{(1)}_{k}{\Gamma}^{(1)}_{k|k-1}\left({A}_k^{(1)}\right)^{\top}+{Q}_k^{(1)}, \label{eq:18} \end{align}$

(47)

$\begin{align} {\Gamma}^{(12)}_{k+1|k}& = {\Upsilon}^{(1)}_{k}{\Gamma}^{(12)}_{k|k-1}\left({A}^{(2)}_k\right)^{\top}+{\Upsilon}^{(1)}_{k}{\Gamma}^{(1)}_{k|k-1}\left({A}_k^{(21)}\right)^{\top}+{Q}_k^{(12)}.\label{eq:19} \end{align}$

(48)

Appendix.4. Explicit formula of the Kalman gain bound and proof of Lemma 3

In this section, we present the explicit formula of ${\rm{k}}\left(\Gamma_{k_0^{{\text{U}}}|{k}_0^{{\text{U}}}}\right)$ and prove Lemma 3. As detailed in Appendix.3, we transform the state vector according to observable and unobservable subsystems, i.e.,

$\begin{align*} {\rho}^{\text{(t)}}_k = U\rho_{k} = \left( \begin{array}{c} {\rho}_k^{(1)}\\ {\rho}_k^{(2)} \end{array} \right), \end{align*}$

where ${\rho}_k^{(1)}\in \mathbb{R}^{d_1}$ and ${\rho}_k^{(2)}\in \mathbb{R}^{d_2}$ are the state in the observable and unobservable subsystems, respectively. The transformed Kalman gain is given by

$\begin{align}\label{eq:k_divide} {K}^{\text{(t)}}_k = UK_{k} = \left( \begin{array}{c} {K}_k^{(1)}\\ {K}_k^{(21)} \end{array} \right), \end{align}$

(49)

where ${K}_k^{(1)}$ and ${K}_k^{(21)}$ correspond to the observable and unobservable subsystems, respectively, with

$\begin{align} {K}^{(1)}_{k}& = {\Gamma}^{(1)}_{k|k-1}\left({H}_k^{(1)}\right)^{\top}\left(R_{k}+{H}_k^{(1)}{\Gamma}^{(1)}_{k|k-1}\left({H}_k^{(1)}\right)^{\top}\right)^{-1}, \label{eq:evol_k_1} \end{align}$

(50)

$\begin{align} {K}_k^{(21)}& = {\Gamma}_{k|k-1}^{(21)}\left({H}_k^{(1)}\right)^{\top}\left(R_k+{H}_k^{(1)}{\Gamma}_{k|k-1}^{(1)}\left({H}_k^{(1)}\right)^{\top}\right)^{-1}.\label{eq:evol_k_2} \end{align}$

(51)

Appendix.4.1. Explicit formula of the Kalman gain bound. Define

$\begin{align*} a_1 = \max\limits_{k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]}\left\{\left\|A_k\right\|_{\infty}\right\}, ~ a_2 = \max\limits_{k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]}\left\{\left\|A_k^{\top}\right\|_{\infty}\right\}, \end{align*}$

and

$\begin{align*} \tilde{a}_1& = \max\limits_{k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]}\left\|A^{(1)}_k\right\|_{\infty}, ~~~~ \tilde{a}_2 = \max\limits_{k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]}\left\|\left(A_k^{(1)}\right)^{\top}\right\|_{\infty}, ~\\[2mm] \tilde{a}_3& = \max\limits_{k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]}\sigma_{\max}\left({A}_k^{(1)}\right), ~ \tilde{a}_4 = \max\limits_{k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]}\left\|{A}_k^{(21)}\right\|_{\infty}. \end{align*}$

Moreover, define as $\tilde{c}_1$ and $\tilde{c}_2$ the lower and upper bounds of the error covariance of the observable subsystem, i.e.,

$\begin{align} \tilde{c}_1 I < {\Gamma}_{k|k}^{(1)} < \tilde{c}_2 I, \quad \text{for all $k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]$, } \label{eq:bound_cov_obsv_0} \end{align}$

(52)

and let

$\begin{align*} &\tilde{c}_3 = \left(\tilde{c}_2+q_1^{-1}\tilde{c}_2^{2}\tilde{a}_3\right)^{-1}, ~ \tilde{t} = n^2\sqrt{d_1} \left(\check{c}_2\check{c}_1^{-1}\right)^{\frac{1}{2}}, ~ \tilde{q} = \left(1-\tilde{c}_3\tilde{c}_1\right)^{\frac{1}{2}}\\[2mm] &\tilde{p} = d_1\tilde{c}_2\tilde{a}_4\left(\tilde{a}_1\tilde{a}_2\tilde{c}_2+q_2\right)q_1^{-1}+q_2, \\[2mm] &~ \tilde{\gamma} = n\sqrt{n}\left\|\Gamma_{{k}_0^{\text{U}}|{k}_0^{\text{U}}}\right\|\left(a_1a_2\right)^2+n\sqrt{n}a_1a_2q_2+\sqrt{n}q_2. \end{align*}$

The upper bound of $\|K_k\|_{\infty}$ for $k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]$ in (30) is defined as

$\begin{align*} &{\rm{k}}\left(\Gamma_{{k}_0^{{\text{U}}}|{k}_0^{{\text{U}}}}\right)\\[2mm] = &\frac{\sqrt{3}d_1}{r_1}\max\left\{\frac{\sqrt{n}}{d_1}\left(\left\|\Gamma_{{k}_0^{\text{U}}|{k}_0^{\text{U}}}\right\|a_1a_2+q_2\right), \frac{1}{\sqrt{d_1}}\left(\tilde{a}_1\tilde{a}_2\tilde{c}_2+q_2\right), \tilde{\gamma}, \tilde{t}\tilde{q}\tilde{\gamma}+\tilde{p}, \frac{\tilde{t}\tilde{p}\tilde{q}}{1-\tilde{q}}+\tilde{p}\right\}. \end{align*}$

Appendix.4.2. Proof of Lemma 3. The proof consists of the following five steps. Step 1 derives an upper bound for $\left\|K_{{k}_0^{\text{U}}+1}\right\|_{\infty}$ . Step 2 derives an upper bound of $\left\|{K}^{(1)}_k\right\|_{\infty}$ for $k\in({k}_0^{\text{U}}+1, {k}_1^{\text{U}}]$ . In Step 3, we study the convergence rate of the error dynamics of the observable subsystem, which is also related to the boundedness of ${K}^{(21)}_k$ . Based on the convergence rate obtained in Step 3, Step 4 derives an upper bound of $\left\|{K}^{(21)}_k\right\|_{\infty}$ for $k\in({k}_0^{\text{U}}+1, {k}_1^{\text{U}}]$ . Step 5 combines the above steps together and concludes the proof.

Step 1. At time step ${k}_0^{\text{U}}+1$ , the Kalman gain is computed as follows:

$\begin{align*} K_{{k}_0^{\text{U}}+1} = \Gamma_{{k}_0^{\text{U}}+1|{k}_0^{\text{U}}}H_{{k}_0^{\text{U}}+1}^{\top}\left(R_{{k}_0^{\text{U}}+1}+H_{{k}_0^{\text{U}}+1}\Gamma_{{k}_0^{\text{U}}+1|{k}_0^{\text{U}}}H_{{k}_0^{\text{U}}+1}^{\top}\right)^{-1}, \end{align*}$

where $\Gamma_{{k}_0^{\text{U}}+1|{k}_0^{\text{U}}} = A_{{k}_0^{\text{U}}}\Gamma_{{k}_0^{\text{U}}|{k}_0^{\text{U}}}A_{{k}_0^{\text{U}}}^{\top}+Q_{{k}_0^{\text{U}}}$ . Given that $\left\|\Gamma_{{k}_0^{\text{U}}|{k}_0^{\text{U}}}\right\|_{\infty}\le\sqrt{n}\left\|\Gamma_{{k}_0^{\text{U}}|{k}_0^{\text{U}}}\right\|$ , $\left\|Q_k\right\|_{\infty}<\sqrt{n}q_2$ , and define

$\begin{align*} a_1 = \max\limits_{k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]}\left\{\left\|A_k\right\|_{\infty}\right\}, \quad a_2 = \max\limits_{k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]}\left\{\left\|A_k^{\top}\right\|_{\infty}\right\}, \end{align*}$

the prior error covariance at time ${k}_0^{\text{U}}+1$ satisfies

$\begin{align*} \left\|\Gamma_{{k}_0^{\text{U}}+1|{k}_0^{\text{U}}}\right\|_{\infty} &\le \left\|A_{{k}_0^{\text{U}}}\right\|_{\infty}\left\|\Gamma_{{k}_0^{\text{U}}|{k}_0^{\text{U}}}\right\|_{\infty}\left\|A_{{k}_0^{\text{U}}}^{\top}\right\|_{\infty}+\left\|Q_{{k}_0^{\text{U}}}\right\|_{\infty}\\ & < \sqrt{n}\left\|\Gamma_{{k}_0^{\text{U}}|{k}_0^{\text{U}}}\right\|a_1a_2+\sqrt{n}q_2. \end{align*}$

Moreover, since

$\begin{equation*} \begin{split} &\left\|\left(R_{{k}_0^{\text{U}}+1}+H_{{k}_0^{\text{U}}+1}\Gamma_{{k}_0^{\text{U}}+1|{k}_0^{\text{U}}}H_{{k}_0^{\text{U}}+1}^{\top}\right)^{-1}\right\|_{\infty}\\ \le&\sqrt{3} \left\|\left(R_{{k}_0^{\text{U}}+1}+H_{{k}_0^{\text{U}}+1}\Gamma_{{k}_0^{\text{U}}+1|{k}_0^{\text{U}}}H_{{k}_0^{\text{U}}+1}^{\top}\right)^{-1}\right\|\\ = &\sqrt{3} \left(\sigma_{\min}\left(R_{{k}_0^{\text{U}}+1}+H_{{k}_0^{\text{U}}+1}\Gamma_{{k}_0^{\text{U}}+1|{k}_0^{\text{U}}}H_{{k}_0^{\text{U}}+1}^{\top}\right)\right)^{-1}\\ \le& \sqrt{3} \left(\sigma_{\min}\left(R_{{k}_0^{\text{U}}+1}\right)\right)^{-1}\\ < &\frac{\sqrt{3}}{r_1}, \end{split} \end{equation*}$

it follows that

$\begin{equation*} \begin{split} \left\|K_{{k}_0^{\text{U}}+1}\right\|_{\infty}& \le \left\|\Gamma_{{k}_0^{\text{U}}+1|{k}_0^{\text{U}}}\right\|_{\infty}\left\|H_{{k}_0^{\text{U}}+1}^{\top}\right\|_{\infty}\left\|\left(R_{{k}_0^{\text{U}}+1}+H_{{k}_0^{\text{U}}+1}\Gamma_{{k}_0^{\text{U}}+1|{k}_0^{\text{U}}}H_{{k}_0^{\text{U}}+1}^{\top}\right)^{-1}\right\|_{\infty}\\ & < \frac{\sqrt{3n}}{r_1}\left(\left\|\Gamma_{{k}_0^{\text{U}}|{k}_0^{\text{U}}}\right\|a_1a_2+q_2\right). \end{split} \end{equation*}$

Step 2. As stated in (50), Kalman gain associated with the observable subsystem is given by

$\begin{align*} {K}^{(1)}_{k}& = {\Gamma}^{(1)}_{k|k-1}\left({H}_k^{(1)}\right)^{\top}\left(R_{k}+{H}_k^{(1)}{\Gamma}^{(1)}_{k|k-1}\left({H}_k^{(1)}\right)^{\top}\right)^{-1}. \end{align*}$

According to Lemma 1, there exist constants $\tilde{c}_1$ and $\tilde{c}_2$ such that the error covariance of the observable subsystem satisfies

$\begin{align} \tilde{c}_1 I < {\Gamma}_{k|k}^{(1)} < \tilde{c}_2 I, \quad \text{for all $k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]$ .} \label{eq:bound_cov_obsv} \end{align}$

(53)

Given that

$\begin{align*} {\Gamma}^{(1)}_{k|k-1} = {A}_k^{(1)}{\Gamma}^{(1)}_{k-1|k-1}\left({A}_k^{(1)}\right)^{\top}+{Q}_{k-1}^{(1)}, \end{align*}$

we have

$\begin{align*} \left\|{\Gamma}^{(1)}_{k|k-1}\right\|_{\infty} \le \tilde{a}_1\tilde{a}_2\left\|{\Gamma}^{(1)}_{k-1|k-1}\right\|_{\infty}+\left\|{Q}_{k-1}^{(1)}\right\|_{\infty} < \sqrt{d_1}\left(\tilde{a}_1\tilde{a}_2\tilde{c}_2+q_2\right), \end{align*}$

with $\tilde{a}_1$ and $\tilde{a}_2$ defined as

$\begin{align*} \tilde{a}_1 = \max\limits_{k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]}\left\|A^{(1)}_k\right\|_{\infty}, \quad \tilde{a}_2 = \max\limits_{k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]}\left\|\left(A_k^{(1)}\right)^{\top}\right\|_{\infty}. \end{align*}$

Following the similar argument as in Step 1, we obtain

$\begin{equation*} \begin{split} \left\|{K}^{(1)}_{k}\right\|_{\infty} < \frac{\sqrt{3d_1}}{r_1}\left(\tilde{a}_1\tilde{a}_2\tilde{c}_2+q_2\right), \quad \text{for all $k\in({k}_0^{\text{U}}+1, {k}_1^{\text{U}}]$ .} \end{split} \end{equation*}$

Step 3. Define the Lyapunov function of the observable subsystem as

$\begin{align*} {V}^{\text{(1)}}_k = \left({\bar{\eta}}^{(1)}_{k|k}\right)^{\top}\left({\Gamma}^{(1)}_{k|k}\right)^{-1}{\bar{\eta}}^{(1)}_{k|k}. \end{align*}$

According to Lemma 3 in [31], the one-step change of ${V}^{\text{(1)}}_k$ is given by:

$\begin{equation*} \begin{split} &{V}^{\text{(1)}}_{k+1}-{V}^{\text{(1)}}_k\\ = &-\left({\bar{\eta}}^{(1)}_{k|k}\right)^{\top}\left({\Gamma}^{(1)}_{k|k}+{\Gamma}^{(1)}_{k|k}\left({A}_k^{(1)}\right)^{\top}\times \right.\\ &\quad \quad \quad \quad \quad \quad \left.\left({Q}_{k}^{(1)}+{\Gamma}^{(1)}_{k+1|k}\left({H}_k^{(1)}\right)^{\top}R_{k+1}^{-1}{H}_k^{(1)}{\Gamma}^{(1)}_{k+1|k}\right)^{-1}{A}_k^{(1)}{\Gamma}^{(1)}_{k|k}\right)^{-1}{\bar{\eta}}^{(1)}_{k|k}\\ \le& -\left\|{\bar{\eta}}^{(1)}_{k|k}\right\|^2\left\|{\Gamma}^{(1)}_{k|k}+{\Gamma}^{(1)}_{k|k}\left({A}_k^{(1)}\right)^{\top}\times\right.\\ &\quad \quad \quad \quad \quad \quad\left.\left({Q}_{k}^{(1)}+{\Gamma}^{(1)}_{k+1|k}\left({H}_k^{(1)}\right)^{\top}R_{k+1}^{-1}{H}_k^{(1)}{\Gamma}^{(1)}_{k+1|k}\right)^{-1}{A}_k^{(1)}{\Gamma}^{(1)}_{k|k}\right\|^{-1}, \end{split} \end{equation*}$

where

$\begin{equation*} \begin{split} &\left\|{\Gamma}^{(1)}_{k|k}+{\Gamma}^{(1)}_{k|k}\left({A}_k^{(1)}\right)^{\top}\left({Q}_{k}^{(1)}+{\Gamma}^{(1)}_{k+1|k}\left({H}_k^{(1)}\right)^{\top}R_{k+1}^{-1}{H}_k^{(1)}{\Gamma}^{(1)}_{k+1|k}\right)^{-1}{A}_k^{(1)}{\Gamma}^{(1)}_{k|k}\right\|\\ %\le&\left\|{\Gamma}^{(1)}_{k|k}\right\|+\left\|{\Gamma}^{(1)}_{k|k}\left({A}_k^{(1)}\right)^{\top}\left({Q}_{k}^{(1)}+{\Gamma}^{(1)}_{k+1|k}\left({H}_k^{(1)}\right)^{\top}R_{k+1}^{-1}{H}_k^{(1)}{\Gamma}^{(1)}_{k+1|k}\right)^{-1}{A}_k^{(1)}{\Gamma}^{(1)}_{k|k}\right\|\\ < &\tilde{c}_2+q_1^{-1}\left\|{\Gamma}^{(1)}_{k|k}\left({A}_k^{(1)}\right)^{\top}{A}_k^{(1)}{\Gamma}^{(1)}_{k|k}\right\|\\ \le&\tilde{c}_2+q_1^{-1}\left\|{\Gamma}^{(1)}_{k|k}\right\|\left\|\left({A}_k^{(1)}\right)^{\top}{A}_k^{(1)}\right\|\left\|{\Gamma}^{(1)}_{k|k}\right\|\\ < &\tilde{c}_2+q_1^{-1}\tilde{c}_2^{2}\tilde{a}_3, \end{split} \end{equation*}$

with $\tilde{a}_3$ defined as

$\begin{align*} \tilde{a}_3 = \max\limits_{k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]}\sigma_{\max}\left({A}_k^{(1)}\right), \end{align*}$

and $\sigma_{\max}(M)$ is the maximum singular value of matrix $M$ . It follows that for all $k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]$ , the Lyapunov function ${V}^{\text{(1)}}_k$ satisfies

$\begin{align*} \tilde{c}_2^{-1}\left\|{\bar{\eta}}^{(1)}_{k|k}\right\|^2 < {V}^{\text{(1)}}_k < \tilde{c}_1^{-1}\left\|{\bar{\eta}}^{(1)}_{k|k}\right\|^2, \quad \text{and} \quad {V}^{\text{(1)}}_{k+1}-{V}^{\text{(1)}}_k < -\tilde{c}_3\left\|{\bar{\eta}}^{(1)}_{k|k}\right\|^2, \end{align*}$

where

$\begin{align*} \tilde{c}_3 = \tilde{c}_2+q_1^{-1}\tilde{c}_2^{2}\tilde{a}_3. \end{align*}$

Hence, for all $k\in({k}_0^{\text{U}}+1, {k}_1^{\text{U}}]$ , the 2-norm of the mean estimation error of the observable subsystem satisfies

$\begin{equation}\label{eq:normerror_o} \begin{split} \left\|{\bar{\eta}}^{(1)}_{k|k}\right\|& < \left(\tilde{c}_2{V}^{\text{(1)}}_k\right)^{\frac{1}{2}}\\ & < \left(\tilde{c}_2{V}_{{k}_0^{\text{U}}+1}^{\text{(1)}}\left(1-{c}_3\tilde{c}_1\right)^{k-{k}_0^{\text{U}}-1}\right)^{\frac{1}{2}}\\ & < \left(\frac{\tilde{c}_2}{\tilde{c}_1}\left\|{\bar{\eta}}^{(1)}_{{k}_0^{\text{U}}+1|{k}_0^{\text{U}}+1}\right\|^2\left(1-\tilde{c}_3\tilde{c}_1\right)^{k-{k}_0^{\text{U}}-1}\right)^{\frac{1}{2}}\\ & = \left(\frac{\tilde{c}_2}{\tilde{c}_1}\right)^{\frac{1}{2}}\left\|{\bar{\eta}}^{(1)}_{{k}_0^{\text{U}}+1|{k}_0^{\text{U}}+1}\right\|\left(\left(1-\tilde{c}_3\tilde{c}_1\right)^{\frac{1}{2}}\right)^{k-{k}_0^{\text{U}}-1}. \end{split} \end{equation}$

(54)

Moreover, for all $k\in({k}_0^{\text{U}}+1, {k}_1^{\text{U}}]$ , the mean estimation error of the observable subsystem is given as follows:

$\begin{align}\label{eq:error_evolve_o} {\bar{\eta}}^{(1)}_{k|k} = \prod\limits_{\kappa = k}^{{k}_0^{\text{U}}+2}{\Upsilon}^{(1)}_{\kappa}{\bar{\eta}}^{(1)}_{{k}_0^{\text{U}}+1|{k}_0^{\text{U}}+1}, \end{align}$

(55)

where ${\Upsilon}^{(1)}_{\kappa} = {\Gamma}^{(1)}_{\kappa|\kappa}\left({\Gamma}_{\kappa|\kappa-1}^{(1)}\right)^{-1}{A}_{\kappa-1}^{(1)}$ . Combining (54) and (55), it is concluded based on the definition of matrix induced norm that

$\begin{align}\label{eq:error_evolve_rate} \left\|\prod\limits_{\kappa = k}^{{k}_0^{\text{U}}+2}{\Upsilon}^{(1)}_{\kappa}\right\|\le \left(\frac{\tilde{c}_2}{\tilde{c}_1}\right)^{\frac{1}{2}}\left(\left(1-\tilde{c}_3\tilde{c}_1\right)^{\frac{1}{2}}\right)^{k-{k}_0^{\text{U}}-1}, \quad \text{for $k\in({k}_0^{\text{U}}+1, {k}_1^{\text{U}}]$ }. \end{align}$

(56)

Step 4. Vectorizing both sides of (48) yields that for $k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]$ ,

$\begin{align*} \textrm{vec}\left\{{\Gamma}^{(12)}_{k+1|k}\right\} = &\left({A}^{(2)}_k\otimes{\Upsilon}^{(1)}_{k}\right)\textrm{vec}\left\{{\Gamma}^{(12)}_{k|k-1}\right\}\\ &+\textrm{vec}\left\{{\Upsilon}^{(1)}_{k}{\Gamma}^{(1)}_{k|k-1}\left({A}^{(21)}_k\right)^{\top}\right\}+\textrm{vec}\left\{{Q}_k^{(12)}\right\}, \end{align*}$

which implies that for all $k\in({k}_0^{\text{U}}+1, {k}_1^{\text{U}}]$ ,

$\begin{align} \textrm{vec}\left\{{\Gamma}^{(12)}_{k+1|k}\right\} = \left(\prod\limits_{\kappa = k}^{{k}_0^{\text{U}}+2}\left({A}^{(2)}_{\kappa}\otimes{\Upsilon}^{(1)}_{\kappa}\right)\right)\textrm{vec}\left\{{\Gamma}^{(12)}_{{k}_0^{\text{U}}+2|{k}_0^{\text{U}}+1}\right\}+\Phi_k, \label{eq:Gamma12} \end{align}$

(57)

where

$\begin{align*} \Phi_k = &\textrm{vec}\left\{{\Upsilon}^{(1)}_{k}{\Gamma}^{(1)}_{k|k-1}\left({A}^{(21)}_k\right)^{\top}+{Q}_k^{(12)}\right\}\\ &+\left({A}^{(2)}_k\otimes{\Upsilon}^{(1)}_{k}\right)\textrm{vec}\left\{{\Upsilon}^{(1)}_{k-1}{\Gamma}^{(1)}_{k-1|k-2}\left({A}^{(21)}_{k-1}\right)^{\top}+{Q}_{k-1}^{(12)}\right\}\\ &+\left({A}^{(2)}_k\otimes{\Upsilon}^{(1)}_{k}\right)\left({A}^{(2)}_{k-1}\otimes{\Upsilon}^{(1)}_{k-1}\right)\textrm{vec}\left\{{\Upsilon}^{(1)}_{k-2}{\Gamma}^{(1)}_{k-2|k-3}\left({A}^{(21)}_{k-2}\right)^{\top}+{Q}_{k-2}^{(12)}\right\}\\ &+\cdots+\prod\limits_{\kappa = k}^{{k}_0^{\text{U}}+3}\left({A}^{(2)}_{\kappa}\otimes{\Upsilon}^{(1)}_{\kappa}\right)\textrm{vec}\left\{{\Upsilon}^{(1)}_{{k}_0^{\text{U}}+2}{\Gamma}^{(1)}_{{k}_0^{\text{U}}+2|{k}_0^{\text{U}}+1}\left({A}^{(21)}_{{k}_0^{\text{U}}+2}\right)^{\top}+{Q}_{{k}_0^{\text{U}}+2}^{(12)}\right\}. \end{align*}$

The explicit form of ${A}^{(2)}_{k}\otimes{\Upsilon}^{(1)}_{k}$ reads

$\begin{align*} {A}^{(2)}_{k}\otimes{\Upsilon}^{(1)}_{k} = \left( \begin{array}{ccc} {A}^{(2)}_{k}(1, 1){\Upsilon}^{(1)}_{k}&\cdots&{A}^{(2)}_{k}(1, d_2){\Upsilon}^{(1)}_{k}\\ \vdots&\ddots&\vdots\\ {A}^{(2)}_{k}(d_2, 1){\Upsilon}^{(1)}_{k}&\cdots&{A}^{(2)}_{k}(d_2, d_2){\Upsilon}^{(1)}_{k}\\ \end{array} \right), \end{align*}$

hence

$\begin{align*} \prod\limits_{\kappa = k}^{{k}_0^{\text{U}}+2}\left({A}^{(2)}_{\kappa}\otimes{\Upsilon}^{(1)}_{\kappa}\right) = \left( \begin{array}{ccc} \vartheta_{k}(1, 1)\prod\nolimits_{\kappa = k}^{{k}_0^{\text{U}}+2}{\Upsilon}^{(1)}_{\kappa}&\cdots&\vartheta_{k}(1, d_2)\prod\nolimits_{\kappa = k}^{{k}_0^{\text{U}}+2}{\Upsilon}^{(1)}_{\kappa}\\ \vdots&\ddots&\vdots\\ \vartheta_{k}(d_2, 1)\prod\nolimits_{\kappa = k}^{{k}_0^{\text{U}}+2}{\Upsilon}^{(1)}_{\kappa}&\cdots&\vartheta_{k}(d_2, d_2)\prod\nolimits_{\kappa = k}^{{k}_0^{\text{U}}+2}{\Upsilon}^{(1)}_{\kappa}\\ \end{array} \right), \end{align*}$

where $\vartheta_{k}(i, j)$ is the $(i, j)^{\textrm{th}}$ element of $\prod_{\kappa = k}^{{k}_0^{\text{U}}+2}{A}^{(2)}_{\kappa}$ . Define

$\begin{align*} P = \left( \begin{array}{cc} {\bf{0}}_{d_2, n-d_2}&I_{d_2}\\ \end{array} \right), \end{align*}$

and given that the top right block of ${A}^{\text{(t)}}_k$ is a zero matrix ${\bf{0}}_{d_1, d_2}$ (as shown in (45)), it can be concluded that

$\begin{align} \nonumber\prod\limits_{\kappa = k}^{{k}_0^{\text{U}}+2}{A}^{(2)}_{\kappa} = P\left(\prod\limits_{\kappa = k}^{{k}_0^{\text{U}}+2}{A}^{\text{(t)}}_{\kappa}\right)P^{\top}& = P\left(\prod\limits_{\kappa = k}^{{k}_0^{\text{U}}+2}UA_{\kappa}U^{\top}\right)P^{\top}\\ & = PU\left(\prod\limits_{\kappa = k}^{{k}_0^{\text{U}}+2}A_{\kappa}\right)U^{\top}P^{\top}.\label{eq:prod_A_transform} \end{align}$

(58)

Based on Lemma 2, the $(i, j)^{th}$ element of $\prod_{\kappa = k}^{{k}_0^{\text{U}}+2}A_{\kappa}$ satisfies

$\begin{align*} 0\le \left(\prod\limits_{\kappa = k}^{{k}_0^{\text{U}}+2}A_{\kappa}\right)\left(i, j\right) \le 1, \quad \text{for all $i, j\in \{1, \cdots, n\}$ .} \end{align*}$

Hence, it can be derived from (58) that

$\begin{align*} \left\|\prod\limits_{\kappa = k}^{{k}_0^{\text{U}}+2}\check{A}^{(2)}_{\kappa}\right\|_{\infty}\le \left\|\prod\limits_{\kappa = k}^{{k}_0^{\text{U}}+2}\check{A}_{\kappa}\right\|_{\infty}\le \left\|U\left(\prod\limits_{\kappa = k}^{{k}_0^{\text{U}}+2}A_{\kappa}\right)U^T\right\|_{\infty} \le \sqrt{n} \left\|\prod\limits_{\kappa = k}^{{k}_0^{\text{U}}+2}A_{\kappa}\right\|\le n^2. \end{align*}$

Consequently,

$\begin{split} \left\|\prod\limits_{\kappa = k}^{{k}_0^{\text{U}}+2}\left({A}^{(2)}_{\kappa}\otimes{\Upsilon}^{(1)}_{\kappa}\right)\right\|_{\infty}&\le n^2\left\|\prod\limits_{\kappa = k}^{{k}_0^{\text{U}}+2}{\Upsilon}^{(1)}_{\kappa}\right\|_{\infty}\\ &\le n^2\sqrt{d_1} \left(\frac{\tilde{c}_2}{\tilde{c}_1}\right)^{\frac{1}{2}}\left(\left(1-\tilde{c}_3\tilde{c}_1\right)^{\frac{1}{2}}\right)^{k-{k}_0^{\text{U}}-1}\\ & = \tilde{t}\tilde{q}^{k-{k}_0^{\text{U}}-1}, \end{split}$

(59)

where the last inequality is due to (56). Recall from (53) that $\tilde{c}_1 I<{\Gamma}_{k|k}^{(1)}<\tilde{c}_2 I$ for $k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]$ . Since

$\begin{equation*} {\Gamma}_{k|k-1}^{(1)} = {A}_k^{(1)}{\Gamma}^{(1)}_{k-1|k-1}\left({A}_k^{(1)}\right)^{\top}+{Q}^{(1)}_{k-1}, \end{equation*}$

it follows that $\left\|{\Gamma}_{k|k-1}^{(1)}\right\|_{\infty}\le \sqrt{d_1}\left(\tilde{a}_1\tilde{a}_2\tilde{c}_2+q_2\right)$ for all $k\in({k}_0^{\text{U}}+1, {k}_1^{\text{U}}]$ , and the prior error covariance of the observable subsystem satisfies $q_1I<{\Gamma}_{k|k-1}^{(1)}$ for $k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]$ . As a consequence,

$\begin{equation}\label{eq:bound_a_upsilon} \begin{split} \left\|{\Upsilon}^{(1)}_{k}\right\|_{\infty}\le\sqrt{d_1}\left\|{\Gamma}^{(1)}_{k|k}\left({\Gamma}_{k|k-1}^{(1)}\right)^{-1}\right\| < \sqrt{d_1}\tilde{c}_2q_1^{-1}, \quad \text{for $k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]$ .} \end{split} \end{equation}$

(60)

Define $\tilde{a}_4$ as

$\begin{align*} \tilde{a}_4 = \max\limits_{k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]}\left\|{A}_k^{(21)}\right\|_{\infty}, \end{align*}$

it follows that⁶ for $k\in({k}_0^{\text{U}}+1, {k}_1^{\text{U}}]$ ,

⁶Recall that for matrix $M\in\mathbb{R}^{p\times q}$ , $\left\|M\right\|_{\max}\le \left\|M\right\|_{2} = \max_{1\le r\le p, 1 \le c\le q}\left|M(r, c)\right|$ .

$\begin{split}\nonumber &\left\|\textrm{vec}\left\{{\Upsilon}^{(1)}_{k}{\Gamma}^{(1)}_{k|k-1}\left({A}^{(21)}_k\right)^{\top}\right\}+\textrm{vec}\left\{{Q}_k^{(12)}\right\}\right\|_{\infty}\\ \le& \left\|{\Upsilon}^{(1)}_{k}{\Gamma}^{(1)}_{k|k-1}\left({A}^{(21)}_k\right)^{\top}\right\|_{\infty}+\left\|{Q}_k^{(12)}\right\|_{\max}\\ < &\sqrt{d_1}\tilde{c}_2q_1^{-1}\tilde{a}_4\left\|{\Gamma}^{(1)}_{k|k-1}\right\|_{\infty}+q_2\\ < & d_1\tilde{c}_2\tilde{a}_4\left(\tilde{a}_1\tilde{a}_2\tilde{c}_2+q_2\right)q_1^{-1}+q_2\\ = &\tilde{p}. \end{split}$

(61)

Substituting (59) and (61) into (57), we obtain that for $k\in({k}_0^{\text{U}}+1, {k}_1^{\text{U}}]$ ,

$\begin{align*} \left\|\textrm{vec}\left\{{\Gamma}^{(12)}_{k+1|k}\right\}\right\|_{\infty}\le {\rm{b}}\left(k\right)\triangleq\tilde{t}\tilde{q}^{k-{k}_0^{\text{U}}-1}\left\|\textrm{vec}\left\{{\Gamma}^{(12)}_{{k}_0^{\text{U}}+2|{k}_0^{\text{U}}+1}\right\}\right\|_{\infty}+\tilde{p}+\tilde{t}\tilde{p}\sum\limits_{\ell = 1}^{k-{k}_0^{\text{U}}-2}\tilde{q}^{\ell}, \end{align*}$

where ${\rm{b}}(k)$ is either a non-increasing or a non-decreasing function of $k$ . Hence, we obtain that for $k\in({k}_0^{\text{U}}, {k}_1^{\text{U}}]$ ,

$\begin{align*} &\left\|\textrm{vec}\left\{{\Gamma}^{(12)}_{k+1|k}\right\}\right\|_{\infty}\\ \le&\max\left\{\left\|\textrm{vec}\left\{{\Gamma}^{(12)}_{{k}_0^{\text{U}}+2|{k}_0^{\text{U}}+1}\right\}\right\|_{\infty}, \quad {\rm{b}}({k}_0^{\text{U}}+2), \quad \lim\limits_{k\rightarrow \infty}{\rm{b}}(k)\right\}\\ \le&\max\left\{\left\|\textrm{vec}\left\{{\Gamma}^{(12)}_{{k}_0^{\text{U}}+2|{k}_0^{\text{U}}+1}\right\}\right\|_{\infty}, \quad \tilde{t}\tilde{q}\left\|\textrm{vec}\left\{{\Gamma}^{(12)}_{{k}_0^{\text{U}}+2|{k}_0^{\text{U}}+1}\right\}\right\|_{\infty}+\tilde{p}, \quad \frac{\tilde{t}\tilde{p}\tilde{q}}{1-\tilde{q}}+\tilde{p}\right\}, \end{align*}$

where

$\begin{equation*} \begin{split} \left\|\textrm{vec}\left\{{\Gamma}^{(12)}_{{k}_0^{\text{U}}+2|{k}_0^{\text{U}}+1}\right\}\right\|_{\infty}&\le \left\|{\Gamma}^{(12)}_{{k}_0^{\text{U}}+2|{k}_0^{\text{U}}+1}\right\|_{\infty}\\ & < \sqrt{n}a_1a_2\left\|\Gamma_{{k}_0^{\text{U}}+1|{k}_0^{\text{U}}+1}\right\|+\sqrt{n}q_2\\ & \le \sqrt{n}a_1a_2\left\|\Gamma_{{k}_0^{\text{U}}+1|{k}_0^{\text{U}}}\right\|+\sqrt{n}q_2\\ & < n\sqrt{n}\left\|\Gamma_{{k}_0^{\text{U}}|{k}_0^{\text{U}}}\right\|\left(a_1a_2\right)^2+n\sqrt{n}a_1a_2q_2+\sqrt{n}q_2\\ & = \tilde{\gamma}. \end{split} \end{equation*}$

Also since

$\begin{align*} {K}_k^{(21)} = \left({\Gamma}_{k|k-1}^{(12)}\right)^{\top}\left({H}_k^{(1)}\right)^{\top}\left(R_k+{H}_k^{(1)}{\Gamma}_{k|k-1}^{(1)}\left({H}_k^{(1)}\right)^{\top}\right)^{-1}, \end{align*}$

it follows that for $k\in({k}_0^{\text{U}}+1, k_1^{\text{U}}]$ ,

$\begin{align*} \left\|{K}_k^{(21)}\right\|_{\infty}&\le \frac{\sqrt{3}}{r_1}\left\|\left({\Gamma}_{k|k-1}^{(12)}\right)^{\top}\right\|_{\infty}\le \frac{d_1\sqrt{3}}{r_1}\max\left\{\tilde{\gamma}, \quad \tilde{t}\tilde{q}\tilde{\gamma}+\tilde{p}, \quad \frac{\tilde{t}\tilde{p}\tilde{q}}{1-\tilde{q}}+\tilde{p}\right\}. \end{align*}$

Step 5. Combining Steps 1, 2 and 4, it can be concluded that for $k\in ({k}_0^{\text{U}}, {k}_1^{\text{U}}]$

$\begin{align*} &\left\|K_k\right\|_{\infty} = \left\|U^{\top}{K}^{\text{(t)}}_k\right\|_{\infty}\le n\left\|{K}^{\text{(t)}}_k\right\|_{\infty}\\ \le&\frac{\sqrt{3}d_1}{r_1}\max\left\{\frac{\sqrt{n}}{d_1}\left(\left\|\Gamma_{{k}_0^{\text{U}}|{k}_0^{\text{U}}}\right\|a_1a_2+q_2\right), \frac{1}{\sqrt{d_1}}\left(\tilde{a}_1\tilde{a}_2\tilde{c}_2+q_2\right), \tilde{\gamma}, \tilde{t}\tilde{q}\tilde{\gamma}+\tilde{p}, \frac{\tilde{t}\tilde{p}\tilde{q}}{1-\tilde{q}}+\tilde{p}\right\}\\ \triangleq&{\rm{k}}\left(\left\|\Gamma_{{k}_0^{\text{U}}|{k}_0^{\text{U}}}\right\|\right), \end{align*}$

which completes the proof.

References

[1]	Z. Zhang, P. Cui, W. Zhu, Deep Learning on Graphs: A Survey, IEEE Trans. Knowl. Data Eng., 34 (2022), 249–270. https://doi.org/10.1109/TKDE.2020.2981333 doi: 10.1109/TKDE.2020.2981333
[2]	D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, P. Vandergheynst, The emerging field of signal processing on graphs: Extending high dimensional data analysis to networks and other irregular domains, IEEE Signal Process. Mag., 30 (2013), 83–98. https://doi.org/10.1109/MSP.2012.2235192 doi: 10.1109/MSP.2012.2235192
[3]	A. Sandryhaila, J. M. F. Moura, Big data analysis with signal processing on graphs: Representation and processing of massive data sets with irregular structure, IEEE Signal Process. Mag., 31 (2014), 80–90. https://doi.org/10.1109/MSP.2014.2329213 doi: 10.1109/MSP.2014.2329213
[4]	A. Sandryhaila, J. M. F. Moura, Discrete signal processing on graphs, IEEE Trans. Signal Process., 61 (2013), 1644–1656. https://doi.org/10.1109/TSP.2013.2238935 doi: 10.1109/TSP.2013.2238935
[5]	J. Bruna, W. Zaremba, A. Szlam, Y. Lecun, Spectral networks and locally connected networks on graphs, arXiv preprint, (2013), arXiv: 1312.6203. https://doi.org/10.48550/arXiv.1312.6203
[6]	D. Duvenaud, D. Maclaurin, J. Iparraguirre, R. Bombarell, T. Hirzel, A. Aspuru-Guzik, eet al., Convolutional networks on graphs for learning molecular fingerprints, Adv. Neural Inf. Process. Syst., 28 (2015), 2224–2232.
[7]	T. N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, arXiv preprint, (2016), arXiv: 1609.02907.
[8]	J. Atwood, D. Towsley, Diffusion-convolutional neural networks, Adv. Neural Inf. Process. Syst., 29 (2016), 1993–2001.
[9]	M. Defferrard, X. Bresson, P. Vandergheynst, Convolutional neural networks on graphs with fast localized spectral filtering, Adv. Neural Inf. Process. Syst., 29 (2016), 3837–3845.
[10]	R. Levie, F. Monti, X. Bresson, M. M. Bronstein, CayleyNets: Graph convolutional neural networks with complex rational spectral filters, IEEE Trans. Signal Process., 67 (2019), 97–109. https://doi.org/10.1109/TSP.2018.2879624 doi: 10.1109/TSP.2018.2879624
[11]	R. Levie, W. Huang, L. Bucci, M. Bronstein, G. Kutyniok, Transferability of Spectral Graph Convolutional Neural Networks, J. Mach. Learn. Res., 22 (2021), 12462–112520.
[12]	F. Monti, D. Boscaini, J. Masci, E. Rodola, J. Svoboda, M. M. Bronstein, Geometric deep learning on graphs and manifolds using mixture model CNNs, . IEEE Conf. Comput. Vis. Pattern Recognit., Honolulu, HI, USA, 2017, 5425–5434. https://doi.org/10.1109/CVPR.2017.576
[13]	M. Fey, J. E. Lenssen, F. Weichert, H. Müller, SplineCNN: fast geometric deep learning with continuous b-spline kernels, Proc. IEEE Conf. Comput. Vis. Pattern Recognit., (2018), 869–877. https://doi.org/10.1109/CVPR.2018.00097 doi: 10.1109/CVPR.2018.00097
[14]	W. L. Hamilton, Z. Ying, J. Leskovec, Inductive representation learning on large graphs, Adv. Neural Inf. Process. Syst., 30 (2017), 1024–1034.
[15]	Y. Zhao, J. Qi, Q. Liu, R. Zhang, WGCN: Graph Convolutional Networks with Weighted Structural Features, in 2021 SIGIR, (2021), 624–633. https://doi.org/10.1145/3404835.3462834
[16]	H. Wu, C. Wang, Y. Tyshetskiy, A. Docherty, K. Lu, L. Zhu, Adversarial examples for graph data: Deep insights into attack and defense, arXiv preprint, (2019), arXiv: 1903.01610. https://doi.org/10.48550/arXiv.1903.01610
[17]	D. Zügner, S. Günnemann, Adversarial attacks on graph neural networks via meta learning, arXiv preprint, (2019), arXiv: 1902.08412. https://doi.org/10.48550/arXiv.1902.08412
[18]	K. Xu, H. Chen, S. Liu, P. Chen, T. Weng, M. Hong, et al., Topology attack and defense for graph neural networks: An optimization perspective, in Proc. Int. Joint Conf. Artif. Intell., (2019), 3961–3967. https://doi.org/10.24963/ijcai.2019/550
[19]	L. Chen, J. Li, J. Peng, A survey of adversarial learning on graph, arXiv preprint, (2003), arXiv: 2003.05730. https://doi.org/10.48550/arXiv.2003.05730
[20]	L. Chen, J. Li, J. Peng, Y. Liu, Z. Zheng, C. Yang, Understanding Structural Vulnerability in Graph Convolutional Networks, in Proc. Int. Joint Conf. Artif. Intell., (2021), 2249–2255. https://doi.org/10.24963/ijcai.2021/310
[21]	P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y. Bengio, Graph attention networks, arXiv preprint, (2017), arXiv: 1710.10903. https://doi.org/10.48550/arXiv.1710.10903
[22]	A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit. L. Jones, A. N. Gomez, et al., Attention is all you need, Adv. Neural Inf. Process. Syst., 30 (2017), 5998–6008.
[23]	C. Zhuang, Q. Ma, Dual Graph Convolutional Networks for Graph-Based Semi-Supervised Classification, in Proc. Int. Conf. World Wide Web, (2018), 499–508. https://doi.org/10.1145/3178876.3186116
[24]	F. Hu, Y. Zhu, S. Wu, L. Wang, T. Tan, Hierarchical Graph Convolutional Networks for Semi-supervised Node Classification, in Proc. Int. Joint Conf. Artif. Intell., (2019), 4532–4539. https://doi.org/10.24963/ijcai.2019/630
[25]	Y. Zhang, S. Pal, M. Coates, D. Ü stebay, Bayesian graph convolutional neural networks for semi-supervised classification, in Proc. Int. Joint Conf. Artif. Intell., 33 (2019), 5829–5836. https://doi.org/10.1609/aaai.v33i01.33015829
[26]	Y. Luo, R. Ji, T. Guan, J. Yu, P. Liu, Y. Yang, Every node counts: Self-ensembling graph convolutional networks for semi-supervised learning, Pattern Recognit., 106 (2020), 107451. https://doi.org/10.1016/j.patcog.2020.107451 doi: 10.1016/j.patcog.2020.107451
[27]	P. Gong, L. Ai, Neighborhood Adaptive Graph Convolutional Network for Node Classification, IEEE Access, 7 (2019), 170578–170588. https://doi.org/10.1109/ACCESS.2019.2955487 doi: 10.1109/ACCESS.2019.2955487
[28]	I. Chami, Z. Ying, C. Ré, J. Leskovec, Hyperbolic graph convolutional neural networks, in Proc. Adv. Neural Inf. Process. Syst., (2019), 4868–4879.
[29]	J. Dai, Y. Wu, Z. Gao, Y. Jia, A Hyperbolic-to-Hyperbolic Graph Convolutional Network, in 2021 IEEE/CVF Conf. Computer Vision Pattern Recogn. (CVPR), (2021), 154–163. https://doi.org/10.1109/CVPR46437.2021.00022
[30]	S. Rhee, S. Seo, S. Kim, Hybrid Approach of Relation Network and Localized Graph Convolutional Filtering for Breast Cancer Subtype Classification, in 2018 Int. Joint Conf. Artif. Intell., (2018), 3527–3534. https://doi.org/10.24963/ijcai.2018/490
[31]	J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, G. E. Dahl, Neural Message Passing for Quantum Chemistry, in 2017 Int. Conf. Machine Learn., (2017), 1263–1272.
[32]	M. Zhang, Z. Cui, M. Neumann, Y. Chen, An End-to-End Deep Learning Architecture for Graph Classification, in Proc. Artif. Intell., (2018), 4438–4445. https://doi.org/10.1609/aaai.v32i1.11782
[33]	R. Ying, J. You, C. Morris, Hierarchical graph representation learning with differentiable pooling, in Proc. 32nd Int. Conf. Neural Inf. Process. Syst., (2018), 4805–4815.
[34]	Y. Ma, S. Wang, C. C Aggarwal, J. Tang, Graph convolutional networks with eigenpooling, in Proc. 25th ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., (2019), 723–731. https://doi.org/10.1145/3292500.3330982
[35]	J. Lee, I. Lee, J. Kang, Self-attention graph pooling, in Proc. 36th Int. Conf. Machine Learn., (2019), 3734–3743. Available from: http://proceedings.mlr.press/v97/lee19c/lee19c.pdf
[36]	C. Cangea, P. Velickovic, N. Jovanovic, T. Kipf, P. Lio, Towards sparse hierarchical graph classifiers, in Proc. Adv. Neural Inf. Process. Syst., (2018). https://doi.org/10.48550/arXiv.1811.01287
[37]	H. Gao, S. Ji, Graph U-Nets, in Proc. 36th Int. Conf. Machine Learn., (2019), 2083–2092. https://doi.org/10.1109/TPAMI.2021.3081010
[38]	H. Gao, Z. Wang, S. Ji, Large-Scale Learnable Graph Convolutional Networks, in Proc. Knowl. Disc. Data Min., (2018), 1416–1424. https://doi.org/10.1145/3219819.3219947
[39]	W. Chiang, X. Liu, S. Si, Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks, in Proc. Knowl. Disc. Data Min., (2019), 257–266. https://doi.org/10.1145/3292500.3330925
[40]	D. Zou, Z. Hu, Y. Wang, S. Jiang, Y. Sun, Q. Gu, Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks, in Proc. Adv. Neural Inf. Process. Syst., (2019), 11249–11259.
[41]	J. Wang, Y. Wang, Z. Yang, Bi-GCN: Binary Graph Convolutional Network, in 2021 IEEE/CVF Conf. Comput. Vision Pattern Recogn. (CVPR), (2021), 1561–1570. https://doi.org/10.1109/CVPR46437.2021.00161
[42]	F. Monti, K. Otness, M. M. Bronstein, MOTIFNET: A Motif-Based Graph Convolutional Network for Directed Graphs, in Proc. IEEE Data Sci. Workshop, (2018), 225–228. https://doi.org/10.1109/DSW.2018.8439897
[43]	J. Du, S. Zhang, G. Wu, J. M. F. Moura, S. Kar, Topology adaptive graph convolutional networks, arXiv preprint, (2017), arXiv: 1710.10370.
[44]	E. Yu, Y. Wang, Y. Fu, D. B. Chen, M. Xie, Identifying critical nodes in complex networks via graph convolutional networks, Knowl.-Based Syst., 198 (2020), 105893. https://doi.org/10.1016/j.knosys.2020.105893 doi: 10.1016/j.knosys.2020.105893
[45]	C. Li, X. Qin, X. Xu, D. Yang, G. Wei, Scalable Graph Convolutional Networks with Fast Localized Spectral Filter for Directed Graphs, IEEE Access, 8 (2020), 105634–105644. https://doi.org/10.1109/ACCESS.2020.2999520 doi: 10.1109/ACCESS.2020.2999520
[46]	S. Abu-El-Haija, A. Kapoor, B. Perozzi, J. Lee, N-GCN: Multi-scale Graph Convolution for Semi-supervised Node Classification, in Proc. Conf. Uncertainty in Artif. Intell., (2019), 841–851.
[47]	S. Wan, C. Gong, P. Zhong, B. Du, L. Zhang, J. Yang, Multiscale Dynamic Graph Convolutional Network for Hyperspectral Image Classification, IEEE Trans. Geosci. Remote. Sens., 58 (2020), 3162–3177. https://doi.org/10.1109/TGRS.2019.2949180 doi: 10.1109/TGRS.2019.2949180
[48]	R. Liao, Z. Zhao, R. Urtasun, R. S. Zemel, LanczosNet: Multi-Scale Deep Graph Convolutional Networks, arXiv preprint., (2019), arXiv: 1901.01484. Available from: https://openreview.net/pdf?id = BkedznAqKQ
[49]	S. Luan, M. Zhao, X. Chang, D. Precup, Break the Ceiling: Stronger Multi-scale Deep Graph Convolutional Networks, in Proc. Conf. Workshop on Neural Inform. Process. Syst., 32 (2019), 10943–10953. Available from: https://proceedings.neurips.cc/paper_files/paper/2019/file/ccdf3864e2fa9089f9eca4fc7a48ea0a-Paper.pdf
[50]	F. Manessi, A. Rozza, M. Manzo, Dynamic Graph Convolutional Networks, Pattern Recogn., 97 (2020), 107000. https://doi.org/10.1016/j.patcog.2019.107000 doi: 10.1016/j.patcog.2019.107000
[51]	A. Pareja, G. Domeniconi, J. Chen, T. Ma, T. Suzumura, EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs, in Proc. Int. Joint Conf. Artif. Intell., (2020), 5363–5370. https://doi.org/10.1609/aaai.v34i04.5984
[52]	Z. Qiu, K. Qiu, J. Fu, D. Fu, DGCN: Dynamic Graph Convolutional Network for Efficient Multi-Person Pose Estimation, in Proc. Int. Joint Conf. Artif. Intell., (2020), 11924–11931. https://doi.org/10.1609/aaai.v34i07.6867
[53]	T. Song, Z. Cui, Y. Wang, W. Zheng, Q. Ji, Dynamic Probabilistic Graph Convolution for Facial Action Unit Intensity Estimation, in Proc. IEEE Conf. Comput. Vision Pattern Recogn., (2021), 4845–4854. https://doi.org/10.1109/CVPR46437.2021.00481
[54]	M. S. Schlichtkrull, T. N. Kipf, P. Bloem, R. Berg, I. Titov, M. Welling, Modeling Relational Data with Graph Convolutional Networks, In The Semantic Web: 15th Int. Conf., ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, 593–607. https://doi.org/10.1007/978-3-319-93417-4_38
[55]	Z. Huang, X. Li, Y. Ye, M. K. Ng, MR-GCN: Multi-Relational Graph Convolutional Networks based on Generalized Tensor Product, in Proc. Int. Joint Conf. Artif. Intell., (2020), 1258–1264. https://doi.org/10.24963/ijcai.2020/175
[56]	J. Chen, L. Pan, Z. Wei, X. Wang, C. W. Ngo, T. S. Chua, Zero-Shot Ingredient Recognition by Multi-Relational Graph Convolutional Network, in Proc. Int. Joint Conf. Artif. Intell., 34 (2020), 10542–10550. https://doi.org/10.1609/aaai.v34i07.6626
[57]	P. Gopalan, S. Gerrish, M. Freedman, D. Blei, D. Mimno, Scalable inference of overlapping communities, in Proc. Conf. Workshop on Neural Inform. Process. Syst., (2012), 2249–2257.
[58]	R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, S. Süsstrunk, SLIC superpixels compared to state-of-the-art superpixel methods, IEEE Trans. Pattern Anal. Mach. Intell., 34 (2012), 2274–2282. https://doi.org/10.1109/TPAMI.2012.120 doi: 10.1109/TPAMI.2012.120
[59]	W. Zheng, P. Jing, Q. Xu, Action Recognition Based on Spatial Temporal Graph Convolutional Networks, in Proc. 3rd Int. Conf. Comput. Sci. Appl. Eng., 118 (2019), 1–5. https://doi.org/10.1145/3331453.3361651
[60]	D. Tian, Z. Lu, X. Chen, L. Ma, An attentional spatial temporal graph convolutional network with co-occurrence feature learning for action recognition, Multimed. Tools Appl., 79 (2020), 12679–12697. https://doi.org/10.1007/s11042-020-08611-4 doi: 10.1007/s11042-020-08611-4
[61]	Y. Chen, G. Ma, C. Yuan, B. Li, H. Zhang, F. Wang, et al., Graph convolutional network with structure pooling and joint-wise channel attention for action recognition, Pattern Recogn., 103 (2020), 107321. https://doi.org/10.1016/j.patcog.2020.107321 doi: 10.1016/j.patcog.2020.107321
[62]	J. Dong, Y. Gao, H. J. Lee, H. Zhou, Y. Yao, Z. Fang, et al., Action Recognition Based on the Fusion of Graph Convolutional Networks with High Order Features, Appl. Sci., 10 (2020), 1482. https://doi.org/10.3390/app10041482 doi: 10.3390/app10041482
[63]	Z. Chen, S. Li, B. Yang, Q. Li, H. Liu, Multi-Scale Spatial Temporal Graph Convolutional Network for Skeleton-Based Action Recognition, in Proc. Int. Joint Conf. Artif. Intell., 35 (2021), 1113–1122. https://doi.org/10.1609/aaai.v35i2.16197
[64]	Y. Bin, Z. Chen, X. Wei, X. Chen, C. Gao, N. Sang, Structure-aware human pose estimation with graph convolutional networks, Pattern Recogn., 106 (2020), 107410. https://doi.org/10.1016/j.patcog.2020.107410 doi: 10.1016/j.patcog.2020.107410
[65]	R. Wang, C. Huang, X. Wang, Global Relation Reasoning Graph Convolutional Networks for Human Pose Estimation, IEEE Access, 8 (2020), 38472–38480. https://doi.org/10.1109/ACCESS.2020.2973039 doi: 10.1109/ACCESS.2020.2973039
[66]	T. Sofianos, A. Sampieri, L. Franco, F. Galasso, Space-Time-Separable Graph Convolutional Network for Pose Forecasting, in Proc. IEEE/ICCV Int. Conf. Comput. Vision, (2021), 11209–11218. https://doi.org/10.48550/arXiv.2110.04573
[67]	Z. Zou, W. Tang, Modulated Graph Convolutional Network for 3D Human Pose Estimation, in Proc. ICCV, (2021), 11457–11467. https://doi.org/10.1109/ICCV48922.2021.01128
[68]	B. Yu, H. Yin, Z. Zhu, Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting, in Proc. Int. Joint Conf. Artif. Intell., (2018), 3634–3640. https://doi.org/10.24963/ijcai.2018/505
[69]	Y. Han, S. Wang, Y. Ren, C. Wang, P. Gao, G. Chen, Predicting Station-Level Short-Term Passenger Flow in a Citywide Metro Network Using Spatiotemporal Graph Convolutional Neural Networks, ISPRS Int. J. Geo-Inform., 8 (2019), 243. https://doi.org/10.3390/ijgi8060243 doi: 10.3390/ijgi8060243
[70]	B. Zhao, X. Gao, J. Liu, J. Zhao, C. Xu, Spatiotemporal Data Fusion in Graph Convolutional Networks for Traffic Prediction, IEEE Access, 8 (2020), 76632–76641. https://doi.org/10.1109/ACCESS.2020.2989443 doi: 10.1109/ACCESS.2020.2989443
[71]	L. Ge, H. Li, J. Liu, A. Zhou, Temporal Graph Convolutional Networks for Traffic Speed Prediction Considering External Factors, in Proc. Int. Conf. Mobile Data Manag., (2019), 234–242. https://doi.org/10.1109/MDM.2019.00-52
[72]	L. Ge, S. Li, Y. Wang, F. Chang, K. Wu, Global Spatial-Temporal Graph Convolutional Network for Urban Traffic Speed Prediction, Appl. Sci.-basel, 10 (2020), 1509. https://doi.org/10.3390/app10041509 doi: 10.3390/app10041509
[73]	P. Han, P. Yang, P. Zhao, S. Shang, Y. Liu, J. Zhou, et al., GCN-MF: Disease-Gene Association Identification by Graph Convolutional Networks and Matrix Factorization, Knowl. Disc. Data Min., (2019), 705–713. https://doi.org/10.1145/3292500.3330912 doi: 10.1145/3292500.3330912
[74]	J. Li, Z. Li, R. Nie, Z. You, W. Bao, FCGCNMDA: predicting miRNA-disease associations by applying fully connected graph convolutional networks, Mol. Genet. Genom., 295 (2020), 1197–1209. https://doi.org/10.1007/s00438-020-01693-7 doi: 10.1007/s00438-020-01693-7
[75]	L. Wang, Z. You, Y. Li, K. Zhang, Y. Huang, GCNCDA: A new method for predicting circRNA-disease associations based on Graph Convolutional Network Algorithm, PLoS Comput. Biol., 16 (2020), e1007568. https://doi.org/10.1371/journal.pcbi.1007568 doi: 10.1371/journal.pcbi.1007568
[76]	C. Wang, J. Guo, N. Zhao, Y. Liu, X. Liu, G. Liu, et al., A Cancer Survival Prediction Method Based on Graph Convolutional Network, IEEE Trans. NanoBiosci., 19 (2019), 117–126. https://doi.org/10.1109/TNB.2019.2936398 doi: 10.1109/TNB.2019.2936398
[77]	H. Chen, F. Zhuang, L. Xiao, L. Ma, H. Liu, R. Zhang, et al., AMA-GCN: Adaptive Multi-layer Aggregation Graph Convolutional Network for Disease Prediction, in Proc. IJCAI, (2021), 2235–2241. https://doi.org/10.24963/ijcai.2021/308
[78]	K. Gopinath, C. Desrosiers, H. Lombaert, Learnable Pooling in Graph Convolutional Networks for Brain Surface Analysis, IEEE Trans. Pattern Anal. Mach. Intell., 44 (2022), 864–876. https://doi.org/10.1109/TPAMI.2020.3028391 doi: 10.1109/TPAMI.2020.3028391
[79]	R. Ying, R. He, K. Chen, Graph Convolutional Neural Networks for Web-Scale Recommender Systems, in Proc. Knowl. Disc. Data Min., (2018), 974–983. https://doi.org/10.1145/3219819.3219890
[80]	X. Xia, H. Yin, J. Yu, Q. Wang, L. Cui, X. Zhang, Self-Supervised Hypergraph Convolutional Networks for Session-based Recommendation, in Proc. Int. Joint Conf. Artif. Intell., 35 (2021), 4503–4511. https://doi.org/10.1609/aaai.v35i5.16578
[81]	H. Chen, L. Wang, Y. Lin, C. Yeh, F. Wang, H. Yang, Structured Graph Convolutional Networks with Stochastic Masks for Recommender Systems, in Proc. SIGIR, (2021), 614–623. https://doi.org/10.1145/3404835.3462868
[82]	L. Chen, Y. Xie, Z. Zheng, H. Zheng, J. Xie, Friend Recommendation Based on Multi-Social Graph Convolutional Network, IEEE Access, 8 (2020), 43618–43629. https://doi.org/10.1109/ACCESS.2020.2977407 doi: 10.1109/ACCESS.2020.2977407
[83]	T. Zhong, S. Zhang, F. Zhou, K. Zhang, G. Trajcevski, J. Wu, Hybrid graph convolutional networks with multi-head attention for location recommendation, World Wide Web, 23 (2020), 3125–33151. https://doi.org/10.1007/s11280-020-00824-9 doi: 10.1007/s11280-020-00824-9
[84]	T. H. Nguyen, R. Grishman, Graph Convolutional Networks with Argument-Aware Pooling for Event Detection, in Proc. AAAI Confer. Artif. Intell., 32 (2018). https://doi.org/10.1609/aaai.v32i1.12039
[85]	Z. Guo, Y. Zhang, W. Lu, Attention Guided Graph Convolutional Networks for Relation Extraction, Ann. Meet. Assoc. Comput. Linguist., (2019), 241–251. https://doi.org/10.18653/v1/P19-1024 doi: 10.18653/v1/P19-1024
[86]	Y. Hong, Y. Liu, S. Yang, K. Zhang, A. Wen, J. Hu, Improving Graph Convolutional Networks Based on Relation-Aware Attention for End-to-End Relation Extraction, IEEE Access, 8 (2020), 51315–51323. https://doi.org/10.1109/ACCESS.2020.2980859 doi: 10.1109/ACCESS.2020.2980859
[87]	Z. Meng, S. Tian, L. Yu, Y. Lv, Joint extraction of entities and relations based on character graph convolutional network and Multi-Head Self-Attention Mechanism, J. Exp. Theor. Artif. Intell., 33 (2021), 349–362. https://doi.org/10.1080/0952813X.2020.1744198 doi: 10.1080/0952813X.2020.1744198
[88]	L. Yao, C. Mao, Y. Luo, Graph Convolutional Networks for Text Classification, Artif. Intell., (2019), 7370–7377. https://doi.org/10.1609/aaai.v33i01.33017370 doi: 10.1609/aaai.v33i01.33017370
[89]	M. Chandra, D. Ganguly, P. Mitra, B. Pal, J. Thomas, NIP-GCN: An Augmented Graph Convolutional Network with Node Interaction Patterns, in Proc. SIGIR, (2021), 2242–2246. https://doi.org/10.1145/3404835.3463082
[90]	L. Xiao, X. Hu, Y. Chen, Y. Xue, D. Gu, B. Chen, et al., Targeted Sentiment Classification Based on Attentional Encoding and Graph Convolutional Networks, Appl. Sci., 10 (2020), 957. https://doi.org/10.3390/app10030957 doi: 10.3390/app10030957
[91]	P. Zhao, L. Hou, O. Wu, Modeling sentiment dependencies with graph convolutional networks for aspect-level sentiment classification, Knowl.-Based Syst., 193 (2020), 105443. https://doi.org/10.1016/j.knosys.2019.105443 doi: 10.1016/j.knosys.2019.105443
[92]	S. Jiang, Q. Chen, X. Liu, B. Hu, L. Zhang, Multi-hop Graph Convolutional Network with High-order Chebyshev Approximation for Text Reasoning, arXiv preprint, (2021), arXiv: 2106.05221. https://doi.org/10.18653/v1/2021.acl-long.513
[93]	R. Li, H. Chen, F. Feng, Z. Ma, X. Wang, E. Hovy, Dual Graph Convolutional Networks for Aspect-based Sentiment Analysis, in Proc. 59 Ann. Meet. Assoc. Comput. Linguist. And 11^th Int. joint Conf. Nat. Language process., 1 (2021), 6319–6329.
[94]	L. Lv, J. Cheng, N. Peng, M. Fan, D. Zhao, J. Zhang, Auto-encoder based Graph Convolutional Networks for Online Financial Anti-fraud, IEEE Comput. Intell. Financ. Eng. Econ., (2019), 1–6. https://doi.org/10.1109/CIFEr.2019.8759109 doi: 10.1109/CIFEr.2019.8759109
[95]	C. Li, D. Goldwasser, Encoding Social Information with Graph Convolutional Networks for Political Perspective Detection in News Media, in Proc. 57th Ann. Meet. Assoc. Comput. Linguist., (2019), 2594–2604. https://doi.org/10.18653/v1/p19-1247
[96]	Y. Sun, T. He, J. Hu, H. Hang, B. Chen, Socially-Aware Graph Convolutional Network for Human Trajectory Prediction, in 2019 IEEE 3rd Inf. Technol. Network. Electron. Autom. Control Conf. (ITNEC), (2019), 325–333. https://doi.org/10.1109/ITNEC.2019.8729387
[97]	J. Chen, J. Li, M. Ahmed, J. Pang, M. Lu, X. Sun, Next Location Prediction with a Graph Convolutional Network Based on a Seq2seq Framework, KSII Trans. Internet Inf. Syst., 14 (2020), 1909–1928. https://doi.org/10.3837/tiis.2020.05.003 doi: 10.3837/tiis.2020.05.003
[98]	X. Li, Y. Xin, C. Zhao, Y. Yang, Y. Chen, Graph Convolutional Networks for Privacy Metrics in Online Social Networks, Appl. Sci.-Basel, 10 (2020), 1327. https://doi.org/10.3390/app10041327 doi: 10.3390/app10041327

This article has been cited by:

1.	Yanbing Wang, Daniel B. Work, Estimation for heterogeneous traffic using enhanced particle filters, 2022, 18, 2324-9935, 568, 10.1080/23249935.2021.1881186
2.	Zlatinka Dimitrova, Flows of Substances in Networks and Network Channels: Selected Results and Applications, 2022, 24, 1099-4300, 1485, 10.3390/e24101485
3.	M. L. Delle Monache, J. Sprinkle, R. Vasudevan, D. Work, 2019, Autonomous vehicles: From vehicular control to traffic contro, 978-1-7281-1398-2, 4680, 10.1109/CDC40024.2019.9029535
4.	Yanbing Wang, Daniel B. Work, 2019, Heterogeneous traffic estimation with particle filtering, 978-1-5386-7024-8, 2551, 10.1109/ITSC.2019.8917248
5.	Joel C. de Goma, Lourd Andre B. Ammuyutan, Hans Luigi S. Capulong, Katherine P. Naranjo, Madhavi Devaraj, 2019, Vehicular Obstruction Detection In The Zebra Lane Using Computer Vision, 978-1-7281-0851-3, 362, 10.1109/IEA.2019.8715022
6.	Maria Laura Delle Monache, Sean T. McQuade, Hossein Nick Zinat Matin, Derek A. Gloudemans, Yanbing Wang, George L. Gunter, Alexandre M. Bayen, Jonathan W. Lee, Benedetto Piccoli, Benjamin Seibold, Jonathan M. Sprinkle, Daniel B. Work, Modeling, Monitoring, and Controlling Road Traffic Using Vehicles to Sense and Act, 2025, 8, 2573-5144, 211, 10.1146/annurev-control-030123-015145

Reader Comments

Your name:*

Email:*
© 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)