Research article Special Issues

A full convolutional network based on DenseNet for remote sensing scene classification

  • The convolutional neural networks (CNN) applied in remote sensing scene classification have two common problems. One is that these models have large number of parameters, which causes over-fitting easily. The other is that the network is not deep enough, thus more abstract semantic information cannot be extracted. To solve these two problems, we propose a simple and efficient full convolutional network based on DenseNet for remote sensing scene classification. We construct a small number of convolutional kernels to generate a large number of reusable feature maps by dense connections, which makes the network deeper, but does not increase the number of parameters significantly. Our network is so deep that it has more than 100 layers. However, it has only about 7 million parameters, which is far less than the number of VGGos parameters. Then we incorporate an adaptive average 3D pooling operation in our network. This operation fixes feature maps of size 7 × 7 from the last DenseBlock to 1 × 1 and decreases the number of channels from 1024 to 512, thus the whole network can accept input images with different sizes. Furthermore, we design the convolutional layer instead of the fully connected layer that is used as a classifier usually, so that the output features of the network can be classified without flattening operation, which simplifies the classification operation. Finally, a good model is trained by exploiting pre-trained weights and data augmentation technology. Compared with several state-of-the-art algorithms, our algorithm improves classification performance significantly on UCM dataset, AID dataset, OPTIMAL-31 dataset and NWPU-RESISC45 dataset.

    Citation: Jianming Zhang , Chaoquan Lu , Xudong Li , Hye-Jin Kim, Jin Wang. A full convolutional network based on DenseNet for remote sensing scene classification[J]. Mathematical Biosciences and Engineering, 2019, 16(5): 3345-3367. doi: 10.3934/mbe.2019167

    Related Papers:

    [1] Yunfei Tan, Shuyu Li, Zehua Li . A privacy preserving recommendation and fraud detection method based on graph convolution. Electronic Research Archive, 2023, 31(12): 7559-7577. doi: 10.3934/era.2023382
    [2] Xinzheng Xu, Xiaoyang Zhao, Meng Wei, Zhongnian Li . A comprehensive review of graph convolutional networks: approaches and applications. Electronic Research Archive, 2023, 31(7): 4185-4215. doi: 10.3934/era.2023213
    [3] Bingjie Zhang, Junchao Yu, Zhe Kang, Tianyu Wei, Xiaoyu Liu, Suhua Wang . An adaptive preference retention collaborative filtering algorithm based on graph convolutional method. Electronic Research Archive, 2023, 31(2): 793-811. doi: 10.3934/era.2023040
    [4] Min Li, Ke Chen, Yunqing Bai, Jihong Pei . Skeleton action recognition via graph convolutional network with self-attention module. Electronic Research Archive, 2024, 32(4): 2848-2864. doi: 10.3934/era.2024129
    [5] Zhiyuan Feng, Kai Qi, Bin Shi, Hao Mei, Qinghua Zheng, Hua Wei . Deep evidential learning in diffusion convolutional recurrent neural network. Electronic Research Archive, 2023, 31(4): 2252-2264. doi: 10.3934/era.2023115
    [6] Ruxin Xue, Jinggui Huang, Zaitang Huang, Bingyan Li . Reconstructed graph spatio-temporal stochastic controlled differential equation for traffic flow forecasting. Electronic Research Archive, 2025, 33(4): 2543-2566. doi: 10.3934/era.2025113
    [7] Xueping Han, Xueyong Wang . MCGCL: A multi-contextual graph contrastive learning-based approach for POI recommendation. Electronic Research Archive, 2024, 32(5): 3618-3634. doi: 10.3934/era.2024166
    [8] Xia Liu, Liwan Wu . FAGRec: Alleviating data sparsity in POI recommendations via the feature-aware graph learning. Electronic Research Archive, 2024, 32(4): 2728-2744. doi: 10.3934/era.2024123
    [9] Hui-Ching Wu, Yu-Chen Tu, Po-Han Chen, Ming-Hseng Tseng . An interpretable hierarchical semantic convolutional neural network to diagnose melanoma in skin lesions. Electronic Research Archive, 2023, 31(4): 1822-1839. doi: 10.3934/era.2023094
    [10] Yongmei Zhang, Zhirong Du, Lei Hu . A construction method of urban road risky vehicles based on dynamic knowledge graph. Electronic Research Archive, 2023, 31(7): 3776-3790. doi: 10.3934/era.2023192
  • The convolutional neural networks (CNN) applied in remote sensing scene classification have two common problems. One is that these models have large number of parameters, which causes over-fitting easily. The other is that the network is not deep enough, thus more abstract semantic information cannot be extracted. To solve these two problems, we propose a simple and efficient full convolutional network based on DenseNet for remote sensing scene classification. We construct a small number of convolutional kernels to generate a large number of reusable feature maps by dense connections, which makes the network deeper, but does not increase the number of parameters significantly. Our network is so deep that it has more than 100 layers. However, it has only about 7 million parameters, which is far less than the number of VGGos parameters. Then we incorporate an adaptive average 3D pooling operation in our network. This operation fixes feature maps of size 7 × 7 from the last DenseBlock to 1 × 1 and decreases the number of channels from 1024 to 512, thus the whole network can accept input images with different sizes. Furthermore, we design the convolutional layer instead of the fully connected layer that is used as a classifier usually, so that the output features of the network can be classified without flattening operation, which simplifies the classification operation. Finally, a good model is trained by exploiting pre-trained weights and data augmentation technology. Compared with several state-of-the-art algorithms, our algorithm improves classification performance significantly on UCM dataset, AID dataset, OPTIMAL-31 dataset and NWPU-RESISC45 dataset.


    The problem of estimating urban traffic flow has always been a matter of great concern, and it is also a core issue that needs to be paid attention to in the context of traffic congestion problems [1,2,3]. As urban traffic demand increasingly exceeds the capacity of existing infrastructure, merely expanding roadways is inadequate for mitigating congestion [4,5,6]. Instead, more sustainable and efficient solutions depend on the strategic planning and management of the current network [7]. Central to this effort is traffic assignment, which determines the flow distribution pattern in a given network [8,9,10]. Accurate assignment relies on understanding travelers' route-choice behavior [11,12], since individual decisions shape overall traffic flow distribution [13,14]. User Equilibrium (UE), based on Wardrop's first principle [15], provides the theoretical framework: in equilibrium, no traveler can unilaterally switch routes to reduce their travel cost. The UE traffic assignment problem (UE-TAP) can be formulated as a convex nonlinear program [16] and solved by the Frank-Wolfe algorithm [17]. However, owing to the classical method's high complexity and limited efficiency, three main streams of solution algorithms have emerged: link-based algorithms [18,19,20], bush-based methods [21,22,23], and path-based algorithms [24,25,26]. More recently, parallel computing strategies have been explored to enhance both efficiency and scalability in solving UE-TAP [27,28,29,30,31].

    In addition to the above mechanism-based methods, data-driven deep learning methods have been increasingly adopted in transportation research, demonstrating significant potential for addressing complex and nonlinear traffic problems [32,33,34,35,36,37,38,39,40,41,42]. Rahman and Hasan [43] employed the graph convolutional network (GCN) to address stochastic flow diffusion under traffic randomness. However, capturing flow patterns purely from observations is constrained by data availability and quality, resulting in limited generalization to new networks. To address this limitation, physical information such as UE theory can be embedded directly in the learning process to enhance behavior realism [44,45]. For example, Fang et al. [46] proposed a long short-term memory (LSTM)-based traffic assignment model that retains the initialization step of the Frank-Wolfe algorithm.

    This study offers a different perspective on addressing generalization challenge, especially across different network structures. Liu et al. [47] explored out-of-distribution (OOD) challenges by testing their model on networks with added links or adjusted capacities; however, these networks still share broadly similar topologies. The central question we address is: how to design a trained UE-learning model that can be applied to networks with significantly different node and link structures. If this generalization can be achieved, the dependence on data availability and accessibility could be significantly reduced in future studies. To the best of our knowledge, this study is the first to develop an end-to-end learning-based framework for user equilibrium prediction under variable network topologies. Compared to previous work [48], our approach models the propagation of node information across the graph while preserving both node and edge attributes, thereby capturing the underlying network topology more comprehensively. In addition to origin-destination (OD) demand, our model also incorporates link attributes and network topology as inputs, enabling a more detailed representation of traffic assignment dynamics. Table 1 summarizes recent machine learning-based UE-TAP literature, highlighting the innovation of this study.

    Table 1.  Recent UE-TAP solved by machine learning.
    Literature Method Test Network Application Scenarios
    Road failure Topology change
    Fang, Cheng [46] LSTM SF, Winnipeg × ×
    Fan, Tang [49] CNN Chicago, Dazhou × ×
    Liu, Yin [47] DNN SF ×
    Rahman and Hasan [43] GCN SF, EMA × ×
    Hu and Xie [50] GAT SF, EMA, Anaheim, Barcelona, Winnipeg × ×
    Liu and Yin [48] NN + VI formulation Braess, SF, Chicago × ×
    This paper GCN SF, EMA
    Notes. √ denotes applicable and × denotes not applicable. SF is the abbreviation for Sioux-Falls and EMA is the abbreviation for Eastern Massachusetts.

     | Show Table
    DownLoad: CSV

    More specifically, the main contributions of this study are as follows:

    (1) Existing research has typically relied on classical optimization methods or data-driven approaches that lack behavioral interpretability to address the UE-TAP. This study introduces a GCN-based model that explicitly encodes the travelers' cost-minimizing behavior into its learning framework and trains it to directly predict equilibrium traffic flows from OD demand inputs, thereby bypassing the need for iterative optimization.

    (2) Previous studies primarily assume fixed network topologies, limiting the applicability of these methods when faced with changes or disruptions in the network structure. This study utilizes network partitioning and subgraph training, explicitly incorporating network attributes such as free-flow travel time and link capacity, and employing a variable adjacency matrix as input, thereby making the model adapt dynamically to varying network topologies and significantly improving model robustness.

    (3) This study generates and utilizes multiple randomized datasets based on two widely used benchmark networks: SF and EMA. Extensive numerical experiments demonstrate the proposed model's robustness and generalizability across different network scales and topologies. Notably, a model pretrained on data-rich networks can be effectively transferred to improve training efficiency and accuracy on networks with limited data, highlighting the potential of transferable learning in traffic assignment modeling.

    The remainder of this study is organized as follows. Section 2 introduces the UE-TAP formulation and the proposed model. In Section 3, the methodology of the proposed model is described. Subsequently, Section 4 introduces the study network and the results obtained using the proposed model. Finally, Section 5 concludes the paper and outlines directions for future research.

    The equilibrium flow pattern throughout a traffic network can be seen as the outcome of traffic assignment. In this section, we define the UE-TAP and establish its role in obtaining equilibrium flows. Since real-world link flows is often difficult to obtain, we first employ the traditional optimization-based UE model to compute equilibrium flows as a reliable approximation. See Section 2.1. We then leverage a deep learning-based approach to learn the underlying mapping between the input attributes and the equilibrium flow distribution in Section 2.2. Table 2 summarizes the notations used in this study.

    Table 2.  Problem notations.
    Notation Description
    UE-TAP
    Sets:
    N Set of nodes of the road network, where nN.
    M Set of links of the road network, where mM.
    O Set of OD pairs, where oO.
    Ko Set of paths between OD pair o, where kKo.
    Parameters:
    qo Total demand for OD pair o, where oO.
    fm Traffic flows on the road link m, where mM.
    tm Travel time on the road link m, where mM.
    Decision variables:
    δm,k Binary indicator: 1 if link m belongs to path k; 0 otherwise.
    fok Traffic flows on the path kKo for OD pair oO.
    GCN
    G Road network.
    Q OD demand matrix.
    A Adjacency matrix of the road network.
    A Adjacency matrix for road segments.
    ˜A Adjacency matrix augmented with the identity matrix to incorporate self-loops.
    ˜D Degree matrix of ˜A.
    Wl Trainable weight matrix at layer l.
    θ Set of all learnable parameters.
    h() Propagation function.
    σ() Activation function.
    y Observed equilibrium flows.
    ˆy Predicted equilibrium flows.
    Data Generation
    μ Scaling factor for demand generation, μ U(0.1,1).
    η Scaling factor for capacity generation, η U(0.1,1).

     | Show Table
    DownLoad: CSV

    The road network is modeled as a directed graph G(N,M,A), where each element of A equals 1 when the two nodes in the road network are adjacent; it is 0 otherwise. We illustrate an example in Figure 1, consisting of four nodes and five links. For travelers who travel from node 1 to node 4 (OD pair o), Ko={0,1,2} (path 0 denotes nodes 1–2–4, path 1 denotes nodes 1–4, and path 2 denotes nodes 1–3–4). Total demand of this OD pair is denoted as qo, with fo0, fo1, and fo2 representing the traffic flows on path 0, 1, and 2, respectively. In this graph, we denote traffic flows on link 1 as f1 and travel time on this link as t1. We represent the formal UE formulation using the Beckmann's transformation in Eqs (1)–(5).

    Figure 1.  Example of a directed graph for describing the UE-TAP.
    minmMfm0tm(x)dx (1)

    Subject to:

    fm=oOkKoδm,kfok,mM (2)
    kKofok=qo,oO (3)
    fok0,oO,kKo (4)
    δm,k{0,1},mM,kKo (5)

    where Eq (1) is a convex optimization problem with a flow conservation constraint (traffic flow on links is determined by path flows) in Eq (2), an OD demand constraint to ensure all demand is assigned in Eq (3), and a nonnegativity constraint is in Eq (4). Finally, the binary variables define the association between links and paths, indicating whether a specific link belongs to a given path. Then, we use traditional algorithm (e.g., Frank-Wolfe algorithm) to iteratively update flows until convergence, while the solution satisfies:

    tm(fm)tk,mM,kKo (6)

    Equation (6) means that all paths have equal travel time for each OD pair and no traveler can reduce their travel time by switching routes. Thus, the UE condition is satisfied, and we get the optimal equilibrium flows of our UE-TAP.

    Traditional optimization-based methods are computationally expensive and inefficient when applied to large-scale transportation networks. So, we try to propose a data-driven method for solving the UE-TAP using deep-learning. The goal is to develop a model that can efficiently learn the mapping between OD demand and equilibrium traffic flows in the network, thereby reducing computational time while maintaining high accuracy. We represent the problem as:

    ˆy=f(Q,A) (7)

    Here, f() is a GCN-based model in this study that learns the relationship between the input data (OD demand matrix Q, adjacency matrix A) and the output ˆy (predicted set of equilibrium flows).

    Given OD demand and link attribute information, we applied GCN [51] to estimate the equilibrium traffic flows distribution in this study. We first show our basic GCN model in a fixed network in Section 3.1, then we transfer this approach to variable network topologies in Section 3.2. Whole workflow of this study is shown in Figure 2.

    Figure 2.  Workflow of the proposed GCN-based model.

    GCN relies on message passing and neighbor aggregation to capture spatial dependencies in graph-structured data. In the context of UE-TAP, GCN is used to estimate the equilibrium traffic flows on each road segment, given a specific travel demand distribution. The hidden representation at layer l+1 in the GCN can be expressed as:

    Hl+1=h(A,Hl,Wl) (8)

    where Hl is the feature matrix at layer l, Wl is the trainable weight matrix at layer l, and h() is the propagation function that aggregates information from neighboring nodes.

    The GCN propagation rule follows:

    Hl+1=σ(˜D1˜A˜D1HlWl) (9)

    where ˜A=A+I denotes the adjacency matrix augmented with the identity matrix to incorporate self-loops, ˜D is the degree matrix of ˜A, and σ() is the activation function.

    In our UE-TAP, we predict equilibrium traffic flows, so no activation function is applied to the last layer. Instead, we use a regression-based approach, and the predicted equilibrium flow ˆy is given by:ˆy=f(Q,A;θ)

    where θ denotes the set of all learnable parameters in the GCN-based model, including the weights and biases of both the graph convolution layers and the fully connected layers.

    The model is trained using mean squared error (MSE) to minimize the difference between the predicted and actual equilibrium flows, with loss function:

    Loss=1nni=1(yiˆyi)2 (11)

    In Figure 2(a), OD demand features are passed through GCN to derive new node features that incorporate graph structure information. When the topology of the traffic network is fixed, the focus is solely on the network itself, without the need to consider additional information. The model takes the OD demand between nodes, the adjacency matrix A of nodes, and link attributes as input.

    In this study, the goal is to obtain the approximate equilibrium flow on each edge; therefore, node features must be converted into edge features in the feature processing stage. The transformation process follows these rules: the features of two nodes connected by an edge are concatenated, and feature fusion is performed by combining the edge features with additional edge information. After processing through the GCN, the edge features encapsulate node information, structural details, and intrinsic attributes of the edge. Finally, after passing through the fully connected layer, the model predicts the equilibrium flows for the traffic network.

    The GCN-based traffic assignment model proposed in Section 3.1 cannot be directly applied to variable topologies, as deep learning models typically require fixed input and output dimensions. To overcome this limitation, we introduce traffic network partitioning, as illustrated in Figure 2(b), enabling our approach to be adaptable and transferable across different traffic networks. Once trained, the model can be fine-tuned and applied to new networks while maintaining predictive accuracy.

    The road network is initially partitioned into subnetworks, each containing an equal number of edges. If a subnetwork has fewer edges than required, a zero-padding operation is applied to maintain a consistent input dimension. The different colors of the arrows in the subsequent figures represent distinct subnetworks identified through the partitioning process, helping to visualize the structural segmentation more clearly. The partitioned subnetworks are then used to train the traffic assignment model. For a new traffic network, we only need to partition the network into subnetworks with an equal number of edges and either fine-tune the trained model or train a new deep learning model, ensuring predictive consistency in UE-TAP.

    (1) Traffic network partitioning based on dual graph representation

    The traffic network is partitioned such that adjacent edges in close proximity are grouped into the same subnetwork, ensuring each subnetwork remains a contiguous unit. To achieve this, we consider the dual graph representation of the traffic network, where the roles of edges and nodes are reversed, transforming the network into an undirected graph. The nodes in the dual graph (representing links in the original network) are then clustered into partitions, which are subsequently mapped back to the original traffic network to define subnetwork boundaries. Notably, all edges connecting the same pair of nodes are assigned to the same subnetwork to preserve network continuity.

    (2) Adjacency matrix for road segments

    To enable traffic assignment across variable topologies, we redefine the adjacency matrix to represent road segment connectivity rather than traditional node connectivity. This allows the GCN to process edge-level interactions, capturing the spatial dependencies among road segments. The adjacency matrix A for road segments is formulated as:

    A(i,j)={1, if is=jr or js=ir0,else (12)

    where r and s represent the starting point and end point of segment i and j.

    (3) Model construction

    Unlike fixed-topology traffic assignment model, the proposed variable-topology model directly accounts for road segments' characteristics. To ensure input consistency across networks, we first integrate OD demand information into the road network using an "all-or-nothing" assignment [52]. For each OD pair, we assign the entire demand to the shortest path (i.e., the path with the lowest travel cost based on free-flow travel times). This process is repeated for all OD pairs, and the resulting link flows are aggregated across all shortest paths to obtain a single edge-level demand vector. The resulting demand information, originally defined at the OD level, is thereby projected onto road segments. Each road segment is then represented by a feature vector that combines both its intrinsic attributes (link capacity, free-flow travel time, and initial link flow from "all-or-nothing" operation) and the derived demand features (FromNode: the total outgoing demand from the tail node; ToNode: the total incoming demand to the head node). These vectors are subsequently processed through a feature embedding layer to obtain low-dimensional representations. Since this model focuses on road segments level, a GCN layer with a segment adjacency matrix A is employed to aggregate information from adjacent links. This enables the model to capture spatial dependencies among neighboring road segments, ultimately facilitating accurate traffic flow estimation under variable network topologies.

    To evaluate the effectiveness of our proposed data-driven UE-TAP model, we constructed and utilized three datasets for model training and validation. Since the problem described in Section 2 is based on supervised learning, labeled data is essential for training the model. In this context, we employed the improved gradient projection (iGP) algorithm [26] to compute traffic equilibrium flows under varying demand conditions and road segment attributes.

    We generated datasets using two benchmark networks: the SF network in Figure 3 and the EMA network in Figure 4. A comparison of the network sizes is provided in Table 3, where the EMA network has significantly more nodes and links than the SF network, enabling evaluation under both small-scale and large-scale network scenarios.

    Figure 3.  SF network.
    Figure 4.  EMA network.
    Table 3.  Comparison of SF and EMA networks.
    Networks Number of nodes Number of links
    SF 24 76
    EMA 74 258

     | Show Table
    DownLoad: CSV

    To introduce variability in OD demand and road segment capacities, we applied randomization techniques to generate a diverse dataset for training and evaluation. The baseline OD demand matrix Q and initial road segment capacities C was collected from Transportation Networks for Research Core Team (2023). To create multiple samples, we introduced random scaling factors drawn from a uniform distribution:

    Qi=μiQ (13)

    where Qi denotes the randomized OD demand matrix for the ith sample, μi U(0.1,1) is the random scaling factor vector, and each OD pair o has an individual scaling coefficient μo, ensuring diverse demand patterns across samples.

    Similarly, the initial road segment capacities are randomized using a uniform distribution:

    Ci=ηiC (14)

    where Ci denotes the randomized OD demand matrix for the ith sample, ηi U(0.1,1) is the random scaling factor vector, and each link m has an individual capacity coefficient ηm, ensuring heterogeneous capacity constraints across samples.

    Note that the choice of a uniform distribution is not a modeling assumption but a data generation strategy. It provides a simple and unbiased way to simulate diverse traffic scenarios within realistic bounds. The proposed framework does not rely on any specific distributional form and can be adapted to empirical data or alternative distributions as they become available. Figure 5 and Figure 6 provide examples of the OD demand matrix, adjacency matrix, and capacity-flow statistics in the generated dataset. The demand matrix has a N×N shape, for example, an entry 72.7 in the first row and second column in Figure 5 represents a demand of 72.7 trips from node 1 to node 2. And the shape of the adjacency matrix is also N×N. Figure 7 provides an overview of road segment capacities and traffic flows, highlighting the diversity of network conditions introduced through randomization.

    Figure 5.  Illustration of demand matrix.
    Figure 6.  Illustration of adjacency matrix.
    Figure 7.  Statistics on capacity and traffic flows.

    The generated dataset is divided into three subsets as shown in Table 4, each designed to test different aspects of the proposed model.

    Table 4.  Datasets.
    Datasets Size Network OD Capacity Topology
    A: Validate feasibility 10,000
    10,000
    SF
    EMA
    same μ same η fixed
    B: Test adaptability in general scenarios 20,000 SF distinct μ distinct η fixed
    C: Evaluate robustness under disruptions 100,000 SF distinct μ distinct η random link removals

     | Show Table
    DownLoad: CSV

    To explore the feasibility of our proposed model in UE-TAPs, we first use Dataset A to construct a basic GCN model consisting of a single graph convolutional layer and two fully connected layers (input-output structure: (N, N) → (N, M) → (N, 1)). Before the final fully connected layer, the data must be transposed to transform node features into road segment features. In the generated data, we use 9000 samples for training, 500 for verification, and the remaining 500 for testing. The performance indicators include mean absolute percentage error (MAPE), mean absolute error (MAE), and root mean square error (RMSE), as defined in Eqs (15)–(17).

    MAPE=1nni=1|yiˆyiyi|×100% (15)
    MAE=1nni=1|yiˆyi| (16)
    RMSE=1nni=1(yiˆyi)2 (17)

    where yi denotes the actual equilibrium traffic flow on segment i, ˆyi represents the estimated traffic flow on segment i, and n is the total number of road segments in the network.

    The loss curves for the training and verification sets on the SF network are shown in Figure 8. As summarized in Table 5, experiments conducted on both SF and EMA networks demonstrate that MAPE remains stable at approximately 1%, while R2 consistently reaches 0.99. The observed discrepancy between the RMSE and MAE metrics can be attributed to the larger scale and the greater number of road segments in the EMA network. Once equilibrium is achieved, the estimated flow on each segment in EMA is generally lower than that in the SF network.

    Figure 8.  Training results of SF network on dataset A.
    Table 5.  Experimental results of GCNs on dataset A.
    Networks Epoch Learning rate RMSE MAPE (%) MAE R2
    SF 200 0.01 57.28 1.14 40.64 0.9997
    EMA 200 0.01 5.19 0.86 2.27 0.9999

     | Show Table
    DownLoad: CSV

    Experiments conducted on this dataset demonstrate that the GCN effectively captures the distribution pattern of OD demand. However, the basic GCN struggles to integrate edge attributes, particularly when the traffic network edges contain critical information, making the simple GCN ineffective in such scenarios.

    To address this limitation, we introduce a traffic assignment model that incorporates link features. This model consists of three main components: a three-layer GCN with the following architecture: (N, 64) → (64, 64) → (64, 64); two fully connected layers for processing of road segment features, structured as: (2, 16) → (16, 32); three fully connected layers for final prediction, structured as: (160,256) → (256,128) → (128, 1). To evaluate model performance, we utilized Dataset B, where 90% of the samples were used for training, 5% for validation, and 5% for testing. The results of the proposed GCN-based deep learning model across varying sample sizes are summarized in Table 6. MAPE stabilized at approximately 14%, while R2 remained around 0.90, indicating strong model fitting capabilities. Additionally, the loss function curves for training with 10,000 and 20,000 samples are presented in Figure 9 and Figure 10, respectively.

    Table 6.  Experimental results of GCNs on dataset B.
    Samples Epoch Learning rate RMSE MAPE (%) MAE R2
    10,000 200 0.010 1128.90 14.70 827.34 0.8917
    10,000 200 0.001 1141.17 14.76 837.96 0.8894
    20,000 200 0.010 1105.82 14.41 809.88 0.8957
    20,000 200 0.001 1080.60 14.17 790.06 0.9004

     | Show Table
    DownLoad: CSV
    Figure 9.  Training results on dataset B (10,000 samples).
    Figure 10.  Training results on dataset B (20,000 samples).

    The proposed model integrates road segment features and effectively captures traffic flow variations induced by changes in link characteristics. To demonstrate the advantages of our approach, we designed two baseline models: a standard artificial neural network (ANN) and a conventional GCN. Table 7 shows the results of comparison experiments. Traditional deep learning models such as ANN are unable to capture the topological structure of the traffic network or incorporate the intrinsic attributes of road segments, making them unsuitable for solving the UE-TAP. The GCN model, while capable of modeling topological relationships, does not adequately incorporate edge-level features. Therefore, these two baselines represent a step-by-step progression toward the motivation of this study: to develop a GCN-based framework that effectively leverages both the structural and physical characteristics of traffic networks for efficient UE flow estimation. By leveraging graph-based deep learning techniques, our model effectively learns spatial dependencies and complex feature interactions, thereby delivering improved prediction accuracy in diverse traffic assignment scenarios.

    Table 7.  Comparison experiments.
    Models Epoch Learning rate RMSE MAPE (%) MAE R2
    ANN 200 0.01 2488.39 41.05 1966.58 7.0×1012
    GCN 200 0.01 2526.36 41.76 1999.56 0.4717
    This model 200 0.01 1128.90 14.70 827.34 0.8917

     | Show Table
    DownLoad: CSV

    We examine the scenario of road segment failures in the traffic network and assess the model's adaptability to topological disruptions. During the data generation process, road segments were randomly removed in pairs, meaning that both roads connecting a node pair fail simultaneously. Although these failures introduce minor structural modifications, the overall network connectivity remains intact, ensuring that no isolated nodes emerge. For the experiment, we generated 100 network topologies using the random road segment failure method, with each topology producing 1000 random OD demand samples (Dataset C). The resulting randomly generated network topology is illustrated in Figure 11, where red dotted lines indicate the failed road segments.

    Figure 11.  An illustration of the SF network with random road failures (the red dashed line indicates the failure of the road segment).

    To assess the robustness of the model under road segment failures, the features of all failed road segments are replaced with zero vectors after feature fusion. The newly generated data was directly fed into the model, which has been trained on the fixed topology. Table 8 summarizes the performance of the GCN-based model across different sample sizes. Shown in this table, the model demonstrates improved adaptability to variable topologies under the current parameter settings. Additionally, increasing the training sample size significantly enhances model performance. With only 10,000 samples, the model achieves an estimated R2 of 0.72. As the sample size increases to 100,000, R2 improves to 0.84, and all performance metrics show notable improvement. Due to the random nature of failed road segments in terms of location and frequency, a small dataset does not provide sufficient training diversity for the model to generalize effectively. To further enhance model estimation performance, a larger dataset with more samples should be generated to capture a wider range of failure scenarios.

    Table 8.  Model performance of random road segment failure.
    Samples Epoch Learning rate RMSE MAPE (%) MAE R2
    10,000 500 0.01 2068.56 26.51 1477.84 0.7276
    100,000 500 0.01 1559.11 19.75 1115.22 0.8456

     | Show Table
    DownLoad: CSV

    The graph convolution-based neural network model, as constructed above, is designed to process inputs with a fixed number of nodes. However, it encounters scalability challenges when applied to networks with varying node counts. In particular, when severe disruptions occur in the traffic network and result in disconnected subgraphs, the existing model fails to accommodate these structural changes. To overcome this limitation, we propose an algorithm for subgraph training, which modifies the input representation to enhance the model's adaptability and scalability. In our extended experiments, Dataset A is used to test the feasibility of the proposed approach.

    (1) Graph partitioning

    In this experiment, we apply the multilevel k-way partitioning algorithm provided by Metis [53] to divide the traffic network into multiple subgraphs. Specifically, the SF network is partitioned into 4 blocks. The original traffic network is first transformed into its dual graph representation, in which nodes represent road segments and edges indicate adjacency between segments. The partitioning is then performed on the dual graph to ensure spatially coherent subgraphs. The results of the partitioning on both the dual graph and the corresponding original traffic network are shown in Figure 12 and Figure 13. Since the resulting subgraphs may contain different numbers of nodes, we apply a zero-padding operation to standardize the input dimensions, thereby ensuring compatibility with the neural network architecture.

    Figure 12.  Dual graph of SF network.
    Figure 13.  Result of the graph partitioning.

    (2) Edge attribute model considered in this study

    This study aims to train an end-to-end deep learning model capable of approximating equilibrium traffic flows under varying OD demand conditions. To facilitate this, the OD demand matrix is first transformed into edge-level features using an "all-or-nothing" traffic assignment operation. The edge attribute set includes the resulting all-or-nothing flow, road capacity, and free flow time. In addition, node-level features are incorporated to enhance spatial representation. For each edge, we compute node demand feature based on its origin and destination nodes, defined as the total outgoing demand from the origin node and the total incoming demand to the destination node. These features are appended to the edge feature set, forming a comprehensive input representation. The adjacency matrix used in the model is constructed based on edge-to-edge relationships, enabling the GCN layer to propagate information across adjacent road segments.

    (3) Evaluation of subgraph-based training on the SF network

    We evaluate the model's performance under subgraph-based training using the SF network, as summarized in Table 9. The input features include both intrinsic edge attributes and, in some cases, node-level demand characteristics, such as the total outbound and inbound demand at each node. First, the comparison between training on 2 subgraphs and 4 subgraphs shows that as the number of partitions increases, the model's prediction accuracy declines. This suggests that while subgraph partitioning helps standardize input dimensions, it also leads to a loss of global structural information, which negatively affects model performance. Second, incorporating node-level demand features significantly mitigates this information loss and enhances the model's accuracy under subgraph-based training. Third, to further address the limitations of partition-based modeling, future research will explore mechanisms for inter-subgraph communication, aiming to recover lost global context and improve the robustness of subgraph-trained models. The training loss curve for this experiment is shown in Figure 14.

    Table 9.  SF subgraph training results.
    Features Subgraphs Epochs Learning rate RMSE MAPE (%) MAE R2
    Edge 2 200 0.01 170.71 1.75 82.16 0.9975
    Edge 4 200 0.01 313.14 7.04 236.02 0.9920
    Edge, demand 2 200 0.01 96.69 2.46 55.10 0.9994

     | Show Table
    DownLoad: CSV
    Figure 14.  Loss curve for subgraph training (2 subgraphs).

    (4) Evaluation of subgraph results on the EMA network

    To evaluate the generalizability of the proposed model across different traffic networks, we further analyzed the EMA network and compared its structural characteristics to those of the SF network, as discussed in the previous subsection. Ensuring consistent input dimensions across networks requires partitioning each network into subgraphs of uniform size. The partitioning results for both networks are illustrated in Figures 1518. Specifically, the SF network is divided into 2 subgraphs, containing 36 and 40 edges, respectively. In contrast, the larger EMA network is partitioned into 7 subgraphs, each containing either 36 or 38 edges, thereby maintaining input shape consistency while preserving topological structure.

    Figure 15.  SF network dual graph partition results.
    Figure 16.  Edge division results of the original SF network graph.
    Figure 17.  Dual graph partitioning of EMA networks.
    Figure 18.  Edge partitioning results of EMA original graph.

    We first evaluated the performance of subgraph-based training on the EMA network, using 7 subgraphs. Two experimental settings were tested: one with node-level features included, and the other without. As shown in Table 10, incorporating node demand features significantly improves model accuracy. However, due to the relatively large number of subgraphs, the overall estimation performance did not reach optimal levels.

    Table 10.  Results of subgraph training on EMA network (7 subgraphs).
    Features Epochs Learning rate RMSE MAPE (%) MAE R2
    Edge 200 0.01 17.41 6.26 10.00 0.9992
    Edge, demand 200 0.01 10.68 8.26 6.12 0.9997

     | Show Table
    DownLoad: CSV

    The OD demand distributions in the SF and EMA networks differ substantially. As a result, a model trained on the SF network cannot be directly applied to the EMA network due to significant discrepancies in road segment capacities and free-flow travel times. To investigate this transferability, we conducted an experiment where a model pretrained on 1000 samples from the SF network (with node-level features included) was fine-tuned using 1000 samples from the EMA network. The results are summarized in Table 11, which compares two scenarios: one where the model is initialized with pretrained weights from the SF network, and another where training starts from random initialization. The results show that loading the pretrained model improves training efficiency and accuracy under the current settings. This suggests that models trained on networks with abundant data can be effectively leveraged to accelerate and enhance learning on networks with limited data. Naturally, the closer the demand patterns and OD distributions between the two networks, the greater the benefit of pretraining is expected to be.

    Table 11.  EMA subgraph training results based on SF pretrained model.
    Pretraining model Epochs Learning rate RMSE MAPE (%) MAE R2
    Yes 100 0.01 20.38 15.42 13.10 0.9976
    No 100 0.01 22.50 16.91 13.49 0.9972

     | Show Table
    DownLoad: CSV

    In summary, this section constructs a diverse set of datasets and proposes two deep learning-based traffic assignment models built upon GCN. The proposed models demonstrate strong performance on data with fixed network topologies, but encounter limitations when applied to scenarios involving random link failures. Due to the high degree of uncertainty in Dataset C, accurate flow estimation cannot be achieved solely based on link-level features. To improve model performance in such cases, a larger volume of topologically diverse training data is required. Meanwhile, the subgraph-based deep learning model introduced in this study presents a promising and scalable solution, as it is capable of adapting to networks with varying node counts, thereby extending the applicability of GCNs to more complex and dynamic transportation systems.

    Conventional traffic assignment models often rely on complex and computationally intensive algorithms. This study proposes a novel approach to the UE-TAP based on GCN, offering an end-to-end deep learning solution that bypasses the intricacies of traditional methods. The proposed model directly maps OD demand, link attributes, and network topology to the equilibrium traffic flow, enabling fast and accurate prediction. First, we demonstrate the feasibility of using deep learning to address UE-TAPs and introduce two GCN-based model frameworks. The baseline model is designed for fixed network topologies, incorporating intrinsic link features to effectively estimate equilibrium flows under varying road segment conditions. Building upon this, we develop an enhanced framework that employs dual graph partitioning and subgraph-based training, enabling the model to adapt to networks with varying node counts and topologies. This flexible model structure also supports transfer learning, allowing a pretrained model on one network to be fine-tuned for another with limited data availability.

    To validate our approach, multiple datasets were generated based on the SF network, and experiments were conducted across various scenarios, including fixed topology, randomized OD demand, and random link failures. The results confirm that the proposed model can accurately approximate equilibrium flows and exhibits robustness in the face of network disruptions. Notably, the subgraph-based model for variable topologies remains effective even when link failures lead to significant changes in network structure. However, this flexibility may come at the cost of reduced prediction accuracy due to the loss of global structural information.

    While the proposed GCN-based framework demonstrates strong performance in estimating equilibrium flows under both fixed and variable network topologies, several limitations remain. First, the model has primarily been validated in synthetic environments, where the network topology and intrinsic road segment attributes (e.g., capacity, free-flow time) are readily available and well-defined. In real-world applications, however, both the abstraction of traffic network topology and the collection of intrinsic link properties present significant challenges due to data heterogeneity, incompleteness, and noise. These factors may affect model generalizability and reliability in practice. Moreover, the current subgraph-based framework does not yet support explicit communication between subgraphs, which may lead to information loss and reduced accuracy, especially as the number of subgraphs increases. Future work will focus on addressing these limitations by developing methods for inter-subgraph information exchange and improving the adaptability to real-world network complexities of this model. Additionally, while the proposed method offers substantial gains in inference speed compared to traditional traffic assignment algorithms, a comprehensive evaluation of its end-to-end computational efficiency, including training cost, inference time, and scalability to large-scale networks, remains an important direction for future work.

    The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

    This study is supported by the National Natural Science Foundation of China (52131203, 5240120175, 72471057), the Natural Science Foundation of Jiangsu Province (BK20232019), the Jiangsu Provincial Scientific Research Center of Applied Mathematics (BK20233002), Humanities and Social Sciences Research Project of the Ministry of Education (24YJCZH357), China Postdoctoral Science Foundation (2024M761430).

    The authors declare there is no conflict of interest.



    [1] X. W. Yao, J. W. Han, G. Cheng, et al., Semantic annotation ofhigh-resolution satellite images via weakly supervised learning, IEEE Trans. Geosci. Remote Sens., 54 (2016), 3660–3671.
    [2] S. Y.Cui and M. H. Datcu, Comparison of approximation methods to Kullback–Leiblerdivergence between Gaussian mixture models for satellite image retrieval, Remote Sens. Lett., 7 (2016), 651–660.
    [3] Y. B.Wang, L. Q. Zhang, X.H. Tong, et al., A three-layered graph-basedlearning approach for remote sensing image retrieval, IEEE Trans. Geosci. Remote Sens., 54 (2016), 6020–6034.
    [4] J.Muñoz-Marí, F. Bovolo, L.Gómez-Chova, et al., Semisupervised one-class support vector machines forclassification of remote sensing data, IEEETrans. Geosci. Remote Sens., 48 (2010), 3188–3197.
    [5] L. Y.Xiang, Y. Li, W.Hao, et al., Reversible natural language watermarking using synonymsubstitution and arithmetic coding, Comput. Mater. Continua, 55 (2018), 541–559.
    [6] Y. Tu,Y. Lin, J. Wang, et al., Semisupervised learningwith generative adversarial networks on digital signal modulationclassification. Comput. Mater. Continua, 55 (2018), 243–254.
    [7] D. J.Zeng, Y. Dai, F.Li, et al., Adversarial learning for distant supervised relation extraction, Comput. Mater. Continua, 55 (2018), 121–136.
    [8] J. M.Zhang, X. K. Jin, J.Sun, et al., Spatial and semantic convolutionalfeatures for robust visual object tracking, MultimediaTools Appl., Forthcoming 2018. Available at https://doi.org/ 10.1007/s11042-018-6562-8.
    [9] S. R.Zhou, W. L. Liang, J.G. Li, et al., Improved VGG model for road traffic sign recognition, Comput. Mater. Continua, 57 (2018), 11–24.
    [10] S. Karimpouli and P. Tahmasebi, Image-basedvelocity estimation of rock using convolutionalneural networks, Neural Netw., 111 (2019), 89–97.
    [11] S. Karimpouli and P. Tahmesbi, Segmentationof digital rock images using deep convolutional autoencoder networks, Comput. Geosci-UK, 126 (2019), 142–150.
    [12] P. Tahmasebi and A. Hezarkhani, Applicationof a modular feedforward neural network for grade estimation, Nat. Resour. Res., 20 (2011), 25–32.
    [13] O. Russakovsky, J. Deng, H. Su, et al., ImageNet large scale visualrecognition challenge, Int. J. Comput.Vision, 115 (2015), 211–252.
    [14] G. S.Xia, J. W. Hu, F.Hu, et al., AID: a benchmark data set for performance evaluation of aerial sceneclassification, IEEE Trans. Geosci.Remote Sens., 55 (2017), 3965–3981.
    [15] J. M.Zhang, Y. Wu, X.K. Jin, et al., A fast object tracker based on integrated multiple features anddynamic learning rate, Math. Probl. Eng.,2018 (2018), Article ID 5986062, 14 pages.
    [16] Y.Yang and N. Shawn, Comparing sift descriptors and gabor texture features forclassification of remote sensed imagery, 15thIEEE International Conference on Image Processing,(2008), 1852–1855.
    [17] B.Luo, S. J. Jiang and L. P. Zhang, Indexing of remote sensing images withdifferent resolutions by multiple features, IEEEJ. Sel. Top. Appl. Earth Obs. Remote Sens., 6 (2013), 1899–1912.
    [18] A.Avramović and V. Risojević, Block-based semantic classification ofhigh-resolution multispectral aerial images, Signal Image Video Process., 10 (2016), 75–84.
    [19] X.Chen, T. Fang, H.Huo, et al., Measuring the effectiveness of various features for thematicinformation extraction from very high resolution remote sensing imagery, IEEE Trans. Geosci. Remote Sens., 53 (2015), 4837–4851.
    [20] J. A. dos Santos, O. A. B. Penatti and R.da Silva Torres, Evaluating the potential of texture and color descriptors forremote sensing image retrieval and classification, 5th International Conference on Computer Vision Theory and Applications,(2010), 203–208.
    [21] Y.Yang and N. Shawn, Geographic image retrieval using local invariant features, IEEE Trans. Geosci. Remote Sens., 5 (2013), 818–832.
    [22] Y.Yang and N. Shawn, Bag-of-visual-words and spatial extensions for land-useclassification, 18th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems,(2010), 270–279.
    [23] Y.Yang and N. Shawn, Spatial pyramid co-occurrence for image classification, IEEE International Conference on ComputerVision, (2011), 1465–1472.
    [24] W. Shao, W. Yang, G.S. Xia, et al., A hierarchical scheme of multiple feature fusion forhigh-resolution satellite scene categorization, IEEE InternationalConference on Computer Vision Systems, (2013), 324–333.
    [25] W. Shao, W. Yang and G. S. Xia, Extremevalue theory-based calibration for the fusion of multiple features inhigh-resolution satellite scene classification, Int. J. Remote Sens., 34 (2013), 8588–8602.
    [26] N. Romain, P. David and G. Philippe-Henri,Evaluation of second-order visual features for land-use classification, 12th International Workshop on Content-BasedMultimedia Indexing, (2014), 1–5.
    [27] L. J.Chen, W. Yang, K.Xu, et al., Evaluation of local features for scene classification using VHRsatellite images, 2011 Joint Urban RemoteSensing Event, (2011), 385–388.
    [28] F. Hu,G. S. Xia, J. W. Hu, et al., Transferring deepconvolutional neural networks for the scene classification of high-resolutionremote sensing imagery, Remote Sens., 7 (2015), 14680–14707.
    [29] M. Castelluccio, G. Poggi, C. Sansone, et al., Land useclassification in remotesensing imagesby convolutional neuralnetworks, preprint, arXiv:1508.00092.
    [30] O. A. B. Penatti, K. Nogueira and J. A. dos Santos, Do deep features generalize from everydayobjects to remote sensing and aerial scenes domains?, IEEE Conference on Computer Vision and Pattern Recognition Workshops, (2015), 44–51.
    [31] F. P.S. Luus, B. P. Salmon, F.Van den Bergh, et al.,Multiview deep learning for land-use classification, IEEE Geosci. Remote Sens. Lett., 12 (2015), 2448–2452.
    [32] F. Zhang, B. Du and L. P. Zhang, Sceneclassification via a gradient boosting random convolutional network framework, IEEE Trans. Geosci. Remote Sens., 54 (2016), 1793–1802.
    [33] K.Nogueira, O. A. B. Penatti and J. A. dos Santos, Towards better exploitingconvolutional neural networks for remote sensing scene classification, Pattern Recogn., 61 (2017), 539–556.
    [34] G. Cheng, P. C. Zhou and J. W. Han,Learning rotation-invariant convolutional neural networks for objectdetection in VHR optical remote sensing images, IEEE Trans. Geosci. Remote Sens., 54 (2016), 7405–7415.
    [35] X. W.Yao, J. W. Han, G. Cheng, et al., Semantic annotation of high-resolutionsatellite images via weakly supervised learning, IEEE Trans. Geosci. Remote Sens., 54 (2016), 3660–3671.
    [36] G.Cheng, C. Y. Yang, X.W. Yao, et al., When deep learning meets metric learning: remote sensing imagescene classification via learning discriminative CNNs, IEEE Trans. Geosci. Remote Sens., 56 (2018), 2811–2821.
    [37] S.Chaib, H. Liu,Y. F. Gu, et al., Deep feature fusion for VHR remote sensing sceneclassification, IEEE Trans. Geosci.Remote Sens., 55 (2017), 4775–4784.
    [38] Q.Wang, S. T. Liu,J. Chanussot, et al., Scene classification with recurrent attention of VHR remotesensing images, IEEE Trans. Geosci.Remote Sens., 99 (2018), 1–13.
    [39] Y. L.Yu and F. X. Liu, Dense connectivity based two-stream deep feature fusion frameworkfor aerial scene classification, RemoteSens., 10 (2018), 1158.
    [40] Y. T. Chen, W. H. Xu, J. W. Zuo, et al., The fire recognition algorithmusing dynamic feature fusion and IV-SVM classifier, Cluster Comput., Forthcoming 2018. Available at https://doi.org/10.1007/s10586-018-2368-8.
    [41] G.Huang, Z. Liu, L.van derMaaten, et al., Densely connected convolutional networks, IEEE Conferenceon Computer Vision and Pattern Recognition,(2017), 4700–4708.
    [42] G.Huang, Y. Sun,Z. Liu, et al., Deep networks with stochastic depth, European Conferenceon Computer Vision,(2016), 646–661.
    [43] S.loffe and C. Szegedy, Batch normalization: acceleratingdeep network training by reducing internal covariate shift, 32nd International Conference on Machine Learning,(2015), 448–456.
    [44] P. Tahmasebi, F. Javadpour and M. Sahimi,Data mining and machine learning for identifying sweet spots in shalereservoirs, Expert Sys. Appl., 88 (2017), 435–447.
    [45] G.Cheng, J. W. Han and X. Q. Lu, Remote sensing image scene classification: benchmark and state of the art, Proc. IEEE, 105 (2017), 1865–1883.
    [46] L. H.Huang, C. Chen,W. Li, et al., Remote sensing image scene classification using multi-scalecompleted local binary patterns and fisher vectors, Remote Sens., 8 (2016),483.
    [47] X. Y.Bian, C. Chen,L. Tian, et al., Fusing local and global features for high-resolution sceneclassification, IEEE J. Sel. Top. Appl.Earth Obs. Remote Sens., 10 (2017), 2889–2901.
  • Reader Comments
  • © 2019 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(11964) PDF downloads(4041) Cited by(126)

Figures and Tables

Figures(11)  /  Tables(7)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog