1.
Introduction
Digital finance is an innovative Internet finance based on artificial intelligence, mobile payment, big data, cloud computing, blockchain and other information technologies. With the rapid development of digital finance in China, cities are becoming networked with each other, and the networked development model of urban digital finance is a structural model to achieve "internal stability" in the region [1,2]. In the context of the shift from traditional finance to digital finance, it is important to analyze the core-edge structure of urban digital finance spatially linked networks and further study the driving factors of urban digital finance networks for the stable development of digital finance networks.
Digital finance is a new type of financial service. With the development of information technology, there are various forms of financial services. The traditional forms of financial services include core businesses such as credit and payment [3,4]. With the development of information technology, information technology is combined with traditional related businesses to form a new form of financial services. For example, in the credit business, the traditional business needs to take the assets as the mortgage of credit, and the relevant assets need to be evaluated in the process of the loan business [5]. However, after the digitalization of the credit business, artificial intelligence technology is integrated into the credit risk assessment and approval process, and unsecured credit business is carried out through digital financial consumption and other situations [6]. Through blockchain and other technologies, payment can be decentralized in the local area, and efficiency can be improved through digital technology, thus forming a new form of business [7].
The rapid development of digital finance has brought about a lowering of financial access barriers and an improvement in the inclusiveness, convenience and value for money of financial services, especially in terms of its promotion of innovative entrepreneurship. Studies have found that the development of digital finance has had a significant impact on entrepreneurship, with a stronger effect on encouraging entrepreneurship in provinces with low urbanization rates and micro-enterprises with low registered capital. Corporate technological innovation is influenced by the 'structural' driving effect of digital finance development [8,9]. The development of digital finance can effectively correct the "attribute mismatch", "sector mismatch" and "stage mismatch" that existed in traditional finance [10,11]. By alleviating financing constraints and optimizing the industrial structure, digital finance has significantly improved the level of regional technological innovation [12,13].
The further development of digital finance can help alleviate credit constraints and smooth out cross-period consumption. At the same time, the popularity of mobile payment platforms such as WeChat and Alipay has not only greatly increased the convenience of payment and reduced the cost of shopping, but also increased the speed of circulation and exchange efficiency. Studies have found that the development of digital finance has significantly increased the effectiveness of households' financial portfolios and reduced the likelihood of extremely risky investments [14,15]. The easing of liquidity constraints is not the main reason for the increase in consumption, as digital finance has mainly contributed to the ease of payment [16,17].
The biggest advantage of digital finance is that it breaks the geographical space constraint and provides the possibility for the free flow of resource elements between regions [18,19]. However, the network characteristics of centralization and clustering of financial development have not changed, and the regional imbalance of digital financial development has become increasingly prominent [20,21]. The digital financial network with the structure of "core-periphery" has gradually taken shape. In the increasingly close inter-regional connection, the marginal disadvantaged areas are not attractive enough to resource elements, which will further widen the regional gap.
The Digital Inclusive Financial Index is widely used in relevant research, but lacks the perspective of urban digital finance. Peking University, with the assistance of Ant Financial Services Group, constructed the Digital Financial Inclusion Index based on a large amount of online transaction data [22]. Feng et al. [23] studied the relationship between digital finance and green technology innovation based on a digital financial index. Chen and Zhang [24] exploited the causal effect of digital finance on manufacturing servitization based on a digital financial index. Different from Peking University Digital Financial Inclusion Index. Liao et al. [25] constructed a digital finance index system for Chinese cities from three dimensions: service, technology, and operational environment, and used a combination of subjective and objective methods to measure urban digital finance indicators. Ye et al. [26] constructed an index system for digital finance risk, and used Lagrange multipliers method to obtain the optimal integrated weights of the cascade analysis method and entropy weights to measure the digital financial risk indicators. Firms use digital finance to increase their ability to resist risks [27,28].
Digital finance increases the relevance between different subjects, and urban digital financial network deserves further research. Liu et al. [29] constructed a spatial correlation network for the development of digital financial inclusion in China, and investigated the structure, locational characteristics and influencing factors of the network using the network analysis method and the quadratic assignment procedure (QAP) method. Lin and Zhang [30] found the positive spatial externality of digital finance exists for all household economic variables. With the help of the spatial spillover model, we can find that digital finance has spatial relevance [31,32]. Dong et al. [33] examined the regional gap in inclusive digital finance and its structural decomposition in the Yangtze River Delta city cluster from three perspectives: time trend, spatial structure, and dynamic evolution.
The effects of digital finance are widely studied, while the exploration of the driving factors of digital finance is ignored. Li et al. [34] explored the influencing factors of digital finance. Liu et al. [29] used the QAP method to study the spatial association network influencing factors of digital HP development in China, revealing the influence of the Internet and economic development level, industrial structure and spatial adjacency on spatial association. Wang et al. [35] used a spatial econometric study to identify those factors that are significantly associated with financial inclusion. Ye et al. [26] on the basis of constructing a digital finance risk indicator system, based on the model results on influencing factors of digital financial risk in economically developed regions of China under the new crown pneumonia epidemic were analyzed. Yao et al. [36] show that the urban fintech level has a significant promoting effect on green total factor productivity.
Up to now, the quantitative analysis literature focusing on the influencing factors of digital financial networks mainly uses the QAP to conduct correlation and regression analysis on certain types of factors, which can examine a limited number of variables and is difficult to comprehensively cover the multiple complex factors behind digital financial networks. This approach is limited in the number of variables that can be examined, making it difficult to fully cover the multiple and complex factors behind the digital financial network.
Based on the modified gravity model, this paper constructs a digital financial network of Chinese cities, identifies core and edge cities in different periods using the core-edge structure model, and analyzes the evolutionary characteristics of the core-edge structure from multiple perspectives. Then applying four nonlinear machine learning methods, Decision Tree (DT), Random Forest (RF), Adaboost, and LightGBM to identify and empirically analyze the drivers of urban digital financial networks.
This paper has the following academic contributions. First, urban network is a new form of urban spatial network emerging under the background of informatization and globalization. The connection between cities in a region tends to be networked, and urban networking is a structural mode to achieve "internal stability", and also an inevitable process of urban development in a region. By exploring the spatial structure of China's urban digital financial network and the evolution characteristics of the core-periphery structure, it is helpful to deepen the understanding of the evolution law of China's urban network, and provide the scientific basis for the strategy of promoting the coordinated development between cities. Second, it uses machine learning methods to enrich and expand the quantitative research on network drivers. Most of the quantitative research on urban digital finance uses the QAP in non-parametric tests to perform correlation analysis and regression analysis on drivers. In this paper, the influencing factors of urban digital finance are re-identified using a variety of non-linear machine learning methods, so that the influence of more factors on urban digital finance can be examined, and the non-linear influence of related factors on urban digital finance can be more comprehensively explored. Third, the influencing factors of urban digital finance network from 2010 to 2020 are compared and analyzed, so as to dissect the driving factors behind urban digital finance, which will help to provide more accurate reference for decision-making.
2.
Materials and methods
2.1. A Network construction based on the gravitation model
Gravity models use cross-sectional data to describe trends in spatial correlations and network structures, and are widely used in social sciences such as trade [37], land use efficiency [38], low-carbon energy technology [39], construction and demolition waste [40]. Regarding the determination of the strength of the financial linkage role of digital financial figures in each city, the following modified gravity model is chosen for measurement in this paper:
where i, j represents the 278 cities studied in China, $ {R}_{ij} $ denotes the strength of the correlation effect of city i on digital finance in city j; $ {P}_{i} $ and $ {P}_{j} $ denote the year-end total population of city $ i $ and city $ j $; $ {G}_{i} $ and $ {G}_{j} $ denote the GDP of city i and city j. The product of population size $ {P}_{i} $ and annual regional GDP $ {G}_{i} $ is used to represent the scale of digital finance development in cities. $ {d}_{ij} $ is the geographical distance between city i and city j. The ratio of its difference with the per capita GDP of the two cities ($ {g}_{i}-{g}_{j}) $ is used as the corrected distance $ {D}_{ij} $ between the cities. $ {F}_{i} $ and $ {F}_{j} $ denote the digital finance indices of city i and city j, respectively. The strength of the digital finance association of cities is asymvariable, and in order to reflect the directionality of the association network, the weight of the digital finance level of city a to the sum of the digital finance levels of city i and city j is used to express the modified empirical constant $ {k}_{i} $, meaning that i in the contribution of digital finance development in city i and city j. In this paper, we calculate the digital finance association strength value of city i to city j based on this formula, and construct the gravitational matrix $ {R}_{ij} $ among 278 cities in three periods of 2010, 2015 and 2020 based on it, and binarize it. The average value of each row of the gravitational matrix is chosen as the threshold value, and the gravitational values between cities are compared with each other. The value above the threshold of the row is recorded as 1, which means that the cities in a row are related to the urban digital finance of the corresponding column; the value below the threshold of the row is recorded as 0, which means that the cities in a row are not related to the urban digital finance of the corresponding column, as expressed by the formula:
2.2. Identification of core and edge blocks
2.2.1. Discrete models
In this paper, we adopt the discrete model of core-edge structure proposed by Borgatti et al. to identify the network status of different cities [41], and divide the urban digital financial network structure into two parts: core block and edge block. The observation matrix is obtained by first binarizing the weight network with 1 as the critical value; then find a pattern matrix that is closest to the observation moments to identify the core and edge blocks. The closeness of the pattern matrix and the observation matrix is measured as:
In the formula, $ \mathrm{\rho } $ is the correlation coefficient between the observation matrix and the mode matrix, $ {a}_{ij} $ indicates whether there is a connection between city $ i $ and city $ j $ in the observation matrix, if there is a relationship, then $ {a}_{ij} = $ 1, otherwise $ {a}_{ij} = $ 0; $ {\mathrm{\delta }}_{\mathrm{i}\mathrm{j}} $ indicates whether there is a connection between city $ i $ and city $ j $ in the mode matrix, if there is a relationship, then $ {\mathrm{\delta }}_{\mathrm{i}\mathrm{j}} = $ 1, otherwise $ {\mathrm{\delta }}_{\mathrm{i}\mathrm{j}} = $ 0. $ {c}_{i} $ ($ {c}_{j} $)indicates the type (core or edge) to which the city $ i $ ($ j $)is affiliated; since the connection between the core and edge blocks is difficult to determine the exact density value, the non-diagonal area outside the core and edge areas is usually regarded as the missing value (.). This model is a core-edge linkage deficit model, which identifies the core-edge structure of digital financial networks in Chinese cities by finding the pattern matrix that maximizes the density of core blocks and minimizes the density of edge blocks, and maximizing the correlation coefficient $ \rho $ between the observation matrix and the pattern matrix.
2.2.2. Continuous model
The continuous model of core-edge structure can measure the core degree of the city and thus identify the network power of the city. The model matrix of the continuum model can be defined as:
In the above equation, $ {c}_{i} $($ {c}_{j} $) is a non-negative vector consisting of the core degree of each city node. The core degree takes values in the range of [0, 1], and the closer the core degree of a city is to 1, the greater the power of the city in the network.
2.3. Measure of the network structure
2.3.1. Individual network indicators
This paper selects the mediation centrality and structure hole effective scale to measure the network location and power of various nodes in an urban network. The mediation centrality is measured as follows:
In the above equation, $ {\mathrm{g}}_{\mathrm{k}\mathrm{l}} $ represents the number of lines of spatial association between city $ \mathrm{k} $ and city $ \mathrm{l} $; $ {\mathrm{g}}_{\mathrm{k}\mathrm{l}}\left(\mathrm{i}\right) $ represents the number of lines of association connecting city $ \mathrm{k} $ and city $ \mathrm{l} $ and passing through city $ \mathrm{i} $. The number of synergistic subjects in the network is n. The intermediary centrality value can measure the size of the intermediary role played by the city in the digital financial network. The higher the intermediary centrality value, the more the city is in the key position to control the flow of resources and information, the greater the control over other cities, and the higher the dependence of other cities on it. The effective size of structural holes represents the ability of non-redundant factors in the network to use structural holes in their individual networks, and the higher the effective size indicates the richer network capital the city has.
2.3.2. Overall network indicators
The average path length and the clustering coefficient are used to measure the reachability and aggregation in the network. The average path length is the average length of the shortest path connecting any two points, and the formula is as follows:
In the above equation, the average path distance from node i to node j of d network is used to measure the overall transmission efficiency and performance of the network, which reflects the size of the network to some extent. The clustering coefficient is an indicator of the local network structure, which is defined in two ways: the average local density and the transmissibility ratio. The clustering coefficient calculated from the average local density is equal to the average value of each point density coefficient; the clustering coefficient calculated from the transmissibility ratio is equal to the ratio of the number of closed tripartite groups to the total number of tripartite groups. The two characteristics of relatively small average path length and relatively large clustering coefficient are also satisfied, indicating that the urban network has small-world characteristics.
The Herfindahl-Hirschman Index ($ HHI $) and the Zipf index were selected as measures of the power distribution of the urban network. The $ HHI $ index is used to calculate the distribution of the urban core degree and is calculated as follows:
In the above equation, $ HHI $ is the Herfindahl-Hirschman index; $ {S}_{i} $ is the share of city $ i $ in the core of the city network. $ HHI $ ranges from 0 to 1, and an increase in $ HHI $ means that the distribution of power in the city network tends to be concentrated, while a decrease in $ HHI $ means that the distribution of power in the city network tends to be decentralized. The Zipf index is used to measure the distribution of connectivity in the city network. distribution, which is calculated by the formula:
In the above equation, $ \mathrm{N}\mathrm{N}\mathrm{C} $ is the network connectivity of cities; 𝛼 is a constant term; $ r $ is the rank of cities in the network connectivity ranking; $ q $ is the Zipf index to be estimated; and $ u $ is the residual. If the index $ q $ is close to or equal to 1, it means that the distribution of city network connectivity obeys Zipf's law; when $ q $ > 1, it means that the distribution of network connectivity of cities is mainly concentrated in the upper truncated tail part, and the influence of small and medium-sized cities is insufficient; when $ q $ < 1, it means that the distribution of city network connectivity is relatively balanced.
2.4. Selection of machine learning methods
This paper attempts to fully take into account the nonlinear effects of each factor on the urban digital financial network. Although linear machine learning methods such as logistics can also partially cover the nonlinear relationships between the independent and dependent variables through the link function, the nonlinear relationships that these methods can cover are usually more limited. In contrast, nonlinear machine learning methods can explore the nonlinear effects between variables more comprehensively, so theoretically nonlinear machine learning methods are more suitable for this study. Since it is difficult to introduce time dimension consideration with continuous variables, this paper draws on Ng's practice to discretize the explained variables and identify the core-edge structure of Chinese urban digital finance through the gravity model [42]. When the city is in the core block of the digital finance network in this year, the explained variable will be assigned a value of 1, otherwise, it will be assigned a value of 0. Combining the common practices in the existing literature, four nonlinear methods, DT, RF, Adaboost, and LightGBM, are selected in this paper.
2.4.1. Decision tree model
The decision tree is a basic classification regression algorithm. In classification problems, it represents the process of classifying instances based on features. The decision tree learning algorithm usually recursively selects the most features and partitions the training data according to their features, a process that optimally classifies each subdata, a process that corresponds to the partitioning of the feature space and to the construction of the decision tree.
First, the root node is constructed and all training data is placed on the root node, an optimal feature is selected, and according to that feature, the training data set is divided into subsets so that each subset has the best classification under the current conditions. If that subset is basically correctly classified, leaf nodes are constructed and these subsets are divided into corresponding leaf nodes; if there are still subsets that are basically incorrectly classified, new optimal features are selected for these subsets, and then the training results are compared, the training results are partitioned, corresponding nodes are constructed, and then recursion is performed until the training subsets are basically correctly classified, or there are no suitable features. Finally, each subset is divided into leaf nodes, i.e., there are clear classes, and therefore a decision tree is generated. The decision tree algorithm is easy to read and implement [43].
The hyperparameters of the DT method include the maximum depth, the minimum number of divided samples, the minimum number of leaf nodes and the maximum number of leaf nodes. In this paper, when training the DT method, the values of these four hyperparameters are determined to be 10, 1, 1, and 50.
2.4.2. Random forest theory
Random forest is based on decision trees, where the variables and data used are randomized to generate multiple classification trees, and then the results of the classification trees are aggregated. It is a cluster classification model with a forest constructed by a random method, and the forest is composed of many decision trees with no correlation between each decision tree. After the random forest model is obtained, each decision tree in the random forest will be judged separately when new samples enter. Using the bootstrap resampling technique, K samples are randomly selected from the original training sample set N (K and N are generally the same) to generate a new training sample set, and then n classification trees are generated to form a random forest based on the self-help sample set. The essence of this algorithm is an improvement of the decision tree algorithm. It combines several decision trees, each of which depends on a separate set of samples.
Assuming that the input sample size is N, the sample size is also N. This makes the input samples of each tree not all samples at training time, so that overfitting is less likely to occur. Then, m features are selected from the M features (m < < M). Then, a decision tree is built using a full splitting method on the sampled data, such that one leaf node of the decision tree cannot continue to split or all samples in the decision tree are of the same classification. In general, many decision tree algorithms are an important step in pruning. However, in random forests, because these two random sampling processes ensure randomness, overfitting does not occur even without censoring.
The hyperparameters of the RF method include the number of decision trees and the maximum depth of each decision tree. In this paper, when training the DT method, the values of these two hyperparameters are determined to be 100 and 10.
2.4.3. Adaboost algorithm
The Adaboost algorithm is based on a reasonable combination of multiple weak classifiers (weak classifiers are generally chosen as single-level decision trees) to make them strong classifiers.
Adaboost uses the idea of iteration, where only one weak classifier is trained in each iteration, and the trained weak classifiers are used in the next iteration. That is, in the nth iteration, there are n weak classifiers, of which n-1 classifiers are well trained before and all their parameters are not changed, and this time, the nth classifier will be trained. In this process, the nth weak classifier is more likely to classify the data that the previous n-1 weak classifiers did not classify correctly, and the final result of the classification depends on the combined effect of the n classifiers.
The hyperparameters of the Adaboost method include the number of base classifiers and the learning rate. In this paper, when training the Adaboost method, the values of these two hyperparameters are determined to be 100 and 1.
2.4.4. The LightGBM model
LightGBM generates long trees by leaf-wise. Each time, from all the current leaves, we find the leaf with the greatest splitting gain and then split it, and so on. Therefore, when the number of splits is the same, leaf-wise can reduce more errors and get better accuracy. However, if the sample size is small, leaf-wise may lead to over-fitting. Therefore, LightGBM can use the additional parameter max_depth to limit the depth of the tree and avoid overfitting.
The hyperparameters of LightGBM method include maximum depth of tree, learning rate, L1 regularization, L2 regularization, sample sampling rate, and tree feature sampling rate. In this paper, when training LightGBM method, the values of the two hyperparameters are determined to be 10, 0.1, 0, 1, 1 and 1 respectively.
2.5. Identification of driving factors
To calculate the magnitude of the influence of each factor on urban digital finance, three steps are required: (i) the models are trained using spsspro software with DT, RF, Adaboost, and LightGBM methods as base learners, respectively; (ii) the prediction performance of the four nonlinear machine learning methods is evaluated using a combination of four indicators: accuracy, precision, recall, and F1; (iii) the prediction performance of the four nonlinear machine learning methods are evaluated using stability selection method to calculate the importance scores of each factor, and then calculate the contribution of each influencing factor to urban digital finance:
Next, this paper will focus on how to calculate the importance score of each influence on deflation using the stability selection method. The stability selection method proposed by Meinshausen and Bühlmann [44] follows the idea of resampling and training the model based on different sampled samples to obtain screening results for multiple variables, and then uses these results to calculate the importance score of each feature importance magnitude of each feature. Since the stability selection method trains multiple models through multiple resampling, the final integrated ranking results are more robust compared to a single model, and thus more robust results can be obtained. In view of this, this paper uses the stability selection method to calculate the importance scores of each influencing factor on deflation.
Specifically, the importance score of each factor can be calculated by counting the number of times a single factor is selected in the full set of deflationary influences. Since factors with high importance always have a tendency to be selected, their scores will be close to the number of resamplings, i.e., 50. Influences that are relatively less important but still relevant will have a score between 0 and 50, while irrelevant influences will not be selected in each resampling process and therefore have a score of 0 [44]. The calculated importance scores of individual factors are substituted into Eq (10) to obtain the contribution of each factor to deflation. The magnitude of each factor's contribution is then ranked to determine the main influencing factors of deflation.
3.
Urban digital financial network characteristics
The core-edge structure of the urban digital financial network shows the characteristics of gradual deepening and orderly distribution. The core blocks identified by the discrete model of the core-edge structure are shown in Table 1. From 2010 to 2020, the number of cities in the core blocks expands from 83 to 113, and it can be found that the scale of cities in the core blocks in the core-edge structure of China's urban digital financial network keeps expanding. The average value of the urban digital finance index increased from 103.65 in 2010 to 106.24 in 2020, with an average annual growth rate of 0.25%. The evolution of core-block cities presents 2 significant features: first, the growth of core cities has obvious path-dependent characteristics, and those cities that have historically been in core positions tend to keep their core positions, highlighting the path-dependent characteristics of the development of urban network power: the cities in core positions in 2010 include 83 cities such as Guangzhou, Ordos, and Beijing, and the core cities in 2020 add Nantong, Fuzhou, Zhengzhou and other 53 cities, which means that the development of digital finance strengthens the status of existing centers and begins to expand outward gradually. Second, with the deep development of digital finance linkages, the concentration of urban cores in China has tended to decline over the past decade. On the one hand, the HHI index of urban centrality in the core block decreases from 0.0214 in 2010 to 0.01395 in 2020, and the urban centrality of the first place in the network power system decreases, while 53.98% of the urban centrality in the core block is increased, which may mean that the distribution of urban network power shifts to a functional polycentric pattern. The skewness and kurtosis of the core degree of the whole sample cities decreased from 4.037 and 23.068 in 2010 to 2.55 and 7.902 in 2020, respectively, and the development of the core-edge structure of China's urban digital financial network is accompanied by the decentralization of urban network power. Third, further study of the topology of Chinese urban digital financial networks reveals that the average path length of Chinese urban networks decreased from 3.672 in 2010 to 2.294 in 2020, and the clustering coefficients calculated based on local density and transmissibility increased from 0.323 and 0.083 in 2010 to 0.400 and 0.212 in 2020, respectively, indicating that the small-world characteristics of urban digital financial networks are becoming more and more obvious. Among them, core cities such as Beijing, Wuxi and Nanjing have network connections with a large number of edge cities, and the connections between core cities and edge cities are getting closer.
The core cities show reciprocal relationships and the peripheral cities lack connections with each other. Based on the improved gravity model to determine the spatial association links of digital finance development among cities, a relationship matrix is established, and the spatial association network map of digital finance in Chinese's core cities in 2010 and 2020 is drawn using Netdraw, a visualization tool under Ucinet software, as shown in the following figure. From 2010 to 2020, the network links among cities have increased significantly, and the spatially linked relationships develop in the direction of thickening and deepening. On the one hand, the network density D value of digital financial spatial linkages among Chinese cities from 2010 to 2020 shows an increasing trend, rising from 0.0324 in 2010 to 0.1008 in 2020, an increase of 211% during the whole study period, indicating that the interconnection among cities in the spatial structure of digital financial networks is gradually increasing, and the spatial digital financial interactions among cities are becoming more frequent. The increase in inter-city interconnection helps to promote the overall digital financial strength of China. On the other hand, the inter-city network density D is relatively low, with the maximum value of network density D only reaching 0.1008 during 2010–2020, indicating that the spatial digital financial linkages among cities are still in a weakly connected distribution, and further strengthening of intra-city linkages is needed. Geographically dispersed core cities form cohesive subgroups through reciprocal network links, which is different from Friedmann's proposal that core areas are urban agglomerations or metropolitan clusters based on geographical proximity. In addition, the overall network linkage consists of long-distance economic ties, and widely distributed and numerous peripheral cities have network links mainly with the core cities. The economic linkages between cities have transcended the limits of geographical distance, but the network links are overly dependent on the core cities, and the lack of economic linkages between the peripheral cities makes the overall network structure unstable.
Core cities match the distribution of structural holes, and the development of edge cities is limited by network capital. Further study of the effective scale and intermediary centrality of structural holes reveals that core cities tend to match the distribution of structural holes, and the development of edge cities is limited by network capital. On the one hand, the top ten cities in effective size in 2010, 2015 and 2020 all belong to core cities and have high intermediary centrality (Table 2), which indicates that cities occupying structural holes are more likely to evolve into core cities. Core cities such as Beijing, Ningbo and Guangzhou assume the function of national resource bridging hubs; Fuzhou and Nantong have significantly increased their effective scale, but have relatively low intermediary centrality, indicating that these cities have abundant non-redundant links, but weak resource bridging capacity. On the other hand, the Zipf index decreased from 0.9013 to 0.8429 from 2010 to 2020 (Figure 2), in which the effective scale of 271 cities has improved, but the effective scale of the upper truncated cities is generally higher than that of the end truncated cities. This means that the distribution of effective scale converges to Zipf's law, and these cities in the most peripheral positions such as Hegang and Tongchuan tend to lack network capital and develop slowly, while the core cities occupying the structural hole positions can further strengthen their network location advantages by virtue of their resource control advantages and information advantages. The enhancement of cities' network status depends on their influence in the network, and the core-edge structure is likely to persist in the future.
4.
Urban digital financial network characteristics
4.1. Selection of drivers
From the above analysis, it can be seen that the overall spatially connected network of digital finance in Chinese cities has become denser over time, but the strength of the connection varies significantly, showing a significant Matthew effect. Based on the existing studies and considering the real availability of data, this paper considers the influence of four types of factors on urban digital finance, including economic development, government intervention, technological innovation and social construction, and defines the sample time span as 2010 to 2020 [45,46]. The specific indicators are shown in Table 3. In order to eliminate the influence of the dimension, all indicators are standardized using the Z-Score method in the following calculation process [47,48,49].
4.2. Performance evaluation of different machine learning methods
In this paper, four nonlinear machine learning methods, DT, RF, Adaboost, and LightGBM, are selected for focus. Two main conclusions can be obtained from Table 5 below. First, all four nonlinear methods exhibit good prediction performance. Each method scored more than 70% for four variables: accuracy, precision, recall, and F1. Second, two methods, LightGBM and RF, performed even better. These two methods outperformed the other two methods in scores for almost all variables. In view of this, this paper will mainly refer to the results of the LightGBM and RF methods for the empirical analysis. In order to simplify the analysis and ensure scientific and reasonable results, the contribution rates of each influence factor calculated by LightGBM and RF are averaged in this paper, hereinafter referred to as the "average results of the two methods". Meanwhile, the average results of DT, RF, Adaboost, and LightGBM are also used as supporting evidence, which is referred to as the "average results of the four methods" in this paper to support the reliability of the benchmark results.
4.3. Empirical analysis of the main drivers
Based on the gravitational model constructed in this paper, the factors influencing the digital financial network of Chinese cities in 2010–2020 are identified using spsspro software, taking into account the robustness of the results. The following main conclusions are obtained.
The latter empirical evidence is done with spsspro, because the classification model has a random nature, each output will be different, so considering the robustness of the results, DT, RF, Adaboost and LightGBM four methods. The final results were calculated by averaging 50 outputs of each method.
The urban digital financial network is significantly influenced by year-end loan balance, science and technology expenditure and per capita gross regional product. This paper first examines the five most important influencing factors of the urban digital financial network. Panel A part of the table presents the average results of two methods, RF and LightGBM, and Panel B part presents the average results of four methods, DT, RF, Adaboost, and LightGBM.
Considering that the prediction performance of RF and LightGBM is significantly better than the other methods, this paper focuses on the average results of these two methods, i.e., the results in Panel A part of the table. The five important influencing factors of urban digital finance are total year-end loans, per capita gross regional product, science and technology expenditure, number of students in higher education as a proportion of the total population, and total retail sales of consumer goods, and the sum of their contributions reaches 55.37%. The top 5 factors in Panel B are nearly consistent with those in Panel A (only the ranking order is slightly different).
Economic development factors are the most important influencing factors for the development of digital financial networks. Table 8 specifically shows the influence of various economic development factors on urban digital finance. From the average results of both FR and LightGBM methods, the sum of the contribution of economic development factors to urban digital financial network reaches 46.17%, among which the contribution of the year-end loan balance and per capita GDP exceeds 12% and ranks first and second respectively. In addition, the contribution rates of disposable income per capita and the share of tertiary industry in regional GDP are relatively low and rank around the 10th position. The average results of DT, RF, Adaboost, and LightGBM are basically consistent with the average results of both RF and LightGBM methods.
The contribution of the year-end loan balance is 13.38%, ranking first. The year-end loan balance of financial institutions can reflect the investment status of the region to a certain extent. The average level of the loan balance at the end of the year is 325, 53, 685.695, with the lowest level being 325, 53, 685.695 in Guyuan City in 2010 and the highest level being 810, 000, 000, 000 in Beijing City in 2020. Financial institutions granting loans in the region imply an increase in regional investment, which ultimately promotes regional economic development and, to a certain extent, reflects the strength of financial support for economic development. By introducing diversified credit products, moderately lowering the loan threshold, lowering the loan interest rate, and extending the loan term, greater credit support can be provided for urban digital finance. The contribution rate of per capita GDP is 12.51%, ranking second. A city's real economy is an important foundation for the development of digital finance, and a city's steadily improving economic development and higher per capita GDP will attract more fintech enterprises, talents, high-tech as well as domestic and overseas investment and other resources to gather there, thus further promoting the development of the city's digital finance network. The average level of GDP per capita is 53, 607.731, with the lowest level being 5304 in Dingxi City in 2010 and the highest level being 355, 301 in Ordos City in 2019.
Social construction factors have a significant impact on urban digital financial networks. The impact of various social construction factors on urban digital finance is specifically listed in Table 9. From the average results of both FR and LightGBM methods, the sum of the contribution of economic development factors to the urban digital finance network reaches 27.77%, among which the contribution of the number of students in higher education to the total population to 8.90%, ranking fourth. In addition, the contribution of digital finance concern and the Internet, the average results of DT, RF, Adaboost, and LightGBM are basically consistent with the average results of both RF and LightGBM methods.
The contribution of the number of students enrolled in higher education to the total population is 8.90%, ranking fourth. The level of education can be converted to a certain extent into financial literacy, which is an intrinsic driver of digital finance development in cities. The average level of ratio of the number of students in institutions of higher learning to the total population is 0.01867, with the lowest level being 0.00005 in Bazhong in 2013 and the highest level being 0.13269 in Guangzhou in 2020. Generally speaking, the higher the education level, the stronger the cognitive ability and acceptance of technological development and financial participation; secondly, a good education level makes people more capable of using their own knowledge stores to enhance their financial participation and thus use diversified financial products to meet their asset management needs. Stimulated by these demands, the providers of relevant financial products and services can improve their product innovation capabilities more efficiently and precisely, providing a two-way willingness to the development of digital finance in cities. The cultivation of talents for the digital economy requires a more complete education system to be constantly applied therein, focusing on basic education on the one hand, and innovation and creativity on the other, to provide intellectual guarantee for the development of digital finance, thus promoting the further development of urban digital financial networks.
The influence of government intervention factors and STI factors on urban digital financial networks is relatively small. Table 10 shows that from the average results of both RF and LightGBM methods, the contribution rate of government intervention factors is 16.34% and the contribution rate of science and technology innovation factors is 9.72%, ranking the third and fourth among the four major categories of factors, respectively. The contribution rate of science and technology expenditure is 12.15%, ranking 1st. The contribution rates of digital finance policy support and the number of fintech enterprises are lower, ranking 12th and 13th respectively. The average results of the four methods of DT, RF, Adaboost, and LightGBM are basically consistent with the average results of the two methods of RF and LightGBM.
The contribution rate of science and technology expenditure is 12.5%, ranking third. On the one hand, the development of science and technology is a prerequisite for the development of a digital economy, which can directly influence the innovative development of digital technology; on the other hand, the development of science and technology can promote the effective concentration of innovation resources, cultivate an innovation-driven atmosphere and drive the development of the digital economy. The average level of science and technology expenditure is 110, 518.427, with the lowest level being 190 in Xinzhou City in 2020 and the highest level being 5, 549, 817 in Shenzhen City in 2018. Science and technology is the key influence on the level of digital finance, and the use of digital technology promotes the continuous development of digital finance, gradually changing the way people enjoy financial services and greatly reducing the cost of access to finance; therefore, it is important to increase the investment in digital finance research and scientific and technological innovation for the development of the urban digital financial network.
5.
Conclusions and policy recommendations
There are few existing empirical studies on urban digital financial networks in China, and they mainly use the QAP in the non-parametric test to conduct correlation analysis and regression analysis on the driving factors, which is difficult to fully examine the influence of each factor on urban digital financial networks and the number of factors examined is relatively limited. Based on the modified gravity model, this paper constructs the digital financial network of Chinese cities from 2010 to 2020, identifies core and edge cities in different periods using the core-edge structure model, and analyzes the evolution characteristics of the core-edge structure from multiple perspectives; then uses four nonlinear machine learning methods, DT, RF, Adaboost and LightGBM, to identify the drivers of urban digital financial network drivers are identified using four nonlinear machine learning methods, DT, RF, Adaboost and LightGBM, and the main conclusions are as follows.
First, the core-edge structure of the urban digital financial network shows the characteristics of gradual deepening and orderly distribution. Between 2010 and 2020, the scale of cities in the core block of the core-edge structure of the urban digital financial network in China has been expanding, and the number of cities in the core block has expanded from 83 to 113. With the deep development of digital financial linkages, the concentration of urban cores in China tended to decline over the past decade, and the distribution of urban network power shifts to a functional polycentric pattern. The small-world characteristics of urban digital financial networks are becoming more pronounced, and the connections between core and peripheral cities are getting closer.
Secondly, the core cities show reciprocal relationships and the peripheral cities lack connections with each other. 2010–2020, there is a significant increase in inter-city network links, and the spatial association relationships developed toward denseness and deepening. The economic linkage between cities has transcended the limitation of geographical distance, but the network linkage is overly dependent on the core cities, and the lack of economic linkage between the peripheral cities, and the overall network structure is not stable.
Third, core cities match the distribution of structural holes, and the development of edge cities is limited by network capital. Core cities occupying structural hole positions can further strengthen their network location advantages by virtue of their resource control advantages and information advantages, while cities at the most peripheral positions often lack network capital and develop slowly. The enhancement of cities' network status depends on their influence in the network, and the core-edge structure is likely to persist in the future.
Fourth, urban digital financial networks are significantly influenced by year-end loan balances, science and technology expenditures, and per capita gross regional product. Both the average results of the two methods and the average results of the four methods finally give the top three rankings of importance with the contribution of the total year-end loan balance, per capita gross regional product, and science and technology expenditure exceeding 10%, which have a driving effect on the development of urban digital financial networks. Therefore, it is important to pay attention to the investment in digital economy research and technological innovation; to continuously improve the level of regional economic development and pay attention to the economic growth effect; and also to improve credit support, thus further promoting the development of urban digital financial networks.
Based on the above findings, this paper has five policy recommendations.
First, promote economic development and play the role of the economy in driving digital finance. From the above empirical results, we can see that GDP can significantly promote the development of digital finance. The higher the economic level, the higher the quality of people's lives, and the higher the demand for financial services, which will promote the development of digital finance. Therefore, we should focus on improving the level of economic development, speeding up the flow of capital, improving the supply and demand of financial services, and providing good economic conditions for the development of digital finance.
Second, improve the credit risk management system of digital finance to support the healthy development of digital finance. Strengthen the construction of internal risk control mechanism of digital finance enterprises, set up special risk control teams and attach great importance to risk management; improve the construction of personal credit system, take advantage of big data of digital finance enterprises, unify credit management methods and introduce credit evaluation mechanism into digital finance business.
Third, to promote the development of digital finance with the power of science and technology. The use of digital technology makes financial providers pay much lower costs, and also makes financial demanders get financial costs lower, and gradually changes the way people enjoy finance, ultimately promoting the continuous deepening of financial development. The government should introduce policies to encourage the development of digital technology, increase the investment in science and technology spending, and drive the development of digital finance with technological progress.
Fourth, improve the level of financial services in less developed cities. Since the economically developed core cities have the advantages of resource control and information, they should improve the level of financial services in the marginal cities in less developed regions, change the problem of excessive tilting of resources to developed cities, and give the necessary support to the backward cities in terms of policy, technology and capital to avoid the increase of the economic gap between developed and less developed regions and promote Coordinated development of the regional economy.
Fifthly, we should enhance the awareness of "sharing and synergy" development and promote spatial linkage among cities. The cities should strengthen the positive interaction of digital finance development, enhance mutual communication and cooperation, and minimize the obstacles and barriers caused by geographical characteristics and differences in endowments. A mechanism of precise assistance from central cities to other cities can be established to effectively play the role of radiation and linkage from far to near and from point to point, so as to promote the nationwide realization of a benign situation of coordinated development of urban digital finance networks.
Conflict of interest
The authors declare there is no conflict of interest.
Acknowledgments
The authors would like to acknowledge the support of China Postdoctoral Science Foundation (No. 2022M720879).