Machine learning model of tax arrears prediction based on knowledge graph

Jie Zheng; Yijun Li; Jie Zheng; Yijun Li

doi:10.3934/era.2023206

Electronic Research Archive

2023, Volume 31, Issue 7: 4057-4076. doi: 10.3934/era.2023206

Previous Article Next Article

Research article Special Issues

Machine learning model of tax arrears prediction based on knowledge graph

Jie Zheng ^,,
Yijun Li

School of Management, Harbin Institute of Technology, Harbin 150001, China

Received: 22 February 2023 Revised: 27 April 2023 Accepted: 03 May 2023 Published: 25 May 2023

Most of the existing research on enterprise tax arrears prediction is based on the financial situation of enterprises. The influence of various relationships among enterprises on tax arrears is not considered. This paper integrates multivariate data to construct an enterprise knowledge graph. Then, the correlations between different enterprises and risk events are selected as the prediction variables from the knowledge graph. Finally, a tax arrears prediction machine learning model is constructed and implemented with better prediction power than earlier studies. The results show that the correlations between enterprises and tax arrears events through the same telephone number, the same E-mail address and the same legal person commonly exist. Based on these correlations, potential tax arrears can be effectively predicted by the machine learning model. A new method of tax arrears prediction is established, which provides new ideas and analysis frameworks for tax management practice.

Keywords:

Citation: Jie Zheng, Yijun Li. Machine learning model of tax arrears prediction based on knowledge graph[J]. Electronic Research Archive, 2023, 31(7): 4057-4076. doi: 10.3934/era.2023206

Related Papers:

[1]	Lorna Katusiime . International monetary spillovers and macroeconomic stability in developing countries. National Accounting Review, 2021, 3(3): 310-329. doi: 10.3934/NAR.2021016
[2]	Nikolay I. Kuryshev, Vladimir R. Tsibulsky . Scarcity of resources as a determining factor of value in input-output models (objectivist concept of capital). National Accounting Review, 2023, 5(3): 208-226. doi: 10.3934/NAR.2023013
[3]	Olha Kravchenko, Nadiia Bohomolova, Oksana Karpenko, Maryna Savchenko, Nataliia Bondar . Scenario-based financial planning: the case of Ukrainian railways. National Accounting Review, 2020, 2(3): 217-248. doi: 10.3934/NAR.2020013
[4]	Andrea Karim El Meligi, Donatella Carboni, Giorgio Garau . Perceived crowding and physical distance rules: a national account perspective. National Accounting Review, 2021, 3(3): 330-341. doi: 10.3934/NAR.2021017
[5]	Serhii Shvets . Public investment as a growth driver for a commodity-exporting economy: Sizing up the fiscal-monetary involvement. National Accounting Review, 2024, 6(1): 95-115. doi: 10.3934/NAR.2024005
[6]	Ilja Kristian Kavonius . Pension in the national accounts and wealth surveys: how do they impact economic measures?. National Accounting Review, 2025, 7(1): 1-27. doi: 10.3934/NAR.2025001
[7]	Tinghui Li, Zimei Huang, Benjamin M Drakeford . Statistical measurement of total factor productivity under resource and environmental constraints. National Accounting Review, 2019, 1(1): 16-27. doi: 10.3934/NAR.2019.1.16
[8]	Meng Fan, Jinping Dai . Monetary attribute of stablecoins: A theoretical and empirical test. National Accounting Review, 2023, 5(3): 261-281. doi: 10.3934/NAR.2023016
[9]	Guido Ferrari, Zhenghui Li . Welcome to our new journal—National Accounting Review. National Accounting Review, 2019, 1(1): 1-2. doi: 10.3934/NAR.2019.1.1
[10]	Guido Ferrari, José Mondéjar Jiménez, Yanyun Zhao . The statistical information for tourism economics. The National Accounts perspective. National Accounting Review, 2022, 4(2): 204-217. doi: 10.3934/NAR.2022012

Abstract

1. Introduction

Why is it so difficult to be transparent about measurement errors in macroeconomic statistics? Measurement error is pervasive in macroeconomics, as in most applied sciences; however, the explicit recognition of macroeconomic inaccuracies is by and large limited to extraordinary times, such as during the lockdown periods of the COVID-19 pandemic or, to isolated cases where specific political and policy demands stood at the basis of evaluations of measurement error. The ultimate impact of these surges of recognition and the ensuing methodological improvements do not have a lasting impact. Indeed, measurement errors and biases remain at similar levels. Of course, the registration of certain items improved due to the enormous increase in computing power, the development of new and advanced methodologies, and the advent of new and exciting modes of observation. However, regarding balance, the accuracy of economic observations does not show a tendency to improve. This sobering observation reflects Simpson's Paradox: even when observations of all parts of an economic system improve, the aggregates can deteriorate due to structural changes. In this case, the shifts are from easily observed and accurately measured activities to invisibles and difficult to measure activities (Griliches, 1994). Moreover, statistics are social constructs, the underlying concepts change due to macro-behavioral and institutional change, and the dynamics of modern economies mean that new activities constantly emerge—new activities are by their very nature more difficult to observe. Therefore, a macro-measurement error was, is and will be a significant, and economically relevant problem. Progress is slow, as illustrated by a perspective that Zoltan Kenessey (Director of the International Statistical Institute) provided on Morgenstern's (1950) on the accuracy of economic observations, which is a classic but forgotten and ignored diagnosis of macroeconomic and macrosocial statistical observations: at the minimum, one feels that the increases in accuracy and timeliness of economic statistics are less than one would have expected about fifty years ago when Morgenstern was working on his book. Moreover, this is more surprising in light of the profound electronic revolution which occurred over this period (Kenessey, 1997, p.248).

In the quarter of a century, much has been achieved by the statistical community since this appraisal in terms of timeliness. However, on the eve of the 75^th anniversary of Morgenstern's on the accuracy, macro-measurement errors continue to be at a similar level as observed in the 1950s. Two important and widely discussed cases may illustrate this point: first, the consumer price index (CPI) and its derivative inflation which, according to Reiss (2008, p.61), provides for the only systematic discussion of measurement issues in economics; and second, the balance of payments registration in the 1980s led to a concerted action for improved measurements under the aegis of the IMF (International Monetary Fund).

The influential Boskin (1996) report on the US CPI appeared in a period when central banks were moving towards inflation-targeting, and policy makers wanted to reevaluate Social Security spending and retirement and compensation programs that were based on an upwardly biased CPI. The interest of academics in consumer price statistics reflects the core role of price in the economic theory; the official inquiry responds to the fact that the CPI is an important input of economic policy and wage policy, and because price indices are necessary for the deflation of nominal data into 'real' data. The Boskin committee's finding of an upward bias of ¾ to 1½ percentage points and its proposals for improvement were influential, thus stimulating the best practices in both the US and abroad (Moulton, 2018). The enthusiastic engagement of statisticians, professional economists, and academic researchers stemmed hopeful. However, according to Feldstein (2017) '(…) despite the attention for this subject in the professional literature, there remains an insufficient understanding of just how imperfect the existing official estimates actually are'. While helping to address and reduce sources of measurement error, the Boskin report did not eliminate the CPI bias. Indeed, the upward bias of the US CPI persists at comparable levels due to new measurement challenges related to (de)globalization, the changing role of services, and the growing share of internet transactions (Goolsbee and Klenow, 2018; Matolcsy et al., 2020).

The IMF (1983, 1987, 1992) reports on the accounting problems of international financial flows on the balance of payments that occurred at a time when the size of the item 'errors and omissions' made it difficult, if not impossible, to glean if the capital account of many OECD countries was either in surplus or in deficit. In 1992, the IMF observed that 'statistical problems have worsened dramatically and may well continue to worsen in the absence of a major effort to improve the data'. (IMF, 1992, p. 1). This sense of urgency made a concrete impact; unfortunately, the improvement did not last as illustrated in Figure 1. Figure 1 reports the world's current account as reported in the most recent vintage of the IMF's World Economic Outlook database. Since our planet does not trade with Moon or Mars (yet), the world's current account should logically be zero. Therefore, its statistical imbalance in the face of non-existing extraterrestrial trade reflects the underlying measurement errors (Grassman, 1983; van Bergeijk (2013, 2024). The implicit measurement errors reveal two phases: first, up till and including 2003, a negative world current account of on average −0.4 percent with periodic movements towards zero; and second, a phase starting in 2004 of a persistent surplus of similar size, but with no apparent drive towards its logical value of zero¹.

¹This issue can also be analysed from the perspective of the worlds saving-investment balance (the mirror of the current account in an autarkic planetary system, with similar results.

Figure 1. World current account in percent of Gross Planetary Product (1980–2023). Note: period averages 1980–2004 and 2005–2023, respectively. Source: IMF World Economic Outlook database (April 2024 version), available at:https://www.imf.org/en/Publications/WEO/weo-database/2024/April and accessed June, 1, 2024.

DownLoad: Full-Size Img PowerPoint

The two cases illustrated the persistent and non-transitory nature of macroeconomic measurement error². These cases are special because of the professional recognition of measurement error. However, broad-based awareness typically appears to be mainly limited to the macroeconomic statistics of the Global South, in particular Africa (Jerven, 2013), and to a critical emerging literature on 'incredible certitude' (Manski, 2011, 2015, 2020), 'pseudo accuracy' (de Jonge, 2020), and 'mock accuracy' (Linsi et al., 2023). A symptom is the remarkable phenomenon of 'digit-inflation' (van Bergeijk 2024), which is the growing tendency to report statistics with many more digits than can be justified on the basis of the accuracy of the data. On the one hand, these phenomena lack transparency regarding the measurement error; on the other hand, digit-inflation goes hand in hand. This article argues that their self-reinforcing cycle can only be broken if our profession starts investigating and transparently reporting the inaccuracies of observations on a regular basis. This needs to be performed in statistical institutions, in academia, and in policy circles. While the focus of the present article is on National Accounting, the problems in other fields of macroeconomic observations such as (un)employment and inflation are of a similar size and importance.

²In van Bergeijk (2024) I review the areas studied in Morgenstern (1963) and find that macroeconomic measurement error is still of similar size as in his case studies.

The remainder of this article is structured as follows. Section 2 starts with a discussion of National Accounting from the perspective that the approach was originally intended to deal with measurement errors and that best practices initially included the explicit and transparent reporting of margins of error of the reported aggregates and/or components. Based on concrete examples, the section argues that it is possible to return to the transparency of the pioneering era of national accounting. Section 3 presents the case of the growth rate of Gross Planetary Product (GPP, aka world production) to illustrate the persistence of non-transient data inaccuracies based on 37 vintages of the IMF's World Economic Outlook database. The section zooms in on the measurements regarding three important global recessions: the Volcker recession at the start of the 1980s, the Great Recession of the Financial Crisis in 2007–2009, and the Great Lockdown of the COVID-19 pandemic in 2020. Section 4 argues for a coherent strategy based on the duty to inform (a professional standard for data producers), a return to transparency on inaccuracies, and empowering data users. Section 5 offers some concluding observations.

2. National accounts and inaccuracy

For the pioneers of national accounting, the inaccuracy, imprecision, and incompleteness of (macro)economic data to the tune of ten to twenty percent were realities, much more than for the current user of the accounts and their most important headline statistic, national income. This awareness is clear from the start of the combination of three lenses or measurements as a method to reduce measurement errors (Stone et al., 1942, see also Comim, 2001) via the use of three different accounting approaches (the income, expenditure and production accounts). Importantly, the teams that developed the first national accounts in the United Kingdom and the United States frankly provided expert opinions on the reliability of (components of) national income (Mitchell et al., 1922; Kuznets, 1941; see also Biddle and Boumans, 2021). These publicly available expert opinions on inaccuracies are best practices that accompany the earliest publications of the national accounts in other countries (see Table 1 on the self-reported error ranges that accompanied the National Accounts in the Netherlands).

Table 1. Self-reported error ranges of National Accounts line items (Netherlands, 1950).

Grade	Examples	Self-reported error range
a	Indirect taxes (receipts), wages	2%–5%
b	Capital goods, profits, current account	5%–10%
c	Household saving, depreciation, inventories	10%–20%
d	Reserves (larger firms)	> 20%
Source: Centraal Bureau voor de Statistiek (1953).

| Show Table

DownLoad: CSV

This transparency is in sharp contrast with current practices. Notifications such as the following explanation from the UK Office for National Statistics are common: "The estimate of GDP (…) is currently constructed from a wide variety of data sources, some of which are not based on random samples or do not have published sampling and non-sampling errors available". As such, it is very difficult to measure both the error aspects and their impact on the GDP. While development work continues in this area, similar to all other G7 national statistical institutes, we do not publish a measure of the sampling error or non-sampling error associated with GDP.³

³https://www.ons.gov.uk/economy/grossdomesticproductgdp/methodologies/grossdomesticproductgdpqmi accessed August 1, 2023.

The choice to not publish a measure of inaccuracy had already become common practice in the early 1960s (Morgenstern 1963, pp. 252–253). This reflects the hesitance of the statistical community to provide error estimates unless these are accurate. However, in this case, 'better is the enemy of good'. Indeed, the illusion of accuracy simply continues if the measurement error is only reported for cases where the inaccuracy could be attributed and estimated with precision. Of course, there is a well-developed standard practice to provide the audience with the headline figures and a discussion of uncertainty, though this suffers from the Caveat Fallacy. Afterall, if the audience sees or hears a number, then the number is the message that it gets whatever the data supplier says and writes on its uncertainty and inaccuracy (Guberek, 2019). Additionally, the inaccuracy of economic statistics is not a regular concern in academia, perhaps because the national accounting procedures have become too complex and difficult to grasp for outsiders (Coyle, 2017). Moreover, it may be that a disinterest in the exact definitions, data collection, measurement issues, methods, and their exact documentation exists (Knibbe, 2019). Anyhow, for most economists, national income is in the words of O'Brien (1998, p. 2) a fait accompli, which reflects the general assumption that the definition and measurement of the GDP have been settled long ago on the basis of the 'fundamental notions of economic theory'.

2.1. Transparency is possible

In recent discussions on the reporting of inaccuracies and uncertainties by an official institution, The Bank of England often features as a prominent example (e.g., Manski, 2015; van der Bles et al., 2019). The Bank regularly publishes a band for historical GDP data in its Monetary Policy Report; while this at least helps to communicate that the recent past is statistically speaking more fluent than the users assume, the graphical representation of the GDP inaccuracy (incidentally, already pioneered by Morgenstern, 1963, Figure 10) is that it does not address the important 'Why?'-and 'Does it matter?'-questions. Moreover, there is a fundamental problem with the band because it assumes time stationarity of the data measurement and revision processes (Manski, 2019, p. 7639). This is problematic because the reported historical data are in different stages of revision (it is also unclear which data vintage is actually being reported). As will become clear in Section 3, some years are much more likely to be revised than others and often have no convergence towards a 'final estimate'. Answering 'Why' and 'Does it matter' are important steps towards understanding the macroeconomic data uncertainty. The essential first step to be set is to make the material available that is already collected and evaluated during the process of National Accounting, though this typically remains under the hood. Best practice examples that provide much needed transparency are Akritidis (2002), Harchaoui et al. (2004), and Meader and Tily (2008). Akritidis (2002) reported on the accuracy assessment by the UK Office for National Statistics of the 2000 GDP, which detailed the data sources (surveys, administrative data, models), adjustments (validation, conceptual issues, balancing), quality indicators (response rate, error, coverage), and GDP components that were impacted, and noted a 3.5% adjustment in the income approach amongst others. Harchaoui et al. (2004) discussed the industry data quality in the input-output tables of Statistics Canada and exemplified how and why reliability issues arose from the imputed data, the lack of deflators, or series breaks. They provided comparative analyses of the data series and found that the labor data was mostly reliable, while the capital data was less and that lower aggregation levels were associated with lower reliability. As a part of the plans for the modernisation of the UK National Accounts, Meader and Tilly (2008) monitored the quality of quarterly estimates through 2007 and 2008. They provided inter alia information on price coherence adjustments, the size and development of revisions, and real time alignment adjustments and residual errors. They discussed the main surveys and sources that feed into GDP assessments and provided benchmarks against external data. These three examples showed that a lot of information is already available. There is no good reason why this manner of essential 'meta-data' should not become a regular element of standard reporting in the framework of National Accounting.

2.2. The academic approach

Discomfort with timeliness and accuracy, especially during the 'data fog' of the COVID-19 pandemic, has led academics unto a quest for new modes of observations. These new modes include satellite observation of night light intensity (Nordhaus and Chen, 2015; Hu and Yao, 2022), social media (Indaco, 2020), and even seismometers (Tiozzo-Pezzoli and Tosetti, 2022). The increase of measurement rods holds potential for improving the measurements, though it is worrying that conceptual and measurement errors are typically overlooked. At the same time, the academic discourse is strange because it challenges a statistical system that has clear strengths. National accounting has a long history of reliably application, is solidly grounded in theory, and aims to reduce measurement errors by design. This can be compared to the new modes that was comprised of blended opportunity data that was collected without any theory of measurement, quality standards, transparency, and more often than not without a history of application. There still is a lot of simplicity in papers that use administrative data and Big Data. All too often, these contributions focus only on (assumed) promises and opportunities without consideration of the drawbacks. These drawbacks include privacy issues, discontinuities, erratic access, inconsistent methodologies, and bias. Big data users (academics and policy makers alike) assume that Big Data's coverage is better and much more complete. The practice provides sobering lessons, a prime of which is the so-called Big Data Paradox (Meng, 2018). Paradoxically, increasing the sample size suggests an increased accuracy, as it reduces the confidence intervals; however, at the same time, it magnifies the effect of survey bias: 'the more the data, the surer we fool ourselves'. Additionally, there is a lot of naivete in the hope that artificial intelligence (AI) in combination with the new modes of observation can provide better and more timely estimates of key macroeconomic statistics. The training material for AI models would a priori seem to be compromised because measurement errors are currently not reported; thus, AI models will tend to simply repeat the current bad practices. Related risks are the dart of training materials for AI models and AI-hallucination when data are lacking.

3. The growth rate of Gross Planetary Product

I am motivated to investigate the growth rate of world production by the impact of the IMF World Economic Outlook (WEO) on academics, policy makers, and the public at large, ⁴ the fact that these data are produced in a highly professional organization with well-trained and competent staff, and the Fund's detailed knowledge about the quality of underlying national statistics, thanks to its regular Article IV consultations and Dissemination Data Bulletin Board initiative. The use of a planetary variable, rather a national number, reflects the general nature of the topic that I want to address and helps to avoid the imbalance in the debate on measurement error that focusses on problems in the Global South to some extent. Additionally, the choice is guided by an ex ante error prognosis that basically recognizes that studying a real annual growth rate of a key concept at a very high level of aggregation with data that are at least three years old and produced by an international institution and an academic consortium can be expected to substantially reduce the measurement error compared to an alternative analysis of a monthly level of a nominal component of GPP that includes the most recent data and is produced by a national government (institute). Basically, this choice avoids six important problems that have been identified in the literature: the choice of the numeraire (that is currency conversion issues (e.g., Obstfeld et al., 2015), seasonal adjustment (e.g. Manski, 2015, Abeln and Jacobs 2022), classification error by country and component (e.g. van Delden et al., 2023, Du Rietz, 2024), depreciation (e.g., Bos, 2009), and strategic over- and/or underreporting by national governments (e.g., Fariss et al., 2022). Thus, we can reasonably expect an investigation of the measurement error in the real annual GPP growth to arrive at a conservative estimate of the measurement error. Figure 2 reports the highest and largest reported annual growth rate for each year in the period from 1991 to 2019 in each of the 37 bi-annual vintages of the WEO database published in April 2006 to April 2024, inclusive. As a practical measure of inaccuracy, I define the Implicit Minimal measurement Error (IMME) for each year as a ratio with the average μ of the observations M_i (the 'best guess') in the denominator and the average distance of the observations M_i from μ in the numerator; therefore, for N vintages, we have the following:

⁴World production is mainly used in the publications of the international organizations including World Bank, WTO and UNCTAD (sometimes with slightly different versions). Substantial academic use of GPP occurs in the World System Theory, in macroeconomic modelling where the world's business cycle is important, and particular as one of the controlling variables in empirical studies on world trade and investment.

$IMME = \frac{\sum \left|{M}_{i}-\mu \right|/N}{\left|\sum {M}_{i}\right|/N}$

(1)

Figure 2. Annual GPP growth rate (largest and smallest estimate) in IMF WEO vintage and benchmark against latest PWT-based estimates (1991–2019). Sources: Calculations based on 37 vintages of the IMF WEO database, available from https://www.imf.org/en/Publications/SPROLLs/world-economic-outlook-databases. accessed and PWT 10.01 available from https://www.rug.nl/ggdc/productivity/pwt/. Data accessed June 3, 2024.

DownLoad: Full-Size Img PowerPoint

Obviously, IMME is a conservative estimate of the measurement error compared to other practical measurement error indicators, as it does not assume that one of the sources is correct and because it evaluates the average error at an average observation. For the period between1991–2019, the average median IMME was ten percent (median six percent), which against a median growth rate across years and vintages of four percent indicated a meaningful measurement error of 0.2 percentage points. Among these years, 1991 and 2009 stood out as years with particularly large deviations. By way of an external benchmark, the figure also reports two alternative estimates of the GPP growth that can be derived from the Penn World Table (PWT), which is another heavily used global data resource that indicates the Great Recession year 2009 as a year for which even the sign of the GPP growth has been difficult to establish⁵.

⁵PWT, a data set that is regularly being updated and most widely available to and used by the economic profession, is currently maintained by a consortium of the universities of California and Groningen. The key variable of interest is GPP at Purchasing Power Parity (PPP). Unlike the IMF's WEO the PWT does not provide an estimate of GPP, but this measure can be derived by aggregating the 182 countries for which data are provided into a world total. Since 2009 PWT distinguishes and reports real GDP measured from the expenditure-side and from the output-side, respectively. For an individual country the difference between the two measures is related to the terms of trade effect, an effect which at the aggregate level of the world does logically not exist of course and thus the two measures for the world should coincide. The analogy is the world current account balance: the world cannot have a current account deficit or surplus and therefore the world's current account imbalances in the IMF WEO represent measurement error (Figure 1). The world likewise logically cannot have a terms of trade effect and therefore the imbalances between expenditure side GPP and output side GPP are useful indications of measurement error for PWT.

Figures 3 and 4 illustrate some of the patterns of the data vintages for specific years. Figure 3 reports the growth rate for the years 1991, 1995, 2000, 2005, 2010, and 2015. For the year 1991 (dark triangles), the pattern is not monotonic: up to April 2010, the reported GPP growth rate for 1991 decreased from 2.4 to 1.5, then jumped to 2.2, and from October 2014 moved in a fluctuating pattern towards a value of 2.7. For the other years, revisions also followed haphazard patterns. It is worth noting that fluctuations continued even in periods for more than five years after the fact.

Figure 3. Pattern of revision for six years in 37 IMF WEO vintages. Source: see Figure 2.

DownLoad: Full-Size Img PowerPoint

Figure 4. Revisions of GPP growth for three major post Second World War recessions. Sources: For the Volcker recession, data were collected from the 13 spring and autumn issues of the World Economic Outlook, typically from Table A1 in the statistical appendix, but since early editions sometimes do not report a GPP estimate but rather a G7 or advanced economies number in the statistical appendix occasionally data were collected from a table in the main text when the statistical appendix did not provide the world aggregate (for example, Table I-1 in the April 1984 issue). The data for 2008 and 2020 are derived from 23 post-2008 biannual World Economic Outlook Database vintages. Note: * The data for the Volcker recession are not available in a WEO data base and were collected from the print versions of each of the issues that appeared from April 1983 to October 1990, inclusive. No number for world production growth in 1982 is reported in the April 1984 issue. In 1983 no estimate for GPP growth was published.

DownLoad: Full-Size Img PowerPoint

Figure 4 zooms in on how statistical reporting on three major post Second World War recessions changed following the first estimate of the recession year. The historical GPP growth rates continued to fluctuate. Note that Figure 4 reports fluctuations during the first seven and a half years (i.e. 15 revisions on the horizontal axis), thus underreporting fluctuations in revision 16 and higher. For example, in the 37 vintages that were published since October 2006, the reported GPP growth rate for the Volcker Recession in 1982 fluctuated between 0.5 and 1.2. Figure 4 suggests that the three major recessions of the world economy were initially overestimated. Upward adjustments of the initial estimate by half a percentage point materialized in the 6^th (Volcker recession) and 7^th revisions (Great Recession and in the sixth year even to positive territory); however, for the Great Lockdown, an upward adjustment was already evident from the 2^nd revision. The revision for the Great Lockdown year 2020 between April 2021 (−3.3) and April 2024 (−2.7) amounted to 0.6 percentage points (an IMME of ten percent). Despite these revisions and possible future adjustments, the Great Lockdown growth rate could be expected to remain in negative territory. For the Great Recession, the jury is still out with the most recent vintage reporting a shrinkage of 0.1 percent, which couldn't be considered to be significantly different from zero against the background of the history of revisions. Nowadays, the Volcker recession appears to be a year with low growth – but not the originally reported recession.

It is worth repeating that the measurement inaccuracy that we illustrated in this section should probably be seen as the minimal kind of measurement error that economists should expect. First, the data producers that we studied were in high regard for their professionalism and experience—other less advanced data producers could be expected to perform not as well. Second, the analysis was based on an ex ante error prognosis that reduced many problems of measurement, including seasonal adjustment, choice of numeraire, delays in information availability, and classification. For other expressions of GPP (e.g., quarterly, nominal, very recent statistics or components, ) data discrepancies will be much larger.

Our analysis showed that the IMF's statistics on the historical real growth rate of GPP certainly did not live up to the implied accuracy that was suggested by the use of three decimals in the WEO data base. If anything, a maximum should be set at 1 decimal and should work with quarters and halves of a percentage point to be a best practice response to the IMMEs that were estimated. A band of 0.2 percentage points would be appropriate to take the implied minimal measurement error into account. Reporting this kind of indeterminacy is perfectly doable at low costs, would do justice to the extent of measurement errors inherent to the aggregation of national account estimates, and would not undermine the economic narrative of world growth over the past decennia. For a limited number of individual years, the story line could be unclear, as illustrated by the example of the GPP rate for the Great Recession year 2009. Additionally, turning points in the cycle may occasionally be associated with different years. These indeterminacies should actually be considered as important benefits of the proposed procedure because this avoids the erroneous representation of these events as established facts in a manner that ignores their fundamental indeterminacy.

Moreover, the findings are sobering from a general statistical viewpoint. The measurement errors do not follow classical assumptions. Estimates of growth rates for a specific year often continue to fluctuate without an apparent convergence to a 'final estimate'. Additionally, nodes in the time series can be identified where both internal and external validations for the data indicate that specific years are measured with numerous problems than other years. A large component of measurement error appears to be non-transient.

However, the key take-away from this section is that the reporting of assessments of the measurement error based on vintages could become a standard procedure of the release of new data on national income and production without much effort.

4. Empowering data users

None of the above could have come as a big surprise, as the stylized facts of macroeconomic data inaccuracies are well known in the statistical community, experienced policy makers, and well-versed applied researchers in the social sciences. Despite this general awareness, producers of statistics worldwide do not report on inaccuracies and suffering from digit-inflation, which is an unwarranted situation that makes data reliability opaque and hampers the very basis of evidence-based policy making. Therefore, neither the extent of the measurement error nor 'the everyday practice of ignoring it seems sustainable even though this situation has lasted for decades.' (Coyle, 2017, p. S228). The key question is how to organize a change for the better. This task should not be underestimated. It would be unrealistic to ignore the powerful forces that hinder the reporting of the measurement errors as a standard item. The proposed new best practice would be a paradigmatic change as many supposed 'facts' may be either non-existent or turn out to be insignificant. Older generations of economists may oppose because the established procedures and findings will be challenged with the risk of jeopardizing their reputations. This in itself creates a strong countervailing force. Additionally, interest-based statistics have become powerful policy and lobbying tools and may generate a resistance to change. In short, vested interests in theories, analyses, policies, and politics based on questionable 'facts' can be expected to oppose a change for the better. For three quarters of a century, critics of economic data inaccuracies have asked for fundamental reform in an attempt to make governments and international organizations problem-owners of measurement errors. Morgenstern (1963, p. 120) insisted that governments should be forced to stop publishing figures 'with the pretence that they are free from error'. Moreover, others have put the responsibility to report the measurement error on data producers (Manski, 2015). Unfortunately, the first best solution does not work: aggregate inaccuracies continue, digit-inflation increases, and the measurement errors are hardly ever provided. Recognizing that we live in a second-best world opens up the possibility that data users can become a force for the good.

Moreover, from this data user perspective, the task still amounts to a fundamental transformation. To alter the incentives, funding agencies, academics, editors, reviewers, institutions, journalists, and educators have to change how they deal with statistics. Funding agencies should require a transparent reporting of measurement errors in funded research projects. Editors and academics must insist that findings without measurement error indicators are unacceptable. Institutions should lead by example, and journalists should extend their work against fake news to include a hunt for misleading statistics. The economic curriculum should significantly emphasize the inaccuracies of available data from start to finish during the educational journey, and students should receive training on managing imprecision in their professional lives. This new paradigm would provide incentives and opportunities for a new generation of economists who are aware of, report, and, where possible, reduce measurement errors, thus ultimately enhancing the reliability of our economic knowledge and understanding. The power to enforce change might seem unbalanced: data producers hold a monopoly since others rarely have the capability to collect data of a comparable quality and scope. However, the monopoly on inaccuracies (which leads to the non-reporting of measurement errors) can be challenged, as illustrated in the previous section. The key issue is to move beyond the quest for the single most precise observation to combinations of measurements of an economic phenomenon, to provide a simple method to assess inaccuracies that are in reach, and to those that are not skilled in advanced econometrics. This might look as a step backward from the frontiers of current achievements, but it is a step forward because it will provide insight in the extent and behavior of the measurement error (that as we discovered already in the previous section runs counter to mainstream assumptions with the risk of exacerbating bias). The IMME proposed in Equation 1 would provide such a tool and Figure 5 provides a rough and somewhat subjective snapshot of the current state of the (in)accuracy. Judged by this standard, the IMF data on the GPP (Figures 2 and 3) would be judged as 'fair', but for individual global recession years as 'poor'.

Figure 5. General overview of macroeconomic data by accuracy. Source: van Bergeijk (2024).

DownLoad: Full-Size Img PowerPoint

Empowering data users has not been explored as a strategy in the literature concerning the (non)communication of uncertainty and measurement errors. For instance, Van der Bles et al. (2019) only acknowledged a role for experts and official entities that 'own' the uncertainty, along with professional communicators (experts, communication departments, press).

This strategy can be significantly strengthened by incorporating three key elements. First, it is crucial to address the frequent lack of essential information in papers, even those published in leading academic journals. Details on basic data characteristics such as the number of observations, the standard deviations of variables, and specifics about data series can be included. To remedy this, economics should adopt a standardized data reporting format that must always be adhered to, thus ensuring that data sections in papers are explicit, detailed, and comprehensive. Second, change logs and accuracy assessments of statistics should be made fully available and easily accessible. This is a critical responsibility for statistical offices, as well as for academic data collectors. Third, integrating ex ante error prognoses into the research strategy is essential.

For data producers, the incentive is the fulfilment of their duty to inform, not just as a moral obligation, but as a professional standard. Transparency about uncertainties and inaccuracies is a fundamental aspect of knowledge. For policymakers, understanding the extent of error enhances the robustness and reliability of the evidence base, thus leading to better decision-making. Academic data users will be motivated by the opportunity to conduct superior research and contribute to the advancement of knowledge.

5. Concluding observation

It is worth thinking about how behavioral and institutional influences have created a low transparency equilibrium. Macroeconomic statistics are influenced by academic research and a strive for best practices at the statistical institutions. Past efforts, such as the Boskin report on CPI and the IMF's work on international financial flows, have influenced current practices, but did not successfully reduce macroeconomic inaccuracies in the end. This manner of hysteresis relates to the constant emergence of new economic activities in a dynamic economy that are difficult to accurately measure, and this adds to the challenge of maintaining accurate statistics; however, this is also caused in a very fundamental way the consequence of the non-reporting of measurement errors. It is time to change this bad practice.

Use of AI tools declaration

The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

Conflict of interest

All author declares no conflicts of interest in this paper.

References

[1]	H. Krut, X. Peng, Does corporate social performance lead to better financial performance? Evidence from Turkey, Green Finance, 3 (2021), 464–482. https://doi.org/10.3934/gf.2021021 doi: 10.3934/gf.2021021
[2]	D. Marghescu, M. Kallio, B. Back, Using financial ratios to select companies for tax auditing: a preliminary study, in Communications in Computer and Information Science. Springer, Berlin, 2010. https://doi.org/10.1007/978-3-642-16324-1_45
[3]	A. Su, Z. He, J. Su, Y. Zhou, Y. Fan, Y. Kong, Detection of tax arrears based on ensemble learning model, in Proceedings of the 2018 International Conference on Wavelet Analysis and Pattern Recognition, Piscataway, NJ, (2018), 270–274. https://doi.org/10.1109/icwapr.2018.8521362
[4]	A. Ippolito, A. C. G. Lozano, Sammon mapping-based gradient boosted trees for tax crime prediction in the city of São Paulo, in Enterprise Information Systems, ICEIS 2020, (2020), 293–316. https://doi.org/10.1007/978-3-030-75418-1_14
[5]	J. Vanhoeyveld, D. Martens, B. Peeters, Value-added tax fraud detection with scalable anomaly detection techniques, Appl. Soft. Comput., 86 (2020), 1–38. https://doi.org/10.1016/j.asoc.2019.105895 doi: 10.1016/j.asoc.2019.105895
[6]	M. Z. Abedin, G. Chi, M. M. Uddin, M. S. Satu, M. I. Khan, P. Hajek, Tax default prediction using feature transformation-based machine learning, IEEE Access, 9 (2021), 19864–19881. https://doi.org/10.1109/access.2020.3048018 doi: 10.1109/access.2020.3048018
[7]	E. I. Altman, M. Balzano, A. Giannozzi, S. Srhoj, Revisiting SME default predictors: The Omega Score, J. Small Bus. Manage., 2022 (2022), 1–35. https://doi.org/10.1080/00472778.2022.2135718 doi: 10.1080/00472778.2022.2135718
[8]	O. Lukason, A. Andresson, Tax arrears versus financial ratios in bankruptcy prediction, J. Risk Financ. Manag., 12 (2019), 187–200. https://doi.org/10.3390/jrfm12040187 doi: 10.3390/jrfm12040187
[9]	S. Chen, J. Zhong, P. Failler, Does China transmit financial cycle spillover effects to the G7 countries, Econ. Res. -Ekon. Istraz., 35 (2022), 5184-5201. https://doi.org/10.1080/1331677X.2021.2025123 doi: 10.1080/1331677X.2021.2025123
[10]	F. Misra, R. Kurniawan, The role of audit information dissemination in curbing the contagion of tax noncompliance, J. Innov. Bus. Econ., 4 (2020). 1–11. https://doi.org/10.22219/jibe.v4i01.10223 doi: 10.22219/jibe.v4i01.10223
[11]	Z. Li, J. Zhu, J. He, The effects of digital financial inclusion on innovation and entrepreneurship: A network perspective, Electron. Res. Arch., 30 (2022), 4697–4715. https://doi.org/10.3934/era.2022238 doi: 10.3934/era.2022238
[12]	G. Kou, Y. Xu, Y. Peng, F. Shen, Y. Chen, K. Chang, et al., Bankruptcy prediction for SMEs using transactional data and two-stage multiobjective feature selection, Decis. Support Syst., 140 (2021), 113429. https://doi.org/10.1016/j.dss.2020.113429 doi: 10.1016/j.dss.2020.113429
[13]	P. Giudici, B. H. Misheva, A. Spelta, Network based credit risk models, Qual. Eng., 32 (2020), 199–211. https://doi.org/10.1080/08982112.2019.1655159 doi: 10.1080/08982112.2019.1655159
[14]	K. Peng, G. Yan, A survey on deep learning for financial risk prediction, Quant. Finance. Econ., 5 (2021), 716–737. https://doi.org/10.3934/qfe.2021032 doi: 10.3934/qfe.2021032
[15]	Õ. R. Siimon, O. Lukason, A decision support system for corporate tax arrears prediction, Sustainability, 13 (2021), 8363. https://doi.org/10.3390/su13158363 doi: 10.3390/su13158363
[16]	V. Chaudhri, C. Baru, N. Chittar, X. Dong, M. Genesereth, J. Hendler, Knowledge graphs: introduction, history and, perspectives, AI Mag., 43 (2022), 17–29. https://doi.org/10.1609/aimag.v43i1.19119 doi: 10.1609/aimag.v43i1.19119
[17]	R. Angles, C. Gutierrez, Survey of graph database models, ACM Comput. Surv., 40 (2008), 1–39. https://doi.org/10.1145/1322432.1322433 doi: 10.1145/1322432.1322433
[18]	N. Ahbali, X. Liu, A. Nanda, J. Stark, A. Talukder, R. P. Khandpur, Identifying corporate credit risk sentiments from financial news, in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track, (2022), 362–370. http://dx.doi.org/10.18653/v1/2022.naacl-industry.40
[19]	Z. Li, L. Chen, H. Dong, What are bitcoin market reactions to its-related events, Int. Rev. Econ. Finance, 73 (2021), 1–10. https://doi.org/10.1016/j.iref.2020.12.020 doi: 10.1016/j.iref.2020.12.020
[20]	T. Ruan, L. Xue, H. Wang, F. Hu, L. Zhao, J. Ding, Building and exploring an enterprise knowledge graph for investment analysis, in International Semantic Web Conference 2016, (2016), 418–436. https://doi.org/10.1007/978-3-319-46547-0_35
[21]	X. Chang, The impact of corporate tax outcomes on forced CEO turnover, Natl. Account. Rev., 4 (2022), 218–236. https://doi.org/10.3934/nar.2022013 doi: 10.3934/nar.2022013
[22]	A. Sousa, A. Braga, J. Cunha, Impact of macroeconomic indicators on bankruptcy prediction models: Case of the Portuguese construction sector, Quant. Finance. Econ., 6 (2022), 405–432. https://doi.org/10.3934/qfe.2022018 doi: 10.3934/qfe.2022018
[23]	Z. Li, Z. Huang, Y. Su, New media environment, environmental regulation and corporate green technology innovation: Evidence from China, Energy Econ., 119 (2023), 106545. https://doi.org/10.1016/j.eneco.2023.106545 doi: 10.1016/j.eneco.2023.106545
[24]	Y. Liu, Z. Li, M. Xu, The influential factors of financial cycle spillover: evidence from China, Emerg. Mark. Finance Trade, 56 (2020), 1336–1350. https://doi.org/10.1080/1540496x.2019.1658076 doi: 10.1080/1540496x.2019.1658076
[25]	G. Aytkhozhina, A. Miller, State tax control strategies: Theoretical aspects, Contaduría y Administración, 63 (2018), 25. https://doi.org/10.22201/fca.24488410e.2018.1672 doi: 10.22201/fca.24488410e.2018.1672
[26]	Z. Li, B. Mo, H. Nie, Time and frequency dynamic connectedness between cryptocurrencies and financial assets in China, Int. Rev. Econ. Finance, 86 (2023), 46–57. https://doi.org/10.1016/j.iref.2023.01.015 doi: 10.1016/j.iref.2023.01.015
[27]	Z. Li, H. Dong, C. Floros, A. Charemis, P. Failler, Re-examining bitcoin volatility: a CAViaR-based approach, Emerg. Mark. Finance Trade, 58 (2022), 1320–1338. https://doi.org/10.1080/1540496x.2021.1873127 doi: 10.1080/1540496x.2021.1873127
[28]	A. Chang, L. Yang, R. Tsaih, S. Lin, Machine learning and artificial neural networks to construct P2P lending credit-scoring model: A case using Lending Club data, Quant. Finance Econ., 6 (2022), 303–325. https://doi.org/10.3934/qfe.2022013 doi: 10.3934/qfe.2022013
[29]	D. Wang, L. Li, D. Zhao, Corporate finance risk prediction based on LightGBM, Inf. Sci., 602 (2022), 259–268. https://doi.org/10.1016/j.ins.2022.04.058 doi: 10.1016/j.ins.2022.04.058
[30]	B. Gao, V. Balyan, Construction of a financial default risk prediction model based on the LightGBM algorithm, J. Intell. Syst., 31 (2022), 767–779. https://doi.org/10.1515/jisys-2022-0036 doi: 10.1515/jisys-2022-0036
[31]	L. Zhang, Q. Song, Multimodel integrated enterprise credit evaluation method based on attention mechanism, Comput. Intell. Neurosci., 2022 (2022), 1–12. https://doi.org/10.1155/2022/8612759 doi: 10.1155/2022/8612759
[32]	J. G. Ponsam, S.V. J. B. Gracia, G. Geetha, S. Karpaselvi, K. Nimala, Credit risk analysis using LightGBM and a comparative study of popular algorithms, in International Conference on Computing and Communications Technologies (ICCCT), 2021. https://doi.org/10.1109/iccct53315.2021.9711896
[33]	D. G. Kirikos, An evaluation of quantitative easing effectiveness based on out-of-sample forecasts, Natl. Account. Rev., 4 (2022), 378–389. https://doi.org/10.3934/nar.2022021 doi: 10.3934/nar.2022021
[34]	F. Corradin, M. Billio, R. Casarin, Forecasting economic indicators with robust factor models, Natl. Account. Rev., 4 (2022), 167–190. https://doi.org/10.3934/nar.2022010 doi: 10.3934/nar.2022010
[35]	P. Harrington, Machine Learning in Action, Manning Publications, (2012), 143–149.
[36]	J. Davis, M. Goadrich, The relationship between Precision-Recall and ROC curves, in William C. ICML '06: Proceedings of the 23rd international conference on Machine learning, (2006), 233–240. https://doi.org/10.1145/1143844.1143874
[37]	T. Fawcett, An introduction to ROC analysis, Pattern Recognit. Lett., 27 (2006), 861–874. https://doi.org/10.1016/j.patrec.2005.10.010 doi: 10.1016/j.patrec.2005.10.010
[38]	W. H. J. David, S. Lemeshow, R. X. Sturdivant, Applied Logistic Regression, 3 edition, John Wiley & Sons, (2013), 177–178. https://doi.org/10.1002/9781118548387
[39]	Z. Li, C. Yang, Z. Huang, How does the fintech sector react to signals from central bank digital currencies, Finance Res. Lett., 50 (2022), 103308. https://doi.org/10.1016/j.frl.2022.103308 doi: 10.1016/j.frl.2022.103308
[40]	D. L. Wilsin, Asymptotic properties of nearest neighbor rules using edited data, IEEE Trans. Syst. Man Cybern., 3 (1972), 408–421. https://doi.org/10.1109/tsmc.1972.4309137 doi: 10.1109/tsmc.1972.4309137
[41]	I. Tomek, Two modifications of CNN, IEEE Trans. Syst. Man Cybern., 6 (1976), 769–772. https://doi.org/10.1109/tsmc.1976.4309452
[42]	N. V. Chawla, K. W. Bowyer, L. O. Hall, W. P. Kegelmeyer, Smote: synthetic minority over-sampling technique, J. Artif. Intell. Res., 16 (2002), 321–357. https://doi.org/10.1613/jair.953 doi: 10.1613/jair.953
[43]	H. Han, W. Y. Wang, B. H. Mao, Borderline-smote: a new over-sampling method in imbalanced data sets learning, in International Conference on Intelligent Computing, (2005), 878–887. https://doi.org/10.1007/11538059_91
[44]	B. Y. Li, Y. Liu, X. G. Wang, Gradient harmonized single-stage detector, in The 33rd AAAI Conference on Artificial Intelligence, (2019), 8577–8584. https://doi.org/10.1609/aaai.v33i01.33018577
[45]	T. Lin, P. Goyal, R. Girshick, K. He, P. Dollar, Focal loss for dense object detection, in 2017 IEEE International Conference on Computer Vision (ICCV), 2017. https://doi.org/10.1109/iccv.2017.324
[46]	T. Li, J. Wen, D. Zeng, K. Liu, Has enterprise digital transformation improved the efficiency of enterprise technological innovation? A case study on Chinese listed companies, Math. Biosci. Eng., 19 (2022), 12632–12654. https://doi.org/10.3934/mbe.2022590 doi: 10.3934/mbe.2022590

Reader Comments

Your name:*

Email:*
© 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)