Validation of corporate probability of default models considering alternative use cases and the quantification of model risk

Michael Jacobs Jr.; Michael Jacobs Jr.

doi:10.3934/DSFE.2022002

Data Science in Finance and Economics

2022, Volume 2, Issue 1: 17-53. doi: 10.3934/DSFE.2022002

Previous Article Next Article

Research article

Validation of corporate probability of default models considering alternative use cases and the quantification of model risk

Michael Jacobs Jr. ^,

CFA, Lead Quantitative Analytics & Modeling Expert, Head – C & I 1^st Line Model Development Validation & Quality Assurance, PNC Financial Services Group – Balance Sheet Analytics & Modeling / Model Development, 340 Madison Avenue, New York, N.Y. 10173, U.S.A

Received: 29 January 2022 Revised: 25 March 2022 Accepted: 31 March 2022 Published: 12 April 2022
JEL Codes: G21, G28, M40, E47

In this study we consider the construction of through-the-cycle ("TTC") probability-of-default ("PD") models designed for credit underwriting uses and point-in-time ("PIT") PD models suitable for early warning uses, considering which validation elements should be emphasized in each case. We build PD models using a long history of large corporate firms sourced from Moody's, with a large number of financial, equity market and macroeconomic candidate explanatory variables. We construct a Merton model-style distance-to-default ("DTD") measure and build hybrid structural and reduced-form models to compare with the financial ratio and macroeconomic variable-only models. In the hybrid models, the financial and macroeconomic explanatory variables still enter significantly and improve the predictive accuracy of the TTC models, which generally lag behind the PIT models in that performance measure. While all classes of models have high discriminatory power by most measures, on an out-of-sample basis the TTC models perform better than the PIT models. We measure the model risk attributable to various model assumptions according to the principle of relative entropy and observe that omitted variable bias with respect to the DTD risk factor, neglect of interaction effects and incorrect link function specification has the greatest, intermediate and least impacts, respectively. We conclude that care must be taken to judiciously choose how we validate TTC vs. PIT models, as criteria may be rather different and apart from standards such as discriminatory power. This study contributes to the literature by providing expert guidance to credit risk modeling, model validation and supervisory practitioners in managing model risk.

Keywords:

Citation: Michael Jacobs Jr.. Validation of corporate probability of default models considering alternative use cases and the quantification of model risk[J]. Data Science in Finance and Economics, 2022, 2(1): 17-53. doi: 10.3934/DSFE.2022002

Related Papers:

[1]	Dominic Joseph . Estimating credit default probabilities using stochastic optimisation. Data Science in Finance and Economics, 2021, 1(3): 253-271. doi: 10.3934/DSFE.2021014
[2]	Michael Jacobs, Jr . Benchmarking alternative interpretable machine learning models for corporate probability of default. Data Science in Finance and Economics, 2024, 4(1): 1-52. doi: 10.3934/DSFE.2024001
[3]	Sami Mestiri . Credit scoring using machine learning and deep Learning-Based models. Data Science in Finance and Economics, 2024, 4(2): 236-248. doi: 10.3934/DSFE.2024009
[4]	Changjun Zheng, Md Abdul Mannan Khan, Mohammad Morshedur Rahman, Shahed Bin Sadeque, Rabiul Islam . The impact of monetary policy on banks' risk-taking behavior in an emerging economy: The role of Basel II. Data Science in Finance and Economics, 2023, 3(4): 427-451. doi: 10.3934/DSFE.2023024
[5]	Lindani Dube, Tanja Verster . Enhancing classification performance in imbalanced datasets: A comparative analysis of machine learning models. Data Science in Finance and Economics, 2023, 3(4): 354-379. doi: 10.3934/DSFE.2023021
[6]	Moses Khumalo, Hopolang Mashele, Modisane Seitshiro . Quantification of the stock market value at risk by using FIAPARCH, HYGARCH and FIGARCH models. Data Science in Finance and Economics, 2023, 3(4): 380-400. doi: 10.3934/DSFE.2023022
[7]	Dirk Tasche . Proving prediction prudence. Data Science in Finance and Economics, 2022, 2(4): 335-355. doi: 10.3934/DSFE.2022017
[8]	Katleho Makatjane, Ntebogang Moroke . Examining stylized facts and trends of FTSE/JSE TOP40: a parametric and Non-Parametric approach. Data Science in Finance and Economics, 2022, 2(3): 294-320. doi: 10.3934/DSFE.2022015
[9]	Kasra Pourkermani . VaR calculation by binary response models. Data Science in Finance and Economics, 2024, 4(3): 350-361. doi: 10.3934/DSFE.2024015
[10]	Alejandro Rodriguez Dominguez, Om Hari Yadav . A causal interactions indicator between two time series using extreme variations in the first eigenvalue of lagged correlation matrices. Data Science in Finance and Economics, 2024, 4(3): 422-445. doi: 10.3934/DSFE.2024018

Abstract

1. Introduction and summary

It is expected that financial market participants have accurate measures of a counterparty's capacity to fulfill future debt obligations, conventionally measured by a credit rating or a score, and typically associated with a probability of default ("PD"). Most extant risk rating methodologies distinguish model outputs considered point-in-time ("PIT") vs. through-the-cycle ("TTC"). Although these terminologies are widely used in the credit risk modeling community, there is some confusion about what these terms precisely mean. In our view, based upon first-hand experience in this domain and a comprehensive literature review, at present a generally accepted definition for these concepts remains elusive, apart from two points of common understanding. First, PIT PD models should leverage all available information, borrower-specific and macroeconomic, which most accurately reflect default risk at any point of time. Second, TTC PD models abstract from cyclical effects and measure credit risk over a longer time period encompassing a mix of economic conditions, exhibiting "stability" of ratings wherein dramatic changes are related mainly to fundamental and not transient economic fluctuations. However, in reality this distinction is not so well defined, as idiosyncratic factors can influence systematic conditions (e.g., credit contagion), and macroeconomic conditions can influence obligors' fundamental creditworthiness.

There is an understanding in the industry of what distinguishes PIT and TTC constructs, typically defined by how PD estimates behave with respect to the business cycle. However, how this degree of "TTC-ness" vs. "PIT-ness" is defined varies considerably across institutions and applications, and there is no consensus around what thresholds should be established for certain metrics, such as measures of ratings volatility. As a result, most institutions characterize their rating systems as "Hybrid". While this may be a reasonable description, as arguably the TTC and PIT constructs are ideals, this argument fails to justify the use cases of a PD model where there may be expectations that the model is closer to either one of these poles.

In this study, we develop empirical models that avoid formal definitions of PIT and TTC PDs, rather deriving constructs based upon common sense criteria prevalent in the industry, and illustrating which validation techniques are applicable to these approaches. Based upon this empirical approach, we characterize PIT and TTC credit risk measures and discuss the key differences between both rating philosophies. In the process, we address the validation of PD models under both rating philosophies, highlighting that the validation of either system exhibits a particular set of challenges. In the case of the TTC PD models, in addition to flexibility in determining measurement of the cycle, there are unsettled questions around the rating stability metric thresholds. In the case of PIT PD models, there is the additional question of demonstrating the accuracy of PD estimates at the borrower level, which may not be obvious from observing average PD estimates versus default rates over time. Finally, considering both types of models, there is the question of whether the relative contributions of risk factors are conceptually intuitive, as we would expect that certain variables would dominate in either of these constructs.

There are some additional comments in order to motivate this research. First, there is a misguided perception in the literature and industry that PIT models contain only macroeconomic factors, and that TTC models contain only financial ratios, whereas from a modeling perspective there are other dimensions that define this distinction that we elaborate upon in this research. Furthermore, it may be argued that the validation of a TTC or PIT PD model involves assessing the validity of the cyclical factor, which if not available to the validator, may be accounted for only implicitly. One possibility is for the underlying cycle to be estimated from historical data based upon some theoretical framework, but in this study, we prefer commonly used macroeconomic factors in conjunction with obligor level default data, in line with industry practice. Related to this point, we do not explicitly address how TTC PD models can be transformed into PIT PD rating models, or vice versa. While the advantage of such alternative constructs is that they can be validated based upon an assumption regarding the systematic factor using the methodologies applicable to each type of PD model, we prefer to validate each as specifically appropriate. The rationale for our approach is that the alternative runs the risk of introducing significant model risk, thereby leading to the validity of such validation being rendered questionable as compared to testing a pure PIT or TTC PD model.

We employ a long history of borrower level data sourced from Moody's, around 200,000 quarterly observations from a large population of rated larger corporate borrowers (at least USD 1 billion in sales and domiciled in the U.S. or Canada), spanning the period from 1990 to 2015. The dataset is comprised of an extensive set of financial ratios, macroeconomic and equity market variables as candidate explanatory variables. We build a set of PIT models with a 1-year default horizon and macroeconomic variables, and a set of TTC models with a 3-year default horizon and only financial risk factors.

The position of this research in the academic literature is at the intersection of two streams of inquiry. First, there are a series of empirical studies that focus on the factors that determine corporate default and the forecasting of this phenomenon, which include Altman (1968), Jarrow and Turnbull (1995) and Duffie and Singleton (1999b). At the other end of the spectrum, there are mainly theoretical studies that focus on modeling frameworks for either understanding corporate default e.g., Merton, (1974), or else for perspectives on the TTC vs. PIT dichotomy e.g., Kiff et al., 2004; Aguais et al., 2008; Cesaroni, 2015. In this paper, we blend these considerations of theory and empirics, while also addressing the prediction of default and TTC/PIT construct.

We would like to emphasize what we believe to be the principal contributions of this paper. First, in terms of methodology, we assert this to be mainly in the domain of practical application rather than methodological innovation. Many practitioners, especially in the wholesale credit and banking book space, still use the techniques employed in this paper. We see our contribution as proposing a structured approach to constructing a suite of TTC and PIT models, while combining reduced form and structural modeling aspects, and then by further proposing a framework for model validation. We would note that many financial institutions in this space do not have such a framework. For example, a lot of banks are still using TTC Basel models that are modified for PIT uses, such as stress testing or portfolio management. Furthermore, a preponderance of banks in this space do not employ hybrid financial and Merton-style models for credit underwriting. In sum, our contribution transcends the academic literature to address issues relevant to financial institution practitioners in the credit risk modeling space, which we believe uniquely positions this research. Second, we would further like to emphasize our contribution in terms of modeling data, which we believe to be more extensive and richer that most of the prior literature, in that we combine a variety of datasets over a rather long historical period and have a very large number of candidate explanatory variables. Finally, we believe that we have made a contribution in terms of our conclusions, which are rather multifaceted and nuanced in contrast with the majority of the prior literature. The implications of our study span the domains of prudential supervision guidance, financial theory, as well as tools for bank risk modelers and validators.

The summary of our empirical results are as follows. We present the leading two models each in the classes of PIT and TTC design, all having favorable rank ordering power, intuitive relative weights on explanatory variables and rating mobility metrics. We also perform predictive accuracy analysis and specification testing, where we observe that the TTC designs are more challenged than the PIT designs in performance, and that unfortunately all designs show some signs of model misspecification. This observation argues for the consideration of alternative risk factors, such as equity market information. In view of this, from the market value of equity and accounting measures of debt for these firms, we are able to construct a Merton model-style distance-to-default ("DTD") measure and build hybrid structural reduced-form models, which we compare with the financial ratio and macroeconomic variable-only models. We show that adding DTD measures to our leading models does not invalidate the other variables chosen, significantly augments model performance and in particular increases the obligor-level predictive accuracy of the TTC models. We also find that while all classes of models have high discriminatory power by all measures, there are some conflicting results regarding predictive accuracy depending upon the measure employed, and that on an out-of-sample basis the TTC models actually perform better than the PIT models. Finally, we perform an exercise in which we measure the model risk attributable to violating various model assumptions according to the principle of relative entropy. In the latter experiment we observe that omitted variable bias (with respect to the DTD) has the greatest impact, the incorrect specification of the link function has the least impact, and the neglect of interaction effects amongst risk factors has an intermediate impact to measured model risk.

Finally, let us introduce the remainder of this paper, which will proceed as follows. In Section 2, we review the relevant literature, where we address a survey of PD modeling in general, as well as issues around rating philosophy in particular. In Section 3, we address modeling methodology, which we partition into the domains of econometric modeling and statistical assumptions. Section 4 encompasses the empirical analysis of this study, a description of the modeling data, estimation, validation results and the quantification of model risk. In Section 5, we conclude and summarize the study, discuss policy implications and provide thoughts on avenues for future research.

2. Review of the literature

Traditional credit risk models focus on estimating the PD, rather than on the magnitude of potential losses in the event of default (or loss-given-default – "LGD"), and typically specify "failure" to be bankruptcy filing, default, or liquidation, thereby ignoring consideration of the downgrades and upgrades in credit quality that are measured in mark-to-market ("MTM") credit models. Such default mode ("DM") models estimate credit losses resulting from default events only, whereas MTM models classify any change in credit quality as a credit event. There are three broad categories of traditional models used to estimate PD: expert systems, including artificial neural networks; rating systems; and credit scoring models.

The most commonly used traditional credit risk measurement methodology is the PD scoring model. The seminal model in this domain is the multiple discriminant analysis ("MDA") of Altman (1968). While MDA is computationally convenient as it relies on the assumption of normal error terms and a linear model equation, with the increase in computational power of computers recently this is a very marginal benefit. Mester (1997) documents the widespread use of credit scoring models amongst banks in the U.S., with 97% and 70% of them using them to approve credit card and small business loan applications, respectively. We are not surprised by this rapid spread that she documents, as credit scoring models are relatively inexpensive to implement and do not suffer from the subjectivity and inconsistency of expert systems. The spread of these models throughout the world was first surveyed by Altman and Narayanan (1997). Departing slightly from the conclusions of Mester (1997), the authors find that it is not so much the models' differences across countries of diverse sizes and in various stages of development that stand out, but rather their similarities. An example of popularly used vended PD scoring model in the industry is the private firm model of Moody's Analytics ("MA"; Dwyer et al, 2004), the flexibility and economy of which explains why so many banks use it.

In a departure from the credit scoring approach, Merton (1974) models equity in a levered firm as a call option on the firm's assets with a strike price equal to the debt repayment amount, basing the framework on financial contingent claims theory rather than a purely empirical construct. The PD is determined by valuing the call option using an iterative method to estimate the unobserved variables that determine this, the market value of assets and the volatility of assets, combined with the amount of debt liabilities that have to be repaid at a given credit horizon in order to calculate the firm's distance-to-default ("DTD"). DTD is the number of standard deviations between the current asset values and the debt repayment amount, so the higher it is, the lower the PD. While this is indeed an elegant construct, there are some very restrictive assumptions in play, which have been explored in both subsequent academic literatures, as well as in the practical realm in ways to implement this construct. In an important example of this, in the CreditEdge^TM ("CE") public firm model of MA, historical default experience is used to estimate an empirical measure of the PD, denoted the expected default frequency ("EDF"). As CE EDF scores are obtained from equity prices, they are more sensitive to changing financial circumstances than external credit ratings that rely predominately on credit underwriting data.

Similar to the previous construct in drawing from financial contingent claims theory, which is sometime called the option-theoretic structural approach, other modern methods of credit risk measurement can be traced to alternative branches in the asset pricing literature of academic finance. In contrast to the structural approach, the reduced form approach utilizing intensity-based models to estimate stochastic hazard rates follows a study pioneered by Jarrow and Turnbull (1995) and Duffie and Singleton (1999b). This school of thought offers a differing methodology to accomplish the task of estimating of PDs. While the structural approach models the economic process of default, the reduced form models decompose risky debt prices in order to estimate the random intensity process underlying default. While this has the advantage of not relying on a description of the economy that is bound by strong assumptions that introduce model risk, the reliance on risky debt prices has been criticized as not purely measuring credit risk, as there are elements of market liquidity that confound the relationship. The proprietary model Kamakura Risk Manager^TM, where the econometric approach (the so-called Jarrow-Chava Model) is a reduced-form model based upon the research of Chava and Jarrow (2004), attempts to explicitly adjust for such liquidity effects. However, noise from embedded options and other structural anomalies in the default risk-free market further distorts risky debt prices, thereby impacting the results of such intensity-based models.

One of the key motivations behind the new generation of PD models being developed in the industry, as well as in this research, is to provide a suite of models that can accommodate multiple uses, such as TTC models for credit underwriting or risk weighted assets ("RWA"), as well as PIT models for credit portfolio management or early warning. One point to highlight is that despite the growing literature on TTC credit ratings, there is still no consensus on the precise definition of this concept, except the general agreement that TTC ratings do not reflect cyclical effects. The Basel guidelines (BIS 2006) describe a PIT rating system as a construct that uses all currently available obligor-specific and aggregate information to estimate an obligor's PD, in contrast to a TTC rating system that, while using obligor-specific information, tends not to adjust ratings in response to changes in macroeconomic conditions. However, the types of such cyclical effects and how they are measured differ considerably in the literature as well as in practice.

First, a number of studies have proposed formal definitions of PIT and TTC PD estimates and rating systems. These include Loeffler (2004), who explores the TTC methodology in a structural credit risk model based on Merton (1974), in which a firm's asset value is separated into a permanent and a cyclical component. In this model, TTC credit ratings are based on forecasting the future asset value of a firm under a stress scenario for the cyclical component. While a downside of this approach is that is relies on a stress scenario which is a subjective construct, this has the benefit of making a model robust to a downturn. Kiff et al. (2004) investigate the TTC approach also in a structural framework in which the definition of TTC ratings follows the one applied by Hamilton et al. (2011), emphasizing that while anecdotal evidence from credit rating agencies confirm their use of this TTC approach, it turns out that there is no single and simple definition of what a TTC rating actually means. In contrast to studies such as these that define PIT and TTC credit measures on the basis of a decomposition of credit risk into idiosyncratic and systematic risk factors, Aguais et al. (2008) follow a frequency decomposition view in which a firm's credit measure is split up into a long-term credit quality trend and a cyclical component which are filtered from the firm's original credit measure by using a smoothing technique based on the filter in Hodrick and Prescott (1997). Furthermore, the authors argue that in the existing literature, there has been little discussion about whether the C in TTC refers to the business cycle or the credit cycle and highlight that these cycles differ considerably from each other regarding their length. They describe a practical framework for banks to compute PIT and TTC PDs through converting PIT PDs into TTC PDs based on sector-specific credit cycle adjustments to the DTD credit measures of the Merton (1974) model derived from a credit rating agency's rating or MA's CE model. Furthermore, they qualitatively discuss key components of PIT-TTC default rating systems and how these systems can be implemented in banks. This approach offers practitioners much flexibility in adaptation to multiple uses, but is computationally intense and has many sub-dependencies in the model components which make it susceptible to model risk. In contrast, Cesaroni (2015) analyzes PIT and TTC default probabilities of large credit portfolios in a Merton single-factor model, where the author defines the TTC PD as the expected PIT PD, and where the expectation is taken over all possible states of a systematic risk factor. This more stylized construct has the benefit of being more parsimonious and easier to implement than some of the literature just described, but gives rise to model risk through relying on more restrictive assumptions. Finally, along this line of research, Repullo et al. (2010) propose translating PIT PDs into TTC PDs by ex post smoothing the estimated PIT PDs with countercyclical scaling factors, which is similar to the previously described papers in relying on some kind of translation between PIT and TTC designs based upon some model. This is connected with the industry's next-generation PD model redevelopment efforts and this research, as it aligns with the objective of supporting TTC vs. PIT ratings while not having formal definitions of what TTC or PIT means.

Second, several studies analyze the ratings of major rating agencies regarding their PIT vs. TTC orientation. These include the Altman and Rijken (2004) who find, based on credit scoring models, that major credit rating agencies pursue a long-term view when assigning ratings, putting less weight on short-term default indicators which indicate a TTC orientation. In relation to this argument, Loeffler (2013) shows for Standard and Poor's and Moody's rating data that these agencies have a policy of changing a rating only if it is unlikely to be reversed in the future and argues that this can explain the empirical finding that rating changes lag changes of an obligor's default risk, consistent with the general view of TTC ratings. While Altman and Rijken (2006) also analyze the TTC methodology of rating agencies, they take an investor's PIT perspective and quantify the effects of this methodology on the objectives of rating stability, rating timeliness, and performance in predicting defaults. Among other results, they find that TTC rating procedures delay migration in agency ratings on average by ½ a year on the downgrade side and ¾ of a year on the upgrade side, as well as that from the perspective of an investor's one-year horizon, TTC ratings significantly reduce the short-term predictive power for defaults. Several papers, such as Amato and Furfine (2004) and Topp and Perl (2010), take this line of inquiry a step further by analyzing actual rating data, showing that these ratings vary with the business cycle, even though these ratings are supposed to be TTC according to the policies of the credit rating agencies. Going back to Loeffler (2013), in relation to this thesis he estimates long-run trends in market-based measures of one-year PDs using different filtering techniques. He shows that agency ratings contribute to the identification of these long-run trends, thus providing evidence that credit rating agencies follow to some extent a TTC rating philosophy. In summary of this stream of research, many studies find that the ratings of major rating agencies show both PIT as well as TTC characteristics, which is consistent with the notion of hybrid rating systems. In connection with this research and industry redevelopment efforts, with the objective of supporting TTC vs. PIT ratings, these results support not having "hard" mobility metric thresholds in evaluating the model output.

Third, rating philosophy is important from a regulatory and supervisory perspective, as well as from a credit underwriting perspective, not least because capital requirements for banks and insurance firms depend upon credit risk measures. Studies that discuss TTC PDs in the context of Basel Ⅱ (Bank for International Settlements, 2006 – "BIS"), or as a remedy for the potential procyclical nature of Basel Ⅱ, include Repullo et al. (2010) who compare smoothing the input of the Basel Ⅱ formula by using TTC PDs or smoothing its output with a multiplier based on GDP growth. They prefer the GDP growth multiplier because TTC PDs are worse in terms of simplicity, transparency, cost of implementation, and consistency with banks' risk pricing and risk management systems. Cyclicality of credit risk measures also plays an important role in the context of Basel Ⅲ (BIS 2011), which states that institutions should have sound internal standards for situations where realized default rates deviate significantly from estimated PDs, and that these standards should take account of business cycles and similar systematic variability in default experience. In two separate consultation papers issued in 2016, the European Banking Authority (2016) proposes to explicitly leave the selection of the rating philosophy to the banks, whereas the Basel Committee for Banking Supervision (BIS, 2016; "BCBS") proposes requiring banks to follow a TTC approach to reduce the variability in PDs and thus RWAs across banks.

Finally, while it is widely accepted that the rating philosophy should influence the validation of rating systems, the challenges to validate TTC models have been largely ignored in the academic or practitioner literature. The BCBS (BIS 2005) further stresses that in order to evaluate the accuracy of PDs reported by banks supervisors there is a need to adapt their PD validation techniques to the specific types of banks' credit rating systems, in particular with respect to their PIT vs. TTC orientation. However, methods to validate rating systems have paid very little attention to the rating philosophy or focused on PIT models. For example, Cesaroni (2015) observes that predicted default rates are PIT, and thus the validation of a rating system "should" operate on PIT PDs from a theoretical perspective. In relation to this argument, Petrov and Rubtsov (2016) explicitly mention that they have not yet developed a validation framework consistent with their PIT/TTC methodology.

3. Model methodology and conceptual framework

In this section, we outline our econometric technique and statistical PD modeling methodology. In principle, for classification tasks including default prediction, while one could use the same loss functions as those used for regression (i.e., the least squares criterion) in order to optimize the design of the classifier, this would not be the most reasonable way to approach such problems. This is because in classification the target variable is discrete in nature, hence alternative measures than employed in regression are more appropriate for quantifying the quality of model fit. This discussion could be motivated the classification problem for default prediction through Bayesian decision theory, which has the benefits of conceptual simplicity and alignment with common sense, as well as possesses a strong optimality flavor with respect to the probability of an error in classification. However, given that the focus and contribution of this paper does not lie in the domain of econometric technique, we will defer such discussion and focus on the logistic regression modeling ("LRM") technique, as it is widely understood in the literature and applied by practitioners.

Considering the 2 class ${\left\{{\omega }_{i}\right\}}_{i = 1}^{2}$ case for the LRM that is relevant to PD modeling, the first step is to express the log-odds (or the logit function) of the posterior probabilities as a linear function of the risk factors:

$ln\left(\frac{P\left(\left.{\omega }_{1}\right|{\boldsymbol{ x}}\right)}{P\left(\left.{\omega }_{2}\right|{\boldsymbol{ x}}\right)}\right) = {{\boldsymbol{ \theta }}}^{T}{\boldsymbol{ x}},$

(1)

Where ${\boldsymbol{ x}} = \left({x}_{1}, .., {x}_{k}\right)\in {R}^{k}$ is a $k$ dimensional feature vector and ${\boldsymbol{ \theta }} = \left({\theta }_{1}, .., {\theta }_{k}\right)\in {R}^{k}$ is a vector of coefficients and we define ${x}_{1} = 1$ so that the intercept is subsumed into ${\boldsymbol{ \theta }}$ . In that $P\left(\left.{\omega }_{1}\right|{\boldsymbol{ x}}\right)+P\left(\left.{\omega }_{2}\right|{\boldsymbol{ x}}\right) = 1$ :

$P\left(\left.{\omega }_{1}\right|{\boldsymbol{ x}}\right) = \frac{1}{1+exp\left(-{{\boldsymbol{ \theta }}}^{T}{\boldsymbol{ x}}\right)} = \sigma \left(-{{\boldsymbol{ \theta }}}^{T}{\boldsymbol{ x}}\right),$

(2)

Where the function $\sigma \left(-{{\boldsymbol{ \theta }}}^{T}{\boldsymbol{ x}}\right)$ is known as the logistic sigmoid (or sigmoid link) and has the mathematical properties of a cumulative distribution function that ranges between 0 and 1, with a domain on the real line. Intuitively, this can be viewed as the conditional PD of a score ${{\bf{ \pmb{\mathsf{ θ}} }}^T}{\bf{x}}$ where higher values indicate greater default risk.

We may estimate the parameter vector ${\boldsymbol{ \theta }}$ by the method of maximum likelihood estimation ("MLE") given a set of training samples, with observations of explanatory variables ${\left\{{{\boldsymbol{ x}}}_{n}\right\}}_{n = 1}^{N}$ and binary dependent variables ${\left\{{y}_{n}\right\}}_{n = 1}^{N}$ , where ${y}_{n}\in \left\{\mathrm{0, 1}\right\}$ . The likelihood function is given by:

$P\left(\left.{y}_{1}, ..., {y}_{N}\right|{\boldsymbol{ \theta }}\right) = {\prod }_{n = 1}^{N}{\left(\sigma \left(-{{\boldsymbol{ \theta }}}^{T}{{\boldsymbol{ x}}}_{n}\right)\right)}^{{y}_{n}}{\left(1-\sigma \left(-{{\boldsymbol{ \theta }}}^{T}{{\boldsymbol{ x}}}_{n}\right)\right)}^{1-{y}_{n}}.$

(3)

The practice is to consider the negative log-likelihood function (or the cross-entropy error), a monotonically increasing transformation of (3), for the purposes of computational convenience:

$L\left({\boldsymbol{ \theta }}\right) = -{\sum }_{n = 1}^{N}{y}_{n}ln\left(\sigma \left(-{{\boldsymbol{ \theta }}}^{T}{{\boldsymbol{ x}}}_{n}\right)\right)+\left(1-{y}_{n}\right)ln\left(1-\sigma \left(-{{\boldsymbol{ \theta }}}^{T}{{\boldsymbol{ x}}}_{n}\right)\right).$

(4)

The expression in Eq4 is minimized with respect to ${\boldsymbol{ \theta }}$ using iterative methods such as steepest descent or Newton's scheme.

We note an important property of this model that is computationally convenient and leads to stable estimation under most circumstances. Since $\sigma \left(-{{\boldsymbol{ \theta }}}^{T}{{\boldsymbol{ x}}}_{n}\right)\in \left(\mathrm{0, 1}\right)$ according to the properties of the sigmoid link function, it follows that the covariance matrix ${\bf{R}}$ is positive definite, which implies that the Hessian matrix ${\nabla }^{2}L\left({\boldsymbol{ \theta }}\right)$ is positive definite. In turn this implies that the negative log-likelihood function $L\left({\boldsymbol{ \theta }}\right)$ is convex, and as such this guarantees the existence of a unique minimum to this optimization. However, maximizing the likelihood function may be problematic in the case where the development dataset is linearly separable. In such a case, any point on the hyperplane ${\widehat{{\boldsymbol{ \theta }}}}_{MLE}^{T}{\boldsymbol{ x}} = 0$ (out of an infinite number of such hyperplanes) that solves the classification task and separates the training samples in each class does so perfectly, which means that every training point is assigned a posterior probability of class membership equal to one (or ${\boldsymbol{ \sigma }}\left({\widehat{{\boldsymbol{ \theta }}}}_{MLE}^{T}{\boldsymbol{ x}}\right) = \frac{1}{2}$ ). In this case, the MLE procedure forces the parameter estimate to be infinite ( ${\widehat{{\boldsymbol{ \theta }}}}_{MLE}^{T}\to \infty$ ), which means geometrically that the sigmoid link function approaches a step function and not an s-curve as a function of the score. This basically is a case of overfitting the development sample, which can be controlled by techniques such as k-fold cross-validation, or including a regularization term inside a corresponding cost function that controls the magnitudes of the parameter estimates (e.g., LASSO techniques for a linear penalty function $C\left(\left.{\boldsymbol{ \theta }}\right|\lambda \right) = \lambda \left|{\boldsymbol{ \theta }}\right|$ with a cost parameter $\lambda$ .).

We conclude this section by discussing the statistical assumptions underlying the LRM model. Logistic regression does not make many of the key assumptions of ordinary least squares ("OLS") regression regarding linearity, normality of error terms, homoscedasticity of the error variance and the measurement level. Firstly, LRM does not assume linearity relationship between the dependent variable and estimator¹, which implies that we can accommodate non-linear relationships between the independent and dependent variables without non-linear transformations of the former (although we may choose to do so for other reasons, such as treating outliers), which yields more parsimonious and more intuitive models. Another way to look at this is since we are applying the log-odds transformation to the posterior probabilities in (1), by construction we have a linear relationship in the risk drivers and do not require additional transformations. Secondly, the independent variables do not need to be multivariate normal, which equivalently means that the error terms need not be multivariate normal either. While there is an argument that if the error terms are actually multivariate normal (which is probably not true in practice), then imposing this assumption leads to efficiency gains and possibly a more stable solution, at the same time there are many more parameters to be estimated. That is because in the normal case we not only have to estimate the $k$ regression coefficients ${\boldsymbol{ \theta }} = \left({\theta }_{1}, .., {\theta }_{k}\right)\in {R}^{k}$ , but we also have to estimate the entire covariance matrix (i.e., the covariance matrix in the LRM is a function of ${\boldsymbol{ \theta }}$ ), which is $O\left(\frac{{k}^{2}}{2}\right)$ additional operations and could lead to a more unstable model depending upon data availability as well as more computational overhead. Thirdly, since the covariance matrix also depends on ${\boldsymbol{ x}}$ by construction through the sigmoid link function, variances need not be homoscedastic for each level of the independent variables (while if we imposed a normal assumption that we would require this assumption to hold as well). Lastly, the LRM can handle ordinal and nominal independent variables as they need not be metric (i.e., interval or ratio scaled), which leads to more flexibility in model construction and again avoids counterintuitive transformations and more parameters to be estimated.

¹ Note that linearity does not mean that the dependent variable has a linear relationship with the explanatory variables (i.e., we can have non-linear transformations of the latter), but rather that the estimator is a linear function (or weighted average) of the dependent variable, which implies that we can obtain our estimator analytically using linear algebra operations as opposed to iterative techniques such as in the LRM.

However, some other assumptions still apply in the LRM setting. First, the LRM requires the dependent variable to be binary, while other approaches (e.g., ordinal logistic regression – "OLR" or the multinomial regression model – "MRM") allow the dependent variable to be polytomous, which implies more granularity in modeling. This is because reducing an ordinal or even metric variable to a dichotomous level loses a lot of information, which makes this methodology inferior compared to OLR or MRM in these cases. In the case of PD modeling, if credit states other than default is relevant (e.g., significant downgrade short of default, or prepayment), then this could result in biased estimates and mismeasurement of default risk. However, we note in this regard that for many portfolios data limitations (especially for large corporate or commercial & industrial portfolios) prevent application of OLR for more states than default (e.g., prepayment events may not be identifiable in data), and conceptually we may argue that observations of ratings have elements of expert judgment and are not "true" events (although in wholesale credit, the definition of default is partly subjective). An assumption related to this is the independence of irrelevant alternatives, which states that relative odds of a binary outcome should not depend on other possible outcomes under consideration. In the statistics and econometrics literature, there is debate not only about how critical this assumption is, but also on ways to test this assumption and the value of such tests (Cheng and Long, 2006; Fry and Harris, 1996; Hausman and McFadden, 1984; Small and Hsiao, 1985).

Another important assumption is that the LRM requires the observations be independent, which means that that the data-points should not be from any dependent samples design (e.g., matched pairings or panel data.) While obviously that is not the completely case in PD modeling in that we have dependent observations, in practice this may not be a very material violation, since if we are capturing most or all of the relevant factors influencing default, then anything else is likely to be idiosyncratic (especially if we are including macroeconomic factors).

While we are not in this implementation assuming a parametric distribution for the error terms in the LRM, there are still certain properties that the errors should exhibit, in order that we have some assurance that the model is not grossly mispecified (e.g., symmetry around zero, lack of outliers.) However, there is some debate in the literature on the criticality of this assumption, as well as the best way to evaluate LRM residuals (Li and Shepherd, 2012; Liu and Zhang, 2017).

Finally, we conclude this section by a discussion of the model methodology within the empirical context. The modeling approach as outlined in this section, and the model selection process as elaborated upon in subsequent sections, is common to both PIT and TTC constructs. However, we impose the constraint that only financial factors are considered in the TTC construct, while macroeconomic variables are additionally considered for the PIT models. This is in addition to the difference in default horizon and other model selection criteria, which results in a differentiation in the TTC and PIT outcomes, in terms of rating mobility and relative factor weights considered intuitive in each construct – i.e., high (lower) rating mobility, and greater (lower) weight on shorter (longer) term financial factors for the PIT (TTC) models.

4. Empirical analysis

4.1. Description of modeling data

The following data is used for the development of the models in this study:

● Compustat^TM Standardized fundamental and market data for publicly-traded companies including financial statement line items and industry classifications (Global Industry Classification Standards - "GICS" and North American Industry Classification System - "NAICS") over multiple economic cycles from 1979 and onward. This data includes default types such as bankruptcy, liquidation, and rating agency's default rating, all of which are part of the industry standard default definitions.

● Moody's Default Risk Service^TM ("DRS") Rating History An extensive database of rating migrations, default and recovery rates across geographies, regions, industries and sectors.

● Bankruptcydata.com A service provided by New Generation Research, Inc. providing information on corporate bankruptcies.

● The Center for Research in Security Prices^TM ("CRSP") U.S. Stock Databases This product is comprised of a database of historical daily and monthly market and corporate action data for over 32,000 active and inactive securities with primary listings on the NYSE, NYSE American, NASDAQ, NYSE Arca and Bats exchanges and includes CRSP broad market indexes.

A series of filters are applied to this Moody's population to construct a population that is closely aligned with the North American large corporate segment of companies that are publicly rated and have publicly traded equity. In order to achieve this using Moody's data, the following combination of NAICS and GICS industry codes, regional and historical yearly Net Sales restrictions are applied:

1. Non-C & I obligors defined by the following NAICS codes below (see Table 1 below), are not included in the population:

Table 1. Large corporate modeling data - GICS industry segment composition for all Moody's obligors vs. defaulted Moody's obligors (1991-2015).

GICS Industry Segment	All Moody's Obligors	Defaulted Moody's Obligors
Consumer Discretionary	19.6%	30.9%
Consumer Staples	8.4%	6.4%
Energy	7.6%	5.9%
Healthcare Equipment & Services	2.9%	2.9%
Industrials	31.6%	15.1%
Materials	10.5%	11.3%
Pharmaceuticals & Biotechnology	2.7%	0.2%
Software & IT Services	2.5%	1.8%
Technology Hardware & Communications	4.3%	11.3%
Utilities	7.6%	5.6%

| Show Table

DownLoad: CSV

● Financials

● Commercial Real Estate and Real Estate Investment Trusts

● Government

● Dealer Finance

● Not-for-Profit

2. A similar filter is performed according to GICS (see Table 2 below) classification:

Table 2. Large corporate modeling data - NAICS industry segment composition for all Moody's obligors vs. defaulted Moody's obligors (1991–2015).

NAICS Industry Segment	All Moody's Obligors	Defaulted Moody's Obligors
Agriculture, Forestry, Hunting & Fishing	0.2%	0.4%
Accommodation & Food Services	2.3%	2.9%
Waste Management % Remediation Services	2.4%	2.1%
Arts, Entertainment & Recreation	0.7%	1.0%
Construction	1.7%	2.5%
Educational Services	0.1%	0.2%
Healthcare & Social Assistance	1.6%	1.6%
Information Services	11.5%	12.1%
Management Compensation Enterprises	0.1%	0.1%
Manufacturing	37.7%	34.4%
Mining, Oil & Gas	6.8%	8.6%
Other Services (ex Public Administration)	0.4%	0.6%
Professional, Scientific & Technological Services	2.3%	2.5%
Real Estate, Rentals & Leasing	0.9%	1.6%
Retail Trade	9.6%	12.4%
Transportation & Warehousing	5.4%	7.0%
Utilities	8.3%	5.4
Wholesale Trade	7.0%	2.7

| Show Table

DownLoad: CSV

● Education

● Financials

● Real Estate

3. Only obligors based in the U.S. and Canada are included.

4. Only obligors with maximum historical yearly Net Sales at least $1B are included.

5. There are exclusions for obligors with missing GICS or NAICS codes, and for modeling purposes obligors are categorized into different industry segments on this basis.

6. Records prior to 1Q91 are excluded, the rationale being that capital markets and accounting rules were different before the 1990's, and the macroeconomic data used in the model development is only available beginning in 1990. As one-year change transformations are amongst those applied to the macroeconomic variables, this cutoff is advanced a year from 1990 to 1991.

7. Records that are too close to a default event are not included in the development dataset, which is an industry standard approach, the rationale being that the records of an obligor in this time window do not provide information about future defaults of the obligor, but more likely the obligor's existing problems. Furthermore, a more effective practice is to base this on data that are 6–18 (rather than 1–12) months prior to default date, as this typically reflects the range of timing between when statements are issued and when ratings are updated (i.e., usually it takes up to six months, depending on time to complete financials, receive them, input, and complete / finalize the ratings).

8. In general, the defaulted obligors' financial statements after default date are not included in the modeling dataset. However, in some cases obligors may exit a default state or "cure" (e.g., emerge from bankruptcy), in which cases only the statements between default date and cured date are not included.

In our opinion, these data exclusions are reasonable and in line with industry standards, sufficiently documented and supported and do not compromise the integrity of the modeling dataset.

The time periods considered for the Moody's data is the development period 1Q91−4Q15. Shown in Table 1 above is the comparison of the modeling population by GICS industry sectors, where for each sector defaulted obligors columns represent the percent of defaulted obligors in the sector out of entire population. The data are concentrated in Consumer Discretionary (20%), Industrials (17%), Tech Hardware and Communications (12%), and Energy except E & P (11%). A similar industry composition is shown below in Table 2 according to the NAICS classification system.

The model development dataset contains financial ratios and default information that are based upon the most recent data available from DRS^TM, Compustat^TM and bankruptcydata.com, so that the data is timely and a priori should be give the benefit of the doubt with respect to favorable quality. Furthermore, the model development time period of 1Q91-4Q15 spans two economic downturn periods and a complete business cycle, the length of which being another factor supporting a verdict of good quality. Related to this point, we plot the yearly one and three year default rates in the model development dataset, as shown above in Figure 1. As the goal of model development is to establish for each risk driver that the preliminary trends observed match that of our expectations, there is sufficient variation in this data to support quantitative methods of parameter estimation, further supporting the suitability of the data from a quality perspective.

Figure 1. PD model large corporate modeling data – Moody's obligors one and three year horizon default rates over time (1991–2015).

Variable	Count	Mean	Standard Deviation	Minimum	25^th Percentile	Median	75^th Percentile	Maximum
Default Indicator	157,353	0.01	0.10	0.00	0.00	0.00	0.00	1.00
Change in Total Assets		0.14	0.35	−0.40	−0.01	0.06	0.17	3.21
Total Liabilities to Total Assets		0.60	0.23	0.12	0.45	0.59	0.71	1.53
Cash Use Ratio		1.90	2.84	−22.43	1.41	2.06	2.65	19.00
Net Accounts Receivables Days		130.25	101.44	11.26	68.98	106.74	159.43	754.09
Net Quick Ratio		0.34	1.07	−0.85	−0.28	0.06	0.59	6.11
Before Tax Profit Margin		5.94	21.00	−146.67	1.85	7.09	12.85	48.70
Moody's Equity Price Index		1.91	6.09	− 27.33	−0.19	2.19	5.68	12.81
Consumer Confidence Index		2.34	21.58	−60.97	−7.02	4.89	15.35	73.21

		PIT 1-Year Default Horizon		TTC 3-Year Default Horizon
Category	Explanatory Variables	AUC	Missing Rate	AUC	Missing Rate
Size	Change in Total Assets	0.726	8.52%
Size	Total Liabilities			0.582	4.64%
Leverage	Total Liabilities to Total Assets Ratio	0.843	4.65%	0.783	4.65%
Coverage	Cash Use Ratio	0.788	7.94%
Coverage	Debt Service Coverage Ratio			0.796	17.0%
Efficiency	Net Accounts Receivables Days Ratio	0.615	8.17%
Liquidity	Net Quick Ratio	0.653	7.71%	0.617	7.17%
Profitability	Before Tax Profit Margin	0.827	2.40%	0.768	2.40%
Macroeconomic	Moody's 500 Equity Price Index Quarterly Average Annual Change	0.603	0.00%
Macroeconomic	Consumer Confidence Index Annual Change	0.607	0.00%
Merton Structural	Distance-to-Default	0.730	4.65%	0.669	4.65%

Explanatory Variable	Parameter Estimate	P-Value	Factor Weight	AIC	AUC	HL P-Value	Mobility Index
Change in Total Assets	−0.4837	0.0000	0.0455
Total Liabilities to Total Assets	2.6170	0.0104	0.1091
Cash Use Ratio	−0.0428	0.0000	0.1545
Net Accounts Receivables Days Ratio	0.0005	0.0000	0.2273
Net Quick Ratio	−0.4673	0.0000	0.0909
Before Tax Profit Margin	−0.0161	0.0000	0.2736
Moody's Equity Index Price Index Quarterly Average	−0.0189	0.0000	0.0759
Consumer Confidence Index Year-on-Year Change	−0.0099	0.0000	0.0232	7,231.00	0.8894	0.5945	0.7184

Type of Model	Model Specification	Model Assumption	Min.	25^th Prcntl.	Median	Mean	75^th Prcntl.	Max.	Std. Dev.
Through-the-Cycle	Model 1	Omitted Variable Bias	0.0093	0.1137	0.2009	0.2290	0.3208	0.8328	0.1461
		Neglected Interaction Effects	0.0221	0.1116	0.1626	0.1759	0.2267	0.5262	0.0861
		Incorrectly Specified Link Function	0.0134	0.0721	0.0960	0.1005	0.1233	0.2714	0.0380
	Model 2	Omitted Variable Bias	0.0079	0.1010	0.1746	0.1962	0.2687	0.7362	0.1251
		Neglected Interaction Effects	0.0081	0.0830	0.1203	0.13389	0.1719	0.5239	0.0699
		Incorrectly Specified Link Function	0.0158	0.0606	0.0821	0.0866	0.1077	0.24061	0.03541
Point-in-Time	Model 1	Omitted Variable Bias	0.0044	0.0816	0.1306	0.1759	0.2149	0.5528	0.0995
		Neglected Interaction Effects	0.0123	0.0572	0.0876	0.0978	0.1266	0.4128	0.0543
		Incorrectly Specified Link Function	0.0062	0.0352	0.0486	0.0635	0.0685	0.1783	0.0256
	Model 2	Omitted Variable Bias	0.0113	0.0873	0.1414	0.1587	0.2118	0.5911	0.0945
		Neglected Interaction Effects	0.0033	0.0500	0.0765	0.0869	0.1131	0.3436	0.0505
		Incorrectly Specified Link Function	0.0077	0.0304	0.0414	0.0461	0.0580	0.1621	0.0222

[1]	Aguais S (2008) Designing and implementing a Basel Ⅱ compliant PIT-TTC ratings framework.
[2]	Altman EI (1968) Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J Finance 23: 589–609. https://doi.org/10.2307/2978933 doi: 10.1111/j.1540-6261.1968.tb00843.x
[3]	Altman EI, Narayanan P (1997) An international survey of business failure classification models. Financ Mark Inst Instrum 6: 1–57. https://onlinelibrary.wiley.com/doi/abs/10.1111/1468-0416.00010
[4]	Altman EI, Rijken HA (2004) How rating agencies achieve rating stability. J Bank Financ 28: 2679–2714. https://doi.org/10.1016/j.jbankfin.2004.06.006 doi: 10.1016/j.jbankfin.2004.06.006
[5]	Altman EI, Rijken HA (2006) A point-in-time perspective on through-the-cycle ratings. Financ Anal J 62: 54–70. https://doi.org/10.2469/faj.v62.n1.4058 doi: 10.2469/faj.v62.n1.4058
[6]	Amato JD, Furfine CH (2004) Are credit ratings procyclical? J Bank Financ 28: 2641–2677. https://doi.org/10.1016/j.jbankfin.2004.06.005 doi: 10.1016/j.jbankfin.2004.06.005
[7]	Cesaroni T (2015) Procyclicality of credit rating systems: how to manage it. J Econ and Bus 82: 62–83. https://doi.org/10.1016/j.jeconbus.2015.09.001 doi: 10.1016/j.jeconbus.2015.09.001
[8]	Chava S, Jarrow RA (2004) Bankruptcy prediction with industry effects. Rev Financ 8: 537–569. https://doi.org/10.1093/rof/8.4.537 doi: 10.1093/rof/8.4.537
[9]	Cheng S, Long JS (2007) Testing for ⅡA in the multinomial logit model. Sociol Methods Res 35: 583–600. https://journals.sagepub.com/doi/abs/10.1177/0049124106292361 doi: 10.1177/0049124106292361
[10]	Duffie D, Singleton K (1998) Simulating correlated defaults. Paper presented at the Bank of England Conference on Credit Risk Modeling and Regulatory Implications Working Paper, Stanford University. https://kenneths.people.stanford.edu/sites/g/files/sbiybj3396/f/duffiesingleton1999.pdf
[11]	Duffie D, Singleton KJ (1999) Modeling term structures of defaultable bonds. Rev Financ Stud 12: 687–720. https://doi.org/10.1093/rfs/12.4.687 doi: 10.1093/rfs/12.4.687
[12]	Dwyer, Douglass, Ahmet E. Kogacil, Roger M. Stein (2004) Moody's KMV RiskCalc^TM v2.1 Model. Moody's Analytics. Available from: https://www.moodys.com/sites/products/productattachments/riskcalc%202.1%20whitepaper.pdf
[13]	Fry TRL, Harris MN (1996) A Monte Carlo study of tests for the independence of irrelevant alternatives property. Transp Res Part B: Methodol 31: 19–32. https://journals.sagepub.com/doi/abs/10.1177/0049124106292361 doi: 10.1177/0049124106292361
[14]	Glasserman P, Xu X (2014) Robust risk measurement and model risk. Quant Financ 14: 29–58. https://doi.org/10.1080/14697688.2013.822989 doi: 10.1080/14697688.2013.822989
[15]	Hamilton DT, Sun Z, Ding M (2011) Through-the-cycle EDF credit measures. Moody Anal. https://ssrn.com/abstract=1921419
[16]	Hausman J, McFadden D (1984) Specification tests for the multinomial logit model. Econometrica J Econometric Soc 1219–1240. https://doi.org/10.2307/1910997 doi: 10.2307/1910997
[17]	Hansen LP, Sargent TJ (2007) Robustness. Princeton: Princeton University Press, Available from: http://www.library.fa.ru/files/Robustness.pdf
[18]	Hodrick RJ, Prescott EC (1997) Postwar US business cycles: an empirical investigation. J Money, Credit Bank 1–16. https://doi.org/10.2307/2953682 doi: 10.2307/2953682
[19]	Jacobs Jr M (2020) The accuracy of alternative supervisory methodologies for the stress testing of credit risk. Int J Financ Eng Risk Manage 3: 254–296. http://michaeljacobsjr.com/files/Jacobs_2020_AccAltSupMdlsStrTstCrRisk_IJE_RM_vol3no3_pp254-296.pdf
[20]	Jacobs Jr. M (2020) A holistic model validation framework for current expected credit loss (CECL) model development and implementation. Int J Financ Stud 8: 1–36. https://doi.org/10.3390/ijfs8020027 doi: 10.3390/ijfs8010001
[21]	Jacobs Jr. M (2022) Borrower level models for stress testing corporate probability of default and the quantification of model risk. Int J Econ Finance 14: 75–99. https://doi.org/10.5539/ijef.v14n4p75 doi: 10.5539/ijef.v14n4p75
[22]	Jacobs Jr. M, Karagozoglu AK, Sensenbrenner F (2015) Stress testing and model validation: application of the Bayesian approach to a credit risk portfolio. J Risk Model Validation 9: 41–70. https://ssrn.com/abstract=2684227
[23]	Jarrow RA, Turnbull SM (1995) Pricing derivatives on financial securities subject to credit risk. J Financ 50: 53–85. https://doi.org/10.1111/j.1540-6261.1995.tb05167.x doi: 10.1111/j.1540-6261.1995.tb05167.x
[24]	Kiff MJ, Kisser M, Schumacher ML (2013) Rating through-the-cycle: what does the concept imply for rating stability and accuracy? International Monetary Fund, 2013. https://www.imf.org/external/pubs/ft/wp/2013/wp1364.pdf
[25]	Li C, Shepherd BE (2012) A new residual for ordinal outcomes. Biometrika 99: 473–480. http://dx.doi.org/10.1093/biomet/asr073 doi: 10.1093/biomet/asr073
[26]	Liu D, Zhang H (2018) Residuals and diagnostics for ordinal regression models: a surrogate approach. J Am Stat Assoc 113: 845–854. https://doi.org/10.1080/01621459.2017.1292915 doi: 10.1080/01621459.2017.1292915
[27]	Löffler G (2004) An anatomy of rating through the cycle. J Bank Financ 28: 695–720. https://doi.org/10.1016/S0378-4266(03)00041-4 doi: 10.1016/S0378-4266(03)00041-4
[28]	Löffler G (2013) Can rating agencies look through the cycle? Rev Quant Financ Account 40: 623–646. https://doi.org/10.1007/s11156-012-0289-9 doi: 10.1007/s11156-012-0289-9
[29]	Merton RC (1974) On the pricing of corporate debt: The risk structure of interest rates. Journal Financ 29: 449–470. https://doi.org/10.2307/2978814 doi: 10.2307/2978814
[30]	Mester LJ (1997) What's the point of credit scoring? Federal Reserve Bank of Philedelphia Business Review, 3–16. https://fraser.stlouisfed.org/files/docs/historical/frbphi/businessreview/frbphil_rev_199709.pdf
[31]	Rubtsov M, Petrov A (2016) A point-in-time-through-the-cycle approach to rating assignment and probability of default calibration. J Risk Model Validation 10: 83–112. doi: 10.21314/JRMV.2016.154
[32]	Repullo R, Saurina J, Trucharte C (2010) Mitigating the Pro-cyclicality of Basel Ⅱ. Econ Policy 25: 659–702. https://doi.org/10.1111/j.1468-0327.2010.00252.x doi: 10.1111/j.1468-0327.2010.00252.x
[33]	Small K A, Hsiao C (1985) Multinomial logit specification tests[J]. Int Econ Rev 26: 619–627. https://doi.org/10.2307/2526707 doi: 10.2307/2526707
[34]	The Bank for International Settlements—Basel Committee on Banking Supervision (BIS) (2005) Studies on the Validation of Internal Rating Systems. Working Paper 14. Basel: The Bank for International Settlements—Basel Committee on Banking Supervision. Available from: https://www.bis.org/publ/bcbs_wp14.htm.
[35]	The Bank for International Settlements—Basel Committee on Banking Supervision (BIS) (2006) International Convergence of Capital Measurement and Capital Standards: A Revised Framework. Basel: The Bank for International Settlements—Basel Committee on Banking Supervision. Available from: https://www.bis.org/publ/bcbsca.htm.
[36]	The Bank for International Settlements—Basel Committee on Banking Supervision (BIS). 2011. Basel Ⅲ: A Global Regulatory Framework for More Resilient Banks and Banking Systems. Basel: The Bank for International Settlements—Basel Commit-tee on Banking Supervision. Available from: https://www.bis.org/publ/bcbs189.htm.
[37]	The Bank for International Settlements—Basel Committee on Banking Supervision (BIS) (2016) Reducing Variation in Credit Risk-Weighted Assets—Constraints on the Use of Internal Model Approaches. Consultative Document. Basel: The Bank for International Settlements—Basel Committee on Banking Supervision. Available from: http://www.bis.org/bcbs/publ/d362.htm.
[38]	The European Banking Authority (2016) The European Banking Authority. Guidelines on PD Estimation, LGD Estimation and the Treatment of Defaulted Exposures. Consultation Paper. Paris: The European Banking Authority. Available from: https://www.eba.europa.eu/regulation-and-policy/model-validation/guidelines-on-pd-lgd-estimation-and-treatment-of-defaulted-assets.
[39]	Topp R, Perl R (2010) Through-the-Cycle Ratings Versus Point-in-Time Ratings and Implications of the Mapping Between Both Rating Types. Financ Mark Inst Instrum 19: 47–61. https://doi.org/10.1111/j.1468-0416.2009.00154.x doi: 10.1111/j.1468-0416.2009.00154.x

1.	Michael Jacobs, Jr, Benchmarking alternative interpretable machine learning models for corporate probability of default, 2024, 4, 2769-2140, 1, 10.3934/DSFE.2024001
2.	Fengxue Yin, Yanling Xiao, Rui Cao, Jianhua Zhang, Impacts of ESG Disclosure on Corporate Carbon Performance: Empirical Evidence from Listed Companies in Heavy Pollution Industries, 2023, 15, 2071-1050, 15296, 10.3390/su152115296

Variable	Count	Mean	Standard Deviation	Minimum	25^th Percentile	Median	75^th Percentile	Maximum
Default Indicator	150,064	0.03	0.17	0.0	0.0	0.0	0.0	1.0
Total Liabilities		3,640.65	6,741.93	8.86	422.60	1,170.45	3,374.12	41,852.00
Total Liabilities to Total Assets		0.62	0.22	0.12	0.49	0.61	0.72	1.53
Debt Service Ratio		16.44	52.82	−25.07	1.74	4.09	9.80	409.64
Net Quick Ratio		0.24	0.93	−0.85	−0.30	0.02	0.47	6.11
Before Tax Profit Margin		5.50	21.08	−146.67	1.57	6.72	12.40	48.70
Distance-to-Default		0.20	0.42	−1.32	0.02	0.07	0.28	5.26

Variable	Count	Mean	Standard Deviation	Minimum	25^th Percentile	Median	75^th Percentile	Maximum
Default Indicator	150,064	0.03	0.17	0.0	0.0	0.0	0.0	1.0
Total Liabilities		3,640.65	6,741.93	8.86	422.60	1,170.45	3,374.12	41,852.00
Total Liabilities to Total Assets		0.62	0.22	0.12	0.49	0.61	0.72	1.53
Debt Service Ratio		16.44	52.82	−25.07	1.74	4.09	9.80	409.64
Net Quick Ratio		0.24	0.93	−0.85	−0.30	0.02	0.47	6.11
Before Tax Profit Margin		5.50	21.08	−146.67	1.57	6.72	12.40	48.70
Distance-to-Default		0.20	0.42	−1.32	0.02	0.07	0.28	5.26

Data Science in Finance and Economics

Validation of corporate probability of default models considering alternative use cases and the quantification of model risk

Related Papers:

Abstract

1. Introduction and summary

2. Review of the literature

3. Model methodology and conceptual framework

4. Empirical analysis

4.1. Description of modeling data

4.2. Econometric specifications and model validation

4.3. The quantification of model risk according to the principle of relative entropy

4. Conclusions and directions for future research

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog