Upload
duongdan
View
222
Download
0
Embed Size (px)
Citation preview
Predicting bankruptcy in the telecommunications industry
Understanding and predicting bankruptcy has always been important — now more than
ever, given the failure of many multibillion dollar enterprises. Effective bankruptcy prediction
is useful for investors and analysts, allowing for accurate evaluation of a firm’s prospects.
Roughly forty years ago Ed Altman showed that publicly available financial ratios can be
used to distinguish between firms that are about to go bankrupt and those that are not. He
did this using discriminant analysis, but this is a natural logistic regression problem.
The following data and discussion are based on material from an honors thesis of Jeffrey
Lui. The data represent a retrospective sample of the 25 telecommunications firms that de-
clared bankruptcy between May 2000 and January 2002 that had issued financial statements
for at least two years, and information from the December 2000 financial statements of 25
telecommunications that did not declare bankruptcy. The nonbankrupt firms were chosen
to try to match asset sizes with the bankrupt firms. Remember, this is a retrospective study
because of the fact that sampling is based on the response itself (bankruptcy); this has
nothing to do with any time-related aspect to the problem.
Five financial ratios were chosen as potential predictors of bankruptcy:
1. Working capital as a percentage of total assets (WC/TA, expressed as a percentage).
Working capital is the difference between current assets and liabilities, and is thus
a measure of liquidity as it relates to total capitalization. Firms on the road to
bankruptcy would be expected to have less liquidity.
2. Retained earnings as a percentage of total assets (RE/TA, expressed as a percentage).
This is a measure of cumulative profitability over time, and is thus an indicator of
profitability, and also of age. A younger firm is less likely to be able to retain earnings,
since it would reinvest most, if not all, of its earnings in order to stimulate growth.
Both youth and less profitability would be expected to be associated with an increased
risk of insolvency.
3. Earnings before interest and taxes as a percentage of total assets (EBIT/TA, expressed
as a percentage). This is a measure of the productivity of a firm’s assets, with higher
productivity expected to be associated with a healthy firm.
4. Sales as a percentage of total assets (S/TA, expressed as a percentage). This is the
standard capital turnover ratio, indicating the ability of a firm’s assets to generate
sales; lower sales would be expected to be associated with unhealthy prospects for a
firm.
c©2018, Jeffrey S. Simonoff 1
5. Book value of equity divided by book value of total liabilities (BVE/TL). This ratio
measures financial leverage, being the inverse of the debt to equity ratio. A smaller
value is indicative of the decline of a firm’s assets relative to its liabilities, presumably
an indicator of unhealthiness. While it is typical in bankruptcy studies to use the
market value of equity in this ratio, the “Internet bubble” of the late 1990’s makes this
problematic. It was not at all unusual during this time period for so–called dot–coms
to have very high stock prices that collapsed within a matter of months, making the
market value of equity unrealistically high before the collapse.
Here are the data:
Row Company WC/TA RE/TA EBIT/TA S/TA BVE/BVL Bankrupt
1 360Networks 9.3 -7.7 1.6 9.1 3.726 1
2 Advanced Radio Telecom 42.6 -60.1 -10.1 0.3 4.130 1
3 Ardent Communications -28.8 -203.2 -51.0 14.7 0.111 1
4 At Home Corp. 2.5 -433.1 -6.0 29.3 1.949 1
5 Convergent Communications 26.1 -57.4 -23.5 54.2 0.855 1
6 Covad Communications 39.2 -111.8 -77.8 10.5 0.168 1
7 e.spire -5.4 -105.2 -5.8 38.9 0.028 1
8 eGlobe -35.2 -92.4 -32.5 48.5 11.280 1
9 Exodus Communications 10.5 -12.4 -2.3 21.0 2.500 1
10 General Datacomm Industries -22.4 -124.5 -7.9 125.6 1.595 1
11 Global Telesystems 24.6 -29.0 -2.0 21.3 1.968 1
12 GST Telecom 6.6 -50.9 -2.6 28.9 0.258 1
13 Metricom 33.9 -46.5 -17.5 0.9 0.828 1
14 Net2000 Communications 19.1 -66.3 -25.5 22.3 0.460 1
15 NetVoice Technologies -21.1 -46.0 -26.8 81.4 0.698 1
16 PSINet 2.5 -228.7 -6.7 38.6 0.030 1
17 Rhythms NetConnections 47.0 -78.2 -42.0 4.4 0.168 1
18 RSL Communications 9.1 -40.2 -0.7 81.5 0.522 1
19 SSE Telecom 43.0 -49.2 -87.4 119.9 2.919 1
20 Startec Global Communications -34.9 -79.0 -13.5 127.8 0.197 1
21 Teligent 20.6 -146.3 -36.0 12.6 0.075 1
22 U.S. Wireless -51.6 -326.1 -98.7 0.9 2.402 1
23 Viatel -93.0 -95.2 -7.3 34.8 0.071 1
24 WebLink Wireless -127.5 -121.3 6.4 65.7 0.248 1
25 Winstar -1.2 -47.5 -9.7 14.5 0.456 1
26 Aether Systems 30.6 -14.4 -4.9 2.2 3.482 0
27 Akamai Technologies 9.8 -33.8 -7.1 3.2 5.965 0
28 Allegiance Telecom 37.8 -45.4 -7.1 17.1 3.450 0
29 ALLTEL Corp. 2.2 31.6 22.0 58.0 2.758 0
30 BellSouth -11.5 27.6 24.4 51.4 2.266 0
31 Broadwing -4.1 -5.8 7.7 31.6 1.222 0
c©2018, Jeffrey S. Simonoff 2
32 CenturyTel -5.7 21.1 14.3 28.9 1.153 0
33 Citizens Communications 25.2 0.0 7.9 25.9 0.717 0
34 Commonwealth Telephone -5.7 0.0 16.3 49.9 1.517 0
35 Conestoga Enterprises 1.7 11.4 14.1 48.0 1.347 0
36 Digex 13.8 -37.6 -13.0 32.3 13.768 0
37 Equant 8.2 -15.6 0.3 87.7 5.444 0
38 Garmin 74.7 54.6 27.9 74.6 3.720 0
39 Gilat Satellite Networks 43.1 -0.4 3.4 40.0 0.925 0
40 IDT Corp. 48.7 38.4 -9.2 65.4 0.705 0
41 Infonet 40.7 0.6 1.8 49.2 7.497 0
42 Openwave Systems 20.3 -61.3 1.9 27.0 35.178 0
43 Price Communications 16.1 3.6 11.6 21.4 0.856 0
44 Qwest -6.1 0.0 9.4 22.6 2.123 0
45 SBC Communications -7.2 18.6 20.8 52.2 2.413 0
46 Telephone and Data Systems -5.3 31.0 4.9 26.9 1.362 0
47 Time Warner Telecom 4.1 -10.1 9.7 36.0 7.623 0
48 U.S. Cellular 0.3 27.9 16.1 49.5 4.357 0
49 Verizon Communications -7.4 8.9 15.3 39.3 1.273 0
50 Western Wireless 1.1 -39.5 15.7 41.8 1.449 0
A good way to get a feeling for the predictive power of individual variables is to construct
side–by–side boxplots, to see if there is separation between the two groups on the variables.
This does not take into account the variables having joint effects, and doesn’t necessarily
imply that a linear logistic model is appropriate, but is still helpful. Here are the plots:
c©2018, Jeffrey S. Simonoff 3
c©2018, Jeffrey S. Simonoff 4
The working capital, retained earnings, and earnings variables all show clear separation
between bankrupt and nonbankrupt firms, in the ways that would have been expected.
The sales variable shows less predictive power, with the bankrupt firms actually having the
highest values of sales as a percentage of assets. nonbankrupt firms have a generally higher
equity to liabilities ratio (lower debt to equity), although the long tails of the variable make
this a little harder to see. Note that while there is no assumption being made here on the
distributions of the predictors, the long tails in this variable does suggest that logging this
variable might be helpful. In fact, I did this, but the unlogged variable worked just as well,
so I won’t pursue that further. Remember, of course, that these are not plots of a response
versus a categorical predictor in an OLS model, so any notion of nonconstant variance is not
relevant here.
c©2018, Jeffrey S. Simonoff 5
Here is an attempt to fit a logistic regression model fit to these data.
Binary Logistic Regression: Bankrupt versus WC/TA, RE/TA, EBIT/TA, S/TA, ...
* ERROR * The model could not be fit. Maximum likelihood estimates of
parameters may not exist due to quasi-complete separation of data
points. Please refer to help or StatGuide for more information about
quasi-complete separation.
Strangely enough, Minitab has refused to fit the model. The reason for this is that the
program is signaling that maximum likelihood estimates are potentially unstable because of
the configuration of the data. This is often because the model fits “too well”; that is, the
predictors separate the data into successes and failures (almost) perfectly, a condition called
(quasi)-complete separation. This results in estimated coefficients being driven to ±∞, and
the program refuses to let that happen. The fact is that Minitab has relatively stringent cri-
teria related to identifying potential (quasi-complete) separation, and will sometimes refuse
to consider models that can actually be fit successfully by other packages (such as R). What
can we do? We can try to find a simpler model that fits (almost) as well, which based on
principles of parsimony would be preferable in any event.
Unfortunately Minitab does not provide best subset regression for logistic regression.
It is possible, of course, to manually look at every possible logistic model (there would be
31 here). We can then compare different models using a model selection criterion, such
as AIC . Corrected AIC (AICC) is technically not valid in the logistic regression context,
but it still seems to be a useful way of trading off fit versus complexity. In this context,
AICC = AIC + 2(ν + 1)(ν + 2)/(n − ν − 2), where ν is the number of coefficients in the
model (k + 1 for all of the models discussed here, where k is the number of predictors). As
always, the goal is to minimize AIC or AICC.
Minitab does offer a different model selection method, stepwise regression, which “steps
in” variables one at a time if they help the model, and “steps out” ones that are no longer
needed, stopping when no additional variables seem to help or stepping in another variable
would violate the (quasi-)complete separation restrictions. Unfortunately this is a suboptimal
approach that often results in models that are not the ones we want, which can be easily
seen in these data. I do not recommend its use, but for illustrative purposes, first, here is
stepwise logistic regression output:
c©2018, Jeffrey S. Simonoff 6
Binary Logistic Regression: Bankrupt versus WC/TA, RE/TA, EBIT/TA, S/TA, ...
* ERROR * The model could not be fit. Maximum likelihood estimates of
parameters may not exist due to quasi-complete separation of data
points. Please refer to help or StatGuide for more information about
quasi-complete separation.
* NOTE * Results are displayed for the last successful step of term selection.
The algorithm failed when it attempted to enter the following terms:
BVE/BVL
Method
Link function Logit
Rows used 50
Stepwise Selection of Terms
Candidate terms: WC/TA, RE/TA, EBIT/TA, S/TA, BVE/BVL
-----Step 1---- -----Step 2----
Coef P Coef P
Constant -0.449 0.100
EBIT/TA -0.1741 0.001 -0.2461 0.002
WC/TA -0.0656 0.054
Deviance R-Sq 48.09% 61.90%
Deviance R-Sq(adj) 46.65% 59.02%
AIC 39.98 32.41
a to enter = 0.15, a to remove = 0.15
Response Information
Variable Value Count
Bankrupt 1 25 (Event)
0 25
Total 50
Deviance Table
c©2018, Jeffrey S. Simonoff 7
Source DF Adj Dev Adj Mean Chi-Square P-Value
Regression 2 42.908 21.4538 42.91 0.000
WC/TA 1 9.572 9.5715 9.57 0.002
EBIT/TA 1 39.772 39.7721 39.77 0.000
Error 47 26.407 0.5619
Total 49 69.315
Model Summary
Deviance Deviance
R-Sq R-Sq(adj) AIC
61.90% 59.02% 32.41
Coefficients
Term Coef SE Coef VIF
Constant 0.100 0.652
WC/TA -0.0656 0.0340 1.42
EBIT/TA -0.2461 0.0797 1.42
Odds Ratios for Continuous Predictors
Odds Ratio 95% CI
WC/TA 0.9365 (0.8760, 1.0011)
EBIT/TA 0.7819 (0.6688, 0.9140)
Regression Equation
P(1) = exp(Y’)/(1 + exp(Y’))
Y’ = 0.100 -0.0656WC/TA -0.2461EBIT/TA
Goodness-of-Fit Tests
Test DF Chi-Square P-Value
Deviance 47 26.41 0.993
Pearson 47 28.27 0.986
Hosmer-Lemeshow 8 5.75 0.676
c©2018, Jeffrey S. Simonoff 8
Measures of Association
Pairs Number Percent Summary Measures Value
Concordant 599 95.8 Somers D 0.92
Discordant 26 4.2 Goodman-Kruskal Gamma 0.92
Ties 0 0.0 Kendalls Tau-a 0.47
Total 625 100.0
Association is between the response variable and predicted probabilities
The algorithm stepped in EBIT/TA first, followed by WC/TA, and then stopped due to
detection of quasi-complete separation. On the face of it the results look okay; Hosmer-
Lemeshow indicates a good fit (remember, the Pearson and deviance goodness-of-fit tests
are not valid here), both predictors are highly statistically significant, and Somers’ D is a
robust 0.92.
So what’s the problem? Stepwise regression does not account for how predictors might
work together in a very effective way, the way best subsets can. We can’t easily get best
subsets here, but a quick-and-dirty approach is to use ordinary (least squares) best subsets
regression (with the 0/1 bankruptcy variable as the response) to sort through the models.
This is, of course, not at all technically valid, but it can at least help provide guidance. Here
is the resultant best subsets output:
Best Subsets Regression: Bankrupt versus WC/TA, RE/TA, ...
Response is Bankrupt
E B
B V
W R I E
C E T S /
/ / / / B
R-Sq R-Sq Mallows T T T T V
Vars R-Sq (adj) (pred) Cp S A A A A L
1 35.9 34.6 30.1 14.1 0.40849 X
1 35.9 34.5 24.3 14.2 0.40872 X
2 46.4 44.1 35.6 6.3 0.37773 X X
2 41.9 39.4 32.6 10.5 0.39314 X X
3 51.5 48.3 34.4 3.5 0.36303 X X X
3 48.1 44.7 35.5 6.7 0.37562 X X X
c©2018, Jeffrey S. Simonoff 9
4 52.6 48.4 33.6 4.4 0.36268 X X X X
4 52.3 48.0 32.9 4.8 0.36415 X X X X
5 53.1 47.7 31.2 6.0 0.36511 X X X X X
The best subsets points to a model with the three predictors RE/TA, EBIT/TA, and
BVE/BVL (of course, only EBIT/TA of those three predictors was in the stepwise-generated
model). Let’s try it out.
Binary Logistic Regression: Bankrupt versus RE/TA, EBIT/TA, BVE/BVL
* WARNING * When the data are in the Response/Frequency format, the Residuals
versus fits plot is unavailable.
Method
Link function Logit
Residuals for diagnostics Pearson
Rows used 50
Response Information
Variable Value Count
Bankrupt 1 25 (Event)
0 25
Total 50
Deviance Table
Source DF Adj Dev Adj Mean Chi-Square P-Value
Regression 3 53.428 17.8092 53.43 0.000
RE/TA 1 9.190 9.1896 9.19 0.002
EBIT/TA 1 4.573 4.5727 4.57 0.032
BVE/BVL 1 8.645 8.6450 8.64 0.003
Error 46 15.887 0.3454
Total 49 69.315
Model Summary
c©2018, Jeffrey S. Simonoff 10
Deviance Deviance
R-Sq R-Sq(adj) AIC
77.08% 72.75% 23.89
Coefficients
Term Coef SE Coef VIF
Constant -0.29 1.12
RE/TA -0.0563 0.0275 1.24
EBIT/TA -0.1676 0.0927 1.44
BVE/BVL -0.630 0.394 1.62
Odds Ratios for Continuous Predictors
Odds Ratio 95% CI
RE/TA 0.9453 (0.8958, 0.9975)
EBIT/TA 0.8457 (0.7052, 1.0142)
BVE/BVL 0.5327 (0.2459, 1.1539)
Regression Equation
P(1) = exp(Y’)/(1 + exp(Y’))
Y’ = -0.29 -0.0563RE/TA -0.1676EBIT/TA -0.630BVE/BVL
Goodness-of-Fit Tests
Test DF Chi-Square P-Value
Deviance 46 15.89 1.000
Pearson 46 20.77 1.000
Hosmer-Lemeshow 8 2.99 0.935
Measures of Association
Pairs Number Percent Summary Measures Value
Concordant 615 98.4 Somers D 0.97
Discordant 10 1.6 Goodman-Kruskal Gamma 0.97
Ties 0 0.0 Kendalls Tau-a 0.49
Total 625 100.0
c©2018, Jeffrey S. Simonoff 11
Association is between the response variable and predicted probabilities
The Hosmer-Lemeshow test indicates a good fit to the data. More interestingly, this
model is clearly better than the one generated by stepwise logistic regression; its Somers’ D
value is higher (0.97 versus 0.92) and its AIC value is smaller (23.89 versus 32.41). With
98.4% concordant pairs and only 1.6% discordant ones, we’ve identified our groups very well.
Each of the coefficients is statistically significant, and the overall test on all three coefficients
is very significant. The RE/TA coefficient says that an increase of one percentage point in
the retained earnings (as a percentage of total assets) is associated with an decrease in the
odds of going bankrupt in the next year by 5%; the EBIT/TA coefficient says that an increase
of one percentage point in the earnings before interest and taxes (as a percentage of total
assets) is associated with a decrease in the odds of going bankrupt by 15%; an increase by
one of book value of equity divided by book value of liabilities is associated with a decrease
in the odds of going bankrupt by 47% (all of these are holding all else in the model fixed,
of course). Note, by the way, that just as was true for least squares coefficients we need to
be aware of the possibility that the magnitude of these coefficient estimates could be biased
upwards by the act of performing variable selection, although the overall strength of the
associations here would tend to diminish that effect.
Here are regression diagnostics.
Row SPEARRES1 HI1 COOK1
1 3.58221 0.073000 0.252631
2 0.36434 0.147700 0.005751
3 0.00005 0.000000 0.000000
4 0.00001 0.000000 0.000000
5 0.04238 0.013737 0.000006
6 0.00008 0.000000 0.000000
7 0.03750 0.012448 0.000004
8 0.24180 0.336022 0.007397
9 1.56452 0.103529 0.070669
10 0.02987 0.008786 0.000002
11 0.87070 0.144438 0.031997
12 0.25983 0.136854 0.002676
13 0.09584 0.042617 0.000102
14 0.02454 0.005977 0.000001
15 0.04220 0.015679 0.000007
16 0.00107 0.000044 0.000000
17 0.00401 0.000352 0.000000
c©2018, Jeffrey S. Simonoff 12
18 0.46737 0.209067 0.014435
19 0.00048 0.000014 0.000000
20 0.04337 0.013074 0.000006
21 0.00095 0.000025 0.000000
22 0.00000 0.000000 0.000000
23 0.04447 0.015083 0.000008
24 0.07258 0.053751 0.000075
25 0.16228 0.076798 0.000548
26 -0.69888 0.130070 0.018257
27 -0.74764 0.314433 0.064092
28 -2.06528 0.159031 0.201651
29 -0.02359 0.003789 0.000001
30 -0.02522 0.004384 0.000001
31 -0.38098 0.093989 0.003764
32 -0.10130 0.025594 0.000067
33 -0.37459 0.101263 0.003953
34 -0.13980 0.046244 0.000237
35 -0.12792 0.034748 0.000147
36 -0.10430 0.138712 0.000438
37 -0.24762 0.098685 0.001678
38 -0.00555 0.000405 0.000000
39 -0.52289 0.120129 0.009332
40 -0.85414 0.647216 0.334611
41 -0.06982 0.027497 0.000034
42 -0.00006 0.000001 0.000000
43 -0.23347 0.069065 0.001011
44 -0.20663 0.052391 0.000590
45 -0.04203 0.009107 0.000004
46 -0.15972 0.048305 0.000324
47 -0.04650 0.016506 0.000009
48 -0.02595 0.004457 0.000001
49 -0.12716 0.036617 0.000154
50 -0.57936 0.408364 0.057920
We see one clear outlier, 360Networks. This firm was in the business of building computer
networks, and was one of only two firms that ultimately went bankrupt that had positive
earnings the year before insolvency. Its value of RE/TA was also not very negative, but part
of this could be from the nature of its business; the thousands of miles of cable that it owned
resulted in the firm having $6.3 billion in total assets only three months before it declared
bankruptcy, making RE/TA less negative.
Here are residual plots. Note that for 0/1 response data Minitab won’t produce a plot
of residuals versus fitted probabilities, but you could create that by hand if you want (the
c©2018, Jeffrey S. Simonoff 13
estimated probabilities take the role of fitted values here). Given that for 0/1 response data
residuals can be very far from normally distributed (especially if you have many estimated
probabilities close to 0 or 1) it’s probably not worth the trouble; we can only use these plots
as a rough guideline to what is going on. The positive outlier is very obvious, and there is a
marginally unusual nonbankrupt company as well (Allegiance Telecom, which had relatively
low earnings for a nonbankrupt company).
If we omit this observation and try to fit a model using all of the predictors (which
is the correct thing to do, since we can’t be sure what is the best model to use now) the
program once again won’t allow it; in fact, it turns out that the model fits the data perfectly
(there is complete separation). In this situation the goodness-of-fit statistics all equal zero
and Somers’ D equals 1. A model using only one predictor that fits perfectly cannot be
simplified, and separation in this context is not a bad thing; the proper implication is to just
say that the variable perfectly splits the successes from the failures based on an observation
being greater than or less than some cutoff value (which can be identified from the data).
If this occurs with two predictors it means that in a scatter plot of the two variables there
is a straight line such that all of the observations are successes on one side of the line and
failures on the other side of the line.
What about here? We should explore the possibility of whether a simpler model is good
enough. It is only when a model using fewer predictors recovers virtually all of the fit that
we can comfortably move away from the current model we are examining.
Here is output from least squares best subsets:
c©2018, Jeffrey S. Simonoff 14
Best Subsets Regression: Bankrupt versus WC/TA, RE/TA, ...
Response is Bankrupt
E B
B V
W R I E
C E T S /
/ / / / B
R-Sq R-Sq Mallows T T T T V
Vars R-Sq (adj) (pred) Cp S A A A A L
1 38.1 36.8 26.1 16.2 0.40166 X
1 37.6 36.3 31.8 16.6 0.40305 X
2 48.9 46.7 37.8 7.4 0.36864 X X
2 44.1 41.7 26.6 12.2 0.38577 X X
3 54.3 51.3 35.4 4.1 0.35253 X X X
3 50.8 47.5 37.0 7.6 0.36607 X X X
4 55.8 51.7 34.5 4.7 0.35085 X X X X
4 55.5 51.4 34.5 5.0 0.35196 X X X X
5 56.5 51.4 32.3 6.0 0.35212 X X X X X
The three-predictor model we used before still seems to be a good choice, but unfortu-
nately Minitab won’t let us fit it because of (quasi-)complete separation. It will let us run
stepwise regression, which suggests the model just based on RE/TA:
Binary Logistic Regression: Bankrupt versus RE/TA, EBIT/TA
* WARNING * When the data are in the Response/Frequency format, the Residuals
versus fits plot is unavailable.
Method
Link function Logit
Residuals for diagnostics Pearson
Rows used 49
Stepwise Selection of Terms
Candidate terms: RE/TA, EBIT/TA
c©2018, Jeffrey S. Simonoff 15
-----Step 1----
Coef P
Constant -3.05
RE/TA -0.0828 0.001
Deviance R-Sq 64.28%
Deviance R-Sq(adj) 62.81%
AIC 28.26
a to enter = 0.15, a to remove = 0.15
Response Information
Variable Value Count
Bankrupt 1 24 (Event)
0 25
Total 49
Deviance Table
Source DF Adj Dev Adj Mean Chi-Square P-Value
Regression 1 43.65 43.6522 43.65 0.000
RE/TA 1 43.65 43.6522 43.65 0.000
Error 47 24.26 0.5161
Total 48 67.91
Model Summary
Deviance Deviance
R-Sq R-Sq(adj) AIC
64.28% 62.81% 28.26
Coefficients
Term Coef SE Coef VIF
Constant -3.05 1.09
RE/TA -0.0828 0.0259 1.00
Odds Ratios for Continuous Predictors
c©2018, Jeffrey S. Simonoff 16
Odds Ratio 95% CI
RE/TA 0.9206 (0.8750, 0.9685)
Regression Equation
P(1) = exp(Y’)/(1 + exp(Y’))
Y’ = -3.05 -0.0828RE/TA
Goodness-of-Fit Tests
Test DF Chi-Square P-Value
Deviance 47 24.26 0.998
Pearson 47 26.29 0.994
Hosmer-Lemeshow 8 5.19 0.737
Measures of Association
Pairs Number Percent Summary Measures Value
Concordant 579 96.5 Somers D 0.93
Discordant 21 3.5 Goodman-Kruskal Gamma 0.93
Ties 0 0.0 Kendalls Tau-a 0.47
Total 600 100.0
Association is between the response variable and predicted probabilities
The least squares best subsets adds EBIT/TA as the best two-predictor regression, and
both likelihood ratio tests and AIC agree with that in the logistic regression fit:
Binary Logistic Regression: Bankrupt versus RE/TA, EBIT/TA
* WARNING * When the data are in the Response/Frequency format, the Residuals
versus fits plot is unavailable.
Method
Link function Logit
c©2018, Jeffrey S. Simonoff 17
Residuals for diagnostics Pearson
Rows used 49
Response Information
Variable Value Count
Bankrupt 1 24 (Event)
0 25
Total 49
Deviance Table
Source DF Adj Dev Adj Mean Chi-Square P-Value
Regression 2 47.738 23.8692 47.74 0.000
RE/TA 1 13.464 13.4638 13.46 0.000
EBIT/TA 1 4.086 4.0863 4.09 0.043
Error 46 20.170 0.4385
Total 48 67.908
Model Summary
Deviance Deviance
R-Sq R-Sq(adj) AIC
70.30% 67.35% 26.17
Coefficients
Term Coef SE Coef VIF
Constant -2.62 1.08
RE/TA -0.0542 0.0251 1.09
EBIT/TA -0.1172 0.0739 1.09
Odds Ratios for Continuous Predictors
Odds Ratio 95% CI
RE/TA 0.9472 (0.9017, 0.9950)
EBIT/TA 0.8894 (0.7696, 1.0279)
Regression Equation
c©2018, Jeffrey S. Simonoff 18
P(1) = exp(Y’)/(1 + exp(Y’))
Y’ = -2.62 -0.0542RE/TA -0.1172EBIT/TA
Goodness-of-Fit Tests
Test DF Chi-Square P-Value
Deviance 46 20.17 1.000
Pearson 46 18.64 1.000
Hosmer-Lemeshow 8 8.44 0.391
Measures of Association
Pairs Number Percent Summary Measures Value
Concordant 584 97.3 Somers D 0.95
Discordant 16 2.7 Goodman-Kruskal Gamma 0.95
Ties 0 0.0 Kendalls Tau-a 0.48
Total 600 100.0
Association is between the response variable and predicted probabilities
The RE/TA coefficient says that an increase of one percentage point in the retained earn-
ings (as a percentage of total assets) is associated with an decrease in the odds of going
bankrupt in the next year by 5.3% holding EBIT/TA fixed; the EBIT/TA coefficient says that
an increase of one percentage point in the earnings before interest and taxes (as a percentage
of total assets) is associated with a decrease in the odds of going bankrupt by 11.1% holding
RE/TA fixed. I’m still not happy that Minitabwouldn’t let me try the three-predictor model,
but there isn’t anything we can do about it in the current version of Minitab.
Here are diagnostics for this two-predictor model; there is still an indication of some
unusual observations, but at this point the fit is almost perfect, and omitting anything
doesn’t change things materially.
Row Company FITS SPEARRES HI COOK
1 Advanced Radio Telecom 0.86072 0.41982 0.081909 0.005241
2 Ardent Communications 1.00000 0.00076 0.000012 0.000000
3 At Home Corp. 1.00000 0.00002 0.000000 0.000000
4 Convergent Communications 0.96251 0.20570 0.079425 0.001217
c©2018, Jeffrey S. Simonoff 19
5 Covad Communications 1.00000 0.00187 0.000097 0.000000
6 e.spire 0.97732 0.15779 0.067850 0.000604
7 eGlobe 0.99797 0.04531 0.010011 0.000007
8 Exodus Communications 0.15712 2.43405 0.094499 0.206099
9 General Datacomm Industries 0.99367 0.08104 0.029981 0.000068
10 Global Telesystems 0.30689 1.56837 0.081809 0.073054
11 GST Telecom 0.60905 0.84943 0.110361 0.029835
12 Metricom 0.87556 0.40479 0.132587 0.008348
13 Net2000 Communications 0.98134 0.14140 0.048926 0.000343
14 NetVoice Technologies 0.95320 0.23786 0.132154 0.002872
15 PSINet 0.99997 0.00508 0.000586 0.000000
16 Rhythms NetConnections 0.99856 0.03818 0.010851 0.000005
17 RSL Communications 0.41104 1.26010 0.097616 0.057256
18 SSE Telecom 0.99997 0.00583 0.001263 0.000000
19 Startec Global Communications 0.96249 0.20281 0.052416 0.000758
20 Teligent 0.99993 0.00852 0.000698 0.000000
21 U.S. Wireless 1.00000 0.00000 0.000000 0.000000
22 Viatel 0.96760 0.18978 0.070358 0.000909
23 WebLink Wireless 0.96108 0.22970 0.232501 0.005328
24 Winstar 0.74858 0.60835 0.092490 0.012572
25 Aether Systems 0.21983 -0.56599 0.120402 0.014617
26 Akamai Technologies 0.51083 -1.07664 0.099086 0.042496
27 Allegiance Telecom 0.66205 -1.46367 0.085560 0.066816
28 ALLTEL Corp. 0.00099 -0.03160 0.005195 0.000002
29 BellSouth 0.00093 -0.03060 0.005170 0.000002
30 Broadwing 0.03880 -0.20620 0.050696 0.000757
31 CenturyTel 0.00431 -0.06627 0.014028 0.000021
32 Citizens Communications 0.02798 -0.17340 0.042532 0.000445
33 Commonwealth Telephone 0.01064 -0.10523 0.028683 0.000109
34 Conestoga Enterprises 0.00744 -0.08750 0.020391 0.000053
35 Digex 0.71929 -1.73818 0.151897 0.180373
36 Equant 0.14051 -0.42127 0.078789 0.005060
37 Garmin 0.00014 -0.01196 0.001182 0.000000
38 Gilat Satellite Networks 0.04748 -0.22982 0.056198 0.001048
39 IDT Corp. 0.02593 -0.17343 0.115087 0.001304
40 Infonet 0.05389 -0.24651 0.062638 0.001354
41 Openwave Systems 0.61774 -1.45768 0.239472 0.223020
42 Price Communications 0.01512 -0.12588 0.031301 0.000171
43 Qwest 0.02358 -0.15855 0.039471 0.000344
44 SBC Communications 0.00231 -0.04835 0.009889 0.000008
45 Telephone and Data Systems 0.00756 -0.08832 0.023555 0.000063
46 Time Warner Telecom 0.03875 -0.20653 0.054837 0.000825
47 U.S. Cellular 0.00242 -0.04948 0.009487 0.000008
48 Verizon Communications 0.00741 -0.08731 0.020908 0.000054
49 Western Wireless 0.08950 -0.35166 0.205149 0.010639
c©2018, Jeffrey S. Simonoff 20
The output above also gave the estimated probabilities from Minitab. Since this is aretrospective study, these cannot be interpreted as genuine probabilities that (going forward)a firm with those values for the financial ratios will go bankrupt in the next year; since wehave oversampled bankrupt firms, the probabilities given are too high. They can still beused to classify observations as bankrupt or nonbankrupt, however. Here is a classificationmatrix, based on whether the estimated probability is above or below .5 (see the end of thishandout for how to get a classification table using Minitab):
Rows: Bankrupt Columns: Predict
0 1 All
0 21 4 25
42.86 8.16 51.02
1 3 21 24
6.12 42.86 48.98
All 24 25 49
48.98 51.02 100.00
85.7% of the firms were correctly classified, much higher than
Cpro = (1.25)[(.5102)(.4898) + (.4898)(.5102)] = 62.5%
and Cmax = 51%, reinforcing the strength of the logistic regression (the three-predictor
model that Minitab won’t let me fit does even better, getting 95.5% of the 49 observations
right). Thus, the two (three) financial ratios do a very good (excellent) job of classifying
the firms into the bankrupt and nonbankrupt groups. Note that this correct classification
proportion is actually incorrect; 360Networks would have been misclassified, so the actual
correct classification rate was 84% of the original firms. That is, you must include the
outliers among the misclassifications. Note that while a cutoff of .5 seems reasonable here,
in other circumstances a different cutoff value might be more logical; for example, in a
prospective study with a success probability of (say) .1, a cutoff lower than .5 might be a
choice that better balances the two different kinds of misclassifications. Of course, this is
c©2018, Jeffrey S. Simonoff 21
using the data twice; what we would really like to do would be to apply these models to new
data to see how they compare. See the appendix for information on how to do that.
This fitted logistic regression can be used to estimate prospective probabilities of bankruptcy
by adjusting the constant term of the regression using prior probabilities of bankruptcy. I will
use a 10% bankruptcy probability, which is roughly consistent with what would be expected
for firms with a corporate bond rating of B, yielding an adjusted intercept
β̃0 = β̂0 + ln
[
(.10)(25)
(.90)(24)
]
= β̂0 − 2.1564.
I can then convert the original probability estimates to adjusted ones, but first converting the
original probabilities to logits, adjusting as above, and then converting back to probabilities.
You can do this using the calculator to create the variable newprob using the commands
below:
exp(LOGE(’FITS’/(1-’FITS’))-2.1564)/(1+exp(LOGE(’FITS’/(1-’FITS’))-2.1564))
Here are the estimated probabilities:
Row Company newprob
1 Advanced Radio Telecom 0.41700
2 Ardent Communications 1.00000
3 At Home Corp. 1.00000
4 Convergent Communications 0.74819
5 Covad Communications 0.99997
6 e.spire 0.83298
7 eGlobe 0.98274
8 Exodus Communications 0.02112
9 General Datacomm Industries 0.94783
10 Global Telesystems 0.04875
11 GST Telecom 0.15277
12 Metricom 0.44884
13 Net2000 Communications 0.85889
14 NetVoice Technologies 0.70214
15 PSINet 0.99978
16 Rhythms NetConnections 0.98769
17 RSL Communications 0.07474
18 SSE Telecom 0.99971
c©2018, Jeffrey S. Simonoff 22
19 Startec Global Communications 0.74808
20 Teligent 0.99937
21 U.S. Wireless 1.00000
22 Viatel 0.77563
23 WebLink Wireless 0.74081
24 Winstar 0.25629
25 Aether Systems 0.03158
26 Akamai Technologies 0.10783
27 Allegiance Telecom 0.18483
28 ALLTEL Corp. 0.00011
29 BellSouth 0.00011
30 Broadwing 0.00465
31 CenturyTel 0.00050
32 Citizens Communications 0.00332
33 Commonwealth Telephone 0.00124
34 Conestoga Enterprises 0.00087
35 Digex 0.22873
36 Equant 0.01857
37 Garmin 0.00002
38 Gilat Satellite Networks 0.00574
39 IDT Corp. 0.00307
40 Infonet 0.00655
41 Openwave Systems 0.15757
42 Price Communications 0.00177
43 Qwest 0.00279
44 SBC Communications 0.00027
45 Telephone and Data Systems 0.00088
46 Time Warner Telecom 0.00464
47 U.S. Cellular 0.00028
48 Verizon Communications 0.00086
49 Western Wireless 0.01125
Note that GST Telecom has gone from an estimated probability of bankruptcy of .61 to
one of .15, but given the prior bankruptcy probability of .1, this firm would still have been
classified to the bankrupt group.
c©2018, Jeffrey S. Simonoff 23
Constructing a classification matrix using Minitab
Minitab does not give a classification matrix as automatic output from a logistic regres-
sion, but it can be calculated reasonably easily. Here are the steps:
1. Say your 0/1 response variable is named Response. When fitting the logistic regression
model, click on Storage and click on Fits (event probabilities). This will create
a variable called FITS that contains the estimated probabilities of success for each
observation. If you do this more than once, the variable will be called FITS2, FITS3,
and so on.
2. Now go to the calculator (Calc→ Calculator) and create a variable with the predicted
group for each covariate pattern (say Predict), based on the construction
FITS > .5.
3. Click on Stat → Tables → Cross Tabulation and Chi-Square. Enter Response
and Predict under Categorical variables (one for Rows, the other for Columns.
Under Display click Counts and Total percents.
Using .5 to define the classifying groups is based on the natural idea of classifying an
observation to its most probable group, but that is not the only possible strategy that you
could take. For example, consider a medical triage situation, where patients are either low
risk (0) or high risk (1). It might be that the cost of misclassifying a person who is actually
at high risk to be low risk (C(0|1)) is viewed as much higher than misclassifying a person
who is actually at low risk to be high risk (C(1|0)), since the latter situation only involves
giving unnecessary treatment, while the former involves failing to give necessary treatment.
Say that it was viewed that C(0|1) is ten times bigger than C(1|0) (that is, it is ten times
worse to fail to give necessary treatment than to give unnecessary treatment). In that case,
the appropriate cutoff is not .5, but rather
C(1|0)
C(1|0) + C(0|1)=
1
1 + 10= .091.
That is, any observation with estimated probability of being at high risk greater than .091
is classified as being at high risk. This will reduce the probability of mistaking a high risk
patient for low risk, but will of course increase the chances of mistaking a low risk patient
for a high risk one. Note that if the costs are viewed as equal, you get back to the .5 cutoff.
Another issue is that in a prospective study, if one group is much less likely than the other, it
c©2018, Jeffrey S. Simonoff 24
might be sensible to assign observations to that group based on a smaller than 50% estimated
likelihood of being from that group.
Validating a model on new data
An excellent way to assess the usefulness of the logistic regression model as a classifier is
to validate it on new data. In order to do this, follow these steps:
1. Write down your final logistic regression model.
2. Bring up as active the worksheet that contains the new data (you could also have this
in the same worksheet as a separate set of variables).
3. Click on Calc→ Calculator. Create a new variable logit that equals β̂0+β̂1x1+· · ·+
β̂pxp, where the β̂’s are the estimated coefficients, and {x1, . . . , xp} are the predictors
(that is, the variable logit contains the estimated logits for the new data).
4. Bring up the calculator again, and create the variable prob, the estimated probabilities
of success for the new data; this equals exp(logit)/(1 + exp(logit)).
5. You can now evaluate the effectiveness of the model in classifying new observations by
constructing a classification matrix on the new data in exactly the same way as was
described earlier (where prob takes the place of FITS1 there).
Note that since you are not using the same data to construct and evaluate the model,
you don’t need to multiply Cpro by 1.25 in your assessment. If your model includes factor
(categorical) variables, you have to convert to the underlying indicator or effect coding
variables and fit the model that way to assess the classification performance of the model on
new data.
Comparing a subset model to a more complex model
As we know, the need for a more complex model versus a simpler one that is a special
case can be assessed for least squares models using a partial F -test. The analogous test in
the logistic regression context is a chi-squared test (this is not the same as the chi-squared
goodness-of-fit tests given by Minitab, but the tests given under Chi-Square in the deviance
table given in the output are examples of this kind of test). To construct the test, subtract
the value under Chi-Square in the first row of the deviance table from the Minitab output
for the subset model from the corresponding value for the more complicated model; call
this value g. This is then compared to a χ2 distribution, with the number of degrees of
c©2018, Jeffrey S. Simonoff 25
freedom d being the difference between the number of parameters in the two models. The
null hypothesis being tested is that the subset model provides a good enough fit. To get the
p-value for the test, click on Calc → Probability Distributions → Chi-Square. Click
the button next to Cumulative probability, put the value d in the box next to Degrees
of freedom, click the button and put the value g in the box next to Input constant. The
p-value is 1 − q, where q is the value given under P(X<=x). Remember, this is only valid if
the simpler model is a special case of the more complicated model (that is, a subset of it).
c©2018, Jeffrey S. Simonoff 26