Predicting bankruptcy in the telecommunications …people.stern.nyu.edu/jsimonof/classes/2301/pdf/banklog.pdfPredicting bankruptcy in the telecommunications industry Understanding

Predicting bankruptcy in the telecommunications industry

Understanding and predicting bankruptcy has always been important — now more than

ever, given the failure of many multibillion dollar enterprises. Effective bankruptcy prediction

is useful for investors and analysts, allowing for accurate evaluation of a firm’s prospects.

Roughly forty years ago Ed Altman showed that publicly available financial ratios can be

used to distinguish between firms that are about to go bankrupt and those that are not. He

did this using discriminant analysis, but this is a natural logistic regression problem.

The following data and discussion are based on material from an honors thesis of Jeffrey

Lui. The data represent a retrospective sample of the 25 telecommunications firms that de-

clared bankruptcy between May 2000 and January 2002 that had issued financial statements

for at least two years, and information from the December 2000 financial statements of 25

telecommunications that did not declare bankruptcy. The nonbankrupt firms were chosen

to try to match asset sizes with the bankrupt firms. Remember, this is a retrospective study

because of the fact that sampling is based on the response itself (bankruptcy); this has

nothing to do with any time-related aspect to the problem.

Five financial ratios were chosen as potential predictors of bankruptcy:

1. Working capital as a percentage of total assets (WC/TA, expressed as a percentage).

Working capital is the difference between current assets and liabilities, and is thus

a measure of liquidity as it relates to total capitalization. Firms on the road to

bankruptcy would be expected to have less liquidity.

2. Retained earnings as a percentage of total assets (RE/TA, expressed as a percentage).

This is a measure of cumulative profitability over time, and is thus an indicator of

profitability, and also of age. A younger firm is less likely to be able to retain earnings,

since it would reinvest most, if not all, of its earnings in order to stimulate growth.

Both youth and less profitability would be expected to be associated with an increased

risk of insolvency.

3. Earnings before interest and taxes as a percentage of total assets (EBIT/TA, expressed

as a percentage). This is a measure of the productivity of a firm’s assets, with higher

productivity expected to be associated with a healthy firm.

4. Sales as a percentage of total assets (S/TA, expressed as a percentage). This is the

standard capital turnover ratio, indicating the ability of a firm’s assets to generate

sales; lower sales would be expected to be associated with unhealthy prospects for a

firm.

c©2018, Jeffrey S. Simonoff 1

5. Book value of equity divided by book value of total liabilities (BVE/TL). This ratio

measures financial leverage, being the inverse of the debt to equity ratio. A smaller

value is indicative of the decline of a firm’s assets relative to its liabilities, presumably

an indicator of unhealthiness. While it is typical in bankruptcy studies to use the

market value of equity in this ratio, the “Internet bubble” of the late 1990’s makes this

problematic. It was not at all unusual during this time period for so–called dot–coms

to have very high stock prices that collapsed within a matter of months, making the

market value of equity unrealistically high before the collapse.

Here are the data:

Row Company WC/TA RE/TA EBIT/TA S/TA BVE/BVL Bankrupt

1 360Networks 9.3 -7.7 1.6 9.1 3.726 1

2 Advanced Radio Telecom 42.6 -60.1 -10.1 0.3 4.130 1

3 Ardent Communications -28.8 -203.2 -51.0 14.7 0.111 1

4 At Home Corp. 2.5 -433.1 -6.0 29.3 1.949 1

5 Convergent Communications 26.1 -57.4 -23.5 54.2 0.855 1

6 Covad Communications 39.2 -111.8 -77.8 10.5 0.168 1

7 e.spire -5.4 -105.2 -5.8 38.9 0.028 1

8 eGlobe -35.2 -92.4 -32.5 48.5 11.280 1

9 Exodus Communications 10.5 -12.4 -2.3 21.0 2.500 1

10 General Datacomm Industries -22.4 -124.5 -7.9 125.6 1.595 1

11 Global Telesystems 24.6 -29.0 -2.0 21.3 1.968 1

12 GST Telecom 6.6 -50.9 -2.6 28.9 0.258 1

13 Metricom 33.9 -46.5 -17.5 0.9 0.828 1

14 Net2000 Communications 19.1 -66.3 -25.5 22.3 0.460 1

15 NetVoice Technologies -21.1 -46.0 -26.8 81.4 0.698 1

16 PSINet 2.5 -228.7 -6.7 38.6 0.030 1

17 Rhythms NetConnections 47.0 -78.2 -42.0 4.4 0.168 1

18 RSL Communications 9.1 -40.2 -0.7 81.5 0.522 1

19 SSE Telecom 43.0 -49.2 -87.4 119.9 2.919 1

20 Startec Global Communications -34.9 -79.0 -13.5 127.8 0.197 1

21 Teligent 20.6 -146.3 -36.0 12.6 0.075 1

22 U.S. Wireless -51.6 -326.1 -98.7 0.9 2.402 1

23 Viatel -93.0 -95.2 -7.3 34.8 0.071 1

24 WebLink Wireless -127.5 -121.3 6.4 65.7 0.248 1

25 Winstar -1.2 -47.5 -9.7 14.5 0.456 1

26 Aether Systems 30.6 -14.4 -4.9 2.2 3.482 0

27 Akamai Technologies 9.8 -33.8 -7.1 3.2 5.965 0

28 Allegiance Telecom 37.8 -45.4 -7.1 17.1 3.450 0

29 ALLTEL Corp. 2.2 31.6 22.0 58.0 2.758 0

30 BellSouth -11.5 27.6 24.4 51.4 2.266 0

31 Broadwing -4.1 -5.8 7.7 31.6 1.222 0


32 CenturyTel -5.7 21.1 14.3 28.9 1.153 0

33 Citizens Communications 25.2 0.0 7.9 25.9 0.717 0

34 Commonwealth Telephone -5.7 0.0 16.3 49.9 1.517 0

35 Conestoga Enterprises 1.7 11.4 14.1 48.0 1.347 0

36 Digex 13.8 -37.6 -13.0 32.3 13.768 0

37 Equant 8.2 -15.6 0.3 87.7 5.444 0

38 Garmin 74.7 54.6 27.9 74.6 3.720 0

39 Gilat Satellite Networks 43.1 -0.4 3.4 40.0 0.925 0

40 IDT Corp. 48.7 38.4 -9.2 65.4 0.705 0

41 Infonet 40.7 0.6 1.8 49.2 7.497 0

42 Openwave Systems 20.3 -61.3 1.9 27.0 35.178 0

43 Price Communications 16.1 3.6 11.6 21.4 0.856 0

44 Qwest -6.1 0.0 9.4 22.6 2.123 0

45 SBC Communications -7.2 18.6 20.8 52.2 2.413 0

46 Telephone and Data Systems -5.3 31.0 4.9 26.9 1.362 0

47 Time Warner Telecom 4.1 -10.1 9.7 36.0 7.623 0

48 U.S. Cellular 0.3 27.9 16.1 49.5 4.357 0

49 Verizon Communications -7.4 8.9 15.3 39.3 1.273 0

50 Western Wireless 1.1 -39.5 15.7 41.8 1.449 0

A good way to get a feeling for the predictive power of individual variables is to construct

side–by–side boxplots, to see if there is separation between the two groups on the variables.

This does not take into account the variables having joint effects, and doesn’t necessarily

imply that a linear logistic model is appropriate, but is still helpful. Here are the plots:



The working capital, retained earnings, and earnings variables all show clear separation

between bankrupt and nonbankrupt firms, in the ways that would have been expected.

The sales variable shows less predictive power, with the bankrupt firms actually having the

highest values of sales as a percentage of assets. nonbankrupt firms have a generally higher

equity to liabilities ratio (lower debt to equity), although the long tails of the variable make

this a little harder to see. Note that while there is no assumption being made here on the

distributions of the predictors, the long tails in this variable does suggest that logging this

variable might be helpful. In fact, I did this, but the unlogged variable worked just as well,

so I won’t pursue that further. Remember, of course, that these are not plots of a response

versus a categorical predictor in an OLS model, so any notion of nonconstant variance is not

relevant here.


Here is an attempt to fit a logistic regression model fit to these data.

Binary Logistic Regression: Bankrupt versus WC/TA, RE/TA, EBIT/TA, S/TA, ...

* ERROR * The model could not be fit. Maximum likelihood estimates of

parameters may not exist due to quasi-complete separation of data

points. Please refer to help or StatGuide for more information about

quasi-complete separation.

Strangely enough, Minitab has refused to fit the model. The reason for this is that the

program is signaling that maximum likelihood estimates are potentially unstable because of

the configuration of the data. This is often because the model fits “too well”; that is, the

predictors separate the data into successes and failures (almost) perfectly, a condition called

(quasi)-complete separation. This results in estimated coefficients being driven to ±∞, and

the program refuses to let that happen. The fact is that Minitab has relatively stringent cri-

teria related to identifying potential (quasi-complete) separation, and will sometimes refuse

to consider models that can actually be fit successfully by other packages (such as R). What

can we do? We can try to find a simpler model that fits (almost) as well, which based on

principles of parsimony would be preferable in any event.

Unfortunately Minitab does not provide best subset regression for logistic regression.

It is possible, of course, to manually look at every possible logistic model (there would be

31 here). We can then compare different models using a model selection criterion, such

as AIC . Corrected AIC (AICC) is technically not valid in the logistic regression context,

but it still seems to be a useful way of trading off fit versus complexity. In this context,

AICC = AIC + 2(ν + 1)(ν + 2)/(n − ν − 2), where ν is the number of coefficients in the

model (k + 1 for all of the models discussed here, where k is the number of predictors). As

always, the goal is to minimize AIC or AICC.

Minitab does offer a different model selection method, stepwise regression, which “steps

in” variables one at a time if they help the model, and “steps out” ones that are no longer

needed, stopping when no additional variables seem to help or stepping in another variable

would violate the (quasi-)complete separation restrictions. Unfortunately this is a suboptimal

approach that often results in models that are not the ones we want, which can be easily

seen in these data. I do not recommend its use, but for illustrative purposes, first, here is

stepwise logistic regression output:


Binary Logistic Regression: Bankrupt versus WC/TA, RE/TA, EBIT/TA, S/TA, ...

* ERROR * The model could not be fit. Maximum likelihood estimates of

parameters may not exist due to quasi-complete separation of data

points. Please refer to help or StatGuide for more information about

quasi-complete separation.

* NOTE * Results are displayed for the last successful step of term selection.

The algorithm failed when it attempted to enter the following terms:

BVE/BVL

Method

Link function Logit

Rows used 50

Stepwise Selection of Terms

Candidate terms: WC/TA, RE/TA, EBIT/TA, S/TA, BVE/BVL

-----Step 1---- -----Step 2----

Coef P Coef P

Constant -0.449 0.100

EBIT/TA -0.1741 0.001 -0.2461 0.002

WC/TA -0.0656 0.054

Deviance R-Sq 48.09% 61.90%

Deviance R-Sq(adj) 46.65% 59.02%

AIC 39.98 32.41

a to enter = 0.15, a to remove = 0.15

Response Information

Variable Value Count

Bankrupt 1 25 (Event)

0 25

Total 50

Deviance Table


Source DF Adj Dev Adj Mean Chi-Square P-Value

Regression 2 42.908 21.4538 42.91 0.000

WC/TA 1 9.572 9.5715 9.57 0.002

EBIT/TA 1 39.772 39.7721 39.77 0.000

Error 47 26.407 0.5619

Total 49 69.315

Model Summary

Deviance Deviance

R-Sq R-Sq(adj) AIC

61.90% 59.02% 32.41

Coefficients

Term Coef SE Coef VIF

Constant 0.100 0.652

WC/TA -0.0656 0.0340 1.42

EBIT/TA -0.2461 0.0797 1.42

Odds Ratios for Continuous Predictors

Odds Ratio 95% CI

WC/TA 0.9365 (0.8760, 1.0011)

EBIT/TA 0.7819 (0.6688, 0.9140)

Regression Equation

P(1) = exp(Y’)/(1 + exp(Y’))

Y’ = 0.100 -0.0656WC/TA -0.2461EBIT/TA

Goodness-of-Fit Tests

Test DF Chi-Square P-Value

Deviance 47 26.41 0.993

Pearson 47 28.27 0.986

Hosmer-Lemeshow 8 5.75 0.676


Measures of Association

Pairs Number Percent Summary Measures Value

Concordant 599 95.8 Somers D 0.92

Discordant 26 4.2 Goodman-Kruskal Gamma 0.92

Ties 0 0.0 Kendalls Tau-a 0.47

Total 625 100.0

Association is between the response variable and predicted probabilities

The algorithm stepped in EBIT/TA first, followed by WC/TA, and then stopped due to

detection of quasi-complete separation. On the face of it the results look okay; Hosmer-

Lemeshow indicates a good fit (remember, the Pearson and deviance goodness-of-fit tests

are not valid here), both predictors are highly statistically significant, and Somers’ D is a

robust 0.92.

So what’s the problem? Stepwise regression does not account for how predictors might

work together in a very effective way, the way best subsets can. We can’t easily get best

subsets here, but a quick-and-dirty approach is to use ordinary (least squares) best subsets

regression (with the 0/1 bankruptcy variable as the response) to sort through the models.

This is, of course, not at all technically valid, but it can at least help provide guidance. Here

is the resultant best subsets output:

Best Subsets Regression: Bankrupt versus WC/TA, RE/TA, ...

Response is Bankrupt

E B

B V

W R I E

C E T S /

/ / / / B

R-Sq R-Sq Mallows T T T T V

Vars R-Sq (adj) (pred) Cp S A A A A L

1 35.9 34.6 30.1 14.1 0.40849 X

1 35.9 34.5 24.3 14.2 0.40872 X

2 46.4 44.1 35.6 6.3 0.37773 X X

2 41.9 39.4 32.6 10.5 0.39314 X X

3 51.5 48.3 34.4 3.5 0.36303 X X X

3 48.1 44.7 35.5 6.7 0.37562 X X X


4 52.6 48.4 33.6 4.4 0.36268 X X X X

4 52.3 48.0 32.9 4.8 0.36415 X X X X

5 53.1 47.7 31.2 6.0 0.36511 X X X X X

The best subsets points to a model with the three predictors RE/TA, EBIT/TA, and

BVE/BVL (of course, only EBIT/TA of those three predictors was in the stepwise-generated

model). Let’s try it out.

Binary Logistic Regression: Bankrupt versus RE/TA, EBIT/TA, BVE/BVL

* WARNING * When the data are in the Response/Frequency format, the Residuals

versus fits plot is unavailable.

Method

Link function Logit

Residuals for diagnostics Pearson

Rows used 50




0 25

Total 50

Deviance Table


Regression 3 53.428 17.8092 53.43 0.000

RE/TA 1 9.190 9.1896 9.19 0.002

EBIT/TA 1 4.573 4.5727 4.57 0.032

BVE/BVL 1 8.645 8.6450 8.64 0.003

Error 46 15.887 0.3454

Total 49 69.315

Model Summary


Deviance Deviance

R-Sq R-Sq(adj) AIC

77.08% 72.75% 23.89

Coefficients


Constant -0.29 1.12

RE/TA -0.0563 0.0275 1.24

EBIT/TA -0.1676 0.0927 1.44

BVE/BVL -0.630 0.394 1.62


Odds Ratio 95% CI

RE/TA 0.9453 (0.8958, 0.9975)

EBIT/TA 0.8457 (0.7052, 1.0142)

BVE/BVL 0.5327 (0.2459, 1.1539)

Regression Equation

P(1) = exp(Y’)/(1 + exp(Y’))

Y’ = -0.29 -0.0563RE/TA -0.1676EBIT/TA -0.630BVE/BVL



Deviance 46 15.89 1.000

Pearson 46 20.77 1.000







Total 625 100.0



The Hosmer-Lemeshow test indicates a good fit to the data. More interestingly, this

model is clearly better than the one generated by stepwise logistic regression; its Somers’ D

value is higher (0.97 versus 0.92) and its AIC value is smaller (23.89 versus 32.41). With

98.4% concordant pairs and only 1.6% discordant ones, we’ve identified our groups very well.

Each of the coefficients is statistically significant, and the overall test on all three coefficients

is very significant. The RE/TA coefficient says that an increase of one percentage point in

the retained earnings (as a percentage of total assets) is associated with an decrease in the

odds of going bankrupt in the next year by 5%; the EBIT/TA coefficient says that an increase

of one percentage point in the earnings before interest and taxes (as a percentage of total

assets) is associated with a decrease in the odds of going bankrupt by 15%; an increase by

one of book value of equity divided by book value of liabilities is associated with a decrease

in the odds of going bankrupt by 47% (all of these are holding all else in the model fixed,

of course). Note, by the way, that just as was true for least squares coefficients we need to

be aware of the possibility that the magnitude of these coefficient estimates could be biased

upwards by the act of performing variable selection, although the overall strength of the

associations here would tend to diminish that effect.

Here are regression diagnostics.

Row SPEARRES1 HI1 COOK1

1 3.58221 0.073000 0.252631

2 0.36434 0.147700 0.005751

3 0.00005 0.000000 0.000000

4 0.00001 0.000000 0.000000

5 0.04238 0.013737 0.000006

6 0.00008 0.000000 0.000000

7 0.03750 0.012448 0.000004

8 0.24180 0.336022 0.007397

9 1.56452 0.103529 0.070669

10 0.02987 0.008786 0.000002

11 0.87070 0.144438 0.031997

12 0.25983 0.136854 0.002676

13 0.09584 0.042617 0.000102

14 0.02454 0.005977 0.000001

15 0.04220 0.015679 0.000007

16 0.00107 0.000044 0.000000

17 0.00401 0.000352 0.000000


18 0.46737 0.209067 0.014435

19 0.00048 0.000014 0.000000

20 0.04337 0.013074 0.000006

21 0.00095 0.000025 0.000000

22 0.00000 0.000000 0.000000

23 0.04447 0.015083 0.000008

24 0.07258 0.053751 0.000075

25 0.16228 0.076798 0.000548

26 -0.69888 0.130070 0.018257

27 -0.74764 0.314433 0.064092

28 -2.06528 0.159031 0.201651

29 -0.02359 0.003789 0.000001

30 -0.02522 0.004384 0.000001

31 -0.38098 0.093989 0.003764

32 -0.10130 0.025594 0.000067

33 -0.37459 0.101263 0.003953

34 -0.13980 0.046244 0.000237

35 -0.12792 0.034748 0.000147

36 -0.10430 0.138712 0.000438

37 -0.24762 0.098685 0.001678

38 -0.00555 0.000405 0.000000

39 -0.52289 0.120129 0.009332

40 -0.85414 0.647216 0.334611

41 -0.06982 0.027497 0.000034

42 -0.00006 0.000001 0.000000

43 -0.23347 0.069065 0.001011

44 -0.20663 0.052391 0.000590

45 -0.04203 0.009107 0.000004

46 -0.15972 0.048305 0.000324

47 -0.04650 0.016506 0.000009

48 -0.02595 0.004457 0.000001

49 -0.12716 0.036617 0.000154

50 -0.57936 0.408364 0.057920

We see one clear outlier, 360Networks. This firm was in the business of building computer

networks, and was one of only two firms that ultimately went bankrupt that had positive

earnings the year before insolvency. Its value of RE/TA was also not very negative, but part

of this could be from the nature of its business; the thousands of miles of cable that it owned

resulted in the firm having $6.3 billion in total assets only three months before it declared

bankruptcy, making RE/TA less negative.

Here are residual plots. Note that for 0/1 response data Minitab won’t produce a plot

of residuals versus fitted probabilities, but you could create that by hand if you want (the


estimated probabilities take the role of fitted values here). Given that for 0/1 response data

residuals can be very far from normally distributed (especially if you have many estimated

probabilities close to 0 or 1) it’s probably not worth the trouble; we can only use these plots

as a rough guideline to what is going on. The positive outlier is very obvious, and there is a

marginally unusual nonbankrupt company as well (Allegiance Telecom, which had relatively

low earnings for a nonbankrupt company).

If we omit this observation and try to fit a model using all of the predictors (which

is the correct thing to do, since we can’t be sure what is the best model to use now) the

program once again won’t allow it; in fact, it turns out that the model fits the data perfectly

(there is complete separation). In this situation the goodness-of-fit statistics all equal zero

and Somers’ D equals 1. A model using only one predictor that fits perfectly cannot be

simplified, and separation in this context is not a bad thing; the proper implication is to just

say that the variable perfectly splits the successes from the failures based on an observation

being greater than or less than some cutoff value (which can be identified from the data).

If this occurs with two predictors it means that in a scatter plot of the two variables there

is a straight line such that all of the observations are successes on one side of the line and

failures on the other side of the line.

What about here? We should explore the possibility of whether a simpler model is good

enough. It is only when a model using fewer predictors recovers virtually all of the fit that

we can comfortably move away from the current model we are examining.

Here is output from least squares best subsets:


Best Subsets Regression: Bankrupt versus WC/TA, RE/TA, ...

Response is Bankrupt

E B

B V

W R I E

C E T S /

/ / / / B

R-Sq R-Sq Mallows T T T T V

Vars R-Sq (adj) (pred) Cp S A A A A L

1 38.1 36.8 26.1 16.2 0.40166 X

1 37.6 36.3 31.8 16.6 0.40305 X

2 48.9 46.7 37.8 7.4 0.36864 X X

2 44.1 41.7 26.6 12.2 0.38577 X X

3 54.3 51.3 35.4 4.1 0.35253 X X X

3 50.8 47.5 37.0 7.6 0.36607 X X X

4 55.8 51.7 34.5 4.7 0.35085 X X X X

4 55.5 51.4 34.5 5.0 0.35196 X X X X

5 56.5 51.4 32.3 6.0 0.35212 X X X X X

The three-predictor model we used before still seems to be a good choice, but unfortu-

nately Minitab won’t let us fit it because of (quasi-)complete separation. It will let us run

stepwise regression, which suggests the model just based on RE/TA:

Binary Logistic Regression: Bankrupt versus RE/TA, EBIT/TA



Method

Link function Logit


Rows used 49

Stepwise Selection of Terms

Candidate terms: RE/TA, EBIT/TA


-----Step 1----

Coef P

Constant -3.05

RE/TA -0.0828 0.001

Deviance R-Sq 64.28%

Deviance R-Sq(adj) 62.81%

AIC 28.26

a to enter = 0.15, a to remove = 0.15




0 25

Total 49

Deviance Table


Regression 1 43.65 43.6522 43.65 0.000

RE/TA 1 43.65 43.6522 43.65 0.000

Error 47 24.26 0.5161

Total 48 67.91

Model Summary

Deviance Deviance

R-Sq R-Sq(adj) AIC

64.28% 62.81% 28.26

Coefficients


Constant -3.05 1.09

RE/TA -0.0828 0.0259 1.00



Odds Ratio 95% CI

RE/TA 0.9206 (0.8750, 0.9685)

Regression Equation

P(1) = exp(Y’)/(1 + exp(Y’))

Y’ = -3.05 -0.0828RE/TA



Deviance 47 24.26 0.998

Pearson 47 26.29 0.994







Total 600 100.0


The least squares best subsets adds EBIT/TA as the best two-predictor regression, and

both likelihood ratio tests and AIC agree with that in the logistic regression fit:

Binary Logistic Regression: Bankrupt versus RE/TA, EBIT/TA



Method

Link function Logit



Rows used 49




0 25

Total 49

Deviance Table


Regression 2 47.738 23.8692 47.74 0.000

RE/TA 1 13.464 13.4638 13.46 0.000

EBIT/TA 1 4.086 4.0863 4.09 0.043

Error 46 20.170 0.4385

Total 48 67.908

Model Summary

Deviance Deviance

R-Sq R-Sq(adj) AIC

70.30% 67.35% 26.17

Coefficients


Constant -2.62 1.08

RE/TA -0.0542 0.0251 1.09

EBIT/TA -0.1172 0.0739 1.09


Odds Ratio 95% CI

RE/TA 0.9472 (0.9017, 0.9950)

EBIT/TA 0.8894 (0.7696, 1.0279)

Regression Equation


P(1) = exp(Y’)/(1 + exp(Y’))

Y’ = -2.62 -0.0542RE/TA -0.1172EBIT/TA



Deviance 46 20.17 1.000

Pearson 46 18.64 1.000







Total 600 100.0


The RE/TA coefficient says that an increase of one percentage point in the retained earn-

ings (as a percentage of total assets) is associated with an decrease in the odds of going

bankrupt in the next year by 5.3% holding EBIT/TA fixed; the EBIT/TA coefficient says that

an increase of one percentage point in the earnings before interest and taxes (as a percentage

of total assets) is associated with a decrease in the odds of going bankrupt by 11.1% holding

RE/TA fixed. I’m still not happy that Minitabwouldn’t let me try the three-predictor model,

but there isn’t anything we can do about it in the current version of Minitab.

Here are diagnostics for this two-predictor model; there is still an indication of some

unusual observations, but at this point the fit is almost perfect, and omitting anything

doesn’t change things materially.

Row Company FITS SPEARRES HI COOK

1 Advanced Radio Telecom 0.86072 0.41982 0.081909 0.005241

2 Ardent Communications 1.00000 0.00076 0.000012 0.000000

3 At Home Corp. 1.00000 0.00002 0.000000 0.000000

4 Convergent Communications 0.96251 0.20570 0.079425 0.001217


5 Covad Communications 1.00000 0.00187 0.000097 0.000000

6 e.spire 0.97732 0.15779 0.067850 0.000604

7 eGlobe 0.99797 0.04531 0.010011 0.000007

8 Exodus Communications 0.15712 2.43405 0.094499 0.206099

9 General Datacomm Industries 0.99367 0.08104 0.029981 0.000068

10 Global Telesystems 0.30689 1.56837 0.081809 0.073054

11 GST Telecom 0.60905 0.84943 0.110361 0.029835

12 Metricom 0.87556 0.40479 0.132587 0.008348

13 Net2000 Communications 0.98134 0.14140 0.048926 0.000343

14 NetVoice Technologies 0.95320 0.23786 0.132154 0.002872

15 PSINet 0.99997 0.00508 0.000586 0.000000

16 Rhythms NetConnections 0.99856 0.03818 0.010851 0.000005

17 RSL Communications 0.41104 1.26010 0.097616 0.057256

18 SSE Telecom 0.99997 0.00583 0.001263 0.000000

19 Startec Global Communications 0.96249 0.20281 0.052416 0.000758

20 Teligent 0.99993 0.00852 0.000698 0.000000

21 U.S. Wireless 1.00000 0.00000 0.000000 0.000000

22 Viatel 0.96760 0.18978 0.070358 0.000909

23 WebLink Wireless 0.96108 0.22970 0.232501 0.005328

24 Winstar 0.74858 0.60835 0.092490 0.012572

25 Aether Systems 0.21983 -0.56599 0.120402 0.014617

26 Akamai Technologies 0.51083 -1.07664 0.099086 0.042496

27 Allegiance Telecom 0.66205 -1.46367 0.085560 0.066816

28 ALLTEL Corp. 0.00099 -0.03160 0.005195 0.000002

29 BellSouth 0.00093 -0.03060 0.005170 0.000002

30 Broadwing 0.03880 -0.20620 0.050696 0.000757

31 CenturyTel 0.00431 -0.06627 0.014028 0.000021

32 Citizens Communications 0.02798 -0.17340 0.042532 0.000445

33 Commonwealth Telephone 0.01064 -0.10523 0.028683 0.000109

34 Conestoga Enterprises 0.00744 -0.08750 0.020391 0.000053

35 Digex 0.71929 -1.73818 0.151897 0.180373

36 Equant 0.14051 -0.42127 0.078789 0.005060

37 Garmin 0.00014 -0.01196 0.001182 0.000000

38 Gilat Satellite Networks 0.04748 -0.22982 0.056198 0.001048

39 IDT Corp. 0.02593 -0.17343 0.115087 0.001304

40 Infonet 0.05389 -0.24651 0.062638 0.001354

41 Openwave Systems 0.61774 -1.45768 0.239472 0.223020

42 Price Communications 0.01512 -0.12588 0.031301 0.000171

43 Qwest 0.02358 -0.15855 0.039471 0.000344

44 SBC Communications 0.00231 -0.04835 0.009889 0.000008

45 Telephone and Data Systems 0.00756 -0.08832 0.023555 0.000063

46 Time Warner Telecom 0.03875 -0.20653 0.054837 0.000825

47 U.S. Cellular 0.00242 -0.04948 0.009487 0.000008

48 Verizon Communications 0.00741 -0.08731 0.020908 0.000054

49 Western Wireless 0.08950 -0.35166 0.205149 0.010639


The output above also gave the estimated probabilities from Minitab. Since this is aretrospective study, these cannot be interpreted as genuine probabilities that (going forward)a firm with those values for the financial ratios will go bankrupt in the next year; since wehave oversampled bankrupt firms, the probabilities given are too high. They can still beused to classify observations as bankrupt or nonbankrupt, however. Here is a classificationmatrix, based on whether the estimated probability is above or below .5 (see the end of thishandout for how to get a classification table using Minitab):

Rows: Bankrupt Columns: Predict

0 1 All

0 21 4 25

42.86 8.16 51.02

1 3 21 24

6.12 42.86 48.98

All 24 25 49

48.98 51.02 100.00

85.7% of the firms were correctly classified, much higher than

Cpro = (1.25)[(.5102)(.4898) + (.4898)(.5102)] = 62.5%

and Cmax = 51%, reinforcing the strength of the logistic regression (the three-predictor

model that Minitab won’t let me fit does even better, getting 95.5% of the 49 observations

right). Thus, the two (three) financial ratios do a very good (excellent) job of classifying

the firms into the bankrupt and nonbankrupt groups. Note that this correct classification

proportion is actually incorrect; 360Networks would have been misclassified, so the actual

correct classification rate was 84% of the original firms. That is, you must include the

outliers among the misclassifications. Note that while a cutoff of .5 seems reasonable here,

in other circumstances a different cutoff value might be more logical; for example, in a

prospective study with a success probability of (say) .1, a cutoff lower than .5 might be a

choice that better balances the two different kinds of misclassifications. Of course, this is


using the data twice; what we would really like to do would be to apply these models to new

data to see how they compare. See the appendix for information on how to do that.

This fitted logistic regression can be used to estimate prospective probabilities of bankruptcy

by adjusting the constant term of the regression using prior probabilities of bankruptcy. I will

use a 10% bankruptcy probability, which is roughly consistent with what would be expected

for firms with a corporate bond rating of B, yielding an adjusted intercept

β̃0 = β̂0 + ln

[

(.10)(25)

(.90)(24)

]

= β̂0 − 2.1564.

I can then convert the original probability estimates to adjusted ones, but first converting the

original probabilities to logits, adjusting as above, and then converting back to probabilities.

You can do this using the calculator to create the variable newprob using the commands

below:

exp(LOGE(’FITS’/(1-’FITS’))-2.1564)/(1+exp(LOGE(’FITS’/(1-’FITS’))-2.1564))

Here are the estimated probabilities:

Row Company newprob

1 Advanced Radio Telecom 0.41700

2 Ardent Communications 1.00000

3 At Home Corp. 1.00000

4 Convergent Communications 0.74819

5 Covad Communications 0.99997

6 e.spire 0.83298

7 eGlobe 0.98274

8 Exodus Communications 0.02112

9 General Datacomm Industries 0.94783

10 Global Telesystems 0.04875

11 GST Telecom 0.15277

12 Metricom 0.44884

13 Net2000 Communications 0.85889

14 NetVoice Technologies 0.70214

15 PSINet 0.99978

16 Rhythms NetConnections 0.98769

17 RSL Communications 0.07474

18 SSE Telecom 0.99971


19 Startec Global Communications 0.74808

20 Teligent 0.99937

21 U.S. Wireless 1.00000

22 Viatel 0.77563

23 WebLink Wireless 0.74081

24 Winstar 0.25629

25 Aether Systems 0.03158

26 Akamai Technologies 0.10783

27 Allegiance Telecom 0.18483

28 ALLTEL Corp. 0.00011

29 BellSouth 0.00011

30 Broadwing 0.00465

31 CenturyTel 0.00050

32 Citizens Communications 0.00332

33 Commonwealth Telephone 0.00124

34 Conestoga Enterprises 0.00087

35 Digex 0.22873

36 Equant 0.01857

37 Garmin 0.00002

38 Gilat Satellite Networks 0.00574

39 IDT Corp. 0.00307

40 Infonet 0.00655

41 Openwave Systems 0.15757

42 Price Communications 0.00177

43 Qwest 0.00279

44 SBC Communications 0.00027

45 Telephone and Data Systems 0.00088

46 Time Warner Telecom 0.00464

47 U.S. Cellular 0.00028

48 Verizon Communications 0.00086

49 Western Wireless 0.01125

Note that GST Telecom has gone from an estimated probability of bankruptcy of .61 to

one of .15, but given the prior bankruptcy probability of .1, this firm would still have been

classified to the bankrupt group.


Constructing a classification matrix using Minitab

Minitab does not give a classification matrix as automatic output from a logistic regres-

sion, but it can be calculated reasonably easily. Here are the steps:

1. Say your 0/1 response variable is named Response. When fitting the logistic regression

model, click on Storage and click on Fits (event probabilities). This will create

a variable called FITS that contains the estimated probabilities of success for each

observation. If you do this more than once, the variable will be called FITS2, FITS3,

and so on.

2. Now go to the calculator (Calc→ Calculator) and create a variable with the predicted

group for each covariate pattern (say Predict), based on the construction

FITS > .5.

3. Click on Stat → Tables → Cross Tabulation and Chi-Square. Enter Response

and Predict under Categorical variables (one for Rows, the other for Columns.

Under Display click Counts and Total percents.

Using .5 to define the classifying groups is based on the natural idea of classifying an

observation to its most probable group, but that is not the only possible strategy that you

could take. For example, consider a medical triage situation, where patients are either low

risk (0) or high risk (1). It might be that the cost of misclassifying a person who is actually

at high risk to be low risk (C(0|1)) is viewed as much higher than misclassifying a person

who is actually at low risk to be high risk (C(1|0)), since the latter situation only involves

giving unnecessary treatment, while the former involves failing to give necessary treatment.

Say that it was viewed that C(0|1) is ten times bigger than C(1|0) (that is, it is ten times

worse to fail to give necessary treatment than to give unnecessary treatment). In that case,

the appropriate cutoff is not .5, but rather

C(1|0)

C(1|0) + C(0|1)=

1

1 + 10= .091.

That is, any observation with estimated probability of being at high risk greater than .091

is classified as being at high risk. This will reduce the probability of mistaking a high risk

patient for low risk, but will of course increase the chances of mistaking a low risk patient

for a high risk one. Note that if the costs are viewed as equal, you get back to the .5 cutoff.

Another issue is that in a prospective study, if one group is much less likely than the other, it


might be sensible to assign observations to that group based on a smaller than 50% estimated

likelihood of being from that group.

Validating a model on new data

An excellent way to assess the usefulness of the logistic regression model as a classifier is

to validate it on new data. In order to do this, follow these steps:

1. Write down your final logistic regression model.

2. Bring up as active the worksheet that contains the new data (you could also have this

in the same worksheet as a separate set of variables).

3. Click on Calc→ Calculator. Create a new variable logit that equals β̂0+β̂1x1+· · ·+

β̂pxp, where the β̂’s are the estimated coefficients, and {x1, . . . , xp} are the predictors

(that is, the variable logit contains the estimated logits for the new data).

4. Bring up the calculator again, and create the variable prob, the estimated probabilities

of success for the new data; this equals exp(logit)/(1 + exp(logit)).

5. You can now evaluate the effectiveness of the model in classifying new observations by

constructing a classification matrix on the new data in exactly the same way as was

described earlier (where prob takes the place of FITS1 there).

Note that since you are not using the same data to construct and evaluate the model,

you don’t need to multiply Cpro by 1.25 in your assessment. If your model includes factor

(categorical) variables, you have to convert to the underlying indicator or effect coding

variables and fit the model that way to assess the classification performance of the model on

new data.

Comparing a subset model to a more complex model

As we know, the need for a more complex model versus a simpler one that is a special

case can be assessed for least squares models using a partial F -test. The analogous test in

the logistic regression context is a chi-squared test (this is not the same as the chi-squared

goodness-of-fit tests given by Minitab, but the tests given under Chi-Square in the deviance

table given in the output are examples of this kind of test). To construct the test, subtract

the value under Chi-Square in the first row of the deviance table from the Minitab output

for the subset model from the corresponding value for the more complicated model; call

this value g. This is then compared to a χ2 distribution, with the number of degrees of


freedom d being the difference between the number of parameters in the two models. The

null hypothesis being tested is that the subset model provides a good enough fit. To get the

p-value for the test, click on Calc → Probability Distributions → Chi-Square. Click

the button next to Cumulative probability, put the value d in the box next to Degrees

of freedom, click the button and put the value g in the box next to Input constant. The

p-value is 1 − q, where q is the value given under P(X<=x). Remember, this is only valid if

the simpler model is a special case of the more complicated model (that is, a subset of it).


Documents

Predicting bankruptcy in the telecommunications …people.stern.nyu.edu/jsimonof/classes/2301/pdf/banklog.pdfPredicting bankruptcy in the telecommunications industry Understanding