22
Logistic Regression Cases(4-6) Muhammad Akram Naseem ([email protected]) Presenter: Research Centre for Training and Development(RCTD) 06/23/2022 1 Unlock the Potential of Data analysis

Logistic Regression(Cases 4-6)

Embed Size (px)

DESCRIPTION

logistic reg

Citation preview

Page 1: Logistic Regression(Cases 4-6)

04/21/2023 1Unlock the Potential of Data analysis

Logistic RegressionCases(4-6)

Muhammad Akram Naseem([email protected])

Presenter: Research Centre for Training and

Development(RCTD)

Page 2: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis2

Logistic Regression What is?Logistic Regression analysis used when

dependent variable is categorical .

Binary logistic regression is most useful when you want to model the

event probability for a categorical response variable with two outcomes.

For example:A doctor wants to accurately diagnose a

possibly cancerous tumor.

Page 3: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis3

Binary logistic regression

Accurately diagnose(AD) depend on experience of doctor

Dependent Variable: Accurately diagnose(AD) (Yes, No)

Independent variable: Experience

Page 4: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis4

Binary logistic regression A loan officer wants to know whether

the next customer is likely to default or not

Default from loan(DFL) depends on monthly income

Dependent variable: Default from loan(DFL) (Yes,No)

Independent variable: monthly Income

Page 5: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis5

Binary logistic regression Satisfaction of employees of a

certain(Yes, No) organization depend on work load

Dependent variable: Satisfaction (Yes, No)

Independent variable: work load

Page 6: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis6

Binary logistic regression(Case-4)

Case: A study is conducted to know the impact of age on coronary heart disease(CHD),

Dependent variable: CHD(1-yes , 0-N0)Independent variable: Age

Data file: logistic regression

Page 7: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis7

Binary logistic regression(Case-4)1.Click on Analyze

2.Click on Binary logistic

3.Shift dependent variable

4.Shift independent variable

5.Click on Ok

Page 8: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis8

Binary logistic regression(Case-4)Omnibus Tests of Model Coefficients

 Chi-square df Sig.

Step 29.31 1.00 0.00

Block 29.31 1.00 0.00

Model 29.31 1.00 0.00

Model Summary

Step -2 Log likelihood Cox & Snell R Square

1 107.35 0.25

Classification Table 

Predicted

Coronary Heart Disease

 NO YES

Coronary Heart Disease

NO45 12

YES14 29Explanatory

Power of the model

P-value suggest

significance of the

model

Proportion of correctly specified(45+29)/

100=0.74,

Page 9: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis9

Binary logistic regression(Case-4)Age is a

significant variable

Variables in the Equation

  B S.E. Wald df Sig. Exp(B)=OR

AGE 0.11 0.02 21.25 1 0.00 1.117

Constant -5.31 1.13 21.94 1 0.00 0.01

1.exp(0.11)=1.12, means

that with the increase of one year in

age the risk of CHD is

increased 1.12 times provided all factors kept

constant Z = -5.31 + 0.1109 (age)

Page 10: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis10

Age Z= P(CHD)

30 -2.01 0.1235 -1.46 0.1940 -0.91 0.2945 -0.36 0.4150 0.19 0.5555 0.74 0.6860 1.29 0.7865 1.84 0.8670 2.39 0.9275 2.94 0.95 30 35 40 45 50 55 60 65 70 75

0.120.19

0.29

0.41

0.55

0.68

0.780.86

0.92 0.95

P(CHD)

Z = -5.31 + 0.1109 (age)

P(CHD)= = Ze1

1

Binary logistic regression(Case-4)

Page 11: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis11

In case 5 our objective is to know the impact of a categorical (binary) explanatory variable on a categorical (binary) dependent variable, how the analysis will be performed and how we will interpret the findings

File used: Case5.sav Dependent variable: Baby Birth

Weight Independent variable: Smoking

Status

Binary logistic regression(Case-5)

Page 12: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis12

Binary logistic regression(Case-5)

Dependent variable: Baby Birth Weight

Will be classified as low weight(1) , not low weight(0)

Independent variable: Smoking Status,

will be classified as smoker(1), non smoker (0)

Page 13: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis13

Binary logistic regression(Case-5)Omnibus Tests of Model Coefficients

 Chi-square df Sig.

Step 4.867 1 0.27

Block 4.8671

0.27

Model 4.8671

0.27

Model Summary

Step -2 Log likelihood Cox & Snell R Square

1 229.805 0.025

Classification Table 

Predicted

Birth Weight

 NO Low

Birth WeightNO

130 0

Low59 0Explanatory

Power of the model

P-value suggest in significance of the

model

Proportion of correctly specified(130+0)/

159=0.81,

Page 14: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis14

Binary logistic regression(Case-5)Age is a

significant variable

Variables in the Equation

  B S.E. Wald df Sig. Exp(B)=OR

SS 0.704 0.320 4.852 1 0.028 2.022

Constant -1.087 0.215 25.627 1 0.00 0.337

1.exp(0.704)=2.02, means that with the

status of smoking the

risk of low birth weight of baby

is increased 2.02 times to those mothers who not smoke

during pregnancy,provided all factors kept

constant

Z = -1.087 +0.704 (SS)

Page 15: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis15

P(LBW)= = Ze1

1

Binary logistic regression(Case-5)

Z = -1.087 +0.704 (SS)

ss zprob

yes -0.383 0.41

No -1.087 0.25 yes No0.00

0.10

0.20

0.30

0.400.41

0.25

prob of LBW along with Smoking status(ss) of

mothers

Page 16: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis16

Binary logistic regression(Case-6)

In this case , we will study to know the impact of race(black, white, other) on birth weight(Low, Not Low)

Dependent variable: Birth Weight status(BWS)

Explanatory variable: race

Page 17: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis17

Binary logistic regression(Case-6)

We will create two dummies of race

Race1:-white-1, black=0, Others=0

Race2:- white-0, black=1, Others=0

Page 18: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis18

Binary logistic regression(Case-6)Omnibus Tests of Model Coefficients

 Chi-square df Sig.

Step 4.636 2 0.098

Block 4.6362

0.098

Model 4.6362

0.098

Model Summary

Step -2 Log likelihood Cox & Snell R Square

1 230.036 0.024

Classification Table 

Predicted

Birth Weight

 NO Low

Birth WeightNO

130 0

Low59 0Explanatory

Power of the model

P-value suggest

significance of the

model

Proportion of correctly specified(130+0)/

159=0.81,

Page 19: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis19

Binary logistic regression(Case-5)Age is a

significant variable

Variables in the Equation

  B S.E. Wald Df Sig. Exp(B)=OR

race1 -0.599 0.347 2.973 1 0.085 0.549

race2 0.232 0.470 0.244 1 0.621 1.261

Constant -0.542 0.252 4.650 1 0.031 0.581

Odd Ratios

Z = -0.542-0.599race1+0.232race2

Page 20: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis20

P(LBW)= = Ze1

1

Binary logistic regression(Case-6)

prob of LBW along with Smoking status(ss) of

mothers

Z = -0.542-0.599race1+0.232race2

race Z prob

White -1.14 0.24

Black -0.31 0.42

other -0.54 0.37 White Black other0.0000.0500.1000.1500.2000.2500.3000.3500.4000.450

0.242

0.423

0.368

Probability of LBW between different races

Page 21: Logistic Regression(Cases 4-6)

04/21/2023Unlock the potential of Data analysis21

Classical Regression Logistic Regression

1.Model significance tested by t or F test

2.Co-efficient estimated by

3.Interpretation is straight forward

4.Explanatory power of model is determined by R2

1.Model significance tested by chi-square

2.Co-efficients estimated by likelihood

3.Interpretation is through odds ratio

4.Explanatory power of model is determined by pseudo R2

Classical regression vs logistic regression