Logit powerpoint

Embed Size (px)

Citation preview

  • 7/30/2019 Logit powerpoint

    1/48

    to Logisticto Logistic

    RegressionRegression

    JohnWhiteheadJohnWhitehead

    Department of EconomicsDepartment of Economics

    Appalachian State UniversityAppalachian State University

  • 7/30/2019 Logit powerpoint

    2/48

    OutlineOutline

    Introduction andIntroduction and

    DescriptionDescription Some PotentialSome Potential

    Problems andProblems andSolutionsSolutions

    Writing Up theWriting Up the

  • 7/30/2019 Logit powerpoint

    3/48

    Introduction and DescriptionIntroduction and Description

    Why use logistic regression?Why use logistic regression?

    Estimation by maximum likelihoodEstimation by maximum likelihood

    Interpreting coefficientsInterpreting coefficients Hypothesis testingHypothesis testing

    Evaluating the performance of theEvaluating the performance of the

    modelmodel

  • 7/30/2019 Logit powerpoint

    4/48

    Why use logistic regression?Why use logistic regression?

    There are many important researchThere are many important researchtopics for which the dependent variabletopics for which the dependent variableis "limited."is "limited."

    For example: voting, morbidity orFor example: voting, morbidity ormortality, and participation data is notmortality, and participation data is notcontinuous or distributed normally.continuous or distributed normally.

    Binary logistic regression is a type ofBinary logistic regression is a type ofregression analysis where theregression analysis where thedependent variable is a dummydependent variable is a dummyvariable: coded 0 (did not vote) or 1(didvariable: coded 0 (did not vote) or 1(did

    vote)vote)

  • 7/30/2019 Logit powerpoint

    5/48

    The Linear Probability ModelThe Linear Probability Model

    In the OLS regression:In the OLS regression:

    Y =Y = ++ X + e ; where Y = (0, 1)X + e ; where Y = (0, 1)

    The error terms are heteroskedasticThe error terms are heteroskedastic e is not normally distributed becausee is not normally distributed because

    Y takes on only two valuesY takes on only two values

    The predicted probabilities can beThe predicted probabilities can begreater than 1 or less than 0greater than 1 or less than 0

  • 7/30/2019 Logit powerpoint

    6/48

    Q: EVAC

    Did you evacuate your home to go someplacesafer before Hurricane Dennis (Floyd) hit?

    1 YES

    2 NO3 DON'T KNOW4 REFUSED

    An Example: HurricaneAn Example: Hurricane

    EvacuationsEvacuations

  • 7/30/2019 Logit powerpoint

    7/48

    The DataThe Data

    EVAC PETS MOBLHOME TENURE EDUC

    0 1 0 16 16

    0 1 0 26 12

    0 1 1 11 13

    1 1 1 1 10

    1 0 0 5 12

    0 0 0 34 12

    0 0 0 3 14

    0 1 0 3 16

    01 0 10 12

    0 0 0 2 18

    0 0 0 2 12

    0 1 0 25 16

    1 1 1 20 12

  • 7/30/2019 Logit powerpoint

    8/48

    OLS ResultsOLS Results

    Dependent Variable: EVAC

    Variable B t-value

    (Constant) 0.190 2.121

    PETS -0.137 -5.296

    MOBLHOME 0.337 8.963

    TENURE -0.003 -2.973

    EDUC 0.003 0.424FLOYD 0.198 8.147

    R2

    0.145

    F-stat 36.010

  • 7/30/2019 Logit powerpoint

    9/48

    Problems:Problems:

    Descriptive Statistics

    1070 -.08498 .76027 .2429907UnstandardizedPredicted Value

    N Minimum Maximum Mean

    Predicted Values outside the 0,1

    range

  • 7/30/2019 Logit powerpoint

    10/48

    HeteroskedasticityHeteroskedasticity

    TENURE

    100806040200

    U

    n

    s

    t

    an

    d

    a

    r

    d

    i

    z

    e

    d

    R

    e

    s

    i

    d

    u

    a

    l

    10

    0

    -10

    -20

    Dependent Variable: LNESQ

    B t-stat(Constant) -2.34 -15.99

    LNTNSQ -0.20 -6.19

    Park Test

  • 7/30/2019 Logit powerpoint

    11/48

    The Logistic Regression ModelThe Logistic Regression Model

    The "logit" model solves these problems:The "logit" model solves these problems:

    ln[p/(1-p)] =ln[p/(1-p)] = ++ X + eX + e

    p is the probability that the event Yp is the probability that the event Y

    occurs, p(Y=1)occurs, p(Y=1)

    p/(1-p) is the "odds ratio"p/(1-p) is the "odds ratio"

    ln[p/(1-p)] is the log odds ratio, or "logit"ln[p/(1-p)] is the log odds ratio, or "logit"

  • 7/30/2019 Logit powerpoint

    12/48

    More:More:

    The logistic distribution constrains theThe logistic distribution constrains the

    estimated probabilities to lie between 0estimated probabilities to lie between 0and 1.and 1.

    The estimated probability is:The estimated probability is:

    p = 1/[1 + exp(-p = 1/[1 + exp(- -- X)]X)]

    if you letif you let ++ X =0, then p = .50X =0, then p = .50

    asas ++ X gets really big, p approaches 1X gets really big, p approaches 1 asas ++ X gets really small, p approachesX gets really small, p approaches

    00

  • 7/30/2019 Logit powerpoint

    13/48

  • 7/30/2019 Logit powerpoint

    14/48

    Comparing LP and LogitComparing LP and Logit

    ModelsModels

    0

    1

    LP Model

    Logit Model

  • 7/30/2019 Logit powerpoint

    15/48

    Maximum Likelihood EstimationMaximum Likelihood Estimation

    (MLE)(MLE)

    MLE is a statistical method forMLE is a statistical method for

    estimating the coefficients of a model.estimating the coefficients of a model.

    The likelihood function (L) measures theThe likelihood function (L) measures the

    probability of observing the particularprobability of observing the particularset of dependent variable values (pset of dependent variable values (p11,,

    pp22, ..., p, ..., pnn) that occur in the sample:) that occur in the sample:

    L = Prob (pL = Prob (p

    11* p* p

    22* * * p* * * p

    nn))

    The higher the L, the higher theThe higher the L, the higher the

    probability of observing the ps in theprobability of observing the ps in the

    sample.sample.

  • 7/30/2019 Logit powerpoint

    16/48

    MLE involves finding the coefficients (MLE involves finding the coefficients (,,

    ) that makes the log of the likelihood) that makes the log of the likelihood

    function (LL < 0) as large as possiblefunction (LL < 0) as large as possible Or, finds the coefficients that make -2Or, finds the coefficients that make -2

    times the log of the likelihood functiontimes the log of the likelihood function

    (-2LL) as small as possible(-2LL) as small as possible

    The maximum likelihood estimatesThe maximum likelihood estimates

    solve the following condition:solve the following condition:

    {Y - p(Y=1)}X{Y - p(Y=1)}X ii = 0= 0

    summed over all observations, i = 1,summed over all observations, i = 1,

    ,n,n

  • 7/30/2019 Logit powerpoint

    17/48

    Interpreting CoefficientsInterpreting Coefficients

    Since:Since:

    ln[p/(1-p)] =ln[p/(1-p)] = ++ X + eX + e

    The slope coefficient (The slope coefficient () is interpreted) is interpretedas the rate of change in the "log odds"as the rate of change in the "log odds"as X changes not very useful.as X changes not very useful.

    Since:Since:

    p = 1/[1 + exp(-p = 1/[1 + exp(- -- X)]X)]

    The marginal effect of a change in X onThe marginal effect of a change in X onthe probability is:the probability is: p/p/X = f(X = f(

    X)X)

  • 7/30/2019 Logit powerpoint

    18/48

    An interpretation of the logitAn interpretation of the logit

    coefficient which is usuallycoefficient which is usually

    more intuitive is the "oddsmore intuitive is the "odds

    ratio"ratio"

    Since:Since:

    [p/(1-p)] = exp([p/(1-p)] = exp(++ XX))

    exp(exp() is the effect of the) is the effect of the

    independent variable on theindependent variable on the

    "odds ratio""odds ratio"

  • 7/30/2019 Logit powerpoint

    19/48

    From SPSS Output:From SPSS Output:

    Variable B Exp(B) 1/Exp(B)

    PETS -0.6593 0.5172 1.933

    MOBLHOME 1.5583 4.7508

    TENURE -0.0198 0.9804 1.020

    EDUC 0.0501 1.0514

    Constant -0.916

    Households with pets are 1.933 times morelikely to evacuate than those without pets.

  • 7/30/2019 Logit powerpoint

    20/48

    Hypothesis TestingHypothesis Testing

    The Wald statistic for theThe Wald statistic for the coefficient is:coefficient is:

    Wald = [Wald = [/s.e./s.e.BB]]22

    which is distributed chi-square withwhich is distributed chi-square with

    1 degree of freedom.1 degree of freedom. The "Partial R" (in SPSS output) isThe "Partial R" (in SPSS output) is

    R = {[(Wald-2)/(-2LL(R = {[(Wald-2)/(-2LL()]})]}1/21/2

  • 7/30/2019 Logit powerpoint

    21/48

    An Example:An Example:

    Variable B S.E. Wald R Sig t-value

    PETS -0.6593 0.2012 10.732 -0.1127 0.0011 -3.28

    MOBLHOM 1.5583 0.2874 29.39 0.1996 0 5.42

    TENURE -0.0198 0.008 6.1238 -0.0775 0.0133 -2.48

    EDUC 0.0501 0.0468 1.1483 0.0000 0.2839 1.07

    Constant -0.916 0.69 1.7624 1 0.1843 -1.33

  • 7/30/2019 Logit powerpoint

    22/48

    Evaluating the PerformanceEvaluating the Performance

    of the Modelof the Model

    There are several statistics whichThere are several statistics which

    can be used for comparingcan be used for comparingalternative models or evaluatingalternative models or evaluating

    the performance of a single model:the performance of a single model:

    Model Chi-SquareModel Chi-Square Percent Correct PredictionsPercent Correct Predictions

    Pseudo-RPseudo-R22

  • 7/30/2019 Logit powerpoint

    23/48

    Model Chi-SquareModel Chi-Square

    The model likelihood ratio (LR), statisticThe model likelihood ratio (LR), statisticisis

    LR[i] = -2[LL(LR[i] = -2[LL() - LL() - LL(,, ) ]) ]

    {Or, as you are reading SPSS printout:{Or, as you are reading SPSS printout:

    LR[i] = [-2LL (of beginning model)] - [-2LL (of endingLR[i] = [-2LL (of beginning model)] - [-2LL (of ending

    model)]}model)]}

    The LR statistic is distributed chi-squareThe LR statistic is distributed chi-square

    with i degrees of freedom, where i is thewith i degrees of freedom, where i is thenumber of independent variablesnumber of independent variables

    Use the Model Chi-Square statistic toUse the Model Chi-Square statistic to

    determine if the overall model isdetermine if the overall model is

  • 7/30/2019 Logit powerpoint

    24/48

    An Example:An Example:

    Beginning Block Number 1. Method: Enter

    -2 Log Likelihood 687.35714

    Variable(s) Entered on Step Number

    1.. PETS PETS

    MOBLHOME MOBLHOME

    TENURE TENUREEDUC EDUC

    Estimation terminated at iteration number 3 because

    Log Likelihood decreased by less than .01 percent.

    -2 Log Likelihood 641.842

    Chi-Square df Sign.

    Model 45.515 4 0.0000

  • 7/30/2019 Logit powerpoint

    25/48

    Percent Correct PredictionsPercent Correct Predictions

    The "Percent Correct Predictions"The "Percent Correct Predictions"statistic assumes that if the estimated pstatistic assumes that if the estimated pis greater than or equal to .5 then theis greater than or equal to .5 then the

    event is expected to occur and notevent is expected to occur and notoccur otherwise.occur otherwise. By assigning these probabilities 0s andBy assigning these probabilities 0s and

    1s and comparing these to the actual 0s1s and comparing these to the actual 0s

    and 1s, the % correct Yes, % correct No,and 1s, the % correct Yes, % correct No,and overall % correct scores areand overall % correct scores arecalculated.calculated.

  • 7/30/2019 Logit powerpoint

    26/48

    An Example:An Example:

    Observed % Correct

    0 1

    0 328 24 93.18%

    1 139 44 24.04%

    Overall 69.53%

    Predicted

  • 7/30/2019 Logit powerpoint

    27/48

    Pseudo-RPseudo-R22

    OneOne psuedo-Rpsuedo-R22 statistic is the McFadden's-statistic is the McFadden's-

    RR22 statistic:statistic:

    McFadden's-RMcFadden's-R22

    = 1 - [LL(= 1 - [LL(,,)/LL()/LL()])]{{= 1 - [-2LL(= 1 - [-2LL(,, )/-2LL()/-2LL()] (from)] (fromSPSSSPSSprintout)printout)}}

    where the Rwhere the R22 is a scalar measure whichis a scalar measure which

    varies between 0 and (somewhat close to)varies between 0 and (somewhat close to)1 much like the R1 much like the R22 in a LP model.in a LP model.

  • 7/30/2019 Logit powerpoint

    28/48

    An Example:An Example:

    Beginning -2 LL 687.36

    Ending -2 LL 641.84

    Ending/Beginning 0.9338

    McF. R2

    = 1 - E./B. 0.0662

  • 7/30/2019 Logit powerpoint

    29/48

    Some potential problems andSome potential problems and

    solutionssolutions Omitted Variable BiasOmitted Variable Bias

    Irrelevant Variable BiasIrrelevant Variable Bias

    Functional FormFunctional Form MulticollinearityMulticollinearity

    Structural BreaksStructural Breaks

  • 7/30/2019 Logit powerpoint

    30/48

    Omitted Variable BiasOmitted Variable Bias

    Omitted variable(s) can result in bias in theOmitted variable(s) can result in bias in thecoefficient estimates. To test for omittedcoefficient estimates. To test for omittedvariables you can conduct a likelihood ratio test:variables you can conduct a likelihood ratio test:

    LR[q] = {[-2LL(constrained model, i=k-q)]LR[q] = {[-2LL(constrained model, i=k-q)]

    - [-2LL(unconstrained model, i=k)]}- [-2LL(unconstrained model, i=k)]}

    where LR is distributed chi-square with q degreeswhere LR is distributed chi-square with q degrees

    of freedom, with q = 1 or more omitted variablesof freedom, with q = 1 or more omitted variables {This test is conducted automatically by{This test is conducted automatically by SPSSSPSS ifif

    you specify "blocks" of independent variables}you specify "blocks" of independent variables}

  • 7/30/2019 Logit powerpoint

    31/48

    An Example:An Example:Variable B Wald Sig

    PETS -0.699 10.968 0.001

    MOBLHOME 1.570 29.412 0.000

    TENURE -0.020 5.993 0.014

    EDUC 0.049 1.079 0.299

    CHILD 0.009 0.011 0.917

    WHITE 0.186 0.422 0.516

    FEMALE 0.018 0.008 0.928Constant -1.049 2.073 0.150

    Beginning -2 LL 687.36

    Ending -2 LL 641.41

  • 7/30/2019 Logit powerpoint

    32/48

    Constructing the LR TestConstructing the LR Test

    Since the chi-squared value is less than thecritical value the set of coefficients is notstatistically significant. The full model is not animprovement over the partial model.

    Ending -2 LL Partial Model 641.84

    Ending -2 LL Full Model 641.41

    Block Chi-Square 0.43

    DF 3

    Critical Value 11.345

  • 7/30/2019 Logit powerpoint

    33/48

    The inclusion of irrelevantThe inclusion of irrelevant

    variable(s) can result in poorvariable(s) can result in poor

    model fit.model fit. You can consult your WaldYou can consult your Wald

    statistics or conduct a likelihoodstatistics or conduct a likelihood

    ratio test.ratio test.

    Irrelevant Variable Bias

  • 7/30/2019 Logit powerpoint

    34/48

    Functional FormFunctional Form

    Errors in functional form can result inErrors in functional form can result in

    biased coefficient estimates and poorbiased coefficient estimates and poor

    model fit.model fit.

    You should try different functional formsYou should try different functional formsby logging the independent variables,by logging the independent variables,

    adding squared terms, etc.adding squared terms, etc.

    Then consult the Wald statistics and modelThen consult the Wald statistics and model

    chi-square statistics to determine whichchi-square statistics to determine which

    model performs best.model performs best.

  • 7/30/2019 Logit powerpoint

    35/48

    MulticollinearityMulticollinearity

    The presence of multicollinearity willThe presence of multicollinearity will notnot leadleadto biased coefficients.to biased coefficients.

    But the standard errors of the coefficients willBut the standard errors of the coefficients will

    be inflated.be inflated.

    If a variable which you think should beIf a variable which you think should bestatistically significant is not, consult thestatistically significant is not, consult the

    correlation coefficients.correlation coefficients.

    If two variables are correlated at a rate greaterIf two variables are correlated at a rate greater

    than .6, .7, .8, etc. then try dropping the leastthan .6, .7, .8, etc. then try dropping the leasttheoretically important of the two.theoretically important of the two.

  • 7/30/2019 Logit powerpoint

    36/48

    Structural BreaksStructural Breaks

    You may have structural breaks in your data.You may have structural breaks in your data.Pooling the data imposes the restriction that anPooling the data imposes the restriction that anindependent variable has the same effect on theindependent variable has the same effect on thedependent variable for different groups of datadependent variable for different groups of data

    when the opposite may be true.when the opposite may be true. You can conduct a likelihood ratio test:You can conduct a likelihood ratio test:

    LR[i+1] = -2LL(pooled model)LR[i+1] = -2LL(pooled model)

    [-2LL(sample 1) + -2LL(sample 2)][-2LL(sample 1) + -2LL(sample 2)]

    where samples 1 and 2 are pooled, and i is thewhere samples 1 and 2 are pooled, and i is thenumber of independent variables.number of independent variables.

  • 7/30/2019 Logit powerpoint

    37/48

    An ExampleAn Example Is the evacuation behavior fromIs the evacuation behavior from

    Hurricanes Dennis and Floyd statisticallyHurricanes Dennis and Floyd statistically

    equivalent?equivalent?

    Floyd Dennis Pooled

    Variable B B BPETS -0.66 -1.20 -0.79

    MOBLHOME 1.56 2.00 1.62

    TENURE -0.02 -0.02 -0.02

    EDUC 0.05 -0.04 0.02Constant -0.92 -0.78 -0.97

    Beginning -2 LL 687.36 440.87 1186.64

    Ending -2 LL 641.84 382.84 1095.26

    Model Chi-Square 45.52 58.02 91.37

  • 7/30/2019 Logit powerpoint

    38/48

    Constructing the LR TestConstructing the LR Test

    Floyd Dennis Pooled

    Ending -2 LL 641.84 382.84 1095.26

    Chi-Square 70.58 [Pooled - (Floyd + Dennis)

    DF 5

    Critical Value 13.277 p = .01

    Since the chi-squared value is greater than thecritical value the set of coefficients are statisticallydifferent. The pooled model is inappropriate.

  • 7/30/2019 Logit powerpoint

    39/48

    What should you do?What should you do?

    Try adding a dummy variable:Try adding a dummy variable:

    FLOYD = 1 if Floyd, 0 if DennisFLOYD = 1 if Floyd, 0 if Dennis

    Variable B Wald Sig

    PETS -0.85 27.20 0.000

    MOBLHOME 1.75 65.67 0.000

    TENURE -0.02 8.34 0.004

    EDUC 0.02 0.27 0.606

    FLOYD 1.26 59.08 0.000

    Constant -1.68 8.71 0.003

  • 7/30/2019 Logit powerpoint

    40/48

    Writing Up ResultsWriting Up Results

    Present descriptive statistics in a tablePresent descriptive statistics in a table Make it clear that the dependent variableMake it clear that the dependent variable

    is discrete (0, 1) and not continuous andis discrete (0, 1) and not continuous andthat you will use logistic regression.that you will use logistic regression.

    Logistic regression is a standardLogistic regression is a standardstatistical procedure so you don'tstatistical procedure so you don't(necessarily) need to write out the(necessarily) need to write out theformula for it. You also (usually) don'tformula for it. You also (usually) don't

    need to justify that you are using Logitneed to justify that you are using Logitinstead of the LP model or Probit (similarinstead of the LP model or Probit (similarto logit but based on the normalto logit but based on the normaldistribution [the tails are less fat]).distribution [the tails are less fat]).

  • 7/30/2019 Logit powerpoint

    41/48

    An Example:An Example:

    "The dependent variable whichmeasures the willingness to evacuateis EVAC. EVAC is equal to 1 if the

    respondent evacuated their homeduring Hurricanes Floyd and Dennisand 0 otherwise. The logisticregression model is used to estimatethe factors which influenceevacuation behavior."

  • 7/30/2019 Logit powerpoint

    42/48

    In the heading state that your dependentIn the heading state that your dependent

    variable (dependent variable = EVAC) and thatvariable (dependent variable = EVAC) and thatthese are "logistic regression results.these are "logistic regression results.

    Present coefficient estimates, t-statistics (orPresent coefficient estimates, t-statistics (or

    Wald, whichever you prefer), and (at least the)Wald, whichever you prefer), and (at least the)

    model chi-square statistic for overall model fitmodel chi-square statistic for overall model fit If you are comparing several modelIf you are comparing several model

    specifications you should also present the %specifications you should also present the %

    correct predictions and/or Pseudo-Rcorrect predictions and/or Pseudo-R22 statisticsstatistics

    to evaluate model performanceto evaluate model performance If you are comparing models with hypothesesIf you are comparing models with hypotheses

    about different blocks of coefficients or testingabout different blocks of coefficients or testing

    for structural breaks in the data, you couldfor structural breaks in the data, you could

    present the ending log-likelihood values.present the ending log-likelihood values.

    Organize your regression results in a table:

  • 7/30/2019 Logit powerpoint

    43/48

    An Example:An Example:

    Table 2. Logistic Regression Results

    Dependent Variable = EVAC

    Variable B B/S.E.

    PETS -0.6593 -3.28

    MOBLHOME 1.5583 5.42

    TENURE -0.0198 -2.48

    EDUC 0.0501 1.07

    Constant -0.916 -1.33

    Model Chi-Squared 45.515

  • 7/30/2019 Logit powerpoint

    44/48

    "The results from Model 1 indicate"The results from Model 1 indicatethat coastal residents behavethat coastal residents behaveaccording to risk theory. Theaccording to risk theory. The

    coefficient on the MOBLHOMEcoefficient on the MOBLHOMEvariable is negative andvariable is negative andstatistically significant at the p < .statistically significant at the p < .01 level (t-value = 5.42). Mobile01 level (t-value = 5.42). Mobile

    home residents are 4.75 timeshome residents are 4.75 timesmore likel to evacuate.

    When describing the statistics

    in the tables, point out thehighlights for the reader.What are the statistically significantvariables?

  • 7/30/2019 Logit powerpoint

    45/48

    The overall model is significantat the .01 level according to theModel chi-square statistic. Themodel predicts 69.5% of the

    responses correctly. TheMcFadden's R2 is .066."

    Is the overall model statistically

    significant?

  • 7/30/2019 Logit powerpoint

    46/48

    Which model is preferred?"Model 2 includes three additionalindependent variables. According tothe likelihood ratio test statistic, thepartial model is superior to the fullmodel of overall model fit. The blockchi-square statistic is not statisticallysignificant at the .01 level (critical

    value = 11.35 [df=3]). The coefficienton the children, gender, and racevariables are not statisticallysignificant at standard levels."

  • 7/30/2019 Logit powerpoint

    47/48

    AlsoAlso

    You usually don't need to discuss theYou usually don't need to discuss themagnitude of the coefficients--just themagnitude of the coefficients--just thesign (+ or -) and statistical significance.sign (+ or -) and statistical significance.

    If your audience is unfamiliar with theIf your audience is unfamiliar with theextensions (beyondextensions (beyond SPSSSPSS oror SASSASprintouts) to logistic regression, discussprintouts) to logistic regression, discussthe calculation of the statistics in anthe calculation of the statistics in anappendix or footnote or provide aappendix or footnote or provide a

    citation.citation. Always state the degrees of freedom forAlways state the degrees of freedom foryour likelihood-ratio (chi-square) test.your likelihood-ratio (chi-square) test.

  • 7/30/2019 Logit powerpoint

    48/48

    ReferencesReferences

    http://personal.ecu.edu/whiteheadj/data/logit/http://personal.ecu.edu/whiteheadj/data/logit/

    http://personal.ecu.edu/whiteheadj/data/logit/logitpap.htmhttp://personal.ecu.edu/whiteheadj/data/logit/logitpap.htm

    E-mail: [email protected]: [email protected]