15
7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to use DV’s to allow for ALL different intercepts and slopes for the two groups, and test if all DV terms=0: not true is : 0 : ... ... 0 ,.... 2 , 1 , 0 0 2 2 1 1 0 2 2 1 1 0 H H H u DVx DVx DVx DV x x x y a k v v k k k k

7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to

Embed Size (px)

Citation preview

Page 1: 7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to

7.4 DV’s and GroupsOften it is desirous to know if two different groups follow the same or different regression functions-One way to test this is to use DV’s to allow for ALL different intercepts and slopes for the two groups, and test if all DV terms=0:

not true is :

0:

...

...

0

,....2,1,00

22110

22110

HH

H

uDVxDVxDVxDV

xxxy

a

kvv

kk

kk

Page 2: 7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to

7.4 DV’s and GroupsAs we’ve already seen in our discussion of F tests, the significance or insignificance of one variable is quite separate from the joint significance of a set of variables-Therefore an F test must be done using the restricted model:

uxxxy kk ...22110

However, in the case of many x variables, this results in a large number of interaction variables, and may be difficult

Page 3: 7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to

7.4 DV’s and GroupsAlternately, one can run the regression in question (without DV’s) for the first and second group, and record SSR1 and SSR2

-a full regression is then run with all observations included to find SSRp

-a test F statistic is then formed as:

)]1(2/[)(

)1/()]([

21

21

knSSRSSR

kSSRSSRSSRF P

Which is usually called the CHOW STATISTIC and is only valid under homoskedasticity

Page 4: 7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to

7.4 DV’s and GroupsThis F value is compared to F* from our tables with k+1, n-2(k+1) degrees of freedom-Note that no valid R2 form of this equation exists-the null hypothesis as listed allows for no difference between groups

-if it is not rejected, the two groups test statistically identical (at a certain α)-if one wants to allow for an INTERCEPT difference in the two groups, the full regression is run with a single DV to distinguish the groups

Page 5: 7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to

7.5 Binary Dependent Variables-Thus far we have only considered QUANTITATIVE values for our dependent variable, but we can also have dependent variables analyzing a QUALITATIVE event

-ie: failing or passing the Midterm-in the simplest case, y has 2 outcomes

-ie: MT=1 if passed, =0 otherwise

uBribeIntelStudyMT 3210

Page 6: 7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to

7.5 Binary Dependent Variables-Here our y value reflect the probability of

“success”, that is,

kk xxP ... X)|E(yX)|1(y 110

-This is also called the RESPONSE PROBABILITY-Since probabilities must sum to one, we also

have that

X)|1(y1X)|0(y PP-the regression with a binary dependent

variable is also called the LINEAR PROBABILITY MODEL (LPM)

Page 7: 7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to

7.5 Binary Dependent Variables-in the LPM, Bj can no longer be

interpretted as the change in y given a one unit change in x. Instead,

(7.28) X)|1(y jj xP -Our estimated regression becomes:

kk xxxy ˆ...ˆˆˆˆ 22110 -Where yhat is our predicted probability of “success”

and Bjhat predicts the change in the probability of success due to a one unit increase in xj

Page 8: 7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to

7.5 Binary Example-Assume that our above example regressed as:

BribeIntelStudyTM 04.0005.001.038.0ˆ -This reflects some LIMITATIONS of the LPM:1) If bribe (expressed in tens of thousands of dollars)=100 ($1 million), then MThat>4 (400% chance of passing). Ie: estimated

probabilities can be negative or over 1.2) This assumes that the probability increase of the first hour of studying (1%) is the same as the probability increase of the 49th hour

(1%).

Page 9: 7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to

7.5 LPM Fixing-One way around this is to redefine predicted values:

0.5y if 0

0.5y if 1~

y

-One advantage of this redefinition is we can obtain a new goodness-of-fit measure as PERCENT CORRECTLY PREDICTED

-as now true and predicted values are both either zero and 1-the number of matches over the number of observations is our goodness of fit

Page 10: 7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to

7.5 LPM and Heteroskedasticity-Because y is binary, the LPM does violate one Gauss-Markov assumption:

(7.30) ]1)[()|( XpXpxyVar -where p(X) is short for the probability of

success-therefore, heteroskedasticity exists and MLR.5

is violated-therefore t and F statistics are not valid until

het is corrected as discussed in chapter 8

Page 11: 7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to

7.5 DV Party-It is also possible to include Dummy Variables as both the dependent variable and as independent variables, for example,

uCheetSheetIntelStudyMT 3210 -Where CheetSheet=1 if a cheat sheet is prepared, =0 otherwise-in this case the estimated coefficient of the independent DV gives the increase in probability of success if not the base case

Page 12: 7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to

7.6 Policy Analysis and Program Evaluation

-Policy Analysis and Program Evaluation is generally done using the regression and hypothesis test:

0: 0:

ˆ...ˆˆˆˆ

0

22110

DVaDV

DV

HH

DVxxy

-Where the DV represents the group possibly needing a program or participation in the program-If H0 is not rejected, the program is not needed or is not effective

Page 13: 7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to

7.6 Evaluation and Analysis DifficultiesThis evaluation and analysis process has

two inherent difficulties:1) Omitted Variables

-if a variable correlated with the DV is omitted, its estimated coefficient and test is invalid-due to past group discrimination, groups are often correlated with factors such as income, education, etc.

2) Self-Selection Problem-often participation in a program is not random

Page 14: 7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to

7.6 Omitted ExampleConsider the following equation:

uonClassSectiIntelStudyMT 3210 -Where we are testing whether midterm achievement is a function of which class section one is in-However, the choice of class can depend when you eat lunch (among other factors), which can affect Midterm achievement-Therefore by not including eating (which is correlated to our DV), our estimate is biased

Page 15: 7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to

7.6 Sample Selection ProblemsProgram evaluation MUST assume that

participation in the program or group (and thus inclusion in the control group) is random

-however, people often CHOOSE inclusion or non-inclusion (ie: people choose to study, people choose to speed)

-since these decisions are influences by parts of the error term, our OLS estimation is biased:

)|()|( DVuEDVuE Ie: one doesn’t study due to their drug (Survivor)

addiction, something that they may not report and thus cannot be included in the regression