16
Who Does Not Respond in the Household Expenditure Survey: An Exercise in Extended Gini Regressions Edna SCHECHTMAN Department of Industrial Engineering and Management, Ben-Gurion University, Beer Sheva, Israel ([email protected] ) Shlomo YITZHAKI Hebrew University and Central Bureau of Statistics, Jerusalem 95464, Israel ([email protected] ) Yevgeny ARTSEV Central Bureau of Statistics, Jerusalem 95464, Israel ([email protected] ) A nonparametric multiple regression method, based on the extended Gini that depends on one parameter, ν , is investigated. The parameter ν enables production of infinite alternative linear approximations to the regression curve which differ in the weighting schemes applied to the slopes of the curve. The method allows the investigator to stress different sections of one independent variable while keeping the treatment of the other independent variables intact. As an application we investigate nonresponse patterns in a survey of household expenditures to learn about the relationship between nonresponse and income. The empirical results show that the higher the income, the higher the response rate, and the larger the household, the higher the response rate. The Arab population tends to respond more than the Jewish one, whereas the ultrareligious group tends to respond less than the rest of the population. The implications on the bias in the estimates are discussed. KEY WORDS: Extended Gini; Nonresponse; Regression. 1. INTRODUCTION The purpose of this article is to present a new, nonparametric multiple regression method, based on the extended Gini (EG) measures of dispersion, for investigating the curvature of a re- gression curve. The method is capable of estimating a series of linear approximations of the regression curve, allowing the investigator to stress different sections along the range of one independent variable, while keeping the weighting scheme of other independent variables intact. This property enables the researcher to learn whether the conditional regression curve is linear, convex, or concave in one independent variable, given the weighting scheme applied to the slopes of the other inde- pendent variables. The method is based on the extended Gini family that depends on one parameter, ν . The choice of this pa- rameter enables the user to produce infinite alternative estima- tors of the regression curve. The main purpose of the method from an econometric point of view is descriptive. One major advantage is that it can treat each independent variable individ- ually, concentrating on the (conditional) relationship between the dependent variable and one independent variable, given the other independent variables. All regressions rely on all obser- vations, eliminating the need to arbitrarily define windows, or omit observations (but there is arbitrariness in the choice of ν ). The drawback of the EG approach is that it is not based on a di- rect derivation as are other derivative estimators (Powell, Stock, and Stoker 1989). The methodology is illustrated by investigating the tendency not to respond to questionnaires on finances of the household in official surveys. The common wisdom with respect to this issue is that either or both rich and poor people tend to respond less than ordinary people. Since we have no firm priors with respect to the kind of relationship we expect to see, the need for a nonparametric method that can analyze the data arises. The main conclusion of the empirical application is that non- response to the survey of family expenditures in Israel is a decreasing convex function of income, and almost reaches a plateau when high-income groups are stressed. Nonresponse tends to be negatively related to household size. The nonre- sponse rate differs among ethnic groups: the Arab population shows below-average nonresponse rate, while the ultrareligious Jewish group has above-average nonresponse rate. This result holds even when the response rate is adjusted for income and household size. Because the estimators in this article are based on the sam- ple’s analogs of the population parameters, we will use capital letters to represent population parameters and lowercase letters to represent the estimators. The structure of the article is the following: Section 2 presents the nonparametric regression coefficients in the sim- ple regression case. Section 3 extends it to a multiple regres- sion framework, and Section 4 presents the estimators and their standard errors. Readers who are not interested in the exact derivation of the estimates and their standard errors can skip the fourth section. Section 5 presents the data and the research question, the sixth section presents the empirical results, and the seventh evaluates the implication of nonresponse on mea- surement of inequality. Section 8 concludes. © 2008 American Statistical Association Journal of Business & Economic Statistics July 2008, Vol. 26, No. 3 DOI 10.1198/00 329

Who Does Not Respond in the Household Expenditure Survey

  • Upload
    bgu

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Who Does Not Respond in the HouseholdExpenditure Survey: An Exercise inExtended Gini Regressions

Edna SCHECHTMANDepartment of Industrial Engineering and Management, Ben-Gurion University, Beer Sheva, Israel([email protected])

Shlomo YITZHAKIHebrew University and Central Bureau of Statistics, Jerusalem 95464, Israel ([email protected])

Yevgeny ARTSEVCentral Bureau of Statistics, Jerusalem 95464, Israel ([email protected])

A nonparametric multiple regression method, based on the extended Gini that depends on one parameter,ν, is investigated. The parameter ν enables production of infinite alternative linear approximations to theregression curve which differ in the weighting schemes applied to the slopes of the curve. The methodallows the investigator to stress different sections of one independent variable while keeping the treatmentof the other independent variables intact. As an application we investigate nonresponse patterns in a surveyof household expenditures to learn about the relationship between nonresponse and income. The empiricalresults show that the higher the income, the higher the response rate, and the larger the household, thehigher the response rate. The Arab population tends to respond more than the Jewish one, whereas theultrareligious group tends to respond less than the rest of the population. The implications on the bias inthe estimates are discussed.

KEY WORDS: Extended Gini; Nonresponse; Regression.

1. INTRODUCTION

The purpose of this article is to present a new, nonparametricmultiple regression method, based on the extended Gini (EG)measures of dispersion, for investigating the curvature of a re-gression curve. The method is capable of estimating a seriesof linear approximations of the regression curve, allowing theinvestigator to stress different sections along the range of oneindependent variable, while keeping the weighting scheme ofother independent variables intact. This property enables theresearcher to learn whether the conditional regression curve islinear, convex, or concave in one independent variable, giventhe weighting scheme applied to the slopes of the other inde-pendent variables. The method is based on the extended Ginifamily that depends on one parameter, ν. The choice of this pa-rameter enables the user to produce infinite alternative estima-tors of the regression curve. The main purpose of the methodfrom an econometric point of view is descriptive. One majoradvantage is that it can treat each independent variable individ-ually, concentrating on the (conditional) relationship betweenthe dependent variable and one independent variable, given theother independent variables. All regressions rely on all obser-vations, eliminating the need to arbitrarily define windows, oromit observations (but there is arbitrariness in the choice of ν).The drawback of the EG approach is that it is not based on a di-rect derivation as are other derivative estimators (Powell, Stock,and Stoker 1989).

The methodology is illustrated by investigating the tendencynot to respond to questionnaires on finances of the householdin official surveys. The common wisdom with respect to thisissue is that either or both rich and poor people tend to respondless than ordinary people. Since we have no firm priors with

respect to the kind of relationship we expect to see, the need fora nonparametric method that can analyze the data arises.

The main conclusion of the empirical application is that non-response to the survey of family expenditures in Israel is adecreasing convex function of income, and almost reaches aplateau when high-income groups are stressed. Nonresponsetends to be negatively related to household size. The nonre-sponse rate differs among ethnic groups: the Arab populationshows below-average nonresponse rate, while the ultrareligiousJewish group has above-average nonresponse rate. This resultholds even when the response rate is adjusted for income andhousehold size.

Because the estimators in this article are based on the sam-ple’s analogs of the population parameters, we will use capitalletters to represent population parameters and lowercase lettersto represent the estimators.

The structure of the article is the following: Section 2presents the nonparametric regression coefficients in the sim-ple regression case. Section 3 extends it to a multiple regres-sion framework, and Section 4 presents the estimators and theirstandard errors. Readers who are not interested in the exactderivation of the estimates and their standard errors can skipthe fourth section. Section 5 presents the data and the researchquestion, the sixth section presents the empirical results, andthe seventh evaluates the implication of nonresponse on mea-surement of inequality. Section 8 concludes.

© 2008 American Statistical AssociationJournal of Business & Economic Statistics

July 2008, Vol. 26, No. 3DOI 10.1198/00

329

330 Journal of Business & Economic Statistics, July 2008

2. SIMPLE REGRESSION CASE

Let (Y,X) be a bivariate random variable with expected val-ues μY and μX and finite variances σ 2

Y and σ 2X , respectively,

and let g(x) = E{Y|X = x} be the regression curve. The errorterm at (Yi,Xi) is defined as the deviation of Yi from a linearapproximation α + βXi, that is, εi = Yi − α − βXi, where α

and β are parameters to be defined later. Aside from the prop-erties implied by the assumptions on (Xi,Yi), no additional as-sumptions are imposed on the error term, and no structure isimposed on the regression curve. In particular, this means thatE{ε|X = x} = g(x) − α − βx, which is equal to zero for all val-ues of x only if g(x) is a linear function of x.

We are interested in estimating a linear approximation to theregression curve, g(x). We start with the parameter represent-ing the slope, β , and only later deal with the constant term.The parameter representing the slope of the linear approxima-tion of the regression curve will be referred to as the EGRC(extended Gini regression coefficient). The slope will be de-fined as a weighted average of the derivatives of g(x), with theweights being derived from the extended Gini variability in-dex. In some sense, the approach in this article can be viewedas similar to the one presented in Angrist, Chernozhukov, andFernándaz-Val (2004), who analyzed the linear approximationof a misspecified model in a quantile regression approach.

The extended Gini variability index is a member of a familyof indices defined by

G(X, ν) = −(ν +1)COV(X, [1−F(X)]ν), ν > −1, ν �= 0.

By determining ν, the investigator introduces his preferenceconcerning the measurement of variability of the independentvariable. The role of ν in the extended Gini variability indexis to reflect the investigator’s attitude toward variability. Thehigher ν is, the more stress is put on the lower portion of thedistribution of the independent variable. In the extreme case(ν → ∞), the investigator cares only about the lowest part ofthe cumulative distribution, as if he is guided by the max–min criterion. If ν = 1, then the investigator measures variabil-ity according to Gini’s mean difference, implying a symmetricweighting scheme around the median. If ν → 0, the investigatordoes not care about variability; the range −1 ≤ ν < 0 reflectsgiving higher weights to the upper side of the distribution ofthe independent variable, while ν → −1 implies an investigatorwhose attitude to variability follows the max–max strategy, thatis, caring about variability around the highest part of the distri-bution only. It is worth noting that when −1 ≤ ν < 0, the indexof variability is negative. (See Donaldson and Weymark 1983;Yitzhaki 1983; and Chakravarty 1988, chap. 3, pp. 82–102, fordescription of the properties of the extended Gini index. Gar-ner 1993, Lerman and Yitzhaki 1994, and Wodon and Yitzhaki2002 are examples of its decomposition and use in welfare eco-nomics; see Araar and Duclos 2003 for a possible extension;see Davidson and Duclos 1997 for statistical inference, andMillimet and Slottje 2002 for an application in environmen-tal economics.) However, in the above-mentioned literature,the parameter is restricted to ν > 0. Schechtman and Yitzhaki(1987, 1999, 2003) defined and investigated the properties ofthe equivalents of the covariance and the correlation. Yitzhakiand Schechtman (2005) offered a survey of the properties of

the EG family and, in particular, showed the metric that leadsto the EG. The decomposition of the extended Gini of a sum ofrandom variables into the contributions of the extended Gini’sof the individual random variables and the (equivalents of)correlations among them can also be found there. Olkin andYitzhaki (1992) defined the simple Gini regression coefficientsand investigated their properties. Serfling and Xiao (2006) de-fined and investigated the properties of multivariate L-momentswhich include the EG moments as a special case.

It is worth mentioning that the parameters of the simple re-gression have been already in use for almost two decades in twomajor economic fields: welfare economics and risk aversion. Inthese fields the investigator who cares about the effect of a pol-icy on the margin is interested in a weighted average of slopesbetween adjacent observations, weighted by the marginal util-ity (or the social evaluation of the marginal utility in welfareeconomics) of income on a curve. There is no need, from theeconomic model point of view, to assume or to require linearity,and the assumption of linearity is introduced in the estimationprocess by the econometrician (who may be the same person).In those cases the determination of the weighting scheme (i.e.,the choice of ν) should be determined by the social views (orrisk aversion) that guide the investigator. For a survey on theapplications in the area of welfare economics, see Wodon andYitzhaki (2002), whereas the applications in finance are demon-strated in Shalit and Yitzhaki (2002).

Definition and Properties of the Extended Gini RegressionCoefficients (following Yitzhaki 1996, prop. 3).

(a) The extended Gini regression coefficients (EGRC) aredefined as

byx(ν) = −(v + 1)COV(Y, [1 − Fx(X)]ν)−(v + 1)COV(X, [1 − Fx(X)]ν) ,

v > −1 : v �= 0, (1)

where ν is a parameter, determined by the investigator in orderto determine the weighting scheme. [The factor −(ν + 1) can-cels out. It is presented here to emphasize that both numeratorand denominator are based on extended Gini’s.] The subscriptyx will be omitted in the simple regression case.

(b) Equation (2) presents the EGRC as a function of thederivatives of g(x) and shows how ν determines the weightingscheme. That is,

β(ν) =∫

W(x, ν)g′(x)dx, (2)

with W(x, ν) ≥ 0 and∫ ∞−∞ W(x, ν)dx = 1, where

W(x, ν) = [1 − Fx(x)] − [1 − Fx(x)](ν+1)

∫ ∞−∞[[1 − Fx(t)] − [1 − Fx(t)](ν+1)]dt

. (3)

Yitzhaki (1996) presented the weights for specific distribu-tions of the independent variable.

(c) Let x[i] be the ith-order statistic, and let yi be the observa-tion of y that accompanies x[i]. Then, the estimators of β(ν), forall ν, can be expressed as weighted averages of slopes definedby pairs of adjacent observations:

b(ν) =n−1∑

i=1

wibi, (4)

Schechtman, Yitzhaki, and Artsev: Extended Gini Regressions 331

where

bi = yi+1 − yi

x[i+1] − x[i], i = 1, . . . ,n − 1;

wi > 0,∑

wi = 1,

and

wi = wi(x, ν) = [nν(n − i) − (n − i)ν+1]�xi∑n−1

k=1[nν(n − k) − (n − k)ν+1]�xi, (5)

where �xi = x[i] − x[i−1].(d) The estimators b(ν) can be expressed as ratios of U-

statistics. (Generally speaking, a U-statistic is an unbiased esti-mator for the parameter, which is created by forming an averageof symmetric functions, called kernels. The interested reader isreferred to Randles and Wolfe 1979, chap. 3, for details.) Assuch, b(ν) are consistent estimators of β(ν); for large samples,the distributions of the estimators converge to the normal distri-bution under regularity conditions.

(e) Suppose that E(Y|X) = α + βX and var(Y|X) = σ 2 <

∞, then: (1) following property (b), β(ν) = β for all ν, and(2) all extended Gini estimators b(ν) (i.e., regardless of ν) areconsistent estimators of the same β .

Proof. See Yitzhaki (1996, prop. 3).

The properties above show that all EGRC can be expressedas weighted averages of slopes defined between adjacent obser-vations, with the weighting scheme being determined by the ν

attached to the independent variable, and by the distribution ofthe independent variable.

By changing ν and reestimating the model, the investigatorcan learn about the curvature of the regression curve. The higherν is, the higher is the weight that is given to the slopes of theregression curve at the lower end of the range of the indepen-dent variable. In case the curve is linear, the estimates of the re-gression coefficient, for all ν, do not differ significantly. If b(ν)

turns out to be a declining (increasing) function of ν, then theregression curve is convex (concave). But of course, it may bethat it does not show a specific pattern. Also, because it is basedon all observations, it is clearly not able to detect small local de-viations. (In the Conclusions section, we offer an extension todeal with this issue as well.) Note that an alternative approachis to use kernel estimation. Both methods have some arbitrari-ness involved in them. In the kernel estimation, one chooses thewidth of the window and the kernel function, whereas in EGone chooses the parameter ν. The important issue is that thechoice of ν generates a weighting scheme instead of a windowwidth and/or kernel function.

Although it was not produced through an optimization, eachextended Gini regression produces a normal equation. To seethis, define

ε = ε(ν) = Y − α − β(ν)X. (6)

Then,

COV(ε, [1 − F(X)]ν) = 0. (7)

Equation (7) can be proved by plugging (6) and (1) into (7). Bythe same reasoning, it can be shown that the sample’s versionof (7) holds.

We now move to define the constant term. Unlike other re-gression methods, the constant term is not estimated simulta-neously with the regression coefficients, but follows it. Hencethe constant term need not be selected according to the method-ology used to derive the regression coefficients. The constantterm depends on the function of the residuals that is being min-imized. To do that one first derives an error term without takinginto account the constant term. Minimizing the sum of squareddeviations of the error term from a constant yields an estimatedlinear approximation that passes through the means of the vari-ables, whereas minimizing the sum of the absolute deviations ofthe error term from a constant forms a constant term so that theestimated linear approximation will pass through the medians,and so on.

Finally, we mention the following:

(a) The OLS can be presented in the same way, except thatthe weighting scheme is derived from the variance of the in-dependent variable (Yitzhaki 1998). See Heckman, Urzua, andVytlacil (2006) for a comprehensive study of the properties, ap-plications, and implications of the weighting scheme in an OLScontext.

(b) As far as we know, the EG regression is the only regres-sion method in which the investigator controls the weights at-tached to each part of the distribution of the independent vari-able. For example, under quantile regression (Koenker and Bas-sett 1978), the weighting scheme is applied to the residual ei.(To see this, note that ei = Yi − α − βXi is the residual, andhence the optimization must be applied to a function of theresiduals. Therefore, the quantile is a quantile of the residual.)As a result, the EG regression can trace the curvature of a curve,whereas quantile regression cannot.

(c) Numerically, the extended Gini regression coefficient,when ν = 1, is identical to Durbin’s (1954) suggested estima-tor, which is based on using the rank of the independent variableas an instrumental variable. In the sample, the rank is given bythe empirical distribution multiplied by the sample size, hencethe identity. However, the motivation, the distributions of theestimators, and other properties are totally different. One cantake advantage of this analogy and extend it to all EGRCs, anduse it to calculate the estimates (but not the standard errors) ofthe EGRC using standard regression software. Note, however,that this interpretation does not apply without qualification tothe multiple regression case (as discussed at the end of the nextsection).

(d) Having offered an infinite number of variability mea-sures, and, as a result, an infinite number of correlations andregression coefficients between a dependent variable and an in-dependent variable, the natural question that arises is how tochoose the appropriate one, that is, how to choose ν. The an-swer to this question relies on the specific application. If theinvestigator believes that the model is truly linear, or that theunderlying distribution is the bivariate normal distribution (sothat the slope of E{Y|X = x} with respect to x is a constant), thenthe suggested methodology is redundant. If, on the other hand,the investigator is interested in applying a theoretical model thatdoes not require the assumption of a linear model but an asym-metric summary statistic that represents a weighted average ofslopes, weighted by his views, then the choice of ν should come

332 Journal of Business & Economic Statistics, July 2008

from that theory. As mentioned earlier, there are at least twofields that fit the above description. When dealing with optimaltaxation, the effect of a change of a tax on the social welfarefunction depends on the social evaluation of the marginal utilityof income. Then, the regression coefficient is a weighted aver-age of the slopes defined between adjacent observations of theEngel curve, weighted by the social evaluation of the marginalutility of income (Wodon and Yitzhaki 2002). Another area inwhich the above description holds is that of risk aversion wherethe relevant coefficients are weighted averages of slopes of thereturns of individual assets with respect to the return of the port-folio, whereas the weighting scheme is derived from the riskaversion of the investor (Shalit and Yitzhaki 2002). Note, how-ever, that a full theory that presents the application in financeis yet to be presented. Another possibility is that the investiga-tor is interested in performing some sensitivity analysis. In thiscase the natural choice is to start with ν = 1 (the symmetricGini mean difference—GMD) and then to choose several ν’s toperform some sensitivity analysis.

3. MULTIPLE REGRESSION CASE

The aim of this section is to develop an extension of the sim-ple regression coefficients suggested in Yitzhaki (1996) to themultiple regression case. This is the main theoretical contribu-tion of the article.

Let (Y,X1, . . . ,XK) be a (K + 1)-variate random variablewith expected values (μY ,μ1, . . . ,μK) and a finite variance–covariance matrix �.

As in the simple (univariate) regression case, we assume thatthe investigator is interested in estimating a linear approxima-tion of the regression curve. However, the concept, and hencethe solution, are different. In the univariate case, the linear ap-proximation is a straight line, determined by a constant termand one slope. The slope can be expressed as the weighted av-erage of slopes between adjacent observations of the scattereddata (Yitzhaki 1996). In this sense, it is like a differential of afunction dy/dx, because it is a weighted average of the deriv-atives defined between adjacent observations. However, in themultiple regression case, there exists a set of slopes, which areconditional slopes. These slopes may vary along a curve if wechange any other independent variable (i.e., the slope of Y on xi

is conditional on the other x’s). Therefore, we suggest the useof a “chain rule,” as we show next.

Assume we have a regression curve defined by

y = g(x1, . . . , xK) = E{Y|X1 = x1, . . . ,XK = xK}.We are seeking a linear approximation to the curve y =g(x1, . . . , xK) in the K-dimensional space. That is, we are look-ing for a set of constant slopes (β1, . . . , βK) and a given point(y0, x0

1, . . . , x0K) so that the function y = g(x1, . . . , xK) is ap-

proximated by

yA = gA(x1, . . . , xK) = y0 +K∑

k=1

βk(xk − x0k) = a +

K∑

k=1

βkxk,

where a = y0 − ∑Kk=1 βkx0

k and y0 = g(x01, . . . , x0

K). The slopesβi are conditional slopes, and can be more precisely written asβ0i.1,2,...,i−1,i+1,...,k, where 0 stands for the dependent variable.

Note that in general, yA �= E{Y|X1 = x1, . . . ,XK = xK}, ex-cept for the point y0 = g(x0

1, . . . , x0K), and equality holds for all

other points only if the model is linear.As mentioned earlier, the univariate case is different. In the

univariate case, a linear approximation of the curve xj = h(xi)

is a constant slope βji and a given point (x0j , x0

i ) so that xAj =

x0j + βji(xi − x0

i ) = aji + βjixi, where βji can be expressed as theweighted average of slopes (i.e., derivatives) between adjacentobservations (Yitzhaki 1996).

To sum up—we have a set of data, following an unknownrelationship y = g(x1, . . . , xk). We want to approximate the un-known relationship by a linear approximation, which is a con-stant and a set of slopes, and estimate these parameters. We firstestimate the slopes and later, the constant is determined.

The difference between a Taylor expansion and a linear ap-proximation in a regression context is that in the former, thesimple and partial derivatives are defined at one point, the pointx0, whereas in the latter, estimators of the simple derivatives arebased on weighted averages of slopes (i.e., derivatives) definedon the entire range of the independent variable. (This makesthem representing a differential and not a derivative.)

For the sake of internal consistency, that is, consistency inthe results obtained from the univariate slopes and the multipleregression slopes, we require that the estimators obey a “chainrule.” (This requirement is met in the OLS case, as discussedbelow.) The following proposition interprets the estimators ofthe partial derivatives (i.e., conditional slopes).

Proposition. Assume that all unidimensional slopes βji, j =0,1, . . . ,K, i = 1, . . . ,K, between (y, x1, . . . , xK) are given.Then, the set of (conditional) slopes (β1, . . . , βK) is determinedby solving a set of linear equations with the (known) constantsβji.

Proof. Because y = g(x1, . . . , xK), we apply a “chain rule”to get

dy

dxi=

K∑

k=1

∂g

∂xk

dxk

dxi, i = 1, . . . ,K. (8)

The univariate slopes, βji = dxj/dxi, have been defined (asweighted averages of slopes between adjacent observations) inYitzhaki (1996). Inserting them in the set of equations (8), onegets

βoi =K∑

k=1

∂g

∂xkβki, i = 1, . . . ,K. (9)

We note that in OLS, the partial derivatives ∂g/∂xk are in factthe (conditional) regression coefficients, βk, as shown in Ap-pendix B. These results are obtained mathematically, by mini-mizing the variance of the error term. This is the rationale be-hind the idea to obtain the implied coefficients in the extendedGini regression by solving for ∂g/∂xk. It turns out, as will beshown later, that the implied coefficients have the same struc-ture as the OLS coefficients, with the obvious adjustments into“Gini” language.

We refer to the estimators of the partial derivatives as “im-plied partial derivatives.” We do not argue that they representpartial derivatives at the point x0, but rather, that if one in-terprets the estimators of the simple regression coefficients as

Schechtman, Yitzhaki, and Artsev: Extended Gini Regressions 333

representing a weighted average of simple derivatives, then be-cause of consistency considerations implied by (9), the estima-tors of the partial derivatives are naturally implied.

Consider now a first-order Taylor expansion around zero ofthe regression curve. By construction, the expansion is linear.

The slopes of the linear approximation can be written as

⎜⎝

dydx1dydxidy

dxK

⎟⎠ =

⎜⎜⎝

∂g∂x1

+ ∂g∂x2

dx2dx1

+ · · · + ∂g∂xK

dxKdx1· · · · · · · · · · · ·

∂g∂x1

dx1dxi

+ · · · · · · + ∂g∂xK

dxKdxi

∂g∂x1

dx1dxK

+ ∂g∂x2

dx2dxK

+ · · · + ∂g∂xK

⎟⎟⎠ . (10)

Using the simple regression coefficients developed in Section 1to represent the simple slopes in the Taylor expansion presentedin (10), we now replace dxj/dxk by βjk(νk), and dy/dxk byβ0k(νk), where k = 1, . . . ,K indicate the independent variables.Having done that, we define Z1, . . . ,Zk by⎛

⎜⎝

β01(ν1)

· · ·β0i(νi)

β0K(νK)

⎟⎠

=⎛

⎜⎝

Z1+ · · · +ZkβK1(ν1)

· · · · · · · · ·Z1β1i(νi)+ · · · +ZkβKi(νi)

Z1β1K(νK)+ · · · +ZK

⎟⎠

=( 1 β21(ν1) · · · βK1(ν1)

· · · · · · · · · · · ·β1K(νK) β2K(νK) · · · 1

)( Z1· · ·ZK

)

. (11)

Using (11), one can solve for the estimators of the partialderivatives ∂g/∂xk:⎛

⎜⎝

Z1· · ·Zi

ZK

⎟⎠ =

( 1 β21(ν1) · · · βK1(ν1)

· · · · · · · · · · · ·β1K(νK) β2K(νK) · · · 1

)−1

×⎛

⎜⎝

β01(ν1)

· · ·β0i(νi)

β0K(νK)

⎟⎠ . (12)

Note that in (12), the vector on the right side depends on allthe νi. Also, the denominator of row k in the matrix (beforeinverting it) is

G(Xk, νk) = Gk(X) = −(νk + 1)COV(Xk, [1 − Fk(X)]νk

)

(i.e., the denominator in each row is the extended Gini of theappropriate variable).

Note that the presentation is similar in structure to the OLSsolution, with each variance replaced by Gini variance, andeach covariance replaced by Gini covariance. If (10) representsslopes of a truly linear model, then all the coefficients on theright side of (12) are constants, and the left side must representby construction the partial derivative of the regression curve, in-dependent of ν. On the other hand, if the regression curve is notlinear, then by changing νi, one can trace the change in ∂g/∂xi,other things being equal, by changing the weighting scheme at-tached to the slopes of variable i. Note that by other things beingequal, it is meant that all rows except row i in the matrix of re-gression coefficients in (12) remain unaffected, and all elements

in the vector of simple regression coefficients of the dependentvariable on the independent variables except element i do notchange. This is a unique property of the EG, which is due tothe fact that there are two covariances and two correlations be-tween each pair of random variables (see comment at the endof this section). Therefore, βij can be changed without affectingβji. There is one difference with respect to OLS that is worthmentioning: the equivalent of the variance–covariance matrixceases to be symmetric. We will discuss further adjustments ofthe estimators to represent changing derivatives along the rangeof the independent variable in the Conclusions section.

The first step in this section was to define the theoretical lin-ear approximation of the unknown regression curve. We nowmove to the estimation procedure. Before we turn to the es-timation stage, we shall rewrite the parameters in matrix no-tation. Let Y be the dependent variable and let X1, . . . ,XKbe the independent variables. Let V be an (n × K) matrix ofpower functions of (one minus) the cumulative distributions ofX1, . . . ,XK (in deviations from their expected values), multi-plied by −(νk + 1). That is, a typical element in V is

Vik = −(νk + 1)

{[1 − Fk(xik)]νK − 1

νk + 1

}.

The vector of regression coefficients, β(ν), is defined by

β(ν) = [V ′X]−1V ′Y, (13)

where β(ν) = {β1(ν1), . . . , βK(νK)} is a (K ×1) column vector,V is an (n × K) matrix defined above, Y is an (n × 1) columnvector of the dependent variable, and X is an (n × K) matrixof the deviations of the independent variables from their ex-pected values. The vector V ′Y is a column vector, the elementsof which are −(νk + 1)COV(Y, [1 − F(Xk)]νk), that is, the EGcovariances between the dependent variable and the indepen-dent variables. The matrix A = V ′X is a matrix with the ele-ments −(νk + 1)COV(Xj, [1 − F(Xk)]νk); that is, its elementsare EG equivalents of variances and covariances of the inde-pendent variables: the diagonal elements are the EG’s of the in-dependent variables, and the off-diagonal elements are EG co-variances between pairs of independent variables. It is assumedthat rank [V ′X] equals K, the number of independent variables.The requirement that one can invert the matrix [V ′X] implies arestriction on the choice of the independent variables that doesnot exist in OLS. If the ν’s assigned to different independentvariables are all equal, then no independent variable can be amonotonic transformation of another independent variable, be-cause it will imply identical rows in the matrix [which dependson X via F(X)]. However, if the ν’s are different, then one inde-pendent variable can be a monotonic transformation of anotherindependent variable without causing colinearity in the [V ′X]matrix. For example, X and eX cannot appear simultaneously asindependent variables in EG regressions if they have the sameν, but they can participate with different ν’s.

We now turn to the estimation step. First, one estimates theregression coefficients, and given the estimated regression co-efficients, one moves to estimate the constant term. The naturalestimators of the regression coefficients are based on replac-ing the cumulative distributions by the empirical distributions(which are calculated using ranks):

b(ν) = [v′x]−1v′y, (13′)

334 Journal of Business & Economic Statistics, July 2008

where v is a matrix with elements −(νk + 1)[n−νk(n −r(xik))

νk − 1/(νk + 1)], and r(xik) is the rank of xik amongx1k, . . . , xnk. As in the simple regression, the multiple regres-sion procedure, although it is not based on an optimization pro-cedure, generates equivalents to the OLS’s normal equations.By defining the error term, and substituting for the multipleregression coefficients, it can be shown that

COV(ε, [1 − Fk(X)]νk

) = 0 for k = 1, . . . ,K. (14)

Lemma 3.1. Define the vector ε(ν) = Y − Xβ(ν). Then,V ′ε(ν) = o where o is a vector of zeros.

Proof. V ′ε(ν) = V ′Y −V ′Xβ(ν) = V ′Y −V ′X[V ′X]−1V ′Y =o. This property holds in the sample as v′e(ν) = o, wheree(ν) = y − xb(ν).

As pointed out in the simple regression case, b(ν) = {b1(ν1),

. . . ,bK(νK)} is identical to an instrumental variable estimatorin an OLS regression, with a power function of rank X serv-ing as the instrumental variable. This interpretation providesan alternative method for calculating b(ν) by using any stan-dard regression software. Note, however, that (a) because noassumptions are imposed on the model, the sampling distribu-tion of b(ν) still needs to be investigated for the present setup(as detailed in Sec. 4), and (b) each independent variable is sub-stituted by only one instrumental variable, which makes the re-liance on a standard regression package problematic.

It is worth emphasizing an important property that is uniqueto the EG multiple regression. The EG has two correlations de-fined between each pair of variables, so that in contrast to theOLS, a symmetric correlation is not imposed on the indepen-dent variables. When the parameter ν of one independent vari-able is changed, only one of the two correlations defined withany other independent variable is affected. This property allowsus to refer to the partial regression coefficients as representingpartial derivatives because changing ν for one independent vari-able may only change its own slope, and one set of the asym-metric EG correlation coefficients which accompanies it.

Once the EG regression coefficients are estimated, the con-stant term will be estimated in a way that is identical to thesimple regression case. That is, the investigator can minimize afunction of the error terms. The exact function used determineswhether the regression passes through the mean, the median, ora quantile.

Before we proceed with the properties of the EG regressions,it is worth comparing it to other regression methods. Readerswho are not interested in such a comparison can skip the rest ofthis section. To simplify the comparison we restrict the formalcomparison to the Gini mean difference (GMD) regression (i.e.,for ν = 1), whereas the extension to the general EG will be doneverbally.

Generally speaking, the properties of a regression coefficientcan be derived from the variability measure and the target func-tion that were used to derive it. The property that character-izes the (extended) Gini index of variability is that there aretwo covariances and two correlations between every two vari-ables (for details see Yitzhaki 2003 for the GMD, and Yitzhakiand Schechtman 2005 for the EG). To see this note that if wehave two variables, X and Y , then we can have two covariances:cov(X,−[1−F(Y)]ν) and cov(Y,−[1−F(X)]ν). Therefore, we

can apply two sets of constraints that may lead to two differentregressions.

Minimizing the Gini of the error term leads to the followingset of normal equations:

COV(xj,F(e)) = 0, j = 1, . . . ,K, (15.M)

where F(e) is the cumulative distribution of the error term. Itcan be solved numerically, leading to a regression known as R-regression (McKean and Hettmansperger 1978; Hettmansperger1984). However, like quantile and MAD (Koenker and Bassett1978) regressions, the estimators do not have explicit presen-tations and one has to solve them numerically. Moreover, it isshown in Olkin and Yitzhaki (1992) that the GMD of the errorterm is a weighted average of all quantiles of the error term, sothat the regression coefficients that can be derived from (15.M)can be compared with quantile and MAD regressions.

The path that we followed in this article is different. Insteadof using an optimization method to derive the EG estimators,we simply reverse the role of the variables in the normal equa-tions (15.M). That is, we set

COV(e,F(xj)) = 0, j = 1, . . . ,K, (15.M1)

which is the sample version of (14) for the Gini mean differ-ence (ν = 1). In (15.M1), one can simply imitate the deriva-tion of the partial regression coefficient, as is done in the OLS.Hence, (15.M) is the Gini version of the normal equations thatcan imitate (and has similar properties to) the MAD and quan-tile regressions, whereas (15.M1) imitates the normal equationsof the OLS.

To illustrate, note that in the simple regression case, the OLSestimator is

bOLSY.X = cov(Y,X)

cov(X,X),

while the Gini estimator [from (15.M1)] is

bGY,X = cov(Y,F(X))

cov(X,F(X)).

The OLS does not have the property of having two regressioncoefficients between Y and X (one as a result of minimizationof the variance of the error term, and one by the decomposi-tion of a linear combination of random variables; see Yitzhakiand Schechtman 2005 for details) because the use of the vari-ance imposes a symmetric correlation between random vari-ables even if the distributions are not symmetric, that is, becausecov(e, x) = cov(x, e) by construction.

The advantage of the method that we follow over the mini-mization of the EG of the error term (and quantile and MADregressions) is that because we did not use optimization, wecan treat each independent variable differently and therefore,we can change the weighting scheme of the slopes of one in-dependent variable without changing the weights of the others.Had we used an optimization method, we would have had totreat all independent variables in a symmetric way because thefunction to be optimized is defined on the residuals. As far aswe can see, the approach taken in this article has the advantagethat it enables us to mix different approaches. That is, some ofthe normal equations in the linear equations can be derived us-ing OLS, and others using Gini or extended Gini. This is an

Schechtman, Yitzhaki, and Artsev: Extended Gini Regressions 335

advantage because it enables the investigator to check the ro-bustness of the results with respect to the regression method ina gradual instead of an all-or-nothing way. In this article, weuse different extended Gini’s to derive the “normal equations”of different independent variables.

Another implication of the above approach that seems to bemissing in the literature is the issue of the constant term. Thedetermination of the estimator of the constant term that estab-lishes the point through which the regression curve passes canbe separated from the determination of the estimators of theslopes. For example, using the above approach, one can producea regression that is based on OLS slopes but passes throughthe median (or another quantile). To do this, one minimizes thevariance of the error term to get the normal equations of theOLS. Then, having estimated the slopes, one can define a newvariable e∗

i = yi − (b1xi1 + · · · + bKxiK) and minimize

mina

n∑

i=1

|e∗i − a|.

The constant term will then ensure that the regression linepasses through the median.

Hence, we do not need to derive the regression coefficientsthrough MAD to produce a regression that passes through themedian. (The explanation of this property is that the constant isa location parameter and not a variability measure.) Section 4derives the asymptotic properties of the estimators. Readerswho are mainly interested in the application can skip Section 4.

4. ASYMPTOTIC BEHAVIOR OF THE ESTIMATORS

For simplicity, the presentation of the asymptotic propertieswill be restricted to the two independent variables case, withdifferent values of ν. All the results can be extended to the K-variable case.

A natural way to estimate the regression coefficients is basedon replacing the cumulative distributions by the empirical dis-tributions. Let the estimated equation be

y = b(ν1, ν2)01.2x1 + b(ν1, ν2)02.1x2 + e(ν1, ν2), (16)

where b0i.j represents the nonparametric EG regression coeffi-cient of the implied partial effect of Xi on the conditional meanof Y , given that Xj is in the model, but held fixed. The constantterm is not needed for estimating (16) and it will be estimatedlater, according to the requirements on where the linear approx-imation should pass (mean, median, quantile). For simplicity ofpresentation we will omit, whenever possible, the parameters ν1and ν2, keeping in mind that the partial regression coefficients,b, are functions of those parameters. The vector b, expressed inmatrix notation in (13′), can be explicitly written as (for K = 2):

b = b(ν1, ν2) =(

b01.2b02.1

)= 1

D

(c22c01 − c21c02c11c02 − c01c12

), (17)

where c0i = −(νi + 1) cov(y, [1 − ri/n]νi) (i = 1,2); cij =−(νj + 1) cov(xi, [1 − rj/n]νj); D = c11c22 − c12c21, and ri isthe vector of ranks of xi.

In what follows, we introduce an alternative estimator of β ,based on functions of U-statistics. We show that its limitingdistribution is normal, and that the differences between the b

of (17) and the estimators based on U-statistics are negligible;hence, the limiting distribution of b is also normal. The advan-tages of using the U-statistics as estimators are that they pro-vide unbiased estimators of the individual parameters, they haveminimum variance among all unbiased estimators, and theirasymptotic distributions are well known (Randles and Wolfe1979, chap. 3). Also, the functions of U-statistics are consistentestimators of the respective parameters, and their limiting dis-tributions are normal (under regularity conditions). The struc-ture of this section is as follows: first, kernels are found for therelevant parameters (defined below). Then, a U-statistic basedon the kernel is obtained for each of the parameters. Using theabove U-statistics, an estimator bU is suggested for β , based ona function of dependent U-statistics, and it is shown that it isa consistent estimator for β . The next step is to find its asymp-totic distribution. This is done using asymptotic results from U-statistics theory. Finally, it is shown that

√nbU and

√nb have

the same limiting distribution. Once the asymptotic distributionis known to be normal, and the asymptotic variance can be cal-culated using jackknife, for example (see Shao and Tu 1996),inference can be drawn (confidence intervals and hypothesistests) for each parameter. We note that an alternative approachwould be to replace the cumulative distribution function by theempirical one, and derive the asymptotic distribution of β fromthere.

Let

θ = −(ν + 1)COV(Y, [1 − F(X)]ν). (18)

The parameter θ is the EG equivalent of the covariance betweenY and X, and is a typical element of V ′Y [see (13)] and of V ′X(when X replaces Y).

The following theorems are proved only for the case whereν is an integer. [The EG is a continuous function of ν so thatone should be able to revise and prove a more general theorem,for a continuous ν. Further research is needed to overcome thisobstacle. However, for practical purposes one can follow thecurvature of the function by looking at several (integer) valuesof ν.]

Theorem 4.1. Let (X1,Y1), . . . , (Xν+1,Yν+1) be a randomsample of size (ν + 1) from a continuous bivariate distributionFX,Y with finite second moments. Let

h((x1, y1), . . . , (xν+1, yν+1)

) = yν+1 − yx(1), (19)

where yν+1 is the average of y1, . . . , yν+1 and yx(1)is the

y that belongs to x(1), the minimum of x1, . . . , xν+1. Thenh((x1, y1), . . . , (xν+1, yν+1)) is a symmetric kernel of degreeν + 1 for the parameter θ of (18). That is, h((x1, y1), . . . , (xν+1,

yν+1)) is an unbiased estimator of the parameter θ , based onν + 1 observations. (See Randles and Wolfe 1979, p. 61.)

Proof. The parameter θ of (18) can be expressed as follows:

θ = −(ν + 1)COV(Y, [1 − F(X)]ν)

= (ν + 1)E{Y}E{[1 − F(X)]ν} − (ν + 1)E{Y[1 − F(X)]ν}

= μY − (ν + 1)E{Y[1 − F(X)]ν}.

Therefore, we need to show that E{YX(1)} = (ν + 1)E{Y[1 −

F(X)]ν}.Claim. E{YX(1)

|X(1) = x} = E{Y|X = x}.

336 Journal of Business & Economic Statistics, July 2008

The proof of the claim is restricted to the discrete case.

Proof of the Claim.

E{YX(1)

∣∣X(1) = x}

=ν+1∑

i=1

yiP(YX(1)

= yi∣∣X(1) = x

)

=ν+1∑

j=1

ν+1∑

i=1

yiP(YX(1)

= yi∣∣X(1) = x,Xj = X(1)

)P(Xj = X(1)

)

=ν+1∑

j=1

ν+1∑

i=1

yiP(Yj = yi|Xj = x)1

(ν + 1)

=ν+1∑

j=1

E(Yj|Xj = x)1

(ν + 1)

= E(Y|X = x).

Using the claim,

E{YX(1)

} = EX(1)

{E(YX(1)

∣∣X(1) = x)}

=∫

E(YX(1)

∣∣X(1) = x)fX(1)

(x)dx

= (ν + 1)

∫E(YX(1)

∣∣X(1) = x)[1 − F(x)]ν f (x)dx

= (ν + 1)

∫E(Y|X = x)[1 − F(x)]ν f (x)dx

= (ν + 1)

∫ ∫yf (y|x)[1 − F(x)]ν f (x)dy dx

= (ν + 1)

∫ ∫y[1 − F(x)]ν f (x, y)dy dx

= (ν + 1)E{Y[1 − F(X)]ν}.

The symmetry of h((x1, y1), . . . , (xν+1, yν+1)) is obvious.

Let h((x1, y1), . . . , (xν+1, yν+1)) = yν+1 − yx(1), as in (19)

and let

U = 1( nν+1

)∑∑∑

i1<···<iν+1

h((

Xi1,Yi1

), . . . ,

(Xiν+1,Yiν+1

))

= 1( nν+1

)∑

. . .∑

i1<···<iν+1

(∑ν+1j=1 Yij

ν + 1− Ymin(Xi1 ,...,Xiν+1 )

),

where (i1, . . . , iν+1) is a permutation of (ν + 1) indices cho-sen from (1, . . . ,n). Then, U is a U-statistic for the parameterθ , and is therefore an unbiased and consistent estimator of θ

(Randles and Wolfe 1979, cor. 3.2.5).Using combinatorial arguments, U can be simplified and

written as a linear combination of concomitants of the orderstatistics as follows:

U = 1( nν+1

)n∑

i=1

[1

ν + 1

(n − 1

ν

)−

(n − i

ν

)]YX(i) . (20)

[Note that if ν > (n − i), then(n−i

ν

) = 0.]

Following Theorem 4.1, all the elements of (13) can be esti-mated by U-statistics and hence, for k = 2, β can be estimatedby a vector of size 2, whose elements are functions of several(dependent) U-statistics.

Let

μ′ = (μ1,μ2, . . . ,μ6) = (C01,C02,C11,C12,C21,C22)

be the vector of the parameters, where C0i = COV(Y, [1 −F(Xi)]νi) and Cij = COV(Xi, [1 − F(Xj)]νj), and let

U′ = (U1,U2, . . . ,U6) = (U01,U02,U11,U12,U21,U22)

= (c01, c02, c11, c12, c21, c22)

be the corresponding vector of U-statistics [whose elements aregiven in (17)]. The estimator of β , based on functions of U-statistics, can be written as

bU =(

bU01.2bU02.1

)= 1

D

(U22U01 − U21U02U11U02 − U01U12

), (21)

where U0i is the U-statistic based on the kernel yνi+1 − yxi(1),i = 1,2, and Uij is the U-statistic based on the kernel xi,νj+1 −xi,xj(1), where xi,νj+1 is the average of (νj + 1) observations ofXi, xi,xj(1) is the value of Xi, which belongs to the smallest valueof Xj (out of the νj + 1 values), and D = U11U22 − U12U21.Using the above notation, we can obtain the following results:

Theorem 4.2. Let (yi, xi1, xi2, i = 1,2, . . . ,n) be a sampledrawn from a continuous multivariate distribution with finitesecond moments and such that μ3μ6 − μ4μ5 �= 0. Then, bU in(21) is a consistent estimator of β in (13) (i.e., each componentof bU is a consistent estimator of the respective element of β).

Proof. Since each U-statistic converges in quadratic mean,and thus in probability, to the parameter it estimates, it followsby Slutzky’s theorem (Randles and Wolfe 1979, thm. A.3.1.3)that

U6U1 − U5U2

U3U6 − U4U5converges in probability to

μ6μ1 − μ5μ2

μ3μ6 − μ4μ5

and thus the former is a consistent estimator of the latter.

Theorem 4.3, due to Hoeffding (1948, thm. 7.1) and Theo-rem 4.4, due to Serfling (1980, thm. 3.3.A) are needed for thederivation of the asymptotic distribution of bU .

Theorem 4.3. Under the assumptions of Theorem 4.2, thevector U has an asymptotic normal distribution with mean μ

and a variance–covariance matrix d2n�, where dn = 2/(

√n).

That is,√

n(U − μ) →D N(0,4�).

Theorem 4.4. Let Un = (U1n, . . . ,U6n) be asymptoticallynormally distributed with mean vector μ and a variance–covariance matrix �. Let g(U) = (g1(U),g2(U)) be a vector-valued function for which each component function gi(U)

is real-valued and has a nonzero differential g(μ; t), t =(t1, . . . , t6) at U = μ. Let

M =[(

∂gi

∂Uj

)

U=μ

], i = 1,2; j = 1, . . . ,6.

Then bU = g(Un) is AN(g(μ),d2nM�M′), where AN is as-

ymptotic normal and dn → 0 as n → ∞.

Schechtman, Yitzhaki, and Artsev: Extended Gini Regressions 337

Proof. By Theorem 4.3 the vector U is AN(μ,d2n�), with �

a variance–covariance matrix and dn = 2/(√

n) → 0 as n → ∞.Let

g(U) =(

g1(U)

g2(U)

)= 1

DU

(U6U1 − U5U2U3U2 − U1U4

),

where D = U3U6 − U4U5, is the vector-valued function. Theproof follows Serfling (1980, thm 3.3.A). The explicit form of� is omitted since it is complicated. (The matrix � involvesvariances of the individual U-statistics, as well as covariancesbetween each pair of U-statistics. Using Slutzky’s theorem, aconsistent estimate of � can be obtained by replacing each pa-rameter by its consistent estimate.) A practical way to estimateit, using jackknife, is given in Yitzhaki (1991). Theorem 4.4states that

√nbU is asymptotically normal. In order to obtain

the same result for√

nb, the estimator based on replacing thecumulative distribution by the empirical one, it is required toshow that

√n(b − bU) → 0 as n → ∞. This is shown in Ap-

pendix A.

5. AN APPLICATION—WHO DOES NOT RESPONDTO QUESTIONNAIRES?

Surveys suffer from nonreporting, even when refusal to re-spond is illegal, as is the case in official surveys in Israel. Ifnonreporting is correlated with income, then the estimates ofthe mean income and the index of income inequality may bebiased. Nonreporting can occur for various reasons; some ofthem depend on the individual (refusal, not-at-home, and soon), whereas others may be due to problems at the collectingagency (the interviewer did not find the dwelling or did not ap-proach the respondent at an appropriate time, there were errorsand omissions at the agency, and so on). In this article we donot investigate the causes of nonresponse. We will be interestedin describing it as a function of several demographic variables(which can be used later in designing the sample) and one majorvariable, income. In general, the experience concerning non-reporting is that the propensity not to respond is a U-shapedfunction with respect to income, because the rich tend not toparticipate, while the poor and the young can not be found eas-ily at home. A recent study by Mistiaen and Ravallion (2003)presented a model in which compliance can either decreaseor increase with income, and also be of an inverted U-shape.Moreover, adding other arguments such as the ability to findthe members of the households at home, finding the address,viewing participation as a democratic value, and so on, can leadto almost all kinds of patterns. Mistiaen and Ravallion (2003)found that the nonresponse problem is not ignorable, and thatthere is a highly negative significant income effect on compli-ance. Deaton (2003) raised the plausible conjecture that richerhouseholds are less likely to participate in surveys, in order toexplain the gap between growth estimates based on households’surveys and those that are based on national accounts. (Compre-hensive studies, dealing with almost all aspects of nonresponse,are detailed in Groves and Couper 1998 and in Groves, Dill-man, Eltinge, and Little 2002.) The main conclusion from read-ing the literature is that nonresponse is a serious issue that maybias the estimates, but we do not have enough knowledge to jus-tify making the assumptions needed for running OLS or otherparametric regressions.

Investigating the magnitude and the effect of nonresponse onthe results is a bit complicated because one is dealing with miss-ing observations. The direct way to learn about the problemis to analyze the properties of nonrespondents from the scat-tered information known about them, such as the location ofthe dwelling, and other direct or indirect information that canbe taken from the files used for the sampling. Such an approachsuffers from two major problems: (a) the information that onecan gather is not sufficient; (b) the response rate in the sam-ple we are dealing with is around 90%, so that the sample ofnonresponses is relatively small. The main idea in this empiri-cal illustration is to use the sample of the respondents to learnabout the effect of nonresponse. For this purpose, we rely ona common procedure used in many official statistical offices,which is detailed next.

To overcome biases that are caused by the sample beinga nonrepresentative one and to reduce standard errors, manystatistical agencies adjust the distribution of the sample to fitknown marginal distributions of current demographic estimatesthat are based on the census. The outcome of this adjustment isa weighting scheme: a weight is attached to each observation.(A necessary condition to be able to perform such an adjust-ment is having detailed census data. Also, there are other rea-sons for using those procedures; among them is to insure thatdifferent samples, performed by different units of the agency,report the same demographic structure so that official statisticswill not be blamed by the media for publishing contradicting es-timates. This may explain why the adjustment to given marginsis performed mainly by producers of official statistics.) For asurvey of the different methodologies used to construct weight-ing schemes, see the survey by Kalton and Flores-Cervantes(2003). A detailed description of the method used in Israel isoffered in Kantorowitz (2002). For the purpose of this article, itis sufficient to say that the above-mentioned procedures changethe weight of each observation, so that it adds up to given mar-ginal demographic and geographic distributions.

The sample we are dealing with is a sample of dwellings. It isa stratified sample according to geographical areas and types ofdwellings, but the probability of each dwelling to be included inthe sample is the same. Since the probability of each dwelling tobe included in the sample is the same, the expected value of theweight of each observation is equal to the ratio of the overallpopulation to the sample size. When nonresponse occurs in acertain group, it will be underrepresented in the sample, so thatthe weight that will be assigned to those who responded in thatgroup will be higher than its expected value in case of equaltendency to respond.

From the point of view of our investigation, the importantfact is that if the propensity of nonresponse is equal among allpotential respondents, then the expected weights of all observa-tions will be equal. Moreover, the higher the nonresponse, thehigher the weight assigned to this type of observation. Hence,the weight attached to an observation can serve as an indica-tor of nonresponse, and will serve as the dependent variable inour analysis. If nonreporting is random, then we should expectthe weight to be uncorrelated with other characteristics of thepopulation. If the slope of the regression curve of weight on in-come is positive, then we conclude that nonreporting increaseswith income. If it is positive, but declining when high incomes

338 Journal of Business & Economic Statistics, July 2008

are stressed, then we conclude that nonreporting increases withincome but the propensity not to respond declines with income.

The weighting scheme of the sample is produced by an algo-rithm for calibration, with several hundreds of constraints im-posed, and is intended to make the sample representative (Kan-torowitz 2002). In particular, a constraint is imposed on themaximal weight assigned to each observation, so that standarderrors do not increase unnecessarily. The constraints insure thatthe reported age structure, geographic distributions, and house-hold sizes will add up to given margins of the distributions ofthe population.

In some sense our purpose is to summarize the effect ofthe several hundreds of constraints imposed on weights toadd up to given demographic margins, into the effect on thevariable of interest, which is income. It is important to notethat the income is not involved at all in the derivation of theweights. Hence, there is no built-in correlation (i.e., spuriouscorrelation) between the weight of each observation and its in-come. (For additional information on the sample and the waythe weighting scheme was created, see http://www.cbs.gov.il/publications/expenditure_survey04/pdf/e_intro.pdf, p-XXIII.)

The survey of household expenditures in Israel has been con-ducted every year since 1997. Because in some years observa-tions from East Jerusalem were missing, we have omitted thoseobservations from all years to get an identical coverage. Theprobability of each household to be included in the sample isequal. Hence, if the propensity of the population to respondand the surveying process are not correlated with demographicproperties, then the expected values of all weights should beequal. The sum of the weights represents the whole popula-tion of the country. Since the size of the sample relative to theoverall population changes over the years, the average value ofthe weights and the slope of the regression can change betweenyears. (It is as if one multiplies the weights in each year by a dif-ferent constant.) We did not correct for this problem, because itseffect is small and it is important only when comparing resultsfrom different years. For the sake of simplicity, we preferred toconcentrate and present the results for one representative year,and checked whether the main conclusions reached are sensitiveto the selected year. Our purpose is to find out the relationshipbetween the weight of each observation and some demographiccharacteristics and income. The higher the weight is, the higheris the nonresponse of the part of the population that the obser-vation represents.

Table 1 provides descriptive statistics of the weights accord-ing to different years and ethnic groupings. The population isdivided into three groups: two minority groups—Arabs and ul-trareligious Jews (an ultrareligious household is defined accord-ing to the school attended by the head of the household), andthe rest—referred to as the majority group. The reason for thisdistinction is that experienced enumerators reported that thosegroups tend to have different patterns of nonresponse.

In general, one can observe from Table 1 that for all yearsthe average weight of the Arab population is the lowest, mean-ing that they have fewer cases of nonresponse. The maximumweight attached to an observation should be viewed with cau-tion because some of the programs assigning the weights mayrestrict the weight to be not greater than a certain value. It isworth noting that although the order of average weights ac-cording to the groups remains the same over the years (except

Table 1. Descriptive statistics of household weightsby ethnic grouping

Weight

Year Ethnic group N Average Max Min Std. dev.

1997 Majority 4,942 283 1,196 20 166Arabs 529 267 1,051 19 199

Ultrareligious Jews 90 398 1,127 45 245

1998 Majority 5,068 286 1,196 18 169Arabs 606 256 1,049 24 176

Ultrareligious Jews 98 321 766 26 166

1999 Majority 5,114 291 1,134 13 154Arabs 597 269 1,129 14 169

Ultrareligious Jews 105 292 639 20 115

2000 Majority 5,146 301 1,195 22 170Arabs 629 260 959 43 142

Ultrareligious Jews 89 310 1,017 33 148

2001 Majority 5,049 314 1,185 18 152Arabs 662 285 1,902 31 171

Ultrareligious Jews 76 341 834 104 145

Source: HES 1997–2001, excluding the observations of East Jerusalem in 1997–1999.

for 1999), the order of the standard deviations of the weightschanges.

Since the groups differ in household size, which may affectthe probability of finding someone at home, Table 2 presents theaverage weights according to household size. It can be seen thatfor the majority, household of size 1 has the highest weight, andthe rest are similar (year 2001 is different). This may be a resultof small households not being at home whereas the elderly, al-though being at home, do not have the patience to complete thequestionnaire. (For Arabs, there is no obvious pattern. Nothingcan be said about the ultrareligious Jews, since the sample sizesare quite small.)

To summarize: the dependent variable is the weight assignedto each observation by a calibration procedure, intended to rep-resent the entire population. The sample is a stratified sample,but the probability of each dwelling and each person living ina dwelling to be included in the sample is equal. Hence, if thepropensity not to be included in the sample is equal, becauseof either nonresponse or errors on behalf of the agency, the ex-pected weight assigned to each observation should be equal. Itmay differ between years, if the ratio of sample size to popu-lation changes between years. The weight is treated as an indi-cator of nonresponse. Having described the dependent variable,we now move to describe the results concerning the regressioncoefficients.

6. EMPIRICAL RESULTS

We turn first to simple regression coefficients. These simpleregression coefficients are used in finance (Gregory-Allen andShalit 1999; Shalit and Yitzhaki 2002), whereas a variation ofthem has been used for almost 20 years in analyzing the incomeelasticity of consumption goods (see the survey in Wodon andYitzhaki 2002). We have estimated the regression coefficientsfor the years 1997–2001. Since they present a stable picture,only the results for the year 2001 are presented here.

Schechtman, Yitzhaki, and Artsev: Extended Gini Regressions 339

Table 2. Means of household weights by ethnic grouping andhousehold size

Household size

Year Ethnic group 1 2 3 4 5+1997 Majority Mean 325 278 278 276 266

N 840 1,199 807 910 1,186Arabs Mean 288 246 255 236 281

N 12 48 54 102 313Ultrareligious Jews Mean 273 363 294 315 488

N 2 18 16 13 41

1998 Majority Mean 330 280 290 276 268N 869 1,281 788 964 1,166

Arabs Mean 316 338 241 252 247N 13 49 70 98 376

Ultrareligious Jews Mean 235 247 385 329 325N 3 12 13 12 58

1999 Majority Mean 333 291 276 297 266N 892 1,258 878 919 1,167

Arabs Mean 230 248 321 302 256N 15 49 64 93 376

Ultrareligious Jews Mean 176 293 291 333 288N 3 20 11 12 59

2000 Majority Mean 351 291 310 280 282N 875 1,296 867 965 1,143

Arabs Mean 323 223 272 381 237N 17 58 64 78 412

Ultrareligious Jews Mean 239 327 254 336 301N 3 21 1 13 51

2001 Majority Mean 363 294 326 291 308N 886 1,325 838 949 1,051

Arabs Mean 521 261 290 304 271N 20 64 74 116 388

Ultrareligious Jews Mean 393 346 483 366 319N 3 16 4 8 45

Source: HES 1997–2001, excluding the observations of East Jerusalem in 1997–1999.

Table 3 presents the regression coefficients of weight onhousehold income, for different values of ν. The higher the pa-rameter ν is, the more the regression stresses the slopes of theregression curve at the lower end of the income distribution.(It is worth noting that the weights given to adjacent slopesare solely determined by the distribution of the independentvariable and ν. They should not be confused with the depen-dent variable of the regression, which is the weight assignedto an observation.) As can be seen, the regression coefficientsare negative, which means that the higher the income is, the

Table 3. Regression coefficients of household weight on grossincome per household, by extended Gini parameter (ν)

ν for gross income

Coefficient 3 2 1 −.5

b −.0025∗ −.0022∗ −.0017∗ −.0007∗SE(b) .0003 .0003 .0002 .0001a(mean) 348.3 342.8 335.7 321.0a(median) 320.7 314.9 307.8 293.2

Source: HES 2001. Units of income—New Israeli Shekel per month.∗Indicates a value significantly different than 0 (at α = .05).

lower is the weight assigned to observations, implying that non-response declines with income. For example, a value of −.0022(obtained for ν = 2) means that a unit increase in income willdecrease the dependent variable (weight) by .0022. The averageweight is the ratio of the sample size to the population. There-fore, the value of (−.0022) divided by the average weight givesthe percentage rate of the increase in response rate with a unitincrease in income. Even when high-income groups are stressed(ν = −.5), we still have a significant negative regression co-efficient. The interpretation of this finding is that we have amonotonic relationship between nonresponse and income. Wehave checked the pattern for the years 1997–2000 and found thesame pattern. To save space the results are not presented.

The rest of Table 3 presents the two versions of the constantterm. One presents the constant term when the regression linepasses through the median and the other when it passes throughthe mean. It is interesting to note that the difference betweenthe two is around 28 with a(mean) higher than a(median). Thisis an indication that the error term tends to be asymmetric. Itis not clear to us whether this kind of result, that is, that thedifference in the constant terms is independent of the slopes, isa coincidence or is a property of the extended Gini regressionprocedure.

Table 4 presents the simple regression coefficients of weighton household size. As in the regression on income, the largerthe family size, the higher the value of the regression coefficient(in absolute value), and in all cases, the signs of the regressioncoefficients are negative. This means that nonresponse is higheramong small households. Note that as before, the constant termof the regression passing through the mean is larger than theconstant term of the regression passing through the median, butagain, the difference between the two constants is around 28.

It is interesting that although the differences in the regres-sion coefficients are relatively large, the differences in the stan-dard errors are relatively small. Further research and additionalresults from different datasets are needed to form an opinionregarding this issue.

Tables 5a and 5b present the multiple regression coefficientswith gross income, household size, and dummy variables forbeing a member of a minority group (Arabs, ultrareligiousJews) as the independent variables. In Table 5a, the Gini re-gression method is used with a choice of values for ν, whereasin Table 5b the OLS method is used for the entire sample andthen for each quartile (by income) separately.

Table 4. Regression coefficients of household weight on householdsize, by extended Gini parameter (ν)

ν for household size

Coefficient 3.0 2.0 1.0 −.5

b −12.2∗ −10.6∗ −8.5∗ −3.5∗SE(b) 1.4 1.3 1.1 1.2a(mean) 354.1 347.5 339.3 321.4a(median) 326.7 320.0 311.1 293.2

Source: HES 2001.∗Indicates a value significantly different than 0 (at α = .05).

340 Journal of Business & Economic Statistics, July 2008

Table 5a. Gini multiple regression coefficients of household weight(0) on gross income per household (1), household size (2), and ethnicgrouping dummy variables (3, 4)a, by extended Gini parameter (ν)

ν for gross income

Coefficient 3.0 2.0 1.0 −.5

b01 −.0026b −.0021b −.0016b −.0006b

(.000) (.000) (.000) (.000)b02 −1.41 −2.54 −3.95b −6.32b

(1.44) (1.39) (1.35) (1.31)b03 −40.8b −36.0b −30.1b −20.1b

(8.1) (8.0) (7.9) (7.8)b04 18.8 23.3 28.9 38.3b

(17.4) (17.2) (17.1) (16.9)a(mean) 357.9 354.6 350.5 343.6a(median) 330.2 326.6 322.6 316.1

Source: HES 2001. Units of income—New Israeli Shekel per month.aDummy variable No. 3 has the following values: 1 = “Arab,” 0 = “Other”; Dummy

variable No. 4 has the following values: 1 = “Ultrareligious Jew,” 0 = “Other.”bIndicates a value significantly different than 0 (α = .05).

The Gini Regression

The parameter ν is set to 1 (symmetric around the median)for household size and minority groups (represented by dummyvariables), and it varies for income only. As can be seen, theregression coefficients of weight on income decline in absolutevalue as ν declines (i.e., stressing higher incomes) by up to .001,so that the patterns detected in the simple regression continue tohold. However, the sign of the regression coefficients of house-hold size remains the same (negative), indicating that largerhouseholds respond in greater proportion to the questionnaires.Since the only difference between the regressions is the changein the parameter of income, the decline in absolute value of thecoefficient of household size should be attributed to a change inthe pattern of association between income and household size.The higher the stress on high-income groups is, the lower is theabsolute value of the effect of household size on nonresponse.Also, the magnitude of the regression coefficient of householdsize has changed from (−8.5) in the simple regression case,to (−4.0), which may be an indicator of the magnitude of as-sociation between income and household size. Given incomeand household size, Arabs tend to respond in higher proportionthan the majority group, but the more we stress high income,the lower is the effect (this may be due to small sample size inthe upper range of incomes). One possible interpretation is thatthe higher the income is, the lower is the difference in responserates between the majority group and Arabs. On the other hand,the effect of stressing high-income range on ultrareligious Jewsis the opposite. The more high incomes are stressed, the loweris the response rate. Since it is a group with a low response rateon average, and seems to be motivated by an ideology, it is rea-sonable to conclude that the difference in response rate betweenthis group and the rest of the population increases with income.However, the high standard errors show that only when highincome is stressed, the dummy variable for ultrareligious Jewsis significant. As before, the difference between the constantterms is approximately 28.

All in all, we can conclude that given congregation andhousehold size, the lower the income, the lower the response

Table 5b. OLS multiple regression coefficients of household weight(0) on gross income per household (1), household size (2), and

ethnic grouping dummy variables (3, 4)a

OLS

Entire First Second Third FourthCoefficient population quartile quartile quartile quartile

b01 −.0008b −.0094b −.00447 −.0003 −.0006b

(.0001) (.003) (.0033) (.0016) (.0002)b02 −4.384b −11.10b −1.55 3.868b −1.48

(1.154) (3.56) (2.275) (1.928) (2.31)b03 −25.039b 16.856 −48.36b −53.164b −47.4b

(6.83) (14.972) (12.05) (12.73) (20.68)b04 33.128 78.977b −4.01 .793 46.2

(17.92) (37.895) (31.17) (30.4) (56.2)a(mean) 340.615 404.02 363.2 290.2 322.4

Source: HES 2001. Units of income—New Israeli Shekel per month.aDummy variable No. 3 has the following values: 1 = “Arab,” 0 = “Other”; Dummy

variable No. 4 has the following values: 1 = “Ultrareligious Jew,” 0 = “Other.”bIndicates a value significantly different than 0 (α = .05).

rate, and given income and congregation, the smaller the house-hold, the lower the response rate. The indications of the simpledescriptive statistics that Arabs tend to better respond than themajority, and ultrareligious Jews respond less than the rest ofthe population, remained intact.

The OLS Regression

In order to see the difference between Gini and OLS regres-sion, we replicated the estimation procedure using the OLS.The OLS was used for the entire sample (n = 5,787), and thenfor each quartile separately (n = 1,446–1,448). The quartilesare: Q1 = 25th percentile = 6,599 Israeli (monthly) Shekels(IS), Q2 = median = 10,927 IS, and Q3 = 75th percentile =18,534 IS. Note, however, that under the OLS, only one-fourthof the observations participate in the regression of each quar-tile, whereas in the Gini regression, all observations participatein the regressions. The pattern found in the Gini regression ap-pears, in part, when using OLS. The regression coefficients ofweight on income are all negative, but only two are significantlydifferent than zero—the coefficients for the first and fourthquartiles. However, moving from the first quartile to the third,there seems to be a trend of decrease, in absolute value. Con-cerning the effect of demographic characteristics, in the OLS,the magnitude and the effect of the demographic characteris-tics change between quarters, whereas under the Gini they areforced to be the same. Looking carefully at the slopes of income(in absolute values), there is a slight indication of a U-shapedpattern—going down from .0094 to .00447 and .0003, and thenup to .0006.

7. EFFECT OF WEIGHTING ON MEAN INCOMEAND INEQUALITY

We turn now to answer directly the research question raisedby Deaton’s (2003) conjecture concerning the impact of non-response on reported average income and on the measurementof poverty. Having found that there is a nonlinear systematic

Schechtman, Yitzhaki, and Artsev: Extended Gini Regressions 341

Table 6. Mean values and extended Gini coefficients of gross household income

With weighting Without weighting

Meanincome

ν of gross income Meanincome

ν of gross income

Year N 3.0 2.0 1.0 3.0 2.0 1.0

2001 5,787 14,110 .62 .55 .42 14,758 .61 .55 .42(.005) (.005) (.005) (.005) (.005) (.006)

2000 5,864 13,273 .61 .54 .41 13,978 .61 .54 .40(.005) (.005) (.005) (.004) (.004) (.005)

1999 5,816 12,837 .62 .55 .41 13,300 .61 .54 .41(.005) (.005) (.005) (.005) (.005) (.006)

1998 5,772 11,336 .61 .54 .41 12,383 .61 .54 .40(.005) (.005) (.006) (.005) (.005) (.005)

1997 5,561 10,724 .62 .55 .41 11,494 .61 .53 .40(.006) (.006) (.007) (.005) (.005) (.005)

NOTE: The case ν = −.5 is omitted because we are assuming inequality aversion.

Source: HES 1997–2001, excluding the observations of East Jerusalem in 1997–1999. Units of income are monthly New Israeli Shekels per month.

relationship between income and the response rate, one won-ders whether this relationship can seriously bias measures ofinequality like the Gini coefficient.

Table 6 presents the weighted mean of income (and extendedGini’s) and the simple (unweighted) mean of income (and ex-tended Gini’s). Presumably, the former represent unbiased es-timates of the population parameters, whereas the latter rep-resent the biased estimates, due to nonresponse. As expectedfrom the regression results, weighted mean incomes are lowerthan unweighted, which means that nonresponse tends to in-crease average income by a magnitude of up to 10%. To verifythis conclusion, Table 7 presents the weighted and unweightedmean incomes by the different demographic groups and house-hold sizes. In only two out of 15 cases we get that unweightedaverage income is lower than weighted average income, and inthese cases the sample sizes and the differences are small.

The second and more complicated issue is the effect of non-response on inequality and poverty measures. First, one has todistinguish between absolute and relative poverty lines. If thepoverty line is an absolute one, then we get a clear answer: anunderrepresentation of the number of poor people is in contrast

to Deaton’s conjecture. On the other hand, if the poverty lineis relative, or if we are concerned with the effect on inequality,then the direction of the bias is not clear, because nonresponsewill affect both the numerator and the denominator of the in-equality index.

To see this, let us evaluate the effect on the Gini index ofinequality. The Gini coefficient can be written as

G = 2 COV(Y,F(Y))

μ,

where Y is income, F(Y) is the cumulative distribution of in-come, and μ is mean income. We have already established thatnonresponse tends to bias mean income upward, leading to adownward bias in inequality measurement. However, omittingan observation can increase or decrease the numerator. Lowerparticipation rates by the poor may increase or decrease the nu-merator, depending on the shape of the distribution, whereasit will certainly increase the denominator. By stochastic dom-inance considerations, it can be shown that omitting an obser-vation of a poor person will shift the cumulative distribution tothe right. Hence, higher nonresponse rates among the poor will

Table 7. Estimates of sample and population-adjusted means of gross income per household, by household size and ethnic grouping

Household size

Ethnic group 1 2 3 4 5+ Total

Majority N 886 1,325 838 949 1,051 5,049Unweighted mean 7,136 12,613 17,471 19,477 20,943 15,482

Weighted mean 6,846 12,447 16,929 18,934 20,027 14,762

Arabs N 20 64 74 116 388 662Unweighted mean 3,846 6,192 7,961 9,545 10,916 9,676

Weighted mean 3,598 5,755 7,579 8,861 10,608 9,121

Ultrareligious N 3 16 4 8 45 76Jews Unweighted mean 5,829 10,681 14,683 7,374 11,648 10,925

Weighted mean 4,385 10,794 11,180 6,740 11,776 10,617

Total N 909 1,405 916 1,073 1,484 5,787Unweighted mean 7,060 12,299 16,690 18,313 18,040 14,758

Weighted mean 6,736 12,153 16,214 17,690 17,530 14,110

Source: HES 2001.

342 Journal of Business & Economic Statistics, July 2008

increase the mean income and will also tend to increase socialwelfare indicators such as μ(1 − G). [The explanation for thisargument is that μ1(1 − G1) > μ2(1 − G2) is a necessary con-dition for distribution 1 to dominate distribution 2 according tosecond-degree stochastic dominance—see Yitzhaki 1982.] Thismeans that the bias in the mean imposes a constraint on themagnitude of the bias in the Gini coefficient. However, it doesnot impose a constraint on the direction of the bias.

Instead of trying to solve the problem theoretically, we ap-proach the question by comparing the effect of weighting oninequality. In any case, since nonresponse is not a simple func-tion of income, and it can affect the numerator in both direc-tions, we cannot expect a clear theoretical answer.

The rest of Table 6 presents the extended Gini’s of gross in-come, with the case ν = 1 being the simple Gini, which is themost relevant. In all years, the simple Gini inequality indexcalculated with weights is higher than the Gini index withoutweights. Hence, nonresponse tends to bias inequality down-ward. However, in three years (out of five) the effect is rel-atively small (less than one standard error), and in one year,1997, the difference is about 3.5%, which is quite large. Turn-ing to the extended Gini, we see that in all cases the extendedGini, which relies on a weighted sample, is higher than the ex-tended Gini based on the biased (unweighted) sample. Hence,we may safely conclude that nonresponse tends to bias inequal-ity downward.

8. CONCLUSIONS AND RECOMMENDATIONS FORFURTHER RESEARCH

This article presents a descriptive method that enables the re-searcher to trace the curvature of the regression curve by chang-ing the weights assigned to different sections of the distrib-utions of the independent variables. One major advantage ofthe method is that the researcher can use a different weightingscheme for each independent variable.

Although descriptive in nature, it can be turned into a stan-dard analytical regression technique. By selecting the sameweighting scheme for all independent variables, one can havethe structure of the OLS, with one simple modification: eachvariance is substituted by an extended Gini, and each covari-ance is substituted by the appropriate extended Gini covariance.The only difference is that the method offers an infinite numberof alternative regression coefficients. Clearly, the method en-ables the investigator to verify whether the results are sensitiveto the specific index of variability (weighting scheme) used.

Turning to our illustration of nonresponse, we have foundthat in the survey of household expenditure in Israel, nonre-sponse decreases with income, decreases with household size,and differs among ethnic groups. The Arab population tends torespond more than the majority, while the ultrareligious Jew-ish population tends to respond less than the majority group.These results are in contrast with Deaton’s (2003) conjecturethat high-income groups tend to respond less to surveys. How-ever, one should be aware that nonresponse is a survey-specific,not to mention the possibility of a country-specific, phenom-enon. Preliminary tests that we conducted have shown that thenonresponse in the income survey, which is a panel in the laborforce survey, demonstrates a totally different response behavior.

Also, additional variables may be relevant in explaining nonre-sponse behavior; schooling is a primary candidate. The signifi-cance of these findings depends on the type of user: for the con-sumer of data it points out the biases that can be expected fromsurveys on the same topic; for producers of data (i.e., statisticalbureaus) it points out weakness in survey data so that they canimprove the sampling strategy so it takes possible biases due tononresponse into account.

In its present form, the regression method does not offer esti-mates of partial derivatives of the regression curve, but it seemsthat one can overcome this deficiency. Yitzhaki (2002) showedthat if one divides the range of an independent variable into twosections, then the Gini regression coefficient (and OLS) can bepresented as a weighted average of the two within-section re-gression coefficients and a between-section regression coeffi-cient. The weights are the relative contribution of each sectionto intra- and intergroup Gini (variance—in OLS) of the inde-pendent variable. This decomposition can be easily expandedto an arbitrary number of sections. Further research is neededto apply this additional decomposition to get a piecewise lin-ear approximation to the regression curve that is based on abetween-section component and within-section components ofthe approximation to allow the estimate of the partial derivativeto vary over sections of the independent variables.

ACKNOWLEDGMENTS

We are grateful to Shaul Lach, Malka Kantorowitz, GideonSchechtman, Aryeh Reiter, and Dmitri Romanov for helpfulcomments and discussions. We thank the associate editor andthe anonymous referees for many critical comments that im-proved the paper.

APPENDIX A: PROOF OF CONVERGENCE

This appendix shows that the difference between the two pro-posed estimators of β is negligible, and hence, b has a limitingnormal distribution (as was shown for bU). (The proof is limitedto the case where ν is an integer.)

The proof proceeds in two steps. In the first step, we expressbN as a linear combination of concomitants of order statistics;in the second step, we show that the difference between the co-efficients of yx(i) , using the two presentations, is negligible, forall i.

Let θ = −(ν + 1)COV(Y, [1 − F(X)]ν) [see (18)] be the pa-rameter of interest. As mentioned in (20), a U-statistic for esti-mating θ is

U = 1( nν+1

)n∑

i=1

[1

ν + 1

(n − 1

ν

)−

(n − i

ν

)]YX(i) =

n∑

i=1

aiYX(i) .

[Note that if ν > (n − i), then(n−i

ν

) = 0.] Hence, the coefficientof YX(i) is

ai = 1( nν+1

)[

1

ν + 1

(n − 1

ν

)−

(n − 1

ν

)]

= 1

n− (n − i)!(n − ν − 1)!(ν + 1)

(n − i − ν)!n! . (A.1)

Schechtman, Yitzhaki, and Artsev: Extended Gini Regressions 343

Using the semiparametric approach, that is, replacing F byr(x)/n = Fn (the empirical cdf), the estimator of θ (18) is givenby

θN = −(ν + 1) cov

(y,

(1 − r(x)

n

)ν)

= −ν + 1

n − 1

n∑

i=1

yi

((1 − ri

n

− AVE

(1 − ri

n

)ν)

= −ν + 1

n − 1

n∑

i=1

([1 − i

n

− AVE

(1 − i

n

)ν)yx(i)

=n∑

i=1

biyx(i) ,

where

AVE

(1 − i

n

is the average of

(1 − i

n

.

Hence, the coefficient of yx(i) is

bi = −ν + 1

n − 1

[(1 − i

n

− AVE

(1 − i

n

)ν].

Note that

AVE

(1 − i

n

= 1

n

n−1∑

j=1

(j

n

= 1

nν+1

n−1∑

j=1

jν.

Using Riemann approximation, then

∫ 1

0xν dx = 1

ν + 1

implies that

1

ν + 1− 1

n≤ 1

nν+1

n−1∑

j=1

jν ≤ 1

ν + 1.

Using these bounds, the difference between the coefficients is

ai − bi = 1

n− (n − i)(n − i − 1) · · · (n − i − (ν − 1))(ν + 1)

n(n − 1) · · · (n − ν)

+ ν + 1

n − 1

((n − i)ν

nν− 1

ν + 1

)+ O

(1

n2

).

Using the common denominator nν(n − 1) · · · (n − ν), of ordern2ν , it is easy to see that in the numerator, the highest power ofn, which is 2ν − 1, cancels out, so that the numerator will be oforder n2ν−2. Therefore, the difference is O(n−2) whereas eachcoefficient is of order 1/n. This implies that

√n(ai − bi) → 0

as n → ∞, which completes the proof.

APPENDIX B: PROOF OF THE CHAIN RULEIN THE OLS CASE

Proposition. The OLS estimators in a multiple regressionobey the chain rule. That is: let y = g(x1, . . . , xK), and applya “chain rule” to get

dy

dxi=

K∑

k=1

∂g

∂xk

dxk

dxi.

Then, the partial derivatives are indeed the (conditional) regres-sion coefficients.

Proof. Assume that all variables are presented as deviationsfrom the mean. The OLS estimator is b = (X′X)−1(X′Y). Bysimple algebra it can be written as (X′X)b = X′Y . Writing itexplicitly, we have⎛

⎜⎝

cov(x1, x1) cov(x2, x1) · · · cov(xk, x1)

cov(x1, x2) cov(x2, x2) · · · cov(xk, x2)

· · · · · · · · · · · ·cov(x1, xk) cov(x2, xk) · · · cov(xk, xk)

⎟⎠

⎜⎝

b1b2· · ·bk

⎟⎠

=⎛

⎜⎝

cov( y, x1)

cov(y, x2)

· · ·cov(y, xk)

⎟⎠ ,

where b1, . . . ,bk are the multiple regression coefficients. Divid-ing each line by the variance of the corresponding independentvariable and denoting simple regression coefficients by bij, weget

⎜⎝

1 b21 · · · bk1b12 1 · · · bk2· · · · · · · · · · · ·b1k b2k · · · 1

⎟⎠

⎜⎝

b1b2· · ·bk

⎟⎠ =

⎜⎝

by1by2· · ·byk

⎟⎠ .

Viewing bij as the differentials (e.g., dxi/dxj) and bi as the par-tial derivatives, we get that OLS estimators obey the chain rule.

[Received November 2004. Revised October 2006.]

REFERENCES

Angrist, J., Chernozhukov, V., and Fernándaz-Val, I. (2004), “Quantile Regres-sion Under Misspecification, With an Application to the U.S. Wage Struc-ture,” Working Paper Series 10428, National Bureau of Economic Research,http://www.nber.org/papers/w10428.

Araar, A., and Duclos, J. Y. (2003), “An Atkinson–Gini Type Family of SocialEvaluation Functions,” Economics Bulletin, 3, 1–16.

Chakravarty, S. R. (1988), Ethical Social Index Numbers, Berlin: Springer-Verlag.

Davidson, R., and Duclos, J. Y. (1997), “Statistical Inference for the Mea-surement of the Incidence of Taxes and Transfers,” Econometrica, 65,1453–1465.

Deaton, A. (2003), “Measuring Poverty in a Growing World (or MeasuringGrowth in a Poor World),” Working Paper 9822, National Bureau of Eco-nomic Research.

Donaldson, D., and Weymark, J. A. (1983), “Ethically Flexible Gini Indices forIncome Distributions in the Continuum,” Journal of Economic Theory, 29,353–358.

Durbin, J. (1954), “Errors in Variables,” Review of International Statistical In-stitute, 22, 23–32.

Garner, T. I. (1993), “Consumer Expenditures and Inequality: An AnalysisBased on Decomposition of the Gini Coefficient,” Review of Economics andStatistics, 75, 134–138.

Gregory-Allen, R., and Shalit, H. (1999), “The Estimation of Systematic RiskUnder Differentiated Risk Aversion: A Mean-Extended Gini Approach,” Re-view of Quantitative Finance and Accounting, 12, 135–157.

344 Journal of Business & Economic Statistics, July 2008

Groves, R. M., and Couper, M. P. (1998), Non Response in Household InterviewSurveys, New York: Wiley.

Groves, R. M., Dillman, D. A., Eltinge, J. L., and Little, J. A. (eds.) (2002),Survey Non Response, New York: Wiley.

Heckman, J. J., Urzua, S., and Vytlacil, E. (2006), “Understanding InstrumentalVariables in Models With Essential Heterogeneity,” Review of Economicsand Statistics, 88, 389–432.

Hettmansperger, T.P. (1984), Statistical Inference Based on Ranks, New York:John Wiley & Sons.

Hoeffding, W. (1948), “A Class of Statistics With Asymptotically Normal Dis-tribution,” The Annals of Mathematical Statistics, 19, 293–325.

Kalton, G., and Flores-Cervantes, I. (2003), “Weighting Methods,” Journal ofOfficial Statistics, 19, 81–97.

Kantorowitz, M. (2002), “Reducing Nonresponse Bias of Income Estimates byIntegrating Data From Households’ Expenditure and Income Surveys,” pa-per presented at the International Conference on Improving Surveys (ICIS),Copenhagen, Denmark.

Koenker, R., and Bassett, G., Jr. (1978), “Regression Quantiles,” Econometrica,46, 33–50.

Lerman, R., and Yitzhaki, S. (1994), “The Effect of Marginal Changes in In-come Sources on U.S. Income Inequality,” Public Finance Quarterly, 22,403–417.

McKean, J.W., and Hettmansperger, T.P. (1978), “A Robust Analysis of theGeneral Linear Model Based on One Step R-estimates,” Biometrika, 65, 571–579.

Millimet, D. L., and Slottje, D. (2002), “An Environmental Paglin–Gini,” Ap-plied Economics Letters, 9, 271–274.

Mistiaen, J. A., and Ravallion, M. (2003), Survey Compliance and the Distrib-ution of Income, Washington, DC: The World Bank.

Olkin, I., and Yitzhaki, S. (1992), “Gini Regression Analysis,” InternationalStatistical Review, 60, 185–196.

Powell, J. L., Stock, J. H., and Stoker, T. S. (1989), “Semiparametric Estimationof Index Coefficient,” Econometrica, 57, 1403–1430.

Randles, R. H., and Wolfe, D. A. (1979), Introduction to the Theory of Non-parametric Statistics, New York: Wiley.

Schechtman, E., and Yitzhaki, S. (1987), “A Measure of Association Based onGini’s Mean Difference,” Communications in Statistics, Theory and Meth-ods, 16, 207–231.

(1999), “On the Proper Bounds of the Gini Correlation,” EconomicsLetters, 63, 133–138.

(2003), “A Family of Correlation Coefficients Based on ExtendedGini,” Journal of Economic Inequality, 1, 129–146.

Serfling, R. J. (1980), Approximation Theorems of Mathematical Statistics,New York: Wiley.

Serfling, R., and Xiao, P. (2006), “Multivariate L-Moments,” mimeo, revisedversion, Journal of Multivariate Analysis, to Appear.

Shalit, H., and Yitzhaki, S. (2002), “Estimating Beta,” Review of QuantitativeFinance and Accounting, 18, 95–118.

Shao, J., and Tu, D. (1996), The Jackknife and Bootstrap, New York: Springer.Wodon, Q., and Yitzhaki, S. (2002), “Inequality and Social Welfare,” in PRSP

Sourcebook, ed. J. Klugman, Washington DC: The World Bank, pp. 75–104.Yitzhaki, S. (1982), “Stochastic Dominance, Mean Variance, and the Gini’s

Mean Difference,” American Economic Review, 72, 178–185.(1983), “On an Extension of the Gini Inequality Index,” International

Economic Review, 24, 617–628.(1991), “Calculating Jackknife Variance Estimators for Parameters of

the Gini Method,” Journal of Business & Economic Statistics, 9, 235–239.(1996), “On Using Linear Regressions in Welfare Economics,” Journal

of Business & Economic Statistics, 14, 478–486.(1998), “More Than a Dozen Alternative Ways of Spelling Gini,” Re-

search on Economic Inequality, 8, 13–30.(2002), “Do We Need a Separate Poverty Measurement,” European

Journal of Political Economy, 18, 61–85.(2003), “Gini’s Mean Difference: A Superrior Measure of Variability

for Non-Normal Distributions,” Metron, 61, 285–316.Yitzhaki, S., and Schechtman, E. (2005), “The Properties of the Extended Gini

Measures of Variability and Inequality,” Metron, 63, 401–433.