46
REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild and Florian Scheuer We consider optimal redistribution in a model where individuals can self-select into one of several possible sectors based on heterogeneity in a multi- dimensional skill vector. We first show that when the government does not observe the sectoral choice or underlying skills of its citizens, the constrained Pareto frontier can be implemented with a single nonlinear income tax. We then characterize this optimal tax schedule. If sectoral inputs are comple- ments, a many-sector model with self-selection leads to optimal income taxes that are less progressive than the corresponding taxes in a standard single- sector model under natural conditions. However, they are more progressive than in canonical multisector economies with discrete types and without occu- pational choice or overlapping sectoral wage distributions. JEL Codes: H2, D5, D8, E2, E6, J3, J6. I. Introduction The Roy (1951) model of self-selection is one of the workhorse models in labor economics. It has been used to study immigration and locational choice (Borjas 1987; Dahl 2002), schooling (Willis and Rosen 1979), choice of occupation or industry (Heckman and Sedlacek 1985, 1990), employment in union versus nonunion (Lee, 1978) and private versus public sectors (Borjas 2002), female labor force participation (Heckman 1974), training program participation (Ham and LaLonde 1996), and the growth-retarding impact of racial and gender discrimination in labor markets (Hsieh et al. 2011), for example. Its essential feature is that individuals optimally self-select into one of several sectors based on which one affords them the highest returns. One would expect this sort of self-selection to have important implications for the design of redistributive income taxes. It is surprising, then, that these implications have not been *We are grateful to the editor, Robert Barro, three anonymous referees, as well as Philippe Chone ´, Craig Brett, Emmanuel Farhi, John Weymark, and seminar participants at the 2012 Taxation Theory Conference (Vanderbilt), SED Annual Meeting (Limassol), NTA Annual Meeting (Providence), 2013 ASSA Meetings (San Diego), University of California Berkeley, University of Georgia, University of Michigan, and LMU Munich for helpful comments. All remaining errors are our own. This article is a significantly revised version of the earlier working paper that circulated under the title ‘‘Entrepreneurial Taxation and Occupational Choice.’’ ! The Author(s) 2012. Published by Oxford University Press, on behalf of President and Fellows of Harvard College. All rights reserved. For Permissions, please email: [email protected] The Quarterly Journal of Economics (2013), 623–668. doi:10.1093/qje/qjs076. Advance Access publication on December 19, 2012. 623 at Fondation Nationale Des Sciences Politiques on January 30, 2015 http://qje.oxfordjournals.org/ Downloaded from

I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

Embed Size (px)

Citation preview

Page 1: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

REDISTRIBUTIVE TAXATION IN THE ROY MODEL*

Casey Rothschild and Florian Scheuer

We consider optimal redistribution in a model where individuals canself-select into one of several possible sectors based on heterogeneity in a multi-dimensional skill vector. We first show that when the government does notobserve the sectoral choice or underlying skills of its citizens, the constrainedPareto frontier can be implemented with a single nonlinear income tax.We then characterize this optimal tax schedule. If sectoral inputs are comple-ments, a many-sector model with self-selection leads to optimal income taxesthat are less progressive than the corresponding taxes in a standard single-sector model under natural conditions. However, they are more progressivethan in canonical multisector economies with discrete types and without occu-pational choice or overlapping sectoral wage distributions. JEL Codes: H2, D5,D8, E2, E6, J3, J6.

I. Introduction

The Roy (1951) model of self-selection is one of the workhorsemodels in labor economics. It has been used to study immigrationand locational choice (Borjas 1987; Dahl 2002), schooling (Willisand Rosen 1979), choice of occupation or industry (Heckman andSedlacek 1985, 1990), employment in union versus nonunion(Lee, 1978) and private versus public sectors (Borjas 2002),female labor force participation (Heckman 1974), trainingprogram participation (Ham and LaLonde 1996), and thegrowth-retarding impact of racial and gender discrimination inlabor markets (Hsieh et al. 2011), for example. Its essentialfeature is that individuals optimally self-select into one of severalsectors based on which one affords them the highest returns.One would expect this sort of self-selection to have importantimplications for the design of redistributive income taxes.It is surprising, then, that these implications have not been

*We are grateful to the editor, Robert Barro, three anonymous referees,as well as Philippe Chone, Craig Brett, Emmanuel Farhi, John Weymark, andseminar participants at the 2012 Taxation Theory Conference (Vanderbilt), SEDAnnual Meeting (Limassol), NTA Annual Meeting (Providence), 2013 ASSAMeetings (San Diego), University of California Berkeley, University of Georgia,University of Michigan, and LMU Munich for helpful comments. All remainingerrors are our own. This article is a significantly revised version of the earlierworking paper that circulated under the title ‘‘Entrepreneurial Taxation andOccupational Choice.’’

! The Author(s) 2012. Published by Oxford University Press, on behalf of Presidentand Fellows of Harvard College. All rights reserved. For Permissions, please email:[email protected] Quarterly Journal of Economics (2013), 623–668. doi:10.1093/qje/qjs076.Advance Access publication on December 19, 2012.

623

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 2: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

studied formally heretofore. This article takes a step towardunderstanding them by analyzing optimal Mirrleesian incometaxation in a two-sector Roy model.

Incorporating self-selection a la Roy in an optimal taxationframework a la Mirrlees raises some challenges. In theMirrleesian approach, the government effectively uses incometaxes to screen individuals based on their unobserved skill (orwage). When individuals can choose among multiple sectors,the underlying skill is naturally multidimensional: Each individ-ual has a skill in each possible sector. It is well known that multi-dimensional screening problems are typically challenging(Rochet and Chone 1998). We show that the particular screeningproblem that arises in a many-sector model of optimal incometaxation is tractable despite the underlying multidimensionalheterogeneity.

We do this by demonstrating that a single nonlinear incometax schedule is sufficient for implementing any incentive compat-ible allocation if the government cannot or does not want tocondition taxes on sectoral choice. With a single tax schedule,allocations only depend on the realized wage, not on sectoralchoice per se, and hence the multidimensional screening prob-lem ‘‘collapses’’ into an almost standard single-dimensional‘‘screening on wages’’ problem. The tools developed by Mirrlees(1971) and others therefore apply, with one important difference:The wage distribution will typically be endogenous when thereare many sectors to choose from. This is because the productivityof effort in any given sector will, in general, depend on the aggre-gate effort expended in this and in other sectors. We characterizeoptimal taxes accounting for this endogeneity and show thatunder natural and general assumptions, it implies a force forless progressive taxation relative to a world with a single sectorand an exogenous wage distribution.

The basic intuition for this result can be understood asfollows. Suppose that there are two sectors, a ‘‘blue-collar’’ anda ‘‘white-collar’’ sector, and individuals are free to choose to workin either. The government aims at redistributing from high- tolow-income individuals and designs an income tax system accord-ingly. For administrative, informational, or political reasons, itmay not distinguish between the two sectors: A white-collar andblue-collar worker earning the same income y pay the same taxT(y). Even if sectoral choice is observable, such a restriction couldresult from the fact that individuals can rather easily relabel

QUARTERLY JOURNAL OF ECONOMICS624

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 3: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

their type of occupation and would have strong incentives to do soin the presence of differential taxation, leading to distortions.Unless the government collects detailed information about thekind of tasks that individuals do, such a reclassification may behard or costly to prevent, which motivates our focus on uniformincome taxation. It also corresponds to the actual tax systems inmany countries, where the income tax is typically not sector- oroccupation-specific.1

With a linear production technology (i.e., when equivalentunits of white- and blue-collar efforts are perfect substitutes),the fact that there are two sectors would be irrelevant (unlessthe government has an intrinsic sectoral preference).Individuals would choose to work in the sector in which theyare more productive, as reflected in their wage, and this choiceand the resulting wage distribution would be independent oftax policy. The optimal tax would therefore be exactly the sameas it would be in a single-sector Mirrlees model with the samewage distribution.

Contrast this with the case in which the two sectors are grosscomplements. In this case, sectoral choices and wages will beendogenous to tax policy. Lowering taxes at income levels thatare dominated by white-collar workers, for example, will differ-entially encourage white-collar effort. This will reduce the mar-ginal productivity of white-collar effort (assuming diminishingmarginal products within a given sector) and raise the marginalproductivity of blue-collar effort (by complementarity across sec-tors). It therefore indirectly redistributes from white- toblue-collar workers.

Suppose now that the blue-collar sector is the low-incomesector, that is, that there are disproportionately more white-collar workers at higher incomes. Then this indirect redistribu-tion channel will lead the government to choose a tax system thatis less progressive than in a Mirrleesian world with exogenouswages: Lowering taxes on high earners will disproportionatelyspur effort in the white-collar sector, which will indirectly redis-tribute from the relatively high-income white-collar workersto the lower-income blue-collar workers by raising blue-collarwages and lowering white-collar wages. Similarly, raising taxes

1. There are exceptions to this pattern, for example, the tax treatment ofthe self-employed versus employed in some countries. See Scheuer (2012a,2012b) for an analysis of differential taxation of entrepreneurs versus workers.

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 625

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 4: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

on lower earners will differentially discourage effort in theblue-collar sector, again increasing their wage. This indirectredistribution channel (sometimes referred to as ‘‘trickle-down’’effects, because lower earners can benefit from tax cuts on higherearners) therefore leads to less progressive taxes. For example, itimplies an optimal top income tax rate that will generally benegative when the skill distribution is bounded—that is, strictlybelow the well-known zero top rate result.

This result does not say taxes should be regressive per se.Rather, it says that optimal taxes will be less progressive thanthey would be in the alternative allocation that would obtain ifthe endogeneity of wages implied by a multisector Roy modelwere neglected. Making such a comparison requires formalizingthis alternative allocation. We use the notion of a self-confirmingpolicy equilibrium (SCPE), developed for a different context byRothschild and Scheuer (2011), for this. An SCPE describes thetax system that would emerge in the same economy if the govern-ment assumed that it was operating in a standard exogenous-wage world. In such a world, a government would, following thestandard approach in public finance pioneered by Saez (2001),infer an underlying skill distribution from the income distribu-tion that it observes given an existing tax system. Taking thisskill distribution as given, it would then design the optimalincome tax system. In an SCPE, this newly computed optimalincome tax system would coincide with the existing tax system,thus ‘‘confirming’’ its optimality. Our results show that taxes insuch an SCPE are not, in fact, optimal in a multisector economy,since the wage-cum-skill distribution is not, in fact, exogenous.In particular, the optimal taxes would be less progressive.

I.A. Related Literature

Most closely related to our analysis is Stiglitz (1982), whoconsiders optimal nonlinear taxation in a two-type model withendogenous wages but without occupational choice. He alsoshows that progressive redistributional motives will lead the op-timal top marginal tax rate to be negative when the two types’efforts are complements. The indirect redistribution channeldriving his results are similar to those driving ours. Our modeldiffers in two significant ways, however. First, our continuoustype model allows us to study the progressivity of the entire taxschedule, rather than just the top marginal tax rate. Second, we

QUARTERLY JOURNAL OF ECONOMICS626

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 5: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

identify several extra effects that arise in a general Roy model,effects which result from (1) endogenous occupational choice and(2) the fact that the sectoral wage distributions will typicallyoverlap in a general model with continuous types, whereasStiglitz’s discrete type model generically—and somewhat unreal-istically—rules out workers in different sectors earning the samewage.

We show that these extra effects mitigate the general equi-librium effects of taxation found in Stiglitz (1982) and thereforemake optimal taxes more progressive than in a discrete typemodel without occupational choice. To understand why, supposewe reduce taxes at the top to increase the effort of the top earners.This is desirable insofar as it indirectly redistributes from high tolow incomes by raising the wages of workers in the low-wagesector and lowering the wages of workers in the high-wagesector. When there is endogenous occupational choice, however,there is an additional effect: The change in wages leads someindividuals to shift out of the high-wage into the low-wagesector. This undoes some of the original increase in aggregateeffort in the high-wage sector and blunts the desirable effects ofthe original reduction in taxes. As a result, the optimal progres-sivity of the tax schedule in our general Roy model is boundedbetween a standard Mirrlees model at the progressive end andStiglitz’s (1982) model at the regressive end.

Naito (1999) has studied the role of sector-specific incometaxes in Stiglitz’s (1982) two-type model and observed that intro-ducing additional commodity taxes or production inefficienciescan be desirable when it relaxes incentive constraints by manip-ulating relative wages, thus contradicting the results by Atkinsonand Stiglitz (1976) and Diamond and Mirrlees (1971). We do notconsider sector-specific or commodity taxation in this article. Saez(2004) has contrasted Stiglitz’s model (with fixed occupations butflexible hours) to a model with fixed hours but flexible occupa-tional choice and shown that the standard Diamond and Mirrlees(1971) optimal tax formulas apply and that Naito’s results ceaseto hold in the latter setting. Saez’s results depend on the assump-tion that each wage corresponds to a unique occupation and thereis no intensive margin. In contrast, we allow the wage distribu-tions of different occupations to overlap and consider bothendogenous occupational choice and intensive labor supply.

Our article also relates to earlier research on optimal incometaxation in models with endogenous wages and occupational

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 627

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 6: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

choice, such as Feldstein (1973), Zeckhauser (1977), Allen (1982),Boadway, Marceau, and Pestieau (1991), and Parker (1999). Thisliterature has largely restricted attention to linear taxation. Anexception is the work by Moresi (1997), who considers nonlineartaxation of profits in a model of occupational choice betweenworkers and entrepreneurs. The occupational choice margin inhis model is considerably simplified, however, and heterogeneityis confined to affect one occupation only, not the other.

Restricting heterogeneity to affect one occupation only or taxschedules to be linear sidesteps the complexities of multidimen-sional screening, which emerge naturally in the present model.Few studies in the optimal taxation literature have attempted todeal with multidimensional screening problems until recently.Kleven, Kreiner, and Saez (2009) have made progress alongthese lines in a study of the optimal income taxation of couples,and Scheuer (2012b) has considered the taxation of entrepre-neurs and workers with occupational choice, but the seconddimension of heterogeneity in their models takes a specializedadditive form.2 Chone and Laroque (2010) and Brett andWeymark (2003) use an exogenous ‘‘type aggregator’’ akin toour technique, which can be interpreted as an endogenous typeaggregator. Moreover, the types of heterogeneity they study(tastes for labor and education, respectively) are again quite dis-tinct from the multidimensional skill types that arise naturallyin our Roy model.

The methodological approach that we pursue here to charac-terize redistributive taxation in the Roy model builds onRothschild and Scheuer (2011), who consider optimal income tax-ation in an economy with a productive and a rent-seeking sector.While that paper shares the overall structure of two-dimensionalskill heterogeneity and occupational choice between two sectors,the emphasis is on the corrective role of income taxation in asetting where wages deviate from the social marginal product ofeffort due to rent-seeking externalities, issues that are absentfrom the present framework.

More generally, this article follows the large optimal incometaxation literature building on the seminal contributionsof Mirrlees (1971) and Diamond (1998). Until recently, the theor-etical literature focused on deriving results for a given skill

2. See also Brett (2007) for a four-type model of family taxation withtwo-dimensional heterogeneity.

QUARTERLY JOURNAL OF ECONOMICS628

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 7: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

distribution and social welfare function. Saez (2001) insteadinferred skills and optimal taxes from the observed income dis-tribution, and Laroque (2005), Werning (2007), and Chone andLaroque (2010) explored the conditions under which an existingset of taxes is potentially Pareto efficient. In the same spirit, wecharacterize Pareto efficient tax policies rather than focusing on aparticular social welfare function. With multiple complementarysectors, however, the wage distribution is endogenous to the taxcode, so existing tests for optimality—for example, Werning(2007), who infers wage-cum-skill distributions from income dis-tributions as a test of optimality—are potentially misleading. Onemight conclude that the tax code is indeed Pareto efficient giventhe inferred skill distribution under the (implicit and incorrect)assumption that the skill distribution is independent of taxes.Our concept of an SCPE captures this situation. It is related tothe literature on self-confirming equilibria in learning models(e.g., Sargent 2008; and Fudenberg and Levine 2009).

The article proceeds as follows. Section II describes the basicmodel and shows that a single nonlinear income tax is a fullygeneral policy tool for a government that observes income butnot effort, wage, or sectoral choice. Section III characterizes andcompares optimal and SCPE nonlinear taxes. Section IV com-pares our results to Stiglitz (1982) and points out the novel roleof occupational choice and overlapping wage distributions in ourmodel. Section V illustrates our theoretical results with a simpleempirical calibration. Section VI extends our results to allow forunbounded skill distributions and additional across-occupationcost heterogeneity. Section VII concludes. All proofs are in theAppendix.

II. The Model

II.A. Setup

We consider an economy where a unit mass of individuals canchoose between working in either of two sectors. Accordingly, in-dividuals have a two-dimensional skill type ð�, ’Þ 2 ��� with� ¼ ½�, �� and � ¼ ½’, ’�. � captures an individual’s productivitywhen working in the �-sector and ’ captures her �-sector skill.These skills are distributed in the population according to a con-tinuous two-dimensional cumulative distribution function (cdf)Fð�, ’Þ with density f ð�, ’Þ. In Section VI, we demonstrate how

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 629

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 8: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

our results extend to the case where individuals also differ intheir cost of working in one of the sectors, capturing, for example,differential education requirements or tastes for differentoccupations.

Individuals have preferences over consumption c andeffort e captured by the strictly concave utility function Uðc, eÞwith Uc > 0, Ue < 0. We respectively denote the consumption,effort, utility, and sector assigned to an individual of type ð�, ’Þ by

cð�, ’Þ, eð�, ’Þ, Vð�, ’Þ � Uðcð�, ’Þ, eð�, ’ÞÞ, and Sð�, ’Þ 2 f�, �g:

The technology in the economy is described by a constantreturns to scale aggregate production function YðE�, E’Þ that com-bines the skill-weighted aggregate effort in the two sectors toproduce the consumption good. Formally, aggregate efforts aregiven by

E� �

ZP����

�eð�, ’ÞdFð�, ’Þ and E’ �

Z���nP

’eð�, ’ÞdFð�, ’Þ

with P � ð�, ’ÞjSð�, ’Þ ¼ �� �

: Because the technology is linearhomogeneous, the marginal products only depend on the ratioof aggregate effort in the two sectors x � E�

E’and are therefore

denoted by Y�ðxÞ and Y’ðxÞ. We define an individual’s wage w ashis marginal return to effort, so that

wð�, ’Þ ¼Y�ðxÞ� if Sð�, ’Þ ¼ �,Y’ðxÞ’ if Sð�, ’Þ ¼ �:

�ð1Þ

An individual’s income is then given by yð�, ’Þ � wð�, ’Þeð�, ’Þ.As is standard, we assume Uðc, eÞ to satisfy the single-crossingproperty, that is, the marginal rate of substitution between

income and consumption,�Ueðc, y

wUcðc, ywÞ

, is decreasing in w.

II.B. Implementation

We start by characterizing a general, direct implementation,where individuals announce their privately known type ð�, ’Þand then get assigned consumption cð�, ’Þ, income yð�, ’Þ and asector to work in Sð�, ’Þ. Assuming that income and consumptionare observable but not sectoral choice, wage, or effort, theresulting incentive constraints that guarantee truth-telling ofthe agents are as follows. First, suppose Sð�, ’Þ ¼ �, that is,

QUARTERLY JOURNAL OF ECONOMICS630

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 9: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

type ð�, ’Þ is assigned to the �-sector. Then incentive compatibil-ity requires that

U cð�, ’Þ,yð�, ’Þ

Y�ðxÞ�

� �� max U cð�0,’0Þ,

yð�0,’0Þ

Y�ðxÞ�

� �, U cð�0,’0Þ,

yð�0,’0Þ

Y’ðxÞ’

� �� �8�0, ’0

because there are two ways for type ð�, ’Þ to imitate another typeð�0, ’0Þ, namely by earning ð�0, ’0Þ’s income either in the �- or the�-sector. Analogously, if Sð�, ’Þ ¼ �, we need

U cð�, ’Þ,yð�,’Þ

Y’ðxÞ’

� �� max U cð�0,’0Þ,

yð�0, ’0Þ

Y�ðxÞ�

� �, U cð�0,’0Þ,

yð�0, ’0Þ

Y’ðxÞ’

� �� �8�0,’0:

The assumption that only income but not the wage or effort isobservable has been standard in optimal income tax models sinceMirrlees (1971). Even though sectoral choice may be observablein some applications, it is worth emphasizing again that condi-tioning allocations on occupations may be hard to enforce in prac-tice. This motivates our approach of treating occupational choiceas de facto noncontractible for the government. The followinglemma shows that under these assumptions, any incentivecompatible allocation can be decentralized by offering a singlenonlinear income tax TðyÞ:

LEMMA 1. Suppose that only income y and consumption c areobservable, whereas an individual’s skill type ð�, ’Þ, effort eand sectoral choice S are private information. Then for anyincentive compatible allocation fcð�, ’Þ, yð�, ’Þ, Sð�, ’Þ, xg, thefollowing properties hold:

i wð�, ’Þ ¼ maxfY�ðxÞ�, Y’ðxÞ’g and

Sð�, ’Þ ¼� if Y�ðxÞ� > Y’ðxÞ’,� if Y’ðxÞ’ > Y�ðxÞ�;

�ð2Þ

ii U�cð�, ’Þ, yð�,’Þ

w

¼ U

�cð�0, ’0Þ, yð�0, ’0Þ

w

for all ð�, ’Þ, ð�0, ’0Þ such

that wð�, ’Þ¼wð�0, ’0Þ ¼ w;iii fcð�, ’Þ, yð�, ’Þ, Sð�, ’Þ, xg can be implemented by offering the

single nonlinear income tax schedule T�ðyÞ correspondingto the retention function R�ðyÞ ¼ y� T�ðyÞ defined by

R�ðyÞ � maxc

c U cð�,’Þ,yð�, ’Þ

wð�,’Þ

� �� U c,

y

wð�, ’Þ

� �8ð�, ’Þ 2 ���

� �ð3Þ

and letting all agents choose one of their most preferred (c, y)-bundles from the resulting budget set B� ¼ fðc, yÞjc y� T�ðyÞg.

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 631

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 10: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

Proof. See Appendix. #

In other words, individuals work in the sector in which theyearn a higher wage, any two individuals with the same wageachieve the same utility, and the so-called principle of taxationextends to our setting. The first two properties follow directlyfrom incentive compatibility. For part iii, the retention functionR�ðyÞ constructed in (3) to implement a given incentive compat-ible allocation is the lower envelope of the indifference curves ofall individuals through the consumption-income bundle assignedto them. Given these properties, the direct mechanism boils downto allocating (c, y)-pairs to a set of individuals with differentwages, as in a standard optimal taxation problem.

Lemma 1 does not rule out the possibility that two individ-uals with the same wage (e.g., in different sectors) choose differ-ent consumption-income bundles, even though, by property ii,these bundles must be located on the same indifference curve inthe (c, y)-space. Because the type distribution F(�, ’) is continu-ous, we can nevertheless restrict attention to allocations that poolall same-wage individuals at the same allocation. To see why,observe that for any w such that two types who both earn wagew receive distinct allocations ðc1, y1Þ and ðc2, y2Þ, the retentionfunction R�ðyÞ described in (3) and the wage-w indifferencecurve must coincide between y1 and y2. The number of suchwages w must therefore be countable and the measure of typesat such wages must be zero. Pooling all individuals at each suchw at a bundle ðR�ðyÞ, yÞ with y between y1 and y2 is thereforeincentive compatible and resource feasible and, moreover, affectsneither the aggregate effort in either sector nor, a fortiori, x.Because pooling in this manner does not alter any type’s utility,the Pareto frontier can be traced out by using only allocationsfcðwÞ, yðwÞ, xg that pool all same-wage individuals and wherewages w and sectoral choice Sð�, ’Þ are as described inLemma 1. We focus on such allocations henceforth.

III. Characterizing Optimal Income Taxes

III.A. Pareto Optima and Self-Confirming Equilibria

For any given x, the marginal productivities in the two sec-tors, Y�ðxÞ and Y’ðxÞ, the two-dimensional skill distribution,Fð�, ’Þ, and the implied sectoral choice described in (2) together

QUARTERLY JOURNAL OF ECONOMICS632

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 11: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

induce a one-dimensional wage distribution characterized by thecdf and sectoral densities

FxðwÞ � Fw

Y�ðxÞ,

w

Y’ðxÞ

� �and

f �x ðwÞ ¼1

Y�ðxÞ

Z wY’ðxÞ

fw

Y�ðxÞ,’

� �d’, f ’x ðwÞ ¼

1

Y’ðxÞ

Z wY� ðxÞ

f �,w

Y’ðxÞ

� �d�,

with associated cdfs F�xðwÞ and F’

x ðwÞ and with fxðwÞ ¼ f �x ðwÞþf ’x ðwÞ. We also define the bottom and top wages given x as

wx ¼ max Y�ðxÞ�, Y’ðxÞ’n o

, wx ¼ max Y�ðxÞ�, Y’ðxÞ’� �

:

To trace out the Pareto frontier, we assign Pareto weights Gð�, ’Þin the two-dimensional skill space. As with the type distributionFð�, ’Þ, we can integrate the Pareto weights Gð�, ’Þ to obtain adistribution of Pareto weights over wages given x, denoted byGxðwÞ, with the corresponding densities gxðwÞ ¼ g�xðwÞ þ g’xðwÞ(and cdfs G�

xðwÞ and G’xðwÞ).

Fixing x, the optimal income tax problem in a Roy model,thus collapses to a one-dimensional screening problem despitethe underlying two-dimensional heterogeneity in the economy.In particular, the Pareto problem is an almost standardMirrlees problem with the additional constraint that the individ-uals’ efforts and sectoral choices must be consistent with x.Formally, we must require that

~xðxÞ �

Rwx

w x

wY�ðxÞ

eðwÞ f �x ðwÞdwRwx

w x

wY’ðxÞ

eðwÞ f ’x ðwÞdw¼ x, or equivalentlyð4Þ

ð1� �ðxÞÞ

Z wx

w x

weðwÞ f �x ðwÞdw� �ðxÞ

Z wx

wx

weðwÞ f ’x ðwÞdw ¼ 0,ð5Þ

where �ðxÞ denotes the aggregate income share of the �-sector:

�ðxÞ �xY�ðxÞ

xY�ðxÞ þ Y’ðxÞ¼

Y�ðxÞE�

YðE�, E’Þ:

We employ the standard approach of optimizing directlyover cðwÞ, yðwÞ-allocations. In fact, it is equivalent but formally

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 633

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 12: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

more useful to optimize over eðwÞ, VðwÞ-bundles, whereVðwÞ ¼ UðcðwÞ, eðwÞÞ and eðwÞ ¼ yðwÞ

w . With this, we can write thePareto problem for income taxation in the Roy model as

maxx

WðxÞ � maxeðwÞ, VðwÞ

Z wx

w x

VðwÞdGxðwÞð6Þ

subject to (5),

V 0ðwÞ þUeðcðVðwÞ, eðwÞÞ, eðwÞÞeðwÞ

w¼ 0 8w 2 ½wx, wx�,ð7Þ

and Z wx

wx

weðwÞ � cðVðwÞ, eðwÞÞð Þ fxðwÞdw � 0,ð8Þ

where cðV, eÞ denotes the inverse of Uð:Þ with respect to its firstargument. We refer to the three constraints (5), (7), and (8) as theconsistency condition, the incentive constraints and the resourceconstraint, respectively.3

As suggested by the notation in (6), it is useful to decomposethe Pareto problem (5) to (8) into an ‘‘inner’’ problem, given x,and an ‘‘outer’’ problem, which maximizes over x. Formally, fixsome x and let W(x) denote the value of the objective (6) whenmaximizing over V(w), e(w) subject to (5), (7), and (8) (the innerproblem). Then the outer problem is simply maxx WðxÞ.

For some of the subsequent analysis, it will be useful torestrict attention to solutions of the Pareto problem when thePareto weights have the special form Gð�, ’Þ ¼ �ðFð�, ’ÞÞ: It is

straightforward to show that g�xðwÞf �x ðwÞ¼

g’x ðwÞf ’x ðwÞ

and GxðwÞ ¼ �ðFxðwÞÞ

for such weights and any given x. The same welfare weight isthus assigned to any two individuals with the same (endogen-ously determined) wage, and we can think of � as a measure ofhow much social welfare weight is attached to different quantiles

3. Note that we have made use of the local version of the incentive constraintsin (7). It is a standard result (see e.g. Fudenberg and Tirole 1991, Theorems 7.2 and7.3) that, under the single-crossing condition, the local incentive constraints (7)together with the monotonicity constraint that income y(w) must be nondecreasingin w are equivalent to the global incentive constraints VðwÞ ¼ UðcðwÞ, eðwÞÞ ¼maxw0 Uðcðw0Þ,

eðw0 Þw0

w Þ for all w, w0. We follow the usual approach of droppingthe monotonicity constraint and verifying ex post that the solution satisfies it,abstracting from issues of bunching.

QUARTERLY JOURNAL OF ECONOMICS634

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 13: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

of the (endogenous) wage distribution. We refer to such weightsas relative welfare weights and use � instead of G to distinguishthis special case from the general one.

To compare the Pareto optimal tax schedules described so farwith the corresponding optimal tax schedule in a standardMirrlees model, we solve the following benchmark inner-outerproblem decomposition using relative welfare weights �. For agiven x, we consider the relaxed inner problem:

maxeðwÞ, VðwÞ

Z wx

wx

VðwÞd�ðFxðwÞÞð9Þ

subject to (7) and (8)—that is, we drop the consistency constraint(5) and solve a standard Mirrleesian problem taking the wagedistribution Fx(w) as given. To ensure that the resulting alloca-tion is consistent with the original economy, however, we need torequire that the ~x induced by the solution to the inner problem isequal to the original x that was taken as given to solve the stand-ard Mirrlees problem. The ‘‘outer’’ problem thus requires that xis a fixed point of the mapping x! ~x�ðxÞ, where ~x�ðxÞ is definedby (4), evaluated at the effort allocation e(w) that solves therelaxed inner problem (7) to (9).

The solution ðTðyÞ, x�Þ to this benchmark problem is inter-pretable as SCPE as developed in a different context inRothschild and Scheuer (2011). Suppose that taxes happen tobe set at TðyÞ (thereby inducing x ¼ x�) and consider a social plan-ner with Pareto weights �ðFx� ðwÞÞ who takes the wage distribu-tion to be exogenous to the tax code, as in a standard Mirrleesmodel. This social planner would, following Saez (2001), observethe income distribution induced by TðyÞ and infer the underlyingwage distribution Fx� ðwÞ from it. It would then compute the opti-mal Mirrleesian taxes given � and Fx� ðwÞ and ‘‘confirm’’ theoptimality of the existing tax code TðyÞ. TðyÞ would not, in fact,be optimal in light of the endogeneity of wages to the tax code, buta social planner who takes wages to be exogenous would have noreason to explore other regions of the tax-policy space, given the‘‘confirmed’’ optimality of the status quo.4

4. Note the importance of using relative welfare weights here: The SCPE socialplanner is unaware of the multisector character of the economy, so using generalPareto weights Gð�, ’Þ would be inconsistent with the planner’s incorrect model ofthe economy.

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 635

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 14: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

DEFINITION 1. An SCPE is a value of x and an allocation V(w), e(w)such that (i) x is a fixed point of ~x�ðxÞ and (ii) V(w) and e(w)solve the inner SCPE problem given x.

Hence, our definition of an SCPE is the most straightforwardway of capturing standard Mirrleesian taxation in our frameworkin a consistent way. We compare SCPE and Pareto optima forgiven weights � in our numerical explorations in Section V.

III.B. Inner Problem

We start by using the inner problem, taking x as given, toderive formulas for marginal tax rates that have to be satisfied inany Pareto optimum.

PROPOSITION 1. Let � denote the multiplier on the resource con-straint (8), let �� denote the multiplier on the consistencycondition (5), and use "uðwÞ and "cðwÞ to denote the uncom-pensated and compensated labor supply elasticities, respect-ively. For any given Pareto weights G, optimal marginal taxrates satisfy

1� T0ðyðwÞÞ ¼ 1þ �f �x ðwÞ

fxðwÞ� �ðxÞ

� �� �1þ

�ðwÞ

wfxðwÞ

1þ "uðwÞ

"cðwÞ

� ��1

ð10Þ

where

�ðwÞ ¼

Z wx

w1�

gxðzÞ

fxðzÞ

UcðzÞ

� �exp

Z z

w1�

"uðsÞ

"cðsÞ

� �dyðsÞ

yðsÞ

� �fxðzÞdz:

ð11Þ

Proof. See Appendix. #

Since the inner SCPE problem differs from the inner Paretoproblem only by the absence of the consistency condition (5), themarginal tax rates for an SCPE can be found by using relativePareto weights � and setting � = 0 in Proposition 1, as in thefollowing Corollary.

COROLLARY 1. For any given relative Pareto weights �, marginaltax rates in the resulting SCPE satisfy

1� T0ðyðwÞÞ ¼ 1þ�ðwÞ

wfxðwÞ

1þ "uðwÞ

"cðwÞ

� ��1

,ð12Þ

QUARTERLY JOURNAL OF ECONOMICS636

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 15: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

where

�ðwÞ ¼

Z wx

w1��0ðFxðzÞÞ

UcðzÞ

� �exp

Z z

w1�

"uðsÞ

"cðsÞ

� �dyðsÞ

yðsÞ

� �fxðzÞdz:

ð13Þ

The formulas for an SCPE are the same as for a standardMirrlees model (see e.g. Saez 2001). In contrast, in any Paretooptimum, the formula for marginal keep shares 1� T0 is adjustedcompared to the SCPE by a correction factor that depends on �and a comparison between the aggregate income share of the

�-sector, given by a(x), with its local income share yðwÞ f �x ðwÞyðwÞ fxðwÞ

¼f �x ðwÞfxðwÞ

.5

This is intuitive. For instance, suppose � >0—which we showin Section III.D corresponds to the case where � is thehigh-income sector. Then the marginal keep share is scaled upin the Pareto problem relative to the SCPE whenever, at thegiven wage (or equivalently income) level, the local incomeshare of the �-sector exceeds its aggregate income share. Thisdisproportionately encourages �-sector effort and thereforeraises wages in the �-sector relative to the �-sector. Hence, thesolution to the Pareto problem uses this ‘‘trickle-down’’ channelthrough wages to redistribute to the low-income sector, which isdesirable for relative Pareto weights with �ðFÞ � F for allF 2 ½0, 1�. Note that this implies a force toward less progressivityin the Pareto optimum relative to the SCPE: If � is thehigh-income sector, marginal tax rates will be scaled up (down)in the Pareto problem compared to the SCPE for low (high)income levels.6

As usual, Z(w) captures the redistributive motives of thesocial planner as well as income effects. The optimal tax formulatherefore simplifies considerably if income effects disappearand preferences are of the quasilinear form Uðc, eÞ ¼ c� hðeÞ.

5. The formulas in Proposition 1 and Corollary 1 will eventually be evaluatedat the optimal x-values from solving the respective outer problems. Since, in gen-eral, the level of x in the SCPE and the Pareto optimum will differ for a giveneconomy, even when based on the same Pareto weights �, the formulas do notpermit a direct comparison of tax rates at the two solutions. One interpretation,however, is as a comparison of the tax rates in two different economies with thesame endogenous wage distributions.

6. The same would be true if � were the high-income sector. We show belowthat in this case, � is negative and hence, per equation (10), marginal tax rates willagain be higher (lower) in the Pareto optimum for low (high) income levels.

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 637

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 16: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

Then UcðwÞ ¼ � ¼ 1 and "cðwÞ ¼ "uðwÞ � "ðwÞ, which leads to thefollowing corollary:

COROLLARY 2. With quasilinear preferences, the marginal tax ratein any Pareto optimum satisfies

1� T0ðyðwÞÞ ¼ 1þ �f �x ðwÞ

fxðwÞ� �ðxÞ

� �� �1þ

GxðwÞ � FxðwÞ

wfxðwÞ1þ

1

"ðwÞ

� �� ��1

:

Without income effects, the redistributive effect of taxationis simply given by �ðwÞ ¼ GxðwÞ � FxðwÞ, which is increasing inthe degree to which Gx(w) puts weight on low-wage earners overand above their population share Fx(w). The marginal tax rate isalso decreasing in the elasticity "(w), which captures the distor-tionary effects of taxation as in Diamond (1998). Again, the cor-rection factor �ðf �x ðwÞ=fxðwÞ � �ðxÞÞ comparing aggregate and localincome shares of the �-sector is applied to marginal keep shares.

Independent of whether preferences exhibit income effects,the top marginal tax rate is generally not zero in a Pareto opti-mum, as the following result demonstrates.

COROLLARY 3. The top marginal tax rate is zero in any SCPE andgiven by

T0ðyðwxÞÞ ¼ � �ðxÞ �f �x ðwxÞ

fxðwxÞ

� �in any Pareto optimum. In particular, if f �x ðwxÞ

fxðwxÞ¼ 0

ðrespectively f �x ðwxÞ

fxðwxÞ¼ 1Þ, then T0ðyðwxÞÞ ¼ ��ðxÞ respectivelyð

T0ðyðwxÞÞ ¼ ��ð1� �ðxÞÞÞ.

In the next subsections, we use the outer problem to deter-mine the sign of �, which will generally be such that this topmarginal tax rate is negative.

III.C. Outer Problem

Denoting the substitution elasticity of the production func-tion YðE�, E’Þ by �ðxÞ, we can derive the following decompositionof the welfare effect of a marginal change in x.

LEMMA 2. For any Pareto weights G, the welfare effect of a mar-ginal change in x can be decomposed as follows:

W0ðxÞ ¼ �1

x���ðxÞY’ðxÞE’ þ

1

�ðxÞI þ Rþ ��ðSþ CÞð Þ

� �,ð14Þ

QUARTERLY JOURNAL OF ECONOMICS638

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 17: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

where

S �1

Y�ðxÞY’ðxÞ

Z wx

w x

w2eðwÞ fw

Y�ðxÞ,

w

Y’ðxÞ

� �dw > 0,ð15Þ

I � �

Z wx

w x

�ðwÞwV 0ðwÞ

UcðwÞ

d

dw

f ’x ðwÞ

fxðwÞ

� �dw,ð16Þ

R �

Z wx

wx

wV 0ðwÞf �x ðwÞf

’x ðwÞ

fxðwÞ

g�xðwÞ

f �x ðwÞ�

g’xðwÞ

f ’x ðwÞ

� �dw,ð17Þ

and

C �

Z wx

w x

w2e0ðwÞf ’x ðwÞf

�x ðwÞ

fxðwÞdw:ð18Þ

Proof. See Appendix. #

We provide a heuristic derivation to illustrate the intuitionbehind this lemma and the terms in (15) to (18). Notice first thatif technology is linear, so that s(x) =1, then setting W0ðxÞ ¼ 0in (14) immediately implies � = 0. By Proposition 1, the marginaltax formulas for the SCPE and Pareto problems coincide. This isintuitive: In this case, wages are exogenous to the tax code, sothe fact that there are two sectors is irrelevant. It is only theadditional effects driven by the endogeneity of wages in thefinite s(x) case that provide scope for using additional tools foraccomplishing redistributive objectives.

In particular, observe that when s(x)<1, a small increase inx has several effects. First, it has a direct effect on the consistencycondition (5), or equivalently,

E�

E’� x ¼ 0:ð19Þ

Second, it has wage effects, because type (�,’)’s wage is given byw ¼ max �Y�ðxÞ, ’Y’ðxÞ

� �and hence depends on x. Finally, it has a

sectoral shift effect since it will induce some individuals to migratefrom the �- to the �-sector due to the decreased wages in the�-sector and increased wages in the �-sector. The term S in equa-tion (15) in Lemma 2 captures this sectoral shift effect, which we

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 639

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 18: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

derive first. We then return to a discussion of the other terms inequations (16) to (18), which are attributable to the direct andwage effects.

1. The Sectoral Shift Effect. To derive the sectoral shift effect,

it is useful to write the consistency condition as ð1� �ðxÞÞZ�ðxÞ�

�ðxÞZ’ðxÞ ¼ 0, where Z�ðxÞ �Rw

w weðwÞ f �x ðwÞdw is the income

earned in the �-sector, and similarly for Z’(x). Consider a smallincrease �x in x, holding efforts and wages constant. This willlead some individuals to shift from the �- to the �-sector, asillustrated in Figure I. Let �Z�ðxÞ denote the resulting changein �-sector income. Because there is an equal and oppositechange in �-sector income, the sectoral shift effect can be writtenas S ¼ �Z�ðxÞ: Figure I illustrates the computation of �Z�(x). Itconsiders the mass element of individuals with �-sector skillsbetween � and � þ d� who are in the �-sector at x but in the�-sector at x +� x. The height of this element is

d Y�ðxÞY’ðxÞ

� dx

��x ¼Y 0�ðxÞY’ðxÞ � Y�ðxÞY 0’ðxÞ

Y’ðxÞ2

!��x:

FIGURE I

Illustrating the Computation of the Sectoral Shift Effect

QUARTERLY JOURNAL OF ECONOMICS640

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 19: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

The income earned by each individual in that element is�Y�ðxÞe �Y�ðxÞð Þ, and the density of individuals is f �, �Y�ðxÞY’ðxÞ

� .

Multiplying the width (d�) by the height, the density, andthe per capita income, and then integrating over � gives:

�Z�ðxÞ ¼ �x

Z �

Y 0�ðxÞY’ðxÞ � Y�ðxÞY 0’ðxÞ

Y’ðxÞ2

!Y�ðxÞ�

2e �Y�ðxÞð Þ f �,�Y�ðxÞ

Y’ðxÞ

� �d�

¼ �x�1=ðx�ðxÞÞ

Y�ðxÞY’ðxÞ

Z w

ww2e wð Þ f

w

Y�ðxÞ,

w

Y’ðxÞ

� �dw,

where the second step involves changing variables to w ¼ �Y�ðxÞ

and using x�ðxÞ ¼Y 0’ðxÞ

Y’ðxÞ�

Y 0�ðxÞY�ðxÞ

(viz Lemma 3 in the Appendix). The

sectoral shift term S defined in expression (15) follows directly.

2. The Direct and Wage Effects. Toward understanding theother terms in Lemma 2, first consider the thought experimentof changing x—and hence the wage distribution—while holdingeffort and sector constant for each individual. The first term inexpression (14) is equal to the effect of this change on the consist-ency condition (5), which can be seen by observing that the effecton the reformulated constraint (19) is –1 and that the original andreformulated constraints differ by a factor of �ðxÞY’ðxÞE’

1x :

This thought experiment is not actually feasible, however,because it requires assigning two individuals with the samewage (but in different sectors) different eðwÞ, VðwÞ-bundles afterthe change in x. The approach we take in the formal proof in theAppendix is instead motivated by a related but feasible thoughtexperiment: increase x by a small amount while holding the e(w)and V(w) schedules fixed at each wage w rather than for eachtype.7

This thought experiment leads to changes in aggregateefforts and utilities, eaggðwÞ � eðwÞfxðwÞ and VaggðwÞ � VðwÞfxðwÞ,at any given w. The algebraic manipulations in our proof aredriven by thinking of the thought experiment in two steps: first,a change in the schedules e(w) and V(w) at the original x and wagedistribution that absorbs the aggregate change in eaggðwÞ andVaggðwÞ induced by the change in x at each wage; and second, a

7. An alternative approach, leading to the same results, is to break the incen-tive constraint into two sector-specific constraints. See Rothschild and Scheuer(2011) for an example of this approach in a related problem.

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 641

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 20: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

change in x—and the wage distribution—coupled with a rever-sion to the original e(w) and V(w) schedules. This second part, byconstruction, holds eaggðwÞ and VaggðwÞ at each w constant, butreallocates that effort and utility across individuals who origin-ally earned the same wage but in different sectors. By an enve-lope argument, only the second, reallocating step has welfareeffects. Because the resource constraint (8) depends only oneagg(w) and Vagg(w) at each wage, it is unaffected in this secondstep. The terms C, I, and R arise from the effects on the otherconstraints, (5) and (7), and the objective (6).

The effect of the reallocating change on the reformulatedconsistency constraint (19) has two components: the –1 termfrom the (infeasible) thought experiment considered above, anda second term—labeled C in expression (18)—which arisesbecause the across-sector reallocation of effort at each wageaffects E�

E’. Intuitively, an increase in x raises the wage of

�-sector wage-w types and lowers the wage of �-sector wage-wtypes. If e0ðwÞ > 0, this effectively reallocates effort from the �- tothe �-sector because �-workers move down and �-workers moveup along the e(w) schedule. It therefore reinforces the reallocationof effort across sectors from the sectoral shift effect S.

The term R in (17) arises analogously: The reallocation of Vacross sectors at each w affects the objective insofar as the plan-ner assigns different welfare weights to workers in different sec-tors but at the same wage. It therefore disappears when there

is no intrinsic sectoral preference—that is, when g�xðwÞg’x ðwÞ¼

f �x ðwÞf ’x ðwÞ

for

all w, as will be the case with relative welfare weights Gð�, ’Þ ¼�ðFð�, ’ÞÞ. In contrast, when the social planner has an intrinsicpreference for the �-sector individuals at wage w, the reallocationof utility from the �- to the �-sector is welfare reducing.

Finally, the term I in equation (16) arises from the incentiveeffect of the reallocating step of the thought experiment.To understand it, suppose that the share of �-sector workers islocally increasing at some w. Then an increase in x, which raises�-sector wages and lowers �-sector wages, leads to a local com-pression of the wage distribution. Such a compression eases thelocal incentive compatibility constraints if they are downward-binding, in which case Z(w)> 0. The increase in x therefore

leads to a welfare improvement insofar as �ðwÞ ddw

f ’x ðwÞfxðwÞ

� < 0,

and the magnitude of this improvement will be related to how

QUARTERLY JOURNAL OF ECONOMICS642

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 21: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

steeply increasing the utility distribution is. As we will formalizein the subsequent section, I can be thought of as a (generalized)Stiglitz (1982) effect: With endogenous wages, increasing(decreasing) effort at high (low) wages will raise (lower) wagesat low (high) wages. In contrast, the sectoral shift and other wageeffects, captured by S, C, and R, are not present in Stiglitz’s(1982) framework.

III.D. Marginal Tax Rate Results

We can use the decomposition in Lemma 2 to sign the multi-plier on the consistency condition � at an optimal x by settingW0ðxÞ ¼ 0 in equation (14):

� ¼ �IþR��ðxÞ

�ðxÞY’ðxÞE’ þCþS�ðxÞ

:ð20Þ

We summarize the resulting conditions for the sign of � in the fol-lowing corollary:

COROLLARY 4. With a linear technology (s(x) =1), � = 0.For �ðxÞ 2 ð0,1Þ, the following holds for any Pareto opti-

mum with (i) increasing effort (e0ðwÞ � 0) and (ii) down-wards-binding incentive constraints (�ðwÞ � 0 for all w):

(1) � � 0 if f �x ðwÞfxðwÞ

is increasing in w and g�xðwÞf �x ðwÞ

g’x ðwÞf ’x ðwÞ

8w,

(2) � 0 if f ’x ðwÞfxðwÞ

is increasing in w and g�xðwÞf �x ðwÞ�

g’x ðwÞf ’x ðwÞ

8w.

The inequalities in 1 and 2 are strict if �(w) is not identicallyzero.

Conditions i and ii are sufficient, but not necessary. Theformer ensures that the wage shift term C reinforces thesectoral shift effect. By equation (11), the latter holds wheneverthe social marginal value of consumption, given by UcðwÞ

gxðwÞfxðwÞ

,is decreasing in w. This is guaranteed with quasilinear-in-consumption preferences and weakly progressive welfare weights(i.e., decreasing gxðwÞ

fxðwÞ), for example. It ensures that a compression

of the wage distribution eases the incentive compatibility con-straints. If f �x ðwÞ

fxðwÞis increasing in w, then � is the high-skill

sector, and an increase in x, by raising wages in the �-sector

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 643

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 22: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

and lowering them in the �-sector, has desirable wage compres-sion effects, as in Stiglitz (1982). This desirable effect is reinforcedby the redistribution effect R whenever g’x ðwÞ

f ’x ðwÞ�

g�xðwÞf �x ðwÞ8w. In this

case, the social planner puts higher social welfare weight on

�-sector workers than on �-sector workers at any given wage,and the wage changes induced by an increase in x also have directwelfare benefits.

Combining these results from the outer problem with themarginal tax rate results from the inner problem has crispimplications for the comparison between Pareto optimal andSCPE tax schedules. For instance, suppose � is thehigh-skilled sector, that is, f ’x ðwÞ

fxðwÞis decreasing, so that � > 0

under conditions i and ii in Corollary 4 (and assuming no in-trinsic redistributive motives across sectors). Then by equation(10) in Proposition 1, the marginal keep share in the Paretooptimum is scaled down relative to the SCPE wherever thelocal income share in the �-sector is higher than in aggregate.This disproportionately reduces �-sector effort and thereforeindirectly increases wages in the �-sector, achieving redistri-bution to the low-skilled sector. In particular, because f ’x ðwÞ

fxðwÞis

decreasing, this means that marginal keep shares are scaleddown for low wages and scaled up for high wages, and the topmarginal tax rate is negative.

On the other hand, suppose f ’x ðwÞfxðwÞ

is increasing, so that � <0.

Marginal keep shares will be scaled down whenever f ’x ðwÞfxðwÞ

is

low, that is, again for high wages. The top marginal tax rateis also again negative. In both of the two cases the generalequilibrium effects in the Roy model work in favor of less progres-sive taxation. We summarize these insights in the followingproposition:

PROPOSITION 2. If �ðxÞ 2 ð0,1Þ, then the top marginal tax rate isnegative in any Pareto optimum with

i a decreasing i-sector share of workers f ixðwÞ

fxðwÞ, i 2 �, ’

� �,

ii an increasing effort schedule e(w),iii a decreasing social marginal utility of consumption

schedule UcðwÞgxðwÞfxðwÞ

, and

iv a weak intrinsic social preference for the i-sector, that is,gi

xðwÞf ixðwÞ�

gjxðwÞ

f jxðwÞ

for all w, j 6¼ i 2 f�, ’g.

QUARTERLY JOURNAL OF ECONOMICS644

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 23: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

Notably, consider the special case with relative welfareweights. Then GxðwÞ ¼ �ðFxðwÞÞ, gxðwÞ ¼ �0ðFxðwÞÞfxðwÞ, andhence

g�xðwÞ ¼ �0ðFxðwÞÞf�x ðwÞ and g’xðwÞ ¼ �0ðFxðwÞÞf

’x ðwÞ:

This immediately implies g�xðwÞf �x ðwÞ¼

g’x ðwÞf ’x ðwÞ

8w and thus R = 0. Withrelative welfare weights, condition iv can therefore be dropped.

Hence, these results reveal the following intuitive separ-ation: Per Corollary 4, the sign of the multiplier � on the consist-ency constraint accounts for the overall redistributive motiveacross sectors, that is, whether redistribution from � to � orvice versa is desirable. Then, conditional on this direction, thenonlinear marginal tax rate correction in the Pareto optimumrelative to the SCPE is determined by comparing local and aggre-gate income shares between sectors, per equation (10) inProposition 1.

IV. The Role of Occupational Choice and

Continuous Types

In this section, we relate our results to those in Stiglitz(1982), who considers optimal nonlinear taxation in a two-typemodel with endogenous wages but without occupational choice.We demonstrate that the general Roy model, with continuoustypes and occupational choice, features three extra effects, ascaptured by S, C, and R in the previous section, that do notappear in Stiglitz’s model. The disappearance of the sectoralshift effect S in a model without occupational choice is obvious.In addition, the Roy model with continuous types generates over-lapping wage distributions in the two sectors, which gives riseto the effects C and R. In contrast, in Stiglitz’s discrete typemodel, generically—and somewhat unrealistically—there are noworkers in different sectors earning the same wage.

The extra Roy effects that emerge in our model do not changethe sign of the general equilibrium effects found in Stiglitz (1982),but they mitigate them. In this sense, optimal redistributive tax-ation in the Roy model involves a less progressive tax schedulethan a standard Mirrlees model (as captured by an SCPE) but amore progressive tax schedule than a discrete type model withoutoccupational choice.

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 645

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 24: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

We start by reformulating Stiglitz’s (1982) model in terms ofthe decomposition into an inner problem (for fixed x) and outerproblem (optimizing over x) as before. Let there be two types withskills � and ’ and with fractions f� and f’ ¼ 1� f� in the popula-tion. We put (relative) Pareto weights � and ’ on them suchthat f� � þ f’ ’ ¼ 1. Without loss of generality, we will think of� as the high wage sector and ’ as the low-wage sector, so thatregular welfare weights satisfy � 1 and ’ � 1. As in Stiglitz(1982), we therefore focus on the case where only the �-type’sincentive constraint binds.

IV.A. Inner Problem

Individuals are paid their marginal products, w� ¼ �Y�ðxÞ,and w’ ¼ ’Y’ðxÞ: Hence, we can write the inner problem forfixed x as

WðxÞ ¼ maxe� , e’, V� , V’

f� �V� þ f’ ’V’ð21Þ

subject to

V� � U cðV’, e’Þ, e’w’

w�

� �,ð22Þ

ð1� �ðxÞÞ f�w�e� � �ðxÞ f’w’e’ ¼ 0,ð23Þ

f�w�e� þ f’w’e’ � f�cðV�, e�Þ þ f’c V’, e’ �

:ð24Þ

As before, the outer problem is just maxx WðxÞ.8

We focus on the top marginal tax rate (i.e., the optimalallocation for type �). Denoting by m and �m the multipliers on(24) and (23), the first-order condition with respect to e� is

�f�� @cðV�, e�Þ=@e� �w�ð Þ þ ð1� �ðxÞÞf���w� ¼ 0:

8. Observe that we again pool all individuals of a given type at the sameallocation (V, e). This is without loss of generality, but the reasoning is somewhatdifferent from the continuous type case in Section II.B. In this discrete-type setting,as in Blackorby, Brett, and Cebreiro (2007), any allocation that treats individualswho earn the same wage differently can be replaced by an alternative incentivecompatible allocation that keeps each individual’s utility unchanged and requiresfewer resources. Importantly for our endogenous wage setting, this alternativeallocation also does not change aggregate effort in either sector or x.

QUARTERLY JOURNAL OF ECONOMICS646

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 25: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

Using @c@e ¼ �

Ue

Uc¼MRS, this simplifies to MRS� ¼ w�ð1þ

ð1� �ðxÞÞ�Þ. By the first-order condition for the worker’s utilitymaximization problem, that is, MRS

w ¼ 1� T0ðyÞ, this implies thatthe marginal tax rate for the high-wage, �-sector individual,is �ð1� �ðxÞÞ� as in Corollary 3.

IV.B. Outer Problem

We next turn to the outer problem to determine �. By theenvelope theorem, we can compute the wage shift effect byholding Vi and ei constant for i 2 f�, ’g, which, in contrast toSection III.C, is a feasible thought experiment here as eachwage corresponds to a single sector. Because sectors do notoverlap at any wage, the reallocation across sectors that ledto the terms R and C in the continuous model in Section III.Cis absent here. There is also no sectoral shift effect S becauseoccupational choice is fixed. As before, the effect of the wageshift induced by the change in x on the objective is identicallyzero, and the effect on the resource constraint (24) is zero by

constant returns to scale. Only the direct effect ����ðxÞY’ðxÞE’1x

and the incentive effects remain.

Specifically, putting a multiplier �� on (22) (and usingthe same algebraic steps employed in the proof of Lemma 2)we find that the effect of the wage shift on the incentiveconstraint is:

���Ue c’, e’w’

w�

� �e’’

Y 0’ðxÞ

Y�ðxÞ�

Y’ðxÞY 0�ðxÞ

Y�ðxÞY�ðxÞ

� �¼ �

1

x�ðxÞI,

where I � ��Ue c’, e’w’

w�

� e’

w’

w�is the discrete incentive constraint

analog of I from the general Roy model.9

Combining the effects yields

W0ðxÞ ¼ �1

x

����ðxÞY’ðxÞE’ðxÞ þ

1

�ðxÞI�:

9. To see this, observe that in the limit where f �x ðwÞfxðwÞ

is 0 up until w�, and

1 thereafter, ddw

f �x ðwÞfxðwÞ

h iis a Dirac d-function and the integral in (16) evaluates

to �V 0ðw�Þw��ðw�Þ ¼ Ueðc� , e�Þe��ðw�Þ by the incentive constraint (7). The only dif-

ference from I is that it has e’w’

w�instead of e� and c’ rather than c� (and � ¼ �

Ucis

discrete rather than continuous). In the density limit, the �-type would be imitating

an infinitesimally close individual. If we let w’ be arbitrarily close to w� , then

we would get e� and c� , as in the limit case of I.

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 647

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 26: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

IV.C. Marginal Tax Rates

At an optimum, W0ðxÞ ¼ 0, so

� ¼ �I

��ðxÞ

�ðxÞY’ðxÞE’,ð25Þ

which coincides with the general formula (20) if S = C = R = 0 andwhen we replace I with I. Moreover, I and � have opposite signs.This means that � > 0 (and hence top marginal taxes are negative)precisely when redistribution occurs from the �- to the ’-types, sothat the downward incentive constraint binds.

This is analogous to our results in the general Roy model, butcomparing equations (20) and (25) reveals that the addition of theeffects S and C will make �, and hence top marginal taxes, smallerin absolute value. To understand the intuition behind this,suppose we lower taxes at the top to increase the effort of thetop earners. This is welfare enhancing because it raises thewages of the low-wage sector workers and lowers the wages ofhigh-wage sector workers and thus relaxes the downward incen-tive constraint. However, when there is endogenous occupationalchoice, the sectoral shift effect works against this, because thischange in wages leads some individuals to shift out of thehigh-wage into the low-wage sector, undoing some of the originalincrease in aggregate effort in the high-wage sector. The indirectwage shift effect C reinforces the sectoral shift effect (when effortis increasing in wage), because, at any given wage where thesectors overlap, it involves a reallocation of effort (among individ-uals who do not shift sectors) from workers in the high-wagesector to workers in the low-wage sector. Optimal taxes in thegeneral Roy model with continuous types will therefore be lessprogressive than in a Mirrlees model but more progressive thanin a Stiglitz model with two types and exogenous occupations.

In fact, the result that the optimal progressivity of taxes in aRoy model is ‘‘sandwiched’’ between Mirrlees at the progressiveend and Stiglitz at the regressive end is not particular to ourenvironment with continuous types. As we show in Rothschildand Scheuer (2012), � is also bounded between Mirrlees (i.e., 0)and Stiglitz (i.e., formula [25]) in a general model with a discrete,two-dimensional type-distribution and with allocations that con-dition on wages only. This means that the Stiglitz formula (25) isa special case even within the class of discrete-type models.

QUARTERLY JOURNAL OF ECONOMICS648

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 27: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

V. A Numerical Example

The purpose of this section is to illustrate the role of sectoraland wage shift effects for the progressivity of optimal incometaxes and verify the consistency of the conditions in Corollary 4and Proposition 2. We use data from the 2011 Current PopulationSurvey (CPS) rotating March sample to calibrate a simpletwo-sector Roy model of the U.S. economy and to compute opti-mal tax schedules. We assume quasilinear preferences

Uðc, eÞ ¼ c� hðeÞ with isoelastic disutility hðeÞ ¼ e1þ1"

1þ1"

. We usea labor elasticity "= 0.5 and Cobb-Douglas technology,Y ¼ E�

�E1��’ .10 We remain deliberately agnostic about the

nature of the two latent sectors. Instead, we build on Basu andGhosh (1978) and Heckman and Honore (1990), who show thatthe parameters of an underlying bivariate normal distributionover (x1, x2) can be identified by observing the single-dimensionaldistribution of the maximum of x1 and x2 (up to a permutationof indices).

Specifically, following Mankiw, Weinzierl, and Yagan (2009),we use the CPS data on weekly earnings and hours to generate asample of hourly wages wi for the U.S. population. We assumethat this wage distribution is generated from a two-sector Roymodel with individuals whose skills (� i,’i) are drawn from abivariate lognormal distribution so that, for a given x ¼ E�

E’, the

distribution across individuals of possible wages ð�iY�ðxÞ, ’iY’ðxÞÞis also bivariate lognormal. We estimate the means ��,�’,variances �2

� , �2’ , and covariance ��’ of this bivariate wage distri-

bution by maximum likelihood. The likelihood of an observation~wi � logðwiÞ ¼ max logð�iY�ðxÞÞ, logð’iY’ðxÞÞ

� �is given by11

‘i ¼ ��� � ewi

��

� �1��

~�’ � ewi

~�’

� �� �þ �

�’ � ewi

�’

� �1��

~�� � ewi

~��

� �� �,

where �ðÞ and �ðÞ denote the density and cumulative distribu-tion of the standard normal distribution, respectively, and where

~�� � �� ���’�2’

�’

!1

1� ��’�2’

and ~�� ���

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� �2

�’��2� ��2

q1� ��’

�2’

10. This implies a constant substitution elasticity s= 1. We explore the effect

of varying s in Section VI.11. See Basu and Ghosh (1978), expressions (2.5) and (6.2).

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 649

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 28: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

if ��’�2’6¼ 1 (and ~�� � �� � �’, ~�� � ��

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� �2

�’��2� ��2

qotherwise),

with symmetric expressions for ~�’ and ~�’.The income share of the �-sector, given by a, can be inferred

from this estimated bivariate wage distribution by using theestimated parameters m� , m’, �2

� , �2’ , and ��’ to draw a (large)

sample of ðw�, w’Þ. We can infer from this sample both sectoralchoices and wages w ¼ maxfw�, w’g. From these and the individ-ual optimization condition e

1" ¼ ð1� Þw (where t is the marginal

tax rate, which we take to be 25% for our calibration), we cancompute the sectoral incomes Y� and Y’ and hence � ¼ Y�

Y�þY’ :Finally, with Cobb-Douglas technology, we can, without loss

of generality, take the underlying skills (� ,’) to coincide withðw�, w’Þ—that is, we can take Y�ðxÞ ¼ Y’ðxÞ ¼ 1. To wit: Notethat Y� ¼

�YE�

. Since scaling all �-skills by k> 0 scales E� by k,this implies that Y� scales by k�1. This rescaling leaves wagesw = � Y� , efforts, and incomes unchanged. In other words, for agiven economy, the underlying skills are only defined up to sucha rescaling.

Our baseline estimates (standard errors) are m� = 2.81 (.029),�’ ¼ 1:74 ð:714Þ, �� ¼ 0:647 ð:015Þ, �’ ¼ 0:637 ð:369Þ, and �’ ¼�:030 ð:630Þ, where �’ ¼

��’���’

is the correlation between thetwo dimensions. The corresponding mean wages are approxi-mately 7 and 20 for the two sectors, and 12.3% of the workersare estimated to work in the �-sector, which has an incomeshare 1 – a of only 5.8%. The left panel of Figure II comparesthe estimated to the empirical wage distribution; it shows a rea-sonably good fit.

The �-sector is the low-income sector here, and it is quitesmall in quantitative terms. We therefore use a likelihood ratiotest to see whether we can reject the two-sector model in favor ofa simpler one-sector model. This likelihood ratio test is compli-cated by the fact that a single-sector model is a ‘‘singularity’’ inparameter space (�’ ceases to be well defined under the restric-tion). We perform a two-step version of the test: First, we observethat �’ ¼ 0 is not rejected. Reestimating the model with �’ � 0leads to imperceptible changes in the remaining coefficients.A standard likelihood ratio test for the restriction of thismodel to a single sector yields �2ð2Þ ¼ 244:2, easily rejecting thesingle-sector restriction. We employ the �’ ¼ 0 estimates in thefollowing tax computations.

To compute the Pareto optimal and SCPE taxes for this econ-omy, only the Pareto weights remain to be specified. We use

QUARTERLY JOURNAL OF ECONOMICS650

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 29: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

−2 0 2 4 60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

log(wage)

Empirical and Fitted Wage Distribution

0 50 100 150 200 250 300−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

wage

Marginal Tax Rate

empirical fitted

Pareto Optimum SCPE

FIGURE II

Empirical/Estimated Wage Distributions and Optimal/SCPE Tax Schedules

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 651

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 30: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

relative weights � ¼ 1� ð1� FÞr, where r characterizes themagnitude of the government’s desire for redistribution fromhigh to low wages. With the quasilinear preferences thatwe use here, r = 1 implies no redistributive motives, and r!1for a Rawlsian social planner. We take r = 1.3, so that there issome intermediate desire for redistribution from high- tolow-wage earners.

The right panel of Figure II shows the marginal tax scheduleT0ðyðwÞÞ as a function of w both for the Pareto optimum andthe SCPE resulting from our parameterization.12 The two taxschedules are similar: Both display U-shaped marginal rates atmodest wages, and then falling rates in the upper tail of thedistribution. The optimal tax schedule is modestly less progres-sive than the SCPE, in accord with the theory, and the topmarginal tax rate is negative, albeit small in magnitude atabout –2% due to the small size of the low-wage sector � (recallthat, by Corollary 3, the top marginal tax rate is givenby �ð1� �Þ�). Finally, Figure III demonstrates that the assump-tions in Proposition 2 are satisfied in our calibrated example:Individual effort e(w) is increasing in the wage, and the sharesof �- and �-sector workers are monotone. A fortiori, incomey(w) = we(w) is increasing in w, so that bunching does not needto be considered.

VI. Extensions

VI.A. Unbounded Skill Distribution

We have focused attention on bounded skill distributionsfor simplicity and to facilitate a comparison with Stiglitz (1982),but the preceding analysis does not rely on this assumption.In particular, recent studies have emphasized that the top endof the empirical wage distribution is better described by anunbounded Pareto distribution (Saez 2001). The analysis of theouter problem in Section III.C can be extended in a straightfor-ward way to such unbounded distributions, and the methodsdeveloped in Diamond (1998) and Saez (2001) can be used to com-pute asymptotic marginal tax rates T0ðyðwÞÞ for w !1. Theseare particularly transparent in the following case:

12. We truncate the distribution at the 99.99th percentile so that the toprate is well defined.

QUARTERLY JOURNAL OF ECONOMICS652

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 31: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

0 50 100 150 200 250 3000

5

10

15

20Effort

wage

0 50 100 150 200 250 300

0.4

0.5

0.6

0.7

0.8

0.9

1Θ−Share

wage

Pareto Optimum SCPE

FIGURE III

Effort and Share of Workers in the �-Sector for Pareto Optimum/SCPE

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 653

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 32: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

PROPOSITION 3. Consider any Pareto optimum (respectively,SCPE) such that

i preferences are quasilinear and isoelastic: Uðc, eÞ ¼ c� e1þ1"

1þ1"

ii the top earners are all in the �-sector: limw!1f �x ðwÞfxðwÞ¼ 1

iii the �-sector skill distribution has a Pareto tail withparameter k, so limw!1

1�FxðwÞwfxðwÞ

¼ �

iv Pareto weights are relative and progressive: GxðwÞ ¼�ðFxðwÞÞ with �00ðxÞ < 0 and

v zero welfare weight is put on the top earners:�0ð1Þ ¼ 0.

Then the asymptotic marginal tax rate is

� 1þ 1"

�� �ð1� �ðxÞÞ

� 1þ 1"

�þ 1

respectively,� 1þ 1

"

�� 1þ 1

"

�þ 1

!:

Moreover, � > 0 whenever e(w) and f �x ðwÞfxðwÞ

are increasing in w.

Proof. See Appendix #

This implies that asymptotic marginal tax rates are scaleddown in the Pareto optimum relative to the SCPE, just like topmarginal tax rates in the case of a bounded skill distribution. Toillustrate this downscaling numerically, we have to replace thethinner-tailed lognormal distributions from the calibration inSection V with Pareto tails. The primary advantage of employinga bivariate lognormal distribution was that it could be identified byobserving only the empirical wage distribution. This allowed us tostudy the role of Roy effects without taking a stand on the natureof the underlying sectors. Unfortunately, a bivariate Pareto distri-bution is not identified without additional sectoral information(Heckman and Honore 1990). We therefore use a simple numericalexample that is not explicitly calibrated to the U.S. economy to geta sense for the implications of these thicker tails.

In particular, we consider a skill distribution given by twoindependent Pareto distributions with support (1,1) and param-eters k� = 2 and k’= 4, respectively. As a result, there is moremass on lower skills in the ’-dimension compared to �, and � is

again the low skill sector with limw!1f ’x ðwÞfxðwÞ¼ 0. We again start

with the Cobb-Douglas case and set the aggregate income share ofthe high skill sector � to a =0.2. All other parameters are asin Section V. The left panel in Figure IV shows the resulting

QUARTERLY JOURNAL OF ECONOMICS654

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 33: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

0 20 40 60 800.1

0.2

0.3

0.4

0.5

0.6Marginal Tax Rate

wage

0 10 20 30 40 50 60

0.2

0.25

0.3

0.35

0.4Marginal Tax Rate

wage

Pareto Optimum SCPE

ρ=.5 ρ=0 ρ=−.5

FIGURE IV

Pareto Optimal/SCPE Tax Rates, and Pareto Optimal TaxRates for Varying s

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 655

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 34: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

marginal tax schedules in the Pareto optimum and SCPE. Itillustrates Proposition 3 and shows that in principle, the optimalasymptotic marginal tax rate can be considerably lower thanin the SCPE, indicating strong general equilibrium effects inthe Roy model.

In the right panel, we show how the importance of theseeffects varies with the elasticity of substitution of the productionfunction. We generalize the technology considered so in Section Vto a constant elasticity of substitution production function

YðE�, E’Þ ¼ �E� þ ð1� �ÞE

h i1

, where the elasticity of substitu-

tion is constant with � ¼ 11� (Cobb-Douglas obtains as a special

case for r= 0). The figure shows that the optimal asymptotic mar-ginal tax rates fall as we move to lower substitution elasticities.This is because the general equilibrium effects from the endo-geneity of wages become more pronounced as we move awayfrom linear technology, with s=1 (r= 1) and fixed wages.

VI.B. Differential Sectoral Costs

In the preceding analysis, individuals based their sectoralchoice exclusively on whether the �- or �-sector afforded thema higher wage. In many applications, however, it is reasonable toassume that occupational choice is also affected by direct costs ortastes for entering specific sectors. For instance, some occupa-tions may require higher levels of education, so that individualswho have a high cost of achieving such education may not selectinto them even if they could earn a higher wage there. It isstraightforward to extend the model to allow for such differentialcosts of working in the two sectors.

Let the types be described by a triple t = (� ,’,b), where bparameterizes the cost of �-sector effort relative to �-sectoreffort and, as before, � and ’ measure the �- and �-sectorskills. Let the general cdfs Fð�, ’, Þ and Gð�, ’, Þ, with supports½�, �� � ½’, ’�� ½ , �, denote the type distribution and cumulativewelfare weights, respectively, and take preferences to be separ-able, isoelastic in effort, and dependent on sector S as follows:

Uðc, e; t, SÞ ¼ uðcÞ � e1þ1

"

1þ1"

if S ¼ �

uðcÞ � e1þ1"

1þ1"

if S ¼ �:

8<:In particular, with constant absolute risk aversion utility of

consumption uðcÞ ¼ � expð�rcÞ, we can interpret ~ � � logð Þ=r

QUARTERLY JOURNAL OF ECONOMICS656

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 35: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

as the consumption cost of working in the �-sector. (With con-stant relative risk adversion, a similarly transformed is inter-pretable as a proportional consumption cost.)

Within the �-sector, there will generally be two dimensionsof heterogeneity—� and . For any x, however,

U c,y

�Y�ðxÞ; ð�, ’, Þ, �

� �¼ uðcÞ �

1

1þ 1"

y~�ð�, ÞY�ðxÞ

� �1þ1"

" #,

where ~�ð�, Þ � � "

1þ". Conditional on S =�, ~� is thus a sufficientstatistic for preferences over (c, y)-bundles (as in Chone andLaroque 2010). Moreover, any two types ð�1, ’1, 1Þ andð�2, ’2, 2Þ with ~�1ð�1, 1Þ ¼ ~�2ð�2, 2Þ and ’1 ¼ ’2 make the samesectoral choice. This means that we can ‘‘collapse’’ the policy-relevant type distribution into the two-dimensional distributionof ð ~�, ’Þ-types, with cumulative distribution function

Fð ~�, ’Þ �

Z �

Z ’

Z �= ~�ð Þ1þ""

dFð�0, ’0, 0Þ,

and we can collapse the welfare weights to Gð ~�, ’Þ analogously.By interpreting w as an effective wage, given by max ~�Y�ðxÞ,

�’Y’ðxÞ

�, the Pareto optimal and SCPE tax rates are characterized

exactly as in Section III. In particular, marginal taxes aregiven by Proposition 1, with � as in equation (20), and, as inProposition 2, top marginal tax rates will be negative whenever(1) the i-sector is the low w sector, (2) effort is increasing in w,(3) marginal social utility of consumption is decreasing in w,and (4) there is a weak preference for i-sector workers at anygiven w.

VII. Conclusion

We view this article as making a dual contribution. The firstis methodological: We provide a technique for solving a multidi-mensional screening problem in an important class of contexts.Specifically, we show that the multidimensional screening prob-lem that arises in designing optimal taxation in a multiple-sectoreconomy can be reduced to a single dimensional optimal incometax problem a la Mirrlees. This basic technique is likely to beapplicable more broadly.

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 657

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 36: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

Our second contribution is to derive some of the implicationsthat self-selection into occupational sectors can have for optimalincome taxation. In particular, we show that the presence of sev-eral complementary sectors in an economy provides a force push-ing toward less progressive taxation. This force is a naturalextension of Stiglitz’s (1982) results to a more general frameworkwith a continuous distribution of types and sectoral mobility, andwe show that the extra effects arising in this more general settingmitigate the regressive forces in the basic Stiglitz model.

We also demonstrated through a simple empirical calibrationthat computing the practical implications of occupational endo-geneity for taxation is tractable. Through a theoretical simulationwith unbounded skill distributions and fat tails, we also demon-strated that in principle these implications could be quantita-tively significant. More detailed empirical calibrations are animportant next step.

Appendix

A. Proof of Lemma 1

PART I. Consider an incentive compatible allocation cð�, ’Þ,�

yð�, ’Þ, Sð�, ’Þ, x�. To obtain a contradiction, suppose there

exists a type (� ,’) such that (e.g.) �Y�ðxÞ < ’Y’ðxÞ, butSð�, ’Þ ¼ � and hence (by equation (1)) wð�, ’Þ ¼ �Y�ðxÞ. Thentype ð�, ’Þ’s incentive constraint must be violated, since hecould do better by earning the same income yð�, ’Þ in the �-rather than the �-sector, which would result in utility

U cð�, ’Þ,yð�, ’Þ

’Y’ðxÞ

� �> U cð�, ’Þ,

yð�, ’Þ

�Y�ðxÞ

� �:

PART II. Consider two types ð�, ’Þ and ð�0, ’0Þ such that wð�, ’Þ ¼wð�0, ’0Þ ¼ w. Again to obtain a contradiction, assume withoutloss of generality,

U cð�, ’Þ,yð�, ’Þ

w

� �< U cð�0, ’0Þ,

yð�0, ’0Þ

w

� �:

Then type ð�, ’Þ’s incentive constraint must be violated: Hecould mimic type ð�0, ’0Þ by earning yð�0, ’0Þ, staying in hisoriginal sector. His utility from this deviation would begiven by the right-hand side of the above inequality.

QUARTERLY JOURNAL OF ECONOMICS658

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 37: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

PART III. We claim that, for each type ð�, ’Þ, the consumptionbundle ðcð�, ’Þ, yð�, ’ÞÞ is a solution to maxðc, yÞ2B� Uðc; y

wð�;’ÞÞ:First, when faced with the budget set B*, each individualwill, taking x as given, choose his sector such that wð�, ’Þ ¼max �Y�ðxÞ, ’Y’ðxÞ

� �and Sð�, ’Þ as given by equation (2).

Second, each bundle ðcð�, ’Þ, yð�, ’ÞÞ from the original alloca-tion is included in the budget set B* by the constructionin (3) and incentive compatibility. To see this, supposethere was some ðcð�, ’Þ, yð�, ’ÞÞ such that cð�, ’Þ > R�ðyð�, ’ÞÞ.Then by (3) this would imply that there exists some ð�0, ’0Þsuch that

U cð�0, ’0Þ,yð�0, ’0Þ

wð�0, ’0Þ

� �< U cð�, ’Þ,

yð�, ’Þ

wð�0, ’0Þ

� �,

contradicting type ð�0, ’0Þ’s incentive constraint and thusincentive compatibility of the original allocation. Finally,each type (� ,’) at least weakly prefers the bundle ðcð�, ’Þ,yð�, ’ÞÞ to any other bundle available in B* by (3).

B. Proof of Proposition 1

Putting multipliers � on (8), �� on (5) and �ðwÞ� on (7),the Lagrangian corresponding to (6)–(8) is, after integrating byparts (7),

L ¼

Z wx

w x

VðwÞgxðwÞdw�

Z wx

wx

VðwÞ�0ðwÞ�dw

þ

Z wx

wx

UeðcðVðwÞ, eðwÞÞ, eðwÞÞeðwÞ

w�ðwÞ�dw

þ ��ð1� �ðxÞÞ

Z wx

wx

weðwÞf �x ðwÞdw� ���ðxÞ

Z wx

wx

weðwÞf ’x ðwÞdw

þ �

Z wx

wx

weðwÞfxðwÞdw� �

Z wx

w x

cðVðwÞ, eðwÞÞfxðwÞdw:

ð26Þ

Using @c@V ¼

1Uc

and compressing notation, the first-order conditionfor V(w) is

�0ðwÞ� ¼ gxðwÞ � �fxðwÞ1

UcðwÞþ �ðwÞ�

UecðwÞ

UcðwÞ

eðwÞ

w:ð27Þ

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 659

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 38: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

Defining �ðwÞ � �ðwÞUcðwÞ, this becomes

�0ðwÞ ¼ gxðwÞUcðwÞ

�� fxðwÞ þ �ðwÞ

UccðwÞc0ðwÞ þUceðwÞe0ðwÞ þUceðwÞeðwÞ

w

UcðwÞ:

ð28Þ

Using the first-order condition corresponding to the optimizationproblem for an individual worker,

UcðwÞc0ðwÞ þUeðwÞe

0ðwÞ þUeðwÞeðwÞ

w¼ 0,

the fraction in (28) can be written as � @MRSðwÞ@c

y0ðwÞw where

MRSðwÞ � �UeðcðwÞ, eðwÞÞ

UcðcðwÞ, eðwÞÞ

is the marginal rate of substitution between effort and consump-tion. Substituting in (28) and rearranging yields

�@MRSðwÞ

@ceðwÞ

y0ðwÞ

yðwÞ�ðwÞ ¼ fxðwÞ � gxðwÞ

UcðwÞ

�þ �0ðwÞ:ð29Þ

Integrating this ordinary differential equation gives

�ðwÞ ¼

Z wx

wfxðwÞ � gxðzÞ

UcðzÞ

� �exp

Z z

w

@MRSðsÞ

@ceðsÞ

y0ðsÞ

yðsÞds

� �dz

¼

Z wx

w1�

gxðzÞ

fxðzÞ

UcðzÞ

� �exp

Z z

w1�

"uðsÞ

"cðsÞ

� �dyðsÞ

yðsÞ

� �fxðzÞdz,

ð30Þ

where the last step follows from eðwÞ @MRSðwÞ@c ¼ 1� "uðwÞ

"cðwÞ aftertedious algebra (e.g., using equations (23) and (24) in Saez 2001).

Using @c@e ¼MRS, the first-order condition for e(w) is

�wfxðwÞ 1�MRSðwÞ

w

� �þ ��w ð1� �ðxÞÞf �x ðwÞ � �ðxÞf

’x ðwÞ

�¼ ��ðwÞ�

ð�UecðwÞUeðwÞ=UcðwÞ þUeeðwÞÞ eðwÞ

UeðwÞ

w

� �,

QUARTERLY JOURNAL OF ECONOMICS660

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 39: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

which after some algebra can be rewritten as

wfxðwÞ 1�MRSðwÞ

w

� �þ �w ð1� �ðxÞÞf �x ðwÞ � �ðxÞf

’x ðwÞ

�¼ �ðwÞ

@MRSðwÞ

@e

e

MRSðwÞ

w

� �:

ð31Þ

Noting that MRSðwÞw ¼ 1� T0ðyðwÞÞ from the first-order condition of

the workers’ utility maximization problem and using the defin-ition of Z(w), this becomes

1þ �ð1� �ðxÞÞ f �x ðwÞ � �ðxÞ f

’x ðwÞ

fxðwÞ

¼ ð1� T0ðyðwÞÞÞ 1þ�ðwÞ

wfxðwÞ1þ

@MRSðwÞ

@e

e

MRSðwÞ

� �� �:

ð32Þ

Simple algebra again shows that 1þ @ log MRSðwÞ@ log e ¼

1þ"uðwÞ"cðwÞ , and that

ð1� �ðxÞÞ f �x ðwÞ � �ðxÞ f’x ðwÞ

fxðwÞ¼ 1� �ðxÞ �

f ’x ðwÞ

fxðwÞ¼

f �x ðwÞ

fxðwÞ� �ðxÞ:

The proposition follows from (30) and (32).

C. Proof of Lemma 2

We begin with the following two technical lemmas, whichwill be useful for the proof of Lemma 2.

LEMMA 3. The substitution elasticity of YðE�, E’Þ is given by

�ðxÞ ¼ � 1x�ðxÞ with �ðxÞ �

Y 0�ðxÞY�ðxÞ�

Y 0’ðxÞ

Y’ðxÞ.

Proof. The substitution elasticity is defined as �ðxÞ �

dxx

Y’ ðxÞ

Y� ðxÞ

dY’ ðxÞ

Y� ðxÞ

� ¼ 1x

d logY’ ðxÞ

Y� ðxÞ

� dx

0@ 1A�1

, from which the lemma follows

directly. #

LEMMA 4.

dF�xðwÞ

dx¼ �

Y 0�ðxÞ

Y�ðxÞwf �x ðwÞ þ�xðwÞ and

dF’x ðwÞ

dx¼ �

Y 0’ðxÞ

Y’ðxÞwf ’x ðwÞ ��xðwÞ

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 661

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 40: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

with

�xðwÞ �1

Y�ðxÞY’ðxÞ�ðxÞ

Z w

w x

w0fw0

Y�ðxÞ,

w0

Y’ðxÞ

� �dw0:

Completely analogous expressions hold for G�xðwÞ and G’

xðwÞ.

The proof of Lemma 4 involves nothing more than tedi-ous algebra. We now turn to proving Lemma 2 and use (26) tocompute

W0ðxÞ ¼

Z wx

wx

VðwÞdgxðwÞ

dxdw

� �

Z wx

wx

cðVðwÞ, eðwÞÞdfxðwÞ

dxdw� ���0ðxÞYðE�, E’Þ

þ �� ð1� �ðxÞÞ

Z wx

w x

weðwÞdf �x ðwÞ

dxdw� �ðxÞ

Z wx

wx

weðwÞdf ’x ðwÞ

dxdw

!

þ �

Z wx

wx

weðwÞdfxðwÞ

dxdwþ B1

with

B1 �dwx

dx

hVðwxÞgxðwxÞ � �cðVðwxÞ, eðwxÞÞfxðwxÞ

þ� fxðwxÞ þ � ð1� �ðxÞÞf�x ðeðwxÞÞ � �ðxÞf

’x ðeðwxÞÞ

� �wxeðwxÞ

i�

dwx

dx

hVðwxÞgxðwxÞ � �cðVðwxÞ, eðwxÞÞfEðwxÞ

þ� fxðwxÞ þ � ð1� �ðxÞÞf�x ðeðwxÞÞ � �ðxÞf

’x ðeðwxÞÞ

� �wxeðwxÞ

i:

Integrating by parts the five integral terms yields

W 0ðxÞ ¼ B1 þ B2 �

Z wx

w x

V 0ðwÞdGxðwÞ

dxdw

þ�

Z wx

wx

V 0ðwÞ

UcðwÞþMRSðwÞe0ðwÞ

� �dFxðwÞ

dxdw� ���0ðxÞYðE�, E’Þ

þ��

Z wx

w x

ðwe0ðwÞ þ eðwÞÞ �ðxÞdF’

x ðwÞ

dx� ð1� �ðxÞÞ

dF�xðwÞ

dx

� �dw

!

��

Z wx

wx

ðwe0ðwÞ þ eðwÞÞdFxðwÞ

dxdw

ð33Þ

QUARTERLY JOURNAL OF ECONOMICS662

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 41: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

with

B2 �

"VðwÞ

dGxðwÞ

dx� �cðVðwÞ, eðwÞÞ

dFxðwÞ

dx

þ��weðwÞ ð1� �ðxÞÞdF�

xðwÞ

dx� �ðxÞ

dF’x ðwÞ

dx

� �

þ�weðwÞdFxðwÞ

dx

#wx

w x

:

By the first-order conditions (29) and (31) with respect to V(w)and e(w) from the inner problem, the terms

Z wx

w x

e0ðwÞ

fxðwÞ

"wfxðwÞ 1�

MRSðwÞ

w

� �þ �w ð1� �ðxÞÞf �x � �ðxÞf

’x ðwÞ

���ðwÞ

@MRSðwÞ

@e

eðwÞ

MRSðwÞ

w

� �#dFxðwÞ

dxdw

and

Z wx

w x

V 0ðwÞ

UcðwÞfxðwÞ

"gxðwÞ

UcðwÞ

�� fxðwÞ � �

0ðwÞ

��ðwÞ@MRSðwÞ

@ceðwÞ

y0ðwÞ

yðwÞ

#dFxðwÞ

dxdw

are both equal to zero. Adding them to (33), using (7), canceling,and rearranging yields

W0ðxÞ ¼ B1 þ B2 � ���0ðxÞYðE�, E’Þ

þ

Z wx

wx

V 0ðwÞgxðwÞ

fxðwÞ

dFxðwÞ

dx�

dGxðwÞ

dx

� �dw

� �

Z wx

wx

eðwÞdFxðwÞ

dxdw

� �

Z wx

wx

�ðwÞ

w

d MRSðwÞeðwÞ½ �

dwþ �0ðwÞ

V 0ðwÞ

UcðwÞ

� �1

fxðwÞ

dFxðwÞ

dxdw

þ ��

Z wx

wx

�ðeðwÞ þwe0ðwÞÞ

��ðxÞ

dF’x ðwÞ

dx� ð1� �ðxÞÞ

dF�xðwÞ

dx

þwe0ðwÞ ð1� �ðxÞÞf �x ðwÞ

fxðwÞ� �ðxÞ

f ’x ðwÞ

fxðwÞ

� �dFxðwÞ

dx

�dw:

ð34Þ

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 663

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 42: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

From Lemma 4,

gxðwÞ

fxðwÞ

dFxðwÞ

dx�

dGxðwÞ

dx¼ �w

Y 0�ðxÞ

Y�ðxÞ

gxðwÞ

fxðwÞf �x ðwÞ � g�xðwÞ

� �

þwY 0’ðxÞ

Y’ðxÞ

gxðwÞ

fxðwÞf ’x ðwÞ � g’xðwÞ:

� �

¼ wY 0�ðxÞ

Y�ðxÞ�

Y 0’ðxÞ

Y’ðxÞ

� �gxðwÞ

fxðwÞf ’x ðwÞ � g’xðwÞ

� �

¼ w�ðxÞf ’x ðwÞf

�x ðwÞ

fxðwÞ

g�xðwÞ

f �x ðwÞ�

g’xðwÞ

f ’x ðwÞ

� �:

The first integral in (34) is therefore

Z wx

wx

V 0ðwÞgxðwÞ

fxðwÞ

dFxðwÞ

dx�

dGxðwÞ

dx

� �dw ¼ �ðxÞRð35Þ

Combining the terms with e(w) on the second and thirdline of (34) and using Lemma 4 and the identityY 0�ðxÞE� þ Y 0’ðxÞE’ � 0 gives:

� �

Z wx

wx

eðwÞdFxðwÞ

dxdwþ ��

Z wx

wx

eðwÞ

�ðxÞ

dF’x ðwÞ

dx

� ð1� �ðxÞÞdF�

xðwÞ

dx

!dw

¼ �Y 0�ðxÞ

Y�ðxÞ

Z wx

w x

weðwÞf �x ðwÞdwþY 0’ðxÞ

Y’ðxÞ

Z wx

w x

weðwÞf ’x ðwÞdw

" #

þ ��

Z wx

w x

weðwÞ ð1� �ðxÞÞY 0�ðxÞ

Y�ðxÞf �x ðwÞ � �ðxÞ

Y 0’ðxÞ

Y’ðxÞf ’x ðwÞ

� �dw

���

Z wx

w x

eðwÞ�xðwÞdw

¼ 0þ ��Y 0�E� � ��

Z wx

wx

eðwÞ�xðwÞdw:

ð36Þ

QUARTERLY JOURNAL OF ECONOMICS664

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 43: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

The terms with we0ðwÞ in the last line (34) can be written,using Lemma 4 again, as

��

Z wx

wx

we0ðwÞ

�ðxÞ

dF’x ðwÞ

dx� ð1� �ðxÞÞ

dF�xðwÞ

dx

þ ð1� �ðxÞÞf �x ðwÞ

fxðwÞ� �ðxÞ

f ’x ðwÞ

fxðwÞ

� �dFxðwÞ

dx

!dw

¼ ��

Z wx

wx

w2e0ðwÞ

"ð1� �ðxÞÞ

Y 0�ðxÞ

Y�ðxÞf �x ðwÞ � �ðxÞ

Y 0’ðxÞ

Y’ðxÞf ’x ðxÞ

� ð1� �ðxÞÞf �x ðwÞ

fxðwÞ� �ðxÞ

f ’x ðwÞ

fxðwÞ

� �Y 0�ðxÞ

Y�ðxÞf �x ðwÞ þ

Y 0’ðxÞ

Y’ðxÞf ’x ðxÞ

� �#dw

� ��

Z wx

wx

we0ðwÞ�xðwÞdw

¼ �ðxÞ��C� ��

Z wx

w x

we0ðwÞ�xðwÞdw,

ð37Þ

where the first term in the last step follows after some tediousalgebra. Combining the terms with �x(w) from (36) and (37) gives���

Rwx

wxðweðwÞÞ0�xðwÞdw, which can be integrated by parts to

yield:

�B3 þ ���ðxÞ

Y�ðxÞY’ðxÞ

Z wx

wx

w2eðwÞfw

Y�ðxÞ,

w

Y’ðxÞ

� �dw ¼ ���ðxÞS,ð38Þ

with B3 ¼ ��wxeðwxÞ�xðwxÞ since �xðwxÞ ¼ 0. Finally, use theincentive constraint (7), rewritten as wV 0ðwÞ

UcðwÞ¼MRSðwÞeðwÞ, to

write the second integral in the second line of (34) as

� �

Z wx

w x

��ðwÞw

d½V 0ðwÞ=UcðwÞ�

dwþ �0ðwÞw

V 0ðwÞ

UcðwÞ

þ�ðwÞV 0ðwÞ

UcðwÞ

�1

wfxðwÞ

dFxðwÞ

dxdw,

or, recognizing the sum of the bracketed terms as ddw

�ðwÞwV 0ðwÞUcðwÞ

h i,

integrating by parts, and using the transversality condition�ðwxÞ ¼ �ðwxÞ ¼ 0 and Lemma 4,

Z wx

wx

�ðwÞwV 0ðwÞ

UcðwÞ

d

dw�

Y 0�ðxÞ

Y�ðxÞ

f �x ðwÞ

fxðwÞ�

Y 0’ðxÞ

Y’ðxÞ

f ’x ðwÞ

fxðwÞ

� �� �dw ¼ �ðxÞIð39Þ

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 665

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 44: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

Define ~Fðw, xÞ � FxðwÞ. Since ~Fðwx, xÞ � 1 for all x,

d ~Fðwx, xÞ

dx¼@ ~Fðwx, xÞ

@xþ@ ~Fðwx, xÞ

@w

dwx

dx¼

dFxðwxÞ

dxþ fxðwxÞ

dwx

dx¼ 0:ð40Þ

Together with an analogous expression at wx, the fact that�xðwxÞ ¼ 0, and Lemma 4, this yields

B1 þ B2 ¼ ���wxeðwxÞ�xðwxÞ ¼ B3:

Using (35), (36), (37), (38), and (39) in (34) yields

W0ðxÞ ¼ �ðxÞ I þRþ �� �ðxÞY’ðxÞE’�ðxÞ þ Sþ C� � �

,ð41Þ

where we have used ��0ðxÞYðE�, E’Þ þ Y 0�ðxÞE� ¼ ��ðxÞY’ðxÞE’

x :

D. Proof of Proposition 3

limw!1

1� T0ðyðwÞÞ ¼ limw!1

1þ � 1� �ðxÞ � f ’x ðwÞfxðwÞ

� 1þ �ðwÞ

wfxðwÞ1þ"uðwÞ"cðwÞ

¼

1þ � 1� �ðxÞ � limw!1

f ’x ðwÞfxðwÞ

� � 1þ 1þ 1

"

�lim

w!1

1�FxðwÞwfxðwÞ

� �ðFxðwÞÞ�FxðwÞ

1�FxðwÞ

� ¼

1þ � 1� �ðxÞð Þ

1þ 1þ 1"

��:

The first equality is from (10). The second uses (iv) and (i), whichimplies �ðwÞ ¼ �ðFxðwÞÞ � FxðwÞ. The third uses (ii) to simplify thenumerator, and (iii) and (v) to take the limits of the two terms inthe denominator. The top tax rate result for the Pareto optimumfollows with a little rearranging. Setting � = 0 yields the result forthe SCPE. Corollary 4 and (iv) imply � >0, since �00ðxÞ < 0 impliesthat gxðwÞ

fxðwÞ¼ �0ðFxðwÞÞ is decreasing in w.

Wellesley College

Stanford University and NBER

References

Allen, F., ‘‘Optimal Linear Income Taxation with General Equilibrium Effects onWages,’’ Journal of Public Economics, 17 (1982), 135–143.

Atkinson, A., and J. Stiglitz, ‘‘The Design of Tax Structure: Direct versus IndirectTaxation,’’ Journal of Public Economics, 6 (1976), 55–75.

QUARTERLY JOURNAL OF ECONOMICS666

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 45: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

Basu, A., and J. Ghosh, ‘‘Identifiability of the Multinormal and OtherDistributions under the Competing Risks Model,’’ Journal of MultivariateAnalysis, 8 (1978), 413–429.

Blackorby, C., C. Brett, and A. Cebreiro, ‘‘Nonlinear Taxes for Spatially MobileWorkers,’’ International Journal of Economic Theory, 3 (2007), 57–74.

Boadway, R., N. Marceau, and P. Pestieau, ‘‘Optimal Linear Income Taxation inModels with Occupational Choice,’’ Journal of Public Economics, 46 (1991),133–162.

Borjas, G., ‘‘Self-Selection and the Earnings of Immigrants,’’ American EconomicReview, 77 (1987), 531–553.

———, ‘‘The Wage Structure and the Sorting of Workers into the Public Sector,’’NBER Working Paper 9313, 2002.

Brett, C., ‘‘Optimal Nonlinear Taxes for Families,’’ International Tax and PublicFinance, 14 (2007), 225–261.

Brett, C., and J. Weymark, ‘‘Financing Education Using Optimal RedistributiveTaxation,’’ Journal of Public Economics, 87 (2003), 2549–2569.

Chone, P., and G. Laroque, ‘‘Negative Marginal Tax Rates and Heterogeneity,’’American Economic Review, 100 (2010), 2532–2547.

Dahl, G. B., ‘‘Mobility and the Return to Education: Testing a Roy Model withMultiple Markets,’’ Econometrica, 70 (2002), 2367–2420.

Diamond, P., ‘‘Optimal Income Taxation: An Example with a U-shaped Pattern ofOptimal Tax Rates,’’ American Economic Review, 88 (1998), 83–95.

Diamond, P., and J. Mirrlees, ‘‘Optimal Taxation in Public Production, I:Production Efficiency,’’ American Economic Review, 61 (1971), 8–27.

Feldstein, M., ‘‘On the Optimal Progressivity of the Income Tax,’’ Journal ofPublic Economics, 2 (1973), 356–376.

Fudenberg, D., and D. Levine, ‘‘Self-Confirming Equilibrium and the LucasCritique,’’ Journal of Economic Theory, 144 (2009), 2354–2371.

Fudenberg, D., and J. Tirole, Game Theory (Cambridge, MA: MIT Press, 1991).Ham, J. C., and R. J. Lalonde, ‘‘The Effect of Sample Selection and Initial

Conditions in Duration Models: Evidence and Experimental Data onTraining,’’ Econometrica, 64 (1996), 175–205.

Heckman, J., ‘‘Shadow Prices, Market Wages, and Labor Supply,’’ Econometrica,42 (1974), 679–693.

Heckman, J., and B. Honore, ‘‘The Empirical Content of the Roy Model,’’Econometrica, 58 (1990), 1121–1149.

Heckman, J., and G. Sedlacek, ‘‘Heterogeneity, Aggregation and Market WageFunctions: An Empirical Model of Self-Selection in the Labor Market,’’Journal of Political Economy, 93 (1985), 1077–1125.

———, ‘‘Self-Selection and the Distribution of Hourly Wages,’’ Journal of LaborEconomics, 8 (1990), S329–S363.

Hsieh, C., E. Hurst, C. Jones, and P. Klenow, ‘‘The Allocation of Talent and U.S.Economic Growth,’’ (Mimeo: Stanford University, 2011).

Kleven, H. J., C. T. Kreiner, and E. Saez, ‘‘The Optimal Income Taxation ofCouples,’’ Econometrica, 77 (2009), 537–560.

Laroque, G., ‘‘Income Maintenance and Labor Force Participation,’’ Econometrica,73 (2005), 341–376.

Lee, L., ‘‘Unionism and Wage Rates: A Simultaneous Equation Model withQualitative and Limited Dependent Variables,’’ International EconomicReview, 19 (1978), 415–433.

Mankiw, G., M. Weinzierl, and D. Yagan, ‘‘Optimal Taxes in Theory and Practice,’’Journal of Economic Perspectives, 23 (2009), 147–174.

Mirrlees, J., ‘‘An Exploration in the Theory of Optimum Income Taxation,’’ Reviewof Economic Studies, 38, (1971), 175–208.

Moresi, S., ‘‘Optimal Taxation and Firm Formation: A Model of AsymmetricInformation,’’ European Economic Review, 42 (1997), 1525–1551.

Naito, H., ‘‘Re-Examination of Uniform Commodity Taxes under a Non-LinearIncome Tax System and Its Implication for Production Efficiency,’’ Journalof Public Economics, 71 (1999), 165–188.

Parker, S. C., ‘‘The Optimal Linear Taxation of Employment andSelf-Employment Incomes,’’ Journal of Public Economics, 73 (1999),107–123.

REDISTRIBUTIVE TAXATION IN THE ROY MODEL 667

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from

Page 46: I. Introduction - Department of Economics Sciences Poecon.sciences-po.fr/sites/default/files/file/laroque/Rothschild... · REDISTRIBUTIVE TAXATION IN THE ROY MODEL* Casey Rothschild

Rochet, J.-C., and P. Chone, ‘‘Ironing, Sweeping and MultidimensionalScreening,’’ Econometrica, 66 (1998), 783–826.

Rothschild, C., and F. Scheuer, ‘‘Optimal Taxation with Rent-Seeking,’’ NBERWorking Paper 17035, 2011.

———, ‘‘Redistributive Taxation in the Roy Model,’’ NBER Working Paper 18228,2012.

Roy, A., ‘‘Some Thoughts on the Distribution of Earnings,’’ Oxford EconomicPapers, 3 (1951), 135–146.

Saez, E., ‘‘Using Elasticities to Derive Optimal Tax Rates,’’ Review of EconomicStudies, 68 (2001), 205–229.

———, ‘‘Direct or Indirect Tax Instruments for Redistributiopn: Short-run versusLong-run,’’ Journal of Public Economics, 88 (2004), 503–518.

Sargent, T., ‘‘Evolution and Intelligent Design,’’ American Economic Review, 98(2008), 5–37.

Scheuer, F., ‘‘Adverse Selection in Credit Markets and Regressive ProfitTaxation,’’ NBER Working Paper 18406, 2012a.

———, ‘‘Entrepreneurial Taxation with Endogenous Entry,’’ (Mimeo: StanfordUniversity, 2012b).

Stiglitz, J., ‘‘Self-Selection and Pareto Efficient Taxation,’’ Journal of PublicEconomics, 17 (1982), 213–240.

Werning, I., ‘‘Pareto-Efficient Income Taxation,’’ (Mimeo: MIT, 2007).Willis, R., and S. Rosen, ‘‘Education and Self-Selection,’’ Journal of Political

Economy, 87 (1979), S7–S36.Zeckhauser, R., ‘‘Taxes in Fantasy, or Most Any Tax on Labor Can Turn Out to

Help the Laborers,’’ Journal of Public Economics, 8 (1977), 133–150.

QUARTERLY JOURNAL OF ECONOMICS668

at Fondation Nationale D

es Sciences Politiques on January 30, 2015http://qje.oxfordjournals.org/

Dow

nloaded from