75
Income Dynamics and Inequality: The Case of Mexico * Daniela Puggioni †ab , Mariana Calderón a , Alfonso Cebreros Zurita a , León Fernández Bujanda a , José Antonio Inguanzo González b , and David Jaume c a Banco de México, Dirección General de Investigación Económica, Mexico City, Mexico b U.C. Los Angeles, Department of Economics, Los Angeles, California, USA c Banco de México, Dirección General de Estabilidad Financiera, Mexico City, Mexico March 2021 Abstract We characterize the salient features of the distribution of (log) earnings of formal workers in Mexico based on social security records for the period 2005–2019. The analysis is based on a non-parametric approach and is focused, primarily, on the properties of the distribution of earnings changes. Similar to Guvenen, Karahan, Ozkan, and Song [2021], we find strong evidence of deviations from normality of this distribution in terms of negative skewness and high kurtosis, with these deviations varying with income and along the worker’s life cycle. Due to the relative size of the informal sector in the Mexican economy, we also study the impact of transitions out of and into formal employment on wages earned in the formal sector and the impact of early exposure to informality on future earnings. We document that workers who exited formal employment can take, on average, three or more years upon re-entry to achieve comparable pre-exit wage levels. In addition, we provide evidence of the facts that the rate of informality matters more than the unemployment rate for impacting long-term earnings trajectories of cohorts of new workers entering the labor market, and that having the first job in the informal sector has a negative and significant impact on future earnings. JEL C: E24, J24, J31, J46. K: Earnings dynamics, higher-order earnings risk, inequality, kurtosis, skewness, informal labor markets. * The views and conclusions presented in this paper are exclusively those of the authors and do not necessarily reflect those of Banco de México and its Board of Governors. The IMSS data used in this paper are confidential and were made available through the Econlab at Banco de México. Inquiries regarding the terms and conditions for accessing these data should be directed to: [email protected]. We are grateful to the participants of the Global Income Dynamics conferences (at Stanford and University of Minneapolis) and the participants of the Informal Seminar at Banco de México’s research department for insightful comments and suggestions. We also thank Alejandro Trujillo for valuable research assistance. All remaining errors are our own. bcorresponding author: B[email protected] 1

Income Dynamics and Inequality: The Case of Mexico

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Income Dynamics and Inequality:The Case of Mexico∗Daniela Puggioni†ab, Mariana Calderóna, Alfonso Cebreros Zuritaa, León Fernández Bujandaa,José Antonio Inguanzo Gonzálezb, and David Jaumec

aBanco de México, Dirección General de Investigación Económica, Mexico City, MexicobU.C. Los Angeles, Department of Economics, Los Angeles, California, USAcBanco de México, Dirección General de Estabilidad Financiera, Mexico City, MexicoMarch 2021

AbstractWe characterize the salient features of the distribution of (log) earnings of formal workers in Mexicobased on social security records for the period 2005–2019. The analysis is based on a non-parametricapproach and is focused, primarily, on the properties of the distribution of earnings changes. Similarto Guvenen, Karahan, Ozkan, and Song [2021], we find strong evidence of deviations from normality ofthis distribution in terms of negative skewness and high kurtosis, with these deviations varying withincome and along the worker’s life cycle. Due to the relative size of the informal sector in the Mexicaneconomy, we also study the impact of transitions out of and into formal employment on wages earnedin the formal sector and the impact of early exposure to informality on future earnings. We documentthat workers who exited formal employment can take, on average, three or more years upon re-entryto achieve comparable pre-exit wage levels. In addition, we provide evidence of the facts that the rateof informality matters more than the unemployment rate for impacting long-term earnings trajectoriesof cohorts of new workers entering the labor market, and that having the first job in the informal sectorhas a negative and significant impact on future earnings.JEL Codes: E24, J24, J31, J46.Keywords: Earnings dynamics, higher-order earnings risk, inequality, kurtosis, skewness, informallabor markets.

∗The views and conclusions presented in this paper are exclusively those of the authors and do not necessarily reflectthose of Banco de México and its Board of Governors. The IMSS data used in this paper are confidential and were madeavailable through the Econlab at Banco de México. Inquiries regarding the terms and conditions for accessing these data shouldbe directed to: [email protected]. We are grateful to the participants of the Global Income Dynamics conferences (atStanford and University of Minneapolis) and the participants of the Informal Seminar at Banco de México’s research departmentfor insightful comments and suggestions. We also thank Alejandro Trujillo for valuable research assistance. All remaining errorsare our own.†bcorresponding author: [email protected]

1 Introduction

There is a large and growing literature studying the salient features of the distribution and dynamicsof earnings and life-time incomes that has focused primarily on advanced economies such as the UnitedStates, the United Kingdom, Germany, and Spain, to name a few (see, for example, Bonhomme andHospido [2017] and Guvenen et al. [2021]). Comparatively, little attention has been paid to emergingand developing countries. The goal of this paper is to contribute to this literature along two dimensions.First, we characterize the defining properties of the distribution of earnings and of transitory earningschanges and the extent to which they display deviations from normality. We explore how these propertiesvary across genders and age groups, and along the income distribution. Second, we analyze the effect oftransitions out of and back into formal employment on the earnings of workers, and the role of informalityin shaping long-term labor market outcomes and the dynamics of earnings. On both fronts, to the best ofour knowledge, we are the first to explore these issues in the Mexican context.Regarding the distribution of earnings, we find that overall inequality (P90–P10 dispersion) remainedfairly stable during the 2005–2015 decade. Starting in 2016, inequality has been decreasing due torevisions to the minimum wage that led to an increase in the real earnings of the lower percentiles of theearnings distribution, while the upper percentiles maintained relatively steady earnings. We also findthat inequality increased during the period of the 2008–2009 financial crisis, more so for men than forwomen, and for younger workers than for older ones. Initial inequality (P90–P10 dispersion at age 25)has also been fairly stable between 2005 and 2019, with the exception of a marginal increase registeredduring the financial crisis. Within the limited time span covered by our data, we observe relatively stableearnings mobility patterns with lower income workers moving upward in the earnings distribution andhigher income workers being more likely to move downward. These patterns tend to be more pronouncedfor younger workers. The exception is the top 0.1% of the distribution where there is essentially nomobility.With respect to the distribution of one-year earnings changes, we encounter evidence of strongdeviations from normality whose extent varies across income levels, age groups and genders. Similar tothe findings of Guvenen et al. [2021], we document very high levels of kurtosis across all demographicgroups, implying that most workers experience innovations to their earnings of a relatively small magnitude,but a non-negligible mass of them faces extreme earnings shocks. Kurtosis is noticeably higher for women,even though it has been steadily decreasing for both men and women since 2009, shrinking the gap inkurtosis levels across genders in the process. While the distribution of transitory earnings innovationsturns out to be asymmetric, we find that the direction of skewness differs across age, gender, and income.In particular, this distribution is positively skewed for lower income men of all age groups, for lowerincome women aged 34 or less, and for women of all income levels aged 35 or more. This means that, incomparison with higher-income men, the earnings of these groups of workers have more room to move upand less room to fall. This contrasts to what Guvenen et al. [2021] document for the United States wherethis distribution is negatively skewed for all workers.2

To document these new facts regarding the distribution of earnings and earnings changes for Mex-ican workers, our analysis relies on social security records from the Instituto Mexicano de SeguridadSocial (IMSS) covering the period from 2005 to 2019 and the universe of (private sector) formal workers.Differently from previous work based on Mexican survey data (see, for example, Binelli and Attanasio[2010]), our social security records have large sample sizes and allow for following workers continuouslythroughout their employment history in the formal sector.Social security data have the advantage of providing accurate, reliable, and consistent information onmillions of Mexican workers1, but they also pose several important challenges. First, as it is commonly thecase with administrative data, earnings are top- and bottom-coded. Second, it is not possible to separatewage effects form labor-supply effects on earnings since no information on the number of hours workedor the full versus part-time status of a worker is available. Third, Mexican social security records do notprovide important information on worker characteristics such as educational attainment or occupation ofemployment precluding the possibility to explore how the distribution of earnings or earnings changesvaries along these dimensions.2 The main limitation of our data, however, is that that they only coverworkers employed in the formal private sector. Hence, while the power of millions of administrative recordscan be exploited to establish key features of the distribution of earnings and earnings changes for workersin the formal labor market, this source of information has nothing to say about these distributions for alarge fraction of the Mexican labor market: informal workers.Motivated by this lack of information on the informal sector, that we consider to be a very significantissue, in the second part of our analysis we study how time away from formal employment and earlyexposure to informality shape labor market trajectories and earnings dynamics of Mexican workers. Weprovide empirical evidence that workers, who exit formal employment to subsequently re-enter formal laborrelationships, experience a penalty upon re-entry with earnings taking up to three years to graduallygrow toward pre-exit levels. We also document that the earnings of women seem to recover somewhatfaster than those of men. This suggests that the ability to maintain a continuous attachment to the formallabor market is likely to be an important source of heterogeneity in the distribution of earning changesacross Mexican workers. Due to the limitations of our data we are unable to determine whether this timeaway from formal employment was spent in public sector formal employment, in informal employment, inunemployment, or out of the labor force, therefore we cannot shed further light on the actual source ofthis penalty.Lastly, based on household survey data that provide information on the employment of informal sectorworkers, we find that for cohorts of workers entering the labor market for the first time, entry during1It has been documented that employers used to under-report wages to IMSS, such practice, however, has substantiallydeclined since the 1997 reform of the pension system in Mexico. Hence, under-reporting does not appear to be a problem forthe period covered in our study. See Kumler, Verhoogen, and Frías [2020].2An additional issue is that, due to the relatively short time span covered in our data, we are not able to speak to the impactthat major events, such as the enactment of NAFTA, the severe financial crisis that followed the peso devaluation of 1995, orChina’s entry into the WTO (a major competitor in the U.S. market, Mexico’s main export destination), may have had on thedistribution of earnings and on earnings dynamics.

3

periods of relatively higher rates of informality is associated with an adverse effect on future labor forceparticipation and earnings. We also exploit a special module of the Mexican household survey that tracksindividual labor market trajectories, and find that, when a worker’s first job is in the informal sector, thisis associated with a long-lasting negative effect on his/her earnings. This last set of results indicatesthat another important determinant of the heterogeneity in earnings levels and earnings changes, thatwe document based on social security records, can be traced back to initial conditions and the extent towhich informality shaped workers’ early labor market experiences.Related literature. Methodologically, the first half of this paper is related to the work of Arellano,Blundell, and Bonhomme [2017], Guvenen, Kaplan, Song, and Weidner [2017], and Guvenen et al. [2021],among others. These authors propose non-parametric methods to characterize key features of the distri-bution of earnings shocks and of life-time incomes. The strength of these methods is that non-linearitiesand non-normalities, that may be important attributes of the earnings process, can more easily uncoverby avoiding strong parametric assumptions.An early, important paper in this strand of the literature is Guvenen, Ozkan, and Song [2014] thatdocuments that the distribution of earnings changes is negatively skewed and becomes more so duringrecessions. Thus, during recessions large upward earnings movements become less likely, while largedrops in earnings become more likely. Guvenen et al. [2021] go further by analyzing life-cycle variationin skewness and other properties of the distribution of earning changes, such as kurtosis, and how theyvary with earnings levels and age. Relative to these papers, our contribution is to analyze the sameissues in the context of the Mexican labor market, a market that significantly differs from the Americanlabor market studied by those authors and the labor markets in more advanced economies. Unlike theUnited States and other developed countries, the labor market experience of Mexican workers is heavilyshaped by the lack of a strong social safety net, such as unemployment insurance, and the prevalence ofinformality —both in terms of informal jobs at formal firms and informal jobs at informal firms.Our work is also related to Binelli and Attanasio [2010] who analyze the level and dispersion ofearnings using data from Mexican household surveys for the period 1987–2002. These authors documenta significant increase in wage and income inequality (P90–P10 dispersion) during the first half of the1990s. For hourly wages, they find that the increase was characterized by inequality growing fasterat the top (P90–P50) than at the bottom (P50–P10) of the distribution. For the second half of the90s, they observe a drop or stable trend, with the slowdown in income and wage inequality beingexplained primarily by the top of the distribution —top-tail dispersion (P90-P50) of hourly wages andhousehold income dropped significantly, while bottom-tail dispersion (P50–P10) decreased slightly forwages and maintained an upward trend for income. Relative to these authors, we examine a later periodin the Mexican economy (2005 to 2019) and base our analysis on social security records, rather thanon household surveys. Also, our primary focus is on the distribution of earnings changes rather than onthe distribution of earnings levels. Other contributions to the study of income inequality in the Mexicancontext are Esquivel, Lustig, and Scott [2010], Lustig, Lopez-Calva, and Ortiz-Juarez [2013], and Campos-

4

Vazquez and Lustig [2017]. The first two find that inequality decreased in Mexico from the mid 1990sto 2006, mainly due to equalizing changes in the distribution of labor income imputable to the skillpremium —measured as the gap between the wages of workers with tertiary education (or secondary)and workers with no schooling or incomplete primary school— falling systematically. The third oneargues that issues such as non-response and under-representation of high-wage earners and weightsassignment in survey data, are relevant and can lead to mixed results in terms of evaluating the evolutionof inequality. Differently to these authors, we quantify inequality using exclusively administrative datathat, while lacking possibly valuable information on education, do not suffer from the same, possiblysevere, problems of survey data.The second half our study is related to Oreopoulos, Von Wachter, and Heisz [2012] and Schwandt andVon Wachter [2019] who find that initial economic conditions for labor market entrants can have persistenteffects on their earnings. For instance, entering the labor market for the first time during a recessionis associated with a persistent reduction in earning and wages. For both Canada (Oreopoulos et al.[2012]) and the United States (Schwandt and Von Wachter [2019]) reductions in earnings from enteringthe labor market during a recession proved to be quite persistent, at least 10 years. Given the peculiardual (formal/informal) nature of the Mexican labor market, we add to this literature by extending theapproach of these authors to include the role of informality as part of the initial economic conditions facedby labor market entrants. Our evidence suggests that, more than the unemployment rate, early exposureto relatively higher informality matters the most for future outcomes of Mexican workers.2 Part I. Inequality, Mobility and Income Dynamics: Evidence from Social

Security Data

In this section we present the first part of our study, that corresponds to a descriptive analysis of inequality,earnings dynamics, and mobility based on Mexican administrative records. The analysis is based on anon-parametric approach and is closely related to the work of Bonhomme and Hospido [2017] for Spainand Guvenen et al. [2017] and Guvenen et al. [2021] for the United States. To the best of our knowledge,we are the first to provide such a characterization of the earnings distribution of private sector formalworkers in Mexico.3The rest of this section is organized as follows: section 2.1 discusses the data underlying the analysis;section 2.2 provides a brief overview of the macroeconomic context of the Mexican economy for the period2005-2019, that corresponds to the period covered by our administrative records; and sections 2.3–2.5 present the analysis regarding income inequality, the distribution of earnings changes, and incomemobility over time. Note that all the results presented in sections 2.3–2.5 distinguish between men and3Binelli and Attanasio [2010] present some results that are related to the analysis presented in this section and in appendixB, particularly concerning income inequality. These authors base their analysis on Mexican household surveys, rather thanadministrative data, and their results are focused on the period 1987 to 2002, a period that is not covered by our administrativerecords.

5

women, results for the whole sample are presented in appendix B.2.1 Data

The analysis conducted in this section of the paper is based on social security records from the InstitutoMexicano de Seguridad Social (IMSS), one of the main Mexican social security institutions. All formalprivate sector workers who receive a salary are required, by law, to register with IMSS. The set of workerscovered by IMSS does not include government workers or workers employed in the informal sector.4 Sinceinformality is prevalent in Mexico, a large portion of the labor force is not covered by the social securitydata.5 Later in the paper we address some potentially important issues regarding the interplay betweenformality and informality in the Mexican labor market.The social security data have a monthly frequency for the period January 2005 to December 2019,and cover, approximately, between 13 million workers at the start of the sample and 20 million workerstoward the end. For the purposes of the first half of the analysis, the key variable contained in the socialsecurity data is the information on wages, reported as a worker’s daily taxable income (“salario basede cotización”).6 This means that the data on daily wages can include various forms of compensationreceived by the worker other than wages (e.g., paid vacation, end of the year bonus), but may excludeothers (in general any additional benefit or compensation that is not subject to labor income taxation),hence not necessarily reflecting the total labor income a worker receives. The data is bottom and topcoded. At the bottom, the threshold is set by the minimum wage, while at the top the cap is set at 25minimum wages before February 2017, and 25 UMAs (unit of measure and update) afterward.7 There is,however, no information on the number of hours/days/etc. worked and on whether employment was parttime or full time.8Since the information on wages is available as daily wage with monthly frequency, we first express4IMSS insured workers cover approximately 80% of formal sector workers with access to social security, according to estimatesfrom the Secretaría de Trabajo y Previsión Social (the Mexican Ministry of Labor). Other formal sector workers obtain theirsocial security through other institutions. For example, Petróleos Mexicanos (PEMEX), the state-owned oil company, providesits own social security for its workers. Most government workers are registered with the Instituto de Seguridad y ServiciosSociales de los Trabajadores del Estado (ISSSTE). A small fraction of government workers, such as Banco de México employees,are actually registered with IMSS rather than with ISSSTE.5It should be noted, however, that self-employed workers —individuals that work on their own and without employees—can register with IMSS to obtain access to some parts of the social security system (these workers are not eligible for workcompensation and labor accident insurance). By default, self-employed workers are recorded in the social security data witha wage equal to the minimum wage. For any given month, the share of enrolled workers that are “self-employed” are roughly0.1% of the observations.6In Mexico it is not customary to contract on hourly wages (e.g. the minimum wage is established as a daily wage, unlikein the USA where the minimum wage is defined on a per hour basis).7https://www.inegi.org.mx/temas/uma/8Another relevant issue pertaining to the measurement of a worker’s labor income is that, in a given month, the data maycontain multiple observations for the same social security number (SSN), corresponding to different jobs held by the sameworker (possibly with different employers). There can also be multiple observations for pairs of SSN and employer id (“registropatronal”), for which the value of the wage variable does not coincide. Multiple SSNs, nevertheless, represent no more than 2%of the observations, hence this particular issue is contained within this 2% at most in the sample, while it affects, on average,just 1.4% of the original records.

6

daily wages as monthly wages multiplying the “salario base de cotización” by 30 and then add the monthlywages up to obtain the annual labor income for each worker in a given year. For the period 2005–2019,this results in over 315 million worker-year pairs with observations per year ranging between 17 and 26millions for workers aged 14–75 years old. This constitutes our universe of potential observations. Byimposing two “admissibility” conditions on the population, we construct the master sample for the analysiscarried out in sections 2.3–2.5. In particular, we impose that: (i) individuals must be between 25 and55 years of age (i.e. the prime-age labor force), and (ii) individuals must display meaningful attachmentto the labor force, in the sense that their earnings must be above a minimum earnings threshold Ymin,t.Since in Mexico the minimum wage is defined as a daily wage, rather than as an hourly wage as it iscommon in other countries, we set Ymin,t equal to 45 days of minimum wage, which corresponds to half aquarter of full-time minimum wage employment.9 Within the subset of the sample that satisfies the firstadmissibility condition, the fraction of observations that are above the minimum earnings threshold variesbetween 97.5 and 98.5% throughout the sample period.

Table 1: Descriptive statistics for selected cross-sectional samplesYear Obs. Mean income Women Age Shares %(millions) Men Women % share [25–35] [36–45] [46–55]2005 12.43 6,114 5,006 34.75 52.41 31.24 16.352019 19.58 6,543 5,564 38.70 46.03 31.64 22.33Year P1 P5 P10 P25 P50 P75 P90 P95 P99 P99.92005 230.14 401.89 682.54 1,584.90 3,191.88 6,483.40 12,492.76 19,602.04 53,289.61 56,754.572019 320.49 523.28 850.17 1,939.84 3,442.34 7,093.79 13,490.63 20,679.88 53,427.10 56,708.68

Note: Besed on authors’ calculations with IMSS data. The table shows summary statistics and demographic characteristics in selected yearsfor the cross-sectional sample used to carry out the analysis presented in sections 2.3–2.5. The mean and percentiles of the income distributionare calculated using raw real earnings deflated with the Mexican CPI for 2018 and then converted to US dollars for facilitating comparisonacross countries. Since the data are top coded the percentiles above P95 are imputed by fitting a Pareto distribution around the top code. TheMexican administrative data do not contain information about the educational level of the worker.Table 1 presents some basic descriptive statistics and demographic characteristics in selected yearsfor the cross-sectional sample used to carry out the analysis presented in sections 2.3–2.5. Additionaldetails regarding the IMSS data and some relevant summary statistics for the master sample are providedin appendix A.

2.2 Macroeconomic Context in Mexico

Before moving to the core analysis, we provide a brief overview of the macroeconomic context in Mexicoduring the period covered in our administrative data, 2005–2019. This overview is intended to offer aminimal relevant background that may aid in the interpretation of the inequality and income dynamicsresults presented in the upcoming sections.9Note that for most other countries in the Global Income Dynamics Project, the threshold Ymin,t is defined as 13 weeks ofpart time (20 hours a week) minimum wage employment. We chose the threshold for Mexico to be as close as possible to thiscommon definition.

7

The first panel of Figure 1 presents the evolution of real GDP growth. Unsurprisingly, there is a briefbut noticeable contraction in GDP during the period of the 2008–2009 financial crisis. After the recoveryfrom that recession, we observe a slowdown in growth during 2013 and a moderate contraction in 2019.Outside of those years, the 2005–2019 period can be overall characterized as one of weak, but steadygrowth. The second panel of figure 1 presents a Macroeconomic Uncertainty Index (MUI) for Mexicoderived using the methodology outlined in Jurado, Ludvigson, and Ng [2015].10 The economic slowdownduring the financial crisis is paired with increased levels of uncertainty, which also steadily increasedfrom mid 2016 onward. This recent trend can be associated with both external and internal factors. Onthe external side, the renegotiation of NAFTA, along with a generalized more protectionist stance fromthe United States, represented a large and significant increase in trade policy uncertainty for Mexicovis-a-vis its largest trading partner. This source of uncertainty did not fully subside until early 2020 withthe enactment of the USMCA. On the internal side, a period of increasing uncertainty commenced withthe presidential election and sudden cancelation of the new Mexico City airport in the second half of2018.Figure 1: Aggregate activity and economic uncertainty

(a) Real GDP growth

-6-4

-20

24

6R

eal G

DP

Gro

wth

2005 2007 2009 2011 2013 2015 2017 2019

GDP growthAverage GDP growth

(b) Macroeconomic Uncertainty Index (MUI)

.96

.97

.98

.99

1

.64

.68

.72

.76

.8

2005 2007 2009 2011 2013 2015 2017 2019

MUI 1 monthMUI 12 months

Note: Based on authors’ calculations with data from INEGI (the Mexican statistical agency). In panel (a), the red dashed linerefers to the average GDP growth during the sample period. In panel (b) the MUI is calculated based on information from 125economic monthly series using the methodology outlined in Jurado et al. [2015]. The MUI 1-month (left axis) refers to theuncertainty index constructed based on one-month ahead forecast errors and the MUI 12-months (right axis) refers to the indexconstructed based on 12-month ahead forecast errors.Regarding labor market aggregate conditions, figure 2 presents the unemployment and informalityrates.11 The unemployment rate increased during the financial crisis, reaching 5.5% in 2009. This number

10In the presence of adjustment costs in both labor and capital inputs, uncertainty can have a significant impact on firm’sinvestment, job creation/destruction decisions and, hence, on labor market outcomes. For example, Bloom [2009], Jurado et al.[2015], and Baker, Bloom, and Davis [2016] have shown that increases in uncertainty can be associated with reductions inemployment and investment, while Mathy [2020] finds evidence suggesting that an increase in uncertainty can negatively affectwages.11The rate of informality corresponds to the percentage of the employed population (15 years and older) that is employed in8

is low compared to the United States, where it increased to 10% in the same period. Between 2012 and2014, unemployment remained stable before resuming its decreasing trajectory toward pre-financial crisislevels. By the end of the period under consideration, it started to moderately increase. Differentiatingby gender, we see that the unemployment rate is fairly similar for men and women, with the exceptionof the 2005–2008 and 2016–2018 periods when unemployment for women was noticeably higher thanfor men. The relatively low unemployment rate in Mexico, even in the middle of a crisis of significantmagnitude, is consisten with this macroeconomic indicator not being able to fully capture the true state ofthe economy. The rate of informality is quite high ranging between 56 and 60% during the entire periodwith the exception of the years 2008–2012 when it displayed a steady downward trend. Informality ishigher for women than for men. Over the period considered, there is a strong correlation between therate of informality for men and women, except in 2017–2019 when the rate of informality for men kepttrending downward, while that for women showed a moderate increase.Figure 2: Unemployment and informality

(a) Unemployment rate

33.

54

4.5

55.

56

Une

mpl

oym

ent R

ate

2005 2007 2009 2011 2013 2015 2017 2019

TotalMenWomen

(b) Informality rate55

5657

5859

6061

62In

form

ality

Rat

e

2005 2007 2009 2011 2013 2015 2017 2019

TotalMenWomen

Note: The unemployment rate in year t corresponds to the average quarterly unemployment rate in that year. The rate ofinformality for year t corresponds to the average monthly rate of informality throughout that year. The official statisticspublished by INEGI are the quarterly and monthly rate, respectively.The first panel of figure 3 shows the evolution of labor productivity, based on the employed populationand the total hours worked. During the financial crisis there was a sharp drop in labor productivity andpre-crisis levels were not reached again until 2014. Starting in 2013 there was a strong increase thatpeaked in 2017. From 2017 labor productivity started trending downward. The sharp increase in laborproductivity between 2013 and 2017 coincides with a broad labor market reform that was approved atthe end of 2012.12 The right panel in figure 3 illustrates the evolution of the minimum (real) wage that

an informal job. A job is classified as informal if it is performed without legal or institutional protection, regardless of the natureof the economic unit/firm in which said job is carried out. In Mexico a worker is classified as informal if he/she is a wage earnerwho does not have access to social security and/or a self-employed worker who does not follow a formal accounting system.12The labor reform was broad and touched on issues such as restrictions on outsourcing, conditions for trial hires, paternityleave, strengthened union transparency, regulation of under-age work, among others. This reform allowed, for the first time, for9

remained largely unchanged for most of the period of analysis. Starting in 2014 the minimum wage hasbeen increasing steadily and substantially. The evolution of the minimum wage will be particularly usefulfor understanding the time-series patterns of income measures for the lowest percentiles of the incomedistribution, since the social security data are bottom coded at the minimum wage. Remarkably, thesharpest increases in the minimum wage have occurred from 2017 onward, a period that also coincideswith the decrease observed in aggregate labor productivity.Figure 3: Labor market conditions

(a) Labor productivity

9294

9698

100

102

104

106

Labo

r Pro

duct

ivity

2005 2007 2009 2011 2013 2015 2017 2019

Employed populationTotal hours worked

(b) Minimum wage

-.10

.1.2

.3.4

Min

imum

Wag

e

2005 2007 2009 2011 2013 2015 2017 2019

Note: Based on authors’ calculations with data from INEGI. Labor productivity in year t corresponds to the average,seasonally adjusted, quarterly labor productivity index for all the quarters of year t. The minimum wage, in logs, is annualized,deflated with the 2018 Mexican CPI and normalized to 0 in 2005.2.3 Income Inequality

We begin our descriptive analysis by characterizing the most salient properties of the distribution of(log) real earnings. Figure 4 presents the evolution of the percentiles (relative to 2005) of the earningsdistribution for both men and women. Notably, the 2008–2009 financial crisis had a negative impacton earnings across the whole earnings distribution, but especially so for men and for workers at thebottom of the distribution. The earnings of workers at the very top proved to be more resilient to thenegative shock associated with the crisis. Between 2013 and 2014 there was a decrease in the earningsof the lowest percentiles, particularly sharp for men. This is likely associated with the fact that 2013was a year of below-average GDP growth and stalled growth in labor productivity (see figures 1 and3). Outside of these two periods, real earnings have shown an upward trend that is more noticeable forwomen than for men. Male workers showed more ups and downs at the bottom of the earning distributionand flatter profiles at the top. In contrast, female workers displayed a steadier upward trend across theworkers to be hired on a trial basis and on flexible schedules (i.e. only for a few hours per day and/or discontinuous shifts).Proponents of this reform argued that some of its provisions would make labor markets more flexible and foster job creation,while protecting and strengthening workers’ rights (https://aristeguinoticias.com/0110/mexico/asi-quedo-la-reforma-laboral/).

10

entire earnings distribution. One important observation is that starting in 2016 the lowest percentiles ofthe earnings distribution (P10 and P25), for both men and women, display a significant upward trendrelative to previous years with log earnings increasing by roughly 20 log points between 2016 and 2019.This pattern can be traced back to the evolution of the minimum wage (see right panel of figure 3).13Figure 4: Evolution of the percentiles of the log real earnings distribution

(a) Men

.25

.15

.05

-.05

Perc

entil

es R

elat

ive

to 2

005

2005 2007 2009 2011 2013 2015 2017 2019

p90p75p50p25p10

(b) Women

.25

.15

.05

-.05

Perc

entil

es R

elat

ive

to 2

005

2005 2007 2009 2011 2013 2015 2017 2019

p90p75p50p25p10

(c) Men

.3.2

.10

-.1Pe

rcen

tiles

Rel

ativ

e to

200

5

2005 2007 2009 2011 2013 2015 2017 2019

p99.99p99.9p99p95p90

(d) Women.3

.2.1

0-.1

Perc

entil

es R

elat

ive

to 2

005

2005 2007 2009 2011 2013 2015 2017 2019

p99.99p99.9p99p95p90

Note: Based on authors’ calculations with data from IMSS. Using the CS+TMax sample, this figure plots against time thefollowing percentiles of the distribution of log (real) earnings: (a) Men: P10, P25, P50, P75, P90; (b) Women: P10, P25, P50,P75, P90; (c) Men: P90, P95, P99, P99.9, P99.99; (d) Women: P90, P95, P99, P99.9, P99.99. Since the data are top codedthe percentiles above P95 are imputed by fitting a Pareto distribution around the top code. All percentiles are normalized to 0in 2005, the first available year. Shaded areas are recessions.The evolution of several measures of dispersion in the distribution of real earnings is documented infigure 5. Panels (a) and (b) show that the overall dispersion in this distribution is very similar for men

13Campos Vázquez and Rodas Milián [2020] use IMSS data and show that the minimum wage increases of 2012 and 2015translated into almost one-to-one increases in earnings for the lowest percentiles of the earnings distribution. Their resultssuggest that the positive effect of the minimum wage on earnings for IMSS-affiliated jobs is strongest in the percentiles up toP10. They additionally find that the earnings of the percentiles beyond P10 and up to approximately P75 also increase inresponse to minimum wage increases, albeit to a lower extent, due to the so called “lighthouse effect” as documented by severalauthors in the Mexican context (see for example Kaplan and Novaro [2006]).11

and women and seems to closely match the dispersion that would be observed in a normal distribution.The dispersion is, on average, slightly higher for men and men are also those who experienced a morenoticeable increase in earnings dispersion during the crisis of 2008–2009. Outside of this recessionaryperiod, the dispersion remained fairly stable between 2005 and 2015, but started to display a steadydecreasing trend from 2015 onward.14 The behavior of the P90-P10 measure presented in panels (a) and(b) can be further understood looking at the the evolution of the upper and lower tails of the distribution—P90-P50 and P50-P10, respectively— presented in panels (c) and (d). The relative stability of P90-P10 from 2005 through 2015 is the result of a slight downward trend in the lower tail of the distributionthat is barely offset by a slight upward trend in the upper tail. The downward trend of P90-P10 from 2015onward can be associated with a downward trend in both the lower (P50-P10) and upper (P90-P50) tailsof the distribution. Referring back to figure 4, the reduction in the inequality of log earnings is mainlydriven by the growth of the lower percentiles paired with the relative stability in the highest percentiles.Finally, figures 6 and 7 describe how initial inequality, for workers aged 25, has evolved over time andhow life-cycle inequality differs across different cohorts of workers. In figure 6 we see that the lower tail(P50-P10) experience a modest downward trend, while the upper tail (P90-P50) experience a very slightupward trend. These trends are similar for men and women, but are indeed very marginal as new cohortsof young workers faced relatively stable earning dispersion throughout our sample period. The patternsof life-cycle inequality across different cohorts are also very similar for men and women as illustrated infigure 7. Specifically, for any given cohort, dispersion of log earnings increases until the last years of thesample (until 2016 or 2018 depending on the cohort) and displays a downward trend thereafter. Onceagain, this is likely associated with the increase in earnings for minimum wage workers starting around2016 and the relative stability of the real earnings for workers at the top of the earnings distribution.Across cohorts, the dispersion of earning for workers aged 25 shows a very slight downward trend, exceptfor those being 25 around the time of the 2008–2009 crisis when dispersion increased. In contrast, thereis a more noticeable downward trend in the dispersion of earnings across cohorts for workers 30 and 35years of age.

2.4 Income Dynamics

The results of this section constitute the core of our analysis of administrative data. We characterizethe most relevant properties of individual earnings dynamics, with an emphasis on the statistics thatprovide a diagnostic of the extent to which the distribution of earnings changes deviates from normality.As in Guvenen et al. [2021], the approach is non-parametric to avoid the strong assumptions embeddedin benchmark econometric models of earnings dynamics, as those could mask important features of the14Figures B.4 and B.5 in appendix B are also consistent with decreasing earnings inequality from 2016 onward. In particular,B.4 shows that the share of income that accrues to the bottom 50% of the earnings distribution grew during that period, whilethe income share of the top 10% of the distribution decreased. Similarly, B.5 shows that the Gini coefficient of the earningsdistribution decreased substantially between 2016 and 2019, relative to the trend it had displayed in previous years.

12

Figure 5: Evolution of the percentiles of the log real earning distribution(a) Men

3.1

32.

92.

82.

72.

62.

5D

ispe

rsio

n of

Log

Ear

ning

s

2005 2007 2009 2011 2013 2015 2017 2019

2.56*σP90-P10

(b) Women

3.1

32.

92.

82.

72.

62.

5D

ispe

rsio

n of

Log

Ear

ning

s

2005 2007 2009 2011 2013 2015 2017 2019

2.56*σP90-P10

(c) Men

21.

81.

61.

41.

21

Dis

pers

ion

of L

og E

arni

ngs

2005 2007 2009 2011 2013 2015 2017 2019

P90-P50P50-P10

(d) Women

21.

81.

61.

41.

21

Dis

pers

ion

of L

og E

arni

ngs

2005 2007 2009 2011 2013 2015 2017 2019

P90-P50P50-P10

Note: Based on authors’ calculations with data from IMSS. Using the CS+TMax sample, this figure plots against time thefollowing measures of overall, top- and bottom-tail dispersion of the distribution of log earnings: (a) Men: P90–P10 and2.56*σ (sigma is the standard deviation); (b) Women: P90–P10 and 2.56*σ of log income; (c) Men: P90–P50 and P50–P10; (d)Women: P90–50 and P50–P10. Shaded areas are recessions. 2.56*σ corresponds to the P90–P10 differential for a Gaussiandistribution.distribution of earnings changes. Deviations from normality are important to study because they havedirect implications for the kind of income shocks a worker may experience, such as their persistency,frequency, direction, and magnitude.Defining (residualized) log earnings changes as gkit = ∆kεit = εi,t+k−εit,15 Guvenen et al. [2021] showthat as k becomes larger, the distribution of earnings changes ∆kεit reflects more closely the distributionof the permanent component of earnings changes rather than that of transitory innovations. Here we focuson one-year earnings changes (i.e. k = 1), that reflect mainly transitory innovations to earnings, andpresent results for k = 5, the more permanent innovations, in appendix B.

15εit are the residuals of a regression where log earnings are regressed against a full set of age dummies, separately bygender and year.13

Figure 6: Initial inequality: dispersion of log earnings for workers at age 25(a) Men

.6.8

11.

21.

41.

61.

8D

ispe

rsio

n of

Log

Ear

ning

s

2005 2007 2009 2011 2013 2015 2017 2019

P90-P50P50-P10

(b) Women

.6.8

11.

21.

41.

61.

8D

ispe

rsio

n of

Log

Ear

ning

s

2005 2007 2009 2011 2013 2015 2017 2019

P90-P50P50-P10

Note: Based on authors’ calculations with data from IMSS. Using the CS+TMax sample, this figure plots against time thefollowing measures of top- and bottom-tail dispersion of the log earnings distribution: (a) Men: P90–P50 and P50–P10 forworkers at age 25; (b) Women: P90–P50 and P50–P10 for workers at age 25. Shaded areas are recessions.Figure 7: Life-cycle inequality across cohorts

(a) Men

25 yrs old

35 yrs old

2.4

2.6

2.8

3P9

0-P1

0 of

Log

Ear

nigs

2005 2007 2009 2011 2013 2015 2017 2019

Cohort 2005 Cohort 2008Cohort 2011 Cohort 2015

(b) Women

25 yrs old

35 yrs old

2.4

2.6

2.8

3P9

0-P1

0 of

Log

Ear

nigs

2005 2007 2009 2011 2013 2015 2017 2019

Cohort 2005 Cohort 2008Cohort 2011 Cohort 2015

Note: Based on authors’ calculations with data from IMSS. Using the CS+TMax sample, this figure plots against timemeasures of overall dispersion of the log earnings distribution: (a) Men: P90–10 over the life cycle for the 2005, 2008, 2011,and 2015 cohorts; (b) Women: P90–10 over the life cycle for the 2005, 2008, 2011, and 2015 cohorts. The grey dashed lineslink across cohorts the years corresponding to ages 25, 30, and 35.Figure 8 illustrates the dispersion in the upper (P90–P50) and lower (P50–P10) tails of the dis-tribution of one-year log earnings changes. For both men and women we see that between 2008 and2009, that is between the onset and bottoming out of the depressed economic conditions caused by thefinancial crisis, there was an increase in dispersion in the lower tail and a decrease in the upper tail ofthe distribution. The magnitude of these opposing movements suggest that, during the time of the financialcrisis, overall dispersion (P90–P10) increased for men and remained relatively stable for women. Outside

14

Figure 8: Dispersion of one-year log earnings changes(a) Men

.5.6

.7.8

.9D

ispe

rsio

n of

g1 it

2005 2007 2009 2011 2013 2015 2017 2019

P90-P50P50-P10

(b) Women

.5.6

.7.8

.9D

ispe

rsio

n of

g1 it

2005 2007 2009 2011 2013 2015 2017 2019

P90-P50P50-P10

Note: Based on authors’ calculations with data from IMSS. Using the LS+TMax sample, this figure plots against time thefollowing measures of top- and bottom-tail dispersion of the distribution of one-year earnings changes: (a) Men: P90–P50and P50–P10 differentials; (b) Women: P90–P50 and P50–P10 differentials. Shaded areas are recessions.of this recessionary period, we observe only modest movements in P90–P50 and P50–P10. These twomeasures of dispersion move up and down in opposing directions for men, albeit the movements are smallso that P90–P10 remains relatively stable. In contrast, from 2011 onward women experience a slightupward trend in both P90–P50 and P50–P10 resulting in a greater overall dispersion in their distributionof one-year earnings changes. This period of increasing dispersion for women coincides with a post-labormarket reform period in which female labor force participation grew in Mexico, likely due, at least inpart, to the more flexible conditions encouraged by the reform.16 Thus, it is possible that the greaterflexibility in hiring practices induced the entry of women into the labor market, but particularly so forthose who may demand more flexible work arrangements.17 In turn, this demand for flexibility may entailmore volatile earnings due to, for example, volatility in the number of hours worked.For the purpose of detecting significant deviations from normality, figure 9 plots measures of skewenessand kurtosis for the distribution of one-year earnings changes. Notice that, other than the particularlysharp decrease and rebound of skewness during the period of the financial crisis observed across genders,the evolution of Kelley skewness for men and women is quite different.18 For men, skewness moves backand forth between positive (one-third of the years) and negative (two-thirds of the years) values. In

16De la Cruz Toledo [2020] finds that the higher preschool enrollment associated with the rollout of a universal preschoolpolicy led to an increase in female labor force participation in Mexico. Growing female labor force participation is also reflectedin our administrative data with the share of women in the IMSS master sample increasing from 36.4 to 38.7% between 2011 and2019.17Prevailing social norms in Mexico imply that women are the members of the household who devote more time to homeproduction and to care for other members such us children and the elderly.18Recall that the Kelley skewness measures the relative fraction of the total dispersion (P90–P10) represented by the upperand lower tails. A negative coefficient implies that the lower tail (P50–P10) is longer than the upper tail (P90–P50). In turn,the negative skewness of the distribution entails that the mass of the distribution is concentrated to the right of the mean.15

Figure 9: Skewness and kurtosis of one-year log earnings changes(a) Kelley skewness

-.25

-.15

-.05

.05

Skew

ness

of g

1 it

2005 2007 2009 2011 2013 2015 2017 2019

WomenMen

(b) Excess Crow-Siddiqui kurtosis

89

1011

1213

14Ex

cess

Kur

tosi

s of g

1 it

2005 2007 2009 2011 2013 2015 2017 2019

WomenMen

Note: Based on authors’ calculations with data from IMSS. Using the LS+TMax sample, this figure plots against time thefollowing higher order moments of the distribution of one-year earnings changes: (a) Men and Women: Kelley skewnesscalculated as (P90−P50)−(P50−P10)P90−P10 ; (b) Men and Women: Excess Crow-Siddiqui kurtosis calculated as P97.5−P2.5P75−P25 − 2.91, where thefirst term is the Crow-Siddiqui measure of Kurtosis and 2.91 corresponds to the value of this measure for the Normaldistribution. Shaded areas are recessions.particular, it is positive at the beginning of the sample and turns negative at the end. In contrast, it isalmost always positive for women. From 2009 onward, skewness remains fairly stable at levels similar tothose observed before the financial crisis. Regarding kurtosis, we see that, qualitatively, its evolution isfairly similar for both men and women, although its level is significantly higher for women.19 Kurtosis hadbeen trending slightly upward before the financial crisis, but increased significantly during this recession,particularly so for women. From 2011 onward, nonetheless, we see both a strong downward trend and asignificant reduction in the gap level between men and women. Despite this significant downward trendin the kurtosis of one-year earnings changes, by the end of our period of study its level remains quitehigh.20It is also of interest to understand how the properties of the distribution of one-year earnings changesmay vary along the permanent income distribution21 and along the life-cycle of workers. To this end,

19Note that, intuitively, kurtosis looks at how much of the weight of the distribution is sitting in the tails as opposed to themiddle. For example, for symmetric unimodal distributions positive kurtosis indicates heavy tails and peakedness relative tothe Normal distribution. But, since kurtosis represents a “movement of mass” that does not affect the variance and since this“movement of mass” can be formalized in more than one way, the meaning of kurtosis can be somewhat vague. For symmetricdistributions, on the other hand, positive kurtosis indicates an excess in either the tails, the center, or both. DeCarlo [1997]shows that kurtosis primarily reflects the tails of the distribution, with the center having a smaller influence.20The levels of excess Crow-Siddiqui kurtosis of one-year earnings changes in the Mexican administrative data are not out ofline with what Guvenen et al. [2021] report for the United States, particularly those that prevail in the latter part of our sampleperiod. While our results suggest an average kurtosis for one-year earnings changes that is higher than the one for the UnitedStates, Guvenen et al. [2021] also find high levels of excess Crow-Siddiqui kurtosis of up to 11 for workers aged 45 to 54 in the60th percentile of the distribution of recent earnings.21Permanent income is defined as Pit−1 = ∑t−1s=t−3 yis3 . This measure takes average raw earnings yi (including zeros or earningsbelow the meaningful attachment to the labor force) over the previous three years. It is constructed only for those individualswho have at least two years of earnings above the threshold.16

figure 10 presents dispersion, skewness, and kurtosis conditional on age (for age groups 25 to 34, 35 to 44,and 45 to 55) and conditional on the percentile of a worker’s permanent income. Several relevant patternsare qualitatively similar across genders, but there are also important differences. Specifically, dispersionis monotonically decreasing with both age and permanent income. Skewness seems to be increasing withage. This is evident in the case of women, and for men at the bottom and top of the permanent incomedistribution; between P30 and P70 skewness is fairly similar across age groups. Along the distributionof permanent income, skewness decreases monotonically up to the 55th percentile, approximately, andthen remains relatively stable, with a slight upward trend from P85 in the case of men. Men also displayskewness that is positive for the lower part of the permanent income distribution and negative for theupper percentiles and this applies to all age groups. In contrast, for women up to age 44, skewness ispositive along the entire distribution of permanent income. For young women, as it was the case withyoung men, skewness is positive at the bottom of the permanent income distribution but becomes negativefrom around the 40th percentile onward. Finally, kurtosis, is monotonically increasing with age for men.The same is true for women as well, but only up to about the 40th percentile of the permanent incomedistribution. For the youngest workers, kurtosis increases sharply up to the 15th percentile, it continuesto increase, but very gradually, up to roughly the 85th percentile, after which it slightly trends downward.In contrast, for the oldest workers kurtosis increases up to the 25th and 8th percentile for men and women,respectively, and then trends downward. Men experience this downward trend gradually, while for womenthe decrease is sharp between P8 and P10 and then kurtosis starts a gradual downward trend along therest of the permanent income distribution.2.5 Income Mobility

We now turn to the analysis of mobility over time in the distribution of earnings. Figure 11 showslong-term mobility for workers aged 25–34 and workers aged 35–44 tracking their movements along thedistribution of permanent income over a 10-years time horizon. We observe upward mobility at the bottomand downward mobility at the top of the permanent income distribution, except at the very top. Thesepatterns of mobility are qualitatively similar for men and women. For both age groups under consideration,there is upward mobility up to about P45 and P35 for men and women, respectively with upward mobilitybeing slightly higher for younger workers. In contrast, individuals located in the top percentiles of thepermanent income distribution experience downward mobility that is also higher for younger workers. Atthe very top of the permanent income distribution, top 0.1%, there is essentially no income mobility.Looking at long-term mobility over time, figure 12 shows the evolution of 10-year mobility for twodifferent starting years, 2007 and 2009. The patterns are again very similar across men and women,but it is worth noting that 10-year mobility within the distribution of permanent income is essentiallyunchanged between 2007 and 2009, despite 2007 being a pre-crisis year and 2009 coinciding with thelarge macroeconomic shock associated with the financial crisis (see figure 1).17

Figure 10: Dispersion, skewness and kurtosis of one-year log earnings changes conditioning onpermanent income for workers of different ages(a) Men

0.5

11.

52

2.5

P90-

P10

Diff

eren

tial o

f git

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

(b) Women

0.5

11.

52

2.5

P90-

P10

Diff

eren

tial o

f git

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

(c) Men

-.50

.51

Kel

ley

Skew

ness

of g

it

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

(d) Women-.5

0.5

1K

elle

y Sk

ewne

ss o

f git

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

(e) Men

19

1725

Exce

ss C

row

-Sid

diqu

i Kur

tosi

s of g

it

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

(f ) Women

19

1725

Exce

ss C

row

-Sid

diqu

i Kur

tosi

s of g

it

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

Note: Based on authors’ calculations with data from IMSS. Using the H+TMax sample, this figure plots against percentiles ofthe permanent income distribution, and for three different age groups, the following moments of the distribution of one-yearearnings changes: (a) and (b) Men and Women: P90–P10 differential; (c) and (d) Men and Women: Kelley Skewness; (e) and(f ) Men and Women: Excess Crow-Siddiqui kurtosis. Since the data are top coded the percentiles of the permanent incomedistribution are plotted only until P95.18

Figure 11: Evolution of 10-year mobility over the life cycle(a) Men

Top 0.1% of Pit-1

020

4060

8010

0M

ean

Perc

entil

es o

f Pit+

10

0 10 20 30 40 50 60 70 80 90 99Percentiles of Permanent Income, Pit

[25-34][35-44]

(b) WomenTop 0.1% of Pit-1

020

4060

8010

0M

ean

Perc

entil

es o

f Pit+

10

0 10 20 30 40 50 60 70 80 90 99Percentiles of Permanent Income, Pit

[25-34][35-44]

Note: Based on authors’ calculations with data from IMSS. The figure shows average rank-rank long-term (10-year) mobilityfor male (a) and female (b) workers of different ages.

Figure 12: Evolution of 10-year mobility over time(a) Men

Top 0.1% of Pit-1

020

4060

8010

0M

ean

Perc

entil

es o

f Pit+

10

0 10 20 30 40 50 60 70 80 90 99Percentiles of Permanent Income, Pit

20072009

(b) WomenTop 0.1% of Pit-1

020

4060

8010

0M

ean

Perc

entil

es o

f Pit+

10

0 10 20 30 40 50 60 70 80 90 99Percentiles of Permanent Income, Pit

20072009

Note: Based on authors’ calculations with data from IMSS. The figure shows average rank-rank long-term (10-year) mobilityfor male (a) and female (b) workers in selected years of the sample, 2007 and 2009.

19

2.6 Summary and Discussion of the Main Findings of Part I

Following a non-parametric approach, and based on millions of social security records for the period 2005–2019, we establish the following facts regarding the distribution and dynamics of (real) log earnings ofMexican formal sector workers.22 Inequality, measured as the dispersion of earnings, has been decreasingsince 2016. This trend can be associated with the relative stability of the (real) earnings of the upperpercentiles of the earnings distribution, and the strong increase in the earnings of the lowest percentilesdue to revisions of the minimum wage that have exceeded the inflation rate. We also find that the financialcrisis was a period of increased dispersion in earnings, particularly so for men, driven by an increase inthe dispersion of the lower tail of the distribution (P50–P10). Across cohorts of workers, inequality alongthe life-cycle seems to increase sharply starting at age 25 to eventually decrease gradually from age 35onward. Given the relatively short window covered by our social security records, it is however unclearto what extent the eventual decrease in dispersion documented in figure 7 can be attributed to featuresof a worker’s life-cycle as opposed to the overall decrease in dispersion that occurs in the latter part ofour sample period.Within the distribution of earnings, we find that there is upward mobility for the lower tail of thedistribution of permanent income, particularly so for the lowest percentiles, and downward mobility forthe upper percentiles, except at the very top (i.e. top 0.1%). Both upward and downward mobility isgreater for the youngest group of workers under consideration (25 to 34 years old). Thus, young andlower-earnings workers are, on average, more likely to move upward in the earnings distribution, whileyoung and higher-earnings workers are, on average, more likely to move downward in the distribution.These mobility patterns appear to be relatively stable through time.Finally, regarding the distribution of (one-year residualized) log earnings changes, we find strongdeviations from normality with the extent of these deviations varying with a worker’s income, along his/herlife-cylce, and to a lesser extent across genders. We document that this distribution is asymmetric anddisplays very high kurtosis. While the peakedness and tailedness of this distribution is present for workersof all ages and income levels, we find that skewness can be either positive or negative. For example,positive skewness is greatest for older and lower-income women, while negative skewness is greatest foryounger and higher-income men. This contrasts with the findings of Guvenen et al. [2021], who find thatone-year earnings changes are negatively skewed for all workers in the USA. In Mexico, we find that, forexample, this distribution is almost always positively skewed for women, with the exception of the 2008–2009 financial crisis. The positive skewness of one-year earnings changes for women implies that theirearnings have more room to move up and less room to fall. The same can be said for men of all ages, but22In appendix C we present the results of a comparison of statistics based on the administrative (IMSS) data and statisticsbased on household surveys. The Mexican household survey (ENOE for its acronym in Spanish) is the only source of informationregarding informal workers in the Mexican economy. Based on comparable samples of formal sector workers, we find that IMSS-and ENOE-based statistics can differ substantially. For completeness, we also present a comparison of relevant statistics betweenformal and informal workers based on data from the household survey. The findings of this validation exercise indicate that acomparison of the IMSS-based statistics of section 2 and ENOE-based statistics for informal workers may provide quite limitedinformation.

20

only conditioning on them being in the lower part of the permanent income distribution, while for womenthis is true regardless of their permanent income. Notably, skewness became very negative, relativelyto the entire period under analysis, during the financial crisis meaning that this recessionary episode,albeit brief, temporarily increased the chances to experience larger and more frequent negative shocks forall Mexican workers. Regarding kurtosis, we find that the distribution is leptokurtic, meaning that mostworkers experience transitory earnings changes of a small magnitude, but that a non-negligible massexperience extreme transitory shocks to earnings. We also find that kurtosis has been decreasing sincethe 2008–2009 period, but remains at overall high levels, and that this downward trend has been morepronounced for women than for men. Older and lower-income workers face the most peaky distributionof transitory income shocks while the opposite is true for young and higher-income workers. Transitoryshocks are the most volatile for younger and lower-income workers but their volatility decreasing withage and along the distribution of permanent income.3 Part II. Income Dynamics and the Role of the Informal Sector

An important limitation of the results presented in section 2 for understanding inequality and the dynamicsof income in the Mexican labor market, is that the social security data from IMSS only cover workers inthe formal sector. In contrast to developed countries, a notorious characteristic of developing and emergingeconomies is the prevalence of informality. In these countries, the informal sector is responsible for alarge share of all economic activity and commands a significant proportion of the economy’s productiveresources. During the period considered in our analysis, 2005–2019, the quarterly rate of informality inMexico ranged between 56 and 60% at the aggregate level (see figure 2), with this rate being somewhatlarger for women than for men. That is, more than half of Mexico’s labor market is informal. Furthermore,Levy [2018] shows that informal firms in Mexico, often operating under constraints that imply lowerproductivity than formal firms, represent 90% of all firms and absorb 40% of the economy’s capital stock.23These firms are present throughout the economy and are not confined to activities deemed “traditional”or “less modern” implying that the informal economy fully coexists alongside the formal economy.24Informality is an established structural trait in developing countries, with Mexico being no exception,and its effects and implications permeate the economy beyond the labor market. It has typically beenassociated with financial underdevelopment and the excessive burden of taxes and regulation, that entailsignificant losses in terms of aggregate productivity, investment and output.25 The reasons behind itsexistence and size are nonetheless complex, with firms and workers often choosing informality optimally23Informal firms are defined as firms whose workforce is composed entirely of non-salaried workers, for which the firm facesno obligations related to social security nor is constrained by dismissal regulations, or firms that hire salaried workers but evadeall legal obligations associated with employing these workers.24See La Porta and Shleifer [2014] and Ulyssea [2020] for an overview of the role of informality in the process of economicgrowth and development.25Levy [2018] argues that one of the main reasons behind Mexico’s poor performance in terms of growth in aggregate productivityand per capita GDP can be traced to a misallocation of resources, with too many resources being allocated to informal sectorfirms, most of which do not ever grow or improve.

21

given the incentives that they face.26 The presence of the informal labor market can also provide a flexiblemargin of adjustment when the economy is beset by shocks, even if it often comes at the expense of lowerproductivity.27 This margin of adjustment is particularly important in emerging economies in which socialsafety nets may be weak or absent altogether. The lack of a social safety net in Mexico implies that theunemployment rate is low and mainly consisting of frictional unemployment and the fraction of cyclicalunemployment that is not absorbed by the informal sector.Due to the pervasiveness of informality in the Mexican economy and the fact that the relative size ofthe informal sector affects the overall functioning of the labor market and the opportunities it can offer toworkers in terms of job security, social mobility, and life-time earnings, we dedicate this second part ofour analysis to studying the interplay between formal and informal employment. In particular, we studythe impact of transitions out of and back into formal employment on wages earned in the formal sectorand the impact of early exposure to informality on future earnings. To this end, we perform two empiricalexercises. In the first exercise we study the wage dynamics of workers who exit and subsequently re-enterthe IMSS database. Even though we cannot track workers upon exit from the social security data, thevery low levels of unemployment and absence of a social safety net in Mexico imply that the majority ofthese workers must be transitioning in and out of formal employment alternating job spells in the informalsector. This is especially likely to be the case for men, whose participation rate is above 90% for the agegroup considered in the analysis. In the second exercise, we quantify the impact of aggregate economicconditions upon entry into the labor market on future earnings focusing on the impact that having thefirst job in the informal sector has on earnings.Our main results can be summarized as follows. First, workers who exited the formal sector can take,on average, three or more years upon re-entry to achieve yearly wages that are comparable to thoseearned before exit. This suggests that most workers do not transition out of formal employment becauseof better opportunities elsewhere, otherwise they would unlikely re-enter with lower wages. The earningsrecovery process appears to be faster for women than for men. Second, for cohorts of new workers enteringthe labor market, the dimension of the state of the economy that matters the most for the trajectory offuture earnings is the rate of informality, rather than the rate of unemployment. Finally, there is a negativeand significant impact on future earnings when a worker’s first job happens to be in the informal sector.These findings lend support to the view that the dual (formal/informal) nature of the Mexican labormarket is crucial for understanding the income dynamics of workers. In fact, our evidence indicates that26See Günther and Launov [2012] who develop an econometric model to formalize this hypothesis and test its empiricalrelevance in the urban labor market in Côte d’Ivoire. Alcaraz, Chiquiar, and Salcedo [2015] estimate that among Mexican maleurban workers aged 23 to 60, for which non-participation in the labor market is very low, only 10 to 20% of informal workersare involuntarily informal (i.e., they would prefer a job in the formal sector if they could find one). This result provides someevidence regarding the existence of barriers to entry into formal employment, but also suggests that an important proportion ofworkers may self-select into informal employment.27Fiess, Fugazza, and Maloney [2010] show that the relative size of the informal sector can display both counter-cyclical andpro-cyclical behavior consistent with both the traditional view of the informal sector as a shock absorber (counter-cyclical), andwith sectoral expansion driven by relative demand or productivity shocks to those sectors that more intensively use informallabor (pro-cyclical). Thus, the pro- or counter-cyclicality may depend on the origin of shocks and the presence of binding wagerigidities that limit the extent to which wages in the formal sector can adjust in response to shocks.

22

spells out of formal employment are costly for workers because they negatively impact lifetime earningsas they are associated with wage penalties upon re-entry that may last for several years before wagescan regain pre-exit levels. In addition, early exposure to informality can have long-lasting effects onthe labor market outcomes, including earnings, of Mexican workers and this is especially relevant sinceinformality is so prevalent and it proves to be the most relevant aggregate economic condition for newcohorts of workers.28The rest of this section is organized as follows. Section 3.1 analyzes the impact of transitioning inand out of IMSS-affiliated employment on the wages earned within the formal sector. In section 3.2we document the long-term impact of initial aggregate labor market conditions, such as the rates ofunemployment and informality, on various labor market outcomes across heterogeneous workers.3.1 Transitions In and Out of Formal Employment and their Impact on Wage Dynamics

A peculiar feature of the formal labor market in Mexico is that there is continuous entry and exit of workersinto and out of IMSS-affiliated jobs. For example, during the sample period covered by our database,only 8.8% of workers maintained an IMSS-affiliated job throughout.29 Additionally, about one fourth ofworkers have two or more spells of formal employment.30 Thus, in general, most Mexican workers donot seem to have a very strong attachment to formal employment. This tenuous bond can potentiallyhave an important impact on the lifetime earnings and welfare of workers given that informal employmenttypically implies lower average wages and does not entail social security benefits, nor grants any of theemployment protection associated with formal jobs.The movements in and out of the formal labor market of IMSS’ workers can be associated withtransitions between formal and non-formal employment, with the latter being employment in either theformal public sector or the informal sector, unemployment, or exit from the labor force. It is important toemphasize that, once workers leave the IMSS database, we are unable to track them and thus cannotascertain the state into which they transition (nor the state from which they transition back into formalemployment). Given that we are focusing on prime age workers, it is most likely that the bulk of thesetransitions occur between formal and informal employment, as the informal labor market is one of themost important mechanisms for Mexican workers to smooth income shocks. As this entry and exit canaffect the dynamics of income, we now turn the focus of our analysis on verifying whether spells out offormal employment imply a penalty on earnings upon re-entry and, if so, how large this penalty is andhow long it takes for workers to recoup their previous wages.We restrict our sample to the group of workers with two spells of formal employment.31 Almost one28This result echoes the findings of Aguilar-Argaez, Alcaraz, Ramírez, and Rodríguez-Pérez [2020], who show that slackmeasures that do not consider the informal labor market may provide an inadequate measure of the true amount of slack in theeconomy.29We refer to IMSS-affiliated jobs as formal employment as they comprise the vast majority of jobs in the formal sector.30See appendix A for descriptive statistics regarding active employment spells. A spell is defined as a sequence of contiguousyears in which we observe a worker’s income within our data.31The descriptive statistics and estimates relative to this part of the analysis are based on a random sample of 4 million

23

fifth of all workers who have been formally employed during the sample period have only two spells asformal employment (see appendix A), which implies that they have exited and re-entered after a breakthat lasted at least one year and hence have only one gap in formal employment. On average, theseworkers are present in the database 3.2 years before leaving, stay out of formal employment 2.6 years,and then come back for another 3.6 years. The distinguishing features of this subsample are: (i) theproportion of males is slightly larger than for the whole sample (61.2% vs 60.4%); (ii) their average age atthe beginning of their first spell of formal employment is slightly lower (31.7 years old versus 32.6 yearsold): and (iii) their average age at the conclusion of their second spell of formal employment is 40.4 yearsold, substantially lower than the retirement age.To analyze the potential penalty on earnings implied by this mechanism of entry, exit and subsequentre-entry into formal employment, we compare pre-exit wage trajectories with the post re-entry trajectories.In particular, we adopt an event-study approach around the beginning and the end of the gap in formalemployment. In our baseline specification we examine a 3-year event window before and after eachworker exits the database. We balance the panel by keeping only workers that can be observed for threeconsecutive years before leaving the dataset and for three consecutive years after re-entry. The 3-yearevent window is chosen as the benchmark as it maximizes sample size while mimicking the averageduration of these workers’ active spells.32The comparison is performed by estimating the following specification, separately by gender:ln(wit) = β0 + 3∑

τ=−2 βτIτ + 5∑k=1 βkIk + 3∑

τ=−25∑

k=1 βkτ IτIk + γgIg + αe + αs + αt + εit (3.1)

Here ln(wit) is the logarithm of the average monthly wage of worker i in year t. Iτ = 1[event = τ ] aredummy variables for the number of τ years before or after leaving the sample: τ = 0 is the year rightbefore leaving the sample (i.e. the last year of a worker’s first job spell in the formal sector), τ = −1refers to two years before leaving the sample, τ = 1 is the first year upon re-entering (i.e. the first yearof a worker’s second job spell in the formal sector), and so on. Ik = 1[duration = k] is a set of dummyvariables that take the value of 1 depending on the k number of years during which the worker was out offormal employment. Ig = 1[ageit = g] are dummy variables for the age group to which worker i belongsin year t. αe, αs, and αt are fixed effects for sector of economic activity e, state s, and year t, respectively,and εit is the error term.The coefficients of interest are the βτ , that capture how wages differ when compared to the base year,τ = 0, the year right before leaving the sample. A graphical representation of the estimates of theseworkers obtained from the universe of workers that were present at least one year in the master sample used to carry out theanalysis presented in section 2.32As we widen the analysis window, we lose observations since it is difficult to find many workers that are present in thedata many years before leaving as well as after re-entering. As a robustness check, we also analyze a 5-year event windowto assess whether wages’ trajectories change as the event window widens. Inevitably, the 3-year event panel is comprised ofa different set of workers than those included in the 5-year event window. Results hold when using the specification with thewider event window.

24

Figure 13: Estimates of wages trajectories (log differences) of workers who exit and re-enterthe formal labor market

-.15

-.1-.0

50

.05

t=-2

t=-1 t=0

t=1

t=2

t=3

Year before leaving the sample: t=0

MenWomen

Note: Based on authors’ estimates with data from IMSS. The figure plots differences of log wages obtained by estimatingequation (3.1) using a subsample of workers with only two spells of formal employment.coefficients is shown in figure 13.33 On average, wages decrease one year before leaving the sample,both for male and female workers. Men appear to be experiencing a downward trend two years beforeleaving, while women seem to have more stable wages in the years before exit. Upon re-entry, our resultsindicate that, in the first year of their second formal job spell, workers suffer a wage penalty of around15% compared with the wage they earned in the year right before concluding their first spell of formalemployment. This holds for men and women, although men seem to fare worse than women. Wages startto recover in the second year after re-entry, but men are still not able to fully regain their pre-exit wagesin the third year after re-entering, whereas women are able to recoup almost entirely.The fact that wages are lower upon re-entry suggests that the wage penalty is likely associated withmost of these workers landing lower-paying jobs, possibly in the informal sector, after leaving the sample.As a consequence, these workers may suffer a negative income shock that the analysis in section 2 isnot able to capture. On the other hand, regaining formal employment may also entail a positive incomeshock. Note, however, that since our results suggest that the unobserved positive (re-entry) income shockis smaller than the initial negative (exit) one, some of the measures of income dynamics discussed insection 2 may be biased in a specific way.We also characterize the behavior of wages, in levels, during the event window under analysis bydefining β(τ, k) = β0 + βτ + βk + βk

τ , which captures the level of the dependent variable for each pairof event τ and duration k. Coefficients β(τ, k) are plotted in figure 14. In addition to the patterns thatwe already discussed in relation to figure 13, we now see that, on average, women have higher wagesthan men and wages for both men and women tend to decrease with the number of years they remainedout of the formal sector between their first and second spell of formal employment. Panel (b) of figure14 shows more clearly that in the third year upon re-entry, women whose stint out of formal employment33The complete estimation output for this regression is provided in appendix D.

25

was relatively brief are able to fully catch up with the wage they earned in the year right before exiting,while men, even those with the shortest duration between spells of formal employment, are still laggingbehind.Figure 14: Estimates of wages trajectories (levels) of workers who exit and re-enter the formallabor market

(a) Men

7.8

7.9

88.

18.

28.

3

t=-2

t=-1 t=0

t=1

t=2

t=3

Year before leaving the sample: t=0

(b) Women

7.8

7.9

88.

18.

28.

3

t=-2

t=-1 t=0

t=1

t=2

t=3

Year before leaving the sample: t=0

Note: Based on authors’ estimates with data from IMSS. The figure plots the evolution of wage levels obtained by estimatingthe following equation: ln(wit) = ∑3τ=−2 ∑5k=1 βk

τ Iτ Ik + γgIg + αe + αs + αt + εit using the subsample of workers with only twospells of formal employment. The units for these estimates should be understood in relation to the logarithm of wages. Tofacilitate the computation of standard errors and confidence intervals, the estimates reported here reflect the estimatedcoefficients from the equation mentioned above, but the point estimate for βkt in that equation is equivalent toβ(τ, k) = β0 + βτ + βk + βk

τ that can be calculated from the estimates of equation (3.1).As a robustness check, we assess whether our results could be driven by the pre-exit and post re-entry wage trajectories of workers whose first spell of formal employment came to an end due to the 2009financial crisis. We address this concern by estimating the following alternative specification:

ln(wit) = 3∑τ=−2 βc

τIτIcrisis + γgIg + αe + αs + αt + εit (3.2)In this case, Icrisis is an indicator variable that equals 1 if the last year we observe the worker in thedatabase before exit is 2008 or 2009. The coefficients βc

τ capture the average wage in every year of theevent window for workers that exited during the financial crisis as compared to the average wages ofthose who left in any other year and their estimates are shown in figure 15. For both genders, our resultssuggest that two years before exit occurred, workers who left during the financial crisis had, on average,slightly higher wages than those who left in any other year. In contrast, the average wages of theseworkers are slightly lower upon returning. Note that the wage patterns documented with our benchmarkspecification still holds.3434The estimation output for regression (3.2) is reported in appendix D.

26

Figure 15: Estimates of wages trajectories of workers who left the formal labor market duringthe 2009 financial crisis(a) Men

7.9

8

8.1

8.2

8.3

t=-2

*200

9Cris

is=0

t=-2

*200

9Cris

is=1

t=-1

*200

9Cris

is=0

t=-1

*200

9Cris

is=1

t=0*

2009

Cris

is=0

t=0*

2009

Cris

is=1

t=1*

2009

Cris

is=0

t=1*

2009

Cris

is=1

t=2*

2009

Cris

is=0

t=2*

2009

Cris

is=1

t=3*

2009

Cris

is=0

t=3*

2009

Cris

is=1

Year before leaving the sample: t=0

(b) Women

7.9

8

8.1

8.2

8.3

t=-2

*200

9Cris

is=0

t=-2

*200

9Cris

is=1

t=-1

*200

9Cris

is=0

t=-1

*200

9Cris

is=1

t=0*

2009

Cris

is=0

t=0*

2009

Cris

is=1

t=1*

2009

Cris

is=0

t=1*

2009

Cris

is=1

t=2*

2009

Cris

is=0

t=2*

2009

Cris

is=1

t=3*

2009

Cris

is=0

t=3*

2009

Cris

is=1

Year before leaving the sample: t=0

Note: Based on authors’ estimates with data from IMSS. The figure plots differences of log wages obtained by estimatingequation (3.2) using a subsample of workers with only two spells of formal employment. The units for these estimates shouldbe understood in relation to the logarithm of wages.As an additional robustness exercise, we corroborate the findings from our baseline approach bycomparing the wage trajectories of workers with two spells of formal employment with the trajectories ofthe group of workers who stayed in the database the entire period in a fashion very similar to a difference-in-differences event study design. This design is based on assigning placebo exits and duration of spellsout of the formal labor market to individuals who remained formally employed throughout the wholeperiod. Even though we do not intend to interpret these results as causal, they serve as a useful tool forassessing whether the features that we find in the group of workers who leave and re-enter the formallabor market are peculiar to this group.We construct a control group comprised of workers who are present in the IMSS data the whole 15years for which information is available and compare their wage dynamics with those of workers who leavethe formal labor market only once and then come back who constitute our treatment group. We use theCoarsened Exact Matching (CEM) method described in Iacus, King, and Porro [2012] to obtain a balancedsample of the treatment and control groups. Through this methodology we find for 66.3% of workers whocould potentially be in the treatment group an exact match in terms of age, gender, sector of economicactivity and locality (state) who were observed in their last year before exiting formal employment.35Consistent with this methodology, the exact match is chosen randomly among potential candidates, sothat we can construct placebo exit events for the individuals in the control group. Specifically, we assumethat each worker in the control group left the database in the same year as his/her match in the treatmentgroup, and that he/she was out of formal employment for the same number of years as his/her treatment

35Age was coarsened into 5-year age groups.27

group counterpart. Having defined the placebo events for the control group, we then proceed to estimatethe following specification:ln(wit) = 3∑

τ=−2 βTτ IτIi,treated + γgIg + αe + αs + αt + εit (3.3)

where Ii,treated is a dummy variable that takes the value of 1 if worker i is part of the treatment group and0 otherwise.Figure 16: Estimates of wages trajectories: treatment vs control group

(a) Men

7.7

7.8

7.9

8

8.1

8.2

8.3

8.4

8.5

8.6

t=-2

*Tre

ated

=0

t=-2

*Tre

ated

=1

t=-1

*Tre

ated

=0

t=-1

*Tre

ated

=1

t=0*

Trea

ted=

0

t=0*

Trea

ted=

1

t=1*

Trea

ted=

0

t=1*

Trea

ted=

1

t=2*

Trea

ted=

0

t=2*

Trea

ted=

1

t=3*

Trea

ted=

0

t=3*

Trea

ted=

1

(b) Women

7.7

7.8

7.9

8

8.1

8.2

8.3

8.4

8.5

8.6

t=-2

*Tre

ated

=0

t=-2

*Tre

ated

=1

t=-1

*Tre

ated

=0

t=-1

*Tre

ated

=1

t=0*

Trea

ted=

0

t=0*

Trea

ted=

1

t=1*

Trea

ted=

0

t=1*

Trea

ted=

1

t=2*

Trea

ted=

0

t=2*

Trea

ted=

1

t=3*

Trea

ted=

0

t=3*

Trea

ted=

1

Note: Based on authors’ estimates with data from IMSS. The figure plots differences of log wages obtained by estimatingequation (3.3) using a control group of workers randomly selected among those who are always present in the IMSS data anda subsample of workers with only two spells of formal employment as the treatment group. The units for these estimatesshould be understood in relation to the logarithm of wages.The estimated coefficients βT

τ are displayed in figure 16.36 On average, workers in the control grouphave higher wages in each event and show a slightly upward trend in their wages, as opposed to the fallobserved in the year before exit for workers in the treatment group. These results suggest that the patternsthat we estimate in our benchmark specification are not driven by the treatment group merely reflectinga pattern that is also present for workers who do not incur in the exit event (i.e. the control group). Thatis, the wage dynamics for workers who exit and re-enter formal employment can be associated with thefact that these workers spent some time out of the formal sector. For instance, workers who suffer anexit event have lower average pre-exit wages than those who remained continuously attached to a formaljob. This indicates that workers at the bottom of the residualized earning distribution described 2.4 aremore likely to exit formal employment. In turn, lower income workers face both a temporary income shockupon exiting formal employment and a more persistent effect on future earnings due to the wage penalty36The estimation output is available in appendix D together with the estimates for a specification similar to (3.3) where thetime window has been widened to 5 years.

28

upon re-entry. Thus, entry and exit into and out of formal employment is likely to exacerbate residualizedearnings inequality.3.2 Long-term Effects of Initial Labor Market Conditions on Employment and Earnings

Initial conditions have been often documented to have long-lasting effects on economic outcomes. Forexample, Chetty, Friedman, and Rockoff [2014] find that being assigned to high value-added teachersimproves students’ long-term outcomes in terms of being more likely to attend college and earn highersalaries. Chetty, Hendren, Lin, Majerovitz, and Scuderi [2016] argue that differences in childhood envi-ronments play a significant role in shaping gender gaps in adulthood for employment rates and earnings.Oreopoulos et al. [2012] show that the aggregate labor market conditions faced by workers when firstentering the labor market can affect employment and earnings outcomes in the long-term. Motivated bythese findings, we investigate in this section to what extent aggregate labor market conditions, such asthe unemployment rate and the rate of informality, shape the long-term labor market outcomes of Mexicanworkers. In particular, we ask whether workers face any long-term negative consequences for startingtheir employment life in an informal job. To the best of our knowledge, we are the first to estimate thelong-run effects of unemployment in Mexico and to evaluate the role of early exposure to informality fordeveloping countries.Given how ubiquitous informal labor arrangements are in Mexico, it is of particular interest under-standing to what extent early exposure to informality has an effect on the long-term labor market outcomesof Mexican workers. On the one hand, based on standard views of the informal labor market (see, forexample, La Porta and Shleifer [2014]), it could be anticipated that early exposure to informality mayhave significant negative effects on workers’ long-term outcomes given that informal employment typicallyoccurs in micro and small enterprises that operate with relatively unproductive technologies and where theprospect of career development is limited. Thus, the accumulation of firm-specific and/or sector-specifichuman capital may have relatively low returns in the context of informal employment. On the other hand,in labor markets where opportunities for young and/or inexperienced workers are scarce, the informalsector may serve for many workers as a stepping stone toward more productive employment in the formallabor market (see Tümen [2016]).3.2.1 Cohort Analysis

Following Oreopoulos et al. [2012] we proceed to studying the effect that the aggregate labor marketconditions faced by cohorts of workers entering the labor market for the first time may have on severaloutcomes throughout their employment trajectory. Specifically, we analyze the long-run effects that initialunemployment and informality rates may have on the earnings, unemployment and labor force participationrate of Mexican workers, as well as on the rates of formal vs informal employment. In all cases, we analyzethe heterogenous impact that initial conditions may have on long-term outcomes by providing separateestimates for different groups of workers differentiated by gender and level of educational attainment.29

We use household surveys to measure early exposure to unemployment and informality and currentlabor outcomes. For early exposure to unemployment, we combine two sources of information: theEncuesta Nacional de Empleo (ENE) and the Encuesta Nacional de Ocupación y Empleo (ENOE). TheENE was a household survey ran from 1995 to 2004. In 2005, this survey was updated and replaced bythe ENOE.37 Both surveys focus on employment outcomes and are representative of the population at thestate level within Mexico.For each worker in the survey, the data does not provide the exact time they entered the labor forcewhen young. Hence, we estimate the year of entry into the labor market as the sum of the year of birthand the years of education plus 5. If our estimated year of entry is below 15, the legal working age inMexico, we assume it to be equal to 15. We split the sample in two groups according to educationalattainment: low educated, corresponding to workers with less than 12 years of education, and higheducated, corresponding to workers with 12 or more years of schooling. The unit of analysis is thendefined by the year a person is supposed to have entered the labor force (cohort c) in each birth state(region r), and for each educational group. We end up with cells at the cohort-region-education-genderlevel and provide estimates for four distinct demographic groups: low-educated males Ml, low-educatedfemales Fl, high-educated males Mh, and high-educated females Fh.Our employment outcomes come from ENOE between 2005–2019 (second quarter to have one datapoint for each state per year).38 For each cohort, we keep workers with 3 or more years of potentialexperience,39 since our treatment consists on exposure to informality and unemployment during years 0through 2 of potential experience. We further restrict the sample to prime-age workers (15–60 yearsold) and drop employed workers that do not report earnings. We also remove potential outliers, thatcould partly reflect measurement error, by trimming the top 1% of labor incomes every year. For thespecifications where the dependent variable is a measure of the cohort’s earnings, labor earnings arecomputed with zeros to avoid conditioning on a potential outcome (labor force participation). Omittingzeros could allow for selection bias to arise since the composition of the labor force may change preciselybecause of the early exposure to informality and unemployment.Unemployment and informality rates represent the aggregate labor market conditions faced by newcohorts of workers entering the labor market and are calculated from ENE and ENOE at the regionallevel for workers aged under 30 years old.40 Initial informality and unemployment rates are computed as37ENOE is the primary official source of employment and occupation data in Mexico. It is a quarterly household survey at thenational level run by INEGI and designed to collect information on the employment situation of individuals in rural and urbanareas. The survey has a rotating panel structure: every quarter, INEGI, replaces one-fifth of the sample (i.e. each householdis followed for five consecutive quarters before being dropped from the survey). The ENOE is representative at the nationaland state level, and for selected cities, one for each state. The survey is conducted continuously throughout the year, visitingapproximately 120 thousand homes in each quarterly sample.38Although employment variables, such as informality and unemployment, are comparable across ENE and ENOE, laborearnings and other labor outcomes are not fully comparable due to methodological changes across these two surveys (see INEGI[2008])).39We say potential experience as the year of entry in the labor market is estimated. Potential experience simply measuresthe number of years elapsed between the estimated year of entry and year t.40We assume that the labor market conditions faced by new cohorts entering the labor market are better measured by the

30

the average of the first 3 years after a worker with a given educational level is supposed to have enteredthe labor force.41 Finally, we keep cohorts for which we have information regarding their initial labormarket conditions and observe at least 10 years of potential experience, so our cohorts include workerswho entered the labor market between 1995 and 2010.For each demographic group d ∈ {Ml,Fl,Mh,Fh}, we borrow from Oreopoulos et al. [2012] andestimate the following specification:ydcrt = 3∑

e=1 βdeUURcr01[experiencedcrt ∈ e] + 3∑e=1 βdeIIRcr01[experiencedcrt ∈ e] + γde+γdr + γdc + γdt + εdcrt

(3.4)where ydcrt is a labor market outcome of demographic group d, belonging to cohort c, in region r, at timet. URcr0 and IRcr0 are the unemployment and informality rates, respectively, at the time of entry intothe labor force for cohort c in region r. experiencedcrt is the number of years of potential experience ofa given group, that varies over time, and e are dummy variables grouping potential years of experience(less than 10 years, between 10 and 14, and 15 years or more). γde , γdr , γdc , and γdt are fixed effectsby years of potential experience, region, cohort and year, respectively.42 The coefficients βd1U and βd1Icapture the common impact of initial labor market condition across workers’ life cycle, while {βd2U, βd3U}and {βd2I, βd3I} capture whether the effects of these initial conditions are amplified or attenuated as workersgain additional potential experience.Our outcomes of interest are: probability of working in a formal or informal job, being unemployed, orbeing out of the labor force,43 and (log) average labor earnings.44 The results are reported in tables 2–6for each labor market outcome and separately by demographic group. The point estimates of coefficientsβd1U and βd1I should be interpreted as measuring the effect of a one percentage point increase in exposureto initial informality or initial unemployment on the fraction of individuals in a cohort exposed to theoutcome of interest.The estimates in tables 2–6 suggest that the initial unemployment rate in the local labor market wherecohort c enters play essentially no role in shaping long-term outcomes. For high-educated workers, bothmen and women, initial unemployment has no impact on long-term earnings or employment status, whilefor low-educated workers it has no impact on employment status but a positive and significant impact oninformality rate and unemployment rate of young workers instead of those of the entire labor force. Results are robust toconsidering the overall informality and unemployment rate instead.41We choose the first three years as opposed to just the first one because we are uncertain about the exact year in which aworker enters the labor force. We keep worker with 3 or more years of potential experience, given that our treatment variables(early exposure to unemployment and informality) are defined for potential year of entry and the following two years.42Our benchmark results are robust to the inclusion of a region-by-time fixed effect that controls for any variable affecting allcohorts in region r in a given year, such as local labor market conditions. See appendix D for details.43Recall that, since variables are defined at the cohort level, these outcomes correspond to the fraction of the cohort that attime t is found in each of those four states.44This variable is constructed as the natural logarithm of the mean total income for each cell of cohort c, demographic groupd, region r, and time t.

31

Table 2: Impact of initial labor market conditions on long-term formal employment

Dependent variable: Fraction of cohort formally employedLow-educated High-educatedIndependent variables: Men Women Men WomenInitial informality 0.369∗∗∗ 0.127∗ 0.260∗∗∗ –0.044(0.112) (0.068) (0.090) (0.094)Initial informality×Experience (10–15) –0.061∗∗ –0.034 –0.033 –0.010(0.025) (0.022) (0.032) (0.036)Initial informality×Experience (>15) –0.298∗∗∗ –0.133∗∗∗ –0.215∗∗∗ –0.139∗∗∗(0.035) (0.025) (0.033) (0.034)Initial unemployment –0.071 0.265 –0.212 –0.377(0.325) (0.255) (0.306) (0.321)Initial unemployment×Experience (10–15) 0.167 –0.166 0.215 0.078(0.277) (0.181) (0.273) (0.240)Initial unemployment×Experience (>15) 0.033 –0.099 –0.084 0.137(0.277) (0.193) (0.258) (0.231)N. of Observations 5,155 5,123 5,883 5,928

Note: Based on authors’ estimates with data from ENE and ENOE. All specifications include potential experience,region, cohort and year fixed effects. Standard errors (in parenthesis) are clustered at the cohort-region level. Pointestimates measure the effect of a one percentage point increase in exposure to initial informality or initial unem-ployment on the fraction of formally employed individuals in a cohort. Stars indicate significance levels (*p < 0.10,**p < 0.05, ***p < 0.01).earnings for low-educated women. The fact that initial unemployment conditions result to be essentiallyirrelevant for long-term outcomes may sound at odds with the findings of, for example, Oreopoulos et al.[2012] and Schwandt and Von Wachter [2019]. These studies focus on Canada and the United States,respectively, countries where the labor market is characterized by little to no informal employment andstrong labor market institutions that provide unemployment insurance and other forms of governmentsupport during spells of unemployment. In contrast, unemployment in Mexico is typically low and lesssensitive to the cycle, as compared to more advanced economies, and the lack of a social safety net anda more robust formal labor market imply that informality is large and an important mechanism throughwhich workers can smooth income shocks, increase the number of hours worked, and find more flexiblelabor arrangements. Thus, it is unsurprising that we find the rate of unemployment to be a poor signal ofthe true state of the local labor market for new entering cohorts of workers.Going over the results by demographic groups, we find that for high-educated men the initial rate ofinformality has a significant impact on long-term formal and informal employment —positive in the firstcase and negative in the second. Both are level effects and that are mitigated by potential experience,but only for workers with more than 15 years of potential experience. Initial rates of informality have anegative effect on all employment margins for high-educated women (i.e. lower chances of employment—both formal and informal— and higher chances of being unemployed and out of the labor force) andearnings as well, with potential experience somewhat reinforcing this negative effect.For low-educated men, initial informality in the local labor market has a significant effect on em-ployment status, positive for formal employment, and negative for informal employment and labor force

32

Table 3: Impact of initial labor market conditions on long-term informal employment

Dependent variable: Fraction of cohort informally employedLow-educated High-educatedIndependent variables: Men Women Men WomenInitial informality –0.607∗∗∗ –0.078 –0.403∗∗∗ -0.223∗∗∗(0.137) (0.076) (0.108) (0.084)Initial informality×Experience (10–15) 0.086∗∗∗ –0.036 0.023 –0.057∗(0.031) (0.023) (0.035) (0.032)Initial informality×Experience (>15) 0.275∗∗∗ 0.021∗∗∗ 0.201∗∗∗ –0.001(0.041) (0.026) (0.034) (0.029)Initial unemployment 0.347 0.168 0.306 0.234(0.415) (0.242) (0.340) (0.318)Initial unemployment×Experience (10–15) 0.277 –0.044 –0.108 –0.219(0.290) (0.211) (0.283) (0.278)Initial unemployment×Experience (>15) 0.296 0.324 0.104 –0.298(0.331) (0.207) (0.267) (0.259)N. of Observations 5,155 5,123 5,883 5,928

Note: Based on authors’ estimates with data from ENE and ENOE. All specifications include potential experience,region, cohort and year fixed effects. Standard errors (in parenthesis) are clustered at the cohort-region level. Pointestimates measure the effect of a one percentage point increase in exposure to initial informality or initial unemploy-ment on the fraction of informally employed individuals in a cohort. Stars indicate significance levels (*p < 0.10,**p < 0.05, ***p < 0.01).participation. The interaction between initial informality and potential experience mutes the effect forformal and informal employment but exacerbates it for labor participation. This interaction, especiallyfor very experienced workers, has also a positive effect on unemployment (i.e. it is negatively associatedwith unemployment) and a negative one on earnings. For low-educated women, initial informality hasa positive and significant effect on formal employment, that is however diminished as initial informalityinteracts with experience, and a positive level effect on unemployment. Conversely, the effect of initialinformality is negative in terms on labor force participation and earnings but only as low-educated femaleworkers accumulates experience.While the nature of the household survey data in Mexico imposes certain limitations for understandinglong-term labor market outcomes, our results suggest that initial exposure to informality plays a relevantrole. In particular, the rate of informality in local labor markets upon entry seems to be associated with asignificant negative effect on long-term labor force participation for all workers, except for highly educatedmen. Our estimates also imply that initial informality rate in the local labor market positively affect theprobability of being formally employed and negatively affect the probability being informally employed.Additionally, we find evidence that long-term earnings for men are relatively unaffected by initial ratesof informality, but the same cannot be said for women whose long-term earnings appear to be negativelyand significantly associated with initial rates of informality.Our reduced-form approach does not allow for discerning the mechanism(s) underlying our results.But, they are suggestive of the possible stepping-stone role that informal jobs may play in labor marketswhere opportunities for young and inexperienced workers are scarce. That is, while initial exposure to

33

Table 4: Impact of initial labor market conditions on long-term unemployment

Dependent variable: Fraction of cohort unemployedLow-educated High-educatedIndependent variables: Men Women Men WomenInitial informality –0.013 –0.079∗∗∗ 0.015 0.012(0.047) (0.028) (0.051) (0.041)Initial informality×Experience (10–15) –0.049∗∗∗ 0.001 –0.027 –0.023(0.017) (0.011) (0.019) (0.016)Initial informality×Experience (>15) –0.037∗∗ –0.002 –0.025 –0.026∗(0.017) (0.013) (0.019) (0.015)Initial unemployment 0.119 0.024 0.210 0.033(0.191) (0.112) (0.182) (0.166)Initial unemployment×Experience (10–15) –0.199 –0.051 –0.202 –0.170(0.163) (0.089) (0.172) (0.134)Initial unemployment×Experience (>15) –0.073 –0.041 –0.188 –0.116(0.177) (0.098) (0.151) (0.126)N. of Observations 5,155 5,123 5,883 5,928

Note: Based on authors’ estimates with data from ENE and ENOE. All specifications include potential experi-ence, region, cohort and year fixed effects. Standard errors (in parenthesis) are clustered at the cohort-regionlevel. Point estimates measure the effect of a one percentage point increase in exposure to initial informalityor initial unemployment on the fraction of unemployed individuals in a cohort. Stars indicate significance levels(*p < 0.10, **p < 0.05, ***p < 0.01).high levels of informality may have an adverse effect on long-term labor force participation, it may resultin a higher probability of being formally employed and, hence, being entitled to social security benefitsand certain forms of employment protection for those workers who manage to remain attached to the laborforce. We explore this issue more in depth in the next section.3.2.2 Individual Analysis

The rotating panel structure of the Mexican household surveys does not permit to construct a panel offormal and informal workers with the level of detail for their employment trajectories required to studylong-term labor market outcomes as in, for example, Schwandt and Von Wachter [2019]. We attempt topartially address this limitation exploiting a special module of the ENOE, the Labor Trajectory Module(MOTRAL, for its acronym in Spanish). The MOTRAL was specifically designed to collect informationon labor trajectories of Mexican workers. It was conducted on a subsample from ENOE in the secondquarter of both 2012 and 2015. Each round of the MOTRAL has the objective of reconstructing employmenttrajectories over the previous 5 years (i.e. the 2012 round covers the period 2007–2012 and the 2015round covers the period 2010–2015). As with the standard household survey, these trajectories are entirelyself-reported. In this last part of our analysis we use the MOTRAL to address the following question:how does the formal versus informal status of a worker’s first job affect his/her future wages?4545While the MOTRAL has its own limitations, the survey does ask respondents about specific details of their first job, even ifit took place before the years covered by the survey.

34

Table 5: Impact of initial labor market conditions on long-term labor force participation

Dependent variable: Fraction of cohort not in the labor forceLow-educated High-educatedIndependent variables: Men Women Men WomenInitial informality 0.250∗∗∗ 0.030 0.128 0.254∗∗(0.090) (0.084) (0.114) (0.105)Initial informality×Experience (10–15) 0.025 0.068∗ 0.038 0.090∗∗(0.025) (0.030) (0.044) (0.039)Initial informality×Experience (>15) 0.059∗∗ 0.094∗∗∗ 0.039 0.166∗∗∗(0.028) (0.036) (0.039) (0.038)Initial unemployment –0.395 –0.457 –0.305 0.110(0.290) (0.329) (0.424) (0.409)Initial unemployment×Experience (10–15) –0.245 0.261 0.095 0.311(0.236) (0.255) (0.385) (0.299)Initial unemployment×Experience (>15) –0.255 –0.184 0.167 0.276(0.237) (0.275) (0.326) (0.297)N. of Observations 5,155 5,123 5,883 5,928

Note: All specifications include potential experience, region, cohort and year fixed effects. Standard errors (inparenthesis) are clustered at the cohort-region level. Point estimates measure the effect of a one percentage pointincrease in exposure to initial informality or initial unemployment on the fraction of formally individuals out of thelabor force in a cohort. Stars indicate significance levels (*p < 0.10, **p < 0.05, ***p < 0.01).

Table 6: Impact of initial labor market conditions on long-term labor earnings

Dependent variable: Log total earningsLow-educated High-educatedIndependent variables: Men Women Men WomenInitial informality –0.453 –0.259 –0.240 –1.415∗∗∗(0.336) (0.473) (0.315) (0.396)Initial informality×Experience (10–15) –0.031 –0.413∗∗ –0.001 0.039(0.111) (0.163) (0.116) (0.114)Initial informality×Experience (>15) –0.393∗∗∗ –0.753∗∗∗ 0.031 –0.042(0.101) (0.168) (0.107) (0.124)Initial unemployment 0.349 3.090∗∗∗ –0.439 –1.998(0.860) (1.548) (0.534) (1.331)Initial unemployment×Experience (10–15) 1.033 –1.688 0.530 0.443(0.842) (1.214) (0.835) (1.068)Initial unemployment×Experience (>15) 1.113∗ –0.238 0.874 1.528(0.642) (1.250) (0.770) (1.076)N. of Observations 5,119 5,070 5,630 5,674

Note: Based on authors’ estimates with data from ENE and ENOE. Labor earnings are computed with zeros toavoid conditioning on a potential outcome (labor force participation). Omitting zeros could result in selection biassince the composition of the labor force may change due to the treatment. The variable is constructed as the nat-ural logarithm of the mean total income for each cell of cohort c, demographic group d, region r, and time t. Pointestimates measure the effect of a one percentage point increase in exposure to initial informality or initial un-employment on the percentage change in average labor earnings of a cohort. All specifications include potentialexperience, region, cohort and year fixed effects. Standard errors (in parenthesis) are clustered at the cohort-regionlevel. Stars indicate significance levels (*p < 0.10, **p < 0.05, ***p < 0.01).

35

A relevant feature of the MOTRAL is that information is reported at the level of an individual’s job.That is, the respondent is asked to provide information regarding each of the jobs held during the five-year window covered by the survey. For each job reported by an individual in the MOTRAL, we canobserve the month and year in which the individual began that job, his/her position within the job (i.e.employer, employee, self-employed), the monthly wage, the sector of economic activity associated withthe job, the social security institution the individual was affiliated with (if any),46 and the month and yearin which the job ended. In addition, socio-demographic variables can be retrieved since each individualin the MOTRAL is present and can be matched in the corresponding round of the ENOE. Importantly,the information collected for each individual’s job is independent of the employment duration in that jobso that the within-job dynamics are not reported. This implies, for example, that wage dynamics withina job are not observed, we only observe one self-reported income for each individual’s job, regardless ofthe specific spell of employment. This, in turn, requires making some assumptions to construct a panelfrom the MOTRAL.We use both rounds of the MOTRAL to construct a panel and define the following key variables atthe monthly frequency for the period from January 2007 to June 2015. (i) Working status: we assumethat an individual is working in a specific month if he/she has at least one job spell in that month; (ii)Monthly wage: calculated as total earnings across all active job spells in a given month; and (iii) Sectorof economic of activity: we assign a sector of economic activity to an individual based on the sectorassociated with the job in which he/she earned the most monthly income. Since some of these variablesin the monthly panel display little month-to-month variation, we aggregate the monthly panel to a yearlyfrequency and redefine the following variables. (i) Working status: dichotomous variable equal to 1 ifthe individual worked for at least one month within a given year; (ii) Income: calculated as the averagemonthly wage for those months in which the individual was an active worker; and (iii) Sector of economicactivity: we set a worker’s sector of economic activity as the sector of the job in which he/she earned themost income during the year. We also include the year of entry into the labor market, the status (formalor informal) of the first job as well as socio-demographic variables.Using the annual panel just described, we estimate the following specification, separately for eachdemographic group d ∈ {Ml,Fl,Mh,Fh}:

ln(yit) = β0 + β1ageit + β2age2it + β3formali0 + β4(formali0 × experienceit) + β5recessioni0+γt + γs + λit + εit (3.5)where yit is the average monthly income, ageit is the age of individual i at time t, experienceit are yearsof potential experience of individual i at time t, formali0 is a dummy variable that equals one if individuali’s first job was formal and zero otherwise, and recessioni0 is a dummy variable equal to one if individuali’s entered the labor force for the first time in a local labor market facing adverse economic conditions.We specifically control for the state of the overall economy in local labor markets to avoid the formal first

46Access to social security benefits is used to determine whether a job can be categorized as formal or informal.36

job variable picking up aggregate economic conditions. γt and γs are year and sector of economic activityfixed-effects, λit is a correction term that addresses estimation bias due to possible non-random selectioninto the sample, and εi,t is the error term. The coefficients of interest are β3 and β4, that capture the effectthat the instance of a worker’s first job being formal has on future earnings, both directly and through theinteraction with years of potential experience.47The formal status of a worker’s first job may be correlated to other unobserved individual characteristicsthat could also affect labor earnings, such as unobserved ability. Thus, an OLS regression may not recoverthe true parameters β3 and β4 because of omitted variable bias. We correct for this bias by instrumentingformali0 using measures of local labor market informality rates for young workers48 for the year in whichthey were first employed. The identifying assumption is that the informality rate is uncorrelated withthese unobserved characteristics of the individual entering the labor market, but it negatively affects thelikelihood of starting in a formal job. The results of a battery of tests that corroborate the validity of ourinstruments of choice are reported in appendix D. Since these rates of informality cannot be calculatedprior to 1995, we restrict our panel to workers who entered the labor market in or after 1995.49To construct the correction term, λit, we follow Heckman [1979] and characterize the selection equationas: li,t = Φ(α0 + α1experienceit + α2experience2it + outworkit + ψt + ψr) + νit (3.6)where lit is a dummy variable that equals one if individual i is working in year t, ψt and ψr are fixedeffects for year and region of birth, respectively, and νit is the error term. The variable outworkit is avariable that counts the number of consecutive years individual i spent without working prior to year t.More specifically, this variables is constructed recursively as: outworkit = [outworkit−1 + 1]× 1[lit−1 = 0]with outworkit = 0 for the first year in which we have information regarding individual i.The estimation results of the probit model used to construct the Heckman sample selection correctionterm are reported in table 7.50 The results show that: (i) potential years of experience have a positivemarginal effect on an individual’s work status that dissipates as experience is accumulated, with this

47Potential years of experience, experienceit , is simply the number of years that have elapsed between the year of entry intothe labor force and year t. The variable recessioni0 is an indicator that equals one if the annual growth rate of economic activityin the state in which individual i was born is negative in the year this individual became employed for the first time. Notice that,for lack of more accurate information, we implicitly assume that individuals enter the local labor market for the first time in thestate in which they were born. Economic activity is measured with the ITAEE (Indicador Trimestral de la Actividad EconómicaEstatal), a quarterly index that is a timely macroeconomic indicator at the state level and can be thought as a proxi for a state’sGDP. The index is first annualized and then its growth rate is calculated as the annual percentage change in the index.48Recall that our analysis in section 3.2.1 revealed that informality is indeed the aggregate labor market feature that mattersthe most for young workers entering the labor force for the first time.49Approximately 43% of our sample enters the labor market in or after 1995. Figure D.2 in appendix D presents the distributionby year of entry in the whole sample. We also restrict the sample to exclude individuals who are employers, self-employed, andunpaid employees. These observations are excluded due to potential concerns regarding heterogeneity in the quality of jobsfor the workers present in the panel. Since the MOTRAL does not provide additional information regarding job quality, otherthan the wages earned, the formal/informal nature of the job, and the sector of economic activity, we address this concern byrestricting our sample. These excluded observations account for around 20% of the total observations in the panel we construct.50Even if the sample for the main regression only considers individuals who entered the labor market in 1995 or after, theprobit model in equation (3.6) is estimated with the whole sample.37

effect being stronger for men than for women; and (ii) the longer an individual has remained in a spellof non-employment the less likely it is that he/she will be employed in the current period, with thiseffect being stronger for men than for women. The results also suggest that women have a lower averageprobability of being employed in any given period. Figure 17 aids in the visualization of these patterns.Table 7: Heckman selection into employment

Dependent variable: Labor market participationIndependent variables: Men WomenExperience 0.058∗∗∗ 0.023∗∗∗(0.005) (0.004)Experience2 –0.001∗∗∗ –0.000∗∗∗(·) (·)Out of work –0.856∗∗∗ –0.733∗∗∗(0.019) (0.014)Constant 0.011 –0.197∗∗∗(0.106) (0.080)N. of Observations 20,382 20,058Note: Based on authors’ estimates with data from ENE andENOE. All specifications include region and year fixed effects.Robust standard errors are reported in parenthesis. Stars indi-cate significance levels (*p < 0.10, **p < 0.05, ***p < 0.01).

Figure 17: Probability of participating in the labor market(a) Men

0.2

.4.6

.81

Prob

abili

ty o

f Wor

king

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Potential years of experience

0 years without working1 year without working2 years without working

(b) Women

0.2

.4.6

.81

Prob

abili

ty o

f Wor

king

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Potential years of experience

0 years without working1 year without working2 years without working

Note: Based on authors’ estimates with data from MOTRAL and ENOE. The figure plots the probability of being employed asa function of accumulated years of experience. Probabilities are calculated assuming average values for fixed effects variables.95% confidence intervals were estimated using the delta method.The estimated coefficients of equation (3.5) are presented in table 8. The main takeaways from thislast part of our analysis are the following. Having the first job in the formal sector has a positive leveleffect on the future wages of high-educated workers, both men and women, which increases with theaccumulation of experience. The level effect is of a significant magnitude and particularly so for women:workers that enter the labor market for the first time into a formal job have future wages that, on average,

38

are 45% higher for men and 54% higher for women, relative to the wages of workers whose first jobwas in the informal sector. Notice also that entering the labor market during a period of contractingeconomic activity has a long-term negative impact only for high-educated men (about a 5% decrease infuture earnings, on average). In contrast, we find that the same effect is positive (10% increase in futureearnings, on average) for high-educated women.Table 8: Impact of formal first job on future wages

Dependent variable: Log wagesLow-educated High-educatedIndependent variables: Men Women All Men Women AllFormal first job 0.022 0.743∗∗∗ 0.185∗ 0.455∗∗∗ 0.541∗∗ 0.257∗∗(0.140) (0.206) (0.111) (0.130) (0.250) (0.108)Formal first job×Experience 0.041∗∗∗ 0.011 0.042∗∗∗ 0.037∗∗ 0.103∗∗∗ 0.066∗∗∗(0.015) (0.014) (0.010) (0.015) (0.020) (0.011)Age 0.021∗ –0.044∗∗ –0.018∗ 0.027 –0.008 0.027(0.012) (0.022) (0.011) (0.023) (0.031) (0.017)Age2 –0.001∗∗∗ 0.000 –0.000 –0.000 0.000 –0.001∗∗(·) (·) (·) (·) (·) (·)Recession 0.041 –0.028 –0.012 –0.051∗ 0.100∗ 0.005(0.034) (0.041) (0.026) (0.037) (0.053) (0.029)

λ –0.114∗∗∗ –0.081∗∗∗ –0.236∗∗∗ –0.145∗∗∗ –0.239∗∗∗ –0.301∗∗∗(0.038) (0.031) (0.024) (0.037) (0.053) (0.029)Constant 8.212∗∗∗ 9.435∗∗∗ 8.881∗∗∗ 7.732∗∗∗ 9.346∗∗∗ 8.488∗∗∗(0.181) (0.554) (0.186) (0.384) (0.533) (0.306)N. of Observations 2,141 1,808 3,949 3,513 3,330 6,843Note: Based on authors’ estimates with data from ENE and ENOE. All specifications include sector of economic activity and yearfixed effects. Robust standard errors are reported in parenthesis. Stars indicate significance levels (*p < 0.10, **p < 0.05, ***p <0.01).

Looking at low educated workers, we find that men can benefit from having a formal first job onlythrough the accumulation of experience. Conversely, having a first job in the formal sector entails apositive level effect on future wages for low-educated women, with no added benefit stemming from theaccumulation of potential experience. The effect on average wages for low-educated women is strongerthan that estimated for high-educated ones. The fact that potential experience seems to have no relevancein the wage trajectories of low-educated women may be related to potential experience generally over-stating women’s work experience since it does not account for the time that women spend out of the labormarket for child rearing (see De la Cruz Toledo [2014]). Due to strong gender roles in Mexico, womenalso tend to have more career interruptions than men since they are more heavily burdened with the careof the family. Additionally, our results suggest that entering the labor market during a period of negativegrowth has no significant effect on future earnings for low-educated workers. This does not imply thatthese workers do not face depressed wages during the year in which they enter the labor market, butthat any potential negative effect on earnings does not persist in the future.

39

3.3 Summary and Discussion of the Main FIndings of Part II

In sum, the empirical evidence presented in this section suggests that the interplay between formalityand informality in the Mexican labor market significantly affect the income dynamics of workers. Inparticular, our estimates suggest that transitions out of and into formal employment can be costly, notonly because, on average, informal jobs pay lower wages and do not entail social security benefits orlegal employment protections, but also because spells out of formal employment are associated with wagepenalties upon re-entry that may last for several years. Our findings also point to the fact that earlyexposure to informality, whether through the aggregate labor market conditions under which a new cohortof workers enters the labor force or through a worker’s first job being informal, can have persistent effectson labor market outcomes. For instance, an individual entering the labor market as a formal worker cansee increases in future wages by 50 to 80%, on average, relative to a worker whose first job was informal,depending on gender and level of education. We also document that, when a new cohort of workersenters the labor market during times of higher rates of informality, this can be associated with positivelong-term outcomes, such as a higher fraction of the cohort being formally employed or a lower fractionof the cohort being unemployed, but also with negative long-term outcomes, such as a higher fraction ofthe cohort being out of the labor force.4 Concluding Remarks

Using social security records for millions of Mexican formal sector workers, we have studied the distributionof earnings, mobility patterns in this distribution, and the distribution of earnings changes that characterizethe dynamics of earnings. Following a non-parametric approach, we reach the following conclusions:the distribution of one-year earnings changes displays significant deviations from normality, with thesedeviations varying over the life-cycle, across the permanent income distribution, and, to a lesses extent,across genders. We find that lower-income and younger workers face, on average, more dispersion inearnings changes, while lower-income and older workers face a distribution of one-year earnings changeswith a more pronounced peak. The distribution of earnings changes is not symmetric, but whether it isleft or right skewed depends on gender, age, and income: the distribution is most right skewed for lower-income and older women, and it is most left skewed for higher income and younger men. Additionally, thedistribution of log earnings displays decreasing dispersion (or inequality) starting in 2015, a trend that islargely related to the upward revisions of the minimum wage from that year onward. Finally, we find thatupward mobility within the earnings distribution is highest for lower-income and younger workers, whiledownward mobility is highest for higher-income and younger workers. At the very top of the distribution(top 0.1%) we find little to no mobility.After establishing these facts, we extend our analysis on the dynamics of earnings of Mexican workersby studying how transitions out of and into formal employment affect earnings and by focusing on thelikely role that informality plays in shaping earnings dynamics in Mexico. In this regard, we find that40

workers that transition out of formal employment are subject to an earnings penalty upon re-entry. Thispenalty is a cost that workers must bear for three years or more before achieving a level of earnings thatis comparable to pre-exit levels. We also find that for cohorts of new workers entering the labor market,higher levels of informality upon entry have significant and persistent effect on long-term labor outcomes,especially in terms of employment status. Early exposure to informality, in the form of getting the firstjob in informal sector, also proves to have a negative and sizable impact on future earnings.We hope that our findings can inform future research and policy analysis regarding the Mexicanlabor market and that studying more in depth its structure and peculiar traits can shed light on the keyfactors that are germane to the distribution of earnings and earnings shocks in Mexico. Understandingand giving context to these factors is also crucial for performing meaningful cross-country comparisonsof these distributions. Future research could benefit from access to tax records that could help overcomethe limitations of our analysis for the very top of the earnings distribution. Further work regarding thedual nature of the Mexican labor market, a feature that is also important in other developing countries,and the ways in which earnings dynamics are shaped by workers’ transitions across formal and informalemployment is also a promising and relevant avenue for future research.

41

References

Aguilar-Argaez, A. M., Alcaraz, C., Ramírez, C. B., & Rodríguez-Pérez, C. A. (2020). The NAIRU andinformality in the Mexican labor market. Banco de México Working Paper Series No. 2020-09.Alcaraz, C., Chiquiar, D., & Salcedo, A. (2015). Informality and segmentation in the Mexican labor market.Banco de México Working Paper Series No. 2015-25.Arellano, M., Blundell, R., & Bonhomme, S. (2017). Earnings and consumption dynamics: a nonlinearpanel data framework. Econometrica, 85 (3), 693–734.Baker, S. R., Bloom, N., & Davis, S. J. (2016). Measuring economic policy uncertainty. The QuarterlyJournal of Economics, 131(4), 1593–1636.Binelli, C., & Attanasio, O. (2010). Mexico in the 1990s: the main cross-sectional facts. Review ofEconomic Dynamics, 13(1), 238–264.Bloom, N. (2009). The impact of uncertainty shocks. Econometrica, 77 (3), 623–685.Bonhomme, S., & Hospido, L. (2017). The cycle of earnings inequality: evidence from Spanish socialsecurity data. The Economic Journal, 127 (603), 1244–1278.Campos-Vazquez, R. M., & Lustig, N. (2017). Labour income inequality in Mexico: Puzzles solved andunsolved. Journal of Economic and Social Measurement , 1–17.Campos Vázquez, R. M., & Rodas Milián, J. A. (2020). El efecto faro del salario mínimo en la estructurasalarial: evidencias para México. El Trimestre Económico, 87 (345), 51–97.Chetty, R., Friedman, J. N., & Rockoff, J. E. (2014). Measuring the impacts of teachers ii: Teachervalue-added and student outcomes in adulthood. American economic review , 104(9), 2633–79.Chetty, R., Hendren, N., Lin, F., Majerovitz, J., & Scuderi, B. (2016). Childhood environment and gendergaps in adulthood. American Economic Review , 106(5), 282–88.DeCarlo, L. T. (1997). On the meaning and use of kurtosis. Psychological Methods, 2(3), 292.De la Cruz Toledo, E. (2014). Women’s employment in Mexico. Doctoral thesis, Graduate School of Artsand Sciences, Columbia University.De la Cruz Toledo, E. (2020). Universal preschool and mother’s employment. Working paper.Esquivel, G., Lustig, N., & Scott, J. (2010). A decade of falling inequality in Mexico: market forces or stateaction? In L. F. Lopez Calva & N. Lustig (Eds.), Declining inequality in Latin America: A decade ofprogress? (pp. 175–217). Brookings Institution and UNDP, Washington D.C.Fiess, N. M., Fugazza, M., & Maloney, W. F. (2010). Informal self-employment and macroeconomicfluctuations. Journal of Development Economics, 91(2), 211–226.Günther, I., & Launov, A. (2012). Informal employment in developing countries: Opportunity or last resort?Journal of Development Economics, 97 (1), 88–98.Guvenen, F., Kaplan, G., Song, J., & Weidner, J. (2017). Lifetime incomes in the United States over sixdecades. National Bureau of Economic Research Working Paper No. 23371.Guvenen, F., Karahan, F., Ozkan, S., & Song, J. (2021). What do data on millions of US workers revealabout life-cycle earnings dynamics? Econometrica, forthcoming.

42

Guvenen, F., Ozkan, S., & Song, J. (2014). The nature of countercyclical income risk. Journal of PoliticalEconomy, 122(3), 621–660.Heckman, J. J. (1979). Sample selection bias as a specification error. Econometrica, 153–161.Iacus, S. M., King, G., & Porro, G. (2012). Causal inference without balance checking: Coarsened exactmatching. Political analysis, 1–24.INEGI. (2008). Homologación de la serie de indicadores estratégicos ENE–ENOE. Documento técnico.Jurado, K., Ludvigson, S. C., & Ng, S. (2015). Measuring uncertainty. American Economic Review , 105 (3),1177–1216.Kaplan, D. S., & Novaro, F. P. A. (2006). El efecto de los salarios mínimos en los ingresos laborales deMéxico. El Trimestre Económico, 139–173.Kumler, T., Verhoogen, E., & Frías, J. A. (2020). Enlisting employees in improving payroll-tax compliance:Evidence from Mexico. The Review of Economics and Statistics, 102(5), 881–896.La Porta, R., & Shleifer, A. (2014). Informality and development. Journal of Economic Perspectives, 28(3),109–26.Levy, S. A. (2018). Under-rewarded efforts: The elusive quest for prosperity in Mexico. Inter-AmericanDevelopment Bank.Lustig, N., Lopez-Calva, L. F., & Ortiz-Juarez, E. (2013). Declining inequality in Latin America in the2000s: The cases of Argentina, Brazil, and Mexico. World development , 44, 129–141.Mathy, G. P. (2020). How much did uncertainty shocks matter in the Great Depression? Cliometrica,14(2), 283–323.Oreopoulos, P., Von Wachter, T., & Heisz, A. (2012). The short-and long-term career effects of graduatingin a recession. American Economic Journal: Applied Economics, 4(1), 1–29.Schwandt, H., & Von Wachter, T. (2019). Unlucky cohorts: Estimating the long-term effects of entering thelabor market in a recession in large cross-sectional data sets. Journal of Labor Economics, 37 (S1),S161–S198.Stock, J. H., Yogo, M., et al. (2005). Testing for weak instruments in linear IV regression. Identificationand inference for econometric models: Essays in honor of Thomas Rothenberg, 80(4.2), 1.Tümen, S. (2016). Informality as a stepping stone: A search-theoretical assessment of informal sector andgovernment policy. Central Bank Review , 16(3), 109–117.Ulyssea, G. (2020). Informality: Causes and consequences for development. Annual Review of Economics,12 , 525–546.

43

A Appendix: IMSS Data and Descriptive Statistics from the Master Sample

In this appendix we provide additional information regarding the structure and the specific features of theadministrative data used to carry out the descriptive analysis in section 2 and report relevant summarystatistics of our master sample that should facilitate the comparison of the results across countries.Our administrative data, the IMSS data, are available monthly from January 2005 to December 2019and cover, approximately, between 13 million workers at the start of the sample period and 20 millionworkers toward the end. The information available for each worker is:51� Social Security Number (SSN)� Unique population registry* (CURP for its acronym in Spanish)� Gender� Type of employment (permanent vs temporary contract)52� Daily (taxable) wage� Employer id� Firm tax id* (RFC for its acronym in Spanish)� Sector of economic activity� Geographic location of employer (county where the employer registers the employees with IMSS)Although not directly provided by IMSS, the worker’s SSN provides enough information to infer the yearof birth (age) and the year of first enrollment in social security.53While our social security data contain sufficient information to characterize the patterns of incomedynamics and inequality for Mexican workers, there are a few issues regarding their limitations thatare relevant for understanding how they may compare to or differ from administrative records and/oremployer-employee matched data in other countries. First, the IMSS data do not contain any informationon workers employed in the informal sector which means, as already mentioned, that they miss a verysubstantial fraction of the Mexican labor force. Second, two additional issues are worthy of mention:a. Employer id vs Firm id. In the IMSS data, the employer id does not correspond to the firm id as itis usually the case in employer-employee matched datasets. The Mexican social security systemallows firms to have multiple “registros patronales” (i.e. employer ids) that are used to registertheir workers with IMSS. The same firm could use multiple employer ids for several reasons suchas operating multiple plants, or employing groups of workers with different risk profiles and thereis no official source of information providing a concordance between employer ids and the firmsthese belong to.54 The variable firm id in the data corresponds to the id with which each firm is

51These fields of information correspond to those that IMSS has agreed to share with the General Directorate of EconomicResearch at Banco de México. The fields identified with asterisks are only available from November 2018 onward.52Temporary contracts are those that are specified with start and end dates, while permanent contracts do not include apre-specified end date.53For some of the observations, the age variable (inferred from the SSN) corresponds to an age that cannot possibly becorrect. These observations represent a negligible fraction of monthly observations and are eliminated once the age restrictionsare applied for constructing the master sample.54In Mexico, social security contributions are paid based on the risk profile of the worker’s occupation.44

registered in the the “Registro Federal de Contribuyentes” (RFC), a tax identity code assigned bythe Servicio de Administración Tributaria, the Mexican tax authority. This code is used by bothfirms and individuals engaging in economic activities subject to taxes. Analogous to the case of theemployer id, firms may legally register multiple RFCs and there is no information regarding whichRFCs belong to the same firm.55b. Demographics. Mexican social security data does not provide information on a worker’s educationalattainment, occupation of employment, or foreign-born status, nor allows for identifying households(neither partners, parents, nor children of a worker). IMSS data cannot be linked with other sourcesof information, such as household surveys, that contain some of these demographic characteristics.We now turn to the specifics of the sample construction and sample statistics. Original IMSS recordsare collapsed to a yearly frequency by summing all the wage observations for a given worker within theyear.56 For the period 2005–2019, this results in over 315 million worker-year pairs with between 17and 26 million observations per year for workers aged 14–75 years old. Imposing age restrictions on thesample to only include workers aged 25–55, which is the relevant group for all the results of section 2,between 23 and 26% of yearly observations and 24% of the observations in the whole sample are lost.Excluded observations mainly consist of workers aged 24 and below (see figure A.1).Figure A.1: Age distribution in the master sample

01

23

4Pe

rcen

t

15 25 35 45 55 65 75Age

age 25-55:75.9% of the total

observations present in the IMSS data

Age

above age 55:6.8% of the total

observations present in the IMSS data

until age 24:17.3% of the total

observations present in the IMSS data

Note: Based on authors’ calculations with data from IMSS. The distribution is calculated over the entire sample, without agerestrictions, consisting of 315 million worker-year pairs.Regarding the composition by gender, the age-restricted sample is consistent with the composition ofthe original data: on average, throughout the sample period, 63% of observations are men, with the shareof women rising steadily from 35% in 2005 to 39% in 2019. This implies that the absolute number of men

55For example, it could be the case that a firm uses one RFC for its taxable domestic operations and another RFC for itsforeign sector operations. But, there is no information on how many RFCs each firm possesses and how it uses them.56At this stage, the only observations that are dropped from the original records are those with a missing value for wage.These observations represent a negligible share of monthly observations.45

grew by 50% between 2005 and 2019, while the absolute number of women grew by 76% during the sameperiod.57Within the group of workers in question (25–55 years of age), the bulk of the observations is concen-trated among workers aged 30 to 44 that, on average, commands 55% of yearly observations. By splittingthe observations into three age groups, we see that there is a significant change in the distribution ofyearly observations across these groups, with a noticeable increase in the participation of the eldestworkers (45–55), at the expense of the participation of both the youngest workers (25–29) and workersaged 30 to 44.Table A.1: Gender and age composition of the age-restricted sample

Year by Gender by Age group in % shareMen Women [25–29] [30–44] [45–55]2005 65.2 34.8 25.1 56.3 18.62010 63.6 36.4 23.3 55.8 20.92015 63.0 37.0 22.9 54.1 23.02019 61.3 38.7 22.6 52.4 25.0Note: Based on authors’ calculations with IMSS data.

An important characteristic of our administrative data is the frequent movement of workers in and outjobs affiliated with social security. These movements represent transitions between formal employmentand non-formal employment, with this latter state being either employment in the government, employmentin the informal sector, unemployment, or exit from the labor force. We highlight some relevant statisticsregarding these transitions. Based on an analysis conducted with a random sample of 4 million workersaged 25 to 55 we document that (see also table A.2):i. 8.8% are present during the entire sample period from 2005 to 2019.ii. 75.4% have only one active spell that has an average duration of 5.9 years.58iii. 18.9% have only two active spells, the first of these having an average duration of 3.2 years, andthe second having an average duration of 3.6 years.59iv. 24.6% of workers have at least one inactive spell during the sample period. Among these, 76.9%have only one inactive spell that has an average duration of 2.9 years.v. The average duration of the first active spell is 5.2 years, regardless of the total number of activespells.vi. The average duration of the first inactive spell is 2.7 years, regardless of the total number of inactivespells.The share of individuals with only one active spell as formal workers, 75.4%, is the result of acombination of: workers that stayed in the database during the whole sample period (8.8%); workers whoentered the formal labor market after 2005 and continuously kept a formal job until 2019, (29.2%); and57In the original monthly records spanning over January 2005 to December 2019, roughly 64% of observations are men, withthe share of women rising steadily throughout the sample, from about 35% at the outset to 37% by the end of the period.58Recall that a spell is defined as a sequence of contiguous years in which we observe the worker’s income (wage).5994% of workers have at most two active spells.

46

Table A.2: Distribution of workers by number of active job spells in the formal sectorN. of spells Share %Men Women All1 74.5 76.8 75.42 19.2 18.5 18.93 5.1 4.0 4.74 1.1 0.6 0.95 or more 0.2 0.1 0.1

Note: Based on authors’ calculations with arandom sample form IMSS data consisting of4 million workers aged 25 to 55.workers who entered in or after 2005, ended their formal employment relationship before 2019, and didnot regain formal employment before or in 2019, i.e. they did not come back into the database (37.4%). Werefer to the first two groups of workers as “stayers” and to the third group as “leavers”. The distributionof workers that have only one formal spell according to these two categories is characterized in table A.3.Stayers and leavers are almost equally split in the sample and, in general, Mexican workers tend to havea tenuous connection with employment in the formal sector, with this being more evident for women.

Table A.3: Distribution of workers with only one job spell in the formal sectorShare %Men Women AllStayers 51.9 48.2 50.4Leavers 48.1 51.8 49.6

Note: Based on authors’ calculationswith a random sample form IMSS dataconsisting of 4 million workers aged 25to 55.Figure A.2 shows the distribution of leavers by age of exit. About 16.9% of these workers leave theIMSS dataset when they are 55 years old. That is, they exit either because they reach the upper agelimit we imposed to be included in the master sample or because they retire altogether. At the otherextreme, a significant share of workers exit formal employment at a young age: 30.1% of all workers thatleave are 30 years old or younger.

47

Figure A.2: Distribution of leavers by age of exit

05

1015

20Pe

rcen

t

25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55

Exit Age

Note: Based on authors’ calculations with data from IMSS.

48

B Appendix: Additional Results for Inequality, Mobility, and Income Dy-namics

In this appendix we present additional results that complement those presented in section 2 of the maintext.Figure B.1: Distribution of earnings in the population

(a) Percentiles

.25

.15

.05

-.05

Perc

entil

es R

elat

ive

to 2

005

2005 2007 2009 2011 2013 2015 2017 2019

p90p75p50p25p10

(b) Top percentiles

.3.2

.10

-.1Pe

rcen

tiles

Rel

ativ

e to

200

52005 2007 2009 2011 2013 2015 2017 2019

p99.99p99.9p99p95p90

(c) Overall dispersion

3.1

32.

92.

82.

72.

62.

5D

ispe

rsio

n of

Log

Ear

ning

s

2005 2007 2009 2011 2013 2015 2017 2019

2.56*σP90-P10

(d) Right- and left-tail dispersion2

1.8

1.6

1.4

1.2

1D

ispe

rsio

n of

Log

Ear

ning

s

2005 2007 2009 2011 2013 2015 2017 2019

P90-P50P50-P10

Note: Based on authors’ calculations with data from IMSS. Using the CS+TMax sample, this figure plots against time thefollowing statistics of the distribution of log (real) earnings for the whole population: (a) P10, P25, P50, P75, P90; (b) P90,P99, P99.9, P99.99; (c) P90–P10 and 2.56 ∗ σ that corresponds to the P90–P10 differential for a Gaussian distribution; (d)P90–P50 and P50–P10. Since the data are top coded the percentiles above P95 are imputed by fitting a Pareto distributionaround the top code. All percentiles are normalized to 0 in 2005, the first available year. Shaded areas are recessions.

49

Figure B.2: Distribution of residual earnings in the population after controlling for age(a) Percentiles

.12

.08

.04

0-.0

4-.0

8Pe

rcen

tiles

Rel

ativ

e to

200

5

2005 2007 2009 2011 2013 2015 2017 2019

p90p75p50p25p10

(b) Top percentiles

.2.1

0-.1

-.2Pe

rcen

tiles

Rel

ativ

e to

200

5

2005 2007 2009 2011 2013 2015 2017 2019

p99.99p99.9p99p95p90

(c) Overall dispersion

3.1

32.

92.

82.

72.

62.

5D

ispe

rsio

n of

Res

idua

l Log

Ear

ning

s

2005 2007 2009 2011 2013 2015 2017 2019

2.56*σP90-P10

(d) Right- and left-tail dispersion2

1.8

1.6

1.4

1.2

1D

ispe

rsio

n of

Res

idua

l Log

Ear

ning

s

2005 2007 2009 2011 2013 2015 2017 2019

P90-P50P50-P10

Note: Based on authors’ calculations with data from IMSS. Using the CS+TMax sample, this figure plots against time thefollowing statistics of the distribution of residual earnings for the whole population: (a) P10, P25, P50, P75, P90; (b) P90,P99, P99.9, P99.99; (c) P90–P10 and 2.56 ∗ σ that corresponds to the P90–P10 differential for a Gaussian distribution; (d)P90–P50 and P50–P10. Residual earnings are obtained regressing log earnings against a full set of age dummies, separatelyby gender and year, and are computed to avoid trends being affected by individuals being at different stages of their lifecycles, or by the business cycle. Shaded areas are recessions.

50

Figure B.3: Top income inequality: Pareto tail at top 5%(a) Men

-5-4

.5-4

-3.5

-3lo

g(1-

CD

F)

12.8 13 13.2 13.4 13.6 13.8log yit

2005 Level (Slope: -1.61)2015 Level (Slope: -1.74)

(b) Women

-5.5

-5-4

.5-4

-3.5

-3lo

g(1-

CD

F)

12.5 13 13.5 14log yit

2005 Level (Slope: -2.10)2015 Level (Slope: -2.11)

Note: Based on authors’ calculations with data from IMSS. Using the CS+TMax sample, this figure plots against log earningsand for selected years in the sample the following variables: (a) Men: log counter cumulative distribution of earnings; (b)Women: log counter cumulative distribution of earnings. The log counter cumulative distribution is calculated as log(1–CDF).The estimated tail index for a power law distribution in the upper tail, is reported in parentheses. Since the data are top codedand the top percentiles imputed, the figure reports top income inequality at the top 5% of the distribution of log earnings.

Figure B.4: Changes in income share relative to 2005(a) Men

-.02

-.01

0.0

1.0

2C

hang

e in

Inco

me

Shar

es R

elat

ive

to 2

005

2005 2007 2009 2011 2013 2015 2017 2019

Q1Q2Q3Q4Q5

(b) Women

-.04

-.02

0.0

2C

hang

e in

Inco

me

Shar

es R

elat

ive

to 2

005

2005 2007 2009 2011 2013 2015 2017 2019

Bottom 50%Top 10%Top 5%Top 1%

Note: Based on authors’ calculations with data from IMSS. Using the CS+TMax sample, this figure plots against timechanges in the distribution of income shares relative to 2005 for the whole population: (a): changes in the quintiles of theincome shares distribution; (b) changes in selected portions of the income shares distribution. Quintiles are normalized to 0 in2005, the first available year. Since the data are top coded and the top percentiles imputed, the figure reports changes inincome shares only up until the top 1% of the distribution. Shaded areas are recessions.

51

Figure B.5: Evolution of the Gini coefficient

.54

.545

.55

.555

.56

.565

.57

Gin

i Coe

ffic

ient

2005 2007 2009 2011 2013 2015 2017 2019

Note: Based on authors’ calculations with data from IMSS. Using the CS+TMax sample, this figure plots against time theGini coefficient for the whole population. A Gini coefficient equal to 0 expresses perfect equality in the income distribution,while a Gini coefficient equal to 1 expresses maximal inequality. Shaded areas are recessions.

Figure B.6: Dispersion of five-year earnings changes(a) Men

.8.9

11.

11.

2D

ispe

rsio

n of

g5 it

2007 2009 2011 2013 2015

P90-P50P50-P10

(b) Women

.8.9

11.

11.

2D

ispe

rsio

n of

g5 it

2007 2009 2011 2013 2015

P90-P50P50-P10

Note: Based on authors’ calculations with data from IMSS. Using the LS+TMax sample, this figure plots against time thefollowing measures of top- and bottom-tail dispersion of the distribution of five-year earnings changes: (a): Men: P90–P50and P50–P10 differentials; (b) Women: P90–P50 and P50–P10 differentials. Shaded areas are recessions.

52

Figure B.7: Skewness and kurtosis of five-year earnings changes(a) Kelley skewness

-.2-.1

0.1

Skew

ness

of g

5 it

2007 2009 2011 2013 2015

WomenMen

(b) Excess Crow-Siddiqui kurtosis

23

45

Exce

ss K

urto

sis o

f g5 it

2007 2009 2011 2013 2015

WomenMen

Note: Based on authors’ calculations with data from IMSS. Using the LS+TMax sample, this figure plots against time thefollowing higher order moments of the distribution of five-year earnings changes: (a) Men and Women: Kelley skewnesscalculated as (P90−P50)−(P50−P10)P90−P10 ; (b) Men and Women: Excess Crow-Siddiqui kurtosis calculated as P97.5−P2.5P75−P25 − 2.91, where thefirst term is the Crow-Siddiqui measure of Kurtosis and 2.91 corresponds to the value of this measure for the Normaldistribution. Shaded areas are recessions.

53

Figure B.8: Dispersion, skewness, and kurtosis of five-year earnings changes(a) Men

0.5

11.

52

2.5

3P9

0-P1

0 D

iffer

entia

l of g

it5

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

(b) Women

0.5

11.

52

2.5

3P9

0-P1

0 D

iffer

entia

l of g

it5

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

(c) Men

-1-.5

0.5

Kel

ley

Skew

ness

of g

it5

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

(d) Women

-1-.5

0.5

Kel

ley

Skew

ness

of g

it5

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

(e) Men

13

57

911

1315

Exce

ss C

row

-Sid

diqu

i Kur

tosi

s of g

it5

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

(f ) Women

13

57

911

1315

Exce

ss C

row

-Sid

diqu

i Kur

tosi

s of g

it5

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

Note: Based on authors’ calculations with data from IMSS. Using the H+TMax sample, this figure plots against percentiles ofthe permanent income distribution, and for three different age groups, the following moments of the distribution of five-yearearnings changes: (a) and (b) Men and Women: P90–P10 differential; (c) and (d) Men and Women: Kelley Skewness; (e) and(f ) Men and Women: Excess Crow-Siddiqui kurtosis. Since the data are top coded the percentiles of the permanent incomedistribution are plotted only until P95.54

Figure B.9: Standardized moments of earnings changes(a) Men

.1.3

.5.7

.91.

1St

anda

rd D

evia

tion

of g

it

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

(b) Women

.1.3

.5.7

.91.

1St

anda

rd D

evia

tion

of g

it

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

(c) Men

-4-3

-2-1

0Sk

ewne

ss o

f git

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

(d) Women

-4-3

-2-1

0Sk

ewne

ss o

f git

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

(e) Men

010

2030

4050

60Ex

cess

Kur

tosi

s of g

it

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

(f ) Women

010

2030

4050

60Ex

cess

Kur

tosi

s of g

it

0 10 20 30 40 50 60 70 80 90 95Quantiles of Permanent Income Pit-1

[25-34][35-44][45-55]

Note: Based on authors’ calculations with data from IMSS. Using the H+TMax sample, this figure plots against percentiles ofthe permanent income distribution, and for three different age groups, the following standardized moments of the distribution ofone-year earnings changes: (a) and (b) Men and Women: Standard deviation; (c) and (d) Men and Women: Coefficient ofskewness; (e) and (f ) Men and Women: Excess kurtosis. Excess kurtosis calculated as γ − 3, where γ is the standard measureof kurtosis (i.e. fourth standardized moment) and 3 corresponds to the value of this measure for the Normal distribution. Sincethe data are top coded the percentiles of the permanent income distribution are plotted only until P95.55

Figure B.10: Evolution of 5-year mobility over the life cycle(a) Men

Top 0.1% of Pit-1

020

4060

8010

0M

ean

Perc

entil

es o

f Pit+

5

0 10 20 30 40 50 60 70 80 90 99.9Percentiles of Permanent Income, Pit

[25-34][35-44][45-55]

(b) WomenTop 0.1% of Pit-1

020

4060

8010

0M

ean

Perc

entil

es o

f Pit+

5

0 10 20 30 40 50 60 70 80 90 99.9Percentiles of Permanent Income, Pit

[25-34][35-44][45-55]

Note: Based on authors’ calculations with data from IMSS. The figure shows average rank-rank short-term (5-year) mobilityfor male (a) and female (b) workers of different ages.

Figure B.11: Evolution of 5-year mobility over time(a) Men

Top 0.1% of Pit-1

020

4060

8010

0M

ean

Perc

entil

es o

f Pit+

5

0 10 20 30 40 50 60 70 80 90 99.9Percentiles of Permanent Income, Pit

20072014

(b) WomenTop 0.1% of Pit-1

020

4060

8010

0M

ean

Perc

entil

es o

f Pit+

5

0 10 20 30 40 50 60 70 80 90 99.9Percentiles of Permanent Income, Pit

20072014

Note: Based on authors’ calculations with data from IMSS. The figure shows average rank-rank short-term (5-year) mobilityfor male (a) and female (b) workers in selected years of the sample, 2007 and 2014.

56

Figure B.12: Empirical log-densities of one-year earnings changes(a) Men

St. Dev.: 0.66

Skewness: -0.32

Kurtosis: 8.17

01

23

4D

ensi

ty

-2.5 -1.5 -.5 .5 1.5 2.5One-Year Log Earnings Growth

Data DensityN(0,0.662)

(b) WomenSt. Dev.: 0.65

Skewness: -0.36

Kurtosis: 8.15

01

23

4D

ensi

ty

-2.5 -1.5 -.5 .5 1.5 2.5One-Year Log Earnings Growth

Data DensityN(0,0.652)

Note: Based on authors’ calculations with data from IMSS. The figure shows the log-density of the distribution of one-yearearnings changes for men (a) and women (b) in 2005.

Figure B.13: Empirical log-densities of five-year earnings changes(a) Men

St. Dev.: 0.91

Skewness: -0.41

Kurtosis: 6.03

01

23

4D

ensi

ty

-2.5 -1.5 -.5 .5 1.5 2.5One-Year Log Earnings Growth

Data DensityN(0,0.912)

(b) WomenSt. Dev.: 0.91

Skewness: -0.41

Kurtosis: 6.03

01

23

4D

ensi

ty

-2.5 -1.5 -.5 .5 1.5 2.5One-Year Log Earnings Growth

Data DensityN(0,0.912)

Note: Based on authors’ calculations with data from IMSS. The figure shows the log-density of the distribution of five-yearearnings changes for men (a) and women (b) in 2005.

57

Figure B.14: Empirical log-densities of one-year earnings changes(a) Men

St. Dev.: 0.65

Skewness: -0.30

Kurtosis: 8.38

01

23

4D

ensi

ty

-2.5 -1.5 -.5 .5 1.5 2.5One-Year Log Earnings Growth

Data DensityN(0,0.652)

(b) WomenSt. Dev.: 0.63

Skewness: -0.38

Kurtosis: 8.55

01

23

4D

ensi

ty

-2.5 -1.5 -.5 .5 1.5 2.5One-Year Log Earnings Growth

Data DensityN(0,0.632)

Note: Based on authors’ calculations with data from IMSS. The figure shows the log-density of the distribution of one-yearearnings changes for men (a) and women (b) in 2010.

Figure B.15: Empirical log-densities of five-year earnings changes(a) Men

St. Dev.: 0.90

Skewness: -0.29

Kurtosis: 6.04

01

23

4D

ensi

ty

-2.5 -1.5 -.5 .5 1.5 2.5One-Year Log Earnings Growth

Data DensityN(0,0.902)

(b) WomenSt. Dev.: 0.90

Skewness: -0.29

Kurtosis: 6.04

01

23

4D

ensi

ty

-2.5 -1.5 -.5 .5 1.5 2.5One-Year Log Earnings Growth

Data DensityN(0,0.902)

Note: Based on authors’ calculations with data from IMSS. The figure shows the log-density of the distribution of five-yearearnings changes for men (a) and women (b) in 2010.

58

C Appendix: Comparison of Administrative Data with Household SurveyData

As the administrative data from IMSS only provide information on formal workers and since a large sizeof the labor force in Mexico is employed in the informal sector, it is likely that the results of section2 are not representative of all Mexican workers. Information on informal workers is limited to what iscollected through the Encuesta Nacional de Ocupación y Empleo (ENOE), the main household survey onemployment in Mexico.While the underlying structure of this survey does not allow for perfectly replicating the statisticsof section 2, we adopt the same criteria and restrictions to construct a sample of workers present inthe ENOE that is as close as possible to the one we build with the administrative data and calculateanalogous statistics to those of sections 2.3 and 2.4. These statistics are presented in figure C.1 for thedistribution of log earnings and in figure C.2 for the distribution of one-year earning changes.The results of the comparison between IMSS- and ENOE-based statistics for formal sector workers, issomewhat mixed. For the evolution of the log earnings distribution we find that IMSS-based percentilesare stable throughout the sample period, with the exception of the lowest percentiles that show an upwardtrend from 2016 onward. The ENOE-based distribution, one the other hand, displays a generalizeddownward trend, with a leveling tendence from 2016 on. For the dispersion of log earnings we havethat both IMSS-based and ENOE-based statistics are quite stable until 2016 and then slightly trenddownward. But all the measures of dispersion under consideration (overall dispersion, top and bottomdispersion) register substantially higher levels when calculated with administrative data. Regarding thedistribution of one-year (residualized) income changes, IMSS-based percentiles are stable throughout theperiod, with the exception of a noticeable decrease during the financial crisis and a downward trend from2013 onward for the highest percentiles. ENOE-based percentiles are also stable throughout the period,with the exception of the highest percentiles displaying ups and down overtime. Dispersion of one-year(residualized) income changes calculated with the IMSS data is quite stable, except during the financialcrisis where we observe a brief but sharp increase in dispersion of the lower tail and a sharp decreaseof the upper tail of the distribution. The same pattern is not observed in the measures of dispersioncalculated with the ENOE data as they remain lower and fairly stable throughout the sample period.There are certain general features of the distribution of earnings and earnings changes that areindeed common to both IMSS and ENOE, such as the decrease in earnings dispersion from 2016, orthe relative stability in the dispersion of transitory income shocks for most of the sample period. Butthere are also significant differences across the two data sources. For instance, the shock associatedwith the financial crisis is evident in a number of IMSS-based statistics (i.e. percentiles of log earnings,percentiles of earning changes, and dispersion of earnings changes), while ENOE-based statistics do notdisplay notable changes around this recessionary period. Regarding the discrepancies between IMSS-and ENOE-based statistics we need to emphasize that there are several potential issues underlying59

the calculation of the statistics based on the household survey data that may be responsible for thesediscrepancies. Specifically, it is important to consider that: measurement error could be present due tothe fact that income is self-reported and households could be reporting a measure of income broaderthan the one captured by the IMSS data; the share of observations with missing incomes in ENOE is,on average, around 25% and non-response is not random and is concentrated among formally employedworkers and workers with higher levels of education; ENOE is specifically designed to collect informationon employment outcomes, not income.For the sake of providing a picture of the income dynamics and inequality that is as accurate andcomprehensive as possible, we also contrast the evolution of the relevant statistics that can be constructedusing only the household survey data across formal and informal workers during the period 2005–2019.Once again, we want to stress that we do not consider the household survey data to be the most accurateand best-suited source of information for characterizing the income distribution of Mexican workers and itsdynamics. Hence, the differences between the statistics we calculate for these comparative exercises andthose calculated in section 2 do not invalidate and should not cast doubt over the quality and the valueof the analysis carried out in the first part of the paper. Figures C.3 and C.4 illustrate these statistics.For the percentiles of log earnings we find that for both formal and informal workers there is adownward trend in real earnings across the entire distribution and this trend is more pronounced forthe top percentiles. Toward the end of the sample period, earnings start to level out and, for the lowestpercentiles of the distribution, even show a slight upward trend. For informal workers the leveling outbegins somewhat earlier and the upward trend at the bottom of the distribution is more sudden andsubstantial. As for measures of earnings dispersion, we see a qualitatively similar pattern for bothformal and informal workers with a downward trend overtime for all measures under consideration thatis slightly more evident for informal workers. Informal workers also display marginally lower levels ofearnings dispersion, overall and at the top and bottom of the distribution.Looking at the distribution of one-year (residualized) income changes, we see very stable patterns forformal workers, with little changes through time and across the entire distribution, except in the very toppercentiles where we observe noticeable increases and decreases in earnings. For informal workers thetop percentiles (P90 and P95) display a downward trend, while the lower percentiles (P5 through P75)remain fairly stable over time with a slight increase since 2015 at the very bottom of the distribution (P5and P10). The very top percentiles (P99 and P99.9) vary significantly between 2005 and 2019 for informalworkers as well. As for the dispersion of one-year (residualized) income changes, we find that, for bothformal and informal workers, lower and upper tail dispersion has remained relatively stable overtime, witha slight downward trend since 2013 for informal workers suggesting that overall dispersion (P90–P10)has decreased modestly for these workers at the end of our period of study. On average, dispersion oftransitory shocks is higher, for informal workers than for formal ones.The formal versus informal comparison with the household survey data reveals three main patterns. Forinstance, informal workers experience: a more pronounced upward trend in (log) earnings for the lowest60

percentiles of the earnings distribution; a slightly lower earnings dispersion, and a higher dispersion oftransitory income shocks. Taken together, these results suggest that the employment opportunities offeredin the informal sector provide more homogeneous (less unequal and less dispersed) earnings than thosein the formal sector, but these earnings are subject to more volatile shocks.

61

Figure C.1: Comparison between administrative and survey data: distribution of log real earnings(a) IMSS

.3.1

50

-.15

Perc

entil

es o

f Log

Ear

ning

s Rel

ativ

e to

200

5

2005 2007 2009 2011 2013 2015 2017 2019

p95p90p75p50p25p10p5

(b) ENOE formal workers

.3.1

50

-.15

Perc

entil

es R

elat

ive

to 2

005

2005 2007 2009 2011 2013 2015 2017 2019

p95p90p75p50p25p10p5

(c) IMSS

3.1

2.6

2.1

1.6

1.1

Dis

pers

ion

of L

og E

arni

ngs

2005 2007 2009 2011 2013 2015 2017 2019

2.56*σP90-P10

(d) ENOE formal workers

1.1

1.6

2.1

2.6

3.1

Dis

pers

ion

of L

og E

arni

ngs

2005 2007 2009 2011 2013 2015 2017 2019

2.56*σP90-P10

(e) IMSS

21.

81.

61.

41.

21

.8.6

.4.2

Dis

pers

ion

of L

og E

arni

ngs

2005 2007 2009 2011 2013 2015 2017 2019

P90-P50P50-P10

(f ) ENOE formal workers

.2.4

.6.8

11.

21.

41.

61.

82

Dis

pers

ion

of L

og E

arni

ngs

2005 2007 2009 2011 2013 2015 2017 2019

P90-P50P50-P10

Note: Based on authors’ calculations with data from IMSS and ENOE. Using the CS+TMax sample for the IMSS data and asample from ENOE (only formal workers with access to social security) constructed to match the CS+TMax sample as closelyas possible, this figure plots against time the following statistics of the distribution of log earnings: (a) IMSS: P5, P10, P25,P50, P75, P90, P95; (b) ENOE formal workers: P5, P10, P25, P50, P75, P90, P95 (all percentiles are normalized to 0 in2005, the first available year); (c) IMSS: P90–P10 and 2.56*σ ; (d) ENOE formal workers: P90–P10 and 2.56*σ ; (e) IMSS:P90–P50 and P50–P10; (f ) ENOE formal workers: P90–P50 and P50–P10. Shaded areas are recessions.62

Figure C.2: Comparison between administrative and survey data: distribution of one-year logearnings changes(a) IMSS

-.2-.1

0.1

.2Pe

rcen

tiles

of g

1 it Rel

ativ

e to

200

5

2005 2007 2009 2011 2013 2015 2017

p99.9 p99p95 p90p75 p50p25 p10p5

(b) ENOE formal workers

.3.2

.10

-.1-.2

-.3Pe

rcen

tiles

of g

1 it Rel

ativ

e to

200

5

2005 2007 2009 2011 2013 2015 2017

p99.9 p99p95 p90p75 p50p25 p10p5

(c) IMSS

.4.5

.6.7

.8D

ispe

rsio

n of

g1 it

2005 2007 2009 2011 2013 2015 2017

P90-P50P50-P10

(d) ENOE formal workers.4

.5.6

.7.8

Dis

pers

ion

of g

1 it

2005 2007 2009 2011 2013 2015 2017

P90-P50P50-P10

Note: Based on authors’ calculations with data from IMSS and ENOE. Using the CS+TMax sample for the IMSS data and asample from ENOE (only formal workers with access to social security) constructed to match the CS+TMax sample as closelyas possible, this figure plots against time the following statistics of the distribution of one-year log earnings changes: (a)IMSS: P5, P10, P25, P50, P75, P90, P95, P99, P99.9; (b) ENOE formal workers: P5, P10, P25, P50, P75, P90, P95, P99,P99.9 (all percentiles are normalized to 0 in 2005, the first available year); (c) IMSS: P90–P50 and P50–P10; (d) ENOEformal: P90–P10 and P50–P10. Shaded areas are recessions.

63

Figure C.3: Comparison between formal and informal workers in the survey data: distribution oflog real earnings(a) ENOE formal workers

.02

-.02

-.06

-.1Pe

rcen

tiles

of L

og E

arni

ngs

Rel

ativ

e to

200

5

2005 2007 2009 2011 2013 2015 2017 2019

p95p90p75p50p25p10p5

(b) ENOE informal workers

.02

-.02

-.06

-.1Pe

rcen

tiles

of L

og E

arni

ngs

Rel

ativ

e to

200

5

2005 2007 2009 2011 2013 2015 2017 2019

p95p90p75p50p25p10p5

(c) ENOE formal workers

.81

1.2

1.4

1.6

Dis

pers

ion

of L

og E

arni

ngs

2005 2007 2009 2011 2013 2015 2017 2019

2.56*σP90-P10

(d) ENOE informal workers.8

11.

21.

41.

6D

ispe

rsio

n of

Log

Ear

ning

s

2005 2007 2009 2011 2013 2015 2017 2019

2.56*σP90-P10

(e) ENOE formal workers

.2.4

.6.8

1D

ispe

rsio

n of

Log

Ear

ning

s

2005 2007 2009 2011 2013 2015 2017 2019

P90-P50P50-P10

(f ) ENOE informal workers

.2.4

.6.8

1D

ispe

rsio

n of

Log

Ear

ning

s

2005 2007 2009 2011 2013 2015 2017 2019

P90-P50P50-P10

Note: Based on authors’ calculations with data from ENOE. Using a sample from ENOE (with both formal workers with accessto social security and informal workers) constructed to match the CS+TMax sample as closely as possible, this figure plotsagainst time the following statistics of the distribution of log earnings: (a) ENOE formal workers: P5, P10, P25, P50, P75,P90, P95; (b) ENOE informal workers: P5, P10, P25, P50, P75, P90, P95 (all percentiles are normalized to 0 in 2005, the firstavailable year); (c) ENOE formal workers: P90–P10 and 2.56*σ ; (d) ENOE informal workers: P90–P10 and 2.56*σ ; (e) ENOEformal workers: P90–P50 and P50–P10; (f ) ENOE informal workers: P90–P50 and P50–P10. Shaded areas are recessions.64

Figure C.4: Comparison between formal and informal workers in the survey data: distribution ofone-year log earnings changes(a) ENOE formal workers

.3.2

.10

-.1-.2

-.3Pe

rcen

tiles

of g

1 it Rel

ativ

e to

200

5

2005 2007 2009 2011 2013 2015 2017

p99.9 p99p95 p90p75 p50p25 p10p5

(b) ENOE informal workers

.3.2

.10

-.1-.2

-.3Pe

rcen

tiles

of g

1 it Rel

ativ

e to

200

5

2005 2007 2009 2011 2013 2015 2017

p99.9 p99p95 p90p75 p50p25 p10p5

(c) ENOE formal workers

.4.5

.6.7

.8.9

Dis

pers

ion

of g

1 it

2005 2007 2009 2011 2013 2015 2017

P90-P50P50-P10

(d) ENOE informal workers.4

.5.6

.7.8

.9D

ispe

rsio

n of

g1 it

2005 2007 2009 2011 2013 2015 2017

P90-P50P50-P10

Note: Based on authors’ calculations with data from ENOE. Using a sample from ENOE (with both formal workers withaccess to social security and informal workers) constructed to match the CS+TMax sample as closely as possible, this figureplots against time the following statistics of the distribution of one-year log earnings changes: (a) ENOE formal workers: P5,P10, P25, P50, P75, P90, P95, P99, P99.9; (b) ENOE informal workers: P5, P10, P25, P50, P75, P90, P95, P99, P99.9 (allpercentiles are normalized to 0 in 2005, the first available year); (c) ENOE formal workers: P90–P50 and P50–P10; (d)ENOE informal workers: P90–P10 and P50–P10. Shaded areas are recessions.

65

D Appendix: Additional Results for Income Dynamics and the Role of theInformal Sector

This appendix includes additional information and results that complement what we presented in section3 (Part II) of the main text.D.1 Transitions In and Out of Formal Employment

The following tables reports selected estimated coefficients for the regressions presented and discussedin section 3.1.Table D.1: Estimates of wages trajectories (log differences) for workers who exit and re-enter the

formal labor market

Dependent variable: Log wagesIndependent variables: Men Woment = −2 0.062∗∗∗(0.003) 0.034∗∗∗(0.004)t = −1 0.047∗∗∗(0.002) 0.032∗∗∗(0.002)t = 0 0.000 (·) 0.000 (·)t = 1 –0.154∗∗∗(0.003) –0.147∗∗∗(0.004)t = 2 –0.081∗∗∗(0.004) –0.063∗∗∗(0.005)t = 3 –0.045∗∗∗(0.004) –0.013∗∗ (0.006)Constant 8.164∗∗∗(0.016) 8.230∗∗∗(0.022)N. of Observations 682,248 389,856Note: Based on authors’ estimates with data from IMSS. The table reports esti-mates of the coefficient βτ from equation (3.1). All specifications include sector ofeconomic activity, state, and year fixed effects. Standard errors (in parentheses)are clustered at the worker level. Stars indicate significance levels (*p < 0.10,**p < 0.05, ***p < 0.01).

66

Table D.2: Estimates of wages trajectories (levels) for workers who exit and re-enter the formal labormarket

Dependent variable: Log wagesIndependent variables: Men Woment = −2× 1 year 8.226∗∗∗(0.015) 8.264∗∗∗(0.021)t = −2× 2 years 8.206∗∗∗(0.016) 8.225∗∗∗(0.021)t = −2× 3 years 8.199∗∗∗(0.016) 8.215∗∗∗(0.022)t = −2× 4 years 8.178∗∗∗(0.017) 8.209∗∗∗(0.022)t = −2× 5 years 8.167∗∗∗(0.018) 8.199∗∗∗(0.023)t = −2× 6 years 8.178∗∗∗(0.019) 8.193∗∗∗(0.024)t = −2× 7 years 8.179∗∗∗ (0.021 8.175∗∗∗(0.026)t = −2× 8 years 8.167∗∗∗(0.026) 8.189∗∗∗(0.029)t = −2× 9 years 8.189∗∗∗(0.038) 8.188∗∗∗(0.038)t = −1× 1 year 8.211∗∗∗(0.016) 8.261∗∗∗(0.022)t = −1× 2 years 8.190∗∗∗(0.016) 8.226∗∗∗(0.022)t = −1× 3 years 8.180∗∗∗(0.017) 8.210∗∗∗(0.023)t = −1× 4 years 8.162∗∗∗(0.018) 8.213∗∗∗(0.023)t = −1× 5 years 8.160∗∗∗(0.018) 8.193∗∗∗(0.024)t = −1× 6 years 8.171∗∗∗(0.020) 8.188∗∗∗(0.025)t = −1× 7 years 8.160∗∗∗(0.022) 8.166∗∗∗(0.027)t = −1× 8 years 8.144∗∗∗(0.027) 8.174∗∗∗(0.030)t = −1× 9 years 8.183∗∗∗(0.038) 8.169∗∗∗(0.039)t = 0× 1 year 8.164∗∗∗(0.016) 8.230∗∗∗(0.022)t = 0× 2 years 8.145∗∗∗(0.017) 8.195∗∗∗(0.023)t = 0× 3 years 8.134∗∗∗(0.017) 8.183∗∗∗(0.023)t = 0× 4 years 8.116∗∗∗(0.018) 8.182∗∗∗(0.024)t = 0× 5 years 8.118∗∗∗(0.019) 8.170∗∗∗(0.024)t = 0× 6 years 8.124∗∗∗(0.020) 8.150∗∗∗(0.026)t = 0× 7 years 8.113∗∗∗(0.023) 8.145∗∗∗(0.028)t = 0× 8 years 8.067∗∗∗(0.027) 8.125∗∗∗(0.030)t = 0× 9 years 8.156∗∗∗(0.039) 8.127∗∗∗(0.041)t = 1× 1 year 8.010∗∗∗(0.017) 8.083∗∗∗(0.023)t = 1× 2 years 7.992∗∗∗(0.017) 8.054∗∗∗(0.023)t = 1× 3 years 7.968∗∗∗(0.018) 8.027∗∗∗(0.024)t = 1× 4 years 7.943∗∗∗(0.019) 8.007∗∗∗(0.025)t = 1× 5 years 7.932∗∗∗(0.019) 8.000∗∗∗(0.025)t = 1× 6 years 7.921∗∗∗(0.021) 7.996∗∗∗(0.027)t = 1× 7 years 7.897∗∗∗(0.022) 7.967∗∗∗(0.027)t = 1× 8 years 7.909∗∗∗(0.027) 8.023∗∗∗(0.031)t = 1× 9 years 7.933∗∗∗(0.037) 7.951∗∗∗(0.036)t = 2× 1 year 8.083∗∗∗(0.017) 8.167∗∗∗(0.023)t = 2× 2 years 8.059∗∗∗(0.018) 8.136∗∗∗(0.024)t = 2× 3 years 8.040∗∗∗(0.018) 8.106∗∗∗(0.024)t = 2× 4 years 8.011∗∗∗(0.019) 8.092∗∗∗(0.025)t = 2× 5 years 7.996∗∗∗(0.020) 8.081∗∗∗(0.026)t = 2× 6 years 7.985∗∗∗(0.021) 8.078∗∗∗(0.027)t = 2× 7 years 7.965∗∗∗(0.023) 8.043∗∗∗(0.028)t = 2× 8 years 7.984∗∗∗(0.028) 8.087∗∗∗(0.032)t = 2× 9 years 7.995∗∗∗(0.039) 8.016∗∗∗(0.038)t = 3× 1 year 8.119∗∗∗(0.017) 8.216∗∗∗(0.024)t = 3× 2 years 8.097∗∗∗(0.018) 8.182∗∗∗(0.024)t = 3× 3 years 8.070∗∗∗(0.018) 8.155∗∗∗(0.025)t = 3× 4 years 8.055∗∗∗(0.019) 8.130∗∗∗(0.025)t = 3× 5 years 8.041∗∗∗(0.020) 8.137∗∗∗(0.026)t = 3× 6 years 8.028∗∗∗(0.022) 8.135∗∗∗(0.028)t = 3× 7 years 7.985∗∗∗(0.023) 8.095∗∗∗(0.029)t = 3× 8 years 8.014∗∗∗(0.028) 8.138∗∗∗(0.032)t = 3× 9 years 8.020∗∗∗(0.038) 8.089∗∗∗(0.039)N. of Observations 682,248 389,856Note: Based on authors’ estimates with data from IMSS. The table reports estimatesof the coefficient βk

τ from equation ln(wit) = ∑3τ=−2 ∑5k=1 βk

τ Iτ Ik + γgIg + αe +αs + αt + εit . All specifications include sector of economic activity, state, and yearfixed effects. Standard errors (in parentheses) are clustered at the worker level. Starsindicate significance levels (*p < 0.10, **p < 0.05, ***p < 0.01).

67

Table D.3: Estimates of wages trajectories of workers who left the formal labor market during the2009 financial crisis

Dependent variable: Log wagesIndependent variables: Men Woment = −2× 2009crisis = 0 8.202∗∗∗(0.015) 8.224∗∗∗(0.021)t = −2× 2009crisis = 1 8.205∗∗∗(0.015) 8.226∗∗∗(0.020)t = −1× 2009crisis = 0 8.179∗∗∗(0.015) 8.214∗∗∗(0.021)t = −1× 2009crisis = 1 8.201∗∗∗(0.015) 8.232∗∗∗(0.021)t = 0× 2009crisis = 0 8.128∗∗∗(0.016) 8.182∗∗∗(0.022)t = 0× 2009crisis = 1 8.162∗∗∗(0.016) 8.197∗∗∗(0.022)t = 1× 2009crisis = 0 7.980∗∗∗(0.017) 8.040∗∗∗(0.023)t = 1× 2009crisis = 1 7.968∗∗∗(0.016) 8.027∗∗∗(0.023)t = 2× 2009crisis = 0 8.055∗∗∗(0.017) 8.126∗∗∗(0.023)t = 2× 2009crisis = 1 8.039∗∗∗(0.017) 8.113∗∗∗(0.023)t = 3× 2009crisis = 0 8.095∗∗∗(0.017) 8.179∗∗∗(0.023)t = 3× 2009crisis = 1 8.081∗∗∗(0.017) 8.166∗∗∗(0.023)N. of Observations 682,248 389,856Note: Based on authors’ estimates with data from IMSS. The table reportsestimates of the coefficient βc

τ from equation (3.2). All specifications includesector of economic activity, state, and year fixed effects. Standard errors (inparentheses) are clustered at the worker level. Stars indicate significance lev-els (*p < 0.10, **p < 0.05, ***p < 0.01).

Table D.4: Estimates of wages trajectories: treatment vs control group with a 3-year window

Dependent variable: Log wagesIndependent variables: Men Woment = −2× treated = 0 8.288∗∗∗(0.015) 8.449∗∗∗(0.027)t = −2× treated = 1 8.038∗∗∗(0.015) 8.134∗∗∗(0.027)t = −1× treated = 0 8.295∗∗∗(0.016) 8.455∗∗∗(0.027)t = −1× treated = 1 8.011∗∗∗(0.016) 8.115∗∗∗(0.027)t = 0× treated = 0 8.306∗∗∗(0.016) 8.463∗∗∗(0.027)t = 0× treated = 1 7.955∗∗∗(0.016) 8.070∗∗∗(0.027)t = 1× treated = 0 8.305∗∗∗(0.016) 8.483∗∗∗(0.027)t = 1× treated = 1 7.765∗∗∗(0.016) 7.867∗∗∗(0.027)t = 2× treated = 0 8.311∗∗∗(0.016) 8.488∗∗∗(0.027)t = 2× treated = 1 7..829∗∗∗(0.016) 7.940∗∗∗(0.027)t = 3× treated = 0 8.315∗∗∗(0.016) 8.492∗∗∗(0.027)t = 3× treated = 1 7.863∗∗∗(0.016) 7.981∗∗∗(0.027)N. of Observations 1,705,014 1,029,924Note: Based on authors’ estimates with data from IMSS. The table reportsestimates of the coefficient βT

τ from equation (3.3). All specifications includesector of economic activity, state, and year fixed effects. Standard errors (inparentheses) are clustered at the worker level. Stars indicate significancelevels (*p < 0.10, **p < 0.05, ***p < 0.01).

68

Table D.5: Estimates of wages trajectories: treatment vs control group with a 5-year window

Dependent variable: Log wagesIndependent variables: Men Woment = −4× treated = 0 8.239∗∗∗(0.021) 8.417∗∗∗(0.035)t = −4× treated = 1 8.063∗∗∗(0.021) 8.206∗∗∗(0.036)t = −3× treated = 0 8.246∗∗∗(0.021) 8.425∗∗∗(0.036)t = −3× treated = 1 8.053∗∗∗(0.021) 8.202∗∗∗(0.036)t = −2× treated = 0 8.250∗∗∗(0.021) 8.430∗∗∗(0.036)t = −2× treated = 1 8.037∗∗∗(0.021) 8.184∗∗∗(0.036)t = −1× treated = 0 8.255∗∗∗(0.021) 8.437∗∗∗(0.036)t = −1× treated = 1 8.003∗∗∗(0.022) 8.156∗∗∗(0.037)t = 0× treated = 0 8.263∗∗∗(0.022) 8.104∗∗∗(0.037)t = 0× treated = 1 7.942∗∗∗(0.022) 8.452∗∗∗(0.037)t = 1× treated = 0 8.259∗∗∗(0.023) 7.890∗∗∗(0.038)t = 1× treated = 1 7.736∗∗∗(0.024) 8.456∗∗∗(0.039)t = 2× treated = 0 8.263∗∗∗(0.024) 7.970∗∗∗(0.039)t = 2× treated = 1 7..806∗∗∗(0.024) 8.458∗∗∗(0.039)t = 3× treated = 0 8.267∗∗∗(0.025) 8.492∗∗∗(0.040)t = 3× treated = 1 7.846∗∗∗(0.025) 8.015∗∗∗(0.040)t = 4× treated = 0 8.270∗∗∗(0.025) 8.462∗∗∗(0.040)t = 4× treated = 1 7.873∗∗∗(0.026) 8.044∗∗∗(0.041)t = 5× treated = 0 8.272∗∗∗(0.026) 8.466∗∗∗(0.041)t = 5× treated = 1 7.890∗∗∗(0.026) 8.069∗∗∗(0.042)N. of Observations 1,705,014 1,029,924Note: Based on authors’ estimates with data from IMSS. The table reportsestimates of the coefficient βT

τ similar to equation (3.3) widening the timewindow to 5 years. All specifications include sector of economic activity,state, and year fixed effects. Standard errors (in parentheses) are clusteredat the worker level. Stars indicate significance levels (*p < 0.10, **p < 0.05,***p < 0.01).

69

D.2 Cohort Analysis

In this section we present additional results from the specifications in section 3.2.1. Here we substituteregion and year fixed effects for region-year fixed effects, keeping the rest of the specifications unchanged.As shown in table D.6–D.10, the results from these alternative specifications are largely consistent withthose reported in tables 2–6 in the main text.Table D.6: Impact of initial labor market conditions on long-term formal employment

Dependent variable: Fraction of cohort formally employedLow-educated High-educatedIndependent variables: Men Women Men WomenInitial informality 0.437∗∗∗ 0.166∗∗ 0.257∗∗∗ –0.031(0.097) (0.058) (0.089) (0.090)Initial informality×Experience (10–15) –0.281∗∗∗ –0.168∗∗∗ –0.106∗∗∗ –0.065(0.034) (0.027) (0.039) (0.045)Initial informality×Experience (>15) –0.682∗∗∗ –0.351∗∗∗ –0.318∗∗∗ –0.233∗∗∗(0.055) (0.037) (0.048) (0.055)Initial unemployment 0.037 0.354 –0.134 –0.243(0.324) (0.255) (0.323) (0.326)Initial unemployment×Experience (10–15) –0.108 –0.341∗ 0.048 –0.057(0.275) (0.198) (0.295) (0.249)Initial unemployment×Experience (>15) –0.036 –0.167 –0.108 –0.066(0.335) (0.220) (0.316) (0.279)N. of Observations 5,155 5,123 5,883 5,928

Note: Based on authors’ estimates with data from ENE and ENOE. All specifications include potential experience,cohort and region-year fixed effects. Standard errors (in parenthesis) are clustered at the cohort-region level. Starsindicate significance levels (*p < 0.10, **p < 0.05, ***p < 0.01).

70

Table D.7: Impact of initial labor market conditions on long-term informal employment

Dependent variable: Fraction of cohort informally employedLow-educated High-educatedIndependent variables: Men Women Men WomenInitial informality –0.661∗∗∗ –0.082 –0.427∗∗∗ -0.213∗∗∗(0.129) (0.078) (0.108) (0.083)Initial informality×Experience (10–15) 0.250∗∗∗ –0.002 0.082∗ –0.054(0.044) (0.029) (0.045) (0.041)Initial informality×Experience (>15) 0.593∗∗∗ 0.085∗∗ 0.285∗∗∗ 0.015(0.073) (0.041) (0.061) (0.049)Initial unemployment 0.171 0.072 0.175 0.236(0.443) (0.264) (0.358) (0.349)Initial unemployment×Experience (10–15) 0.518 0.123 0.100 –0.285(0.344) (0.237) (0.318) (0.318)Initial unemployment×Experience (>15) 0.508 0.443∗ 0.312 –0.272(0.445) (0.264) (0.360) (0.349)N. of Observations 5,155 5,123 5,883 5,928

Note: Based on authors’ estimates with data from ENE and ENOE. All specifications include potential experience,cohort and region-year fixed effects. Standard errors (in parenthesis) are clustered at the cohort-region level. Starsindicate significance levels (*p < 0.10, **p < 0.05, ***p < 0.01).

Table D.8: Impact of initial labor market conditions on long-term unemployment

Dependent variable: Fraction of cohort unemployedLow-educated High-educatedIndependent variables: Men Women Men WomenInitial informality –0.002 –0.071∗∗ 0.020 0.018(0.048) (0.028) (0.051) (0.040)Initial informality×Experience (10–15) –0.052∗∗∗ –0.008 –0.024 –0.015(0.020) (0.012) (0.024) (0.019)Initial informality×Experience (>15) –0.077∗∗ –0.029∗ –0.011 –0.015(0.026) (0.017) (0.032) (0.025)Initial unemployment 0.147 0.011 0.201 0.002(0.217) (0.123) (0.191) (0.174)Initial unemployment×Experience (10–15) –0.156 –0.037 –0.190 –0.139(0.192) (0.098) (0.175) (0.147)Initial unemployment×Experience (>15) –0.146 –0.021 –0.177 –0.054(0.234) (0.113) (0.179) (0.161)N. of Observations 5,155 5,123 5,883 5,928

Note: Based on authors’ estimates with data from ENE and ENOE. All specifications include potential expe-rience, cohort and region-year fixed effects. Standard errors (in parenthesis) are clustered at the cohort-regionlevel. Stars indicate significance levels (*p < 0.10, **p < 0.05, ***p < 0.01).

71

Table D.9: Impact of initial labor market conditions on long-term labor force participation

Dependent variable: Fraction of cohort not in the labor forceLow-educated High-educatedIndependent variables: Men Women Men WomenInitial informality 0.227∗∗ –0.014 0.150 0.226∗∗(0.089) (0.083) (0.111) (0.104)Initial informality×Experience (10–15) 0.083∗∗ 0.178∗∗∗ 0.047 0.134∗∗∗(0.032) (0.035) (0.053) (0.050)Initial informality×Experience (>15) 0.166∗∗∗ 0.295∗∗∗ 0.045 0.233∗∗∗(0.049) (0.048) (0.066) (0.066)Initial unemployment –0.355 –0.437 –0.241 0.005(0.328) (0.354) (0.447) (0.431)Initial unemployment×Experience (10–15) –0.254 0.256 0.043 0.480(0.279) (0.269) (0.410) (0.329)Initial unemployment×Experience (>15) –0.326 –0.255 –0.027 0.392(0.328) (0.327) (0.414) (0.396)N. of Observations 5,155 5,123 5,883 5,928

Note: All specifications include potential experience, cohort and region-year fixed effects. Standard errors (inparenthesis) are clustered at the cohort-region level. Stars indicate significance levels (*p < 0.10, **p < 0.05,***p < 0.01).

Table D.10: Impact of initial labor market conditions on long-term labor earnings

Dependent variable: Log total earningsLow-educated High-educatedIndependent variables: Men Women Men WomenInitial informality –0.325 –0.101 –0.351 –1.438∗∗∗(0.333) (0.463) (0.306) (0.385)Initial informality×Experience (10–15) –0.458∗∗∗ –1.020∗∗∗ 0.048 0.049(0.124) (0.178) (0.138) (0.155)Initial informality×Experience (>15) –1.085∗∗∗ –1.732∗∗∗ 0.221 0.093(0.186) (0.270) (0.194) (0.254)Initial unemployment –0.165 2.200 –1.375 –2.609∗(0.975) (1.721) (1.092) (1.436)Initial unemployment×Experience (10–15) 0.675 –1.070 0.919 0.759(0.748) (1.295) (0.873) (1.149)Initial unemployment×Experience (>15) 1.672 1.133 2.361∗∗ 2.792(1.068) (1.630) (1.143) (1.488)N. of Observations 5,119 5,070 5,630 5,674

Note: Based on authors’ estimates with data from ENE and ENOE. Labor earnings are computed with zeros to avoidconditioning on a potential outcome (labor force participation). Omitting zeros could result in selection bias sincethe composition of the labor force may change due to the treatment. The variable is constructed as the natural loga-rithm of the mean total income for each cell of cohort c, demographic group d, region r, and time t. All specificationsinclude potential experience, cohort and region-year fixed effects. Standard errors (in parenthesis) are clustered atthe cohort-region level. Stars indicate significance levels (*p < 0.10, **p < 0.05, ***p < 0.01).

72

D.3 Individual Analysis

To correct for potential endogeneity, we instrument the formal status of a worker’s first job in (3.5) withlocal labor market informality rates for workers 30 years old and younger. Table D.11 reports the resultsof a battery of tests that corroborate the validity of our instruments of choice.Table D.11: Validity of the instruments used in the first-stage regression

Low-educated High-educatedMen Women All Men Women AllUnderidentification testKleibergen-Paap rk LM statistic 146.70 53.41 186.27 70.99 31.37 102.84(0.000) (0.000) (0.000) (0.000) (0.000) (0.000)Weak identification testCragg-Donald Wald F statistic 69.01 26.35 92.83 35.71 15.87 52.30Weak-instrument-robust inferenceAnderson-Rubin Wald test 26.57 33.06 64.78 30.82 64.84 57.11(0.000) (0.000) (0.000) (0.000) (0.000) (0.000)Stock-Wright LM S statistic 29.38 31.51 64.92 31.41 61.66 56.41(0.000) (0.000) (0.000) (0.000) (0.000) (0.000)

Note: Underidentification test — H0: the matrix of reduced form coefficients has rank 1 (underidentified); H1: the ma-trix has rank 2 (identified). The Kleibergen-Paap rk LM statistic has a chi-square asymptotic distribution with 1 degreeof freedom. Weak identification test — H0: the equation is weakly identified. The critical values for the Cragg-DonaldWald F statistic for weak instrument based on LIML with 10% maximal IV size is 7.03 for the case of 2 endogenous re-gressors and 2 excluded instruments (see Stock, Yogo, et al. [2005]). Weak-instrument-robust inference — Tests of jointsignificance of endogenous regressors in the main equation. H0: β3 = β4 = 0 and orthogonality conditions are valid.The Anderson-Rubin Wald test and Stock-Wright LM S statistics have both a chi-square asymptotic distribution with 2degrees of freedom. p-values are reported in parenthesis.Using data from ENE and ENOE, figure D.1 shows the distribution of informality rates in eachMexican state for the period 1995 to 2019 demonstrating that there is a lot of cross-state heterogeneityin terms of the mean and the dispersion of informality rates.Since information on local labor market informality rates is available only from 1995 onward, thepanel that we build from the MOTRAL has to be constrained to include only workers who entered thelabor market for the first time in or after 1995. This means that roughly 57% of the observations has tobe excluded from the estimation of equation (3.5). Figure D.2 depicts the distribution of year of first entryinto the labor market for the whole panel.

73

Figure D.1: Distribution of Informality Rates per State(a) States with low informality rate

020

4060

020

4060

020

4060

020

4060

.3 .4 .5 .6 .7 .3 .4 .5 .6 .7 .3 .4 .5 .6 .7 .3 .4 .5 .6 .7

Aguascalientes Baja California Baja California Sur Coahuila

Colima Chihuahua México, D. F. Durango

Jalisco Estado de México Nuevo León Querétaro

Quintana Roo Sinaloa Sonora Tamaulipas

Perc

ent

Informality rate

(b) States with high informality rate

020

4060

020

4060

020

4060

020

4060

.5 .6 .7 .8 .9 .5 .6 .7 .8 .9 .5 .6 .7 .8 .9 .5 .6 .7 .8 .9

Campeche Chiapas Guanajuato Guerrero

Estado de Hidalgo Michoacán Morelos Nayarit

Oaxaca Puebla San Luis Potosí Tabasco

Tlaxcala Veracruz Yucatán Zacatecas

Perc

ent

Informality rate

Note: Based on authors’ estimates with data from ENE and ENOE. The figure plots the average rate of informality for theMexican states with (a) low levels of informality and (b) high levels of informality during the period 1995–2019.

74

Figure D.2: Distribution of the Year of Entry into the Labor Market

01

23

45

Perc

ent

1960 1970 2000 20101980 1990Year of entry into the labor market

Percent>=1995

Note: Based on authors’ estimates with data from MOTRAL. The figure plots the frequency of year of entry for all theindividuals present in the panel we constructed from the two rounds of the MOTRAL and highlights the observations that wereactually included in the analysis presented in section 3.2.2.

75