50
The Size Distribution of Firms and Aggregate Industrial Water Pollution * By J I QI,XIN TANG AND XICAN XI July 12, 2017 We show that misallocation across firms amplifies industrial pollution by dis- torting the firm size distribution in China. Using a unique firm level data from China, we find that larger firms are more likely to use clean technology, and emit less pollutants per output. We also provide evidence of size-dependent dis- tortions that reallocate factors away from large productive firms. In a heteroge- neous firm model with an endogenous choice of pollution treatment technology, we show that size-dependent distortions lower the adoption rate of clean tech- nology, amplify aggregate pollution intensity, and lower aggregate output. Our quantitative results show that eliminating size-dependent distortions would in- crease mean firm size by 130% and aggregate output by 30%. Meanwhile, the fraction of firms using clean technology would increase by 27% and aggregate pollution decrease by 20%. In contrast, tightening environmental regulations would result in a sizable decrease in aggregate pollution, but have a negligible effect on aggregate output. (JEL E01, E23, O44, Q52, Q53, Q56) Severe pollution has been accompanying China’s remarkable economic growth for the last few decades, causing environmental degradation, public health damages and millions of premature deaths. 1 It has therefore become a major concern for the public and policymakers. Understanding * Disclaimer: Research results and conclusions expressed herein are those of the authors and do not necessarily re- flect the views of the China Ministry of Environmental Protection. Latest version and an online appendix (still in prepa- ration) containing additional materials are available at the author’s website: https://sites.google.com/site/zjutangxin/. Ji Qi: Chinese Academy for Environmental Planning and Tsinghua University, (e-mail: [email protected]); Xin Tang: Wuhan University, (e-mail:[email protected]); Xican Xi: International Monetary Fund, (e-mail: [email protected]). We thank comments and discussions by Alexis Anagnostoupolos, Marina Azzimonti, Juan Carlos Conesa, Berthold Herrendorf, Nicolai Kuminoff, Mark Montgomery, Pietro Peretto, Edward Prescott, Diego Restuc- cia, Todd Schoellman, Kerry Smith, Daniel Yi Xu and Gustavo Ventura. We have also benefited from comments received at Arizona State University, the Chinese Academy of Environmental Planning, Duke University, the IMF, Na- tional School of Development of Peking University, Shanghai University of Finance and Economics (SHUFE), Stony Brook University, Wuhan University, and Zhejiang University. We thank sincerely for the China Ministry of Environ- mental Protection for granting us access to the National General Survey of Pollution Sources data. This paper has been screened to ensure that no confidential data are revealed. 1 World Bank (2007) estimates that the total cost of air and water pollution in China is about 6% of GDP. The health cost of pollution is estimated to be 4% of GDP, the majority of which is associated with premature deaths caused by pollution. Recently, Ebenstein (2012) find that industrial water pollution has a large positive effect on deaths from digestive cancers in China, and Ebenstein et al. (2015) find that air pollution is negatively associated with life expectancy. See also Vennemo et al. (2009), Zheng and Kahn (2013) and Greenstone and Jack (2015) for surveys of studies on pollution in China. 1

The Size Distribution of Firms and Aggregate Industrial

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Size Distribution of Firms and Aggregate Industrial

The Size Distribution of Firms and Aggregate Industrial WaterPollution∗

By JI QI, XIN TANG AND XICAN XI†

July 12, 2017

We show that misallocation across firms amplifies industrial pollution by dis-torting the firm size distribution in China. Using a unique firm level data fromChina, we find that larger firms are more likely to use clean technology, andemit less pollutants per output. We also provide evidence of size-dependent dis-tortions that reallocate factors away from large productive firms. In a heteroge-neous firm model with an endogenous choice of pollution treatment technology,we show that size-dependent distortions lower the adoption rate of clean tech-nology, amplify aggregate pollution intensity, and lower aggregate output. Ourquantitative results show that eliminating size-dependent distortions would in-crease mean firm size by 130% and aggregate output by 30%. Meanwhile, thefraction of firms using clean technology would increase by 27% and aggregatepollution decrease by 20%. In contrast, tightening environmental regulationswould result in a sizable decrease in aggregate pollution, but have a negligibleeffect on aggregate output. (JEL E01, E23, O44, Q52, Q53, Q56)

Severe pollution has been accompanying China’s remarkable economic growth for the last fewdecades, causing environmental degradation, public health damages and millions of prematuredeaths.1 It has therefore become a major concern for the public and policymakers. Understanding∗Disclaimer: Research results and conclusions expressed herein are those of the authors and do not necessarily re-

flect the views of the China Ministry of Environmental Protection. Latest version and an online appendix (still in prepa-ration) containing additional materials are available at the author’s website: https://sites.google.com/site/zjutangxin/.†Ji Qi: Chinese Academy for Environmental Planning and Tsinghua University, (e-mail: [email protected]);

Xin Tang: Wuhan University, (e-mail:[email protected]); Xican Xi: International Monetary Fund, (e-mail:[email protected]). We thank comments and discussions by Alexis Anagnostoupolos, Marina Azzimonti, Juan CarlosConesa, Berthold Herrendorf, Nicolai Kuminoff, Mark Montgomery, Pietro Peretto, Edward Prescott, Diego Restuc-cia, Todd Schoellman, Kerry Smith, Daniel Yi Xu and Gustavo Ventura. We have also benefited from commentsreceived at Arizona State University, the Chinese Academy of Environmental Planning, Duke University, the IMF, Na-tional School of Development of Peking University, Shanghai University of Finance and Economics (SHUFE), StonyBrook University, Wuhan University, and Zhejiang University. We thank sincerely for the China Ministry of Environ-mental Protection for granting us access to the National General Survey of Pollution Sources data. This paper has beenscreened to ensure that no confidential data are revealed.

1World Bank (2007) estimates that the total cost of air and water pollution in China is about 6% of GDP. Thehealth cost of pollution is estimated to be 4% of GDP, the majority of which is associated with premature deathscaused by pollution. Recently, Ebenstein (2012) find that industrial water pollution has a large positive effect on deathsfrom digestive cancers in China, and Ebenstein et al. (2015) find that air pollution is negatively associated with lifeexpectancy. See also Vennemo et al. (2009), Zheng and Kahn (2013) and Greenstone and Jack (2015) for surveys ofstudies on pollution in China.

1

Page 2: The Size Distribution of Firms and Aggregate Industrial

2 JI QI, XIN TANG AND XICAN XI

the driving forces behind the severe pollution is key to the design and evaluation of anti-pollutionpolicies. Specifically, is the severe pollution an inevitable consequence of rapid economic growth,or is it largely exacerbated by inappropriate economic policies and/or market inefficiencies?

In this paper, we show that misallocation across firms amplifies industrial water pollution bydistorting firm size distribution in China. Using representative firm data on water pollution andpollution treatment technologies from the First National General Survey of Pollution Sources, andfirm production data from the First China National Economic Census, we document two novelfacts on firm size and firm’s pollution intensity:

(i) Large firms have lower pollution intensity (pollutants per unit value of output) than smallfirms. We find 7- to 32-fold differences in pollution intensity between firms in the largest andsmallest quartiles of firm size distribution for the top-5 polluting industries in China.2 Further,we find that an important reason for this is that large firms are more likely to use advancedpollution treatment technologies that require a fixed installation cost.3

(ii) Large firms account for a smaller fraction of total employment in China than in the U.S. Forthe top-5 polluting industries in China, firms with more than 400 employees account for 40%of the total employment, while for their American counterparts the number is close to 70%.4

This is suggestive of size-dependent distortions that reallocate factors away from large pro-ductive firms in China if we take the U.S economy as a relatively distortion-free benchmark.We find this is indeed the case by showing that firm-level variations in average product oflabor and capital increase with firm size and productivity [Hsieh and Klenow (2009, 2014)].5

Size-dependent distortions limit the operations of large productive firms and allow too manysmall unproductive firms to survive, which lower both mean firm size and aggregate output. Anew insight we provide in this paper is that size-dependent distortions also increase aggregatepollution intensity, because they reduce the factors allocated to large productive firms, which aremore likely to use advanced pollution treatment technologies and therefore have lower pollution

2The negative correlation between firm size/productivity and pollution intensity is also found in other countries.For example, Dasgupta, Lucas and Wheeler (1998) find that small firms have higher air pollution intensity in Braziland Mexico; Shapiro and Walker (2015) find that more productive firms have lower pollution intensity in the U.S; andBloom et al. (2010) find that better managed firms have lower energy intensity in the UK.

3Our data reveal that large firms not only are more likely to use advanced end-of-pipe treatment technology toremove a larger proportion of the pollutants, but also generate less pollutants in production. This indicates that theproduction technologies used by large firms are also more environmentally friendly. While we cannot directly measurefirms’ production technologies, we do have data on firm’s end-of-pipe pollution treatment technologies.

4See Axtell (2001), Luttmer (2007) and Rossi-Hansberg and Wright (2007) for theories and evidence regarding theheavy right tail of U.S. firm distribution.

5It is well understood in the literature that this measure of distortions captures the effects of many factors that distortfactor allocation across firms, including taxes and subsidies, regulations, transportation costs and financial frictions,among others. A prominent example of such size-dependent distortions is internal trade barriers that impede the flow ofgoods across regions, which presumably affect large firms more, since they are more likely to sell goods across regions.Tombe and Zhu (2015) find that the internal trade barriers have a large impact on China’s aggregate productivity. Inaddition, as is the case in many other developing countries, small firms are more likely to evade taxes due to imperfecttax enforcement.

Page 3: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 3

intensity. However, since size-dependent distortions increase aggregate pollution intensity whilereduce aggregate output, it is a quantitative question whether they also amplify aggregate pollution.

We use a quantitative model to organize our empirical findings, and to quantify the effects ofsize-dependent distortions on aggregate output and pollution. In particular, we extend the classicLucas (1978) span-of-control model to include size-dependent distortions, an endogenous choice ofpollution treatment technologies, and imperfect environmental regulations. In our model, there is astand-in household with a continuum of members. Household members are endowed with differentmanagerial talents and make occupational choices based on their talents. If a member chooses tobe an entrepreneur, she uses her managerial talent, capital and labor to produce output. She alsohas to pay implicit taxes that depend on her talent, which is intended to capture the size-dependentdistortions in an intuitive way. The size of the firm is therefore determined by the managerial talentand the taxes.

A new feature of our model is that the entrepreneurs also make decisions on treatment technolo-gies. The installation of clean technology requires a fixed cost, and firms without clean technol-ogy will be shut down with some probability, capturing the imperfect environmental regulationsin China. The fixed costs associated with clean technology lead to increasing returns to scale,which implies that in our model, only firms above a certain size threshold choose to install cleantechnology. For a given distribution of managerial talents and distortions, the model generates en-dogenously a distribution of firm sizes, and a negative association between firm size and pollutionintensity.

To discipline our quantitative analysis, we require our benchmark model to match the observedpollution intensity, clean technology adoption rate, and firm size distribution in China. The modelfits the firm-level data well. We then use the calibrated model to evaluate the effects of removingthe size-dependent distortions and tightening environmental regulations.

When we eliminate the distortions completely, our quantitative results show that eliminatingsize-dependent distortions would increase mean firm size by 130% and aggregate output by 30%.Meanwhile, the fraction of firms using clean technology would increase by 27% and aggregatepollution decrease by 20%. The drop in pollution comes from both the reduction in pollutantsgenerated during the production stage, and the increase in the adoption rate of clean technologiesat the treatment stage. Each stage contributes to about 50% of the total reduction.6 The expansionof productive firms is key to both channels. To isolate the importance of the size-dependency ofdistortions, we solve a version of the model where all firms in the economy face the same level ofdistortions. In our model, the size-dependency of the distortions does not imply large output loss.However, it plays a central role in determining the pollution level.

We also study the effects of tightening environmental regulations. Specifically, we increasethe regulation such that the fraction of firms adopting clean technology is the same as in the first

6As mentioned above, there are two stages that firms can take actions to cut their emission level in reality. Firmscan reduce the total quantity of pollutants generated during the production stage by using environmentally friendlyproduction technologies, or reduce the end-of-pipe emission by adopting more advanced treatment equipments fora given amount of pollutants generated. In the model, we focus mainly on firm’s choice of end-of-pipe treatmenttechnologies, which we can directly observe and measure in the data. We capture the decrease of pollution intensityduring the production stage in a reduced-form way, which is calibrated to data.

Page 4: The Size Distribution of Firms and Aggregate Industrial

4 JI QI, XIN TANG AND XICAN XI

experiment. We find that it reduces aggregate pollution by about 10% and has very little effecton output. Moreover, we find that the environmental policy improves resource allocation on theextensive margin by driving small unproductive firms out of the economy. However, the allocationworsens at the intensive margin in the sense that among the remaining active firms, the productionof medium sized firms expands more at the expense of large firms. While on the other hand, theremoval of the size-dependent distortions improves the allocation on both margins.

Related Literature.—Our paper is closely related to the studies on the aggregate consequencesof misallocation across heterogeneous firms. Important papers in the literature, such as Guner,Ventura and Xu (2008), Restuccia and Rogerson (2008), and Hsieh and Klenow (2009) establishthe importance of idiosyncratic policy distortions for aggregate productivity and output, especiallythose correlated with firm size and productivity.7 Our paper shows that the impact of distortionsgoes beyond aggregate economic output: they cause not only a large decrease in aggregate output,but also a large increase in aggregate pollution as well. By considering both aggregate output andpollution, our paper provides a more complete understanding of the welfare consequences of policydistortions, given the large costs of pollution in developing countries such as China.

We contribute to a large literature on the relationship between economic growth and environment.At the heart of the literature is the environmental Kuznets curve (EKC henceforth). Grossman andKrueger (1993, 1995), followed by many others, establish empirically a hump-shaped relationshipbetween a country’s per-capita income and its environmental quality.8 Many interpret this fact asevidence of an unavoidable tradeoff between economic growth and environment when income percapita is low, and there’s a popular belief that growth-promoting policies necessarily lower envi-ronmental quality in China and other developing countries. We challenge this view by showingthat removing distortions leads to both higher aggregate output and lower aggregate pollution inChina.9 A broad message from our findings is that, due to policy distortions and market inefficien-cies, developing economies are usually operating within the production possibility frontier betweeneconomic output and environmental quality. Therefore, both an increase in the output and a reduc-

7For more recent development, see Bartelsman, Haltiwanger and Scarpetta (2013), Hsieh and Klenow (2014) andAdamopoulos and Restuccia (2014). Restuccia and Rogerson (2013) and Hopenhayn (2014a) provide reviews of thisliterature.

8Copeland and Taylor (2004) provide a thorough survey of the early contributions, and a unifying theoretical frame-work for understanding the links between economic growth and environment. They identify three channels throughwhich economic growth could affect environmental outcomes: scale (total production), technology and industry com-position. Our paper emphasizes the importance of a new channel, namely factor allocation across firms, and its interac-tion with the scale and technology channels. Also, unlike most papers in the literature, pollution treatment technologyis endogenous in our paper, and we use the model and data to quantify the role of endogenous technology adoption.

9Several recent papers also study how factor allocation across firms affects aggregate pollution. Using IndianManufacturing data, Barrows and Ollivier (2016) find that the reallocation of resources across firms explains more than50% of the decline in aggregate CO2 emission intensity in India between 1990-2010. Using the U.S Manufacturingdata, Shapiro and Walker (2015) find that the emissions reductions in the U.S. are primarily driven by decline inwithin-product emissions intensity. They further use a quantitative model to show that the increasing stringency of theenvironmental regulation explain most of the emissions reductions. Martin (2013) estimates empirically the relativecontribution from changes of different policies for India. Our work complements these papers in that we emphasizethe importance of size-dependent distortions, and our data allows us to speak directly of quantifying the importance ofendogenous treatment technology adoption.

Page 5: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 5

tion in pollution can be attained through reducing policy distortions and market inefficiencies.10

This paper also provides a novel mechanism that could potentially rationalize the declining partof the EKC. One of the key findings in this paper is that large firms have lower pollution intensity.Recent empirical studies such as Poschke (2015) and Bento and Restuccia (2016) find that meanfirm size in the manufacturing sector increases with a country’s per-capita income, with the latterand Hsieh and Klenow (2014) attributing the cross-country differences in firm size largely to thecross-country differences in distortions. These two facts together imply that the decrease in distor-tions would contribute to the decrease in the aggregate pollution as income per capita increases.

Finally, our results highlight the role of firm size and misallocation for technology adoption indeveloping countries. As made clear by Parente and Prescott (1994, 1999), an important task indevelopment economics is to understand the slow technology adoption in developing countries.Recent contributions include Acemoglu et al. (2012) who focus on taxes and subsidies and Cole,Greenwood and Sanchez (2016) that emphasize the importance of contractual frictions and financialmarket inefficiencies. Using direct observations on the adoption of pollution treatment technology,we establish empirically that firm size plays a key role in technology adoption. Therefore, policydistortions and market inefficiencies that distort firm sizes impedes technology adoption. If we viewtariffs as a particular type of policy distortions, our results share a similar increasing returns to scaleintuition with that of Bustos (2011), which shows in a different context, that a bilateral reductionin tariffs induces more firms to adopt new technology with a fixed cost. Our paper differs fromhers also in that the quantitative framework we adopt allows us to evaluate the effect of regulationpolicies with the presence of distortions and analyze the economic mechanism.

The rest of the paper proceeds as follows. The next section documents facts pertaining to pollu-tion intensity differences across firms and the comparison of firm-size distributions between Chinaand the U.S. We describe the model in Section II and calibrate its benchmark version in SectionIII. In Section IV we perform several policy experiments to study the interaction between size-dependent distortions and environmental policies. We conclude in Section V.

I. Empirical Evidence

In this section, we document the key empirical findings regarding the size-intensity relationship,and the comparison of firm size distributions between China and the U.S that motivate our study.We start with a brief introduction of the data that we use. We then move on to explain the empiricalfindings. Using an accounting exercise, in the last section, we show that firm size distribution hasa sizable effect on aggregate pollution.

10A large literature in environmental economics discusses the possibility of double-dividend from environmentaltaxes, that is, an improvement in the environment and economic efficiency simultaneously from the use of environmen-tal taxes to replace other distortive taxes. Goulder (1994) and Fullerton and Metcalf (1997) review early work on thisquestion.

Page 6: The Size Distribution of Firms and Aggregate Industrial

6 JI QI, XIN TANG AND XICAN XI

A. Data Sources

There are three major data sources that we draw upon in this paper: (i) the First National GeneralSurvey of Pollution Sources, (ii) the First China National Economic Census and (iii) the Statisticsof U.S. Businesses. These three data sources are used to calculate the pollution intensity of Chinesemanufacturing firms and the firm size distribution of manufacturing firms in China and in the U.S.They are referred to in the remainder of this paper respectively by their acronyms NGSPS, CNECand SUSB.11

National General Survey of Pollution Sources.—The NGSPS is a joint effort of multiple nationalministries in China. The survey records data for year 2007. It is designed to cover all entitiesand self-employed households which emit pollutants in China. The complete survey consists offour components: industrial pollution sources, agricultural pollution sources, domestic pollutionsources, and facilities for centralized treatment of pollution. For the purpose of this paper, we useonly the industrial pollution data, which includes all polluting production entities that belong to anyof the 39 manufacturing industries. Moreover, the NGSPS contains information on the discharge ofmultiple air, water, and solid waste pollutants. Here we focus on water pollution because the dataare more accurately measured. The variables we use are: the quantity of major pollutants generatedand discharged, the total value of production, the type, book value and annual operating costs ofpollutants treatment equipment, the firm’s industry (four-digit GB/T4574-2002), the ownershipclassification, and the province.

It is well established in environmental science that industrial waste is typically concentrated ina handful of sectors. Even within narrowly defined manufacturing sectors, pollutant emissionsare usually concentrated among firms that engage in some particular manufacturing processes. Toaddress this issue, the NGSPS divides the complete sample into two large groups—key sources andregular sources—where firms identified as “key sources” are those that are most polluting. Wefocus on the key firms in the paper.12 We focus on key firms because the quality of the data of thesefirms are higher and most regular firms emit very little pollutants, meaning that the key firms aremore representative of the polluting manufacturing firms in China.

Among all the pollutants, we use Chemical Oxygen Demand (COD, henceforth), which measuresthe amount of oxygen consumed when a chemical oxidant is added to a sample of water. It is anindirect measure indicating the overall quantity of contaminants that will eventually cause oxygenloss and thus death of living creatures. Table 1 lists the percentage of key and regular firms thathave positive emissions of different pollutants. We choose COD because it allows us to keep mostobservations from the data. Other pollutants are discharged by significantly less number of firmswhich raises sample selection concerns. Moreover, COD emission is highly correlated with theemission of other pollutants.13 Finally, we focus on the measured end-of-pipe discharges. TheOnline Appendix A.2 explains in details how the data are collected. In the interests of space, herewe note that the information contained is different from the mix of energy sources and intermediate

11In the interest of space, we leave more detailed description of these data to the online Appendix.12The Online Appendix A.1 contains a detailed description of the definition of the key sources.13Take the Paper and Paper Product industry for example, the correlation between the emission of COD and that of

NH+4 is corr(COD,NH+

4 ) = 0.82, and that between COD and BOD is corr(COD,BOD) = 0.94.

Page 7: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 7

TABLE 1—PERCENTAGE OF FIRMS WITH POSITIVE EMISSION BY POLLUTANTS

Waste COD Petro NH+4 BOD CN Cr6+ Phenol As Cr Total

Key 76.2 73.2 31.4 25.2 17.5 4.90 4.86 2.42 2.27 2.01 106,067Reg 35.2 28.3 7.91 6.49 2.56 0.13 N/A 0.04 0.07 N/A 814,937† Data Source: National General Survey of Pollution Sources. The acronyms are respectively refer-

ring to: Wastewater, Chemical Oxygen Demand, Petrochemicals, Ammonian, Biochemical Oxy-gen Demand, Cyanidium, Hexavalent Chromium, Volatile Phenols, Arsenium and Chromium.

TABLE 2—STATISTICS OF TOP-10 POLLUTING INDUSTRIES BY COD

Paper Agri Tex Chem Bever Med Fer Petro Food Fib

Fractiona 33.4 15.2 14.0 10.4 4.27 2.98 2.49 2.32 2.30 2.15% Emissionb 99.6 91.8 91.1 99.7 65.1 92.9 99.9 99.9 96.4 97.8% Productionc 87.2 69.3 48.3 98.6 88.1 95.7 99.3 99.7 98.5 91.9† Data Source: National General Survey of Pollution Sources. The acronyms are respectively

referring to (with two-digit GB/T4547-2002 classification code in the parentheses): Paper andPaper Products (C22); Processing of Food from Agricultural Products (C13); Textile (C17); RawChemical Materials and Chemical Products (C26); Beverages (C15); Medicines (C27); Mining,Smelting and Pressing of Ferrous Metals (C32); Processing of Petroleum, Coking, Processing ofNuclear Fuel (C25); Foods (C14); Chemical Fibers (C28).

a Relative contribution to total COD emissions by sectors.b Percentage of total COD emissions accounted for by key firms.c Percentage of total production accounted for by key firms.

good in the production process. We present results for the top-5 polluting industries. Altogether,this leaves us with 29,019 firms.

Table 2 contains basic statistics about these industries (here we include all top-10 polluting in-dustries ranked according to the amount of COD emitted). We see from it that the key firms in thetop-5 polluting industries are fairly representative of China’s industrial pollution situation: theseindustries combined contribute to 77% of the total industrial COD emission; the key firms are re-sponsible on average for more than 90% of the within sector emission; and for more than 80% ofthe within sector output.

China National Economic Census.—The CNEC is conducted by the National Bureau of Statis-tics (NBS, henceforth) in year 2004. It is designed to cover all legal entities, industrial entities, andprivately-owned businesses which undertake economic activities in secondary and tertiary indus-tries in China. We use observations which belong to the manufacturing sector. The variables we useare: the total value of production, the labor compensation, the book value of capital stock, the num-ber of employees, the firm’s industry (four-digit GB/T4574-2002), the ownership classification,and the province.14

14We emphasize here that it is important that we use the CNEC rather than the Annual Surveys of Industrial Produc-tion for which data of year 2007 is available (the same year that the NGSPS covers). The reason is that CNEC surveysfirms of all sizes as opposed to only firms with a revenue of more than CNY 5 million by the annual surveys. In 2004,the number of firms and employees covered by the annual survey are respectively 276,410 and 66,725,059 while thosecovered in the census are 1,375,148 and 93,541,923. Therefore we would be missing 28.6% employment and 79.9%

Page 8: The Size Distribution of Firms and Aggregate Industrial

8 JI QI, XIN TANG AND XICAN XI

TABLE 3—POLLUTION INTENSITY AND PRODUCTION LEVEL

Quartile of Firm Sales

Industry QU1 QU2 QU3 QU4

Paper 6.7 3.2 2.0 1.0Agricultural Food 20.8 7.6 3.6 1.0Textile 8.3 3.6 2.4 1.0Chemical Materials 6.7 3.8 2.7 1.0Beverage 31.4 18.7 4.7 1.0† Data Source: National General Survey of Pollution Sources. QU1 to QU4 represent

respectively the bottom to the top quartile. The pollution intensity of the top quartileof each industry is normalized to one.

Statistics of U.S. Businesses.—The SUSB is conducted by the U.S Census Bureau and is anannual series that provides national and subnational data on the distribution of economic data byenterprise size and industry. It contains the number of firms, total employment by sector (up tosix-digit 2002 NAICS), and enterprise size groups which we use.

B. Firm Size and Pollution Intensity

We define pollution intensity as follows

Intensity =Total COD Emission

Total Value of Production.

We group the firms into quartiles based on their total value of output. For each industry, we calculatethe output-weighted average of pollution intensity of the firms in each quartile. Table 3 reports theresults. For the Paper and Paper Product industry, the pollution intensity of the firms in the bottomquartile is 6.7 times of that of the firms in the top quartile. The difference can be as large as 31.4times, as is the case of the Beverage Manufacturing industry. Moreover, the pollution intensitydecreases continuously as the size of the firms becomes larger. This can also be seen from thescatter plot of the logarithm of intensity against that of the total value of production. We plotthe Paper and Paper Product industry in Figure 1 as an example. Scatter plots for the other fourindustries are left in the Online Appendix. A significant negative correlation between log-intensityand log-production in the data can be seen in Figure 1.

To further examine the statistical property of the relationship between intensity and productionlevel, we regress the log-emissions on the log-sales, including a complete set of dummies for two-digit industry (Xs), province (Xp), and ownership rights (Xo):

(1) log(CODi) = −3.36(0.37)

+ 0.62(0.01)

× log(Salesi) + Xsγ1 + Xpγ2 + Xoγ3 + εi.

firms had we used only the annual survey. The number of firms covered in NGSPS and CNEC, 921,004 and 1,375,148are broadly consistent given that NGSPS further requires that a production entity to have pollution sources in orderto be included. However, the basic features like variable definitions are essentially the same in these two datasets.Therefore we would like to refer interested readers to Brandt, Biesebroeck and Zhang (2012) which contains a detaileddescription of the annual surveys.

Page 9: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 9

2 4 6 8 10 12 14−

15−

10−

50

5

Paper

Log Production

Log

Inte

nsity

FIGURE 1. POLLUTION INTENSITY AGAINST PRODUCTION

Source: National General Survey of Pollution Sources. Line: Least square fit.

The estimates are all statistical significant at 0.1% level with the standard errors reported in theparentheses below the estimates. The estimate implies that as the total sales increases by 1%, thetotal emission increases by 0.62%, which is less than 1%. This suggests that the emission inten-sity is decreasing as the sales of the firm increases. More specifically, by subtracting log(Salesi)on both sides of Equation (1), the elasticity between pollution intensity and total sales is −0.38,which means that other things equal, a doubling of total sales is associated with a 38% decrease inpollution intensity. The estimation has a R2 of 0.55, which suggests that a fair amount of variationcan be explained by variations in the total sales and in the three sets of dummies.15

C. Firm Size and Treatment Technologies

The negative size-intensity relationship we document in Section II.B does not explain why largerfirms pollute with less intensity. To answer this question, we exploit the detailed information inthe NGSPS on the end-of-pipe wastewater treatment equipment that firms use. The NGSPS groupswastewater treatment technologies in five categories: physical, chemical, physio-chemical, biolog-ical and combined technologies. In the subsequent analysis, we drop physio-chemical technologiesbecause less than 0.5% firms adopt this type of equipment. The combined technologies are differentcombinations of biological technologies with other technologies. They demonstrate very similarfeatures as biological technologies. We therefore group them with biological technologies.16 We

15We have also estimated the relationship using other econometric specifications. For instance, we estimated versionsof Equation (1) for each industry, and with robust standard errors clustered on different groups. All the regressionssuggest the same negative relationship between intensity and size qualitatively. The estimation results of the otherspecifications, as well as interpretations of the coefficients before the dummies are included in the online Appendix.

16Several examples of the actual technologies attributed to the three base categories (physical, chemical and biolog-ical) are as follows. Physical: Filtering, Centrifuging, Precipitation Separation, etc. Chemical: Oxidation-reduction,

Page 10: The Size Distribution of Firms and Aggregate Industrial

10 JI QI, XIN TANG AND XICAN XI

TABLE 4—FIRM SIZE AND TREATMENT TECHNOLOGIES

Technology Mean Efficiency Adoption Rates Median Costsa Median Salesb

Physical 63.37% 25.79% 100 100Chemical 69.77% 34.50% 360 270Biological 80.90% 39.71% 1200 820† Note: The numbers reported are for the Paper and Paper Product (C22) industry.

Treatment Efficiencies is defined as 1− COD Emitted/COD Generated.a,b The median installation costs of physical technologies and the median sales of

firms adopting them are normalized to 100.

are interested in the processing efficiency and installation costs of these technologies.17

To show the difference in the technologies adopted by firms of different size, we use the Paperand Paper Product industry as an example.18 Table 4 shows that for different technologies the meanprocessing efficiency, median installation costs, as well as the median sales of the firms that adoptthese technologies. We proxy the processing efficiency using one minus the ratio of emitted CODto generated COD. We normalize both the median installation costs of physical technologies andthe median sales of firms using physical technologies to 100. We find that on average, biologicalequipment is 17 percentage point more efficient than physical equipment. Meanwhile, they arealso more costly in that their installation costs are on average 11 times more expensive than thoseof physical equipment. Further, sales of firms that adopt biological technologies on average areabout 8 times of those by firms adopting physical technologies. Putting together, the evidencepoints to increasing returns to scale of the clean technologies. It indicates that small firms lackthe profit margins that are needed to take advantage of the increasing returns to scale exhibited bybiological technologies, while at the same time, large firms are more likely to adopt these moreadvanced technologies.

Environmentally Friendly Production Technology.—Notice that the above results are all aboutthe end-of-pipe treatment technologies, and we have made no statement about factors that couldlead to less COD generated. In fact, in the data the COD generated per unit value of production isalso decreasing in total value of output. It is possible that larger firms use environmentally morefriendly production technology, that they sell products with higher markup, or that they produceproducts that are technologically less polluting. An example from the Handbook of Emission Co-efficients published by the Chinese Academy of Sciences is as follows. Two technologies in paperpulp manufacturing use different inputs: bagasse and wood. While bagasse which generates 140-

neutralization, etc. Biological: Aerobic Biological Treatment, Activated Sludge Process, etc.17We do not include the annual operating costs in our analysis because on average, the ratio of operating costs of

the treatment equipments on the annual value of production is about 1.5%. Furthermore, the median of this ratio isless than 0.5%, suggesting that operating costs are almost negligible for more than 50% of the firms. Therefore, theoperating costs alone is unlikely to affect firm’s treatment technology adoption decision. Adding the operating costs tothe installation costs will not change the results. Since we do not model the operating costs in our theory, we excludethem here for consistency.

18Here we want to control for the potential heterogeneities in production processes across different industries, there-fore we focus on one industry. However, pooling all polluting industries together yields very similar results and arehence left in the Online Appendix.

Page 11: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 11

180 kg COD per ton is used mostly by firms with annual production of less than 100 k-tons, woodthat generates 30-55 kg COD per ton is used mostly by firms with annual production more than100 k-tons. Another example is from Bloom et al. (2010). They use data of more than 300 manu-facturing firms in UK and find that better management practices are associated with both improvedproductivity and lower greenhouse gas emissions. Unfortunately, we cannot test these hypothesesdirectly with our data. Therefore, in this paper, we focus on firms’ decisions on treatment equip-ment adoption which we observe directly and model the intensity reduction during the productionstage in a reduced-from way.

D. Firm-Size Distribution

The negative correlation between pollution intensity and production scale implies that, ceterisparibus, the shares of total output by large and small firms could potentially have a large impactthe aggregate industrial pollution. An important task then is to understand the firm size distribu-tion in China. Specifically, we choose the firm size distribution in the U.S as a benchmark, andcompare the firm size distribution in China to it. We choose the U.S to be a benchmark for tworeasons. First, it is reasonable to treat the U.S. economy as relatively distortion-free compared withother economies. Second, China and U.S are both large economies with complete sets of industrialsectors. For a study like ours, it is important that for each sector we are studying, we can findcomparable counterparts in the benchmark country. Contrasting the industries in China with thosein European advanced economies, it could either be that it is problematic to find comparable coun-terparts, or that the size of the corresponding industries is significantly smaller. Notice that firmsize in the U.S and China could be different for many reasons, and we do not intend to answer whythey are different. This primary purpose of the section is to motivate the accounting exercise inSection I.E and the investigation of size-dependent distortions that affect disproportionately largefirms in Section II.B.

Ideally, we would like to compare the shares of total output accounted for by firms of differentsizes. However, the SUSB does not report information on the output value. Therefore instead, wefocus on the shares of employment accounted for by firms of different size. It is the closest measurethat relates to our analysis, and it has been firmly established that firm employment is stronglycorrelated with firm production. We use the International Standard Industrial Classification of AllEconomic Activities, Rev.3.1 (ISIC Rev 3.1) published by the United Nations to bridge differentindustrial classification systems adopted by China (GB/T4574-2002) and the U.S (NAICS 2002).More specifically, crosswalks of GB/2002 at four-digit level and those of NAICS/2002 at six-digitlevel to the ISIC Rev 3.1 are issued by China’s NBS and the U.S Census Bureau. The resultspresented in this section are from matching at the disaggregated level (four-digit GB with six-digitNAICS).19

The firm size distributions for each of the top polluting industries and all industries pooled to-gether are shown in Figure 2.20 For all panels in Figure 2, we see that the share of employmentof firms with more than 400 employees in the U.S is significantly higher than that in China. For

19Matching at a more aggregated level (two-digit GB with three-digit NAICS) yields very similar results.20The details of the calculation are contained in Appendix C.

Page 12: The Size Distribution of Firms and Aggregate Industrial

12 JI QI, XIN TANG AND XICAN XI

1−19 20−99 100−399 400+

Paper

Firm Size

Em

ploy

men

t Sha

re0.

00.

20.

40.

60.

81.

0

ChinaUS

1−19 20−99 100−399 400+

Agricultural Food

Firm Size

Em

ploy

men

t Sha

re0.

00.

20.

40.

60.

81.

0

ChinaUS

1−19 20−99 100−399 400+

Textile

Firm Size

Em

ploy

men

t Sha

re0.

00.

20.

40.

60.

81.

0

ChinaUS

1−19 20−99 100−399 400+

Chemical Materials

Firm Size

Em

ploy

men

t Sha

re0.

00.

20.

40.

60.

81.

0ChinaUS

1−19 20−99 100−399 400+

Beverage

Firm Size

Em

ploy

men

t Sha

re0.

00.

20.

40.

60.

81.

0

ChinaUS

1−19 20−99 100−399 400+

Pooled Polluting

Firm Size

Em

ploy

men

t Sha

re0.

00.

20.

40.

60.

81.

0

ChinaUS

FIGURE 2. EMPLOYMENT DISTRIBUTION

Page 13: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 13

TABLE 5—SIZE DISTRIBUTION ON POLLUTION

Paper Agricultural Food Textile Chemistry Beverage Average

Average Intensity 43.5% 61.1% 97.5% 101.2% 89.0% 67.0%† Note: Please see notes of Table 2 for acronyms of industries. For individual industries, the

numbers reported are the aggregate pollution from the artificial U.S production structure aspercentage from that of China. Column 6 (Average) calculates the weighted average of theseratios using the percentage contribution in row one of Table 2 as weights.

example, for the paper manufacturing industry, more than 90% of the workers in the U.S are hiredby firms with more than 400 employees while in China, the number is less than 40%. Overall,pooling these industries together, approximately 70% employment is in the large firms in the U.Swhile in China the number is only 20%. These findings indicate that compared to the U.S, a muchlarger portion of production is done by small firms in China. Hence the underlying industry struc-ture difference could be a candidate for explaining the high industrial pollution emissions in China.The results are consistent with Wang and Whalley (2014), where the authors compare the manu-facturing concentration ratio (the share of market occupied by the largest firms) between China andthe U.S. According to Table 1 in their paper, the ratios of the concentration indicators of U.S overChina for all five top polluting industries are higher than the overall average, which suggests thatin the polluting industries, large firms in the U.S take a larger share of markets.

E. Size Distribution and Aggregate Pollution

To gain an understanding of how the firm size distribution affects aggregate pollution quantitatively,in this section we conduct an accounting exercise. In this exercise, for each polluting industry inChina, we fix the level of total output, but replace the employment distribution with that fromthe U.S, and calculate the implied level of aggregate pollution using the size-intensity relation-ship estimated in Section I.B. This simple exercise is complicated by the fact that NGSPS onlyreports the firm-level total value of production but not the number of employees. We construct theemployment-production relationship using linear regression with CNEC data.21

The results are shown in Table 5. The numbers reported are the ratio of the aggregate pollutionlevel produced with the U.S employment share distribution over that with the original Chinese dis-tribution. The results imply that by changing the employment share distribution to that of the U.S,while keeping production at the same level, the aggregate discharge in the Paper and Paper Productindustry reduces to 43.5% of the original level. On average, for the top-5 polluting industries, theeffect of change in size distribution is reduction of discharge to 67% of the original level. Theaverage is calculated using relative size of each industry in pollution as weights, which is the firstrow of Table 2. Changing the size distributions of the five industries together while keeping thoseof all the other industries untouched will achieve a reduction in total emissions by about 25.5%.Although the exercise here is a crude approximation, it nevertheless shows that size distribution

21There are many ways to construct the employment-production relationship using CNEC, and each method has itsown advantages and disadvantages. Calculation using alternative methods gives similar results. We leave the details ofthese alternative methods in Appendix A.

Page 14: The Size Distribution of Firms and Aggregate Industrial

14 JI QI, XIN TANG AND XICAN XI

could have a significant impact on the level of aggregate industrial pollution.

II. The Model

The accounting exercise in the last section has several limitations. First, the aggregate output isfixed. It is possible that when the size distribution changes, although the pollution intensity de-creases, but because of a larger increase in the aggregate output, the aggregate pollution increasesas a result. We would like to allow for such a scenario in our analysis. Second, the firm size distri-bution is mechanically changed to that in the U.S. From the accounting exercise alone, we do notknow what are the factors that drive the difference between the firm size distribution of China andthe U.S, nor do we know that by changing these factors, whether the implied employment distri-bution will in fact become that of the U.S. Third, the relationship between firm size and emissionintensity is taken as exogenous and invariant. It is possible that changes in the factors that affectthe employment distribution also affects the technology choice decisions of the firms, which makesthe size-intensity relationship endogenous. Therefore, to better evaluate the environmental con-sequences of distortions to firm size, we need a model which (i) contains some economic factorsaffecting both aggregate output and pollution; (ii) reveals what are the factors that affect firm sizeand how; and (iii) provides explanation to the size-intensity relationship.

For this purpose, we consider a one sector neoclassical growth model with heterogeneous produc-tion units featuring size-dependent distortions, imperfect environmental monitoring, and endoge-nous treatment technology choice. We assume that there are two types of treatment technologies—dirty and clean. In the context of our model, the two technologies are interpreted as the physicaland biological technology which we discussed in Section II.C.

A. Setup

Household.—There is a representative household with a continuum of members. Each householdmember is endowed with z units of managerial talent, z ∼ G(z) with support Z , [0, z], whereG(z) is the cumulative distribution and g(z) is the probability density. We assume the support anddistribution of z are exogenous. Further we assume that z is fixed once drawn. Household membersface an occupational choice decision between worker and entrepreneur. A worker supplies one unitof labor inelastically in exchange for wage income, and an entrepreneur rents capital and labor torun a neoclassical firm and earns profits. Let the final product be the numeraire, and R and W bethe capital and labor rental price respectively. Firms and capital are owned by the household.

Firms.—Firms combine managerial talent z, capital k, and labor n to produce output y accordingto technology

y = F (z, k, n) = z1−γ(kαn1−α)γ,

where γ < 1 is the span-of-control parameter. The assumption of decreasing returns to scale withrespect to k and n supports a non-degenerate distribution of firms.22

22We build our model based on Lucas (1978) here. However, all the qualitative properties of our model remainvalid if instead we use a model with monopolistic competition [Melitz (2003)] since the two models are isomorphic[see Appendix I of Hsieh and Klenow (2009)]. In the Melitz model, the decreasing returns to scale come from the

Page 15: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 15

The production process generates pollutants e as by-products. The total emission depends on theproduction scale y and the treatment technology firms use

(2) e = E(i, y),

where i = 1 indicates the adoption of clean technology and i = 0 otherwise. The installation of theclean equipment incurs fixed cost RkE , where we assume that the equipment is also rented fromthe market, just as the production capital k.23

Regulators.—We assume that the environmental authority monitors the adoption of clean tech-nology by firms with probability p. When a firm using dirty production technology gets inspected,we assume that a fraction ξ of its total profits is confiscated by the regulating agency. As a result,firms using dirty technology lose a fraction of pξ of its total profits in expectation. The confiscatedprofits are distributed to the household as lump-sum transfers, so they do not affect the decisionproblem of household members. This reduced-form way of modeling monitoring policy could forinstance be rationalized by a mixed strategy Nash Equilibrium of a behind-the-scenes “monitoringgame.”

The current industrial pollution management and control system in China consists of economicincentives and command-and-control instruments.24 The pollution levy system is the most widelyused economic instrument in China. However, it has been widely documented that it places verylimited constraints on the pollution emission of the firms because the penalty imposed is very low.Firms only have to pay for the pollutant discharges that go beyond the national standard. Thepollutant discharges are self-reported and the truthfulness of the reported discharges is imperfectlyexamined by the regulators.25 Further, for firms that discharge multiple pollutants and the levels ofmore than one of the pollutants are above the national standards, firms only have to pay for the onethat leads to the highest penalty. We calculate from the CNEC the pollution fees levied on firms asa fraction of total labor compensation. We find that for firms with strictly positive emission fees,these fees only account for 0.06% (median) and 0.3% (mean) of the labor compensation.

Therefore in practice, the environmental agencies rely mostly on the command-and-control in-struments. To implement the regulation, field inspections are done by the staff of local environ-mental agencies. At the firm level, field staff typically check the type of treatment equipment firmsinstalled and test emission intensity of major pollutants. Firms that are found at fault during thefield inspection are usually suspended from production for an extended period of time until theissues are resolved.26 In our model, the fraction ξ of the profits confiscated is used to approxi-mate these costs. Since according to Table 4, the treatment technology used by firms is highly

concavity in the utility function.23We choose to model the installation costs as one-time fixed cost as opposed to fixed cost plus operating cost, or

size-dependent fixed cost because the latter two are not supported by empirical evidence. We also assume that thefixed cost is only associated with clean technology. Qualitatively, assuming that dirty technology also requires a fixedcost will not affect the property of the model. It is equivalent to a decrease in the cost of clean technology, since whatmatters for firm’s decision is the difference of the two costs. See Section B of the Online Appendix for further details.

24See Chapter 5 of World Bank (2001) for a detailed description.25It is possible that this is a common problem for developing countries in general. For example, Duflo et al. (2013)

conducted a field experiment in India and find that the emission reporting system in India is largely corrupted, withauditors systematically reporting plant emissions just below the standard.

26See Dasgupta et al. (2001) for a case study of Zhenjiang.

Page 16: The Size Distribution of Firms and Aggregate Industrial

16 JI QI, XIN TANG AND XICAN XI

correlated with the pollution intensity, we assume that the regulator in our model checks only thetreatment technologies. Although the local environmental agencies also monitor the total amountof discharges, these regulations are usually done at more aggregated level, in most cases based onthe provincial-level aggregation. They thus are less relevant to the firm-level decision that we studyhere.27

B. Firm-level Distortions

Recent studies on Chinese economy have documented large distortions at both the sector and firmlevel, which lead to sizable negative effects on aggregate productivity and output.28 Following theseminal approach developed by Hsieh and Klenow (2009, 2014), we model and estimate firm-leveldistortions using variations in average products of capital and labor across firms. More specifically,if we let τzi , τki and τli be respectively the wedges firm i faces on the product, capital, and labormarket, the profit maximization problem of firm i is

πi = maxki,li

(1− τzi)z

1−γi (kαi l

1−αi )γ − (1 + τki)Rki − (1 + τli)Wli

.

Using the first order conditions, the average product of capital φk, labor φl and the capital-laborratio κ could be expressed as

φk =y

k=

(1 + τki)R

αγ(1− τzi),(3)

φl =y

l=

(1 + τli)W

(1− α)γ(1− τzi),(4)

κ =k

l=

α

1− α· (1 + τli)W

(1 + τki)R.(5)

The above equations show that in absence of any market friction (τz = τk = τl = 0), φk, φland κ should be equalized across all firms. Equations (3) and (4) say that firms that face higherdistortions on the capital (labor) and/or product market will demonstrate higher average productof capital (labor). In addition, according to Equation (5), the capital-labor ratio increases with therelative size of labor to capital market wedge. Using firm-level data on total value of production,book value of capital stock and labor compensation from the CNEC, we calculate z, φk, φl and κfor each firm in our sample. Figure 3 shows on log scales the scatter-plots of φk, φl and κ againstfirm-level productivity z for the Paper industry.

Two patterns emerge from Figure 3. First, from the two upper panels, we see that both φk andφl are positively correlated with z, which suggests that more productive firms have higher averageproducts of both capital and labor. Expressed in wedges, this means that both (1 + τk)/(1 − τz)and (1 + τl)/(1 − τz) are higher for more productive firms. If we assume γ = 0.85 following

27Firm level inspections in the U.S are also targeted mainly on the adopted treatment technologies. See Becker andHenderson (2000) and the references therein for more details.

28See Hsieh and Klenow (2009), Song, Storesletten and Zilibotti (2011), Brandt, Tombe and Zhu (2013), and Tombeand Zhu (2015) among others.

Page 17: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 17

0 10 20 30 40 50

−1

01

2

Average Product of Capital

Productivity

AR

K

0 10 20 30 40 50

0.5

1.0

1.5

2.0

2.5

3.0

3.5

Average Product of Labor

Productivity

AR

L

0 10 20 30 40 50

−1

01

23

Capital−labor Ratio

Productivity

K/L

Rat

io

FIGURE 3. FACTOR AND PRODUCT MARKET DISTORTIONS

Source: China National Economic Census. All panels are plot in log scale. Lines are least square fit.

Atkeson and Kehoe (2005), then the elasticity of φk and φl to z are both about 0.25, meaning thata doubling of firm productivity is associated with a 25% increase in the average revenue productof factor inputs.29 It could be because that more productive firms are subject to higher factor orproduct market distortions or both. Second, from the lower panel, we see that the capital-laborratio is at best weakly negatively correlated with z. The least squares estimate of the elasticity is

29Generally speaking, the estimated elasticity depends on the value of γ, since z is calculated for a given γ. Butbecause we are using a Lucas (1978) model, while Hsieh and Klenow (2014) and Bento and Restuccia (2016) both usethe Melitz (2003) model, our results are not directly comparable with theirs. Yet broadly speaking, our estimates arequantitatively consistent with theirs. Using comprehensive micro data, Hsieh and Klenow (2014) find that the elasticityis 0.1 for the U.S. and between 0.5 to 0.6 for India and Mexico. Using data from the World Bank’s Enterprise Surveys,Bento and Restuccia (2016) find that the cross-country evidence suggests that the elasticities range from 0.22 to 0.74,averaging 0.52.

Page 18: The Size Distribution of Firms and Aggregate Industrial

18 JI QI, XIN TANG AND XICAN XI

−0.0057, and the R2 is only 0.053. This indicates that the relative wedge firms face on the capitaland labor markets do not depend strongly on the idiosyncratic productivity of firms, which in thecontext of our model implies 1+τk ≈ 1+τl. Since we cannot separately identify the three wedges,for simplicity, we assume τk = τl = 0 and attribute all the variations in the average product offactors to wedges in the product market τz. Whether we assume τk = τl = 0 or alternatively τz = 0will not affect our results, but the interpretations need to be changed accordingly.30

In the spirit of Adamopoulos and Restuccia (2014), we implement these idiosyncratic wedgesin the model by positing a generic “tax” function that specifies the wedges as a function of firm’sproductivity z:31

(6) τz = max

0, 1− φ0zφ1.

We assume the taxes collected are returned to household as lump-sum transfers. Anticipating thebenchmark calibration in the next section, the wedge function specified in equation (6) is increasingand concave in z, with the lower and upper bounds being 0 and 1 respectively. The shape ofthe function captures the size-dependency of the product market distortions where the wedges arehigher for larger firms.32

The idiosyncratic τz is meant to capture implicitly a variety of policies and institutions thatallocate factor factors away from large productive firms. For example, it is consistent with thestories that large productive firms face transportation costs, additional management costs, or localprotectionism and trade barriers that impede the inter-regional flow of goods when delivering theirproducts to wider range of areas [Young (2000), Hsieh and Klenow (2014), and Tombe and Zhu(2015)]. It could also be that smaller firms are subject to preferential tax treatment.33 It is of greatinterest to follow the so-called direct approach in Restuccia and Rogerson (2013), and study specificpolicies and regulations that we observe in the economy [Adamopoulos and Restuccia (2014)]. Weleave this important task for future work.

30For example, we cannot distinguish between the data generating process we use here and another process where τkand τl increase simultaneously while τz is equal to zero. Also notice that when estimating the elasticity, since y, k andl enter the regression on both sides, when any of the variable is measured with measurement error, it is possible thatmeasurement error drives spurious correlation. We cannot rule out this possibility completely. However, we argue thatthis does not seem to be the case here. In particular, if y is measured with extreme measurement error, the regressioncoefficient of φk over z will be 1−γ. Similarly, if instead k is measured with extreme measurement error, the regressioncoefficient will be (1− γ)/γ. We calculate φk and z using different values of γ and the regression coefficients do notvary as predicted by either case.

31Papers that adopt similar assumptions include Hsieh and Klenow (2014), Buera and Shin (2016 forthcoming), andBento and Restuccia (2016), etc.

32There is one difference between Equation (6) and the tax function used by Adamopoulos and Restuccia (2014).To model the size dependency, in their specification, the authors use an exponential function as opposed to the powerfunction here. We choose the power function because it is consistent with the log-linearity of φk (φl) and z while theexponential function implies a tax scheme that increases much sharper with respect to productivity than the empiricalcounterpart for our case.

33For instance, the value added taxes for firms with annual value of industrial output that is less than CNY 1 millionis 3% while firms with production scale larger than CNY 1 million are subject to a 13% tax rate.

Page 19: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 19

C. Firm’s Problem

Entrepreneurs first decide on which type of treatment technology to use and then on how much toproduce. The business profits of a type-z entrepreneur π(z) is the maximum over the profits ofproducing using dirty technology π0(z) and those of using clean technology π1(z):

(7) π(z) = max π0(z), π1(z) ,

where the subscript indicates the treatment equipment choice decision.Firms using clean technology are not subject to environmental penalties, hence their profits are

just revenues less costs:

(8) π1(z) = maxk,n

(1− τz)z1−γ(kαn1−α)γ −Wn−R(k + kE)

.

Notice that here the treatment equipment kE cannot be used to produce the final product. Thisassumption is based on the empirical finding by Shadbegian and Gray (2005).

On the other hand, firms using dirty technology will be inspected by the environmental authoritywith probability p. Under such circumstances, a fraction ξ of their annual profits will be confiscated.Hence, the profit function is

πC0 (z) = (1− ξ)[(1− τz)z1−γ(kαn1−α)γ −Wn−Rk

],

where the superscript C indicates “caught.” While if the firm succeeds in evading the inspection,the profit function is

πE0 (z) = (1− τz)z1−γ(kαn1−α)γ −Wn−Rk,

where the superscript E indicates “evaded.” Because we assume perfect risk sharing within thehousehold, these entrepreneurs will not have precautionary motives and will simply maximize theexpected profits over πC0 and πE0 :

π0(z) = maxk,n

(1− p)πE0 (z) + pπC0 (z)

.

Some algebra yields

(9) π0(z) = maxk,n

(1− pξ)

[(1− τz)z1−γ(kαn1−α)γ −Wn−Rk

],

where pξ is the fraction of profits that is confiscated in expectation for a firm using dirty technology.

D. Size-Dependent Distortions and Technology Adoption

To clarify the basic mechanics of the model, in this section we analyze firm’s optimization problemwhen R and W are given exogenously. We prove two results in this section. First, we show thatthere exists a threshold z such that firms with z > z adopt clean technology, while firms withz ≤ z do not. Second, if we denote the previous threshold in environments with and without size-dependent distortions to be respectively zf and zn, we show that zf > zn. The first result says

Page 20: The Size Distribution of Firms and Aggregate Industrial

20 JI QI, XIN TANG AND XICAN XI

that there are returns to scale embedded with the clean technology that are only exploited whenfirms are large enough. The second result says that by introducing size-dependent distortions, apositive measure of firms that adopt clean technology when there are no distortions do not have theprofit margin to benefit from the clean technology, and hence choose to enter the market with dirtytechnology. Throughout, we assume 0 < α < 1, 0 < γ < 1, φ0 = 1 and 1 − γ − φ1 > 0. Weimpose the last inequality because the tax specified in (6) is imposed on firm level TFP z. In orderfor the benefits of higher talent z (the elasticity of profits to TFP is 1− γ) to always out-weight thecosts (the elasticity of tax to TFP is φ1), 1− γ − φ1 > 0 must be satisfied. All proofs are left in theappendix.

Lemma 1 characterizes firms’ profit functions in the absence of size-dependent distortions.

Lemma 1. In an economy with no size-dependent distortions, π0(z) and π1(z) are both increasingand linear with respect to z. In addition, the slope of π1(z) is steeper than that of π0(z):

(10)∂π0(z)

∂z= (1− pξ)∂π1(z)

∂z, ∀z ∈ Z.

Lemma 1 highlights the core trade-off of adopting clean technology in our model. Although theup-front fixed costs shift the overall profit function down by RkE , the profits of firms with cleantechnology will not be confiscated by the regulators. With constant elasticity between capital andlabor, the optimizing capital to labor ratio is constant in absence of factor market frictions, there-fore entrepreneurs reap economic rents from managerial talents z. These economic rents increaselinearly in z, because we assume a constant returns to scale production function.

Since the tax in (6) is size-dependent in the sense that more talented entrepreneurs are subject tohigher distortions, it can be shown that in an economy with size-dependent distortions, both π0(z)and π1(z) are concave.

Corollary 1. Suppose the size-dependent distortions are specified as max

0, 1− zφ1

with 1 −γ + φ1 > 0, then π0(z) and π1(z) are both increasing and concave with respect to z. In addition,the slope of π1(z) is steeper than that of π0(z):

∂π0(z)

∂z= (1− pξ)∂π1(z)

∂z, ∀z ∈ Z.

On the other hand, if the taxes are uniformly imposed—meaning that for all i, τzi is the same—thenπ0(z) and π1(z) will remain linear.

Since the wage income associated with being a worker is fixed at W , the monotonicity of theprofit functions implies that there is a threshold z for which all household members with talentshigher than z choose to become entrepreneurs. Put differently, household members choose theiroccupations according to their comparative advantages. This is the standard result from the Lucasmodel. We summarize it below in Proposition 1.

Proposition 1. There exists a unique threshold z such that all household members with z < zchoose to be workers and those with z ≥ z become entrepreneurs. Further, z is the solution ofW = π(z).

Page 21: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 21

0

𝜋

−𝑅𝑘𝐸

𝑧 𝑧

𝜋0𝑓

𝜋1𝑓

𝜋0𝑛

𝜋1𝑛

𝚫𝒛 𝝉 = 𝑮 𝒛𝒇 − 𝑮 𝒛𝒏

𝒛𝒏

𝒛𝒇

FIGURE 4. THE EFFECT OF SIZE-DEPENDENT DISTORTIONS

Finally, Proposition 2 summarizes the main result of this section: larger firms adopt clean tech-nology and size-dependent distortions impede technology upgrade.

Proposition 2. Given kE,W, τz and R, there exist unique thresholds zn and zf such that:

(i) In the economy with no size-dependent distortions, entrepreneurs with z ≤ zn produce usingdirty technology while those with z > zn produce using clean technology.

(ii) In the economy with size-dependent distortions, entrepreneurs with z ≤ zf produce usingdirty technology while those with z > zf produce using clean technology.

(iii) zn < zf , that is, size-dependent distortions impede technology upgrade.

A graphical illustration of Proposition 2 is shown in Figure 4. There are four profit functions inthe figure, πn0 , π

f0 , π

n1 and πf1 where superscripts n and f indicate whether there are size-dependent

distortions, and subscripts 0 and 1 represent firms using dirty or clean technology respectively.Notice that although the installation cost RkE is fixed, the expected loss pξπE0 (z) is increasing inz. Therefore, although for firms with lower z the fixed installation costs of clean technology isnot justified, for those with higher z it will eventually pay off. Here the elasticity of profits tomanagerial talents is 1− γ when there are no distortions, which is larger than that in a market withdistortions 1−γ−φ1. Since the distortions decrease the rate by which profits increase with z by φ1,for some firms although their “pre-tax” profits make it profitable adopt the clean technologies, the“after-tax” profits do not. The ultimate result is that a positive measure ∆z(τ) = G(zf ) − G(zn)

Page 22: The Size Distribution of Firms and Aggregate Industrial

22 JI QI, XIN TANG AND XICAN XI

of firms which would produce using clean technology in an environment with no distortions, nowproduce using dirty technology when size-dependent distortions exist.

E. Steady State Equilibrium

In this section, we specify the household problem and define the general equilibrium to close themodel. In particular, we focus on the case of steady state equilibrium.

The household engages in a simple consumption saving problem:

maxCt,Kt+1

∞∑t=0

βtU(Ct)(11)

s.t.

Ct +Kt+1 − (1− δ)Kt = It,

where Ct is the consumption,Kt is the aggregate capital, β is the discount rate, δ is the depreciationrate, and It is household income which we will specify in detail shortly.34 The solution to (11) isthe standard intertemporal Euler equation

(12) U ′(Ct) = βU ′(Ct+1)(1 +Rt+1 − δ),

which pins down the equilibrium interest rate.Household income It comes from three sources: wage income, firms’ profits, and lump-sum

transfers from taxes τz and environmental penalties pξ. To characterize It, we need some additionalnotation. We denote Z0 = z ∈ [zt, z]|π0(z) ≥ π1(z) as the set of firms operating under dirtytechnology, and Z1 = z ∈ [zt, z]|π0(z) < π1(z) as the set of firms using clean technology.Notice that for the intermediate case where 0 < z < z < z, Proposition 2 implies Z0 = [z, z) andZ1 = [z, z]. If we let T denote the transfers, Equation (9) and Proposition 1 then yield:

It = RtKt +WtG(zt) +

∫z∈Z0

π0(z)dG(z) +

∫z∈Z1

π1(z)dG(z) + T,

where the five terms are respectively capital rental income, wage income, profits from dirty andclean firms, and government transfers. A law of large numbers here guarantees the ex ante proba-bility of being inspected equals the ex post fraction of firms that actually get inspected.

Now we are ready to define the equilibrium. Let Y be the aggregate output and E be the aggre-gate pollution, the steady state equilibrium of the model is defined as follows.

Definition 1. A steady state equilibrium in this model is the prices W,R, allocations C,K, Y ,firms’ policy functions k(z), n(z), y(z), π(z), household’s occupational choice z, firms’ technol-ogy choice z, and aggregate pollutants emissions E such that:

34We assume here that the household values only consumption and not environmental quality. This assumptionis innocuous in the competitive equilibrium, since individual household member has no control over the aggregateenvironmental quality. However, the assumption will affect the results if a planner’s problem is studied, or if we wantto evaluate the welfare effect of different policies.

Page 23: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 23

(i) Given factor prices W,R, C,K, z solve the household optimization problem;

(ii) Given factor prices W,R, k(z), n(z), y(z), π(z) and z solve firms’ optimization prob-lems;

(iii) Factor prices W,R clear all markets:

• Labor Market:

G(z) =

∫ z

z

n(z)dG(z),

• Capital Market:

K =

∫ z

z

k(z)dG(z) + kE

∫z∈Z1

dG(z),

• Product Market:

C +K − (1− δ)K =

∫ z

z

y(z)dG(z);

(iv) The aggregate pollutants emissions are

E =

∫z∈Z0

e (0, y(z)) dG(z) +

∫z∈Z1

e (1, y(z)) dG(z).

III. Calibration

We calibrate our model to the Chinese data. The model period is set to be one year.Calibration.—Motivated by the empirical evidence in Section I, we assume that the pollution

intensity function of firms with treatment technology i and production level y is log-linear:

(13) loge

y= ψ

(i)0 + ψ

(i)1 log y.

This specification implies that conditional on the treatment technology adopted, there is still “withingroup” intensity reduction as production scale increases. Here ψ(i)

1 captures for instance the inten-sity reduction during the production stage mentioned in Section.I.C. Equation (13) implies that theactual emissions are

(14) e = E(i, y) = eψ(i)0 y1+ψ

(i)1 .

Because the firm size distribution in the model is affected by both the talent distribution G(·) andthe product market frictions τz, our identification assumption is such that parameters governing τz[φ0 and φ1 in Equation (6)] are calibrated according to the empirical regularities in Section II.B (ex-plained in detail shortly after) and given τz, G(·) is set to match the firm size distributions in China.We choose the pooled polluting industries as our calibration targets. We ask the model to matchtwo aspects of the firm size distribution—the total number and the share of employment of firms ofcertain size. It is well documented in the literature that the commonly used log-normal distribution

Page 24: The Size Distribution of Firms and Aggregate Industrial

24 JI QI, XIN TANG AND XICAN XI

does a reasonably good job at matching the distribution of the bulk of small and medium-sizedfirms, but does not generate the concentration of employment that we observe in Figure 2. Theheavy right tail is crucial to our evaluation because these are the firms that are producing with cleantechnology. Since τz is levied based on the productivity z, we assume that the distribution of the“after-tax” productivity z′ = (1 − τz)z1−γ is a combination of two components. The first is a log-normal distribution with mean µ, standard deviation σ, and total probability mass 1 − gmax thataccounts for the bulk of small and medium firms. The second is an atomic with value z′max andmeasure gmax, which accounts for those very large firms.35 The “before-tax” productivity z is thencalculated by

z =(z′φ

1/(γ−1)0

) 1−γ1−γ+φ1 ,

which gives us G(z).Therefore, we are left with total of 17 parameters to calibrate: discount factor β, production tech-

nology parameters A, δ, α, γ, treatment technology parameters ψ(0)0 , ψ

(0)1 , ψ

(1)0 , ψ

(1)1 , kE, pξ,

size-dependent distortions φ0, φ1 and distributional parameters µ, σ, z′max, gmax. The generalstrategy of our calibration involves assigning values to some parameters based on a priori informa-tion in the data, and calibrating the rest jointly such that the distance between the moments fromthe model and the data is minimized.

Eight of the seventeen parameters can be determined exogenously. We set the depreciation rateδ to 10% [Song, Storesletten and Zilibotti (2011)]. To get estimates of ψ(0)

0 , ψ(0)1 , ψ

(1)0 , ψ

(1)1 , we

repeat the exercises in Section I.B for firms using physical and biological equipment separately.The estimates are ψ(0)

0 = −3.5795, ψ(0)1 = −0.4149, ψ(1)

0 = −4.4270 and ψ(1)1 = −0.3410. In the

context of our model, these estimates suggest that on average, for two firms with the same level ofproduction but different treatment technology, the firm that uses clean technology discharges 40%to 60% less pollutants than the firm equipped with dirty technology. We use information on theaverage products of capital and labor to calibrate the tax function. Equation (3) suggests that theelasticity of φk to 1 − τz is equal to unity. Therefore φ1 is equal to the elasticity of φk to z. Wetherefore calculate φk and z according to Section II.B, with R = 0.1 [Hsieh and Klenow (2009)]and the same γ used later when we are calibrating the model to match the firm size and employmentdistributions. We then run a regression

log φki = φ0 + φ1 log zi + εzi ,

which gives us the value for φ1 = 0.03.36 Given φ1, we then calibrate φ0 such that the average taxburden in the economy equals the value added tax imposed on Chinese manufacturing firms in thedata, which is 13%. This gives us the value of φ0 = 1.15.37 We set A = 1 as normalization.

35This strategy follows Guner, Ventura and Xu (2008) and is quite popular among macroeconomic studies on wealthdistribution, see for example Castaneda, Dıaz-Gimenez and Rıos-Rull (2003).

36The elasticity here is lower than what we have in Section II.B, because here z is calculated using γ = 0.93, whilepreviously we set γ = 0.85.

37In general, it is very difficult to estimate the average distortions in an economy, because many factors that leadto resource misallocation are not observable [Restuccia and Rogerson (2013)]. Our choice of average tax rate as thecalibration target follows Bento and Restuccia (2016). As is shown in Proposition 2, the gains in output (and capital

Page 25: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 25

The remaining parameters are calibrated jointly. The calibration involves two layers: an outerlayer loops over the parameterization of G(z) and an inner layer solves the model given G(z). Inthe inner layer, first we approximateG(z) with 5,000 grid points. We then choose β and α to matchrespectively the capital-output ratio of 1.65 and capital share of 0.5 in China [Bai, Hsieh and Qian(2006)]. We set pξ and kE such that the total treatment equipment investment is equal to 1% of thetotal output, and the fraction of firms adopting clean technologies equals the empirically observedlevel of 57%. The value of γ is set such that the difference between the numbers of firms fall ineach bin of the employment and firm size distributions generated by the model, and those in thedata is minimized. More specifically, if we let sq and sq be the number of firms in each bin q (intotal ten of them) calculated from the data and from the model respectively, γ∗ solves

(15) γ∗ = argminγ

10∑q=1

(sq − sq)2 .

Notice that for each combination of µ, σ, zmax, gmax, there is one corresponding γ∗. There-fore, in the outer layer, we use a multi-dimensional search process to choose the combination ofµ, σ, zmax, gmax that minimizes the minimum distances from the inner layer. Furthermore, werequire that the γ∗ is the same one used in calculating G(z). The model parameters along with theirtargets and calibrated values are listed in Table 6.

Discussion.—The calibrated model matches very well the capital share, capital-output ratio, thefraction of total treatment equipment expenditure in total output, and the fraction of firms adoptingclean technology. The calibrated value of returns to scale γ lies within the empirically estimatedrange.38 The calibrated value of pξ should be interpreted as the implied “tax” rate for using dirtytechnology, which combines the effect of many disparate and overlapping policies and adds up to20.5% of a firm’s annual output value.

Figure 5 shows graphically the firm size (left panel) and employment share distributions (rightpanel) in the model and in the data. Overall the model does a reasonable job in matching the twodistributions given that there are five degrees of freedom. The mean (59.27) and median (22.95) ofthe firm size distribution, which are not directly targeted in the calibration, match well with theirempirical counterparts, which equal to 59.05 and 20 respectively. The challenges of calibratingthe model to simultaneously match the two aspects of the firm size distribution are as follows. Inorder to create the concentration of employment among large firms, we need not only very talentedentrepreneurs who are willing to hire a lot of employees, but also the wage these entrepreneurs facehas to be kept at a low level to make them able to actually hire the desired amount of workers. Thisimplies that the profits for small firms are also low. Furthermore, we also need the returns to scale

and consumption as well) from eliminating the size-dependent distortions are increasing in the average level of thedistortions, hence targeting a higher average “tax” rate increases such gains monotonically, which will not affect theresults qualitatively. However, quantitatively we cannot rule out the possibility that the removal of extremely largedistortions leads to larger increase in output than the decrease in pollution intensity, thereby raising the aggregatepollution.

38The values previously used in the macro literature range from 0.85 [Atkeson and Kehoe (2005)] at the lower end to0.95 [Bartelsman, Haltiwanger and Scarpetta (2013)] at the upper end. Estimations from micro-level data yield similarresults, for example Olley and Pakes (1996) estimated the value to be between 0.8 to 0.9 for the U.S telecommunicationsequipment industry, depending on the particular econometric specifications.

Page 26: The Size Distribution of Firms and Aggregate Industrial

26 JI QI, XIN TANG AND XICAN XI

TABLE 6—PARAMETERIZATION

Parameter Value Targets

Production A 1 Normalizationδ 0.1000 Depreciation Rateα 0.5376 Capital Share 0.5γ 0.9300 Size Distribution†

Treatment ψ(0)0 −3.5795 Physical Intensity-output Elasticity

ψ(0)1 −0.4149

ψ(1)0 −4.4270 Biological Intensity-output Elasticity

ψ(1)1 −0.3410

kE 4.1500 Envir.capital-output ratio 1%pξ 0.205 Frac.Firms Use Bio 57%

Distortions φ0 1.15 Average Value Added Tax 13%φ1 0.03 Avg.Factor.Prod-Prod Elasticity

Preference β 0.8750 Capital-output Ratio 1.65

Talents µ −2.4567 Size Distribution†

σ 4.0020zmax 10820.4gmax 0.00048

† Note: Jointly calibrated.

(γ) to be high. A higher γ raises the factor demand from the large firms, which pushes up factorprices and drives out small firms. The latter jeopardizes the fit of the number of small firms in themodel. The performance of our model is thus the compromise of these two forces.39

The identification assumption of our benchmark calibration is that given taxes τz, the talentdistribution is identified by the size distribution in the data. Another commonly used calibrationstrategy in the literature takes the U.S as an undistorted economy (which means τz = 0 in ourmodel) and calibrates G(z) by matching the size distribution from the tax-free economy to the U.Sdata. Taxes are then introduced such that certain moments of the size distribution of the countryin interest are matched.40 If we calibrate our model in this way, the underlying assumption wouldbe that the talent distribution of the entrepreneurs in China is the same as that in the U.S, and allthe difference between the size distributions of China and the U.S is due to the taxes.41 From ourperspective, both strategies have their strength and weakness, and compromise to data limitationsin different ways. The quantitative results of our paper are to largely driven by size and shape of

39Models where these two forces are not very antagonistic to each other usually have much better fit. For instance,Guner, Ventura and Xu (2008) study the whole U.S business sector which has narrower employment span and lighteremployment concentration. In particular, the largest group they are targeting is firms with more than 100 employees.On the contrary, the largest group we are targeting is 400+ employees, which is significantly larger. In another paper,Adamopoulos and Restuccia (2014) assume away the selection mechanism in the model which, put differently, mutesthe general equilibrium feedback through wages. This relaxes the restriction on the average talents considerably.

40See for example, Guner, Ventura and Xu (2008), Bartelsman, Haltiwanger and Scarpetta (2013), Adamopoulosand Restuccia (2014) and Bento and Restuccia (2016) among others.

41The evidence regarding this point is mixed, see Figure 2 in Bloom and Van Reenen (2010).

Page 27: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 27

1−19 20−49 50−99 100−399 400+0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Firm Size Groups

Fra

ctio

n of

Firm

s

Firm Size Distribution: Benchmark

ModelChina

1−19 20−49 50−99 100−399 400+0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Firm Size Groups

Fra

ctio

n of

Wor

kers

Employment Share Distribution: Benchmark

ModelChina

FIGURE 5. MODEL FIT OF THE BENCHMARK CALIBRATION

τz. Therefore, as long as alternative calibrations imply tax schemes that are similar to what we usein our benchmark calibration, the quantitative aspects of our results will hold as well.42 We chooseto calibrate our model to the Chinese economy directly because better empirical evidence (SectionII.B) is available to us.

IV. Quantitative Results

We use the calibrated model as a framework for understanding the effects of firm size distributionon industrial pollution. We conduct two experiments. In experiment (i), we eliminate all the distor-tions by setting τz = 0. The experiment could be interpreted as reductions in inter-regional tradebarriers, improvements in transportation infrastructure, decreases in tax burdens, etc. Since we arefollowing the indirect approach, we do not have empirical evidence of the size of the decrease in τzcaused by reduction on some observable distortions of a certain degree.43 The effects of the policyare then assessed by the changes in the average firm size across steady states, since according toPoschke (2015) and Bento and Restuccia (2016), average firm size is positively correlated witheconomic development. In experiment (ii), we increase the monitoring pξ such that the fractionof firms using clean technology reaches the same level as in experiment (i). With this experiment,we approximate the current environmental policy that punishes firms using dirty technology. Wecontrast results from the two experiments to illustrate different mechanisms and effects of thesetwo types of policies.

A. Counterfactual Experiments

In this section, we compare the effects of the two experiments. We start by describing their results,which are summarized in Table 7. Columns (i) and (ii) refer to experiments (i) and (ii) respectively.

42In an earlier version of the paper where endogenous treatment technology choice is not explicitly modeled, wefind that this is indeed the case.

43Adamopoulos and Restuccia (2014) provide an example of observable farm-level price distortions.

Page 28: The Size Distribution of Firms and Aggregate Industrial

28 JI QI, XIN TANG AND XICAN XI

TABLE 7—THE EFFECTS OF DISTORTIONS AND REGULATIONS

Statistics Benchmark (i) (ii)

Aggregate Output 100.00 129.72 100.12Capital 100.00 161.25 100.14Consumption 100.00 123.45 100.11Output per Worker (Wage) 100.00 128.49 99.97Output per Firm 100.00 298.78 109.76Average Talent 100.00 224.67 109.49TFP 100.00 102.00 100.14

Number of Firms 100.00 43.42 91.22Mean Size 59.27 137.81 65.07Median Size 22.95 41.83 26.72

Aggregate Pollution 100.00 78.95 90.75Average Intensity 100.00 60.86 90.65Clean Share 57.40 85.29 85.15Monitoring 20.50 20.50 32.50† Note: All values reported are in percentage points except mean and median size, which are

numbers of workers in absolute term.

Eliminating Size-Dependent Distortions.—We begin with the effects on resource allocation. Thecore mechanism that generates the results of experiment (i) is the change of size distributions drivenby the general equilibrium wage effect. Since the tax scheme τz is assumed to be size-dependent,which is larger on more productive firms, in the benchmark economy the market share of these firmsis severely restrained. The elimination of τz removes these constraints. As a result, the previouslysuppressed factor demand increases considerably. In column (i) of Table 7, this shows up as a 61%increase in aggregate capital and a 28% increase in wage (output per worker). The size-dependencyof τz also implies that the situation of small unproductive firms improves to a lesser extent than thatof the large productive ones. Many small unproductive firms that previously survived because ofthe low prevailing wage now lose their profit margins. The owners of these firms therefore findit more profitable to work for the more productive firms. This selection mechanism explains the125% increase in average talent of active entrepreneurs and the 78% increase in the mean firmsize. Furthermore, production is also more concentrated at firms with high productivity among theremaining firms. We define the extensive margin as the selection of active entrepreneurs, and theintensive margin as the production distribution among the active firms. The first two rows of Table8, which report the output share accounted for by firms with productivity in different quintile, showthe intensive margin. We see clearly that firms in the top two groups expand considerably at theexpense of firms in the bottom three groups. Therefore, elimination of τz improves resource allo-cation on both the extensive and intensive margin. The overall changes of the size and employmentdistributions can be seen from the left two panels of Figure 6.

Defining aggregate TFP = Y/(KαN1−α)1−γ , which is the Solow residual following standardgrowth accounting literature [Hall and Jones (1999)], we find that size distributions in models withfirm selection exert very limited influence over TFP. This finding is consistent with that in Guner,Ventura and Xu (2008).

Page 29: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 29

TABLE 8—RESOURCE ALLOCATION ALONG THE INTENSIVE MARGIN

Economy QU1 QU2 QU3 QU4 QU5

Benchmark 2.69 4.19 7.29 16.55 69.28Case (i) 1.50 2.83 6.34 18.06 71.27Case (ii) 2.93 4.47 7.77 17.37 67.46† Note: QU1 to QU5 represent respectively the first to the fifth quintile.

Now we examine the effects on pollution. Although aggregate output increases by 30%, becausethe average pollution intensity decreases more (by 40%), the aggregate pollution decreases by 20%.The decline in average intensity comes both from the production stage and the treatment stage. Inboth stages, changes in the size distribution assume a key role. Since the production in the nodistortions economy shows higher degree of concentration in large firms, the fact that the twowithin-group elasticities ψ(0)

1 and ψ(1)1 are negative implies mechanically a decrease in pollution

intensity. The effect of size distribution on a firm’s choice of the end-of-pipe treatment technologyworks exactly as described in Proposition 2. The elimination of τz increases significantly the profitsof firms, which strengthens the economic incentive of adopting clean technology.

To evaluate the relative contribution of reduction in the production stage and in the treatmentstage, we assume artificially that ψ(0)

0 = ψ(1)0 and ψ(0)

1 = ψ(1)1 , which means that from a technologi-

cal point of view, clean and dirty technologies are the same. We apply this modification to both thebenchmark case and the no tax case. We set the pollution intensity and aggregate pollution in thefirst case to be the new benchmark. The ratio of the new no tax case over the new benchmark showsthe effect from purely changing the size distribution, which is equal to 61% for the intensity and79% for the aggregate pollution. We interpret these numbers as reduction of industrial pollution inthe production stage. We then express the intensity and pollution in case (i) as a percentage of thenew benchmark. The numbers are respectively 45% for intensity and 58% for aggregate pollution.Therefore, in the context of our model, about 30% (1−39/55) of the decrease in pollution intensityand 50% (1− 21/42) of the decrease in aggregate pollution are from the treatment stage.

Tightening Environmental Regulations.—The tightening of regulations affects the decisions offirms directly through technology adoption requirement and indirectly through the general equi-librium wage feedback. Unlike in experiment (i), these two effects do not affect the size andemployment distributions in equilibrium by too much, as can be seen from the right two panels inFigure 6. There are two reasons behind this. First, large firms that already have adopted the cleantechnology in the benchmark are not directly affected by the changes in the policy. Second, theinstallation costs of the clean technology in our benchmark calibration are small. As a result, theydo not divert a significant portion of the firms’ resources from productive use, and hence do notaffect the optimal operating scale of firms by much. Therefore, in column (ii) of the top panel ofTable 7, the macro aggregates barely change compared to the benchmark case.

However, despite a tiny decrease (0.03%) in wages, there is selection which increases the av-erage size and productivity of active firms. The underlying mechanism here is that an increase inregulation decreases the expected returns from being an entrepreneur, which drives out the least

Page 30: The Size Distribution of Firms and Aggregate Industrial

30 JI QI, XIN TANG AND XICAN XI

1−19 20−49 50−99 100−399 400+0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Firm Size Groups

Fra

ctio

n of

Firm

s

Firm Size Distribution: Benchmark vs No Tax

BenchmarkNo Tax

1−19 20−49 50−99 100−399 400+0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Firm Size Groups

Fra

ctio

n of

Firm

s

Firm Size Distribution: Benchmark vs Regulation

BenchmarkRegulation

1−19 20−49 50−99 100−399 400+0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Firm Size Groups

Fra

ctio

n of

Wor

kers

Employment Share Distribution: Benchmark vs No Tax

BenchmarkNo Tax

1−19 20−49 50−99 100−399 400+0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Firm Size Groups

Fra

ctio

n of

Wor

kers

Employment Share Distribution: Benchmark vs Regulation

BenchmarkRegulation

FIGURE 6. THE EFFECTS OF DISTORTIONS AND REGULATIONS ON THE FIRM SIZE DISTRIBUTION

productive firms that cannot afford the installation of a clean technology.44 These household mem-bers then choose to become workers, which increases the labor supply and suppresses equilibriumwage rates. The remaining large firms benefit from the reduction in wage rates and expand theiroperating scale, which explains the increases in size and productivity. Graphically this is reflectedin Figure 6 as contraction of the smallest group in the firm size distribution and expansion in theother four groups.

Firms in the four expanding groups are not affected equally though. To see this, we refer to row(ii) in Table 8. As is shown in the experiment (i), the most efficient way of allocating the talentsis to shift the production to most productive firms, increasing the share of output accounted for byfirms with productivity in the top quintile. However, row (ii) says the opposite. Comparing withthe benchmark case, the proportion of production accounted for by the lower quintiles increaseswhile that of the top quintile decreases. Therefore, although strengthening the monitoring improves

44This prediction matches well the policy practice in China. For example, during 2004 to 2008, the emission ofmajor air pollutants together with industrial production have declined significantly. The reduction of per unit GDPemission for these pollutants were 35% for SO2, 29% for Black Carbon and 31% for CO [Lin et al. (2014)]. Duringthat period of time, 34 million kW coal-burning electric generating sets were directly shut down, which amounts to6.18% of the total electric production in 2013 (National Development and Reform Commission [2009] Decree 4).

Page 31: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 31

resource allocation on the extensive margin, the allocation on the intensive margin worsens. Thisis because size-dependent τz limits the extent that a declining wage can benefit firms in the topquintile. Both effects are small here because in our benchmark calibration kE is small. In SectionIV.C, we discuss the case where kE is set to a higher level.

Since the aggregate output only increases slightly, quantitatively the decreases in aggregate pol-lution and intensity are almost the same. In this experiment, they decrease by about 10%. Becausethe size and employment distributions do not change much here, most of the decrease comes fromthe treatment stage. If we repeat the decomposition exercise we did for experiment (i), 92% of thereductions in both intensity and aggregate pollution stem from the adoption of clean technologies.

Comparing the Two Policies.—The above two experiments show that removal of distortions andintensifying regulations affect resource allocation through different mechanism. In particular, re-moving distortions increases labor demand from firms while regulations increases labor supplyfrom household members. Because distortions are size-dependent, large firms benefit more fromthe their removal. On the other hand, an increase in labor supply suppresses equilibrium wage,this would lead an increase of firms’ profits. However, the actual extra profits firms earn need totake out the effect of distortions. Again due to the size-dependency of the distortions, large firmsbenefit less from the decrease in wage. This explains the reason that the intensive margin improveswhen distortions are eliminated but worsens when stronger regulations are imposed. Put differ-ently, distortions shape the effect of environmental policies on different firms. This complementsthe analysis by Bustos (2011), who empirically studies the effect of the elimination of tariffs—aparticular form of size-dependent distortions—on technology upgrading. Results of our first exper-iment are consistent with hers. Further, we examine how the economy is affected if governmentimplements policies that target directly at technology adoption in the presence of size-dependentdistortions.

A recent literature has argued that industrial pollution in China comes from unanticipated re-sponses of economic agents to different policies and institutional features.45 In reality, the effectsof regulation policies such as government campaigns are often ineffective for their ease of rebound.Our results suggest that policies and institutional features could have worked to amplify other ex-isting distortions in the economy, and the elimination of those distortions could at the same timereduce the unanticipated environmental consequence from these policies and institutional features.Of course, it could be that reducing economic distortions costs more, therefore in the short-run,strengthening regulations pays off better.46 The discussion of the implementation of specific poli-cies is beyond the scope of the current paper, we leave it to future research.

45For instance, Jia (2014) studies the role of the political incentive and career concern of provincial governors;Jiang, Lin and Lin (2014) emphasized differential policy treatment to firms with different ownership rights structures;and Kahn, Li and Zhao (2015) and Cai, Chen and Gong (2016) estimate the difference in the reduction efforts fromthe provincial governments on the upstream and downstream counties along the rivers in response to the pollutionreduction mandates imposed by the central government.

46Another concern is that if small firms have higher growth rate, removal of size-dependent distortions could poten-tially hamper economic growth in the long-run. However, a recent study on India’ manufacturing sector by Martin,Nataraj and Harrison (2017) suggests that as in the U.S, in India it is also the case that larger and younger factoriesgrow more quickly and create more jobs than smaller, older factories.

Page 32: The Size Distribution of Firms and Aggregate Industrial

32 JI QI, XIN TANG AND XICAN XI

TABLE 9—THE EFFECT OF THE SIZE-DEPENDENCY OF DISTORTIONS

Statistics Benchmark (i) (i’)

Aggregate Output 100.00 129.72 105.33Capital 100.00 161.25 106.38Consumption 100.00 123.45 105.12Output per Worker (Wage) 100.00 128.49 104.34Output per Firm 100.00 298.78 242.61

Aggregate Pollution 100.00 78.95 70.60Average Intensity 100.00 60.86 67.02Clean Share 57.40 85.29 72.21† Note: All values reported are in percentage points.

B. The Effect of the Size-Dependency of Distortions

To further isolate the effect of size distribution, we solve a version of the model where the distor-tions are imposed uniformly over all firms in the economy. More specifically, in this exercise, weset values of τz such that the total amount of taxes collected is the same as in the benchmark casewith size-dependent τz. This experiment reveals the quantitative effect of the size-dependency ofτz.47 The implied tax rate τz = 18% is higher than the average tax rate in the size-dependent case,13%. The results of the experiment are summarized in Table 9. Columns benchmark and (i) areresults of the benchmark calibration and those from setting τz = 0 from Table 7. We label theuniform tax case as (i’). By comparing benchmark with (i’), we are able to assess the effect of thesize-dependency of τz. Similarly, a comparison of (i) with (i’) reveals the effect of levying a flattax. Notice that a uniform τz imposed on all firms does not change the extensive margin comparingto the zero τz situation, therefore measures of average talents, number of firms, mean/median sizeof firms, and TFP are not affected [Guner, Ventura and Xu (2008)]. This means that all the effectsof the size-dependency of the distortions work through the intensive margin.

Aggregate output, capital, consumption, output per worker, and output per firm in case (i’) allincrease comparing to the benchmark case, which is consistent with findings about size-dependentdistortions in the literature. What is different here is that since the uniform tax rate needed to gener-ate the same tax revenue is relatively high (the tax rate for the largest firm in the size-dependent caseis 25%), the tax itself still results in a considerable amount of output loss. The source of output lossin our model is the misallocation of the entrepreneurial talent z. However, for the average pollutionintensity, much of the reduction is achieved through the elimination of the size-dependency of τz.The adoption rate of the clean technology increases by 15 percentage points in case (i’), which is53% of the total increase resulting from the complete elimination of τz. 84% of the total decline inaggregate pollution in the zero τz case is achieved by simply removing the size-dependent featureof τz.

Discussion.—The finding that size-dependency of τz affects the economic efficiency moderatelyis consistent with Hopenhayn (2014b). In particular, he shows that size-dependency of the policy

47To this end, the experiment resembles those in the taxation literature, where progressive taxes are compared withflat taxes. See for example Ventura (1999) and Conesa and Krueger (2006).

Page 33: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 33

does not necessarily imply large distortions. What matters for the size of the distortion is thetotal amount of resources that are affected, not to which firms these resources belong. Studies likeRestuccia and Rogerson (2008) and Guner, Ventura and Xu (2008) find that size-dependent policiesaffect economic efficiency more than flat taxes because those size-dependent policies happen tolead to large amount of resources being affected. The fact that in the flat tax case we impose afairly large τz explains why removing size-dependency of τz does not improve the efficiency of theeconomy by much.

However, the size-dependency of τz does matter for pollution, since it affects the size distribu-tions. The reduction of pollution comes from two sources. One is the mechanical decrease in theproduction stage, and the other is the increase in the adoption rate of clean technology. The latteris due to the fact that the intensity of regulation pξπ0(z) is increasing in z. Therefore, since theelimination of the size-dependency of τz improves the resource allocation on the intensive mar-gin, which increases the profits of large firms, the punishment from violating the environmentalregulation actually increases, giving firms higher incentive to choose clean technology.

C. Environmental Regulations and Size Distribution

In our benchmark calibration, we choose a relatively small kE at about 2.5 times the equilibriumwage of a typical worker. This limits the extent to which environmental policy could affect the realeconomy. In this section, we show that when kE is increased to ten times the value used in thebenchmark calibration, environmental policy has sizable effect on resource allocation. We considertwo scenarios. In the first case, we solve the model again with all parameters remaining at thebenchmark calibration level, but increase kE to 41.5. With no changes in the regulation intensity,the adoption rate of clean technology drops. In the second case, we increase pξ such that theadoption rate in the benchmark case (57%) is restored. We label these two experiments by (iii) and(iv).

The results of experiments (iii) and (iv) are shown in Table 10. We begin with case (iii). Firstwe notice that the size distribution and the efficiency of the economy stay virtually the same as thebenchmark case. The clean technology adoption rate falls to 9%, since a large number of firmsnow find it unprofitable to use clean technology, and a higher level of pollution follows. Becausethe benchmark case could be understood as a decrease in the price of clean technology from case(iii), this result suggests that subsidies to clean technology can work as a substitute for strongerregulation.

If we increase the regulation to the level such that the original technology adoption rate is re-stored, as opposed to the results of case (ii) where size distributions and the allocation of resourcesare only mildly affected, the effects of the regulation in experiment (iv) are much larger. The sizeand employment distributions for case (iv) are shown in Figure 7. In contrast to the size and em-ployment distributions in the benchmark case, all firms in the first group are driven out. This isalso reflected in column (iv) of Table 10 as increases in average talent and mean/median size ofthe firms. Therefore environmental policy improves resource allocation at the extensive margin byforcing small unproductive firms to quit the market. However, the gains from these improvementsare limited because the allocation of resources at the intensive margin worsens. Table 11 shows the

Page 34: The Size Distribution of Firms and Aggregate Industrial

34 JI QI, XIN TANG AND XICAN XI

TABLE 10—THE EFFECT OF REGULATIONS: HIGHER kE

Statistics Benchmark (iii) (iv)

Aggregate Output 100.00 100.00 100.41Capital 100.00 100.28 101.45Consumption 100.00 99.94 100.21Output per Worker (Wage) 100.00 100.00 99.63Output per Firm 100.00 100.00 187.70Average Talent 100.00 100.00 184.14TFP 100.00 100.00 100.81

Number of Firms 100.00 100.00 53.50Mean Size 59.27 59.27 111.65Median Size 22.95 22.95 63.24

Aggregate Pollution 100.00 125.64 87.56Average Intensity 100.00 125.64 87.20Clean Share 57.40 8.89 57.06Monitoring 20.50 20.50 73.50† Note: All values reported are in percentage points except mean and median size, which are

numbers of workers in absolute term.

TABLE 11—RESOURCE ALLOCATION ALONG THE INTENSIVE MARGIN: HIGH kE

Economy QU1 QU2 QU3 QU4 QU5

Benchmark 2.69 4.19 7.29 16.55 69.28Case (iv) 4.44 6.63 10.98 21.86 56.10Case (i) 1.50 2.83 6.34 18.06 71.27Case (ii) 2.93 4.47 7.77 17.37 67.46† Note: QU1 to QU5 represent respectively the first to the fifth quintile.

allocation of resources at the intensive margin. The efficient allocation at the intensive margin isachieved in the no size-dependent τz case [case (i)]. Instead of an allocation of production towardlarger firms, as in the efficient case, here the production of firms in the bottom four quantiles ex-pands significantly at the expense of production share of the most productive firms. As a result,much of the gains in economic efficiency are offset by the worsening of resource allocations at theintensive margin. Qualitatively, this is the same as case (ii). However, by comparing case (ii) and(iv), we see that the larger the fixed cost kE is, the higher the degree of resource misallocation onthe intensive margin will be. Although the level of industrial pollution gets lower, the reductioncould be much larger if the allocation at the intensive margin was also improved, as is in the caseof elimination of τz.

V. Conclusion

This paper proposes a new explanation for the severe industrial pollution in China. Due to increas-ing returns to scale in the adoption of pollution treatment technologies and imperfect environmentalregulations, large productive firms are more likely to use clean technology and have lower pollution

Page 35: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 35

1−19 20−49 50−99 100−399 400+0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Firm Size Groups

Fra

ctio

n of

Firm

s

Firm Size Distribution: Benchmark vs High Costs

BenchmarkHigh Costs

1−19 20−49 50−99 100−399 400+0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Firm Size Groups

Fra

ctio

n of

Wor

kers

Employment Share Distribution: Benchmark vs High Costs

BenchmarkHigh Costs

FIGURE 7. ENVIRONMENTAL POLICY AND SIZE DISTRIBUTION

intensity. As a result, size-dependent distortions that reallocate factors away from large productivefirms reduce aggregate output while increase aggregate pollution intensity. We support our theoryusing a unique firm-level data on firm production, emissions and pollution treatment technologies.Quantitatively, we find that size-dependent distortions reduce aggregate output by 30% while in-crease aggregate pollution by 20%.

Our findings have novel implications for both misallocation and pollution. They imply that thewelfare losses caused by misallocation include both less economic output which are the focus ofthe literature, and more pollution. The latter could have important welfare consequences given theseverity of pollution in China and other developing countries, as documented in Greenstone andJack (2015). Our findings also challenge the popular view that economic growth necessarily comesat the cost of more pollution in developing countries, and suggest an attainable double dividend ofhigher output and lower pollution through removing policy distortions and market inefficiencies.

In our paper, we measure the idiosyncratic distortions using the indirect method. From a pol-icy perspective, it is therefore an important question to ask are there empirically observable size-dependent distortions that can generate quantitatively large effects on both the firm size and pollu-tion? Identifying observable policy distortions would also make it easier to collect more informa-tion on the costs of some particular policy instrument, which allows us to conduct a more completecost-benefit analysis.

Further, we have restricted our attention to industrial water pollution in China. It is of interestto extend our analysis to other pollutants and other countries. China also has a severe air pollutionproblem, and it has overtaken the U.S as the world’s biggest emitter of greenhouse gases. Casualobservations reveal that small and medium-size firms in high-polluting industries such as steel arean important source of sulfur dioxide, hazardous haze and greenhouse gas emissions in China. Forother developing countries, Hsieh and Klenow (2014) and Bento and Restuccia (2016) find thatsize-dependent distortions have a large negative impact on the average firm size, and Dasgupta,Lucas and Wheeler (1998) find that small firms have higher air pollution intensity in Brazil andMexico. It is interesting to see if size-dependent distortions amplify pollution in these countries as

Page 36: The Size Distribution of Firms and Aggregate Industrial

36 JI QI, XIN TANG AND XICAN XI

well. We leave these important extensions for future research.

APPENDIX

A. Accounting Exercises

This appendix provides robustness calculations of the accounting exercises.Estimation Strategies.—Ideally, we would like to have the pollution intensity over firm size.

However, such data do not exist since the NGSPS only reports total value of production and totalamount of pollution. Therefore, we would need to construct pollution intensity over the number ofemployees. We use the CNEC for this purpose. In particular, we estimate the corresponding binsfor production for each employment bin from the U.S data.

SUSB reports firm size in 22 bins. For the U.S size bins, we construct the corresponding pro-duction bins to be used in NGSPS. CNES is used to bridge the employment bins (SUSB) to theproduction bins (NGSPS).

1. Non-parametric:

• For each U.S employment bin, we compute the 1st quartile and 3rd quartile productionlevels for Chinese firms within that employment bin. The two quartiles are used as thelower and upper bounds for the production bins in NGSPS.

• We then use the median pollution intensity of firms within the newly defined productionbins as the average pollution intensity for those bins.

• Lastly, we calculate the aggregate pollution by assigning to each bin the correspondingshare of production. The NGSPS production bins are used for China and the employ-ment bins are used for U.S.

2. Piecewise Linear:

• For each U.S employment bin, we regress log-product on log-employment using thesubset of Chinese firms within that employment bin. The lower and upper boundsfor the production bins in this case are calculated as the predicted value of the aboveregression.

• We then run piecewise log-linear regression of pollution intensity on production withineach new production bin. The average pollution intensity is chosen to be the predictedintensity at the midpoint of the new log-production bin.

• Lastly, the average intensity is applied to the production share distributions. The U.Sdistribution does not change, however, a new distribution for China is calculated sincethe endpoints of the production bins are different.

3. Parametric:

• Using CNEC, we regress log-production on log-number of workers, which yields aparametric relationship between the number of workers and production.

Page 37: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 37

TABLE A.1—SIZE DISTRIBUTION ON POLLUTION

Methods Paper Agricultural Food Textile Chemistry Beverage Average

Non-parametric 39.8% 60.7% 81.6% 102.5% 103.7% 63.5%Piecewise-linear 34.8% 69.4% 93.5% 180.1% N/Aa 75.4%Parametric 43.5% 61.1% 97.5% 101.2% 89.0% 67.0%† Note: Please see notes of Table 2 for acronyms of industries. For individual industries, the

numbers reported are the aggregate pollution from the artificial U.S production structure aspercentage from that of China. We use the 1st and 3rd quartile in the non-parametric calcula-tion. Column 6 (Average) calculates the weighted average of these ratios using the percentagecontribution in row one of Table 2 as weights.

a Since the beverage industry has a lot fewer firms than the others, there are employment sizebins with no corresponding firms in China, which invalidates the method. We set the ratio to100% in the calculation of the last two averages.

• Using NGSPS, we regress log-intensity on log-production, which yields a parametricrelationship between intensity and production. From these two relationships, we cansubsequently construct a new parametric relationship between intensity and number ofemployees. The average intensity is chosen to be the midpoint of each U.S employmentbin. Notice that in this case we have a direct functional form for employment andintensity).

• Lastly, the average intensity is applied to the production share distributions. The U.Sdistribution does not change, but a new distribution for China is calculated since theendpoints of the production bins are different. Notice that this distribution for China isthe one we use in Section I.C.

The estimation results are shown in Table A.1. Each of these three methods has its own advan-tages and disadvantages. The two non-parametric methods capture more of the variation at the locallevel, which could be washed out in a parametric estimation across the whole state space. However,this local nature also introduces a lot of instability on the estimations. Further, there are situationswhen there are gaps not covered by adjacent production bins and situations when these productionbins overlap with each other. Under these conditions, some information will be lost while other isused for multiple times. Nevertheless, the results are robust across different estimation strategies.

B. Proofs

In this section we provide formal proofs to the results in Section II.C. For convenience, we statethose results here again.PROOF OF LEMMA 1:

Lemma 1. In an economy with no product market frictions, π0(z) and π1(z) are both increasingand linear with respect to z. In addition, the slope of π1(z) is steeper than that of π0(z):

(B.1)∂π0(z)

∂z= (1− pξ)∂π1(z)

∂z, ∀z ∈ Z

Page 38: The Size Distribution of Firms and Aggregate Industrial

38 JI QI, XIN TANG AND XICAN XI

Proof. Since kE is sunk-cost, it does not affect firms’ decision once is paid. The factor demanddecisions for the two types of firms are therefore the same. The first order conditions for capitaland labor are respectively

∂πi(z)

∂k: αγz1−γkαγ−1n(1−α)γ = R(B.2)

∂πi(z)

∂n: (1− α)γz1−γkαγn(1−α)γ−1 = W, i = 0, 1(B.3)

Dividing (B.2) with (B.3) yields constant capital to labor ratio h

(B.4) h =k

n=

αW

(1− α)R

which says more capital is demanded when technology is capital intensive (higher α) or whencapital rental price R low. Notice that the system of equations (B.2) with (B.3) is log-linear andthus has closed-form solution. With some algebra, the solutions are characterized by

n(z) = Φ1Rαγγ−1W

1−αγγ−1 · z, Φ1 =

[(1− α)αγ

(1− α)γααγ

] 1γ−1

(B.5)

k(z) = Φ2R1+γ(α−1)

γ−1 Wγ(1−α)γ−1 · z, Φ2 =

α

1− αΦ1(B.6)

Substitute the optimal solutions (B.5) and (B.6) back to the definition of profits functions (8) and(9), we have

π0(z) = (1− pξ)(

Ω− 1

1− αΦ1

)κz, Ω =

1− α

)αγΦγ

1 and κ = Wγ(1−α)γ−1 R

αγγ−1

π1(z) =

(Ω− 1

1− αΦ1

)κz −RkE

where it is clear that both functions are increasing and linear in z and (B.1) is true.

PROOF OF COROLLARY 1:

Corollary 1. Suppose the product market frictions are specified as max

0, 1− zφ1

with 1− γ +φ1 > 0, then π0(z) and π1(z) are both increasing and concave with respect to z. In addition, theslope of π1(z) is steeper than that of π0(z):

∂π0(z)

∂z= (1− pξ)∂π1(z)

∂z, ∀z ∈ Z

Proof. The proof is straightforward given Lemma 1. Substituting in the tax function, π0(z) andπ1(z) now becomes

π0(z) = (1− pξ)(

Ω− 1

1− αΦ1

)κz

1−γ+φ11−γ

π1(z) =

(Ω− 1

1− αΦ1

)κz

1−γ+φ11−γ −RkE

Page 39: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 39

where Ω,Φ1 and κ are defined as in Lemma 1.Assumption 1 − γ + φ1 > 0 guarantees the monotonicity of the profits functions. Concavity is

easily verified by taking second order derivatives.

PROOF OF PROPOSITION 1:

Proposition 1. There exists a unique threshold z such that all household members with z ≤ zchoose to be workers and those with z ≥ z become entrepreneurs. Further, z is the solution ofW = π(z)

Proof. Since the overall profit function π(z) is the upper envelope of π0(z) and π1(z), from Lemma1 and 1 we know π(z) is monotonic increasing. It is easy to verify that π(0) = 0. Therefore, aslong as 0 < W < π(z), we can find a unique z such that π(z) = W , where uniqueness followsfrom monotonicity. The condition 0 < W < π(z) is guaranteed in the general equilibrium versionof our model by Inada condition on the production function.

PROOF OF PROPOSITION 2:

Proposition 2. Given kE,W, τz and R, there exist unique thresholds zn and zf such that:

(i) In the economy with no product market frictions, entrepreneurs with z ≤ zn produce usingdirty technology while those with z > zn produce using clean technology.

(ii) In the economy with product market frictions, entrepreneurs with z ≤ zf produce using dirtytechnology while those with z > zf produce using clean technology.

(iii) zn < zf , that is, product market frictions impede the technology upgrade.

Proof. Uniqueness follows from

∂π0(z)

∂z= (1− pξ)∂π1(z)

∂z, ∀z ∈ Z

and monotonicity under both the case with and without frictions.We can solve for analytical expression for zn:

(B.7) zn =RkE

pξ[Ωκγ − 1

1−ακ]

where zn is simply the “distance” (RkE) over “speed” (pφ[Ωκγ − 1

1−ακ]). The “distance” in both

cases are the same, so eventually whether zf lies left or right to zn depends on the “speed” ofconvergence.

Using expressions of the profits functions with frictions, we can show that

(B.8) zn =RkE

pξ[Ω− 1

1−α

]κ<

(RkE

pξ[Ω− 1

1−α

) 1−γ1−γ+φ1

= z∗f

Page 40: The Size Distribution of Firms and Aggregate Industrial

40 JI QI, XIN TANG AND XICAN XI

which proves the proposition.One caveat is that the second inequality holds only if the number in the parentheses is greater

than 1. We verify this in our quantitative analysis but restrain ourselves from discussing extremecases where the condition is not hold.

Page 41: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 41

REFERENCES

Acemoglu, Daron, Philippe Aghion, Leonardo Bursztyn, and David Hemous. 2012. “The Environmentand Directed Technical Change.” American Economic Review, 102(1): 131–166. 5

Adamopoulos, Tasso, and Diego Restuccia. 2014. “The Size Distribution of Farms and International Pro-ductivity Difference.” American Economic Review, 104(6): 1667–1697. 4, 18, 26, 27

Atkeson, Andrew, and Patrick J. Kehoe. 2005. “Modeling and Measuring Organization Capital.” Journalof Political Economy, 113(5): 1026–1053. 17, 25

Axtell, Robert L. 2001. “Zipf Distribution of U.S. Firm Sizes.” Science, 293(5536): 1818–1820. 2Bai, Chong-En, Chang-Tai Hsieh, and Yingyi Qian. 2006. “The Return to Capital in China.” Brookings

Papers on Economic Activity, 2006(2): 61–88. 25Barrows, Geoffrey, and Helene Ollivier. 2016. “Emission Intensity and Firm Dynamics: Reallocation,

Product Mix, and Technology in India.” Manuscript. 4Bartelsman, Eric, John Haltiwanger, and Stefano Scarpetta. 2013. “Cross-Country Differences in Pro-

ductivity: The Role of Allocation and Selection.” American Economic Review, 103(1): 305–334. 4, 25,26

Becker, Randy, and Vernon Henderson. 2000. “Effects of Air Quality Regulations on Polluting Industries.”Journal of Political Economy, 108(2): 379–421. 16

Bento, Pedro, and Diego Restuccia. 2016. “Misallocation, Establishment Size, and Productivity.” NBERWorking Paper No. 22809. 5, 17, 18, 24, 26, 27, 35

Bloom, Nicholas, and John Van Reenen. 2010. “Why Do Management Practices Differ across Firms andCountries?” Journal of Economic Perspectives, 24(1): 203–224. 26

Bloom, Nicholas, Christos Genakos, Ralf Martin, and Raffaella Sadun. 2010. “Modern Management:Good for the Environment or Just Hot Air?” Economic Journal, 120(544): 551–572. 2, 11

Brandt, Loren, Johannes Van Biesebroeck, and Yifan Zhang. 2012. “Creative Accounting or CreativeDestruction? Firm-level Productivity Growth in Chinese Manufacturing.” Journal of Development Eco-nomics, 97(2): 339–351. 8

Brandt, Loren, Trevor Tombe, and Xiaodong Zhu. 2013. “Factor Market Distrotions across Time, Spaceand Sectors in China.” Review of Economic Dynamics, 16(1): 39–58. 16

Buera, Francisco J., and Yongseok Shin. 2016 forthcoming. “Productivity Growth and Capital Flows: TheDynamics of Reforms.” American Economic Review. 18

Bustos, Paula. 2011. “Trade Liberalization, Exports, and Technology Upgrading: Evidence on the Impactof MERCOSUR on Argentinian Firms.” American Economic Review, 101(1): 304–34. 5, 31

Cai, Hongbin, Yuyu Chen, and Qing Gong. 2016. “Polluting Thy Neighbor: Unintended Consequences ofChina’s Pollution Reduction Mandates.” Journal of Environmental Economics and Management, 76: 86–104. 31

Castaneda, Ana, Javier Dıaz-Gimenez, and Jose-Vıctor Rıos-Rull. 2003. “Accounting for Earnings andWealth Inequality.” Journal of Political Economy, 111(4): 818–857. 24

Cole, Harold L., Jeremy Greenwood, and Juan M. Sanchez. 2016. “Why Doesn’t Technology Flow fromRich to Poor Countries?” Econometrica, 84(4): 1477–1521. 5

Conesa, Juan Carlos, and Dirk Krueger. 2006. “On the Optimal Progressivity of the Income Tax Code.”Journal of Monetary Economics, 53(7): 1425–1450. 32

Copeland, Brian R., and M. Scott Taylor. 2004. “Trade, Growth, and the Environment.” Journal of Eco-nomic Literature, 42(1): 7–71. 4

Dasgupta, Susmita, Benoit Laplante, Nlandu Mamingi, and Hua Wang. 2001. “Inspections, Pollution

Page 42: The Size Distribution of Firms and Aggregate Industrial

42 JI QI, XIN TANG AND XICAN XI

Prices and Environmental Performance: Evidence from China.” Ecological Economics, 36(3): 487–498.15

Dasgupta, Susmita, Robert E.B. Lucas, and David Wheeler. 1998. “Small Plants, Pollution and Poverty:New Evidence from Brazil and Mexico.” World Bank. 2, 35

Duflo, Esther, Michael Greenstone, Rohini Pande, and Nicholas Ryan. 2013. “Truth-telling by Third-party Auditors and the Response of Polluting Firms: Experimental Evidence from India.” The QuarterlyJournal of Economics, 128(4): 1499–1545. 15

Ebenstein, Avraham. 2012. “The Consequences of Industrialization: Evidence from Water Pollution andDigestive Cancers in China.” The Review of Economics and Statistics, 94(1): 186–201. 1

Ebenstein, Avraham, Maoyong Fan, Michael Greenstone, Guojun He, Peng Yin, and Maigeng Zhou.2015. “Growth, Pollution, and Life Expectancy: China from 1991–2012.” American Economic Review:Papers and Proceedings, 105(5): 226–231. 1

Fullerton, Don, and Gilbert E. Metcalf. 1997. “Environmental Taxes and the Double-Dividend Hypothesis:Did You Really Expect Something for Nothing.” Chicago-Kent Law Review, 73(1): 221–256. 5

Goulder, Lawrence H. 1994. “Environmental Taxation and the “Double Dividend”: A Reader’s Guide.”NBER Working Paper No. 4896. 5

Greenstone, Michael, and B. Kelsey Jack. 2015. “Envirodevonomics: A Research Agenda for an EmergingField.” Journal of Economic Literature, 53(1): 5–42. 1, 35

Grossman, Gene M., and Alan B. Krueger. 1993. “Environmental Impacts of A North American FreeTrade Agreement.” In The U.S.-Mexico Free Trade Agreement. , ed. Peter M. Garber. Cambridge, Mas-sachusetts:MIT Press. 4

Grossman, Gene M., and Alan B. Krueger. 1995. “Economic Growth and the Environment.” The QuarterlyJournal of Economics, 110(2): 353–377. 4

Guner, Nezih, Gustavo Ventura, and Yi Xu. 2008. “Macroeconomic Implications of Size-dependent Poli-cies.” Review of Economic Dynamics, 11(4): 721–744. 4, 24, 26, 28, 32, 33

Hall, Robert E., and Charles I. Jones. 1999. “Why Do Some Countries Produce So Much More OutputPer Worker Than Others?” The Quarterly Journal of Economics, 114(1): 83–116. 28

Hopenhayn, Hugo A. 2014a. “Firms, Misallocation, and Aggregate Productivity: A Review.” Annual Re-view of Economics, 6(1): 735–770. 4

Hopenhayn, Hugo A. 2014b. “On the Measure of Distortions.” NBER Working Paper No. 20404. 32Hsieh, Chang-Tai, and Peter J. Klenow. 2009. “Misallocation and Manufacturing TFP in China and India.”

The Quarterly Journal of Economics, 124(4): 1403–1448. 2, 4, 14, 16, 24Hsieh, Chang-Tai, and Peter J. Klenow. 2014. “The Life-cycle of Plants in India and Mexico.” The Quar-

terly Journal of Economics, 129(3): 1035–1084. 2, 4, 5, 16, 17, 18, 35Jiang, Liangliang, Chen Lin, and Ping Lin. 2014. “The Determinants of Pollution Levels: Firm-level

Evidence from Chinese Manufacturing.” Journal of Comparative Economics, 42(1): 118–142. 31Jia, Ruixue. 2014. “Pollution for Promotion.” Manuscript. 31Kahn, Matthew E., Pei Li, and Daxuan Zhao. 2015. “Water Pollution Progress at Borders: The Role

of Changes in China’s Political Promotion Incentives.” American Economic Journal: Economic Policy,7(4): 223–242. 31

Lin, Jintai, Da Pan, Steven J. Davis, Qiang Zhang, Kebin He, Can Wang, Davide G. Streets, Donald J.Wuebbles, and Dabo Guan. 2014. “China’s International Trade and Air Pollution in the United States.”Proceedings of the National Academy of Sciences, 111(5): 1736–1741. 30

Lucas, Robert E. Jr. 1978. “On the Size Distribution of Business Firms.” The Bell Journal of Economics,9(2): 508–523. 3, 14, 17

Luttmer, Erzo G. J. 2007. “Selection, Growth and the Size Distribution of Firms.” The Quarterly Journal

Page 43: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 43

of Economics, 122(3): 1103–1144. 2Martin, Leslie A. 2013. “Energy Efficiency Gains from Trade: Greenhouse Gas Emission and India’s Man-

ufacturing Firms.” Manuscript. 4Martin, Leslie A., Shanthi Nataraj, and AAnn E. Harrison. 2017. “In with the Big, Out with the Small:

Removing Small-Scale Reservations in India.” American Economic Review, 107(2): 354–386. 31Melitz, Marc J. 2003. “The Impact of Trade on Intra-Industry Reallocations and Aggregate Industry Pro-

ductivity.” Econometrica, 71(6): 1695–1725. 14, 17Olley, George Steven, and Ariel Pakes. 1996. “The Dynamics of Productivity in the Telecommunications

Equipment Industry.” Econometrica, 64(6): 1263–1297. 25Parente, Stephen L., and Edward C. Prescott. 1994. “Barriers to Technology Adoption and Development.”

The Journal of Political Economy, 102(2): 298–321. 5Parente, Stephen L., and Edward C. Prescott. 1999. “Monopoly Rights: A Barrier to Riches.” The Amer-

ican Economic Review, 89(5): 1216–1233. 5Poschke, Markus. 2015. “The Firm Size Distribution Across Countries and Skill-biased Change in En-

trepreneurial Technology.” Manuscript. 5, 27Restuccia, Diego, and Richard Rogerson. 2008. “Policy Distortions and Aggregate Productivity with Het-

erogeneous Establishments.” Review of Economic Dynamics, 11(4): 707–720. 4, 33Restuccia, Diego, and Richard Rogerson. 2013. “Misallocation and Productivity.” Review of Economic

Dynamics, 16(1): 1–10. 4, 18, 24Rossi-Hansberg, Esteban, and Mark L. J. Wright. 2007. “Establishment Size Dynamics in the Aggregate

Economy.” American Economic Review, 97(5): 1639–1666. 2Shadbegian, Ronald J., and Wayne B. Gray. 2005. “Pollution Abatement Expenditures and Plant-level

Productivity: A Production Function Approach.” Ecological Economics, 54(2): 196–208. 19Shapiro, Joseph S., and Reed Walker. 2015. “Why is Pollution from U.S. Manufacturing Declining? The

Roles of Trade, Regulation, Productivity and Preferences.” NBER Working Paper No. 20879. 2, 4Song, Zheng, Kjetil Storesletten, and Fabrizio Zilibotti. 2011. “Growing Like China.” American Eco-

nomic Review, 101(1): 196–233. 16, 24Tombe, Trevor, and Xiaodong Zhu. 2015. “Trade, Migration and Productivity: A Quantitative Analysis of

China.” Manuscript. 2, 16, 18Vennemo, Haakon, Kristin Aunan, Henrik Lindhjem, and Hans Martin Seip. 2009. “Environmental

Pollution in China: Status and Trends.” Review of Environmental Economics and Policy, 3(2): 209–230. 1Ventura, Gustavo. 1999. “Flat Tax Reform: A Quantitative Exploration.” Journal of Economic Dynamics

and Control, 23(9-10): 1425–1458. 32Wang, Jun, and John Whalley. 2014. “Are Chinese Markets for Manufactured Products More Competi-

tive than in the US? A Comparison of China-US Industrial Concentration Ratios.” NBER Working Paper19898. 13

World Bank. 2001. China Air, Land and Water: Environmental Priorities for a New Millenium. WashingtonD.C.:World Bank Publications. 15

World Bank. 2007. Cost of Pollution in China: Economic Estimates of Physical Damages. WashingtonD.C.:World Bank Publications. 1

Young, Alwyn. 2000. “The Razor’s Edge: Distortions and Incremental Reform in the People’s Republic ofChina.” The Quarterly Journal of Economics, 115(4): 1091–1135. 18

Zheng, Siqi, and Matthew E. Kahn. 2013. “Understanding China’s Urban Pollution Dynamics.” Journalof Economic Literature, 51(3): 731–772. 1

Page 44: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 1

ONLINE APPENDIX: NOT FOR PUBLICATION

[TO BE COMPLETED.]The section is a backup of the contents to be contained in the online appendix only and will be

removed from the paper upon the completion of the first draft.

A. Detailed Descriptions of Data Sources

The source of data that we draw upon in Sections I.B and I.C of the main text is the NationalGeneral Survey of Pollution Sources (henceforth, NGSPS). Since to the best of our knowledge, thedata were not used before in economic studies, we provide a detailed description of the dataset inthis section. The content in this section is combined from:

• Census Program of the National General Survey of Pollution Sources (1gIÀ/ ÊY);

• Regulation on National General Survey of Pollution Sources [Decree of the State Council ofthe People’s Republic of China (No.508)] (IÀ/ Ê^~[¥u<¬ÚIIÖ-1508Ò]).48

• Technical Specifications Requirements of the First National General Survey of PollutionSources (1gIÀ/ ÊEâ5½).

A.1 General Introduction

The purpose of the NGSPS is to understand the number of pollution sources and their distributionin different industries and regions; to understand the generation, discharge and treatment of majorpollutants; to establish records for key pollution sources; to build a pollution source informationdatabase and an environmental statistics platform; and to provide basis for formulating policies andplans for economic and social development and environmental protection.

The term “pollution source” here refers to premises, facilities and equipment which dischargepollutants to environment in the process of production, living or other activities or have adverseimpact on environment, as well as other sources that result in pollution.

The NGSPS surveys all entities and self-employed households which have pollution sourceswithin the borders of the People’s Republic of China. The scope of the survey includes industrialpollution sources, agricultural pollution sources, domestic pollution sources, facilities for central-ized treatment of pollution, and other facilities generating or discharging pollutants. In this paper,we focus on the industrial sources. The industrial sources include all production entities that belongto any of the 39 manufacturing industries according to the Industrial Classification for NationalEconomic Activities (GB/T4547-2002). A complete list of the industries is provided in Table .

Key Sources.—The observations in the industrial sources are divided into two groups, key sourcesand regular sources. Key firms are required to file questionnaires that are more detailed than those

48An official English translation is provided at the end of the current document.

Page 45: The Size Distribution of Firms and Aggregate Industrial

2 JI QI, XIN TANG AND XICAN XI

filed by the regular firms. A firm is categorized as a key source, if one of the following condition issatisfied:

1. All production entities that discharge pollutants that contain heavy metal, hazardous waste,and radioactive substance.

2. All production entities that belong to the 11 heavily polluting industries, which include: Pa-per and Paper Products; Food Processing; Raw Chemical Materials and Chemical Products;Textile; Ferrous Metal Smelting and Rolling Processing; Food Manufacturing; Productionand Supply of Electric and Heating Power; Manufacturing of Leather, Fur, and Feather; Pro-cessing of Petroleum, Coking, and Nuclear Fuel; Manufacturing of Non-metallic MineralGoods; Ferrous Metal Smelting and Rolling Processing.

3. Firms with more than CNY 5 million in revenue that belong to the 16 key industries, which in-clude: Beverage Manufacturing; Medicine Manufacturing; Chemical Fibers Manufacturing;Transportation Equipment Manufacturing; Coal Mining and Washing; Non-ferrous MetalMining; Processing of Timber, and Manufacture of Wood, Bamboo, Rattan, Palm and StrawProducts; Petroleum and Natural Gas Exploitation; General Purpose Machinery Manufactur-ing; Ferrous Metal Mining; Non-metal Mining; Apparel, Footwear and Caps Manufacturing;Water Production and Supply; Metal Products Manufacturing; Special Purpose MachineryManufacturing; Communication Equipment, Computers and Other Electronic Equipment.

A firm is categorized as a regular firm, if none of the above conditions holds.Variables.—The industrial sources contain information of the following variables:

1. Firm’s basic registration information, geographic latitude and longitude, receiving water offirm’s discharged wastewater, etc;

2. The consumption of raw and intermediate inputs, including: water consumption, energy con-sumption (coal, petroleum, gas, electricity, etc), sulfur content of fuel, the consumption ofhazardous intermediate input, etc.

3. The quantity of each of the products produced by the firm.

4. The type, size and number of the pollutants treatment equipment that the firms own.

5. The generation, control, discharge and comprehensive utilization of various kinds of pollu-tants, and the operation of various kinds of pollution prevention and control facilities.

6. The monitoring of pollutants emission, including: the date and frequency of the monitoring;the type, quantity, and concentration of the pollutant.

Pollutants Included.—The pollutants included for the industrial sources are those that have gen-eral implications on pollution control. In particular, they are

Page 46: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 3

1. Wastewater: Chemical Oxygen Demand (COD), Ammonian, Petrochemicals, Volatile Phe-nols, Mercury, Cadmium, Plumbum, Arsenium, Hexavalent Chromium, Cyanidium. ForPaper and Paper Product, Food Processing, Food Manufacturing and Beverage Manufactur-ing Industries, Five-days Biochemical Oxygen Demand (BOD5) is added. For Urban SewageTreatment Plants, Total Phosphorus, Total Nitrogen, and BOD5 are added.

2. Exhaust: Soot, Industrial Dust, Sulfur Dioxide. For Electrolytic Aluminium, Cement, Ce-ramic, Frosted Glass industries, Fluoride. For Vehicle Exhaust Census, Carbon Monoxideand Hydrocarbon are added.

3. Industrial Solid Waste: Hazardous Waste (according to the National Catalogue of HazardousWastes), Smelting Waste, Fly Ash, Slag, Coal Refuse, Gangue, Radioactive Slag.

4. Plaster discharged by Desulfurization Facilities, Sludge generated by waste water treatmentfacilities, and remaining from burning hazardous waste.

5. Radioactive pollution sources from utilization of Concomitant Radioactive Mineral and fromCivil Nuclear Power Generation.

A.2 Methods of Measurement

There are three methods that the data of pollutants discharge are collected, which in the priorityof adoption when data from multiple sources are available, are Monitoring, Method of EmissionCoefficient, and Material Balance Method. In this section, we go over these three methods and theprinciples that the three methods are used in practice.

Monitoring.—Monitored data are collected by measuring the actual quantity and pollutants con-centration of wastewater and exhaust to infer the annual quantity of generation and discharge ofcertain types of pollutants. In addition, there are three sources that monitored data are obtained,which in the priority of adoption when data from multiple sources are available are: NGSPS mon-itoring, historical monitoring, and online monitoring. The NGSPS monitoring data are measuredby the field staff of the survey in the survey year. The historical monitoring data are those mea-sured and recorded in past three years, with the latest data having the highest priority.49 The onlinemonitoring data are those that are automatically uploaded by computerized and Internet-connectedpollutants treatment equipment. Before historical and online monitoring data are accepted, the fieldstaff of the survey examine to ensure that the production and pollutants treatment technology havenot undergone substantial changes.

A pollution source is subject to reporting monitored data, if any of the following four criteria issatisfied:

1. National Monitored Key Pollution Sources (I­:À/ ): all firms listed in the List ofNational Intensively Monitored Firms (I[­:iè¶ü,¼[2007]93Ò).

49There are in addition four different sources of historical data, in the order of adoption when more than one sourceare available, these four sources are: the historical data monitored by the local environmental authorities, the datamonitored upon the completion of a newly constructed project, the data monitored by a third-party agency, and thosethat are self-reported by the firms.

Page 47: The Size Distribution of Firms and Aggregate Industrial

4 JI QI, XIN TANG AND XICAN XI

2. Facilities for Centralized Treatment of Pollution.

3. Provincial Monitored Key Pollution Sources (­:À/ ) that in sum account for 65%of total provincial emission as is recorded in the 2005 Environmental Statistic Yearbook. Thatis for each of the major pollutants, starting from the most polluting firm and adding up thequantity of emission, all firms until the summation reaches the 65% of the total provincialemission are recorded. The union of the list of firms for all major pollutants are thus the firmsthat belong to this category.

4. Newly established projects since 2005 whose pollutants discharge is more than the leastpolluting firms in the Provincial Monitored Key Pollution Sources.

These firms are later referred to as the monitored firms in this document.Starting from 2007Q1, for all National Monitored Key Pollution Sources, Centralized Treatment

Facilities, and Provincial Monitored Key Pollution Sources, waste water sources have to be mon-itored at least once per quarter, and exhaust sources have to be monitored at least once per sixmonths.

The actual monitoring practice is subject to the following guidelines:

1. The monitoring of Mercury, Cadmium, Hexavalent Chromium, Plumbum, and Arseniummust be conducted at the discharge outlets of the individual factory workshops, with eachoutlets monitored separately.

2. Other water-based pollutants are monitored at the outlets of the factory.

3. The flow rates of wastewater and exhaust are monitored at the same time with the monitoringof pollutant discharge.

4. The monitoring of pollution sources are conducted at the representative polluting time peri-ods.

5. All monitoring technical standards of wastewater pollutants follow Technical SpecificationsRequirements for the Monitoring of Surface Water and Wastewater (/LYÚÀYiÿEâ5, HJ/T91-2002) and Technical Specifications Requirements for the Monitoring ofWastewater Pollutants Emission (YÀ/ÔüoþiÿEâ5, HJ/T92-2002).

6. For unstable wastewater sources, the samples are collected using Water Ratio AutomaticSampler.

7. All monitoring technical standards of exhaust pollutants follow The Determination of Partic-ulates and Sampling Methods of Gaseous Pollutants from Exhaust Gas of Stationary Sources(½À/ üí¥âÔÿ½ÚíÀ/Ôæ, GB/T16157-1996).

Method of Emission Coefficient.—The generation and discharge of pollutants are calculated ac-cording to the Handbook of Emission Coefficient, edited by the Chinese Academy of Sciences. Thebenchmark coefficients are estimated based on firm’s production technology, production scale, etc.

Page 48: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 5

The actual coefficients used are modified from the benchmark coefficients according to firm’s useof intermediate inputs, managerial practice, and pollution treatment equipment installed. The ac-tual pollutants generation and discharge are then calculated according to the actual production scaleof the firm in year 2007.

Material Balance Method.—The Material Balance Method is an application of conservation ofmass to the analysis of physical systems. By accounting for material entering and leaving a system,mass flows can be identified which might have been unknown, or difficult to measure without thistechnique. The exact conservation law used in the analysis of the system depends on the contextof the problem, but all revolve around mass conservation, i.e. that matter cannot disappear or becreated spontaneously. When applied, a firm’s usage of intermediate inputs, energy consumption,water consumption, and production technology are considered simultaneously, in order that thecalculated pollutants generation and discharge can realistically reflect the actual production andemission of the firm in reality.

Principles of Data Source Selection.—The adoption of the methods of measurement follow thefollowing guidelines:

1. The data of the Key Pollution Sources are obtained mainly through monitoring and themethod of emission coefficient, while the material balance method is only used when theother two methods are not feasible. More specifically, the data of all firms that are identifiedas the monitored firms are obtained by monitoring. The data of other firms in the key sourcesare obtained by the method of emission coefficient.

2. The data of the Regular Pollution Sources are obtained mainly through the method of emis-sion coefficient, while the material balance method is only used when the method of emissioncoefficient is not feasible.

3. Before accepted, the monitored data are compared with those calculated from the methodof emission coefficient. If the discrepancy is less than 20% of the monitored value, themonitored value is used. If the discrepancy is larger than 20%, the production technologyand operating status of the firm is examined. If the operating status is in compliance withthat specified in the technical regulations of the monitoring practice (for example, the firmhas to reach at least 75% of its production capacity when the emission is monitored), themonitored data are accepted. Otherwise, the emission is calculated using the method ofemission coefficient.

A.3 Comparisons of Statistical Features

In this section, we present detailed evidence to show that the samples covered in the CNEC andthose in the NGSPS are broadly comparable.

B. Modeling Strategies

In the paper, we model the costs of adopting abatement technology as a fixed costs kE , regardlessof the production scale of firms. The readers may concern that this could be a mis-specification.

Page 49: The Size Distribution of Firms and Aggregate Industrial

6 JI QI, XIN TANG AND XICAN XI

Two alternatives arise naturally: fixed costs plus operating costs and scale-related fixed costs. Inthis section, we show that each alternative contradicts some of the empirical evidence we documentfrom the data.

Operating Costs.—We start with the case of operating costs. Trimming the top 1% observationsfor outliers, the distribution of the ratio of operating costs over total value of production is shownin Table B.1. On average, operating costs of abatement equipment takes about 1.5% of a firm’s

TABLE B.1—OPERATING COSTS AS A FRACTION OF OUTPUT

Minimum 25% 50% 75% Maximum Mean

0 0.0013 0.0049 0.0143 0.2084 0.0150

annual value of production. In addition, the median of the ratio is less than 0.5%, suggesting thatoperating costs are negligible for more than 50% of the firms. For conciseness consideration, wechoose to omit them from the model.

Fixed Costs Proportional to Production Level.—We do observe in the data that firms with largerproduction scale tend to make larger investment in abatement equipments. For instance, the corre-lation between the log value of production and abatement equipment investment is equal to 0.64.However, the relationship significantly weakens if we remove the percentage interpretation impliedby taking logarithmic. In fact, if we calculate the ratio of abatement equipment investment overtotal production of firms below certain quantiles, the ratio decreases gradually as we include morelarge firms into the calculation. Table B.2 reports the ratio of aggregate treatment equipmentsinstallation costs as a fraction of aggregate value of output when firms with size below certainquantiles are included. Recall that we have documented that the unit-cost per processing capacity

TABLE B.2—INSTALLATION COSTS AS A FRACTION OF OUTPUT

25% 40% 60% 80% 95%

0.6195 0.2475 0.1151 0.0668 0.0433

is also decreasing with the equipment capacity, together these facts suggest the existence of returnsto scale of treatment equipments.

Relative Prices of Physical versus Biological Equipment.—We assume in the main text that onlythe installation of biological equipments requires a fixed costs. In this paragraph, we provide evi-dence in support of this choice. The distributions of prices for physical and biological equipmentsin absolute term (CNY 10,000 in year 2007) are listed in Table B.3. As is shown in Table B.3, thecosts of biological equipments is 7 to 15 times of those of the physical equipments. We furthercalculate the ratio of aggregate treatment equipments installation costs as a fraction of aggregatevalue of output for different technologies. We find that the ratio is 3.07% for physical technologyand 12.83% for biological technology. In addition, the total investment of physical equipment overbiological equipment is 0.087, meaning that the installation costs of physical equipments whenaggregated across the economy are only 9% of those of the biological equipments.

Page 50: The Size Distribution of Firms and Aggregate Industrial

THE SIZE DISTRIBUTION OF FIRMS AND INDUSTRIAL POLLUTION 7

TABLE B.3—RELATIVE PRICES OF TREATMENT EQUIPMENTS

Technologies Minimum 25% 50% 75% Maximum Mean

Physical 0.002 2.000 6.000 25.000 800.000 36.140Biological 0.04 30.00 85.00 260.00 4000.00 249.90

These evidence suggest that comparing to biological equipment, the costs of installing physicaltreatment equipments are nearly negligible. Therefore in our later calculation we assume that onlythe adoption of biological equipment is costly.

In summary, the evidence suggests that firms’ investment in abatement equipment is increasingwith production scale but to a much lesser extent in absolute term. Again for simplicity and withoutloss of generality, we choose the fixed costs modeling strategy.

C. Intensity-Size Correlation

D. Size-Distribution

E. Accounting Exercise

F. Calibration

G. The Partial Equilibrium Model