27
ABSTRACT The Health-Income Gradient in the Early 20th Century by Sarah Combelles Siegel The positive relationship between socioeconomic status and health has been observed and established across many fields. This paper has two goals: (1) find a relationship between health and income through child mortality rates in the 1900 and 1910 census and (2) check the omitted variable bias of the mother’s childhood socioeconomic status. Previous research has established the impact of different father’s occupations on the child mortality index and find a relationship between occupations that earn more and a lower child mortality rate. I construct a panel dataset that links mother’s back to their childhood household and tests whether there is an omitted variable bias with the mother’s childhood household socioeconomic status. I find that a one standard deviation increase in husband’s wealth is associated with a decrease in the child mortality rate of 2.03 deaths per 1000 children ever born and the mother’s childhood socioeconomic status plays a role in the child mortality rate with a one standard deviation increase their income is associated with a decrease in the child mortality rate of 1.36 deaths per 1000 children ever born. In general, I find that the omission of mother’s childhood socioeconomic status creates a slight omitted variable bias. Further work will try to improve the matching procedure to find enough sisters in the sample to run household fixed effects and improve the identification strategy.

ABSTRACT The Health-Income Gradient in the Early 20th …

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ABSTRACT The Health-Income Gradient in the Early 20th …

ABSTRACT

The Health-Income Gradient in the Early 20th Century

by Sarah Combelles Siegel

The positive relationship between socioeconomic status and health has been observed and

established across many fields. This paper has two goals: (1) find a relationship between

health and income through child mortality rates in the 1900 and 1910 census and (2) check

the omitted variable bias of the mother’s childhood socioeconomic status. Previous research

has established the impact of different father’s occupations on the child mortality index

and find a relationship between occupations that earn more and a lower child mortality

rate. I construct a panel dataset that links mother’s back to their childhood household

and tests whether there is an omitted variable bias with the mother’s childhood household

socioeconomic status. I find that a one standard deviation increase in husband’s wealth is

associated with a decrease in the child mortality rate of 2.03 deaths per 1000 children ever

born and the mother’s childhood socioeconomic status plays a role in the child mortality

rate with a one standard deviation increase their income is associated with a decrease in the

child mortality rate of 1.36 deaths per 1000 children ever born. In general, I find that the

omission of mother’s childhood socioeconomic status creates a slight omitted variable bias.

Further work will try to improve the matching procedure to find enough sisters in the sample

to run household fixed effects and improve the identification strategy.

Page 2: ABSTRACT The Health-Income Gradient in the Early 20th …

THE HEALTH-INCOME GRADIENT IN THE EARLY 20TH CENTURY

Thesis

Submitted to the

Faculty of Miami University

by

Sarah Combelles Siegel

Miami University

Oxford, Ohio

2020

Advisor: Dr. Gregory Niemesh

© 2020 Sarah Combelles Siegel

ii

Page 3: ABSTRACT The Health-Income Gradient in the Early 20th …

Contents

1 Introduction 1

2 Background 1

3 Data 4

3.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3.2 Record Linkages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3.2.1 First Link: Marriage licenses to early censuses . . . . . . . . . . . . . 5

3.2.2 Second link: early censuses to late censuses . . . . . . . . . . . . . . . 6

3.3 Variables of Interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3.4 Sample Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4 Results 9

4.1 Robustness Checks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

5 Conclusion 11

iii

Page 4: ABSTRACT The Health-Income Gradient in the Early 20th …

List of Figures

1 Overview of Matching Process . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2 Overview of Matching Process (Includes all combinations of matches) . . . . 16

3 TPR and PPV Rates with beta=1.375 . . . . . . . . . . . . . . . . . . . . . 16

List of Tables

1 Probit Scores for Matching Marriage Licenses to Early Censuses . . . . . . . 17

2 Matching Table: Marriage Licenses to 1870/80 Censuses . . . . . . . . . . . 18

3 Probit Scores for Matching Early Censuses to Late Censuses . . . . . . . . . 19

4 Matching Table: 1870/80 Censuses to 1900/10 Censuses . . . . . . . . . . . 20

5 Summary Statistics of Matches Sample in Late Census . . . . . . . . . . . . 21

6 1910 US Census Summary Statistics . . . . . . . . . . . . . . . . . . . . . . 21

7 Regression Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

8 Robustness Check: Different Occupation Score Calculations . . . . . . . . . 23

iv

Page 5: ABSTRACT The Health-Income Gradient in the Early 20th …

1 Introduction

The relationship between socioeconomic status and health has been observed and established

across many fields. The wealthier usually have better health outcomes – both with subjective

health measures and indicators such as the presence of chronic diseases and longevity. Given

the vast inequality around the world, it is important to understand. Furthermore, it is also

important to understand how this is transmitted through generations: are children endowed

with this disparity or is it developed through their childhood. If it is endowed, is it because

of their parent’s health statuses?

This paper has two goals: (1) whether I find a relationship between health and income

through child mortality rates in the 1900 and 1910 census and (2) whether the mother’s

childhood socioeconomic status plays a role in this relationship.

Previous research, such as Preston and Haines (1991), has established the impact of different

father’s occupations on the child mortality index, and they do find a relationship between

occupations that earn more and a lower child mortality rate. This paper extends their work

and thus, the literature, by constructing a panel dataset that links mother’s back to their

childhood household and tests whether there is an omitted variable bias with the mother’s

childhood household socioeconomic status.

Not only do my results align with the literature in finding the existing of a relationship

between health and income, but I find that a one standard deviation increase in husband’s

wealth is associated with a decrease in the child mortality rate of 2.03 deaths per 1000

children ever born. Furthermore, the findings suggest that the mother’s childhood socioeco-

nomic status plays a role in the child mortality rate with a one standard deviation increase

their income is associated with a decrease in the child mortality rate of 1.36 deaths per 1000

children ever born. In general, I find that the omission of mother’s childhood socioeconomic

status from the initial research creates a slight omitted variable bias.

The next section provides background on the subjects at hand, and then section III details

the data used including the methodology to create the sample. Section IV details the results

and then the paper concludes in section V.

2 Background

A positive relationship between socioeconomic status (SES) and health outcomes is widely

established across multiple fields and is persistent no matter how SES is measured: in in-

come, educational attainment, or occupation. As SES increases, health outcomes improve

and conversely, as SES decreases, health outcomes worsen. This relationship isn’t limited to

1

Page 6: ABSTRACT The Health-Income Gradient in the Early 20th …

the extremes of SES but rather can be seen throughout the continuum SES creating a gradi-

ent. When measured by income, the relationship is called the health-income gradient. Even

though the relationship is well-established, the mechanisms underlying it and any potential

causality is not developed even with research focused on trying to identify it.

A problem when examining the health-income gradient is the simultaneity. As individuals

earn less, then may have worse health outcomes, and their worse health may impede their

ability to work and earn income. The potential simultaneity poses a problem and thus, re-

searchers have tried to address this by investigating children’s health outcomes in relation to

their family’s socioeconomic status. By isolating children, researchers limit the way health

directly impacts the ability to create income which helps in reducing the simultaneity of the

question at hand. Case et al. (2002) use a sample of nationally representative children in the

United States and they find that health outcomes are positively related to household income,

a relationship that increases in strength as children grow older. Not only does their paper

provide support for the presence of the health income gradient but it establishes a potential

mechanism for the intergenerational impact of the gradient: children’s health outcomes are

related to their childhood socioeconomic status and their health outcomes stay relatively

similar as they transition into adulthood.

Currie and Stabile (2003) use panel data on Canadian children to further examine the health-

income gradient in children. They confirm Case et al.’s finding of its presence and seek to

identify why the disparity increases in older children. They find that the primary mechanism

behind this increase is that children in low-income households have a higher probability of fac-

ing health shocks. Given this support and a plausible mechanism for how the health-income

gradient shifts over a person’s childhood, it suggests that a person’s childhood socioeconomic

status can stay with them and affect them even as adults.

There is a lot of other research surrounding the health-income gradient and following the

lead of Case et al. (2002), they look at how children are affected by their household SES.

There are papers using data from Australia (Khenan et al., 2009), Germany (Reinhold and

Jurges, 2012), and England (Currie at al., 2007; Case et al., 2008) all finding evidence of the

health-income gradient in their data which vary in both how they measure health outcomes

and SES. A discussion paper written by Johnston et al. (2010) summarized the literature

relating to the question of whether the income gradient is present in child health outcomes.

The largest problem they find with the field of research is that many of the health indicators

used are subjective measures of health, i.e. self-reporting scales of general health, rather than

doctor diagnoses or other objective measures of health. Research has found that different

groups of individuals will respond differently when responding about subjective measures of

health both in terms of who is asked the question in relation to the child and their socioeco-

nomic status which creates issues with using these measures (d’Uva et al., 2008). Thus, the

2

Page 7: ABSTRACT The Health-Income Gradient in the Early 20th …

discussion paper reviews the literature with their found knowledge on how different groups

self-report differently. Even after updating the studies to correct for this fact, they continue

to find a positively-correlated health-income gradient but that the magnitude and signifi-

cance found in each individual studies is “highly dependent upon the choice of respondent

and the measure of child health” (Johnston et al., 2010).

Since this paper uses child mortality as the health indicator, it is important to compare to

studies with similar measures of health. Finch (2003) finds evidence of the health-income

gradient in birthing outcomes. Specifically, as household income decreases, the likelihood of

an infant being born at a low birthweight increase. Importantly for this paper, low birth-

weight is linked to further health complications including a higher risk of infant (and child)

mortality. Additionally, a medical study in Tanzania found varying child mortality rates

among different classifications of household socio-economic status (Nattey et al., 2013). The

paper looked at SES in categories and thus, did not establish a gradient of effect, but did

find that the child mortality rate was highest among the least well-off.

In establishing the health-income gradient in children, there is also the potential mechanism

of the parent’s health, specifically the mother’s health, transmitting to their child. For a

mother, their general health may be correlated with their child’s health at birth. For ex-

ample, in Finch (2003), could the outcome be tied to the mother’s health at the time of

birth (which is also linked to household SES)? Propper et al. (2007) use a data set from

the UK that contains information on maternal health and maternal health behaviors along

with the child’s health. Consistent with the rest of the literature, they find a health-income

gradient for the children’s health outcomes. When conducting further analysis on how ma-

ternal health impacts the child’s health outcomes, they find that their behaviors are not the

ones impacting child health but rather their own maternal health and mental health. Dowd

(2007) also establish that maternal health behaviors do not explain the income gradient’s

presence in children’s health outcomes and suggest that there are other salient pathways

that this relationship operates. Propper et al. (2007) even find a strong association be-

tween “adverse events in early childhood of the mother and maternal anxiety and poor child

health.” Furthermore, after controlling for the maternal health variables, they find that the

inclusion of income is not significant anymore. The important conclusion from their findings

is that these issues are interconnected, and it provides a potential explanation for how the

household SES establishes itself in children’s health outcomes.

Another potential pathway that could be impacted child health outcomes could be the ma-

ternal educational attainment. There is consistent research in contemporary settings that

support the conclusion that higher maternal education leads to improved child health out-

comes (Caldwell, 1994). The settings of these findings may not translate in a historical

setting due to a lack of a developed health system that most people can have access to

3

Page 8: ABSTRACT The Health-Income Gradient in the Early 20th …

today, but this does potentially provide another pathway to consider the intergenerational

impact of SES and children as SES is correlated with educational attainment.

Most of these previous papers establish their findings using data collected in a contempora-

neous setting. Thus, it is important to understand the historical setting of the 20th century

to understand if the findings of the literature today can be substantiated in the past. Haines

(2010) looks at how SES is related to infant and child mortality rates in England and Wales

in the period of 1890-1911. His work there does demonstrate differences in the mortality

rates for difference occupations and social class groupings supporting evidence of a relation-

ship between SES and child mortality rates during that time period. Furthermore, the work

also looks at the differences in urban and rural outcomes. Compared to this analysis, Haines

uses a more complex calculation of child mortality to account for underrepresented groups

in the data.

Preston and Haines (1991) published a book summarizing their findings on child mortality in

the United States using the 1900 census. In the 1900 census, it asks respondents the number

of children they’ve ever given birth to and the number of children they have surviving. From

this information, they can back out child mortality and find child mortality rates for women

who have been married for different lengths of years and to ensure parity among different

social groups. With these estimates of child mortality rates, they then look to see how it

is associated with many other factors in the household – including the father’s occupation.

They find some significant occupational impacts on child mortality that generally suggest the

health-income gradient (occupations that earn more have lower rates of child mortality). My

study extends the findings of Preston and Haines (1991) by examining using the associated

income with occupations to look at the linear effect of income on child mortality rates in the

1900 and 1910 census and I also look to see if there is an impact of maternal childhood SES.

This last one is important because as I have established throughout this background, there

are many mechanisms for which the child health-income gradient is present and I might be

able to isolate some of those pathways by including the mother’s childhood SES.

3 Data

The final dataset consists of 20,418 women found across three time periods: (1) in her

childhood household in either the 1870 or 1880 census, (2) in her marriage license, and (3)

in her married household in the 1900 or 1910 census (Ruggles et al., 2017). Following the

use of marriage licenses to link women from their maiden names to their married names

in Craig, Erikson, and Niemesh (2019), I conduct a two-step matching process. The first

link is from the marriage licenses to the 1870 and 1880 censuses where women is found as a

4

Page 9: ABSTRACT The Health-Income Gradient in the Early 20th …

child with her parents. The second link is from the 1870 and 1880 censuses to the 1900 and

1910 censuses where I find women and her husband. Figure 1 provides an overview to the

matching process.

3.1 Methodology

There are different ways of completing the matching process. The simplest, but most strin-

gent, process to identify matches would be to require unique perfect matches across both

datasets. Abramitzky et al. (2012) use age bands to increase the flexibility of their matching

process while still requiring perfect name matches. Long and Ferrie (2013) find that this

process should generally lead to a 18.8%-37.6% matching rate. Feigenbaum (2016) proposes

using machine learning techniques, specifically a probit model, to allow for a higher rate

of matches as it accounts for slight misspellings in names. I follow the machine learning

algorithm process developed by Feigenbaum (2016) while being conscious of the potential

high false-positive rate.

3.2 Record Linkages

3.2.1 First Link: Marriage licenses to early censuses

I use a digitized data set that contains all marriages that occurred in the Commonwealth

of Massachusetts from 1841-1915 (FamilySearch, 2016). To begin, I identify all women who

are 0-16 years old in either the 1870 or 1880 census which yields 259,628 women who meet

the criteria. Then I identify the potential matches of girls to the women found that are

based on: the distances between the woman’s last name, maiden name, father’s first name,

and mother’s first name, and an age band of 5 years for reported age in both datasets. To

calculate name distances, I use the Jaro-Winkler string distance (Winkler, 1999). Once all

the potential matches are found, I manually match approximately 10,000 marriage licenses

and their hits identifying whether it is a match or not. The hand matching process introduces

human error and does hard code any human biases within the process. The results of the

model are listed in Table 1. From applying the probit model to the rest of the data, I get

the probability that the potential match is consider a true match.

Given the probability, I need to determine which potential matches are matches. There are

two cutoffs to consider: (1) at what probability (a number ranging from 0 to 1) can the

potential match be considered a match (called alpha) and (2) if the marriage license finds

multiple potential matches, how close can the ratio of the top two scores before they are

indistinguishable (called beta). For example, if beta is equal to one, it means that the top

two potential matches had the same probit score meaning that I cannot distinguish which

one is the true match. There are a variety of ways to determine which cutoffs to use. I

5

Page 10: ABSTRACT The Health-Income Gradient in the Early 20th …

consider two standard machine learning measures: efficiency, measured by the true positive

rate (TPR), and accuracy, measured by the positive predictive value (PPV). TPR is also a

measure of Type I errors and PPV is also a measure of Type II errors. The measures are

calculated by: TPR = (truepositives)(truepositives+falsenegatives)

and PPV = (truepositives)(truepositives+falsepositives)

.

There is an inherent tradeoff between the two measures: if I value efficiency and aim to

maximize it, then accuracy will decrease and vice-versa. After determining that beta should

be set equal to 1.375, I looked at the rates of TPR and PPV to determine alpha (as seen

in Figure 3). I ended up setting alpha equal to 0.5 which set TPR = 85.9% and PPV =

83.1%. The rates are two percentage points lower than what Feigenbaum (2016) has in his

matching process. After choosing these cutoffs, I apply the probit model to the rest of the

potential matches and the chosen alpha and beta levels. If I match the same girl in the

census to two different women in the marriage licenses dataset, I drop both since I cannot

distinguish which one is a better match than the other. At the end of this process, I end

up with 39,665 linked women from their marriage licenses back to their childhood homes in

either the 1870 or 1880 census. Table 2 shows how the matching process worked for the full

sample of marriage licenses back to the early censuses.

As is clear in Figure 1, the overall matching rate is low for the first link at 15.28% which is low

compared to the literature. There are a couple reasons for this: (1) three individuals must

be found in the childhood household (daughter, mother, and father); (2) matching is based

off of five names and only one numerical option; (3) state of birth or other easy identifiers are

not available to simplify the process and reduce the amount of census observations that are

double matched. Although the matching rate is disappointing, it should not be attributed

to the matching procedure.

3.2.2 Second link: early censuses to late censuses

After successfully linking women from the marriage licenses to the early censuses, I repeat

a similar process to link the matched women to the 1900 and 1910 censuses. The 1870 and

1880 censuses provide each woman’s state of birth which helps to identify potential matches.

I also use the woman’s first name and age, husband’s first name and age, and their married

last name to find potential matches. This time, I run a combination of different census years

to find all the possible matches: from the 1870 census to the 1900 and 1910 censuses and

the 1880 census to the 1900 and 1910 censuses. It is possible to find matches for a woman

across all four censuses (only occurs five times). Table 6 lists the matching rate for each of

the combinations.

The quality of matches in the second link is higher than in the first link. Almost 75% of

the found potential matches have only one hit signaling a higher quality group of potential

matches compared to the first link.

6

Page 11: ABSTRACT The Health-Income Gradient in the Early 20th …

After finding the potential matches, I follow the same process: (1) create a probit model, (2)

evaluate the TPR and PPV, (3) establish the cutoffs, and (4) find the matches. I manually

matched a sample of 4,000 observations that came from potential matches of the marriage

license to the 1880 to the 1900 census. Like in the first round, I pick beta to be equal to

1.375 and then look at the rates of TPR and PPV to pick alpha (Figure 1 shows the rates).

This time, I let alpha be equal to 0.4 which places TPR=96.36% and PPV=89.08% and are

higher rates than the first link. This suggests that the quality of the second link is much

higher.

Using these cutoff rates, I complete the final dataset that includes 20,814 women found across

all three time periods. For women found in both the 1870 and 1880 censuses, I use their

information from the 1880 census and for women found in both the 1900 and 1910 censuses,

I use their information in the 1910 censuses in hopes that I could more accurately capture

the child mortality rates for those families.

Some limitations of this process might lead to biases within the matched sample. For exam-

ple, I require the presence of three individuals in the childhood household (woman, mother,

and father). Secondly, the couple must be found together in the late censuses which priori-

tizes couples who stay together or are not widowed. Given that I am linking child mortality

to income levels here, I must be cognizant that some occupations lead to higher probabilities

of death than others and the final sample may under represent those occupations.

3.3 Variables of Interest

The variables of interest are child mortality (dependent variable), and husband’s income, and

maternal grandfather’s income (independent variables). As mentioned in the background,

the census does not directly ask about child mortality but asks women (1) how many children

they’ve ever given birth to and (2) how many children are still surviving. From these two

variables, I can calculate the rate of child mortality.

child mortality =(numberofchildreneverborni − numberofchildrensurvivingi)

numberofchildreneverborni

(1)

A limitation is the lack of indication of when the children died. Contemporary child

mortality rates only include deaths before the age of five, however, my measure does not

provide any indication of age. For women who are old enough, I may capture their children

dying as adults.

My important independent variables, husband and grandfather’s income, are used as in-

dicators of socioeconomic status. Given that the 1900 census asks about a respondent’s

7

Page 12: ABSTRACT The Health-Income Gradient in the Early 20th …

occupation and does not directly ask respondents their incomes or their wealth, approxi-

mations given the respondent’s occupation will need to do. I use two measures of median

occupation wealth to estimate occupation’s relation to household income. The two measures

calculate the relationship similarly but use different years as their base. The first is a variable

that Ruggles et al. (2017) created that assigns occupations an income score. The score is

scaled from 0-80 and is a continuous measure of occupations representing the median total

income (in hundreds of 1950 dollars) of all individuals nationally with that occupation in

1950. The higher the number, the greater income the occupation had.

The 1870 census asks about wealth, so I could estimate occupation’s SES standing from

this data. To construct this estimate, I find the median wealth amount for each occupation

– and if possible, doing so at the regional level and for immigrant status. Then, I assign

the amount to all individuals with that occupation at the closest level possible. Unlike the

occupational income score, this is a clear measure in dollars (1870 dollars) and is not scaled

to a certain maximum which provides easier interpretation.

My analysis replies primarily on the 1870 wealth indicators for both the husband’s occu-

pation and the maternal grandfather’s occupation given that I believe it more accurately

reflects occupational standings for both generations. Robustness checks are done using the

1950 occupational income scores as well.

3.4 Sample Characteristics

To check that the final matched sample of 20,418 women is representative – I compared it

to all couples in the full count 1910 census. Table 5 is the summary statistics for the final

matched sample while Table 6 is for couples in the full count 1910 census. There are a

couple of summary statistics unique to the final matched sample, such as the grandfather’s

occupation income score and wealth because they were found in the early censuses.

There are some important differences between the matched sample and the full count 1910

census that are important to note. The husband’s occupational income score is almost 11

points higher, close to a full standard deviation higher than the full US Census meaning that

I am dealing with higher socioeconomic statuses compared to the full population. The age of

the men and women are also lower and have a smaller standard deviation. Given that I am

limiting the couples to women who were children in either 1870 or 1880, this makes sense.

Furthermore, this plays a role into the generally lower rates of variables relating to children.

Younger women have not necessarily maxed out their child birthing years and since I am

capturing younger women, it makes sense that the average number of children ever born is

lower than the US census by almost 1.5 children. Given that the base rate is lower, it makes

sense that the number of children surviving is lower too. However, it is important to note

that the average number of deaths (per woman) is half of the US average (0.3 compared to

8

Page 13: ABSTRACT The Health-Income Gradient in the Early 20th …

0.75). Although this may seem like bad news, the child mortality rate is not as far away from

the US average. In the sample, the child mortality rate is 134 deaths per 1000 children born

while the rate is 173 deaths per 100 children born for the US. In generalizing the findings,

my sample might provide a smaller estimate than the true estimate because (1) the sample

is wealthier and (2) the child mortality rate is lower.

4 Results

The regressions run are weighted by number of children born, and run with robust clustered

standard errors. The sample size is restricted to women who have given birth and husbands

and maternal grandfathers who have a reported income which means that the final sample

size is 11,131 women. Controls include the woman’s age, to account for the ability to have

more children as age increases; her husband’s age and the maternal grandfather’s age, to

capture how income changes over one’s lifetime; the size of location, to account for differences

between rural and urban areas; and duration of the marriage, to account for how long couples

have to create children.

Table 7 includes the regression output and the key regression to estimate the presence of the

health-income gradient is:

childmortalityi = β0 + β1 ∗ husband′sincomei + β2 ∗ controlsi (2)

For the health-income gradient to be present, I would want to see a negative β1: as the fam-

ily’s income increases, child mortality rates decrease. I test this relationship in regressions

(1), (2), and (3) in Table 7. I first run a simple regression with no controls and then add

in the size of the location and then add in the rest of the controls. In all three regressions,

the household’s income has a negative correlation with the child mortality rate. Given the

consistency in statistical significance and similar estimates, I have confidence in the results.

The coefficient in front of the key regressor in (3) means that an increase of $100 in in-

come (1870 dollars) is associated with a decrease of 0.82 deaths per 1000 children ever born.

Given that the standard deviation of the wealth score is $2834, an increase in one standard

deviation of father’s wealth is associated with a decrease of 2.32 deaths per 1000 children

ever born which is an important decline to note. However, compared to the average child

mortality rate of 134 deaths per 1000 children ever born, it may not be such a significant

decline. To provide context, an increase of $100 in 1870 dollars which today is equivalent to

$1969.24 today (in 2020).

Given that I’ve established the presence of a health-income gradient with regards to the child

9

Page 14: ABSTRACT The Health-Income Gradient in the Early 20th …

mortality rate, I now transition to looking at whether adding in the woman’s household in-

come impacts the significance or relevance of the health-income gradient. I am looking to

ensure that there was not omitted variable bias in this estimate. Thus, the new regression

is:

childmortalityi = β0+β1∗husband′sincomei+β2∗grandfather′sincomei+β3∗controlsi (3)

The most important part of this second regression is to see how the estimation of β1 changes,

and thus, how the inclusion of the previously omitted variable could impact the estimate. I

can estimate the sign of the omitted variable bias by looking at (1) the correlation between

grandfather’s income and husband’s income and (2) the relationship of β2 on the child

mortality rate. The correlation between grandfather’s income and husband’s income would

seem to be positive – a woman’s childhood socioeconomic class is positively correlated with

her husband’s socioeconomic class. I could assume that β2 would have a negative effect on

child mortality rates. Given the intergenerational impact of the health-income gradient, I

can assume that as a woman’s childhood socioeconomic status increases, the healthier she is

as an adult and thus, the healthier her children are. Thus, I hypothesize that grandfather’s

income has a negative omitted variable bias on the husband’s income.

This is supported in regression (4) which adds in the maternal grandfather’s income.There is

a slight reduction in the coefficient estimation of the husband’s occupation wealth. Now, an

increase of $100 in income is associated with a decrease of 0.72 deaths per 1000 children ever

born. On top of this effect is also the impact of the grandfather’s occupation wealth which

means that for every extra $100 (in 1870 dollars) the maternal grandfather made (and the

environment that the woman grew up in), is associated with a decrease of 0.59 deaths per

1000 children ever born. Given the intergenerational impact of the health – income gradient,

this could reflect the mother’s health which could impact the health of her children at birth

and throughout their childhood.

Although I established some omitted variable bias, this does not tell a perfect picture and

the relationship could be capturing other stories which is why I do not use causal language.

For example, it could be telling a story of how class and marriage interact with each other.

A woman is more likely to marry a man of equivalent occupational standing which means

that there is a lot of correlation between the two variables and it does not quite identify

what I have assumed it identifies.

10

Page 15: ABSTRACT The Health-Income Gradient in the Early 20th …

4.1 Robustness Checks

I conduct robustness checks in Table 8 to check whether using the 1950 occupational income

score changes the results. Given that it is a scale, it is difficult to interpret. A 1-point

increase in the occupational income score is associated with a decrease of 1.05 deaths per

1000 children ever born. Again, given that the standard deviation of the occupation score is

12.57, a one standard deviation increase in husband’s occupational score is associated with a

decrease of 13.2 deaths per 1000 children ever born. This is a much more consequential effect

than I see with the regular results, however, as my discussion in the methodology section

provides, I believe that the 1870 wealth scores provide a more accurate picture. Thus, the

important conclusion from these robustness checks is that they support the initial results of

a presence of the health – income gradient.

When checking for the omitted variable bias, I check using both the 1870 wealth score and

the 1950 occupation score. Given that the grandfather’s income is captured in 1870 or 1880,

the 1870 wealth score is a much more reasonable and accurate measure than the 1950 occu-

pational income score. In Table 8, (2) shows that, the omitted variable bias did play a role

as now, a 1-point increase in the occupational income score is associated with a decrease

of 0.81 deaths per 1000 children ever born. Furthermore, a $100 increase in grandfather’s

income (in 1870 dollars) is associated with a decrease of 0.73 deaths per 1000 children born.

When using the 1950 occupational score for the grandfather’s income, as in (3), there is

a slight decrease in β1 – a 1-point increase in the occupational income score is associated

with a decrease of 0.98 deaths per 1000 children born. However, I do not find a statistically

significant relationship between grandfather’s income (using the 1950 occupation score) with

the child mortality rate. As previously noted, this exception is most likely representative

of the variable and how it is far removed from the grandfather’s occupational income in 1870.

5 Conclusion

In conclusion, there is evidence within this paper signifying that the child mortality rate

is impacted by the husband’s income. Furthermore, when adding in a control to represent

the mother’s childhood socioeconomic status, the income gradient does not disappear, but

lessens a little and I find a significant relationship with the grandfather’s income.

For the previous work that was done on the child mortality rate in the 1900 and 1910 census,

but Preston and Haines (1991), this does indicate that there was omitted variable bias with

the lack of the woman’s socioeconomic status in her childhood. It also does indicate that

there are potentially other factors at play in these relationships.

Although I was able to identify an omitted variable bias and were able to improve my

11

Page 16: ABSTRACT The Health-Income Gradient in the Early 20th …

understanding of the health-income gradient, this work is not causal and thus has limited

applications. To try and isolate the causality in this problem, the goal is to increase the

matched sample to be able to have household fixed effects – a stronger identification strategy

that could enable more discussion of causality than in this study. The key to this will be

finding enough sisters in the matched dataset.

12

Page 17: ABSTRACT The Health-Income Gradient in the Early 20th …

References

[1] Abramitzky, Ran, Leah Platt Boustan, and Katherine Eriksson. “Europe’s Tired,

Poor, Huddled Masses: Self-Selection and Economic Outcomes in the Age of

Mass Migration.” American Economic Review 102, no. 5 (2012): 1832–56.

https://doi.org/10.1257/aer.102.5.1832.

[2] Abramitzky, Ran, Leah Platt Boustan, Katherine Eriksson, James J. Feigenbaum, and

Santiago Perez. “Automated Linking of Historical Data.” National Bureau of Eco-

nomic Research, no. No. w25825 (May 10, 2019).

[3] Bailey, Martha, Connor Cole, Morgan Henderson, and Catherine Massey. “How Well Do

Automated Linking Methods Perform? Lessons from U.S. Historical Data.” National

Bureau of Economic Research, no. No. w24019 (May 2019).

[4] Caldwell, John C. “How Is Greater Maternal Education Translated into Lower Child

Mortality?” Health Transition Review 4, no. 2 (1994): 224–29.

[5] Case, Anne, Diana Lee, and Christina Paxson. “The Income Gradient

in Children’s Health: A Comment on Currie, Shields and Wheatley

Price.” Journal of Health Economics 27, no. 3 (May 1, 2008): 801–7.

https://doi.org/10.1016/j.jhealeco.2007.10.005.

[6] Case, Anne, Darren Lubotsky, and Christina Paxson. “Economic Status and Health in

Childhood: The Origins of the Gradient.” American Economic Review 92, no. 5

(December 2002): 1308–34. https://doi.org/10.1257/000282802762024520.

[7] Craig, Jacqueline, Katherine Eriksson, and Gregory T Niemesh. “Marriage and the In-

tergenerational Mobility of Women: Evidence from Marriage Certificates 1850-1910,”

n.d., 24.

[8] Currie, Alison, Michael A. Shields, and Stephen Wheatley Price. “The

Child Health/Family Income Gradient: Evidence from England.”

Journal of Health Economics 26, no. 2 (March 1, 2007): 213–32.

https://doi.org/10.1016/j.jhealeco.2006.08.003.

[9] Currie, Janet, and Mark Stabile. “Socioeconomic Status and Child Health: Why Is the

Relationship Stronger for Older Children.” American Economic Review 93, no. 5

(December 2003): 1813–23. https://doi.org/10.1257/000282803322655563.

[10] Deaton, Angus. “Health, Inequality, and Economic Development.”

Journal of Economic Literature 41, no. 1 (March 2003): 113–58.

https://doi.org/10.1257/002205103321544710.

13

Page 18: ABSTRACT The Health-Income Gradient in the Early 20th …

[11] Dowd, Jennifer Beam. “Early Childhood Origins of the Income/Health Gradient: The

Role of Maternal Health Behaviors.” Social Science and Medicine 65, no. 6 (September

1, 2007): 1202–13. https://doi.org/10.1016/j.socscimed.2007.05.007.

[12] FamilySearch (2016, June). ”massachusetts marriages, 1695-1910”. ”Database:

http://FamilySearch.org”. Index based upon data collected by the Genealogical So-

ciety of Utah, Salt Lake City.

[13] Feigenbaum, James J. “A Machine Learning Approach to Census Record Linking,” n.d.,

34.

[14] Feigenbaum, James J. “Multiple Measures of Historical Intergenerational Mobility:

Iowa 1915 to 1940.” The Economic Journal 128, no. 612 (July 1, 2018): F446–81.

https://doi.org/10.1111/ecoj.12525.

[15] Finch, Brian Karl. “Socioeconomic Gradients and Low Birth-Weight: Empirical and

Policy Considerations.” Health Services Research 38, no. 6p2 (2003): 1819–42.

https://doi.org/10.1111/j.1475-6773.2003.00204.x.

[16] Haines, Michael R. “Socio-Economic Differentials in Infant and Child Mortality during

Mortality Decline: England and Wales, 1890–1911.” Population Studies, June 4, 2010.

https://doi.org/10.1080/0032472031000148526.

[17] Johnston, David W., Carol Propper, Stephen E. Pudney, and Michael A. Shields. “The

Income Gradient in Childhood Mental Health: All in the Eye of the Beholder?”

Journal of the Royal Statistical Society: Series A (Statistics in Society) 177, no. 4

(October 1, 2014): 807–27. https://doi.org/10.1111/rssa.12038.

[18] Johnston, David W., Carol Propper, Stephen Pudney, and Michael A. Shields. “Is

There an Income Gradient in Child Health? It Depends Whom You Ask.” SSRN

Scholarly Paper. Rochester, NY: Social Science Research Network, March 22, 2010.

https://papers.ssrn.com/abstract=1575883.

[19] Khanam, Rasheda, Hong Son Nghiem, and Luke B. Connelly. “Child Health and the

Income Gradient: Evidence from Australia.” Journal of Health Economics 28, no. 4

(July 1, 2009): 805–17. https://doi.org/10.1016/j.jhealeco.2009.05.001.

[20] Kippersluis, Hans van, Tom Van Ourti, Owen O’Donnell, and Eddy van

Doorslaer. “Health and Income across the Life Cycle and Generations in Eu-

rope.” Journal of Health Economics 28, no. 4 (July 1, 2009): 818–30.

https://doi.org/10.1016/j.jhealeco.2009.04.001.

[21] Long, Jason, and Joseph Ferrie. “Intergenerational Occupational Mobility in Great

Britain and the United States Since 1850.” American Economic Review 103, no.

4 (June 2013): 1109–37. https://doi.org/DOI: 10.1257/aer.103.4.1109.

14

Page 19: ABSTRACT The Health-Income Gradient in the Early 20th …

[22] Nattey, Cornelius, Honorati Masanja, and Kerstin Klipstein-Grobusch. “Rela-

tionship between Household Socio-Economic Status and under-Five Mortal-

ity in Rufiji DSS, Tanzania.” Global Health Action 6 (January 24, 2013).

https://doi.org/10.3402/gha.v6i0.19278.

[23] Newcombe, H. G. NYSIIS Algorithm Handbook of Record Linkage. New York, NY:

Oxford University Press, 1988.

[24] Propper, Carol, John Rigg, and Simon Burgess. “Child Health: Evidence on the Roles

of Family Income and Maternal Mental Health from a UK Birth Cohort.” Health

Economics 16, no. 11 (2007): 1245–69. https://doi.org/10.1002/hec.1221.

[25] Reinhold, Steffen, and Hendrik Jurges. “Parental Income and Child Health in Germany.”

Health Economics 21, no. 5 (2012): 562–79. https://doi.org/10.1002/hec.1732.

[26] Ruggles, Stevem, Kate Genadek, Ronald Goeken, Josiah Grover, and Matthew

Sobek. “IPUMS.” Text. IPUMS, April 18, 2019. https://ipums.org/projects/ipums-

usa/d010.v6.0.

[27] Uva, Teresa Bago d’, Eddy Van Doorslaer, Maarten Lindeboom, and Owen O’Donnell.

“Does Reporting Heterogeneity Bias the Measurement of Health Disparities?” Health

Economics 17, no. 3 (2008): 351–75. https://doi.org/10.1002/hec.1269.

[28] Winkler, William E. “The State of Record Linkage and Current Research Problems.”

Statistical Research Division, U.S. Census Bureau, 1999.

15

Page 20: ABSTRACT The Health-Income Gradient in the Early 20th …

Figure 1: Overview of Matching Process

Note: This figure looks at the overview of the two links. Accounts for women matched to multiple censuses by only counting themonce.

Figure 2: Overview of Matching Process (Includes all combinations ofmatches)

Note: This figure provides a more complex look at the two links by accounting for the different combinations of the census matches.

Figure 3: TPR and PPV Rates with beta=1.375

(a) From Marriage Li-cense to 1880 Census

.4.6

.81

Perc

ent

0 .2 .4 .6 .8 1Alpha

TPR PPVRates when beta=1.375. Picked alpha=0.5 (dashed line).PPV = positive predictive value and TPR = true positive rate

Rate of PPV and TPR for Marriage Licenses to 1880 Census

(b) From 1880 to 1900Census

.6.7

.8.9

1Pe

rcen

t

0 .2 .4 .6 .8 1Alpha

TPR PPVRates when beta=1.375. Picked alpha=0.4 (dashed line).PPV = positive predictive value and TPR = true positive rate.

Rate of PPV and TPR for Early Census to Late Census Matches

Note: This figure looks at the TPR and PPV rates for both linking stages when beta is set to 1.375.

16

Page 21: ABSTRACT The Health-Income Gradient in the Early 20th …

Table 1: Probit Scores for Matching Marriage Licenses to Early Censuses

(1) (2)Predictors Match (hits=1) Match (hits>1)All strings match perfectly and age difference is 0 1.294∗∗∗ (0.252) 1.521∗∗∗ (0.188)perfect all and 0 -0.0197 (0.384) 0.0842 (0.136)First name matches perfectly -0.936∗∗ (0.329) -0.172 (0.299)Last name matches perfectly -1.004∗ (0.480) -0.255 (0.468)Mother’s first name matches perfectly -0.0421 (0.279) -0.333 (0.288)Father’s first name matches perfectly -1.205∗ (0.533) 1.531∗ (0.701)Father’s last name matches perfectly 0.490 (0.476) 0.261 (0.471)String distance of first name -3.356 (2.459) -3.651 (2.659)String distance of last name -11.65∗∗∗ (1.820) -1.105 (3.684)String distance of mother’s first name -2.616 (1.938) -4.796∗ (2.396)String distance of father’s first name -11.25∗∗ (3.583) -1.265 (4.390)String distance of father’s last name -0.836 (1.217) -10.51∗∗ (3.566)Last letter of first name matches 0.283 (0.254) 0.489 (0.271)Last letter of first name matches -0.553 (0.474) -0.901 (0.505)First letter of mother’s first name matches 0.111 (0.249) -0.101 (0.253)Last letter of mother’s first name matches 0.0963 (0.197) 0.193 (0.243)Last letter of father’s first name matches 0.214 (0.368) -0.472 (0.505)First letter of father’s last name matches -0.137 (0.518) -3.487∗∗∗ (0.875)Last letter of father’s last name matches 0.667 (0.465) 1.552∗∗ (0.510)First name NYSIIS match 0.849∗∗ (0.287) 0.448 (0.289)Last name NYSIIS match 0.304 (0.558) -1.065∗ (0.538)Mother’s first name NYSIIS match 0.365 (0.227) 0.482 (0.267)Father’s first name NYSIIS match 0.466 (0.469) -0.582 (0.666)Father’s last name NYSIIS match 0.809 (0.553) 1.719∗∗ (0.537)Absolute value of woman’s age difference = 0 0 (.) 0 (.)Absolute value of woman’s age difference = 1 -0.204 (0.127) -0.346∗∗ (0.109)Absolute value of woman’s age difference =2 -0.512∗∗∗ (0.148) -0.877∗∗∗ (0.118)Absolute value of woman’s age difference = 3 -1.072∗∗∗ (0.166) -1.526∗∗∗ (0.141)Absolute value of woman’s age difference = 4 -4.411∗∗∗ (0.331)Absolute value of woman’s age difference = 5 -4.286∗∗∗ (0.324) -4.228∗∗∗ (0.423)Multiple perfect string matches -2.424∗∗∗ (0.0979)Hits -0.0348∗∗∗ (0.00275)Hits-squared 0.0000906∗∗∗ (0.00000854)Constant 0.894 (0.897) 1.374 (1.310)Observations 2767 22410

Standard errors in parentheses∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001

Note: The sample size is one-half of the number of handmatched observations from the marriage licenses to the 1880 census.

String distances are calculated using the Jaro-Winkler string distance method (Winkler 1999). NYSIIS is a code assigned to

each general name (Newcombe, 1988). Number of hits represents the number of potential matches with (1) less than a

Jaro-Winkler distance of 0.2 for all strings, (2) the absolute value of distance of age less than 5 years and (3) first initials for

father’s first name, and woman’s first and last names must match.17

Page 22: ABSTRACT The Health-Income Gradient in the Early 20th …

Table 2: Matching Table: Marriage Licenses to 1870/80 Censuses

Category To 1870 Census To 1880 Census

Total Number of Marriages 102906 100% 232404 100%No Potential Hits 62177 100% 60.42% 141191 100% 60.75%Causes of Match Failure 27148 100% 26.38% 62121 100% 26.73%

Score too low (hits==1) 7496 27.61% 15495 24.94%Score too low (hits>1) 16049 59.12% 34786 56.00%Ratio too low 88 0.32% 205 0.33%Censuses double matched 3515 12.9% 11635 18.73%

Matched 13581 100% 13.20% 29092 100% 12.52%Matches (hits==1) 10092 74.31% 21824 75.02%Matches (hits>1) 3489 25.69% 7268 24.98%

18

Page 23: ABSTRACT The Health-Income Gradient in the Early 20th …

Table 3: Probit Scores for Matching Early Censuses to Late Censuses

(1) (2)Predictors Match (hits=1) Match (hits>1)All names match perfectly 1.107∗∗ (0.379) 0.587 (1.177)Wife’s first name matches perfectly -0.985 (0.610) 0.982 (1.358)Husband’s first name matches perfectly -2.369 (2.383) 1.231 (2.764)Last name matches perfectly -0.690 (0.413) 0.314 (1.209)Wife’s first name distance -1.233 (5.005) 1.612 (11.88)Last name distance -11.49∗∗∗ (1.982) -18.96∗∗∗ (5.012)Husband’s first name distance -3.019 (5.926) 13.23 (16.18)Last letter of wife’s last letter matches -0.297 (0.494) -0.0514 (1.155)First letter of husband’s first name matches 0.509 (0.908) -1.579 (2.004)Last letter of husband’s first name matches 1.438∗ (0.634) 2.527 (1.605)Last letter of husband’s first name matches 0.622∗∗ (0.201) 0.599 (0.386)Wife’s first name NYSIIS match 1.115 (0.623) 0.916 (0.934)Husband’s first name NYSIIS match 1.227 (2.183)Last name NYSIIS match 0.742∗∗ (0.233) 0.137 (0.407)Absolute value of Wife’s age difference = 0 0 (.) 0 (.)Absolute value of Wife’s age difference = 1 -0.275 (0.183) -0.265 (0.217)Absolute value of Wife’s age difference = 2 -0.440 (0.226) -0.755∗∗ (0.247)Absolute value of Wife’s age difference = 3 -1.626∗∗∗ (0.250) -1.512∗∗∗ (0.329)Absolute value of Wife’s age difference = 4 -3.899∗∗∗ (0.452) -2.925∗∗∗ (0.559)Absolute value of Wife’s age difference = 5 -3.813∗∗∗ (0.534)Absolute value of Husband’s age difference = 0 0 (.) 0 (.)Absolute value of Husband’s age difference = 1 -0.319 (0.190) -0.698∗∗ (0.230)Absolute value of Husband’s age difference = 2 -0.430 (0.237) -1.091∗∗∗ (0.297)Absolute value of Husband’s age difference = 3 -1.091∗∗∗ (0.282) -1.114∗∗∗ (0.279)Absolute value of Husband’s age difference = 4 -2.862∗∗∗ (0.306)Absolute value of Husband’s age difference = 5 -3.669∗∗∗ (0.418)Multiple matches with all names perfect matches -1.965∗∗∗ (0.216)All names match perfectly and age differences = 0 0.134 (0.428)Hits -0.0873∗∗∗ (0.0226)Hits-squared 0.000610∗∗ (0.000229)Constant 0.631 (1.624) -2.956 (4.644)Observations 1242 1430

Standard errors in parentheses∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001

Note: The sample size is one-half of the number of handmatched observations from the 1880 census to the 1900 census. String

distances are calculated using the Jaro-Winkler string distance method (Winkler 1999). NYSIIS is a code assigned to each

general name (Newcombe, 1988). Number of hits represents the number of potential matches with (1) less than a

Jaro-Winkler distance of 0.2 for all strings, (2) the absolute value of distance of woman’s and husband’s age’s less than 5 years

and (3) first initials for woman’s first and last names and state of birth much match.

19

Page 24: ABSTRACT The Health-Income Gradient in the Early 20th …

Table 4: Matching Table: 1870/80 Censuses to 1900/10 Censuses

Category From 1870Matches to 1900Census

From 1870Matches to 1910Census

Total Number of Marriages 13581 100% 13581 100%No Potential Hits 7157 100% 52.70% 6836 100% 50.34%Causes of Match Failure 7392 100% 18.67% 2138 100% 15.76%

Score too low (hits==1) 824 38.54% 834 40.94%Score too low (hits>1) 1296 60.62% 1184 58.13%Ratio too low 12 0.56% 5 0.25%Censuses double matched 6 0.28% 14 0.69%

Matched 4268 100% 31.47% 4695 100% 34.60%Matches (hits==1) 3955 92.66% 4348 92.61%Matches (hits>1) 313 7.33% 347 7.39%

Category From 1880Matches to 1900Census

From 1880Matches to 1910Census

Total Number of Marriages 29092 100% 29092 100%No Potential Hits 20912 100% 71.88% 12003 100% 41.26%Causes of Match Failure 3358 100% 11.55% 4718 100% 16.24%

Score too low (hits==1) 1668 49.20% 1848 39.17%Score too low (hits>1) 1652 49.20% 2819 59.75%Ratio too low 12 0.36% 27 0.57%Censuses double matched 26 0.77% 24 0.51%

Matched 4808 100% 16.53% 12332 100% 42.45%Matches (hits==1) 4454 92.64% 11432 92.70%Matches (hits>1) 354 7.36% 900 7.30%

20

Page 25: ABSTRACT The Health-Income Gradient in the Early 20th …

Table 5: Summary Statistics of Matches Sample in Late Census

Variable Mean Std. Dev. Min. Max. NWoman’s Age 36.623 7.419 19 64 20416Husband’s Age 39.974 9.268 19 86 20416Children even born 1.915 2.063 0 18 20416Children surviving 1.602 1.712 0 13 20416Child Mortality Rate 0.134 0.255 0 1 14539Size of place 14.423 22.731 1 90 20416Husband’s Occupation Income (1950) 29.201 12.573 4 80 15636Husband’s Occupation Wealth (1870) 3445.179 2834.123 0 15792.5 15492Grandfather’s Occupation Income (1950) 24.101 9.638 3 80 19813Grandfather’s Occupation Wealth (1870) 2751.244 2301.376 21.024 16205.667 20048

Note: Only includes couples that are matched across all three periods. Uses information from the 1910 census if found,

otherwise 1900 census. Occupation Income Score (1950) assigns an occupational income scores to each occupation in all years

which represents the median total income (in hundreds of 1950 dollars) of all persons with that particular occupation in 1950

(IPUMS). Occupation Wealth (1870) is based on the median wealth of that income from the 1870 census and is calculated in

1870 dollars.

Table 6: 1910 US Census Summary Statistics

Variable Mean Std. Dev. Min. Max. NWoman’s Age 38.041 12.745 19 114 16548698Husband’s Age 42.503 13.397 0 130 16548698Children ever born 3.459 3.135 0 86 16548555Children surviving 2.713 2.435 0 85 16548555Child Mortality Rate 0.173 0.256 0 1 13964034Size of place 13.899 26.05 1 90 16548698Husband’s Occupation Income (1950) 18.837 13.442 0 80 16548698

Note: Only includes couples with both wife and husband present in the 1910 US Census. Husband’s Occupation Income Score

(1950) assigns an occupational income scores to each occupation in all years which represents the median total income (in

hundreds of 1950 dollars) of all persons with that particular occupation in 1950 (IPUMS).

21

Page 26: ABSTRACT The Health-Income Gradient in the Early 20th …

Table

7:

Reg

ress

ion

Res

ult

s

(1)

(2)

(3)

(4)

Child

Mor

tality

Rat

eC

hild

Mor

tality

Rat

eC

hild

Mor

tality

Rat

eC

hild

Mor

tality

Rat

eH

usb

and’s

Occ

upat

ion

Wea

lth

(187

0)-0

.000

0075

3∗∗∗

-0.0

0000

748∗∗∗

-0.0

0000

818∗∗∗

-0.0

0000

717∗∗∗

(0.0

0000

0930

)(0

.000

0009

28)

(0.0

0000

0934

)(0

.000

0009

62)

Gra

ndfa

ther

’sO

ccupat

ion

Wea

lth

(187

0)-0

.000

0059

0∗∗∗

(0.0

0000

119)

Con

trol

s:Siz

eof

Pla

ceX

XX

Age

sX

XL

engt

hof

Mar

riag

e(y

ears

)X

XO

bse

rvat

ions

1113

111

131

1113

110

943

Note

:R

ob

ust

stan

dard

erro

rsre

port

edin

pare

nth

eses

wit

hle

vel

s:∗p<

0.0

5,∗∗

p<

0.0

1,∗∗

∗p<

0.0

01.

Wei

ghte

dby

nu

mb

erof

child

ren

born

.O

ccu

pati

on

Wea

lth

(1870)

isb

ase

don

the

med

ian

wea

lth

of

that

inco

me

from

the

1870

cen

sus

an

dis

calc

ula

ted

in1870

dollars

.A

ge

contr

ols

incl

ud

equ

ad

rati

ces

tim

ate

sfo

rw

om

an

’sage,

hu

sban

d’s

age,

an

dgra

nd

fath

er’s

age.

Sou

rces

:C

om

ple

teco

unt

1870,

1880,

1900

an

d1910

Fed

eral

Cen

sus

data

from

Ru

ggle

set

al.

(2017).

Marr

iage

cert

ifica

tes

from

Fam

ilyS

earc

h.o

rg.

22

Page 27: ABSTRACT The Health-Income Gradient in the Early 20th …

Table

8:

Rob

ust

nes

sC

hec

k:

Diff

eren

tO

ccupat

ion

Sco

reC

alcu

lati

ons

(1)

(2)

(3)

(4)

Child

Mor

tality

Rat

eC

hild

Mor

tality

Rat

eC

hild

Mor

tality

Rat

eC

hild

Mor

tality

Rat

eH

usb

and’s

Occ

upat

ion

Inco

me

(195

0)-0

.001

05∗∗∗

-0.0

0080

7∗∗∗

-0.0

0105∗∗∗

-0.0

0098

5∗∗∗

(0.0

0020

7)(0

.000

210)

(0.0

0020

7)(0

.000

217)

Gra

ndfa

ther

’sO

ccupat

ion

Wea

lth

(187

0)-0

.000

0072

6∗∗∗

(0.0

0000

117)

Gra

ndfa

ther

’sIn

com

e(1

950)

-0.0

0006

54(0

.000

321)

Obse

rvat

ions

1120

111

013

1120

110

890

Note

:R

ob

ust

stan

dard

erro

rsre

port

edin

pare

nth

eses

wit

hle

vel

s:∗p<

0.0

5,∗∗

p<

0.0

1,∗∗

∗p<

0.0

01.

Wei

ghte

dby

nu

mb

erof

child

ren

born

.O

ccu

pati

on

Inco

me

Sco

re(1

950)

ass

ign

san

occ

up

ati

on

al

inco

me

score

sto

each

occ

up

ati

on

inall

yea

rsw

hic

hre

pre

sents

the

med

ian

tota

lin

com

e(i

nhu

nd

red

sof

1950

dollars

)of

all

per

son

sw

ith

that

part

icu

lar

occ

up

ati

on

in1950

(IP

UM

S).

Occ

up

ati

on

Wea

lth

(1870)

isb

ase

don

the

med

ian

wea

lth

of

that

inco

me

from

the

1870

cen

sus

an

dis

calc

ula

ted

in1870

dollars

.C

ontr

ols

incl

ud

eages

,si

zeof

pla

ce,

an

dm

arr

iage

len

gth

.A

ge

contr

ols

incl

ud

equ

ad

rati

ces

tim

ate

sfo

rw

om

an

’sage,

hu

sban

d’s

age,

an

dgra

nd

fath

er’s

age.

Sou

rces

:C

om

ple

teco

unt

1870,

1880,

1900

an

d1910

Fed

eral

Cen

sus

data

from

Ru

ggle

set

al.

(2017).

Marr

iage

cert

ifica

tes

from

Fam

ilyS

earc

h.o

rg.

23