22
Deriving and Modelling Fertility Variables in the NCDS and BCS70 Dylan Kneale, Institute of Education Supervisors: Professor Heather Joshi & Dr Jane Elliott

Deriving and Modelling Fertility Variables in the NCDS and BCS70

  • Upload
    gibson

  • View
    44

  • Download
    0

Embed Size (px)

DESCRIPTION

Deriving and Modelling Fertility Variables in the NCDS and BCS70. Dylan Kneale, Institute of Education Supervisors: Professor Heather Joshi & Dr Jane Elliott. Pathways to Parenthood: Exploring the influence of Context as a Predictor of Timing to Parenthood. - PowerPoint PPT Presentation

Citation preview

Page 1: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Deriving and Modelling Fertility Variables in the NCDS and BCS70

Dylan Kneale, Institute of EducationSupervisors: Professor Heather Joshi & Dr Jane Elliott

Page 2: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Pathways to Parenthood: Exploring the influence of Context as a Predictor of Timing

to Parenthood• Overall Aims

1. Define early parenthood…teenage?2. Explore strength of different ‘known’ sets of predictors of early parenthood3. Explore the influence of context as a predictor of early parenthood4. Explore the influence of context on the other side of the spectrum: postponement and childlessness

Page 3: Deriving and Modelling Fertility Variables in the NCDS and BCS70

1958 NCDS Birth

1965 NCDS

(Age 7) 1969

NCDS (Age 11)

1970 BCS70 Birth

1974 NCDS

(Age 16)

1975 BCS70 (Age 5)

1981 NCDS

(Age 23)

1980 BCS70

(Age 10)

1986 BCS70

(Age 16)

1991 NCDS

(Age 33)

1996 BCS70

(Age 26)

2000 NCDS

(Age 42)

2000 BCS70

(Age 30) 2004

BCS70 (Age 34)

2004 NCDS

(Age 46) NCDS

BCS70

i. Deriving fertility variables (NCDS)

ii. Modelling fertility variables (NCDS & BCS70)

Page 4: Deriving and Modelling Fertility Variables in the NCDS and BCS70

• Fertility variables collected at all waves since childhood (Age 23,

33, 41/42, 46 years)

• 2004 sweep allows for analysis of full fertility schedule adding to

previous analyses of NCDS cohort e.g. Holdsworth & Elliott

(2001)

• Want to create variables for Event History Analysis

• First attempt to create variable for modelling entry into

parenthood in Event History terms could work as:

Time to first parenthood (Event) = Minimum Recorded Child’s Date of

Birth (Age 23, 33, 41/42, 46 years)

Childless cohort members (Censored) = Maximum Recorded Interview

Date (Age 23, 33, 41/42, 46 years)

Deriving fertility variables (NCDS) I

Page 5: Deriving and Modelling Fertility Variables in the NCDS and BCS70

• Using this method produces the following summary KM statistics:

Deriving fertility variables (NCDS) II

Median Age 1st Parenthood

% Childless at last observation (46)

♂ 30.6 years 26.7%

♀ 27.0 years 20.4%

• Median estimates of entry to parenthood are higher than other sources for NCDS.

• However, of more concern; estimates of childlessness using data up to 46 years don’t differ significantly from those up to 33 years.

• Equivalent of only additional 3.3% of women becoming mothers (Holdsworth & Elliott 2001).

• ONS estimates transition between 33 and 46 years at twice this rate

Page 6: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Deriving fertility variables (NCDS) III• Possible discrepancy at age 41/42 years: Partially complete fertility

history collected

• ^ Symbol reflects a ‘text fill’ – used in CAPI questionnaires.

• Text fill used to tailor questionnaire to respondent. “Since 1991” meant to be applicable to all those present at age 33 years but not those missing.

• Possible that this filter was used for those rejoining the study at age 41/42 and 46 years when not needed?

• Build up evidence for this:

Page 7: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Deriving fertility variables (NCDS) IV• Evidence 1: Those rejoining the study had a lower number of births

recorded before 1991 than those continuing.

6.3% of births recorded at 41/42 years occurred before previous interview

for those continuing in the study

3.7% of births for those re-entering the study occurred before 1991

• Evidence 2: Those recorded as childless had children using

information from other sources:

Of 880 cohort members recorded as being childless at 41/42 years and

not present at data collection at 33 years

12% had children living elsewhere (natural?)

Conclusive proof:

44% had natural children over 9 years old living with them in household

Page 8: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Deriving fertility variables (NCDS) V• Evidence suggests that filter applied to both those continuing study

and rejoining study.

• In which case, fertility histories collected that do not include age 33

years may have to be capped or excluded:Present Number Truncation/Adjustment

Ages 23, 33, 41-42, 46 years 7138 Censored at 46 years

Ages 33, 41-42, 46 years 947 Censored at 46 years

Ages 41-42, 46 years 294 Not used

Ages 23, 46 years 104 Censored at 23 years

Ages 23, 41-42 years 383 Censored at 23 years

Ages 23, 33 years 887 Censored at 33 years

Ages 23, 33, 41-42 years 1444 Censored at 42 years

Ages 33 and 46 years 63 Censored at 46 years

Age 23 years 1591 Censored at 23 years

Age 33 years 310 Censored at 33 years

Age 41-42 years 203 Not used

Ages 33, 41-42 years 320 Censored at 42 years

Ages 23, 33 and 46 years 298 Censored at 46 years

Ages 23, 41-42, 46 years 690 Censored at 23 years

Total Potentially Included 14672

Page 9: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Deriving fertility variables (NCDS) VI• New method gives following summary KM statistics:

Median Age 1st Parenthood

% Childless at last observation (46)

♂ 29.4 years 20.7%

♀ 26.5 years 15.6%

•More importantly, those detected as being possible parents at a later

wave of data collection but with no accurate fertility history are censored

at an earlier point – this applies to 430+ cohort members. Has

implications for the whole fertility schedule for men and women.

•This factor could be responsible for inflated estimates of childlessness

among NCDS cohort members found in other sources.

•The method used here results in slightly smaller sample but one that errs

on the side of caution

Page 10: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Modelling fertility variables (NCDS & BCS70)Can see how highest hazard of

entry into parenthood is

reached among NCDS

earlier than among BCS70

cohort.

Inverse bathtub shape of hazard for

NCDS. For BCS70, shape is a little

more variable. However, this

applies to whole distribution.

Interest in my particular case is

entry to early parenthood

Page 11: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Modelling fertility (NCDS & BCS70) I• Strategies for event history modelling using parenthood data:

• Have continuous data (as opposed to discrete) – first stage in

guiding model selection

• Began with a Cox’s Proportional Hazards Model as find it intuitive

and easier to compute and interpret. Also can use same model when

hazard is different between data as no assumption made.

• Basic model:

• At each point, model is estimated through comparing the

characteristics of an individual experiencing an event compared to

those who remain in the risk set.

• Used Tenure as an example to assess suitability. Tenure is a

universal predictor in other models.

)exp(*)|( 0 xjj xtxt

Page 12: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Modelling fertility (NCDS & BCS70) III• A fundamental assumption of Cox Proportional Hazard Model is that

the Hazard remains proportional throughout the observation period.

• Assessed validity of assumption graphically and through statistical

test.

-4

-3

-2

-1

0

1

2

3

4

14 16 17 18 19 21 22 23 25 26 27 29 30 31 32 34 35 36 38 39 40 42 43 44 45

Diff

eren

ce in

Log

Cum

ulat

ive

Haz

ard

Rate

NCDS MalesNCDS FemalesBCS70 MalesBCS70 Females

Numerous ways of assessing

graphically.

According to the PH assumption,

while difference in

cumulative hazard would vary

absolutely, difference in log

cumulative hazard should

remain constant with no

systematic variation with time

(Singer and Willett 2003)

Page 13: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Modelling fertility (NCDS & BCS70) III• Also tested PH assumption through Schoenfeld residual test –

examining departure from 0. Significantly different suggests not

Proportional e.g.:

BCS70 Female model (entry up to 23 yrs)

Tenure ρ χ² p-valueOwner Occ - - -

Council -0.20 109.92 0.00Private/Oth -0.04 4.71 0.03Full Model

Test- 111.82 0.00

Possible solutions

1. Limit observation time

2. Consider Using Time Varying Covariates

3. Consider using a different model

Page 14: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Modelling fertility (NCDS & BCS70) IV• Limiting observation time to between 16-20 years did work but

against message and evidence presented in rest of thesis.

• Using a time varying covariate is okay for Tenure as data supports

this. However, may be poorer strategy in terms of data for larger

models. Plus computationally difficult.

0%10%20%30%40%50%60%70%80%

Rem

ained

(Age

23)

Coun

cil(A

ge 2

3)Pr

ivat

e(A

ge 2

3)Ti

ed an

dO

ther

Rem

ained

(Age

23)

Own

erO

ccup

iedPr

ivat

e(A

ge 2

3)Ti

ed an

dO

ther

Rem

ained

(Age

23)

Own

erO

ccup

iedCo

uncil

(Age

23)

Tied

and

Oth

erRe

main

ed(A

ge 2

3)O

wner

Occ

upied

Coun

cil(A

ge 2

3)Pr

ivat

e(A

ge 2

3)

Owner Occupied (Age 16;n=4,668)

Council (Age 16;n=3,663)

Private (Age 16; n=440) Tied and Other (Age 16;n=343)

Page 15: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Modelling fertility (NCDS & BCS70) V• Stratification not really an option with my data – know that

numerous factors predict early parenthood and potentially split

sample.

• Tried interacting Tenure with time to make the model explicitly non-

proportional:

where

• Interaction terms are significant. However, in extended models

interacting time with covariates will be computationally difficult and

also difficult to interpret.

• Use the AIC from these models to compare with other modelling

strategies.

]exp[*)()|( 22110 jjj xxtxt

tXX 12

Page 16: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Modelling fertility (NCDS & BCS70) VI• Alternative modelling strategies.

Want an alternative that:

- Can be used for both genders and both cohorts

- Know that hazards are not monotonic – want alternative that can

deal with these.

Can rule out PH models.

Can rule out only monotonic models - Weibull, Gompertz and Exponential

distributions (Wu and Chuang 2002)

Left with 3 types of Accelerated Failure Time Models – Gamma, Log-

logistic and Lognormal models

Page 17: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Modelling fertility (NCDS & BCS70) VII• Accelerated Failure Time Models analogous to simple linear model and do not model the hazard directly

but model survival time.

• Specification (distribution) for the δ term and intercept distinguishes between models – follow the normal,

logistic or gamma distribution

• Test these models using tenure and compare results using AIC (Akaike’s Information Criteria) to find best

fit.

• For NCDS, all three models produced very similar results. Little differentiation either in parameter values

or model fitting statistics, as other studies (Kwong and Hutton 2003; Cleves, Gould et al. 2004; Ghilagaber

2005). AIC estimates for all three distributions are all similar and all substantially lower than the AIC for the

best fitting Cox model constructed (inc Time interacted model).

0lnln TzT

Page 18: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Lognormal Log-logistic Generalised Gamma

♂ ♀ ♂ ♀ ♂ ♀

Baseline: Owner Occupation

Council -0.097** -0.154** -0.093** -0.155 -0.096** -0.152**

Private -0.038* -0.106** -0.038* -0.106 -0.038* -0.106**

Tied and Other -0.083** -0.072** -0.083** -0.072 -0.083** -0.072**

Log-Likelihood -1065.2 -1256.8 -1065.6 -1261.1 -1064.6 -1256.1

Akaike Information Criteria

2140.5 2523.7 2141.2 2532.3 2141.2 2524.2

Page 19: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Modelling fertility (NCDS & BCS70) X• When examining differences in AIC, Log-logistic gives marginally

poorer fitting values consistently leaving choice between Gamma

and Lognormal models.

• Gamma model is particularly suitable for “bath-tub” shape

distribution and used often in Demography for modelling mortality.

Inverse is suitable for fertility and would be suitable for modelling

whole NCDS fertility distribution.

• However, as I am modelling early fertility then Gamma model not as

suitable – tries to model concave shape when one not always

present.

• Therefore using Lognormal models to model entry into first

parenthood in early adulthood

Page 20: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Modelling fertility (NCDS & BCS70) XI• Univariate results (Time Ratio) for 16-23 years:

NCDS BCS70

♂ ♀ ♂ ♀

Baseline (Owner

Occupation)

Council 0.907** 0.858** 0.849** 0.821**

Private 0.963* 0.900** 0.968 0.919*

Tied and Other 0.921** 0.930** 0.971 0.924

Page 21: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Cohort Definition

Tenure (Baseline: Owner Occupation Only)

Mixed Owner

Occupation Tenure

Only Council Tenure

Some Council, no

owner occupation

tenure

Other

NCDS

Early Fatherhood 1.300 1.797** 2.172** 1.527

Very Early Fatherhood 1.493* 1.522** 1.550 1.449

Teenage Fatherhood Not significant in full model

Lognormal time to first fatherhood (16-23) 0.977 0.952** 0.947* 0.954*

Lognormal time to first fatherhood (16-30) 0.960** 0.961** 0.9637 0.955**

BCS70

Early Fatherhood 1.170 1.504** 1.294 1.369

Very Early Fatherhood 1.250 1.538* 0.836 1.140

Teenage Fatherhood Not significant in full model

Lognormal time to first fatherhood (16-23) Not significant in full model

Lognormal time to first fatherhood (16-30) 0.988 0.956* 1.004 0.974

** p < 0.01; * p < 0.05

Page 22: Deriving and Modelling Fertility Variables in the NCDS and BCS70

Conclusions – challenges I found when

modelling fertility

• When deriving NCDS fertility variables need to acknowledge that

participation at Wave 5 (Age 33 years) is crucial in determining

inclusion criteria.

• Failure to adjust for this leads to modest change in median survival

time and larger changes in estimates of childlessness

• CAPI filters?

• Traditional Cox model was not suited to my data even after allowing

for Time varying covariates etc

• Final choice between Gamma and Lognormal Accelerated Failure

Time models. Gamma more suitable for whole distribution;

Lognormal for early parenthood