University of Chicagojenni.uchicago.edu/econ312/Slides/multistate-duration_2008-01-12a_… · 12/01/2008 · The hazard for the nwkinterval depends on the number of previous spells

Multistate Duration Models

James J. HeckmanThe University of Chicago

This draft, January 12, 2008

1 Single Spell Models

Anonnegative random variable with absolutely continuous distribution func-tion ( ) and density ( ) may be uniquely characterized by its hazard func-tion. The hazard for is the conditional density of given 0 i e

( ) = ( | ) =( )

1 ( )0 (1)

Knowledge of determines .

Conversely, knowledge of determines because by integration of (1)Z0

( ) = ln(1 ( ))¯̄0 + (2)

( ) = 1 exp

Z0

( )

¸; (3)

= 0 since (0) = 0. The density of is

( ) = ( ) exp

Z0

( )

¸(4)

( ) = lim0Pr( + | ) (5)

lim0

( + ) ( )¸

1

(1 ( ))

=( )

1 ( )

The survivor function is the probability that a duration exceeds Thus

( ) = ( 1) = 1 ( ) = exp

Z0

( )

¸(6)

In terms of the survivor function we may write the density ( ) as

( ) = ( ) ( )

Note that there is no requirement that

lim

Z0

( ) (7)

or equivalently that ( ) = 0 Duration dependence is said to exist if

( ) 6= 0

If( )

0 at = 0, there is said to be positive duration dependence at 0.

If( )

0 at = 0, there is said to be negative duration dependence at 0.

In job search models of unemployment, positive duration dependence arisesin the case of a “declining reservation wage” (see, e.g., Lippman and McCall,1976).We define the conditional hazard as

( |˜( )

˜( ))= lim

0

Pr( + | ,˜( )

˜( ))

(8)

The dating on regressor˜( ) is an innocuous convention,

˜( ) may include

functions of the entire past or future or the entire paths of some variables e.g.

1( ) =

Z1( 1( ))

2( ) =

Z2( 2( ))

3( ) =

Z3( 3( ) )

where the ( ) are underlying time-dated regressor variables.

(A-1)˜( ) is distributed independently of (t0) for all , 0. The distribution of

˜is (

˜). The distribution of

˜is (

˜)

(A-2) There is no functional restrictions connecting the conditional distributionof given

ãnd

ãnd the marginal distributions of

ãnd

˜

By analogy with the definitions presented for the raw duration models, wemay integrate (8) to produce the conditional duration distribution

( |˜ ˜) = 1 exp

Z0

( |˜( )

˜( ))

¸(9)

the conditional survivor function

( |˜ ˜) = ( |

˜ ˜) = exp

Z0

( |˜( )

˜( ))

¸and the conditional density

( |˜ ˜) = ( |

˜( )

˜( )) ( |

˜ ˜) (10)

One specification of conditional hazard (8) that has received much attentionin the literature is the proportional hazard specification

( |˜( )

˜( )) = ( ) (

˜( )) (

˜( )) (11)

which postulates that the log of the conditional hazard is linear in functionsof ,

ãnd

ãnd that

( ) > 0 (˜( )) 0 (

˜( )) > 0 for all .

2 Multiple Spell Models

Let { ( ) 0}, ( ) ¯ where ¯ = {1 }, , be a finitestate continuous time stochastic process. We define random variable ( ),

{1 } as the value assumed by at the transition time. ( ) or( ) is generated by the following sequence.

(i) An individual begins his evolution in a state (0) = (0) = (0) and waitsthere for a random length of time 1 governed by a conditional survivorfunction

( 1 1 | (0)) = exp1Z

0

( |˜( ) (0))

As before ( |˜( ) (0)) is a calendar time (or age) dependent function

and we now make explicit the origin state of the process.

(ii) At time (1) = (1), the individual moves to a new state (1) = (1)governed by a conditional probability law

( (1) = (1) | (1) (0))

which may also be age dependent.

(iii) The individual waits in state (1) for a random length of time 2 gov-erned by

( 2 2 | (1) (1) (0)) =

exp

2Z0

( |˜( + (1)) (1) (0))

Note that one coordinate of˜( ) may be + (1), and that (2) (1) = 2

At the transition time (2) = (2) he switches to a new state (2) = (2)where the transition probability

( (2) = (2) | (1) (2) (1) (0))

may be calendar time dependent.

Continuing this sequence of waiting times and moves to new states gives riseto a sequence of random variables

(0) = (0) (1) = (1)

(1) = (1) (2) = (2)

(2) = (2)

and suggests the definitions

( ) = ( ) for ( ) ( + 1)

where ( ), = 0 1 2 is a discrete time stochastic process governed by theconditional probabilities

( ( ) = ( ) |˜ ˜ 1

)

where

˜= ( 1 ) and

˜ 1= ( (0) ( 1))

= ( ) ( 1) is governed by the conditional survivor function

( |˜ 1 ˜ 1

) = exp

Z0

( |˜( + ( 1))

˜ 1 ˜ 1)

2.1 Specializations of Interest

2.1.1 Repeated events of the same kind.

This is a one state process, e.g. births in a fertility history. (·) is a degenerateprocess and attention focuses on the sequence of waiting times 1 2 .... .One example of such a process writes

( ) = exp

Z0

( |˜( + ( 1)))

The hazard for the interval depends on the number of previous spells. Thisspecial form of dependence is referred to as occurrence dependence. In a studyof fertility, 1 corresponds to birth parity for a woman at risk. Heckmanand Borjas (1980) consider such models for the analysis of unemployment.Another variant writes the hazard of a current spell as a function of the meanduration of previous spells i.e. for spell 1

( |˜( + ( 1))

˜ 1) =

Ã| 1

1

1X=1

( 1) +

!

(see, e.g., Braun and Hoem (1978)).

Yet another version of the general model writes for the spell( |

˜( + ( 1))

˜ 1) = ( |

˜1 2 1)

This is a model with both occurrence dependence and lagged duration de-pendence, where the latter is defined as dependence on lengths of precedingspells.A final specification writes

( |˜( + ( 1))

˜ 1) = (

˜( + ( 1)))

For spell this is a model for independent non-identically distributed dura-tions; and ( ) is a nonstationary renewal process.

2.1.2 Multistate Processes

Let( ( ) = ( ) |

˜ ˜ 1) = ( 1) ( )

where

k k =

is a finite stochastic matrix with and

( |˜ 1 ˜ 1

) = exp¡

( 1)

¢

where the elements of { } are positive constants. Then ( ) is a time homo-geneous Markov chain with constant intensity matrix

= ( )

where

=

1 0˜

. . .0˜

and is the number of states in the chain.

In a dynamic McFadden model for a stationary environment, has the specialstructure = = for all and i.e. the origin state is irrelevant indetermining the destination state. This restricted model can be tested againsta more general specification.

A time inhomogeneous semi-Markov process emerges as a special case of thegeneral model if we let

( ( ) = ( ) |˜ ˜ 1

( 1)) = ( 1) ( )( ( ) )

wherek ( ) k =

˜( )

is a two parameter family of time ( ) and duration ( ) dependent stochasticmatrices with each element a function and and = 0

We further define

( |˜ 1 ˜ 1

( 1))

= exp

Z0

( |˜( 1)

( 1))

With this restricted form of dependence, ( ) is a time inhomogeneous semi-Markov process. (Hoem, 1972, provides a nice expository discussion of suchprocesses). Moore and Pyke (1968) consider the problem of estimating a timeinhomogeneous semi-Markov model without observed or unobserved explana-tory variables. The natural estimator for a model without restrictions con-necting the parameters of

( ( ) = ( ) |˜ ˜ 1

( 1))

and

( |˜ 1 ˜ 1

( 1))

breaks the estimation into two components.

a. Estimate˜by using data on transitions from to for observations with

transitions having identical (calendar time , duration ) pairs. A specialcase of this procedure for a model with no duration dependence in atime homogeneous environment pools to transitions for all spellsto estimate the components of (see also Billingsley, 1061). Anotherspecial case for a model with duration dependence in a time homogeneousenvironment pools to transitions for all spells of a given duration.

b. Estimate ( |˜ 1 ˜ 1

( 1)) using standard survival methods

on times between transitions.

3 General Duration Models For The Analysisof Event History Data

An individual event history is assumed to evolve according to the followingsteps.

(i) At time = 0, an individual is in state , there are 1 possibledestinations. The limit (as 0) of the probability that a person whostarts in at calendar time = 0 leaves the state in interval ( 1 1+ )given regressor path {

˜( )} 1+

0 and unobservable is the conditional

hazard or escape rate

lim

( 1 1 1 + |˜(0)=( ) (0)=0

˜( 1) )

(12)

= ( 1 |˜(0)

= ( ) (0)=0˜( 1) ) (13)

This limit is assumed to exist. (Assumed to be “regular”). The limit (as0) of the probability that a person starting in

˜(0)= ( ) at time (0)

leaves to go to 6= , in interval ( 1 1 + ) given regressor path{˜( )} 1+

0 and is

lim0

( 1 1 1 + (1) = |˜(0)

= ( ) (0) = 0˜( 1) )

= ( 1 |˜(0)=( ) (0)=0

˜( 1) ) (14)

From the laws of conditional probability

X=1

( 1 |˜(0)

= ( ) (0) = 0˜( 1) ) = ( 1 |

˜(0)= ( ) (0) = 0

˜( 1) )

(ii) The probability that a person starting in state at calendar time = 0survives to 1 = 1 is (from the definition of the survivor function in (8)and from hazard (69))

( 1 1 |˜(0)= ( ) (0) = 0 {

˜( )} 1

0 )

= exp

1Z0

( |˜(0)

= ( ) (0) = 0˜( ) )

Thus the density of 1 is

( 1 |˜(0)

= ( ) (0) = 0 {˜( )} 1

0 )

=

( 1 1 |˜(0)

= ( ) (0) = 0 {˜( )} 1

0 )

1

= ( 1 |˜(0)

= (0) = 0 ( 1) ) ( 1 1 |˜(0)

= ( ) (0) = 0 {˜( )} 1

0 )

The density of the joint event (1) = and 1 = 1 is

( 1 |˜(0)

= ( ) (0) = 0 {˜( )} 1

0 ) =

( 1 |˜(0)

= ( ) (0) = 0˜( 1) ) ( 1 1 |

˜(0)= ( ) (0) = 0 {

˜( )} 1

0 )

The density is sometimes called a subdensity. Note that

X=1

( 1 |˜(0)

= ( ) (0) = 0 {˜( )} 1

0 )

= ( 1 |˜(0)

= ( ) (0) = 0 {˜( )} 1

0 )

Proceeding in this fashion, one can define densities corresponding to eachduration in the individual’s event history. Thus, for an individual who startsin state

˜( )after his transition, the subdensity for +1 = +1 and

( + 1) = = 1 is

( +1 |˜( )

( ) = ( ) {˜( )} ( +1)

0 )

where

( + 1) =+1X=1

(15)

The conditional density of completed spells 1 and right censored spell

+1 given {˜( )} ( )+ +1

0 assuming that (0) = 0 is the exogenous start date

of the event history (and so corresponds to the origin date of the sample) is,allowing for more general forms of dependence,

( 1 (1) 2 (2) ( ) +1 | {˜( )} ( )+ +1

0 ) = (16)Z ½( +1 +1 |

˜( ) ˜( )( ) {

˜( )} ( )+

( )

¾( )

As noted in section 5, it is unlikely that the origin date of the sample coincideswith the start date of the event history. Let

( (0) (0) = 0 (1) 1 {˜( )} (1)

)

be the probability density for the random variables describing the events thata appropriate as defined in (8.2). The joint density of ( (0) 1 (1)) thecompleted spell density sampled at (0) = 0 terminating in state (1) is definedanalogously. For such spells we write the density as

( (0) (0) = 0 1 (1) {˜( )} (1)

)

In a multiple spell model setting in which it is plausible that the process hasbeen in operation prior to the origin date of the sample, intake rate is thedensity of the random variable describing the event “entered the state ( )at time = 0 and did not leave the state until 0 ” The expressionfor in terms of the exit rate depends on (i) presample values of

ãnd (ii)

the date at which the process began. Thus in principle given (i) and (ii) it ispossible to determine the functional form of . In this context it is plausiblethat depends on .

The joint likelihood for (0) 1 ( = ) (1) 2 ( ) +1

conditional on and {˜( )} ( )+ +1 for a right censored + 1 st spell is

( (0) 1 (1) 2 (2) ( ) +1

¯̄̄{˜( )} ( )+ +1 )

= ( (0) (0) = 0 1 (1) | {˜( )} (1)

)

"Y=2

( ( ) |˜( 1) ˜( 1)

( 1)˜( )} ( )

( 1)

#( +1 +1 |

˜ ˜( 1)( 1) {

˜( )} ( )+ +1

0 ) (17)

The marginal likelihood obtained by integrating out is,

( (0) 1 (1) 2 ( ) +1 | {˜( )} ( )+ +1) =Z

=

( (0) 1 (1) 2 ( ) +1 | {˜( )} ( )+ +1 ) ( ) (18)

Using (16) and conditioning on 1 = 1 produces conditional likelihood

( (0) 1 (1) 2 ( ) +1 | {˜( )} ( )+ +1

1 = 1 )

=Y=2

( ( ) |˜( 1) ˜( 1)

( 1) { ( ) ( )( 1) )·

( +1 +1 |˜ ˜

( ) { ( )} ( )+ +1

( ) )

3.1 Competing Risk Specifications

Let there be states the individual can occupy at any moment of time. If theindividual begins “life” in state there are 1 “latent times” with densities

( ) = ( ) exp

Z0

( ) ( = 1 ; 6= ) (A1)

where (·) is the density function of exit times from state into state , and(·) is the associated hazard function. The joint density of the 1 latent

exit times is given by

Y=16=

( ) exp

Z0

( ) (A2)

An individual exits from state to state 0 if the 0 first passage time is thesmallest of the 1 potential first passage times, i.e., if

0 ( = 1 ; 6= 0; 0 6= )

Let the probability that the individual leaves state and then directly entersstate 0 be denoted by 0.Then

0 = (A3)

Z0

Z0

Z0

Y=16=

( ) exp

Z0

( )

× 0( 0) exp

0Z0

0( ) 0

=

Z0

0( 0) exp

0Z0

X=16=

( ) 0 (A3)

The conditional density of exit times from state into state 0 given that0 ( : 6= 0; 0 6= ) is

( 0 | 0 )( : 6= 0; 0 6= ) (A4)

=

0( 0) exp

0Z0

X=16=

( )

0(A4)

It follows that the density of exit times from state into all other statescombined can be written

·( ·) =X0=10 6=

0 ( 0 | 0 )( : 6= 0; 0 6= )

=X=16=

( ·) exp

·Z0

X=16=

(u) du (A5)

The probability that the spell is not complete by time is simply

Prob( ) =

Z·( ) = exp

Z0

X=16=

( ) (A6)

This term enters the likelihood function for incomplete spells of at leastin length. In this manner all spells, not only completed ones, are used inthe estimation of the parameters of the hazard function. This is not thecase in regression analyses of durations in a state (or some transformation ofduration) on exogenous variables, where only completed spells can be usedin a straightforward fashion. (Obviously a nonlinear regression procedure canaccount for censoring). [ 1 ( + ) ( 2) ( + ) ]

Parameter vectors are indexed by transition. is ×1 vector of coe cientsof explanatory variables in the hazard function. To be specific

= [ 0 1 ( 2) ] (A7)

As discussed in (Flinn and Heckman, 1982) we impose a one factor specifi-cation, so that is the factor loading associated with the to transition.The usual normalizations required in factor analysis in discrete data modelsare imposed. (See Heckman, 1981, pp. 167-174).

We write the hazard function for the mth spell and rth individual as

( ) = exp£ 0 ( + )

¤=

( )

1 ( )(A8)

where is the cdf associated with (A1). From state into state isgiven by expression (A4). The density of a duration in the mth spell thatbegins in and terminates in state is the product of these two terms andis written as

( ) = ( ) exp

Z0

X=16=

( ) (A9)

Densities of this type are called subdensities in the duration analysis literature(see Kalbfleisch and Prentice, 1980, p. 167). It is notationally convenient towrite the exp term in (A9) as

( ) = exp

Z0

X=16=

( ) (A10)

so( ) = ( )

Note that from (A6) the probability that the mth spell lasts more than is

( ) = ( ) (A11)

so has a substantive probabilistic interpretation. It is called the survivorfunction in the duration analysis literature.Consider an individual’s contribution to the likelihood function (suppressingthe individual’s subscript for notational convenience)

( ) =

"1X

=1

( )

#( ) (A12)

where the Mth (and final) censored spell is assumed to last longer than .Treating as a random e ect, the integrated likelihood is

( ) =

Z( ) ( ) (A13)

where ( ) is the density of . Now define

$( ) = ln[¯( )]

Note that$( )

=1¯( )

¯( )=

1¯( )

Z ¯( )( ) (A14)

a. It allows for a flexible Box-Cox hazard with scalar heterogeneity

|˜) = exp

μ˜( ) +

μ1 1

1

¶1 +

μ2 1

2

¶2 +

¶1 2

(130)where 1 2 1 2 and are permitted to depend on the origin state,the destination state and the serial order of the spell. Lagged durationsmay be included among the

˜Using maximum likelihood procedures it is

possible to estimate all of these parameters except for one normalizationof .

b. It allows for general time varying variables and right censoring. The re-gressors may include lagged durations.

c. ( ) can be specified as either normal, log normal or gamma or the NPMLEprocedure can be used.

d. It solves the left censoring or initial conditions problem by assuming thatthe functional form of the initial duration distribution for each originstate is di erent from that of the other spells.

4 Conventional Reduced Form Models

Proposition 1 Uncontrolled unobservables bias estimated hazards towards neg-ative duration dependence.

The proof is a straightforward application of the Cauchy-Schwartz theorem.Let ( |

˜ ˜) be the hazard conditional on

˜ ãnd ( |

˜) is the hazard con-

ditional only on˜. These hazards are associated respectively with conditional

distributions ( |˜ ˜) and ( |

˜)

From the definition,

( |˜) =

Z( |

˜ ˜) (

˜)Z

(1 ( |˜ ˜)) (

˜)

Thus

( |˜)=

Z(1 ( |

˜ ˜))

( |˜ ˜)

(˜)Z

(1 ( |˜ ˜)) (

˜)

+

Z( |

˜ ˜) (

˜)

¸2 Z 2( |˜ ˜)

1 ( |˜ ˜))

(˜)

Z(1 ( |

˜ ˜) (

˜)

Z(1 ( |

˜ ˜)) (

‘)

¸2The second term on the right hand side is always nonpositive as a consequenceof the Cauchy-Schwartz theorem.¥

Intuitively, more mobility prone persons are the ones likely to leave first.

There are many possible conditional hazard functions (see, e.g., Lawless (1982)).One class of proportional hazard models that nests many previous models asa special case and therefore might be termed “flexible” is the Box-Cox condi-tional hazard

( |˜ ˜) = exp

½˜( ) +

μ1 1

1

¶+

μ2 1

2

¶2 +

˜( )

¾(19)

2 = 0 and 1 = 0; Weibull 2 = 0 and 1 = 1 Gompertz.The conventional approach does, however, allow for right censored spells as-suming independent censoring mechanism. We consider two such schemes.

Let ( ) be the probability that a spell is censored at duration or later. If

( ) = 0

( ) = 1 (20)

there is censoring at fixed duration This type of censoring is common inmany economic data sets. More generally, for continuous censoring times let( ) be the density associated with ( ). Let = 1 if a spell is not rightcensored and = 0 if it is. Let denote an observed spell length. Then thejoint frequency of ( ) conditional on

˜for the case of absolutely continuous

distribution ( ) is

(¯̄̄˜) = ( )(1 )

Z=

[ ( |˜( ) )] ( )] ( |

˜( ) ) ( ) (21)

= { ( )1 ( ) }Z=

[ ( |˜( ) )] ( ) |

˜( ) ) ( ). (22)

For a Dirac censoring distribution, the density of observed durations is

( |˜) =

Z=

[ ( |˜( ) )] ( |

˜( ) ) ( ) (23)

Except for special time paths of variables the termZ0

( |˜( ) )

which appears the survivor function does not have a closed form expression.To evaluate it requires numerical integration.

To circumvent this di culty, one of two expedients is often adopted (see, e.g.Lundberg, 1981, Cox and Lewis, 1966).

1. (a) Replacing time trended variables with their within spell average

¯˜( ) =

1Z0 ˜( ) 0

(b) Using beginning of spell values˜(0). Expedient (i) has the undesir-

able e ect of building spurious dependence between duration timeand the manufactured regressor variable. To see this most clearly,suppose that

˜is a scalar and

˜( ) = + Then clearly

¯( ) = +2,

and and ¯( ) are clearly linearly dependent. Expedient (ii) ignoresthe time inhomogeneity in the environment.

Table 1. Weibull Model-Employment to Nonemployment Transitions(Absolute Value of Normal Statistics in Parentheses)

Regressors Fixed At Regressors FixedAverage Value Over At Value As Of RegressorsThe Spell tart of Spell Vary(expedient i) (expedient ii) Freely

Intercept .971 -3.743 -3.078(1.535) (12.074) (8.670)

ln duration ( 1) -.137 -.230 -.341(1.571) (2.888) (3.941)

Married with Spouse -1.093 -.921 -.610Present? (=1 if yes; = Otherwise) (2.679) (2.310) (1.971)

National Unemployment Rate -1.800 .569 .209(6.286) (3.951) (1.194)

Source: See Flinn and Heckman, 1982b, p. 69.

These empirical results are typical. Introducing time varying variables intosingle spell duration models is inherently dangerous and ad hoc methods fordoing so can produce wildly misleading results. More basically, separating thee ect of time varying variables from duration dependence is only possible ifthere is “su cient” independent variation in

˜( ) scalar. Taking logs, we reach

ln( ( | )) = ( ) +

μ1 1

1

¶1 + ( )

5 Identification and Estimation Strategies

1. (A) What features, if any, of ( |˜) and/or ( |

˜) are identified

from the “raw data”, i.e., ( |˜)?

(B) Under what conditions are ( |˜) and ( ) identified? i.e., how

much a priori information has to be imposed on the model beforethese functions are identified?

(C) What empirical strategies exist for estimating ( |˜) and/or

( ) nonparametrically and what is their performance?

5.1 Nonparametric Procedures to Assess The StructuralHazard ( |

˜)

In order to test whether or not an empirical ( |˜) exhibits positive duration

dependence, it is possible to use the total time on test statistic (Barlow et.al.,1972, p. 267). This statistic is briefly described here. For each set of

˜values,

constituting a simple of˜durations, order the first durations starting with

the smallest

1 2 1˜

Let :˜

= [˜

( + 1)]( 1) where 0 0. Define

= 11X

=1

"X=1

:˜

#/ 1

X=1

:˜

is called the cumulative total time on test statistic. If the observations arefrom a distribution with an increasing hazard rate, tends to be large. Intu-itively, if ( |

˜) is a distribution that exhibits positive duration dependence.

1:˜

stochastically dominates 2:˜

2:˜

stochastically dominates 3:˜

, and

so forth. Critical values for testing the null hypothesis.

Let 1 = { : ln[1 ( |˜)] is concave in holding

˜fixed}. Membership

in this class can be determined from the total time on test statistic. If 1

is log concave, the :˜

defined earlier are stochastically increasing in for

fixedãnd

˜. Ordering the observations from the largest to the smallest and

changing the subscripts appropriately, we can use to test for log concavity.

Next let 2 = { : ( |˜) =

Z(1 exp

³(˜) ( )

´( ) for some

probability measure on [0, ]} It is often erroneously suggested that 1 =

2 i.e. that negative duration dependence by a homogenous population (1) cannot be distinguished from a pure heterogeneity explanation ( 2).

In fact, by virtue of Bernstein’s theorem (see, e.g. Feller, 1971, p. 439-440) if2 it is completely monotone i.e.

( 1) (1 ( |˜)) 0 for 1 and all 0 (24)

and if ( |˜) satisfies (23), ( |

˜) 2

Setting = 3, (23) is violated if (-1)33

3(1 ( |

˜)) 0 i.e. if for some

= 0 " 2 ( |˜)

2+ 3 ( |

˜)

( |˜)

3( |˜)

#= 0

0

Formal verification of (23) requires uncensored data su ciently rich to supportnumerical di erentiation twice. Note that if the data are right censored at= , we may apply (23) over the interval 0 provided that we define

1 ( |˜) =

Z ³1

(˜) ( )

´( )Z ³

1(˜) ( )

´( )

and test whether

( 1) (1 ( |˜)) 0 for 1 and 0 (25)

The key insight in his test is as follows. For 2 the probability thatis the survivor function

( |˜) =

Z0

exp³

(˜) ( ))

´( ) (26)

By a transformation of variables = exp³

(˜) ( )

´we may transform

(25) for fixed˜to

( |˜) =

1Z0

( )

i.e. as the moment of a random variable defined on the unit interval.

From the solution to the classical Hausdor moment problem (see, e.g., Shohatand Tamarkin, 1943, p. 9) it is known that there exists a ( ) that satisfies(23) if

( |˜) 0 = 0 1 (27)

where

0 ( |˜) = ( |

˜)

1 ( |˜) = ( |

˜) ( + 1 |

˜)

( |˜) = ( |

˜)

μ1

¶(¯̄̄˜) +

μ2

¶( + 2 |

˜) + + ( 1) ( + |

˜)

.

Choosing equispaced intervals (0 1 [ ]) where [ ] is the nearest whole in-teger less than , form the ( |

˜) functions = 0 [ ]. Compute the

survivor functions so defined and test a subset of the necessary conditions.( = 1 )It is important to note that these are rejection criteria. There are other modelsthat may satisfy (23). For example

( ) =

Z0

( ) (28)

for 1 is completely monotone. By Bernstein’s theorem this distributionhas one representation in 2 but it is not unique.

5.2 Identifiability

( |˜) = ( ) (

˜) (29)

Before stating identifiability conditions, it is useful to define

( ) =

Z0

( )

Then for the proportional hazard model we have the following proposition dueto Elbers and Ridder (1982).

Proposition 2 If (i) ( ) = 1, (ii) ( ) defined on [0 ) can be writtenas the integral of a nonnegative integrable function ( ) defined on [0 )

( ) =

Z0

( ) , (iii) the set˜ ˜ ˜

is an open set in and the function

is defined onãnd is nonnegative, di erentiable and nonconstant on ,

then , and ( ) are identified. ¥

A general strategy of proof for this case is as follows (for details see Heckmanand Singer (1984a)) Assume that 0

˜

( ) is a member of a parametric family

of nonnegative functions and that the pair (˜

) is not identified. Assuming

that 0˜

is di erentiable to order , nonidentifiability implies the identities

1 =1( )

0( )=

0˜1

( )

Z0

0˜1

( )

1( )

0˜0

( )

Z0

0˜0

( )

0( )

· · ·1 =

( )( )

( )0 ( )

Proposition 3 For the true value of 0, defined so that 0 0 if ( )for all admissible , and for all bounded , then the triple ( 0 0 0) is

uniquely identified.¥ (For proof, see Heckman and Singer 1984a).

Proposition 4 For the true value of 0, such that 0 0 1, if all admis-sible are restricted to have a common finite mean that is assumed to be knowna priori ( ( ) = 1) and a bounded (but not necessarily common) second mo-ment ( 2) , and all admissible are bounded, then the triple ( 0 0 0)is uniquely identified.¥ (For proof see Heckman and Singer, 1984a).

Proposition 5 For the true value of 0 restricted so that 0 0 ,a positive integer, if all admissible are restricted to have a common finitemean that is assumed to be known a priori ( ( ) = 1) and a bounded (butnot necessarily common) + 1 moment ( ( +1) ), and all admissibleare bounded, then the triple ( 0 0 0) is uniquely identified.¥ (For proof

see Heckman and Singer, 1984a).

It is interesting that each integer increase in the value of 0 0 requires aninteger increase in the highest moment that must be assumed finite for alladmissible .The general strategy of specifying a flexible functional form for the hazardand placing moment restrictions on the admissible works in other modelsbesides the Box-Cox class of hazards. For example consider a nonmonotoniclog logistic model used by Trussell and Richards (1983).

0( ) =( )( ) 1

1 + ( )0 (30)

Proposition 6 For hazards model (4.8), the triple ( 0 0 0) is identifiedprovided that the admissible are restricted to have a common finite mean( ) = 1 ¥ (For proof, see Heckman and Singer, 1984a).

6 Sampling Plans and Initial Conditions Prob-lems

Begin after the date of the sample. For interrupted spells one of the followingduration times may be observed: (1) time in the state up to the sampling date( ) (2) time in the state after the sampling date ( ) or (3) total time in acompleted spell observed at the origin of the sample ( = + ). Durationsof spells that begin after the origin date of the sample are denoted

6.1 Time Homogeneous Environments andModelsWith-out Observed and Unobserved Explanatory Vari-ables

Time 0. Looking backward, a spell of length interrupted at 0 beganperiods ago. Looking forward, the spell lasts periods after the sampling date.The completed spell is = + in length. We ignore right censoring andassume that the underlying distribution is nondefective. (These assumptionsare relaxed below.)Let ( ) be the intake rate i.e. periods before the sample begins, ( )is the proportion of the population that enters the state of interest at time= . The time homogeneity assumption implies that

( ) =

Let ( ) = ( ) exph R

0( )

ibe the density of completed durations in the

population. The associated survivor function is

( ) = 1 ( ) = exp

Z0

( )

¸The proportion of the population experiencing a spell at calendar time =0 0, is obtained by integrating over the survivors from each cohort, i.e.

0 =

Z0

( )(1 ( ) =

Z0

( ) exp

Z0

( )

¸(31)

Thus the density of an interrupted spell of length is the ratio of the propor-tion surviving from those who entered periods ago to the total stock

( ) =( )(1 ( ))

0=

( ) expR0

( )

¸0

This rules out defective distributions. Assuming =

Z0

( ) and

integrating the denominator of the preceding expression by parts, we reach thefamiliar expression (see, e.g. Cox and Lewis (1966))

( ) =(1 ( ))

=( )

=1exp

Z0

( )

¸The density of sampled interrupted spells is not the same as the populationdensity of completed spells.

The density of sampled completed spells is obtained by the following straight-forward argument. In the population, the conditional density of given0 is

( | ) =( )

(1 ( ))= ( ) exp

Z( )

¸. (32)

Using the density of ( ) the marginal density of in the sample is

( ) =

Z0

( | ) ( ) =

Z0

( )(33)

so

( ) =( )

The density of the forward time can be derived in a similar fashion.

( ) =

Z0

( + | ) ( ) =

Z0

( + )

1Z( ) =

(1 ( ))=

( )=exp

h R0

( )i

(34)

So in a time homogeneous environment the functional form of ( ) is identicalto ( ).

The following results are well known about the distributions of the randomvariables and .

1. (a) If ( ) is exponential with parameter(i.e. ( ) = ) then so are ( ) and ( ). The proof is imme-diate.

(b) ( ) =2(1 +

2

2)

where 2 = ( )2 =

Z0

( )2 ( )

(c) ( ) =2(1 +

2

2)

(since and have the same density)

(d) ( ) =

μ1 +

2

2

¶so ( ) = 2 ( ) = 2 ( ) and ( ) unless 2 = 0.

(e) Ifln(1 ( ))

in ,2

21 (This condition is implied if

( ) =( )

1 ( )is decreasing in , i.e., 0( ) 0. In this case,

( ) = ( ) (See Barlow and Proschan 1975 for proof.

(f) Ifln(1 ( )) 2

21 (This condition is implied if 0( ) 0).

In this case ( ) = ( ) (See Barlow and Proschan 1975for proof).

We next present the distribution of the duration time for spells that beginafter the origin date of the sample. Let denote the time a spell begins.The density of is ( ). Assuming that and are independent the jointprobability that a spell begins at = and lasts less than periods is

Pr{ = and } = ( ) ( )

Thus the density of in a time homogeneous environment is

( ) = ( ). (35)

It is common to “solve” the left censoring problem by assuming that ( ) isexponential. The bias that results from invoking this assumption when it isfalse can be severe. As an example suppose that the population distributionof is Weibull so

( ) = 1 0 0

For = 2plim ˆ = ( )1 2 (1 2).

As another example, suppose the sample being analyzed consists of completespells sampled at time zero (i.e. ) generated by an underlying populationexponential density

( ) =

Then from (32)( ) = 2

If it is falsely assumed that ( ) characterizes the duration data and is esti-mated by maximum likelihood plim ˆ = 2 .

Continuing this example, suppose instead that a Weibull model is falsely as-sumed i.e.

( ) = 1

and the parameters and are estimated by maximum likelihood. Themaximum likelihood estimator solves the following equations,

1

ˆ=

X=1

ˆ

1

ˆ+

X=1

ln

=

ˆX=1

(ln ) ˆ

so

1

ˆ+

X=1

ln

=

X=1

ˆ ln

X=1

ˆ

(36)

Using the easily verified result thatZ0

1 ln¡ ¢

=

½( )

ln ( ( ))

¾and that fact that in large samples plim ˆ = is the value of that solves(31), is the solution to

1+ (ln ) =

( ln )

( )

we obtain the equation

1+

μ( ) | =2 ln

¶=

μln ( ) | = +2 ln

¶(37)

Using the fact that( + 1)

( + 1)=1+

0( )

( )

and collecting terms, w may rewrite (32) as

1

( + 1)+

( ) | =2 =1

( )

( ) | = +1 (38)

Since (2) = 1, it is clear that = 1 is never a solution of this equation.

In fact, since the left hand side is monotone decreasing in and the righthand side is monotone increasing in , and since at = 1, the left hand side

1.It can also be shown that

plim ˆ =1

( + 2)

6.2 The Densities of and in Time Inhomo-geneous Environments For Models With Observedand Unobserved Explanatory Variables

We define ( |˜( ) ) to be the intake rate into a given state at calendar

time . We assume that is a scalar heterogeneity component and˜( ) is

a vector of explanatory variables. It is convenient and correct to think of( |

˜( ) ) as the density associated with the random variable for a

person with characteristics (˜( ) ). We continue the useful convention that

spells are sampled at = 0. The densities of and are derived fortwo cases: (a) conditional on a sample path {

˜( )} and (b) marginally on

the sample path {˜( )} (i.e. integrating it out). We denote the distribution

of {˜( )} as (

˜) with associated density (

˜).

The derivation of the density of conditional on {˜( )}0 is as follows.

The proportion of the population in the state at time = 0 is obtained byintegrating over the survivors of each cohort of entrants. Thus

0(˜) =Z

0

Z=

( |˜( ) ) exp

μ Z0

( |˜( ) )

¶( )

The proportion of people in the state with sample path {˜( )}0 whose spells

are exactly of length is the set of survivors from a spell that initiated at= orZ

=

( |˜( ) ) exp

μ Z0

( |˜( ) )

¶( )

Thus the density of conditional on {˜( )}0 is

( | {˜( )}0 (39)

=

Z=

( |˜( ) ) exp

³ R0

( |˜( ) )

´( )

0(˜)

The marginal density of (integrating out˜) is obtained by an analogous

argument: divide the marginal flow rate as of time = (the integratedflow rate) by the marginal (integrated) proportion of the population in thestate at = 0.

Thus defining

0 =

Z=

0(˜) (

˜)

where=is the domain of integration for

˜we write

( ) =

Z=

Z=

(-tb-tb) |˜(-tb-tb) ) exp

³ R0

( |˜( ) )

´( ) (

˜)

0

(40)

Note that we use a function space integral to integrate out {˜( )}0 . (See

Kac (1959) for a discussion of such integrals). Note further that one obtainsan incorrect expression for the marginal density of if one integrates (39)against the population density of

˜( (

˜)). The error in this procedure is

that the appropriate density forãgainst which (39) should be integrated is

a density of˜conditional on the event that an observation is in the sample at

= 0. By Bayes’ theorem this density is

(˜| 0) =

Z0

( | {˜( ))}0 ) (

˜)

0(˜)

0

which is not in general the same as the density (˜). For proper distributions

for ,

(˜| 0) = (

˜)

0(˜)

0

The derivatives of the density of the completed length of a spell sampledat = 0 is equally straightforward. For simplicity we ignore right censoringproblems so that we assume that the sampling frame is of su cient length thatall spells are not censored and further assume that the underlying durationdistribution is not defective. (But see the remarks at the conclusion of thissection.) Conditional on {

˜( )} and the probability that the spell began

at is( |

˜( ) )

The conditional density of a completed spell of length that begins at isZ( |

˜( + ) ) exp

μ Z0

( |˜( + ) )

¶( )

For a fixed 0 by definition exceeds . Conditional on˜, the probability

that exceeds is 0(˜).

Thus, integrating out , respecting the fact that

(¯̄̄{˜( )} ) = (41)

0Z Z=

( |˜( ) ) ( |

˜( + ) ) exp

h R0

( |˜( + ) )

i( )

0(˜)

The marginal density of is

( ) (42)

=

0Z Z=

Z=

( |˜( ) ) ( |

˜( + ) ) exp

R0 ( |

˜( + ) )

¸( ) (

˜)

0

Ignoring right censoring, the derivation of the density of proceeds by recog-nizing that conditional on 0 is the right tail portion of random variable

+ , the duration of a completed spell that begins at = . The probabil-ity that the spell is sampled is 0(

˜). Thus the conditional density of =

given {˜( )} is obtained by integrating out and correctly conditioning on

the event that the spell is sampled i.e.

( | {˜( )} (43)

=

0Z Z=

( |˜( ) ) ( |

˜( ) ) exp

μ R0 ( |

˜( + ) )

¶( )

0(˜)

and the corresponding marginal density is

( ) (44)

=

0Z Z=

Z=

( |˜( ) ) ( ) |

˜( + ) ) exp

h R0

( |˜( + ) )

i( ) (

˜)

0

(45)

Of special interest is the case ( | ) = (˜) in which the intake rate does

not depend on unobservables and is constant for all given , and in which˜

is time invariant. Then (39) specializes to

( |˜) =

1

( )

Z=

exp

Z0

( |˜)

¸( ) (46)

where

(˜) =

Z0

Zexp

Z0

( |˜)

¸( )

This density is essentially of the same functional form as the density after (30).

Under the same restrictions on and˜, (39) and (40) specialize respectively

to

( |˜) =

Z=

( |˜) exp

h R0

( |˜)

i( )

(˜)

(47)

and

( |˜) =

Z=

exph R

0( |

˜)

i( )

(˜)

(48)

For this special case all of the results (i)-(vi) stated in subsection go throughwith obvious redefinition of the densities to account for observed and unobservedvariables.

It is only for this special case of ( |˜) with time invariant regressors that

the densities of and do not depend on the parameters of .The common expedient for “solving” the initial conditions problem for thedensity of –assuming that ( |

˜) is exponential–does not avoid the

dependence of the density of on even if does not depend on as longas it depends on or

˜( ) where

˜( ) is not time invariant.

Thus in the exponential case in which ( |˜( + ) ) = (

˜( + ) ), we

may write (40) for the case = ( |˜( )) as

( | {˜( )} )

=

0Z Z=

( |˜( )) 0 (

˜|( + ) )

(˜( ) ) 0 (

˜( ) )

( )

0Z Z=

( |˜( )) 0 (

˜( + ) )

( )

Only if (˜( + ) ) = (

˜( + )) so that unobservables do not enter the

model (or equivalently that the distribution of is degenerate), does disap-pear from the expression. In this case the numerator factors into two compo-nents, one of which is the denominator of the density. “ ” also disappears ifit is a time invariant constant that is functionally independent of

a. The functional form of ( |˜( ) ) is not in general known. This in-

cludes as a special case the possibility that for some known 0,( |

˜( ) ) 0 for . In addition, the value of may vary

among individuals so that if it is unknown it must be treated as anotherunobservable.

b. If˜( ) is not time invariant, its value may not be known for 0 so

that even if the functional form of is known, the correct conditionalduration densities cannot be constructed.

The initial conditions problem stated in its most general form is intractable.However, various special cases of it can be solved. For example, suppose thatthe functional form of is known up to some finite number of parameters, butpresample values of

˜( ) are not. If the distribution of these presample values

is known or can be estimated, one method of solution to the initial conditionsproblem is to define duration distributions conditional on post sample valuesof

˜( ) from the model using the distribution of their values.

This approach suggests using³| {

˜( )}0

´=

0Z Z=

Z{˜( ): 0}

( |˜( ) ) ( |

˜( + ) ) exp

Z0

( |˜( + ) )

¸( ) ( )

Recall, however, that the distribution of˜within the sample is not the distri-

bution of˜in the population, (

˜). This is a consequence of the impact of

the sample selection rule on the joint distribution ofãnd The distribution

of the˜within sample depends on the distribution of , and the parameters

of ( |˜) and the presample distribution of

˜. Thus, for example, the joint

density of and˜for 0 is

(˜( ) | 0)

=

(˜)

Z 0 Z=

Z{˜: 0}

( |˜( ) ) ( + |

˜( + ) ) 0

( |˜( + ) )

(˜) ( )

0

so, the density of within sample˜( ) is

(˜( ) | 0) =

0(

˜( ))

=(˜)

0 0

0

={˜: 0}

( |˜( ) ) ( + |

˜( + ) ) 0

( |˜( + ) )

(˜) ( )

It is this density and not (˜) that is estimated using within sample data on

˜.

A partial avenue of escape from the initial conditions problem exploits i.e.durations for spells initiated after the origin date of the sample. The densityof conditional on {

˜( )} +

0 where 0 is the start date of the spell is

( | {˜( )} +

0 )

=

Z0

Z=

( |˜( ) ) ( |

˜( + ) ) 0

( |˜( + ) )

( )

Z0

Z=

( |˜( ) ) ( )

The denominator is the probability that 0. Only if does not depend onwill be the density of not depend on the parameters of . More e cient

inference is based on the joint density of and that

( | {˜( )} +

0 )

=

Z=

( |˜( ) ) ( |

˜( + ) ) exp

R0

( |˜( + ) )

¸( )

Z0

Z=

( |˜( ) ) ( )

For example, the density of measured completed spells that begin after thestart date of the sample incorporates the facts that 0 andi.e. that the onset of the spell occurs after = 0 and that all completed spellsmust be length or less. Thus we write (recalling that is the startdate of the spell)

( | {˜( )} +

0 0)

=

Z0

Z=

( |˜( ) ) ( |

˜( + ) ) 0 ( |

˜( + ) )

( )

Z0

Z0

Z=

( |˜( ) ) ( |

˜( + ) )

( |˜( + ) )

( )

The density of right censored spells that begin after the start date of the sampleis simply the joint probability of the events 0 and i.e.(0 | {

˜( )}0 ) =

=

Z0

Z Z=

( |˜( ) ) exp

" R0

( |˜( + ) )

#( )

The modification required in the other formulae presented in this subsection toaccount for the finiteness of the sampling plan are equally straightforward. Forspells sampled at = 0 for which we observe presample values of the durationand post sample completed durations ( ), it must be the case that (a) 0and (b) where 0 is the length of the sampling plan.Thus in place of (41) we write

( | {˜( )} 0)

=

Z Z=

( |˜( ) ) ( |

˜( + ) ) 0 ( |

˜( + ) )

( )

0Z Z Z=

( |˜( ) ) ( |

˜( + ) ) 0 ( |

˜( + ) )

( )

The denominator of this expression is the joint probability of the events thatand 0. For spells sampled at = 0 for which

we observe presample values of the duration and post sample right censoreddurations, it must be the case that (a) 0 and (b) so thedensity for such spell is

( |˜( )o

0)

=

0Z Z Z=

( |˜( ) ) ( |

˜( + ) ) 0

( |˜( + ) )

( )

7 Examples of Duration Models Produced byEconomic Theory

Example A

7.1 A Dynamic Model of Labor Force Participation

The consumer works at age if the marginal rate of substitution between goodsand leisure evaluated at the no work position (also known as the nonmarketwage)

( ( )) =2( ( ) 1)

1( ( ) 1)(49)

function ( ) written as

( ) = ( ) ( ( )) (50)

If ( ) 0, the consumer works at age and we record this event by setting( ) = 1 If ( ) 0, ( ) = 0.

For each person successive values of ( ) may be correlated but it is assumedthat ( ) is independent of ( ) and ( ). We define the index functioninclusive of ( ) as

( ) = ( ) ( ( )) + ( ). (51)

The probability that an employed person does not leave the employed state is

1 ( ) (52)

where = ( ) The probability of receiving new values of in intervalis

=

μ ¶(1 )

The probability that a spell is longer than is the sum over of the products ofthe probability of receiving innovations in ( ) and the probability that theperson does not leave the employed state on each of the occasions (1 ( )) .Thus

( ) =X=0

μ ¶(1 ) (1 ( ))

= (1 ( ( ))

Thus the probability that an employment spell terminates at is

( = ) = ( 1) ( )

= (1 ( )) 1( ( ( )) (53)

By similar reasoning it can be shown that the probability that a non-employmentspell terminates in periods is

( = ) = [(1 (1 ( ))] 1 (1 ( )) (54)

From single spell, can only estimate ( ( )In place of the Bernoulli assumption for the arrival of fresh values of , supposeinstead that a Poisson process governs the arrival of shocks. As is well known(see, e.g. Feller (1970)) the Poisson distribution is the limit of a Bernoulli trialprocess in which the probability of success in each interval = goes tozero in such a way that lim 0 6= 0.

For a time homogeneous environment the probability of receiving o ers timeperiod is

( | ) = exp ( )( )

!(55)

Thus for the continuous time model the probability that a person who beginsemployment at = 1 will stay in the employed state at least periods is, byreasoning analogous to that use to derive (50),

Pr( ) =X=0

exp ( )( )

!(1 ( )) = exp [ ( ) ] (56)

so the density of spell lengths is

( ) = ( ) exp [ ( ) ]

A more direct way to derive this notes that from the definition of a Poissonprocess, the probability of receiving a new value of in interval ( + ) is

= + ( )

where lim0

( )0, the probability of exiting the employment state condi-

tional on an arrival of is ( )). Hence the exit rate or hazard rate from theemployment state is

= lim0

( )+ ( ) = ( ).

Using (4) relating the hazard function and the survivor function we concludethat

Pr( ) = 0

( )= ( )

By similar reasoning, the probability that a person starting in the nonemployedstate will stay on in that state for at least duration is

Pr( | ) = (1 ( ))

Analogous to the identification result already presented for the discrete timemodel, it is impossible using single spell employment or nonemployment datato separate from ( ) or 1 ( ) respectively. However, access to data onboth employment and nonemployment spells make it possible to identify bothand ( ).

The assumption of time homogeneity of the environment is only made to sim-plify the argument. Suppose that nonmarket time arrives via a nonhomoge-neous Poisson process so that the probability of receiving one nonmarket drawin interval ( + + ) is

( ) = ( ) + ( ) (57)

Assuming that and remain constant, the hazard rate for exit from em-ployment at time period for a spell that begins at 1 is

( | 1) = ( ) ( ) (58)

so that the survivor function for the spell is

( | 1) = exp ( )

Z1+

1

( )

¸(59)

By similar reasoning

( | 1) = exp (1 ( ))

Z1+

1

( )

¸

Example B

7.2 A One State Model of Search Unemployment

is the value of search. Using Bellman’s optimality principle for dynamicprogramming [see, e.g. Ross (1970)] may be decomposed into three com-ponents plus a negligible component [of order o( )].

=1 +

+(1 )

1 +

+1 +

max[ ; ] + ( )

= 0 otherwise. (60)

for 0.

• is the rate of arrival of job o eres (externally specified).

• is discount rate, is cost of search per instant, is time interval.

Collecting terms in (60) and passing to the limit, we reach the familiar formula[Lippman and McCall (1976a)]

+ = ( )

Z( ) ( ) for 0 (61)

To calculate the probability that an unemployment spell exceeds , wenote that the probability of receiving an o er in term interval ( + ) is

= + ( ) (62)

and further note that the probability that an o er is accepted is (1 ( ))so

= (1 ( ) (63)

and( ) = (1 ( )) (64)

For discussion of the economic content of this model, see, e.g., Lippman andMcCall (1976) or Flinn and Heckman (1982a). Accepted wages are truncatedrandom variables with as the lower point of truncation. The density ofaccepted wages is

( | ) =( )

1 ( )=

From the assumption that wages are distributed independently of wage arrivaltimes, the joint density of duration time and accepted wages ( ) is theproduct of the density of each random variable,

( ) = { (1-F(rV)) exp[ (1 ( ))]} ( )

1 ( )

= exp[ (1 ( )) ] ( )

= (65)

For simplicity we assume that a reservation wage property characterizes theoptimal policy noting that for general time inhomogeneous models it need not.We denote the reservation wage at time as ( ). The probability that anindividual receives a wage o er in time period ( + ) is

( ) = ( ) + ( ) (66)

The probability that it is accepted is (1 ( ( ))). Thus the hazard rate attime for exit from an unemployment spell is

( ) = ( )(1 ( ( )) (67)

so that the probability that a spell that began at 1 lasts at least is

( | 1) = exp

Z1+

1

( )(1 ( ( )))

¸(68)

The associated density is

( | 1) = ( 1+ )(1 ( ( 1+ ))) exp

Z1+

1

( )(1 ( ( )))

¸

Example C

7.3 A Dynamic McFadden Model

( | ) = ( ) ( ) (69)

so that the probability that the next purchase is item at a time = + 1

or later is

( | 1) = exp

Z1+

1

( ) ( )

¸(70)

The may be specified using one of the many discrete choice models dis-cussed in Amemiya’s survey (1981). For the McFadden random utility modelwith Weibull errors (1973), the are multinomial logit. For the Domencich-McFadden (1975) random coe cients preference model with normal coe -cients the are specified by multivariate probit.

Following McFadden (1974), the utility associated with each of possiblechoices at time is written as

( ) = (˜( )) + (

˜( ))

where denotes vectors of measured attributes of individuals,˜( ) represents

vectors of attributes of choices, is a nonstochastic and (˜( )) are iid

Weibull, i.e.,( (

˜( )) ) =

Then as demonstrated by McFadden (p. 110),

(˜( )) =

(˜( ))

X=1

(˜( ))

Adopting a linear specification for we write

(˜( )) =

˜

0( ) ( )

so

(˜( )) =

˜

0 ( ) ( )

X=1

˜

0 ( ) ( )

8 New Issues That Arises in Formulating andEstimating Choice Theoretic DurationMod-els

1. Without data on accepted wages, the models of previous sections areunderidentified even if there are no regressors or unobservables in themodel.

2. Even with data on accepted wages, the model is not identified unlessthe distribution of wage o ers satisfies a recoverability condition to bedefined below.

3. For models without unobserved variables, the asymptotic estimator ofthe model is non-standard.

4. Allowing for individuals to di er in observed and unobserved variablesinjects an element of arbitrariness into model specification, creates newidentification and computational problems, and virtually guarantees thatthe hazard is not of the proportional hazards functional form.

5. A new feature of duration models with unobservables produced by op-timizing theory is that the support of now depends on parameters ofthe model.

Point A

From a random sample of durations of unemployment spells in a model withoutobserved or unobserved explanatory variables, it is possible to estimatevia maximum likelihood or Kaplan-Meir procedures (see, e.g Kalbfleisch andPrentice, 1980), pp. 10-16). It is obviously not possible using such data aloneto separate from (1 ( )) much less to estimate the reservation wage .

Point B

Given access to data on accepted wage o ers it is possible to estimate thereservation wage . A strongly consistent estimator of is the minimumof the accepted wages observed in the sample

c = min{ } =1 (71)

For proof see Flinn and Heckman (1982a).

Access to accepted wages does not secure identification of . Only the trun-cated wage o er distribution can be estimated

( | ) =( ) ( )

1 ( ).

To recover an untruncated distribution from a truncated distribution with aknown point of truncation requires further conditions. If is normal, suchrecovery is possible. If it is Pareto, it is not. A su cient condition that

Point C

Using density (63) in a maximum likelihood procedure creates a non-standardstatistical problem. The range of random variable depends on a parame-ter of the model ( ). For a model without observed or unobservedexplanatory variables, the maximum likelihood estimator of is in fact theorder statistic estimator (30). The likelihood based on (63) is monotonicallyincreasing in , so that imposing the restriction that is essential insecuring maximum likelihood estimates of the model. Assuming that the den-sity of is such that ( ) 6= 0 the consistent maximum likelihood estimatorof the remaining parameters of the model can be obtained by inserting c inplace of everywhere in (63) and the sampling distribution of this estimatoris the same whether or not rV is known a priori or estimated. For a proof, seeFlinn and Heckman (1982a). In a model with observed explanatory variablesbut without unobserved explanatory variables, a similar phenomenon occurs.However, at the time of this writing, a rigorous asymptotic distribution the-ory is only available for models with discrete valued regressor variables whichassumes a finite number of values.

i. Economic theory provides no guidance on the functional form of theand functions (other than the restriction given by (59)). Estimatessecured from these models are very sensitive to the choice of these func-tional forms. Model identification is di cult to check and is very func-tional form dependent.

ii. In order to impose the restrictions produced by economic theory to secureestimates, it is necessary to solve nonlinear equation (59).

iii. Because of the restrictions like (59), proportional hazard specifications arerarely produced by economic models

Point D

In the search model without observed variables, the restriction that isan essential piece of identifying information. In a model with unobservableintroduced in or = ( ) as a consequence of functional restriction(59). In this model, the restriction that is replaced with an implicitequation restriction on the support of for an observation with acceptedwage and reservation wage ( ), the admissible support set for is

{ : 0 ( ) }.

9 Pitfalls In Using RegressionMethods To An-alyze Duration Data

To focus on essential ideals, consider a regression analysis of duration data fora particular type of event, e.g., the lengths of time spent in consecutive jobs.To simplify the analysis we assume that no time elapses between consecutivejobs. The density of duration in a given job for an individual with fixedcharacteristics is

( | ).

Unobserved heterogeneity components are assumed to be absent from themodel. The expected length of given is

( | ) =

Z0

( | ) = ( ) (72)

From a regression analysis, we seek to estimate the parameters of ( ). Forexample, if

( | ) = ( ) exp[ ( ) ] ( ) 0

( | ) =1

( )(73)

Defining ( ) = ( ) 1

( | ) = (74)

Under ideal conditions, a regression of on will estimate . We now specifythose conditions.

Suppose that the data at our disposal come from a panel data set of length. To avoid inessential detail suppose that at the origin of the sample, 0,everyone begins a spell of the event. This assumption enables us to ignoreproblems with initial conditions.We would like to use this data to estimate ( | ). But in our panel sample,the expected value of the length of the first spell is not ( | ) but is rather

( | ) =

Z0

( | ) +

Z( | ) ( | ) (75)

Thus, in the exponential example

( | ) = {1 exp( )} (76)

Clearly, at least squares regression of on will not estimate . Asthe bias disappears. In the exponential example, as becomes big relative tothe mean duration. 1 ( ), the bias becomes small.

One widely used method utilizes only completed first spells. This results inanother type of selection bias. The expected value of given that is

( | ) =

Z0

( | )Z0

( | )

(77)

In our exponential example

( | ) =

½1

( ) exp( )

1 exp( )

¾(78)

Again, a simple least squares regression of on does not estimate forthis sample. As , the bias disappears. (Note that the model could beconsistently estimated by nonlinear least squares.)

Clearly there is no selection bias when we analyze the expected duration of acompleted second spell of the event. Denote the length of spell by . Theexpected length of the second spell is

( 2 | 1 + 2 ) (79)

=

Z0

Z2

02 ( 2 | ) ( 1 | ) 1 2Z

0

Z2

0

( 2 | ) ( 1 | ) 1 2

Because 1 and 2 are conditionally independent, and hence the subscripts 1and 2 can be interchanged without a ecting the validity of the expression, thisis also the conditional expectation of the length of the first spell. For a sampleof individuals with at least two completed spells of the event

E(t2 |Z,T,t1+t2 T)

= Z1- exp(-T/ )(1+ Z)-( Z/2) exp(-T/ Z)1 exp( ) ( ) exp( )

¸(80)

Note further that

( 1 | 1 + 2) 6= ( 1 | 1 ).

The key point to extract from this discussion is that for short panels in whichis “small,” regression estimators do not estimate the parameters of regressionfunction (72). Least squares estimators are critically dependent on both thesample selection rule and the length of the panel.A common functional form for the hazard function, (·), is assumed for allspells. is a heterogeneity component common across all spells. The densityof duration time in the first spell, 1, is

First spell: [ 1 ( 1) ]exp½ Z

1

0

[ ( ) ]

¾

The density for the duration time in the second spell 2 given that the firstspell ends at calendar time t1 is

9.1 Conditional second spell density:

( 2 ( 2 + 1) ] exp

½ Z1

0

[ ( + 1) ]

¾The marginal second spell density is obtained by integrating out 1. Thus( 2 ) =

0

[ 2 ( 2 + 1) ] exp 20 [ ( + 1) ] [ 1 ( 1) ] exp 1

0 [ ( ) ] 1

In the case in which the distribution of ( ) does not depend on time (time stationarity in the exogenous variables),

( 2 ) = [ 2 ( 2 + 1) ] exp

½ Z2

0

[ ( + 1) ]

¾Otherwise the marginal second spell density will be of a di erent functionalform than the marginal first spell density, and the regression function for thesecond spell will have a functional form di erent from the of the first spellregression function.

A simple example may serve to clarify the main points. We first demonstratethat the functional form of the regression will depend on the time path of theexogenous variables that drive the model. Consider the following exponentialmodel for the first spell of an event

( 1 | ) = ( ) exp[ ( ) 1] 0 1

where ( ) = 1 ( + ), and remains constant over the entire spell.The regression function for duration in the first spell is

( 1 | ) =1

( )= + (81)

Suppose we consider another individual who is subject to a di erent value ofbefore and after calendar time 1. The density of 1 for this person is derivedmost simply from the conditional density before and after 1 etc

( 1 | 1 1) =( 1 ) exp[ ( 1 ) 1]

1 exp[ ( 1 ) 1]0 1 1

and

( 1 | 1 1) =( 2 ) exp[ ( 2 ) 1]

exp[ ( 2 ) 1]1 1

The conditional expectation of duration in the first spell is

( 1 | 1 2 1 ) (82)

=1

( 1 )+ exp[ ( 1 ) 1]

μ1

( 2 )

1

( 1 )

¶To show this, assume the same functional form for the hazard function in allspells of the event. The conditional expectation of duration in the second spell,given values of the exogenous variables that confront the individual after theend of the first spell, is, for a case of no time varying variables

( 2 | ) = 1 ( ) (83)

For the case of time varying variables, the conditional expectation depends onwhether or not 1 1. If 1 1, the conditional expectation is

( 2 | 1 2 1 1) =1

( 1 )

+ exp[ ( 1 )( 1 1)]1

( 2 )

1

( 1 )

¸(84)

1 1

For 1 1, the conditional expectation is

( 2 | 1 1 1 1) =1

( 2 )1 1 (85)

Although equations (79) and (81) are of the same functional form, there isone important di erence: in equation (81) 1 is an explanatory variable. Sinceunobserved heterogeneity component is correlated across spells, 1 is anendogenous variable in a regression model that treats as a component ofthe error term of the model (i.e. a model that is not computed conditional on). Partitioning the data on the basis of 1 raises further problems. By

Bayes theorem, the conditional mean of in equations (81) and (82) dependson 1 and the explanatory variables so that the error term (inclusive of )associated with regression specifications for equations (81) or (82) does not ingeneral have a zero mean. A standard least squares assumption is violated andleast squares estimators of duration equations will be biased and inconsistent.

The main point is quite general: whenever there are time trended or non-stationary explanatory variables in the model, conditioning the durations ofsubsequent spells on explanatory variables measured from the onset of thosespells induces correlation between the explanatory variables and the hetero-geneity component in the model.

One solution to these problems is to use the marginal second spell densityand compute the conditional expectation of 2 with respect to it. For the caseof no time varying variables, and in the more general case of time stationaryexogenous variables, the marginal and conditional densities coincide so that theright-hand side of equation (68) is the density. In the presence of nonstationaryexplanatory variables, the two distributions di er.

In our example, the conditional expectation of 2 computed with respect tothe marginal distribution of 2 is

( 2 | 1 2 1 )

=1

( 1 )+ exp [ ( 1 ) 1]

( 1 ) ( 2 )

( 2 )

¸1

( 1 )+ 1

¸(86)

Table 1: Ln Employment Durations (Based on Two Completed Spells)Within spell averages Start of spellof exogenous variables exogenous variablesSpell Spell Spell Spellone t two t one t two t

Intercept 1.232 (2.16) -1.052 (1.25) 7.094 (2.1) -2.048 (3.2)Marital status(1 if married) .624 (1.43) -.449 (.72) -.409 (.81) -.490 (.75)Nationalunemployment -2.020 (5.40) -.975 (.21) -6.523 (2.5) .531 (1.88)

Di erence specificationsIntercept 2.502 (1.95) 12.423 (2.5)Marital status -.138 (.17) 0.112 (.01)Unemployment 185 (.56) .858 (.337)

Marital status(first spell) -.570 (.49) -.496 (1.11)Nationalunemployment(first spell) 1.946 (2.41) 8.951 (2.43)

= 1.08 =.65

Table 2: Ln Employment Durations (Based on Two Completed Spells)Within spell averages Start of spellof exogenous variables exogenous variablesSpell Spell Spell Spellone t two t one t two t

Intercept -.398 (.97) .44 (.65) -1.051 (2.28) -1.011 (1.61)Marital status(1 if married,spouse present) -.117 (.24) .16 (.44) -.375 (.764) -.093 (.25)Nationalunemployment -.335 (1.46) -.53 (1.74) .057 (.21) .151 (.51)

Di erence specificationsIntercept -.074 (0.84) 12.423 (2.5)Marital status -.342 (.645) 0.112 (.01)Unemployment -.402 (.85) .858 (.337)

Marital status(first spell) .091 (-.13) -.496 (1.11)Unemployment

(first spell) -.252 (.60) -.653 (1.42)= 1 23 = 74

Table 3. Maximum Likelihood Estimates - Weibull Model1

Employment to Nonemployment Nonemployment to EmploymentPanel A: Regressors Fixed at Average Value Over Spell

Intercept .971 -.093(1.535) (.221)

ln Duration ( ) -.137 -.287(1.571) (2.976)

MSP -1.093 .347(2.679) (1.134)

Unemployment -1.800 -.577(6.286) (3.119)

$2 = 711 457

1Absolute value of asymptotic normal statistics in parentheses.2$ denotes the value of the log likelihood function.


Employment to Nonemployment Nonemployment to EmploymentPanel B: Regressors Fixed at Value for First Month of Spell

Intercept -3.743 -1.054(12.074) (3.464)

ln Duration ( ) -.230 -.363(2.888) (4.049)

MSP -.921 .297(2.310) (.902)

Unemployment .569 -.130(3.951) (.900)

$ = 740 9981Absolute value of asymptotic normal statistics in parentheses.2$ denotes the value of the log likelihood function.


Employment to Nonemployment Nonemployment to EmploymentPanel C: Regressors Free to Vary Over the Spell

Intercept -3.078 -.899(8.670) (2.742)

ln Duration ( ) -.341 -.316(3.941) (3.279)

MSP -.610 .362(1.971) (1.131)

Unemployment .209 -.204(1.194) (1.321)

$ = 746 5151Absolute value of asymptotic normal statistics in parentheses.2$ denotes the value of the log likelihood function.

Table 4. Maximum Likelihood Estimates with Time VaryingVariables and Heterogeneity1

Employment to Nonemployment Nonemployment to EmploymentIntercept -3.600 -.879

(8.395) (2.525)

ln duration .015 -.312(.121) (3.170)

MSP -.498 .320(1.384) (.961)

Unemployment -.017 -.172(.101) (1.056)

1.196 -.133(4.651) (.756)

$ = 740 1261Absolute value of asymptotic normal statistics in parentheses.

Table 5. Maximum Likelihood Estimates with Time VaryingVariables, Heterogeneity, and General Duration Dependence1

Without heterogeneity With heterogeneity

Const. -3.271 -.762 -3.565 -.748(8.901) (2.425) (8.537) (2.247)

Tenure 10 -.806 -1.714 .045 -1.704(1.858) (2.731) (.085) (2.685)

Tenure2 100 .028 .602 -.120 .603(.145) (1.673) (.607) (1.666)

MSP -.568 .349 -.490 .313(1.731) (1.089) (1.353) (.956)

Unemployment .329 -.192 .075 -.164(1.865) (1.271) (.392) (1.042)

1.088 -.118(3.572) (.650)

$ = 742 334 739 177

Documents

University of Chicagojenni.uchicago.edu/econ312/Slides/multistate-duration_2008-01-12a_… · 12/01/2008 · The hazard for the nwkinterval depends on the number of previous spells