17
Psychological Methods 2001, Vol.6, No. 1, 18-34 Copyright 2001 by the American Psychological Association, Inc. 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors: A Group-Based Method Daniel S. Nagin Carnegie Mellon University Richard E. Tremblay University of Montreal This article presents a group-based method to jointly estimate developmental tra- jectories of 2 distinct but theoretically related measurement series. The method will aid the analysis of comorbidity and heterotypic continuity. Three key outputs of the model are (a) for both measurement series, the form of the trajectory of distinctive subpopulations; (b) the probability of membership in each such trajectory group; and (c) the joint probability of membership in trajectory groups across behaviors. This final output offers 2 novel features. First, the joint probabilities can charac- terize the linkage in the developmental course of distinct but related behaviors. Second, the joint probabilities can measure differences within the population in the magnitude of this linkage. Two examples are presented to illustrate the application of the method. Two prominent themes in developmental psychol- ogy, developmental psychopathology, and develop- mental criminology are comorbidity and heterotypic continuity. Comorbidity refers to the contemporane- ous occurrence of two or more undesirable conditions, such as conduct disorder and hyperactivity during childhood (Angold, Costello, & Erkanli, 1999; Nagin & Tremblay, 1999) or anxiety and depression in Daniel S. Nagin, H. J. Heinz III School of Public Policy and Management, Carnegie Mellon University; Richard E. Tremblay, Research Unit on Children's Psychosocial Mal- adjustment, University of Montreal, Montreal, Quebec, Canada. This research was supported by the National Consortium on Violence Research; National Science Foundation Grants SBR-9511412 and SES-9911370; the Conseil Quebecois de la Recherche Sociale, Quebec; the Fonds Pour la For- mation de Chercheurs et L'Aide a la Recherche, Quebec; the National Health Research and Development Program; the Social Sciences and Humanities Research Council of Canada; and the Molson Foundation. We thank Bobby Brame, Lisa Broidy, Shawn Bushway, Avshalom Caspi, Laura Dugan, Anne Garvin, Brandi Marin-Pearce, Terrie Moffitt, and Steve Raudenbush for helpful comments. Correspondence concerning this article should be ad- dressed to Daniel S. Nagin, H. J. Heinz III School of Public Policy and Management, Carnegie Mellon University, Pitts- burgh, Pennsylvania 15213-2890. Electronic mail may be sent to [email protected]. adulthood (Kessler et al., 1994). Heterotypic continu- ity is the manifestation over time of a latent individual trait in different but analogous behaviors (Caspi, 1998; Kagan, 1969). For example, a propensity for violence may reveal itself as kicking and biting sib- lings during early childhood, gang fighting during adolescence, and spouse abuse during adulthood. The form and target of the aggression is different, but the constant is physical violence. Due to the changing form of the manifestation, use of the same measure- ment scale at different stages of life is inappropriate for capturing such a tendency. Comorbidity and heterotypic continuity are typi- cally represented by a single summary statistic, usu- ally a correlation or odds ratio, that measures the co- occurrence of the two behaviors or symptoms of interest (e.g., hyperactivity and conduct disorder at age 6) or, alternatively, relates the two distinct behav- iors measured at different life stages (e.g., physical aggression at age 5 and violent delinquency at age 15). Examples of research using this conventional measurement strategy in comorbidity analysis include work by Costello et al. (1988); Fergusson, Horwood, and Lynskey (1993); Haapasalo, Tremblay, Boulerice, and Vitaro (in press); Lewinsohn, Hops, Roberts, See- ley, and Andrews (1993); and Valez, Johnson, and Cohen (1989) and in heterotypic continuity analysis include Backteman and Magnusson (1981); Caspi (1998); Farrington (1990); Huesmann, Eron, Lefkow- itz, and Walder (1984); Loeber and LeBlanc (1990); and Olweus (1979). 18

Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

Psychological Methods2001, Vol.6, No. 1, 18-34

Copyright 2001 by the American Psychological Association, Inc.1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8

Analyzing Developmental Trajectories of Distinct but RelatedBehaviors: A Group-Based Method

Daniel S. NaginCarnegie Mellon University

Richard E. TremblayUniversity of Montreal

This article presents a group-based method to jointly estimate developmental tra-jectories of 2 distinct but theoretically related measurement series. The method willaid the analysis of comorbidity and heterotypic continuity. Three key outputs of themodel are (a) for both measurement series, the form of the trajectory of distinctivesubpopulations; (b) the probability of membership in each such trajectory group;and (c) the joint probability of membership in trajectory groups across behaviors.This final output offers 2 novel features. First, the joint probabilities can charac-terize the linkage in the developmental course of distinct but related behaviors.Second, the joint probabilities can measure differences within the population in themagnitude of this linkage. Two examples are presented to illustrate the applicationof the method.

Two prominent themes in developmental psychol-ogy, developmental psychopathology, and develop-mental criminology are comorbidity and heterotypiccontinuity. Comorbidity refers to the contemporane-ous occurrence of two or more undesirable conditions,such as conduct disorder and hyperactivity duringchildhood (Angold, Costello, & Erkanli, 1999; Nagin& Tremblay, 1999) or anxiety and depression in

Daniel S. Nagin, H. J. Heinz III School of Public Policyand Management, Carnegie Mellon University; Richard E.Tremblay, Research Unit on Children's Psychosocial Mal-adjustment, University of Montreal, Montreal, Quebec,Canada.

This research was supported by the National Consortiumon Violence Research; National Science Foundation GrantsSBR-9511412 and SES-9911370; the Conseil Quebecois dela Recherche Sociale, Quebec; the Fonds Pour la For-mation de Chercheurs et L'Aide a la Recherche, Quebec;the National Health Research and Development Program;the Social Sciences and Humanities Research Council ofCanada; and the Molson Foundation.

We thank Bobby Brame, Lisa Broidy, Shawn Bushway,Avshalom Caspi, Laura Dugan, Anne Garvin, BrandiMarin-Pearce, Terrie Moffitt, and Steve Raudenbush forhelpful comments.

Correspondence concerning this article should be ad-dressed to Daniel S. Nagin, H. J. Heinz III School of PublicPolicy and Management, Carnegie Mellon University, Pitts-burgh, Pennsylvania 15213-2890. Electronic mail may besent to [email protected].

adulthood (Kessler et al., 1994). Heterotypic continu-ity is the manifestation over time of a latent individualtrait in different but analogous behaviors (Caspi,1998; Kagan, 1969). For example, a propensity forviolence may reveal itself as kicking and biting sib-lings during early childhood, gang fighting duringadolescence, and spouse abuse during adulthood. Theform and target of the aggression is different, but theconstant is physical violence. Due to the changingform of the manifestation, use of the same measure-ment scale at different stages of life is inappropriatefor capturing such a tendency.

Comorbidity and heterotypic continuity are typi-cally represented by a single summary statistic, usu-ally a correlation or odds ratio, that measures the co-occurrence of the two behaviors or symptoms ofinterest (e.g., hyperactivity and conduct disorder atage 6) or, alternatively, relates the two distinct behav-iors measured at different life stages (e.g., physicalaggression at age 5 and violent delinquency at age15). Examples of research using this conventionalmeasurement strategy in comorbidity analysis includework by Costello et al. (1988); Fergusson, Horwood,and Lynskey (1993); Haapasalo, Tremblay, Boulerice,and Vitaro (in press); Lewinsohn, Hops, Roberts, See-ley, and Andrews (1993); and Valez, Johnson, andCohen (1989) and in heterotypic continuity analysisinclude Backteman and Magnusson (1981); Caspi(1998); Farrington (1990); Huesmann, Eron, Lefkow-itz, and Walder (1984); Loeber and LeBlanc (1990);and Olweus (1979).

18

Page 2: Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

ANALYZING DEVELOPMENTAL TRAJECTORIES 19

Figure 1 depicts the essence of the data summaryproblem in such analyses. Figure 1A characterizes theconventional approach to summarizing the co-occurrence of two behaviors. For a comorbidityanalysis, up to T summary measures of associationcan be computed—one for each of the T measurementperiods. For example, if series X and Y, respectively,measured depression and anxiety over T periods, up toT correlation coefficients could be computed to rep-resent comorbidity for each period. For heterotypiccontinuity analysis, the combinations are potentiallyeven larger because each of the T measurements ofbehavior X can be related to any of the K measure-ments of behavior Y from period T onward.

The conventional approach to measuring behavior-al overlap and stability suffers from several importantlimitations. Most important, it makes inefficient useof longitudinal data because measures of associationonly use two assessment periods. This is especially

problematic in light of the enormous cost of conduct-ing longitudinal studies. Also, it is paradoxical be-cause a key rationale for tracking individuals for morethan two assessment periods is to provide the capacityto trace more than the linear change in the develop-mental course, yet this capacity is greatly dissipatedby conventional two-period-based summary statisticsfor measuring comorbidity and heterotypic continuity.Second, the customary interpretation of a summarystatistic is that its magnitude applies equally to allindividuals within the population under study. For ex-ample, suppose the correlation between involvementin antisocial behavior at ages 8 and 14 was found to be0.6. Most commonly, this correlation is interpreted asapplying to all population members. However, there isa more complicated—and probably more realistic—alternative: It is an average correlation calculated overheterogeneous subpopulations. For some subpopula-tions, there may be very little association, whereas for

Behavior X: X, X2 X; XT

Comorbidity

111 IBehavior Z: 2, Z2 Z3 ZT

Heterotypic Continuity

BehaviorX: X, X2 X3 XT

Behavior Z: -H ZH3

B

Comorbidity

Behavior X: X, X2 X3 XT

Behavior Z: Z, Z2 Zr

Behavior X: X, X2 X3 XT

Heterotypic Continuity

Behavior Z: Zr ZT+I Z,+j ZT+K

Figure 1. Measuring comorbidity and heterotypic continuity. A: Conventional approach. B: Joint trajectory approach.

Page 3: Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

20 NAGIN AND TREMBLAY

other subpopulations, the association may be muchlarger (e.g., Magnusson & Bergman, 1990; Pulkkinen& Tremblay, 1992).

Two other limitations of the conventional approachare directly related to the use of correlation coeffi-cients. The first is that the correlation's magnitude isnot readily interpretable. For example, what does acorrelation of +0.6 imply about the proportion of in-dividuals who exhibit behavioral continuity comparedto the proportion of individuals who exhibit behav-ioral change? The second limitation is that the corre-lation's size can be quite sensitive to the inclusion ofoutlier data from highly skewed distributions, such asthose often encountered in research on mental healthproblems, crime, and antisocial behavior (Moffitt,1993).

In this article, we develop and demonstrate a sta-tistical model that relates the entire longitudinalcourse of the two behaviors of interest. The essence ofthe approach is depicted in Figure IB. It aims to relateall measurements of the two behaviors of interest in asingle summary statistical model. The model is a gen-eralization of a group-based method described inNagin (1999) for identifying distinctive groupings ofindividual-level trajectories within the population.Such trajectory groups describe the course of a be-havior over age or time and might include "increas-ers," "decreasers," and "no-changers." The general-ization provides the capacity for the joint estimationof trajectory models for two distinct but theoreticallyrelated measurement series.

The three key outputs of the joint model are illus-trated in Figure 2 and Table 1. The table and figurecontain examples of output from an analysis, to bedeveloped in detail here, which examines the comor-bidity of hyperactivity and physical aggression fromages 6 to 15 years in a large sample of boys fromMontreal, Canada. One output is the identification ofthe optimal number of trajectory groups for each mea-surement series. Figure 2 depicts the form of the tra-jectory groups identified for these two externalizingbehaviors. For each behavior, a four-group model wasfound to be optimal. A second key output is the prob-ability of membership in each trajectory group. Theseprobabilities are displayed in the first part of Table 1.For example, 3.9% of the population is estimated tobelong to the chronic physical aggression trajectorygroup and 10.2% to a counterpart hyperactivity group.The third key output is the joint probability of mem-bership in trajectory groups across behaviors. Theseprobabilities are displayed in the second, third, and

•i 4

6 9 12 15Age

-Low —*—Moderate Desist —*—High Declining • Chronic

B

6 9

-Low —A—Moderate Desist •

12Age

— High Declining •

15

-Chronic

Figure 2. A: Trajectories of physical aggression: Ages 6 to15. B: Trajectories of hyperactivity: Ages 6 to 15.

fourth parts of Table 1. For example, the estimatedprobability of belonging to the chronic physical ag-gression trajectory group given membership in thecounterpart chronic hyperactivity group is .275 (sec-ond part of Table 1), whereas the converse conditionalprobability estimated is much higher, .722 (third partof Table 1). Finally, the joint probability of belongingto both chronic trajectory groups is .011 (fourth partof Table 1). These probabilities, each of which isimportant in its own right, provide the capacity tocharacterize the linkage in the developmental courseof distinct but related behaviors.

The joint trajectory model advances conventionalapproaches to measuring comorbidity or heterotypiccontinuity by providing the capability to examine thelinkage between the dynamic unfolding of the twobehaviors over the entire period of observation. Thus,it makes use of all the longitudinal measurements ofthe behaviors of interest. In addition, it captures popu-lation differences in the strength and form of the Co-morbidity or heterotypic continuity.

Model

This section is intended to provide a conceptualdescription of the joint trajectory model. A detailedderivation of the likelihood function that is maxi-

Page 4: Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

ANALYZING DEVELOPMENTAL TRAJECTORIES 21

Table 1Comorbidity of Hyperactivity and Physical Aggression

Behavior or group Low Moderate desist High declining

Probability of physical aggression group conditional on hyperactivity group

Low hyperactivityModerate desist hyperactivityHigh declining hyperactivityChronic hyperactivity

1.000.158.000.000

.000

.803

.367

.013

.000

.038

.594

.711

Probability of hyperactivity group conditional on physical aggression group

Low physical aggressionModerate desist physical aggressionHigh declining physical aggressionChronic physical aggression

.777

.000

.000

.000

.223

.758

.062

.000

.000

.239

.652

.278

Joint probability of hyperactivity and physical aggression trajectory group

Low physical aggressionModerate desist physical aggressionHigh declining physical aggressionChronic physical aggression

.223

.000

.000

.000

.063

.332

.016

.002

.000

.101

.165

.025

Chronic

Physical aggressionHyperactivity

Probability estimates for joint

.283

.220

physical aggression

.425

.401

and hyperactivity model

.253

.277.039.102

.000

.000

.039

.275

.000

.003

.286

.722

.000

.001

.072

.011

mized for the purpose of model estimation is providedin the Appendix. First, the statistical model underly-ing the estimation of a group-based trajectory modelfor a single behavior is summarized. For an in-depthdiscussion of the estimation of the univariate trajec-tory model, see Nagin (1999). The approach used tolink two univariate models to form a joint model isthen described.

A developmental trajectory describes the develop-mental course of a behavior over age or time. Thegroup-based model assumes that, as an approxima-tion, the population is composed of a mixture ofgroups with distinctive developmental trajectories(Land, McCall, & Nagin, 1996; Nagin & Land, 1993;Nagin & Tremblay, 1999). However, at the level ofthe individual trajectory group, membership is not ob-served. Thus, individual level data cannot be sorted exante for the purpose of estimating each group's tra-jectory. Instead, the estimation procedure, which isbased on mixture modeling, identifies the shape of thetrajectory for each group and the proportion of thepopulation that constitutes each such group.

The technical specifics of the statistical model usedto identify and estimate the trajectory groups dependon the form of the response variable under investiga-tion. For a count variable, it is assumed that the dataare generated by an underlying Poisson process, in

which the Poisson rate parameter for each group jcomprising the population follows up to a third-orderpolynomial in age:

log(\i) = Pi + ftAge* + MAgef, + P^4

where X-J, is individual /'s rate of the behavior of in-terest at time t conditional on membership in group j,and Age,-,, Age2,, and Age], are, respectively, subject f sage, age squared, and age cubed at time t.1'2 The

1 The model does not require that all individuals be thesame age at each assessment period t. For example, themodel can accommodate data from an overlapping cohortdesign. Thus, age is not synonymous with time. An alter-native formulation of a trajectory is in terms of time. In thisformulation, it is important that time be measured from anatural starting point like commencement of treatment fordepression. Otherwise, a trajectory in time is not interpret-able.

2 As described in Jones, Nagin, and Roeder (in press), thePoisson-based model for count data is embedded in themore-general Zero-Inflated Poisson. As implied by itsname, this generalization of the Poisson model includes a"zero-inflation" factor to account for the possibility of agreater frequency of zero realizations than is predicted bythe standard Poisson distribution. Because our focus is ongroup-based modeling not on the analysis of count data, for

Page 5: Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

22 NAGIN AND TREMBLAY

model's coefficients, fib, (${, $2, and ̂ determine theshape of the trajectory and are superscripted by j todenote that the coefficients are not constrained to bethe same across the j groups.3 This absence of con-straint is key. It allows each group's trajectory to havea distinct shape. Thereby, the model has the capacityto capture unusual mixtures of developmental trajec-tories in the population.

For psychometric scale data, which is commonlycensored at scale minimums and maximums and forbinary data (e.g., accident or not in period t), we as-sume that the data for each group j are being gener-ated by an underlying latent variable, y$, where

For the censored normal model, e is a disturbanceassumed to be independently normally distributedwith zero mean and constant variance o-2. For thebinary model, it is assumed to follow the extremevalue distribution, which underlies the logistic distri-bution (Maddala, 1983).

In the case of the censored normal latent variable,v* > is linked to its observed but censored counterpart,yit, as follows: Let 5min and 5max, respectively, denotethe minimum and maximum possible score on themeasurement scale. The model assumes

and

yit=y*J

Jit = Sn

For binary data, the observed quantity, which is againdenoted by yit, the model assumes that yit = 1 if yft

j

> 0 and equals 0 otherwise.Key outputs of the univariate model are estimates

of the coefficients defining the shape of the trajectoryfor each group j and also of IT*, the proportion of thepopulation belonging to each group j. These are alsoincluded among the outputs of the joint model. Thekey additional output of the joint model is the esti-mation of probabilities linking the two behaviors un-der investigation.

Formally, let Yl and 72 denote the two longitudinalseries of measurements for each individual i that areto be modeled in a joint trajectory format, where Yl ismeasured over Tl periods, Y2 is measured over T2

ease of exposition we use the more familiar Poisson distri-bution.

periods, and the index i designating the individual hasbeen suppressed for notational convenience. Note thatthe model does not require that measurements be con-temporaneous or that the length of the measurementperiod be the same (i.e., T1 =£ T2). For example, in anillustration of heterotypic continuity described below,oppositional behavior and property delinquency weremeasured over two age intervals of differing lengthand were also partially overlapping: 6-15 years oldand 11-17 years old, respectively. Moreover, the formof the data need not be the same for the two series.The joint model can accommodate any combinationof censored normal, count, or binary data across thetwo measurement series under investigation.

Figure IB suggests two conceptual models for join-ing the trajectories for K1 and Y2. For heterotypiccontinuity, we are interested in linking an earlier be-havior denoted by x (e.g., childhood opposition) witha later behavior denoted by z (e.g., adolescent prop-erty delinquency). Let j and k index the trajectorygroups associated with behaviors x and z, respec-tively. Formally, we are interested in estimating theconditional probability of transiting to trajectory k(e.g., a trajectory of chronic property offending inadolescence) given membership in trajectory group j(e.g., a trajectory of chronic opposition in childhood).This probability is denoted by ir*'-7'.

For comorbidity analysis, the arrows linking behav-iors x and z are double headed to suggest an interest inmeasuring their joint occurrence. Angold et al. (1999)emphasize the importance of measuring both ir*'7

(e.g., the probability of chronic childhood physicalaggression given chronic childhood hyperactivity)and the converse conditional probability ir-7'1* in de-scribing comorbidity. In addition, the joint probabilityof belonging to both trajectory groups (e.g., followingtrajectories of chronic hyperactivity and chronicphysical aggression) is of interest. This probability isdenoted by irjk.

As it turns out, these two alternative representationsof the joint occurrence of trajectory groups are ana-lytically equivalent. Specifically, the likelihood func-tion can be specified such that the direct output ofestimation are either estimates of (a) ir-7* for; =1 ,2 ,. . . , J and k = 1, 2, . . . , K or (b) irj and ir*1-7 for; =1, 2, . .., J and k = 1, 2, . . . , K.4 These two alter-

3 In principle, any order polynomial of age can be used tomodel X.J', subject to identifiability.

4 As discussed in the Appendix, an equivalent version ofthe second parameterization is in terms of IT* and ir7'*.

Page 6: Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

ANALYZING DEVELOPMENTAL TRAJECTORIES 23

natives are formally equivalent because if we knowIT7*, we can calculate ITJ, ir*'7, IT*, and iryl/: as follows:

•J=2X j=i,...,j.

IT

and

Alternatively, if we know ir7 andlate Tfik, Tsk, and TT"* as follows:

(1)

(2)

, K , (3)

(4)

we can calcu-

(5)

(6)

and

'

We bring these calculations to the reader's attentionnot only to make clear that the "comorbidity" and"heterotypic" models are analytically equivalent mod-els but also because the calculations themselves canbe usefully employed in analysis; a point we hope todemonstrate in the next section where we provideillustrative applications. In both these illustrations, weuse the second, "heterotypic continuity" parameter-ization in estimation and thus use Equations 5-7 tocalculate other probabilities of interest.

Two Illustrative Examples

In this section, the application of the joint models isillustrated with two examples. The first is a joint tra-jectory model of the comorbidity of hyperactivity andphysical aggression. The second is a joint trajectorylinking childhood opposition to adolescent propertydelinquency. An SAS-based procedure for estimatingjoint models such as these is available by request.This procedure extends the univariate trajectory esti-mation software described in Jones, Nagin, andRoeder (in press).

Both examples use data from a Montreal-based pro-

spective longitudinal study. This study tracks 1,037white males of French ancestry. Subjects were se-lected in 1984 from kindergarten classes in low-socioeconomic Montreal neighborhoods. Followingthe assessment at age 6, the boys and other informantswere interviewed annually from ages 10 to 17. As-sessments were made on a wide range of factors.Among these were physical aggression, hyperactivity,and opposition, which were measured at age 6 andannually from ages 10 to 15, based on teacher ratingsusing the Social Behavior Questionnaire (Tremblay,Desmarais-Gervais, Gagnon, & Charlebois, 1987).Physical aggression was assessed with three items:kicks, bites, hits other children; fights with other chil-dren; and bullies or intimidates other children. Itsminimum and maximum scores are, respectively, 0and 6. Hyperactivity was assessed with two items:squirmy, fidgety; does not keep still. Its minimum andmaximum scores are 0 and 4. Opposition, whichranges from a score of 0 to 10, was measured byteacher ratings of five items: does not share materials;irritable; disobedient; blames others; and inconsider-ate. The property offense scale, which was based onself-reports of activity over the past year, included thefollowing items: stealing from a store, keeping objectsworth more than $10, stealing something worth morethan $100, entering without paying admission, steal-ing money from home, stealing a bicycle, stealingsomething worth between $10 and $100, buying sto-len goods, being in an unauthorized place, and break-ing and entering (Nagin & Tremblay, 1999).

Example 1: Comorbidity Analysis

Much research has documented the overlap ofphysical aggression and hyperactivity in children (An-gold et al., 1999; Hinshaw, Lahey, & Hart, 1993;Hinshaw, Zupan, Simmel, Nigg, & Melnick, 1997;Kerr, Tremblay, Pagani-Kurtz, & Vitaro, 1997; La-hey, McBurnett, & Loeber, in press; Tremblay,Masse, Pagani, & Vitaro, 1996). This example is in-tended to illustrate how a joint trajectory analysis canilluminate the nature of this overlap by measuring itfrom a dynamic perspective.

Figure 2 reports a joint trajectory model of the co-morbidity of physical aggression and hyperactivityamong the subjects of the Montreal study. Figure 2Aand 2B report the trajectory model for physical ag-gression and hyperactivity, respectively. For both ofthese psychometric scales, the censored normal dis-tribution is used to model the trajectories to accountfor the censoring at the lower and upper bounds of the

Page 7: Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

24 NAGIN AND TREMBLAY

scale. A discussion of model selection, the optimalnumber of groups and the order of the polynomialcharacterizing each group, follows in a separate sec-tion.

The trajectories for hyperactivity and physical ag-gression are very similar. For both behaviors, thereis a group called "lows," whose individuals rarelydisplay the behavior. A second group is best charac-terized as moderate-level desisters. At age 6, theymanifested modest levels of the behavior, but by ages10-12 they have largely desisted from displays of thatbehavior. A third group, called "high-level decliners,"start off scoring relatively high on the problem be-havior at age 6, but by age 15 score far lower. Finally,there is a group of chronics that start off scoring highon the behavior and continue to score high throughoutthe observation period.

The first part of Table 1 reports estimates of theprobability of membership in each trajectory group.As previously noted, one such set of marginal prob-abilities estimates, &, emerges directly from estima-tion. Estimates of the marginal probabilities for theother behavior, TT*, can be calculated by Equation 6based on the estimates of ir*'7' and ir7 produced di-rectly from model estimation. For both hyperactivityand physical aggression, group membership prob-abilities are about the same: 20%-30% for the lows,approximately 40% for the moderate-level desisters,25%-30% for high-level decliners, and 5%-lO% forthe chronics.

Thus far, the analysis suggests that hyperactivityand physical aggression follow nearly identical devel-opmental courses. However, the comorbidity prob-abilities reported in Table I suggest the overlap intheir developmental courses is not nearly complete.Specifically, two sets of conditional probabilities arereported: (a) the probability of membership in each ofthe physical aggression trajectories conditional onmembership in a given hyperactivity trajectory group(second part of Table I), and (b) the converse set ofprobabilities for each hyperactivity group conditionalon a given physical aggression group (third part ofTable I). In addition, estimates of the joint probabili-ties of trajectory group membership are reported inthe fourth part of Table I. The probabilities in thethird part of Table I were calculated with Equation 7and those in the fourth part witii Equation 5.

Both sets of conditional probabilities show a highlevel of overlap in similar counterpart trajectorygroups. This is suggested by the generally large di-agonal elements of the probability matrices in the

second and third parts of Table I, which indicatesubstantial comorbidity. Still there are importantdifferences, particularly for the extreme groups.Nearly 100% of those in the low-hyperactivity groupare estimated to belong to the low-physical-aggression group. By contrast, it is estimated that 22%of those in the low-physical-aggression group belongto the moderate hyperactivity group. Thus, while vir-tually all low-hyperactivity boys are also low-physical-aggression boys, included among the low-physical-aggression boys is a sizeable minority ofmoderately hyperactive boys. This suggests that anobservation of low hyperactivity is synonymous withlow physical aggression, but that the reverse is notnecessarily true. Some children who are not physi-cally aggressive may nonetheless display modest hy-peractivity. Similarly, when considering the chronictrajectories, only 28% of the chronic-hyperactivegroup are also members of the chronic-physical-aggression group. By contrast, 72% of the chronic-physical-aggression group are estimated to belong tothe chronic-hyperactive group. Thus, most chronicallyphysically aggressive boys are also chronically hyper-active; however, the reverse is not necessarily true—most chronically hyperactive boys are not chronicallyphysically aggressive. Combined, these results sug-gest that the overlap between physical aggression andhyperactivity is complex. Low hyperactivity predictslow physical aggression better than low physical ag-gression predicts low hyperactivity. However, at highlevels of these behaviors the reverse is true. Chronicphysical aggression better predicts chronic hyperac-tivity than chronic hyperactivity predicts chronicphysical aggression.

The fourth part of Table 1 reports the joint prob-abilities of trajectory group membership. Due to thehigh comorbidity of these two externalizing behav-iors, most of the probability mass is concentrated onthe diagonal elements of the matrix. The modal groupis composed of those belonging to the two moderatedesisting trajectory groups. This group accounts for33.2% of the population. In total, 61.8% belong totrajectory groups that are low or moderate on bothbehaviors. In contrast, only 1.1 % are estimated to fol-low a joint trajectory of both chronic physical aggres-sion and chronic hyperactivity.

Example 2: Heterotypic Continuity Analysis

In this example, a joint trajectory model is esti-mated to examine the heterotypic continuity betweenopposition and property offending. As discussed in

Page 8: Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

ANALYZING DEVELOPMENTAL TRAJECTORIES 25

Nagin and Tremblay (1999), a key question in devel-opmental criminology is whether there is a singlepathway to all criminal behavior (e.g., Gottfredson &Hirschi, 1990) or multiple pathways. Loeber andcolleagues (Loeber, 1991; Loeber et al., 1993) pro-pose a multiple pathway model in which differentchildhood developmental pathways lead to differenttypes of offending in adolescence and adulthood. Onesuch developmental pattern is a covert behavior-problem pathway that starts with minor problems suchas lying, which leads to property damage, and is thenfollowed by serious covert delinquent acts, such asfraud and burglary. The joint model of opposition andproperty delinquency is intended to explore this link-age.

Specifically, this analysis relates trajectories of op-position from ages 6 to 15 to trajectories of propertydelinquency from ages 11 to 17. Figure 3A displaysfour trajectories of opposition. Accompanying prob-abilities of group membership are reported in Table 2.The opposition trajectory groups are very similar tothose for hyperactivity and physical aggression bothin terms of their shapes and group membership prob-abilities: a low group, a moderate desisting group,a high declining group, and a chronic group that

are respectively estimated to account for 25%, 46%,24%, and 5% of the population. Figure 3B reportssix distinct trajectories of property delinquency. Onegroup, which we call lows, comprise about 30% ofthe population. This group has a rate of offending thatis near zero over the entire observation period, whichindicates that they engage in virtually no propertydelinquency. Three of the group trajectories are risingsteadily. One of these, the rising chronics, is smallin size, an estimated 5.9% of the population, but fol-lows a very interesting trajectory for delinquency re-searchers—it starts high and rises steeply to a veryhigh level. The other two, the low-rising Is and 2s,start near zero, with the low-rising 2s rising to a dis-tinctly higher level than do the low-rising 1 s. Both ofthese are examples of late-onset property delin-quency. Combined, they are estimated to comprise41.3% of the population. The remaining two trajec-tory groups display greater stability, although bothshow some clear evidence of decline. One group, the"medium decliners," accounts for an estimated 14.6%of the population. The other group, labeled "mediumchronics," comprise an estimated 7.3% of the popu-lation.

Table 2 reports transition probabilities linking these

High Declining

11

B

12

-Low-Medium decline

14Age

•— Low rising 1I- • Rising chronic

15 16 17

-Low rising 2-Medium chronic

Figure 3. A: Trajectories of opposition: Ages 6 to 15. B: Trajectories of property delinquency: Ages 11 to 17.

Page 9: Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

26 NAGIN AND TREMBLAY

Table 2Property Delinquency Group Membership Probabilities Conditional on Opposition Group Membership

Property delinquency group

Opposition group

LowModerate desistHigh decliningChronic

Low

.532

.274

.164

.204

Low-Rising 1

.259

.280

.222

.245

Low-Rising 2

.084

.173

.191

.164

Mediumdeclining

.093

.158

.189

.087

Risingchronic

.009

.051

.110

.160

Mediumchronic

.023

.064

.124

.139

two sets of developmental trajectories.5 Observe thata majority of the low-opposition boys are in the low-property delinquency trajectory group (.532), whereasthe probability of a boy in the chronic group followingthis trajectory is only .204. By contrast, the chronic-opposition group has a combined probability of .299( = .139 + .160) of following the rising chronic ormedium-chronic trajectories, whereas for the low-opposition group this probability is only .03. Thus, asargued by Loeber and colleagues (1993), oppositionhas a clear linkage to property offending. Still, thelink falls far short of certainty. Nearly half of thelow-opposition boys follow one of the five more-elevated property delinquency trajectories, and morethan 10% follow the three most elevated trajecto-ries—the low-rising 2, rising chronic, or medium-chronic trajectories. On the other hand, not all thechronic boys engage in high levels of property delin-quency—about 45% follow the low or low-rising 1trajectories.

Model Selection

One of the most technically challenging problemsin mixture modeling is determining the optimal num-ber of groups to include in the model. One possiblechoice for testing the optimality of a specified numberof groups is the likelihood ratio test. The likelihoodratio test is only suitable for model selection problemsin which the alternative models are nested linear sub-spaces. In mixture models, a k group model is not anested linear subspace of a k + 1 group model, and,therefore, it is not appropriate to use the likelihoodratio test for model selection (Erdfelder, 1990; Ghosh& Sen, 1985; Titterington, Smith, & Makov, 1985).Instead, use of the Bayesian Information Criterion(BIC) is recommended as a basis for selecting theoptimal model (D'Unger, Land, McCall, & Nagin,1998). For a given model, BIC is calculated as

BIC = log(L) - 0.5 * log(n) * (k),

where L is the value of the model's maximized like-lihood, n is the sample size, and k is the number ofparameters in the model. Kass and Raftery (1995) andRaftery (1995) argue that BIC can be used for com-parison of both nested and unnested models underfairly general circumstances. When prior informationon the correct model is limited, they recommend se-lection of the model with the maximum BIC. In thisapplication, model selection includes not only the de-termination of the number of groups, but also theorder of the polynomial used to capture the shape ofeach trajectory group.

There is, however, one practical complication inapplying BIC to model selection in the joint trajectoryformat. The number of models that must be estimatedgrows with the square of the number of models thatare considered under the univariate trajectory format.If N1 and N2 models are considered for Yl and Y2,respectively, an exhaustive model search requires es-timating N1 * N2 joint models. Instead, it is recom-mended that model selection be based on searches ofthe two univariate model spaces, which thereby re-duces the number of models searched to N1 + N2. Thefinal joint model is estimated with the number andshapes of trajectories found to be optimal, based onthe two univariate model searches. Experience hasshown that trajectories emerging from joint estimationdiffer little from their univariate counterparts.

Application of the maximum BIC criterion formodel selection generally leads to a clear-cut choiceof the best model. For example, BIC-based calcula-tions reported in Nagin (1999) strongly support thefour-group physical aggression model shown in Fig-

5 It is also possible to calculate the probability of mem-bership in each opposition trajectory group conditional uponmembership in each delinquency group. These probabilitiesare not reported because they seem less useful given thetemporal sequencing of the behaviors.

Page 10: Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

ANALYZING DEVELOPMENTAL TRAJECTORIES 27

ure 2 being the "best" model. However, in some ap-plications, the BIC score continues to improve asmore groups are added. In this circumstance, the ad-dition of a new group to the model most commonlyresults in the splitting of a large group into twosmaller groups with parallel trajectories. This in factoccurred for the property delinquency trajectoryanalysis reported in Figure 3. We stopped at sixgroups because the addition of more groups only splitthe large groups of low delinquency boys, but left thehigh delinquency trajectory groups unchanged. Giventhat the latter were of greatest interest, the addition ofmore groups was not informative.

Confidence Intervals for GroupMembership Probabilities

An alternative to the joint model for joining trajec-tories of different measurement series relies only onthe univariate trajectory method. This can be accom-plished as follows: (a) estimate separate univariatetrajectory models for Yl and Y2, (b) based on theposterior probabilities of group membership, sort theestimation sample members into the trajectory groupsthat they most likely belong to for Yl and Y2; and (c)cross-tabulate the group membership counts to esti-mate Trklj, ir7'*.6 This approach has two shortcomingsthat are avoided by the joint estimation procedure.First, as described in Roeder, Lynch, and Nagin(1999), such a "classify-analyze" strategy will notproduce consistent estimates of the above probabili-ties. Second, it provides no valid basis for computingthe standard errors of the estimates of ir*1', ir7'*, andir7'*. Consequently, it is not possible to calculate con-fidence intervals for the probability estimates or toconduct hypothesis tests about the estimates. In con-trast, the joint model provides consistent and efficientestimates of all required standard errors.

The only obstacle to calculating confidence inter-vals with the joint model is a practical one. Specifi-cally, one must ensure that the estimates of TT/ and ir*'7

remain in the 0-1 interval. This is accomplished byestimating IT' and ir*'7 as multinomial logistic prob-abilities:

where O7 and -y*'7 are parameters to be estimated.The downside of this approach is that the estimates

of ir7 and ir*'7 are nonlinear functions of the parameterestimates. Therefore, confidence intervals for ir7 andiT*'7 are not directly computable by the usual calcula-tions based on the standard errors of 07 and 7*'7. Twoalternatives are available for calculating confidenceintervals. One can use the first term of a Taylor Ex-pansion on the logistic equation above to form anequation that is linear in the estimated parameters(Greene, 1990); or one can use the parametric boot-strap technique, which was used here, to estimate thestandard errors of ir7 and -fr*'7. This procedure, whichwas first proposed by Efron (1979), simulates thesampling distribution of -fr7 and ir*ly' as follows: (a)Drawing on the result that •ft' and -ir*'7 are asymptoti-cally multivariate normally distributed with mean andcovariance (ir7, S7) and (ir*'7, S*'7), respectively, andthat the points estimates themselves are consistent es-timates of the population quantities, a simulated ran-dom sample of 10,000 estimates of O7 and -y*'7 is gen-erated; (b) these estimates are then used to generate10,000 estimates of ir7 and T7*'7, which, in turn, arerank ordered to create simulated sampling distribu-tions of IT' and 7T*'7. Thus, a 95% confidence intervalis simply the lower and upper 2.5 percentiles of thisdistribution. More generally, this same approach canbe used to calculate confidence intervals for othermore-complicated nonlinear functions such as ft •ft*'-'= ttk.

Table 3 illustrates the application of this approachto the opposition-property delinquency trajectorymodel. Point estimates and 95% confidence intervalsare reported for three probabilities of potential interestto researchers. The first is the probability of belongingto the chronic-opposition trajectory group, ir3

op. Thepoint estimate of this quantity is .047 with a 95%confidence interval of .021-. 101. Note, unlike thetypical confidence interval, this bootstraped interval isasymmetric. This asymmetry makes sense because thepoint estimate of -rr3

op, .047, is close to the lowertheoretical bound of a probability, 0; therefore, there

and 6 Following model estimation, the parameter estimatescan be used to calculate the probability that each individu-al's observed pattern of behavior was generated by eachtrajectory group. These probabilities are called the posteriorprobabilities of group membership.

Page 11: Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

28 NAGIN AND TREMBLAY

Table 3Confidence Intervals for Various Probabilities: Opposition-Property Delinquency Model

Probability

Chronic opposition

Calculation

,3

Pointestimate

.047

95% confidenceinterval

.021-.101

Chronic or high desister opposition

Chronic opposition and rising chronicproperty or moderate chronic property

r f i j_ - , *Del™ op + "" Del™ op

.290

.014

.239-353

.005-.033

is more upside error potential. Also reported is theprobability and associated confidence interval for thelinear sum of two probabilities, the probability of be-longing to the chronic or high-desister group. Thisprobability is estimated at .290 with 95% confidenceinterval of .239-353. The third probability quantity isa joint probability—the probability of belonging tothe chronic childhood-opposition group and then pro-ceeding to belong to one of the two most theft-pronegroups in adolescence, the rising- and medium-chronic groups. Like the first example this probabilityis small, .014, and consequently, its 95% confidenceinterval, .005-.033, is substantially asymmetric. Eachof these illustrative confidence intervals suggests thatdespite the large number of parameters estimated in ajoint model, it is still possible to obtain reasonablyprecise estimates of important probability estimates.

Strengths and Limitations of a Group-BasedModeling Approach

Our statistical depiction of comorbidity and hetero-typic continuity requires a group-based modeling ap-proach. Without it, we would not be able to probabi-listically link the developmental course of twodifferent behaviors. Use of probabilities to character-ize alternative developmental courses requires thedesignation of discrete states of the behaviors—developmental trajectories. In this respect, our mod-eling strategy has its roots in the tradition of charac-terizing comorbidity and heterotypic continuity withodds ratios rather than correlations. Use of the oddsratio also requires the creation of behavioral statessuch as highly anxious or a diagnosis of depression.

As discussed in Nagin (1999), Nagin and Land(1993), and Nagin and Tremblay (1999), the assump-tion that the population is composed of distinct groupsis unlikely to be strictly correct. Instead, the groupsare intended as an approximation of an underlying

continuous process. In so doing, we adopt a standardprocedure in nonparametric and semiparametric sta-tistics of approximating a continuous distributionfrom a discrete mixture (Follman & Lambert, 1989;Heckman & Singer, 1984; Lindsay, 1995; Manski,1995).

This statistical approximation has its analog in thetaxonomic-based theories that are quite common indevelopmental psychology. Examples include theo-ries of personality development (Caspi, 1998), druguse (Kandel, 1975), learning (Holyoak & Spellman,1993), language and conceptual development (Mark-man, 1989), development of prosocial behaviors suchas altruism, and development of antisocial behaviorssuch as delinquency (Loeber, 1991; Moffitt, 1993;Patterson, DeBaryshe, & Ramsey, 1989). Taxonomictheories predict different trajectories of developmentacross subpopulations, but the purpose of such tax-onomies is generally to draw attention to differencesin the causes and consequences of different develop-mental trajectories within the population rather than tosuggest that the population is composed of distinctgroups.

The idea of using groups to approximate a continu-ous distribution is easily illustrated with an example.Suppose Figure 4A depicts the population distributionof some behavior z. In Figure 4B, this same distribu-tion is replicated and overlaid with a histogram thatapproximates its shape. This panel illustrates that anycontinuous distribution with finite endpoints can beapproximated by a discrete distribution composed of afinite number of "points of support" (i.e., the shaded"pillars"). For any given number of points of support,maximum likelihood estimation can be used to esti-mate two sets of parameters. The first identifies thelocation on the horizontal axis of each point of sup-port. In Figure 4, these points are denoted by zb z2, z3,. . . , where z, measures the "average" behavior of in-

Page 12: Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

ANALYZING DEVELOPMENTAL TRAJECTORIES 29

0.10 —

0.05 —

o.oo —

20

B

0.10 —

0.05 —

0.00 —

20

Figure 4. Using groups to approximate a continuous distribution. A: An unknown continuous distribution. B: An unknowncontinuous distribution approximated by a finite number of points of support.

dividuals at the jth point of support. A second set ofparameters measure the proportion of the populationKJ at each point of support. These proportions mustsum to 1 but in general will not be equal. If a time-agedimension were added to Figure 4 to measure thedevelopmental trend of Zj, each of these points ofsupport would correspond to the trajectory groups de-picted. For example, in Figures 2 and 3, the estimatesof Ttj would correspond to the proportion of the popu-lation whose developmental trajectory is best approxi-mated by group /

When is a group-based approach a useful approxi-mation of an underlying continuous phenomenon?The answer to this question depends on many issuesbut we emphasize two here. First, if the behaviorunder study is reasonably approximated by a knowncontinuous distribution that is also tractable (e.g.,the multivariate normal distribution), an analysisbased on that distribution is obviously the prefer-

red approach. Alternatively, if a suitable continuousdistribution is not known or tractable, the group-based, semiparametric approach is an attractive alter-native.

A second and related rationale for group-basedmodeling is that it is well suited for the study of abehavior that does not vary regularly throughout thepopulation, but instead tends to reveal itself in mark-edly different intensities in clusters of individuals.Figure 5 displays two contrasting possibilities. In Fig-ure 5A, the behavior varies uniformly across thepopulation, whereas in Figure 5B there are two dis-tinct modes of behavior. A group-based modeling ap-proach makes little sense for the behavior depicted inFigure 5A—there are no distinct groups. However, forthe behavior in Figure 5B, it makes a great deal ofsense because of the bimodality of the behavior. Atwo-group model would capture the clusters at eachmode.

Page 13: Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

30 NAGIN AND TREMBLAY

0.5

0.25

00 1 2 3 4

BGhsvfor Intensity

B

10.5

0

Behavior Intensity

Figure 5. Two hypothetical examples of the populationdistribution of a behavior. A: Uniformly distributed behav-ior. B: Bimodally distributed behavior.

The examples in Figure 5 are in one dimension. Inone dimension, clear-cut evidence of clustering re-quires unusual examples such as bimodality. How-ever, trajectories of development are in two dimen-sions—Behavior x Time. In this circumstance, thepossibilities for distinctive groups of trajectories arenot hard to imagine. As previously noted, develop-mental psychology is filled with examples of theoriespredicting distinctive patterns of development. Wemake this point because the real world is mostly com-posed of phenomena in which clustering or regularvariation is not clear-cut, particularly if a time dimen-sion is a component of the analysis.

Our experience with model selection for the prop-erty delinquency illustrates this ambiguity. On the onehand, the group-based strategy was successful in iden-tifying distinctive trajectories of serious property de-linquency that would be difficult to identify withmodeling approaches based on continuous distribu-tions. These groups are the two-dimensional analog ofthe Figure 5B distribution. On the other hand, thegroup-based modeling strategy successively splitmoderate-level delinquency trajectory groups intosimilar parallel groups. For these groups, trajectoriesvaried more regularly in the population. Such groupsare the two-dimensional analog to the Figure 5A dis-tribution. This example illustrates that the choice be-

tween the multinomial modeling strategy that under-lies the group-based approach and the modelingstrategy based on a multivariate continuous distribu-tion will rarely be clear cut ex ante. Only the analysiscan reveal whether variation is highly regular or clus-tered into distinctive groups or both.

The successive splitting of the moderate-level de-linquency groups also provides a concrete illustrationof a fundamental statistical problem that attends tousing a finite number of support points to approximatea continuous distribution. Perfect conformance to thedistribution requires an infinite number of points ofsupport. Thus, even though statisticians have madeimportant progress in demonstrating the utility of theBIC in identifying the optimal number of groups inmixture problems, this theoretical work begins withthe assumption that the population is composed of afinite, albeit unknown, number of groups.

Thus, it must be recognized that unless one holds tothe position that the population is strictly composed ofdiscrete groups, use of any goodness-of-fit measure,BIC, or otherwise, will not formally identify the cor-rect model. We do not hold the view that the popu-lation is strictly composed of distinct groups. Onepractical consequence of this view is that as samplesize increases, models will tend to include moregroups because ever more information will becomeavailable for identifying more subtle features of theresponse surface.7 However, in our view, this is not afatal shortcoming of semiparametric or nonparametricmodeling strategies such as that used here. Rather, itis a frank acknowledgment that all models are ap-proximations and thus, literally speaking, incorrect.Still more work is needed for developing methods tocalibrate the adequacy of group-based models. Suchmethods should include not only mathematical statis-tical criteria but also visual representations of theresponse surface.

One of psychometrics' most important contribu-tions to statistical theory was the development ofmethodology for linking test items responses to latentconstructs such as intelligence. Structural equationmodeling has its roots in this tradition. The modelingstrategy used here and the predecessor work on uni-

7 Note, however, that D'Unger et al. (1998) found that theoptimal number of groups as determined by BIC was largelyinsensitive to sample size. This finding reflects the fact thatBIC's penalty for adding more parameters grows withsample size.

Page 14: Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

ANALYZING DEVELOPMENTAL TRAJECTORIES 31

variate trajectory estimation are not in this tradition.Instead, it is rooted in the classical statistical andeconometric tradition in which the quantity to be mea-sured is the quantity of interest. Thus, in the modelsection, the term latent variable was used to describey,,*7 because it is not fully observed. This use of theterm "latent" is different from that in the psychomet-ric literature, in which the term latent factor refers toan unobservable construct that is assumed to give riseto multiple manifest variables. Thus, within the for-mal structure of the model, we do not attempt to for-mally link the response variable to some more funda-mental but not directly measured construct.

The primary advantage of this approach is that ithelps streamline the statistical model. This is a criticaladvantage because the model is already complex. Fur-thermore, we suspect that adapting this modelingframework to a structural equation modeling frame-work would make it difficult to retain two keystrengths of the framework—the flexibility to handlea variety of different data types and to accommodatemissing data.

Balanced against these advantages are obviouslimitations. The modeling framework provides no for-mal basis for combining items intended to be indica-tors of an unobserved latent construct and does notprovide the capacity to explicitly account for mea-surement error. For an excellent account of the appli-cation of structural equation modeling methods togroup-based modeling, see Muthen and Muthen(1999).

Conclusion

This article has demonstrated a group-basedmethod for joining developmental trajectories of dis-tinct but theoretically related behaviors. The objectivewas to provide a method that improves on conven-tional approaches for measuring comorbidity and het-erotypic continuity. The principal advantages of thisapproach are as follows: (a) It links the entire devel-opmental course of the two behaviors of interestrather than relating single measurements of each be-havior made at a particular time—in so doing, themethod makes use of all the data assembled over thecourse of the longitudinal study; (b) it provides threereadily interpretable metrics for describing comorbid-ity and heterotypic continuity, ir*1', ir7'*, and ir7'*; and(c) the group-based method is specifically intendedfor problems in which theory or practice suggests dis-tinctive developmental courses within the population.

As the examples illustrate, the joint model is wellsuited for assessing population heterogeneity in theform of the comorbidity or heterotypic continuity un-der study.

A particularly valuable next step in the develop-ment of the joint model methodology is generalizingthe specification of ir*'7 to incorporate covariates. Thisgeneralization has already been demonstrated for theunivariate model (Nagin, 1999; Roeder et al.. 1999).The obstacle to adding covariates to the joint model isdeveloping a workable approach to limit the numberof parameters to be estimated. Consider the opposi-tion-property delinquency model, which includes fouropposition trajectories and six delinquency trajecto-ries. Without any constraints on the specification ofT!k<lj, 20 (= 4 * (6 - 1)) parameters would be requiredfor each covariate. This is plainly too many, and ac-tual experience confirms this prediction—the standarderrors of the estimated parameters are extremelylarge. Thus, as a practical matter, the fully generalmodel is not identified, except perhaps in extremelylarge data sets. A much more constrained but stillsensible model is required.

References

Angold, A., Costello, E. J., & Erkanli, A. (1999). Comor-bidity. Journal of Child Psychology and Psychiatry, 40,57-87.

Backteman, G., & Magnusson, D. (1981). Longitudinal sta-bility of personality characteristics. Journal of Personal-ity, 49, 148-160.

Brame, B., Nagin, D., & Tremblay, R. (in press). Develop-mental trajectories of physical aggression from schoolentry to late adolescence. Journal of Clinical Psychologyand Psychiatry.

Caspi, A. (1998). Personality development across the lifecourse. In W. Damon (Series Ed.) & N. Eisenberg (Vol.Ed.), Handbook of child psychology: Vol. 3. Social, emo-tional, and personality development (pp. 311-388). NewYork: Wiley.

Costello, E. J., Costello, A. J., Edelbrock, C., Burns, B. J.,Dulcan, M. K., Brent, D., & Janiszewski, S. (1988). Psy-chiatric disorders in pediatric primary care: Prevalenceand risk factors. Archives of General Psychiatry, 45,1107-1116.

D'Unger, A., Land, K., McCall, P., & Nagin, D. (1998).How many latent classes of delinquent/criminal careers?Results from mixed Poisson regression analyses of theLondon, Philadelphia, and Racine cohorts studies. Ameri-can Journal of Sociology, 103, 1593-1630.

Page 15: Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

32 NAGIN AND TREMBLAY

Efron, B. (1979). Bootstrap methods: Another look at thejackknife. Annals of Statistics, 7, 1-26.

Erdfelder, E. (1990). Deterministic developmental hypoth-eses, probabilistic rules of manifestation, and the analysisof finite mixture distributions. In A. von Eye (Ed.), Sta-tistical methods in longitudinal research: time series andcategorical longitudinal data: Vol. 2. (pp. 471-509).Boston: Academic Press.

Farrington, D. P. (1990). Childhood aggression and adultviolence: Early precursors and later-life outcomes. InD. J. Pepler and K. H. Rubin (Eds.), The development ofchildhood aggression. Hillsdale, NJ: Erlbaum.

Fergusson, D. M., Horwood, L. J., & Lynskey, M. T.(1993). Prevalence and comorbidity of DSM-III-R diag-noses in a birth cohort of 15 year olds. Journal of theAmerican Academy of Child and Adolescent Psychiatry,32, 1127-1134.

Follman, D. A. & Lambert, D. (1989). Generalized logisticregression by non-parametric mixing. Journal of theAmerican Statistical Association, 84, 295-300.

Ghosh, J. K., & Sen, P. K. (1985). On the asymptotic per-formance of the log-likelihood ratio statistic for the mix-ture model and related results. In L. M. LeCam and R. A.Olshen (Eds.), Proceedings of the Berkeley Conference inHonor of Jerzy Neyman and Jack Kiefer: Vol. II. (pp.789-806). Monterey, CA: Wadsworth.

Gottfredson, M. R., & Hirschi, T. (1990). A general theoryof crime. Stanford, CA: Stanford University Press.

Greene, W. H. (1990). Econometric analysis. New York:Macmillan.

Haapasalo, J., Tremblay, R. E., Boulerice, B., & Vitaro, F.(in press). Predicting preadolescents' delinquency, socialwithdrawal and school placement from kindergarten be-havior: Person and variable approaches. QuantitativeCriminology.

Heckman, J., & Singer, B. (1984). A method for minimizingthe impact of distributional assumptions in econometricmodels for duration data. Econometrica, 52, 271-320.

Hinshaw, S. P., Lahey, B. B., & Hart, E. L. (1993). Issues intaxonomy and comorbidity in the development of con-duct disorder. Development and Psychopathology, 5, 31-49.

Hinshaw, S. P., Zupan, B. A., Simmel, C., Nigg, J. T., &Melnick, S. (1997). Peer status in boys with and withoutattention-deficit hyperactivity disorder: Prediction fromovert and covert antisocial behavior, social isolation, andauthoritative parenting beliefs. Child Development, 68,800-896.

Holyoak, K., & Spellman, B. (1993). Thinking. Annual Re-view of Psychology, 4, 265-315.

Huesmann, L. R., Eron, L. D., Lefkowitz, M. M., & Walder,

L. O. (1984). Stability of aggression over time and gen-erations. Developmental Psychology, 20(6), 1120-1134.

Jones, B., Nagin, D. S., & Roeder, K. (in press). A SASprocedure based on mixture models for estimating devel-opmental trajectories. Sociological Research & Methods.

Kagan, J. (1969). The three faces of continuity in humandevelopment. In D. A. Goslin (Ed.), Handbook of social-ization theory and research (pp. 983-1002). Chicago, IL:Rand McNally.

Kandel, D. B. (1975). Stages in adolescent involvement indrug use. Science, 190, 912-914.

Kass, R. E., & Raftery, A. E. (1995). Bayes factor. Journalof the American Statistical Association, 90, 773-795.

Kerr, M., Tremblay, R. E., Pagani-Kurtz, L., & Vitaro, F.(1997). Boys' behavioral inhibition and the risk of laterdelinquency. Archives of General Psychiatry, 54, 809-816.

Kessler, R. C., McGonagle, K. A., Zhao, S., Nelson, C. B.,Hughes, M., Eshleman, S., Wittchen, H. U., & Kendler,K. S. (1994). Lifetime and 12-month prevalence of DSM-III-R psychiatric disorders in the United States: Resultsfrom the National Comorbidity Study. Archives of Gen-eral Psychiatry, 51, 8-19.

Lahey, B. B., McBurnett, K., & Loeber, R. (in press). AreAttention-Deficit/Hyperactivity Disorder and Opposi-tional Defiant Disorder developmental precursors to Con-duct Disorder? In M. Lewis & A. Sameroff (Eds.), Hand-book of developmental psychopathology. New York:Plenum Press.

Land, K., McCall, P., & Nagin, D. (1996). A comparison ofPoisson, negative, binomial, and semiparametric mixedPoisson regression models with empirical applications tocriminal careers data. Sociological Methods & Research,24, 387-440.

Lewinsohn, P. M., Hops, H., Roberts, R. E., Seeley, J. R., &Andrews, J. A. (1993). Adolescent psychopathology: I.Prevalence and incidence of depression and other DSM-III-R disorders in high school students. Journal of Ab-normal Psychology, 102, 133-144.

Lindsay, B. G. (1995). Mixture models: Theory, geometry,and applications. Hayward, CA: The Institute of Math-ematical Statistics.

Loeber, R. (1991). Questions and advances in the study ofdevelopmental pathways. In D. Cicchetti & S. Toth(Eds.), Models and integrations. Rochester Symposiumon Developmental Psychopathology (Vol. 3, pp. 97-115).Rochester, NY: University of Rochester Press.

Loeber, R., & LeBlanc, M. (1990). Toward a developmentalcriminology. Crime and Justice, 12, 375—437.

Loeber, R., Wung, P., Keenan, K., Giroux, B., Southamer-

Page 16: Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

ANALYZING DEVELOPMENTAL TRAJECTORIES 33

Loeber, M., Van Kammen, W. B., & Maughan, B.(1993). Developmental pathways in disruptive child be-havior. Development and Psychopathology, 5, 103-133.

Maddala, G. S. (1983). Limited dependent and qualitativevariables in econometrics. New York: Cambridge Uni-versity Press.

Magnusson, D., & Bergman, L. R. (1990). A pattern ap-proach to the study of pathways from childhood to adult-hood. In L. N. Robins & M. Rutter (Eds.), Straight anddevious pathways from childhood to adulthood (pp. 101-116). New York: Cambridge University Press.

Manski, C. F. (1995). Identification problems in the socialsciences. Cambridge, MA: Harvard University Press.

Markman, E. M. (1989). Categorization and naming in chil-dren: Problems of induction. Cambridge, MA: MITPress.

Moffitt, T. E. (1993). Adolescence-limited and life-coursepersistent antisocial behavior: A developmental tax-onomy. Psychological Review, 100(4), 674-701.

Muthen, B., & Muthen, L. (1999). Integrating person-centered and variable-centered analyses: Growth mix-ture modeling with latent trajectory classes. Manuscriptin preparation.

Nagin, D. (1999). Analyzing developmental trajectories:Semi-parametric, group-based approach. PsychologicalMethods, 4, 139-177.

Nagin, D. S., & Land, K. C. (1993). Age, criminal careers,and population heterogeneity: Specification and estima-tion of a nonparametric, mixed Poisson model. Criminol-ogy, 31, 327-362.

Nagin, D. S., & Tremblay, R. E. (1999). Trajectories ofboys' physical aggression, opposition, and hyperactivityon the path to physically violent and nonviolent juveniledelinquency. Child Development, 70, 1181-1196.

Olweus, D. (1979). Stability of aggressive reaction patternsin males: A review. Psychological Bulletin, 86, 852-875.

Patterson, G. R., DeBaryshe, B. D., & Ramsey, E. (1989). Adevelopmental perspective on antisocial behavior. Ameri-can Psychologist, 44, 329-335.

Pulkkinen, L., & Tremblay, R. E. (1992). Patterns of boys'social adjustment in two cultures and at different ages: Alongitudinal perspective. International Journal of Behav-ioural Development, 75(4), 527-553.

Raftery, A. E. (1995). Bayesian model selection in socialresearch. Sociological Methodology, 25, 111-164.

Roeder, K., Lynch, K., & Nagin, D. (1999). Modeling un-certainty in latent class membership: A case study incriminology. Journal of the American Statistical Associa-tion, 94, 766-776.

Titterington, D. M., Smith, A. F. M., & Makov, U. E.(1985). Statistical analysis of finite mixture distributions.New York: Wiley.

Tremblay, R. E., Desmarais-Gervais, L., Gagnon, C., &Charlebois, P. (1987). The preschool behavior question-naire: Stability of its factor structure between culture,sexes, ages and socioeconomic classes. InternationalJournal of Behavioral Development, 10, 467^-84.

Tremblay, R. E., Masse, L. C., Pagani, L., & Vitaro, F.(1996). From childhood physical aggression to adoles-cent maladjustment: The Montreal Prevention Experi-ment. In R. D. Peters & R. J. McMahon (Eds.), Prevent-ing childhood disorders, substance abuse anddelinquency (pp. 268-298). Thousand Oaks, CA: Sage.

Valez, C. N., Johnson, J., & Cohen, P. (1989). A longitu-dinal analysis of selected risk factors of childhood psy-chopathology. Journal of American Academy of Childand Adolescent Psychiatry, 28, 861-864.

(Appendix follows)

Page 17: Analyzing Developmental Trajectories of Distinct but ...€¦ · 1082-989X/01/S5.00 DOI: 10.I037//1082-989X.6.U8 Analyzing Developmental Trajectories of Distinct but Related Behaviors:

34 NAGIN AND TREMBLAY

Appendix

Specification of Likelihood Function

This appendix specifies the form of the likelihood func-tion used to estimate the joint trajectory model. Given thatthe joint model is an elaboration of the univariate model, webegin with a brief summary of the form of its likelihoodfunction. For each individual i, the likelihood of observingthe longitudinal sequence of behavioral measurements, Yt

= {yn, ya, ya, . . ., y l T } , denoted by P(Yt), is specified asa mixture of J underlying latent groups:

72

where PJ(Y,) is the probability of Y, given membership ingroup j and V is the probability of group/ As described inthe main text, the form of PJ(Y^ is selected to conform withthe type of data under analysis. In work to date, the Zero-Inflated Poisson distribution has been used in the analysis ofcount data, the censored normal distribution in the analysisof psychometric data, and the logit distribution in the analy-sis of binary data. See Jones et al. (in press) orNagin (1999)for a full discussion of these alternatives. The basic modelalso assumes that conditional on membership in group/ therandom variables yir are independent. Thus,

The assumption of conditional independence is a strongone, but not nearly as strong as it may seem at first blush. Asdescribed in the main text, the parameters of the polynomialdefining each trajectory group are allowed to vary freely.Thus, in the population, individual-level behavior will becorrelated over time even though within-group, it is inde-pendent over time. Thus, the model does conform with theempirical reality that at the population level there is serialcorrelation in behavior. Further, we note that the standardfixed or random effect model generally assumes that con-ditional on the fixed or random effect (and other relevantcovariates), individual-level behavior is independent overtime. In principle, the conditional independence assumptioncould be relaxed to allow for within-group serial correlationin behavior, but the cost in terms of complexity would beconsiderable.

The joint trajectory model builds from the univariatemodel as follows: Let X1 and Y2 denote the two longitudinalseries to be modeled in a joint trajectory format, where Y1 ismeasured over T1 periods, Y2 is measured over T2 periods,and the index / has been suppressed for notational conve-nience. We continue the maintained assumption of condi-tional independence given group membership. Thus,

and

where /(*) and h(*) are suitably defined probability distri-butions given the form of the data.

We next add a new layer to the conditional independenceassumption with the assumption that conditional on j and k,Y1 and Y2 are independently distributed, Pjk(Y\Y2) =f'(Y^hJ(Y2). Thus, the unconditional likelihood function ofy1 and Y2 sums across Pjk (Y\ Y2), with each such conditionaldistribution weighted by the joint probability of membership intrajectory group j for Y1 and trajectory group k for Y2, -xjk:

P(Y\Y2) =

An alternative and equivalent form of the likelihood func-tion builds from the result that ir7* = ir*'-'ir'. Thus,

The reader will observe that this second likelihood func-tion has a sequential construction — each group j of Y} islinked to each group k of Y2 via a conditional probabilityir*1-'. For problems in which Y1 temporally precedes Y2, thisformulation is natural. However, regardless of temporal se-quence, still another equivalent formulation conditionallylinks each group k to each group j via the conditional prob-ability ir7'*. For this formulation, the likelihood function foreach individual i is:

P(Y],Y2) =

A1 Brame et al. (in press) use a distinctly different ap-proach. Instead of linking trajectories across Y1 and Y2 viaIT'*, the univariate approach is expanded by estimating tra-jectories groups that combine parameters for each behavior.Specifically in this formulation a group is defined by threesets of parameters, -nm, Blm, and E2m, where m indexes thecombined trajectory, ir"1 is the proportion of the populationin each combined group, and B lm and B2m are vectors ofparameters specifying the shape of group m's trajectory forbehaviors Y[ and Y2, respectively. For this formulation, theform of the likelihood function for a given individual i isP(Y\Y2) = ^Lmvmr (Yl)hm(Y2).

Received February 17, 2000Revision received July 10, 2000

Accepted November 7, 2000 •