24
Social Networks 25 (2003) 309–332 Group composition and network structure in school classes: a multilevel application of the p model Miranda J. Lubbers Institute for Educational Research, GION, University of Groningen, Grote Rozenstraat 3, 9712 TG Groningen, The Netherlands Abstract This paper describes the structure of social networks of students within school classes and ex- amines differences in network structure between classes. In order to examine the network structure within school classes, we focused in particular on the principle of homophily, i.e. the tendency that people associate with similar others. When differences between classes were observed, it was investigated whether these were related to group compositional characteristics. A two-stage regres- sion procedure is proposed to analyze social networks of multiple groups. The random coefficient model is discussed briefly as an alternative to the two-stage method. © 2003 Elsevier B.V. All rights reserved. JEL classification: C51 Keywords: Social networks; School classes; Logit p model; Homophily 1. Introduction In educational research, little attention is paid to the school class as a social environment. Classes are usually represented as groups of isolated individuals, whose learning processes are influenced by personal attributes and family background characteristics as well as char- acteristics of instruction, teachers and school organizations. Nevertheless, evidence showed that social relationships between peers also influence students’ learning processes at school. Peer acceptance gives students a sense of belonging at school and access to a number of An earlier version of this paper was presented at SUNBELT XXI (2001) in Budapest, Hungary. Tel.: +31-50-363-6680; fax: +31-50-363-6670. E-mail address: [email protected] (M.J. Lubbers). 0378-8733/$ – see front matter © 2003 Elsevier B.V. All rights reserved. doi:10.1016/S0378-8733(03)00013-3

Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

Social Networks 25 (2003) 309–332

Group composition and network structure inschool classes: a multilevel application

of thep∗ model�

Miranda J. Lubbers∗Institute for Educational Research, GION, University of Groningen,

Grote Rozenstraat 3, 9712 TG Groningen, The Netherlands

Abstract

This paper describes the structure of social networks of students within school classes and ex-amines differences in network structure between classes. In order to examine the network structurewithin school classes, we focused in particular on the principle of homophily, i.e. the tendencythat people associate with similar others. When differences between classes were observed, it wasinvestigated whether these were related to group compositional characteristics. A two-stage regres-sion procedure is proposed to analyze social networks of multiple groups. The random coefficientmodel is discussed briefly as an alternative to the two-stage method.© 2003 Elsevier B.V. All rights reserved.

JEL classification: C51

Keywords: Social networks; School classes; Logitp∗ model; Homophily

1. Introduction

In educational research, little attention is paid to the school class as a social environment.Classes are usually represented as groups of isolated individuals, whose learning processesare influenced by personal attributes and family background characteristics as well as char-acteristics of instruction, teachers and school organizations. Nevertheless, evidence showedthat social relationships between peers also influence students’ learning processes at school.Peer acceptance gives students a sense of belonging at school and access to a number of

� An earlier version of this paper was presented at SUNBELT XXI (2001) in Budapest, Hungary.∗ Tel.: +31-50-363-6680; fax:+31-50-363-6670.

E-mail address: [email protected] (M.J. Lubbers).

0378-8733/$ – see front matter © 2003 Elsevier B.V. All rights reserved.doi:10.1016/S0378-8733(03)00013-3

Page 2: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

310 M.J. Lubbers / Social Networks 25 (2003) 309–332

social resources (such as social companionship, help and behavioral confirmation), whichincreases their motivation and school success (Wigfield et al., 1998) and decreases theprobability that they drop out (Hymel et al., 1996). Relationships among classmates arethought to be particularly important. Students learn skills by observing and modeling otherstudents’ learning styles (Bandura, 1977), they evaluate their achievement by comparingit with that of their classmates and they are rewarded for behavior that is valued by theirclassmates (Wentzel, 1996). Not all classmates are equally important to a student’s learningprocess, but primarily those classmates whose learning styles are visible to the student andwith whom the student has a certain extent of communication (Bandura, 1977). Since astudent’s position within the class network of social relationships is thought to be related tohis educational performance, it is important to study the structure of social networks withinclasses and its determinants.

This paper concerns the first measurement of a Dutch longitudinal study that describesthe school class as a social environment and investigates its impact on students’ educa-tional performance. The purpose of this paper is two-fold. The first objective is to describethe structure of social networks of students within classes, and to examine differences innetwork structure between classes. In order to describe the network structure, we focusin particular on a strong tendency observed in many types of networks, the tendency thatpeople associate with similar others. Since the presumption that classmates influence eachother’s educational performance is a major reason for studying class networks, we are espe-cially interested in whether similarity in educational performance at the entry to secondaryeducation structures relationships among students. In addition, similarity on a few othercharacteristics that were found to be important in other studies was included. When dif-ferences between classes in network structure or in the tendency to relate to similar othersare observed, we also investigated whether these differences can be explained by the groupcomposition.

The second objective is a methodological one. Traditionally, social network researchhas focused on describing the pattern of interdependencies within a single group of actors.Although descriptive studies of single networks give valuable insight into group processes, itis questionable to what extent findings from these studies can be generalized to other groups.As we are particularly interested in differences between school classes and the effect of groupcomposition on the social networks of students, we need to examine the networks of manyclasses. We used data of students in 57 school classes, who had entered secondary education4 months before the first measurement took place. As research of multiple networks is ratherrare, statistical methods that are suitable for simultaneous analysis of multiple completenetworks, or in other words methods for ‘multilevel social network analysis’ (Snijders andBaerveldt, 2003), are not available. Therefore, we will propose an approach that applies astatistical model for single networks, the logitp∗ model, in a multilevel context by meansof a two-stage regression procedure.

The first objective is discussed inSection 2, in which three research questions are speci-fied. We elaborate on the methodological objective inSection 3, where the two-stage pro-cedure is described. InSections 4 and 5, the data are introduced and the results presented.Before the conclusion, we will briefly discuss an alternative for the two-stage procedure,the random coefficient model, and describe its correspondence with, advantages and disad-vantages vis-à-vis the two-stage procedure used in this paper.

Page 3: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

M.J. Lubbers / Social Networks 25 (2003) 309–332 311

2. Dyadic similarity, group composition and social structure inschool classes

According to social psychological studies, interpersonal attraction is strongly based onsimilarity between self and other (Byrne, 1971). Several explanations for this tendency aregiven (for a review seeHinde, 1997, pp. 123–134). A major explanation is that associatingwith similar others (homophily) is rewarding in multiple ways. Similar others are morelikely to provide validation of one’s own behavior, values and beliefs than dissimilar oth-ers. Moreover, shared activities are thought to be more enjoyable when others have similarinterests (Aboud and Mendelson, 1996). Besides, sharing similar values eases communica-tion and minimizes tension. These arguments are even applied to similarity in demographiccharacteristics, which is thought to create shared tastes and knowledge (Mayhew et al.,1995).

A large body of empirical evidence (seeMcPherson et al., 2001) suggests that homophilyis a pervasive organizing principle of many kinds of social networks. Homophily involvesa variety of characteristics. These are presumed to be important in different stages of tieformation (Aboud and Mendelson, 1996). With regard to interpersonal attraction, peopleinitially select each other on the basis of visible or ‘surface’ characteristics. Research showedthat children and adolescents tend to exclude others who are dissimilar in gender, age andethnicity (seeHartup, 1993). As people learn more about each other, they subsequentlychoose friends from the resulting group of similar others. In this second selection stage,similarity in ‘deeper’ characteristics becomes more important. In this respect, similaritiesin adolescents’ friendships are consistently found in activity preferences, but findings aboutcognitive ability, personality, values and attitudes are mixed (Aboud and Mendelson, 1996).It is assumed that those characteristics become more and more important as children growolder.

Although the tendency that ties between similar people occur at a higher rate than tiesbetween dissimilar people may be produced by personal preferences, it may also be in-duced by the shared social environment (i.e. similarity as a consequence of being groupedtogether) (Jussim and Osgood, 1989; Leenders, 1995). When studying sociometric choicewithin relatively small groups, one should consider the limitations the group imposes oninterpersonal selection. Classrooms constrain the alternatives since they constrain both thenumber of others whom students can choose between, and the set of attributes these othershave (seeLeenders, 1995; Jussim and Osgood, 1989). For example, schools usually groupstudents together who are similar in age and often in cognitive ability. Then similarities inage and cognitive ability are a consequence of the shared social environment rather than afocus of interpersonal attraction. The extent to which classes limit sociometric choice mayvary from class to class: some classes are for example ethnically diverse whereas others arecompletely homogeneous in this respect. These constraints on the composition of a classdetermine the baseline component of homophily (McPherson et al., 2001), i.e. the expecteddegree of similarity in relationships given the relative size of categorical groups within theclassroom. Then, selective attraction should explicitly be studied as a deviation from thisbaseline component.

Variations in group composition thus determine variations in the baseline component ofhomophily. In other words, in relatively homogeneous classes a randomly chosen pair of

Page 4: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

312 M.J. Lubbers / Social Networks 25 (2003) 309–332

students is expected to be more similar than in heterogeneous classes. However, severaltheoretical notions suggest that the group composition may also affect selective attraction.This would imply that group composition does not only determine the baseline componentof homophily (in other words contact opportunities), but that deviations from this baselinecaused by selective attraction (preference) are also group-dependent. For example,Vermeijand Baerveldt (2001)compare the implications of two theoretical notions with respect toethnic heterogeneity in classes. On one hand, the contact hypothesis (Allport, 1954) sug-gests that mere exposure to dissimilar others increases familiarity and friendliness amongdissimilar people. Accordingly, as dissimilar students have more contact opportunities inheterogeneous classes, one could presume that similarity is aless important selection cri-terion than in relatively homogeneous classes. On the other hand, they state that accordingto social identity theory (Tajfel and Turner, 1979), identity becomes more salient as theheterogeneity increases, and as a result, similar students may bemore attracted to eachother than is expected on the basis of their relative numbers. Similar expectations can beexpressed for heterogeneity of classes with respect to educational performance level. Inaddition to the possible effect of heterogeneity in performance level,Hallinan and Smith(1989)expected that in classes with a higheraverage level, academic success would bemore salient to students and therefore homophily on performance level would be strongerthan in lower level classes. In conclusion, it is necessary and worthwhile to explore to whatextent network structures differ between classes and whether the composition of classes in-fluences the network structures that arise. In fact, if groups differ significantly with respectto the network structure, one should be cautious to generalize findings from single networkstudies.

In this paper, we first describe the social network structure within school classes, andspecifically the extent to which sociometric choice in school classes is based on similarity.The first research question is: Within the setting of their school class, do students relate toothers who are relatively similar? The second research question is whether classes differ innetwork structure, and specifically in the extent to which sociometric choices of studentsare based on similarity. If they do, the third research question is to what extent the observeddifferences can be explained by compositional features of classes.

Results are presented of students in 57 school classes who had entered secondary educa-tion 4 months before the first measurement took place. It is studied whether prior educationalperformance, measured in the final year of primary education, predicts sociometric choiceon two relations, liking and co-operation. Besides educational performance, a few otherimportant predictors of adolescents’ network ties are included. As we mentioned earlier inthis section, similarity in the demographic variables gender, ethnicity and age were foundto be important in other studies. The effect of gender appeared to be so strong that thenetworks of boys and girls were almost completely separate. Therefore, we chose to studygirls’ and boys’ networks separately. Ethnicity is included in the analyses, but age is ex-cluded because its range within classes is very narrow: 91% of the pairs of classmatesdiffered less than 1 year in age. Besides performance level, gender and ethnicity, we willconsider the primary school students had attended.McPherson et al. (2001)stress thatthe effect of the shared environment lasts far beyond the individual’s actual embedded-ness. Besides, students who knew each other from primary school may already have influ-enced each other’s learning behavior before entering secondary education. It is therefore

Page 5: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

M.J. Lubbers / Social Networks 25 (2003) 309–332 313

important to distinguish the effect of same primary school from similarity in performancelevel.

3. Method of analysis

The research questions of this paper were formulated at two levels. One is the dyadlevel, at which the first research question was formulated and the other is the class level,at which the second and third question were formulated. In order to predict the struc-ture of social relationships in classes from characteristics of both dyads and classes, atwo-stage regression procedure has been applied. In the first stage, each class networkhas been analyzed separately using a logitp∗ model. From these analyses, we obtaineda set of regression coefficients for each of the classes representing the occurrence ofstructural effects in the class network. These coefficients become the dependent vari-ables in the second stage, when they are related to compositional characteristics ofclasses.

3.1. Modeling the variability within classes

In the first stage, each network was analyzed using a statistical model for social networks,the logitp∗ model. The family of modelsp∗ is developed byWasserman and Pattison (1996),Pattison and Wasserman (1999)and Robins et al. (1999)and generalizes the earlierp1model introduced byHolland and Leinhardt (1981), elaborating earlier work on Markovgraphs byFrank and Strauss (1986)andStrauss and Ikeda (1990). Below, the model willbe summarized. The reader is referred to (Wasserman and Pattison, 1996, Anderson et al.,1999) for a more complete introduction.

As the two sexes will be treated separately and the two relations univariately, the term‘class’ in this section could refer to the set of either boys or girls in a class, and likewise theterm ‘relation’ could refer to either liking or co-operation. The random univariate network ofpossible ties on a relation within a class is denoted by its adjacency matrixY = (Yij)1≤i,j≤n,wheren is the number of students in the group. The binary random variableYij thus rep-resents a possible tie directed from studenti (the sender) to studentj (the receiver). Arealization ofY is denoted byy = (yij)1≤i,j≤n, with yij = 1 if a tie from i to j is observed,andyij = 0 if no tie fromi to j is observed.

In p∗ models, the probability that a tie exists is modeled as a function of parame-ters that represent structural tendencies (such as mutuality and transitivity) and potentialactor or dyadic attributes (such as similarity in performance level). The unit of analy-sis is the ordered pair of actors (i, j) in a network, which we will refer to as acouple(Robins et al., 2001). The dependent variable is the observed value of a relational tieYij.

As each network member is part of multiple couples, the common assumption of statisticalindependence is not tenable forp∗. Markov models for random graphs ofFrank and Strauss(1986)andStrauss and Ikeda (1990)have made it possible to relax this limiting assumption.They proposed to estimate the probability that a tie existsconditional on the rest of thenetwork. Thus, the dependent variableYij is modeled as the conditional probabilityPij that

Page 6: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

314 M.J. Lubbers / Social Networks 25 (2003) 309–332

a tie exists and a residualRij:

Yij = Pij + Rij where Pij = Pr(Yij = 1|rest ofY) (1)

Subsequently, the conditional probability can be elaborated as follows:

Pij = Pr(Yij = 1|Yc(ij)) =

Pr(Y = y+(ij))Pr(Y = y−(ij))+ Pr(Y = y+(ij))

(2)

whereyc(ij) denotes the adjacency matrixy of the network in which the tie fromi to j is

missing,y+(ij) denotes the adjacency matrix of the network in which there is a tie fromi to j

andy−(ij) the adjacency matrix in which there is no tie fromi to j.Logistic regression is used to estimate thep∗ model. Therefore, the probability that a tie

exists is first transformed into the natural logarithm of the odds that a tie is present versusabsent, in order to produce a variable that ranges from negative to positive infinity ratherthan from 0 to 1. Then, we can equate it to a linear combination of explanatory variables.

logit(Pij) = log

(Pr(Y = 1|Yc

(ij))

Pr(Y = 0|Yc(ij))

)= log

(Pr(Y = y+(ij))Pr(Y = y−(ij))

)= θ′(z(y+(ij))− z(y−(ij)))

(3)

whereθ denotes the vector of model parameters andz(y) the vector of network statistics,which are functions ofy. z(y+(ij))− z(y−(ij)) thus expresses the vector of changes in networkstatistics when variableYij changes from 1 to 0, while the rest of the network is unaltered.

The network effects included in the model reflect the assumed dependence structure. ForMarkov graphs, conditional dependencies are assumed between any two pairs of actors thathave an actor in common. This means that all dependencies can be defined at the level ofthreesomes. Therefore, the sufficient statistics for Markov graphs are triangles and stars oforder 3 or more.

In this study, 10 effects were estimated in two subsequent models that are common toall classes. In the first model, structural parameters were estimated which represent dyadicand triadic configurations. We restrict ourselves to the primary parameters choice (denotedφ), mutuality (ρ), 2-out-star (σO), 2-in-star (σI ), 2-mixed-star (σM), transitivity (τT), andcyclicity (τC). The corresponding network statistics are defined as follows:

choice :z0(y) = L =∑i,j

yij (4)

mutuality : z1(y) = M =∑i<j

yijyji (5)

2-out-star :z2(y) = SO =∑i,j,k

yijyik (6)

2-in-star : z3(y) = SI =∑i,j,k

yjiyki (7)

Page 7: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

M.J. Lubbers / Social Networks 25 (2003) 309–332 315

2-mixed-star :z4(y) = SM =∑i,j,k

yjiyik (8)

transitivity : z5(y) = TT =∑i,j,k

yijyjkyik (9)

cyclicity : z6(y) = TC =∑i,j,k

yijyjkyki (10)

where i, j and k denote distinct students in a class. In fact, these network statistics aresimply subgraph counts. The network change statistics for these configurations are thechanges between the statistic calculated for the network where the tieyij is present and thestatistic for the network in which the tie is absent. So for example, the change statistic forchoice is by definition a constant.

In the second model, differential effects of choice are added to the first model for threedyadic attributes. These are same primary school attended (sameschij, parameterφschool),difference in educational performance (difPEFTij, φPEFT) and difference of ethnic back-ground (difethnicij,φethnic).1 The variables, which are described inSection 4, are multipliedwith the tie indicator to produce the network statistics.

Same primary school attended :z7(y) = DS =∑i,j

yijsameschij (11)

Difference in performance :z8(y) = DP =∑i,j

yijdifPEFTij (12)

Difference of ethnicity :z9(y) = DE =∑i,j

yijdifethnicij (13)

Since the log odds are conditional, the method described earlier is a pseudo-likelihoodmethod rather than maximum likelihood. According toStrauss and Ikeda (1990), thepseudo-likelihood estimation procedure performs equally well as maximum likelihood.Nevertheless, the statistical properties of the estimator are unknown (Snijders, 2002). Al-though Markov Chain Monte Carlo techniques are being developed that may provide analternative estimation procedure, these too have limitations as yet (Snijders, 2002).

The network data were preprocessed in PREPSTAR (Ardu, 1995) to produce matrices ofnetwork change statistics per class. Subsequently, parameters for each class were estimatedusing the logistic regression option in SPSS (seeCrouch et al., 1998).

3.2. Modeling the variability between classes

The logitp∗ analyses yield a set of parameter estimates and associated standard errorsfor each of theN classes, separately for boys and girls and for the two relations. We first

1 The operationalization of the dyadic attributes as ‘similar’ or ‘different’ is data driven. The average dyad withinclasses consists of two students who attended different primary schools, are relatively similar in performance andbelong to the same ethnic category. This represents the baseline. The categories other than 0 are specified for thosedyads that deviate from the average dyad.

Page 8: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

316 M.J. Lubbers / Social Networks 25 (2003) 309–332

need to summarize these findings over classes. If it appears that the structural effects varybetween classes, then the next step is to explore to what extent compositional features canexplain this variance.

To summarize the effects, the coefficients of the classes were first split into an averagecoefficient and a class-dependent deviation. For convenience, letθ now denote a singleparameter of the vector. The regression equation can then be written as:

θ̂m = µθ + Um + Em (14)

whereθ̂m denotes the estimated parameter value for classm, µθ denotes the average coef-ficient andUm is the true deviation of classm which has a mean of 0 and a varianceσ2

θ . Emdenotes the estimation error associated with the true parameter valueθm. To differentiatebetween true variance and error variance, and thus to obtain more precise estimators forµθ andσ2

θ , we have to account for the considerable differences in standard errors betweenclasses (seeSnijders and Baerveldt, 2003). The program MLwiN (Goldstein et al., 1998)was used for an iterated estimation of the weighed least squares. Similarly to the proce-dure of modeling a meta-analysis with known level 1 variation (e.g.Bryk and Raudenbush,1992; Goldstein, 1995), this was done by creating a pseudo-level in such a way that eachunit at level 2 has only one unit at level 1. Then, the error variance was specified at level1 and the true variance was modeled at level 2. As a result of this procedure, classes withlarge standard errors have less influence on the average effect size than classes with smallerrors. Nevertheless, classes with extremely high standard errors (≥4) due to high levelsof collinearity were removed from the analyses, since the regression coefficients in thesecases are usually unreasonably high as well.

The average effect size of a variable,µθ, indicates to what extent the network effectoccurs in classes and, specifically for the dyadic attributes, to what extent sociometricchoice is based on similarity (research question 1). To test whether the average effect sizeis zero, at-ratio of the average parameter estimateµ̂WLS

θ and the associated standard errorS.E.(µ̂WLS

θ ) is used. This statistic has approximately a standard normal distribution.

tµθ = µ̂WLSθ

S.E.(µ̂WLSθ )

(15)

To test whether an effect is zero in all groups, the following statistic is used (Snijders andBaerveldt, 2003):

T 2 =∑m

(θ̂m

S.E.(θ̂m)

)2

(16)

where S.E.(θ̂m) denotes the standard error associated to the estimated parameter value ofclassm. The statistic has approximately aχ2 distribution withN degrees of freedom, whereN is the number of classes.

The variance of an effect,σ2θ , indicates whether classes differ in the extent to which

network effects occur and similarity matters (research question 2). To test whether thevariance is zero, the following statistic is used (Snijders and Baerveldt, 2003), which hasapproximately aχ2 distribution andN − 1 degrees of freedom, whereN is the number of

Page 9: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

M.J. Lubbers / Social Networks 25 (2003) 309–332 317

classes:

Q = T 2 −∑

mθ̂m/S.E.(θ̂m)2√∑

m1/S.E.(θ̂m)2

2

(17)

Note that theT2- andQ-statistics should be interpreted a bit conservatively, since thestandard errors in the logitp∗ model are known to be underestimated (Snijders, 2002).

So far, the variability between classes is considered to be random. However, the third re-search question is whether the class-dependent deviations can be explained by relevant classcharacteristics. Therefore, the coefficients of relevant structural tendencies and the dyadicsimilarity covariates were regressed on characteristics of school classes, which expandsEq. (14)for these parameters with one or more fixed effects.

θ̂m = γ0 +∑h

γhWhm + Um + Em (18)

whereγ0 is the intercept andγh the h coefficients for the explanatory variables at classlevel Whm. Again, Um is the group-dependent deviation andEm denotes the estimationerror associated with the true parameter valueθm. The estimated coefficients of mutuality,transitivity and cyclicity were regressed on five class characteristics (group size, averageperformance level in the class, heterogeneity of performance level within the class, thepercentage of non-Dutch students and the percentage of dyads who attended the sameprimary school).Section 4describes these class characteristics in detail. For the coefficientsof the dyadic similarity variables, only their class level counterparts were used as explanatoryvariables (so for example, only the percentage of non-Dutch students in a class is used toexplain the variance in the effect of ethnicity on sociometric choice).

From these analyses we obtain coefficients for the compositional variables, which indicatewhether the composition of the class explains differences in network structure betweenclasses. Again, all of the regressions were weighed by the standard errors, and at-ratio ofa parameter estimate and its standard error can be calculated to test whether the effect iszero.

Sections 4 and 5present the data and the results obtained with the method described above.Subsequently, inSection 6, we briefly discuss an alternative for the two-stage model, wherethe two stages are integrated into a full multilevel model, and describe its correspondence,advantages and disadvantages compared to the two-stage procedure used in this paper.

4. Data

The results are part of an empirical study, which describes the school class as a socialenvironment and explores its impact on educational performance. For this aim, sociometricdata were collected within the framework of a large-scale study in The Netherlands, the‘Longitudinal Cohort Studies in Secondary Education’ (VOCL), which are carried outjointly by Statistics Netherlands (CBS) and the Groningen Institute for Educational Research(GION). These cohort studies follow cohorts of students from the first grade in junior high

Page 10: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

318 M.J. Lubbers / Social Networks 25 (2003) 309–332

until they leave full-time secondary education. The sociometric data were collected on the1999 cohort (CBS/GION, 1999). All students (average age 13) who were then in the firstgrade of a sample of 126 schools belong to the cohort; this amounts to about 20,000 studentsin 800 classes, or a sampling fraction of approximately 1:9 at student level.

In this paper, results are presented of a selection of 57 classes (1466 students) with (1) aclass size of at least 20 students, of whom (2) no student was missing on the social networkquestionnaire, and (3) no more than two students were missing on the measure for prioreducational performance.2 Since the explanatory variables exhibited few missing data inthis selection, missing data imputations were executed as discussed further on in this section,and hence all couples of students in the selected classes could be included in the analyses.In this selection, higher achieving classes are over-represented.3 Moreover, the selectioncontains no classes in the four main cities Amsterdam, Rotterdam, Utrecht and The Hague.The results are based on the measurement in the first grade.

Since students only rarely make other-gender choices, we decided to consider the net-works of boys and girls separately. Nine classes that had less than 10 boys were excludedfrom the analyses of boys’ networks (remainingN = 48 classes) and similarly, three classeswith less than 10 girls were excluded from the analyses of girls’ networks (remainingN =54 classes).

4.1. Measures

The measures used in this paper are derived from several sources. In January 2000, thestudents completed a general questionnaire and a sociometric questionnaire while in theirregular classes. The students were assured that the information in the survey would be keptconfidential. Students who were absent on the day the questionnaires were administered aremissing cases. Data on students’ gender and prior performance level were collected fromtheir school records; data on ethnicity were collected from the parents’ questionnaires thatwere administered in the same month.

Social network. To measure the social relationships between classmates, a sociometricquestionnaire was designed, which contained five social network items. The dependentvariables in this paper consist of the two relations liking and co-operation. To measure thefirst relation, the students were asked to nominate the classmates they liked best. The seconditem used in this study is task-related. Students were asked with whom of their classmatesthey preferred to work, for example on homework or an assignment in class. The maximumof classmates that could be mentioned in response to either network item is limited to three.

Table 1presents the number of names that were reported by the students. It appearsthat the majority of the students nominated three classmates (the maximum by design),80 and 59%, respectively, for the relations liking and co-operation. Boys tend to nominatefewer classmates than girls. The category ‘0 ties’ also contains students who explicitly

2 This predictor was subject to a large amount of school level non-response.3 The average performance level in the selection (539.2, standard deviation 7.4) is more than half a standard

deviation higher than the national average in 1999 (534.6, standard deviation 9.9), because students in higherlevel classes had better response rates. The performance level test is described in more detail further below in thissection.

Page 11: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

M.J. Lubbers / Social Networks 25 (2003) 309–332 319

Table 1Frequencies of the social network variables liking and co-operation at student level, overall and for boys and girlsseparately

Relation Number of classmates nominated per student

0 (%) 1 (%) 2 (%) 3 (%)

LikingOverall (N = 1466) 3.1 4.3 12.6 79.9Boys (N = 692) 3.3 5.8 16.2 74.4Girls (N = 774) 3.0 3.0 9.4 84.6

Co-operationOverall (N = 1466) 7.1 14.9 19.2 58.9Boys (N = 692) 8.4 16.8 21.5 53.3Girls (N = 774) 5.0 13.2 17.1 63.8

mentioned they did not like or know anyone or they preferred to work alone, and studentswho mentioned names of others than classmates.

For each ordered pair of students (i, j), the dependent variables liking and co-operationare coded 1 if studenti nominated studentj and else 0. Applying a fixed maximum of threenames in classes of 20–32 students obviously results in relatively sparse class networks: intotal 11% of the 36,700 ordered pairs of classmates had a tie on the first relation and 9% onthe second. However, ties in mixed-gender couples are even more sparse. On each relation,less than 1% of the mixed-gender couples had a tie. In 21 classes, mixed-gender choiceswere not made at all. After separating boys’ and girls’ networks, 20% of all ordered pairsof boys had a tie on the first relation and 17% on the second. The same percentages applyto girls’ couples.

The overlap between the two relations is quite extensive: 69% of the couples of studentshaving a tie on relation 1 also had a tie on relation 2.

Entry level of educational performance. The Dutch secondary education system is stilllargely based on tracking (i.e. placing students in classes of similar ability). Yearly, about80% of the primary schools in The Netherlands, take part in the assessment of the PrimaryEducation Final Test, developed by the National Institute for Educational Measurement(CITO). This standardized test assesses the performance level of children in the final yearof primary education in arithmetic, language and information processing and is aimedat determining which track in secondary education would be most appropriate. The testscore along with the more subjective opinion of the primary school teacher brings about afinal recommendation for an appropriate track. Schools for secondary education use theserecommendations in their decision to admit students and to place them in a particular track.The test scores range theoretically between 501 and 550.

For half of the students in the cohort, the test scores were provided by the administrationof their schools. The selection in this paper only contains classes in which maximally twostudents do not have test scores (this amounts to 40 missing cases in the entire selection). Forthese 40 students, the average test score in their category of recommendation was imputed.

Absolute differences between students’ test scores were calculated for each pair of stu-dents within a class and were used as a dyadic attribute. The differences ranged from 0 to

Page 12: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

320 M.J. Lubbers / Social Networks 25 (2003) 309–332

29. Differences greater than 10 (8% of all dyads) were recoded to 10 in order to preventoutliers to have too much impact. The average test score and its standard deviation withinthe group are used as class level variables.

Ethnicity. The parent questionnaire included several items on the country of birth of thechild that participates in the cohort and of his parents. An ethnicity variable was constructedfrom these data. Data on ethnicity were missing for 13 students in the selection. Values forthese students were imputed on the basis of the child’s country of birth as reported by theirschool administrations.

Since the non-Dutch categories all contain relatively few cases, certainly when studiedper class, they were grouped together. The similarity in ethnicity will be used as a dyadicattribute, coded 1 if the two students in a dyad are either both Dutch or both non-Dutch andelse 0. Note that for example a Moroccan and a Turkish student are defined as similar. Thepercentage of ethnic minority students in the group will be used as an explanatory variableat class level.4

Primary school attended. In the student questionnaire, students were asked to write downthe name and the address of the primary school they had attended. Data were missing orcould not be identified for 52 students. For these cases, data were imputed on the basisof the zip code of the student’s private address, the schools situated in this zip code areaand the schools of classmates who knew the student longer than a year.5 At dyad level, adummy was constructed that was coded 1 if the two students in a dyad had attended thesame primary school and 0 otherwise. At class level, the percentage of dyads in the groupwho had attended the same primary school is used as an explanatory variable.

4.2. Descriptive statistics of dyadic and group characteristics

The dyadic characteristics and the compositional characteristics of classes are summa-rized inTable 2. The column labeled ‘overall’ shows the characteristics for all of the 36,700couples in the selection (mixed-gender couples included) and the characteristics of entireclasses. This column is added for reference only. As we examine girls’ and boys’ networksseparately, we defined the group compositional covariates for both sexes separately. Afterall, for understanding how group composition constrains sociometric choice, it is not fruit-ful to include the characteristics of classmates of the other gender when the networks arealmost completely separate with respect to gender.

Table 2gives some insight in the differences between classes regarding the baselinecomponent of homophily. The numbers of boys and girls in a class vary between 10 and20 in the selected classes, with an average of 13 and 14, respectively. Consequently, insome classes students can only interact with at maximum 9 classmates of the same gender,whereas in other classes there may be up to 19 others. Also, in some classes the majorityof dyads are unfamiliar with each other from primary school, whereas in other classes up

4 As this percentage is lower than 50 % in all classes, and in almost all classes considerably lower, it correlateshighly with the percentage of dyads in the same ethnic category (r = 0.99 for boys and 0.97 for girls), yet itsinterpretation is more comprehensible.

5 This information was only available for children who nominated or were nominated by the missing student andconcerns responses to the variables ‘How long do you know him/her (the nominated student)?’ in the sociometricquestionnaire.

Page 13: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

M.J.L

ubbers/SocialN

etworks

25(2003)

309–332321

Table 2Descriptive statistics of explanatory variables at dyad and class level (average and range), overall and for boys’ and girls’ networks separately

Overall Boys GirlsVariables at dyad level N = 36700 couples N = 7602 boys’ couples N = 9964 girls’ couples

Same primary school 13.6% 16.3% 16.6%Difference in PEFTa Mean 4.5 (S.D. 3.8) Mean 4.3 (S.D. 3.1) Mean 4.3 (S.D. 3.1)Different ethnic category 18.3% 18.8% 18.4%

Variables at class level N = 57 class networks N = 48 boys’ networks N = 54 girls’ networks

Number of students in class 26 (20–32) 13 (10–20) 14 (10–20)Average PEFT in classa 538.8 (521.5–548.4) 539.5 (524.8–548.6) 538.9 (522.4–548.2)Standard deviation PEFT within classa 4.0 (1.5–6.5) 3.9 (1.3–7.5) 4.1 (1.3–6.6)Non-Dutch students in class (%) 10.9 (0.0–42.9) 10.8 (0.0–33.3) 11.5 (0.0–50.0)Dyads from same school in class (%) 13.8 (4.6–63.2) 16.8 (3.6–62.2) 17.1 (3.8–62.2)

a Primary Education Final Test score (PEFT).

Page 14: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

322 M.J. Lubbers / Social Networks 25 (2003) 309–332

to 63% of the dyads had attended the same primary school. Similarly, the heterogeneityof the class with respect to ethnicity varies considerably between classes, although in allclasses the majority is Dutch. Finally, the table shows that the standard deviation of thePrimary Education Final Test scores within classes (and within the boys’ and girls’ groupsin classes) is on average lower than the standard deviation in the entire selection (standarddeviation 7.4). This indicates that students are usually grouped together based on theircognitive ability. However, the range of the standard deviations shows that some classes aremore homogeneous than others.

5. Results

5.1. Model 1: network structure

In the first step, ap∗ model was estimated with the network effects choice, mutuality,all 2-stars, cyclicity and transitivity (seeTables 3 and 4). Due to high levels of collinearityor occasionally some estimation problems in at least one of the models, 13 classes wereremoved from the analyses for boys (remainingN = 35 classes) and 20 classes for girls(remainingN = 34 classes) for the relation liking. The problems were less severe forco-operation: six and seven classes were removed for boys and girls, respectively (42 boys’networks and 47 girls’ networks remaining). High collinearity usually concerned at leastthe variables 2-out-star and density (correlations up to 0.99).

When we compare the results for the two relations and the two sexes, it appears that thebroad outlines are quite similar. For both relations and for both sexes, mutuality, 2-out-starsand transitivity have the strongest effects (considering the value ofT2). Accordingly, thesetendencies best describe the structure of the networks. The negative average effect (µ̂WLS)of 2-out-stars is substantially not very interesting, it simply reflects the design of the socio-metric questionnaire, with a fixed maximum on the number of nominations.

Table 3Structure of the relation liking

Boys’ networks (N = 35) Girls’ networks (N = 34)

T2 µ̂WLS S.E. σ̂2 Q T2 µ̂WLS S.E. σ̂2 Q

Choice 74∗∗ 0.414 0.286 1.382 73∗∗ 112∗∗ 1.362 0.354 2.414 89∗∗Mutuality 430∗∗ 2.380 0.209 0.932 94∗∗ 368∗∗ 2.116 0.169 0.421 70∗∗2-In-star 36 −0.085 0.040 0.000 31 46 −0.012 0.041 0.007 462-Out-star 371∗∗ −1.246 0.126 0.350 103∗∗ 478∗∗ −1.600 0.136 0.392 110∗∗2-Mixed-star 95∗∗ −0.276 0.054 0.031 55 144∗∗ −0.321 0.059 0.052 72∗∗Transitivity 364∗∗ 0.819 0.063 0.053 60∗ 321∗∗ 0.704 0.047 0.012 47Cyclicity 83∗∗ −0.342 0.189 0.643 77∗∗ 73∗∗ −0.074 0.158 0.396 73∗∗

Statistic for testing whether the effect is zero in all groups (T2), estimated average effect size (µ̂WLS), standarderror associated to estimated average effect size (S.E.), estimated variance of the effect size between classes (σ̂2),statistic for testing whether the variance of the effect is zero (Q) (the parameters and the test statistics are describedin Section 2).

∗ P < 0.01.∗∗ P < 0.001.

Page 15: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

M.J. Lubbers / Social Networks 25 (2003) 309–332 323

Table 4Structure of the relation co-operation

Boys’ networks (N = 42) Girls’ networks (N = 46)

T2 µ̂WLS S.E. σ̂2 Q T2 µ̂WLS S.E. σ̂2 Q

Choice 92∗∗ −0.426 0.201 0.685 77∗∗ 98∗∗ 0.596 0.213 0.850 90∗∗Mutuality 498∗∗ 2.132 0.141 0.373 78∗∗ 461∗∗ 1.895 0.147 0.498 99∗∗2-In-star 72∗ −0.173 0.042 0.011 55 130∗∗ −0.233 0.060 0.090 121∗∗2-Out-star 293∗∗ −0.844 0.068 0.060 71∗ 516∗∗ −1.201 0.103 0.315 141∗∗2-Mixed-star 95∗∗ −0.234 0.036 0.000 54 200∗∗ −0.305 0.055 0.075 108∗∗Transitivity 425∗∗ 0.851 0.061 0.064 80∗∗ 528∗∗ 0.827 0.065 0.112 122∗∗Cyclicity 88∗∗ −0.551 0.145 0.342 70∗ 143∗∗ −0.120 0.173 0.881 143∗∗

Symbols: see footnoteTable 3.∗ P < 0.01.∗∗ P < 0.001.

For mutuality, the large positive values of the estimated average effect size indicate thatthere is a high tendency toward mutuality. The effect seems to be a bit stronger for liking thanfor co-operation and a bit stronger for boys than for girls. The odds that a tie is present versusabsent is on average between e1.895 = 7 (in girls’ co-operation networks) and e2.380 = 11times greater (in boys’ liking networks) when the potential choice is reciprocated (i.e. whenyji = 1), controlled for the other effects. When we do not control for other effects, butconsider the bivariate association only, it appears that ‘being liked’ multiplies the odds sixto seven times. The variance of the mutuality parameter between classes is considerable,suggesting that mutual relationships are more common in some classes than in others.Nevertheless, the 95% confidence intervals (µ̂ ± 2

√σ̂2) show that the mutuality effect is

positive in at least 97.5% of the classes.In all four analyses presented inTables 3 and 4, we find quite strong positive average effect

sizes of transitivity and (weaker) negative average effect sizes of cyclicity and 2-mixed-stars.The strong positive effect of transitivity in the presence of the negative lower order effect ofmixed-stars suggests that two-paths as part of transitive triads are far more likely to occurthan intransitive two-paths. Intransitive two-paths in their turn seem to be more likely thantwo-paths as part of cycles, as the negative effect of cyclicity indicates. These results suggestthat relationships between students are hierarchically structured, which seems to be mostperceptible for the co-operation networks of boys.

However, this general pattern is not observed across all classrooms, since there is somevariability in the structure. In particular, the effect of cyclicity varies considerably betweenclasses. The variance of transitivity and mixed-stars is less substantial but it deviates sig-nificantly from zero as well, as theQ-statistic indicates. The confidence intervals showthat the coefficients of transitivity have positive values in at least 97.5% of the classes,so the direction of the effect is the same in the greater part of the classes. Cyclicity onthe other hand has a negative value in some classes but a positive one in a considerablenumber of others. If we look at the results of the logitp∗ analyses in individual classes,it turns out that only about 40% of the networks display thefull hierarchical pattern de-scribed above, i.e. the presence of at the same time a positive coefficient for transitivity andnegative coefficients for cyclicity and mixed-stars. This percentage is only considerably

Page 16: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

324 M.J. Lubbers / Social Networks 25 (2003) 309–332

Table 5Effects of dyadic similarity on liking, controlled for network effects

Boys’ networks (N = 35) Girls’ networks (N = 34)

T2 µ̂WLS S.E. σ̂2 Q T2 µ̂WLS S.E. σ̂2 Q

Network effectsChoice 66∗ 0.434 0.279 0.683 63∗ 140∗∗ 2.350 0.509 5.327 106∗∗Mutuality 319∗∗ 2.084 0.202 0.770 80∗∗ 287∗∗ 1.839 0.185 0.519 76∗∗2-In-star 33 −0.089 0.042 0.000 29 52 −0.030 0.046 0.012 522-Out-star 352∗∗ −1.279 0.131 0.360 101∗∗ 483∗∗ −1.857 0.159 0.525 118∗∗2-Mixed-star 93∗∗ −0.276 0.057 0.034 57∗ 151∗∗ −0.388 0.068 0.070 76∗∗Transitivity 330∗∗ 0.803 0.062 0.045 55 285∗∗ 0.702 0.055 0.025 56∗Cyclicity 79∗∗ −0.368 0.189 0.600 72∗∗ 72∗∗ 0.006 0.168 0.428 72∗∗

Dyadic attributesSame school 63∗ 0.748 0.146 0.033 35 65∗ 0.803 0.133 0.000 27Difference in performance 35 0.016 0.018 0.000 35 65∗ −0.043 0.026 0.006 62∗Difference in ethnicitya 26 0.009 0.157 0.000 26 36 −0.257 0.149 0.000 33

Symbols: see footnoteTable 3.a For the attribute ‘different ethnicity’N = 23 boys’ networks andN = 24 girls’ networks.∗ P < 0.01.∗∗ P < 0.001.

higher for the co-operation networks of boys (where this pattern is observed in 64% of theclasses).

5.2. Model 2: structural effects and dyadic attributes

Next, the dyadic similarity attributes (primary school attended, performance level andethnicity) were added to the logistic regression models. The network effects introduced inthe first model now serve as control variables. The results are presented inTables 5 and 6for the relations liking and co-operation, respectively.

The tables show that the average effect of same primary school is substantial for bothboys’ and girls’ networks on both relations. Its significant positive value indicates thatstudents who have attended the same primary school are more attracted to each other thanstudents who have not. The odds of a tie being present to absent for couples of studentswho have attended the same primary school is about twice as high as the odds for thosewho have not, controlled for the other effects (bivariate association showed that the oddsare about three times greater without controlling for other effects).

The two other dyadic attributes, difference in performance level and ethnicity, appearedto be far less important. For girls, there is a small but significant average effect of differencein performance level on co-operation (t = 0.037/0.016 = 2.3, P < 0.05). This resultsuggests that girls prefer to co-operate with others who are similar to themselves in priorperformance. The odds of a tie being present decreases on average by 4% (e−0.370 = 0.96)for every unit increase in difference in performance level. A difference of five score pointsthus leads to a total decrease of 17% (the odds multiplied with e−0.185 = 0.83). We findno effect of performance level on girls’ liking relations or in the two types of networks

Page 17: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

M.J. Lubbers / Social Networks 25 (2003) 309–332 325

Table 6Effects of dyadic similarity on co-operation, controlled for network effects

Boys’ networks (N = 42) Girls’ networks (N = 46)

T2 µ̂WLS S.E. σ̂2 Q T2 µ̂WLS S.E. σ̂2 Q

Network effectsChoice 86∗∗ −0.448 0.237 0.906 75∗∗ 120∗∗ 1.116 0.279 1.663 101∗∗Mutuality 386∗∗ 1.900 0.156 0.498 86∗∗ 350∗∗ 1.639 0.156 0.561 101∗∗2-In-star 76∗ −0.183 0.048 0.022 61 128∗∗ −0.243 0.063 0.097 118∗∗2-Out-star 282∗∗ −0.877 0.073 0.069 73∗ 525∗∗ −1.351 0.114 0.389 142∗∗2-Mixed-star 97∗∗ −0.241 0.042 0.008 58 218∗∗ −0.358 0.061 0.095 113∗∗Transitivity 383∗∗ 0.847 0.072 0.107 90∗∗ 481∗∗ 0.830 0.069 0.127 122∗∗Cyclicity 82∗∗ −0.583 0.144 0.286 62 142∗∗ −0.095 0.181 0.958 141∗∗

Dyadic attributesSame school 93∗∗ 0.886 0.117 0.000 36 113∗∗ 0.958 0.109 0.000 35Difference in performance 37 0.008 0.017 0.000 37 58−0.037 0.016 0.000 52Difference in ethnicitya 29 −0.078 0.140 0.000 29 43 −0.244 0.145 0.081 39

Symbols: see footnoteTable 3.a For the attribute ‘different ethnicity’N = 28 boys’ networks andN = 30 girls’ networks.∗ P < 0.01.∗∗ P < 0.001.

of boys. Bivariate association confirms that a dyad’s similarity in performance level ishardly related to the presence of a tie. Dyads that have a tie are on average slightly moresimilar in performance level than dyads that are not related. Even though this is a significantassociation (P < 0.001) in the (large) overall selection of same gender couples, it is notsignificant in individual classes.

Since the former model displayed a tendency toward transitivity, and toward hierarchi-cal relations particularly in the co-operation networks of boys, it would be a plausibleassumption that it is not theabsolute difference in performance level between studentiand studentj that matters, but thedirection of this difference. For instance, students mightprefer co-operation partners who are at a higher level of educational performance. We in-vestigated this in a separate analysis for the co-operation networks of boys, by including thereal (signed) instead of the absolute difference in performance level between studenti andstudentj. It appeared that there is no tendency to nominate ‘upward’ (nor ‘downward’) withrespect to educational performance. This is confirmed by bivariate association: about 5%of the present ties concerns couples of students who performed equally well, 48% concernscouples of which the receiver of the tie had a lower test score than the sender, and 46%concerns couples of which the receiver had a higher test score.

In contrast to other studies (e.g.Baerveldt et al., 1999; Foster et al., 1996; Shrum et al.,1988), the effect of ethnicity is not significant. Note that the number of classes on whichthe estimation of the ethnicity parameter is based is lower than the number of classes forthe other parameters. Classes that were excluded from the analyses of ethnicity were totallyhomogeneous with respect to ethnicity (all students were Dutch). Obviously, the ethnicityparameter is redundant for these classes and was left out.

The variance of all three dyadic covariates is very small, suggesting that the effects (orthe lack of effects) are consistent over classes. Only the effect of difference in performance

Page 18: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

326 M.J. Lubbers / Social Networks 25 (2003) 309–332

level on girls’ liking relations differed significantly between classes (P < 0.01), althoughthe average fixed effect was not significant and the variance is small. Inspection of the resultsfrom logit p∗ analyses for individual classes shows that this is due to the significance of thefixed effect in four classes (three classesP < 0.01; one classP < 0.05).

5.3. Model 3: group composition and network structure

Finally, the group compositional variables were added as class level covariates for ex-plaining the effects of mutuality, transitivity and cyclicity and the dyadic covariates (seeEq. (18)in Section 2) that were reported inTables 5 and 6. All class level covariates werestandardized.

The results of the analyses of mutuality, transitivity and cyclicity are presented inTable 7.Table 7shows that none of the group composition variables have a structural impact on thesocial networks of boys or girls. The variance decreases by about 20% in comparison withthe variance of the parameters in model 2 (presented inTables 5 and 6), but the individualeffects are all insignificant.

Table 7Effects of group composition (regression coefficients and standard errors) on mutuality, transitivity and cyclicity

Liking Co-operation

Boys’ networks(N = 35)

Girls’ networks(N = 34)

Boys’ networks(N = 42)

Girls’ networks(N = 46)

Mutuality; intercept 2.084 (0.195) 1.840 (0.175) 1.896 (0.156) 1.607 (0.150)Group size −0.204 (0.186) 0.218 (0.182) −0.034 (0.158) 0.169 (0.157)Average PEFTa −0.148 (0.225) 0.119 (0.214) −0.053 (0.171) 0.066 (0.188)Heterogeneity PEFTa −0.154 (0.255) 0.161 (0.208) −0.014 (0.205) 0.073 (0.184)Non-Dutch students (%) 0.244 (0.226) −0.236 (0.171) 0.140 (0.191) −0.148 (0.160)Dyads from same primary

school (%)−0.035 (0.199) −0.318 (0.202) −0.059 (0.169) −0.240 (0.174)

Variance 0.657 0.388 0.479 0.469

Transitivity; intercept 0.794 (0.058) 0.718 (0.057) 0.843 (0.068) 0.818 (0.066)Group size 0.059 (0.056) −0.024 (0.058) 0.131 (0.069) −0.008 (0.071)Average PEFTa −0.090 (0.070) 0.007 (0.071) 0.079 (0.077) 0.105 (0.087)Heterogeneity PEFTa −0.067 (0.074) 0.011 (0.067) −0.001 (0.091) −0.011 (0.082)Non-Dutch students (%) −0.058 (0.071) −0.025 (0.055) 0.042 (0.083) 0.046 (0.073)Dyads from same primary

school (%)−0.069 (0.060) 0.029 (0.068) 0.113 (0.080) 0.111 (0.073)

Variance 0.028 0.026 0.081 0.103

Cyclicity; intercept −0.335 (0.177) −0.004 (0.166) −0.597 (0.141) −0.070 (0.164)Group size −0.086 (0.164) −0.064 (0.173) 0.065 (0.142) 0.088 (0.176)Average PEFTa 0.387 (0.206) 0.224 (0.202) 0.123 (0.157) −0.254 (0.208)Heterogeneity PEFTa 0.190 (0.229) 0.086 (0.198) 0.168 (0.185) 0.131 (0.201)Non-Dutch students (%) −0.106 (0.206) 0.130 (0.172) −0.005 (0.172) −0.014 (0.180)Dyads from same primary

school (%)−0.040 (0.177) 0.105 (0.173) −0.148 (0.158) −0.362 (0.187)

Variance 0.429 0.375 0.229 0.672

a Primary Education Final Test (PEFT).

Page 19: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

M.J. Lubbers / Social Networks 25 (2003) 309–332 327

We may conclude that although classes vary considerably in the occurrence of mutuality,cyclicity and to a lesser extent transitivity, the variance cannot be explained by group sizeand composition in terms of performance, ethnicity and familiarity from primary school.Furthermore, when we differentiated between the classes with a hierarchical relational pat-tern (i.e. a positive coefficient for transitivity and negative ones for cyclicity and mixed-stars)and the classes in which such a pattern was not observed (using independent samplest-tests),no significant differences in the composition variables were found.

For both relations, none of the interaction effects of the group composition variables withtheir dyad level counterparts are significant. This is hardly surprising, sinceTables 5 and 6already showed that the variance between classes of the dyadic attributes is negligible. Thus,similarity in performance level is as unimportant for sociometric choice in classes that areheterogeneous with respect to prior performance as it is in homogeneous classes. Likewise,interethnic choices are as likely to occur in classes with relatively many ethnic minoritystudents as in classes with few. Furthermore, students are not only more attracted to childrenthey knew from primary school when most of their classmates are new to them, but alsowhen many faces in their new class are already familiar to them. It can be concluded thatgroup composition influences the baseline component of similarity, but it does not affectthe preferences of students.

6. The two stages integrated: a random coefficient model

So far, the logitp∗ model was estimated using a two-stage procedure in order to model thevariability within classes as well asbetween classes. First, each class network was analyzedseparately (Eq. (3)), then, the estimated coefficientsθ̂m from the separate analyses werethe outcomes in a regression analysis at class level (Eqs. (14) and (18)). These estimatedcoefficients include a statistical errorEm, and a rather technical procedure was needed todifferentiate between the true variability and the error variability. HadEqs. (14) and (18)been specified for thetrue coefficientsθm (for Eq. (14), θm = µθ + Um and forEq. (18),θm = γ0 +∑hγhWhm +Um), substituting them intoEq. (3)yields an integrative multilevelmodel (seeBryk and Raudenbush, 1992; Snijders and Bosker, 1999) that can be estimated ina single stage. The result is a kind of logistic regression model, where the data of all classesare analyzed jointly. The dependent variable isYijm, the observed value of a relational tiedirected from studenti to studentj in classm. To represent the class level, each class ischaracterized by its own values for the intercept and the regression slopes in this model,hence the within-group variability can be distinguished from the between-group variability.

In principle, it is possible to estimate these models using a random-effects logistic regres-sion model such as implemented in the program MIXOR (Hedeker and Gibbons, 1996).The estimation of the regression coefficients would be statistically more efficient, and themodel would solve some problems we encountered in the two-stage procedure. First, highcollinearity within some of the groups is not a problem in a full multilevel model, so thedata of all classes could be used for the analyses. Secondly, the number of parameters inthe reported models was relatively large in proportion to the number of couples in smallerclasses. The random coefficient model enables us to achieve a higher level of statisticalpower.

Page 20: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

328 M.J. Lubbers / Social Networks 25 (2003) 309–332

Table 8Estimates for random coefficient model, with a random slope for mutuality

Liking Co-operation

Boys Girls Boys Girls

Nnetworks= 48,Ncouples= 7602

Nnetworks= 54,Ncouples= 9964

Nnetworks= 48,Ncouples= 7602

Nnetworks= 54,Ncouples= 9964

Network effectsChoice 0.366 (0.189) 1.736 (0.153) −0.924 (0.145) 0.011 (0.175)Mutuality 2.222 (0.153) 1.994 (0.131) 1.996 (0.119) 1.721 (0.120)2-In-star −0.007 (0.033) 0.008 (0.030) −0.120 (0.048) −0.066 (0.025)2-Out-star −1.186 (0.034) −1.561 (0.025) −0.703 (0.046) −0.888 (0.034)2-Mixed-star −0.265 (0.028) −0.318 (0.021) −0.199 (0.033) −0.308 (0.023)Transitivity 0.712 (0.034) 0.695 (0.026) 0.796 (0.034) 0.704 (0.028)Cyclicity −0.380 (0.055) −0.283 (0.050) −0.494 (0.086) −0.089 (0.050)

Dyadic attributesSame primary school 0.667 (0.120) 0.800 (0.128) 0.706 (0.128) 0.841 (0.159)Difference of performance −0.002 (0.016) −0.036 (0.014) 0.000 (0.017) −0.019 (0.017)Different ethnicity −0.104 (0.143) −0.191 (0.094) −0.106 (0.142) −0.270 (0.139)

Random effectsRandom intercept 0.245 0.405 0.141 0.256Random slope mutuality 0.339 0.366 0.148 0.353Intercept–slope covariance −0.200 −0.289 −0.137 −0.206

For the fixed effects, regression coefficients and standard errors are presented, for the random effects the variancecomponents.

The disadvantage of this model is its complexity. Estimating the variance of all of thespecified network configurations and dyadic similarity covariates in a single analysis wouldyield a model incorporating 7 (model 1) or even 10 (model 2) random slopes, which maybe correlated within groups. Such models are very computer-intensive (seeHedeker andGibbons, 1996). Although MIXOR allows specification of up to eight random effects,models incorporating three or more random effects could not be estimated for these data.

As an illustration of the random coefficient model, we present a model incorporating thefixed effects of model 2 and only two random effects: the random intercept and a randomslope for mutuality, as this variable had the largest fixed effect (seeTable 8). For this aim,the observed ties, the network change statistics produced by PREPSTAR and the dyadicsimilarity variables of the couples in all classes were arranged in a single file, along with aclass indicator.

First, it should be noted that the results inTable 8are based on considerably largernumbers of classes than the numbers of classes presented inTables 5 and 6. As mentionedbefore, the reason is that no classes had to be excluded due to collinearity or estimationproblems. The fact that the model was estimated on a larger number of classes, substantiallylarger for the relation liking, may obviously affect the effect sizes.

Furthermore, it can be noticed fromTable 8that the standard errors are much smallerthan those presented in former tables. The standard errors in the logitp∗ model are knownto be underestimated and hence significance of effects overestimated (Snijders, 2002). In

Page 21: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

M.J. Lubbers / Social Networks 25 (2003) 309–332 329

Tables 3–6, standard errors were not related to the estimation ofp∗ coefficients themselves,but to the estimation of theaverage effect size. This is not the case for the standard errorsin Table 8, which are presumably underestimated.

However, in broad outlines, the findings seem to be quite consistent with those from thetwo-stage procedure. Again, mutuality has the strongest average effect size, and the positiveeffect of transitivity in the presence of the negative effect of cyclicity indicates a tendencytoward hierarchical relationships. Furthermore, having attended the same primary schoolincreases the probability of a tie being present. Differently from former tables, ethnicityand performance level are now significant for girls on the relation liking (P < 0.05). Thisseems to be due to the underestimation of standard errors inTable 8, since the average effectsizes did not increase. Similarity in performance level is no longer a significant predictorfor girls’ co-operation relations.

7. Conclusion and discussion

The empirical objective of this paper was to describe the structure of social relationshipswithin school classes and to study differences in network structure between classes. For thisaim, three research questions were formulated. The first question was whether similarity inperformance level, gender, ethnicity and same primary school attended affect sociometricchoice of 13-year-old students within school classes. For the two relations observed, likingand co-operation, similarity in performance level appeared to be unimportant, except fora small effect on co-operation choices of girls, yet this effect was not confirmed when therandom coefficient model was used based on more classes. Apparently, association withclassmates is not based on similarity in performance level. When data of more measurementswill be available, it is possible to investigate whether related students influence each other’sperformance level over time. Sociometric choices appeared to be strongly based on similarityin gender, and same primary school turned out to be quite important as well. In comparisonwith other studies, it is remarkable that ethnicity was not important. Unfortunately, the setof non-Dutch students is too small to distinguish effects for different ethnic groups. Thisis due to the fact that the selection contained no schools in the four main cities of TheNetherlands, and relatively few classes in the lower levels of secondary education, whichare ethnically more diverse. We need to examine the relation between ethnicity and networkpositions more closely in the larger sample in order to reach a meaningful interpretation ofthis effect.

Secondly, we investigated whether classes differ in the extent to which network effectsoccur and similarity matters, and thirdly, if it does, whether the differences between classescan be explained by compositional features of classes. To start with similarity, the effects(or lack there of) of performance level, ethnicity and primary school attended are extremelyconsistent over classes. As there is hardly any variance between classes, the group com-position appeared to be unimportant for explaining the effects. This finding implies thatalthough the group composition clearly affects the baseline component of homophily inclasses, it does not affect students’preference for similarity in terms of the investigatedvariables. Thus, the two tendencies expressed in the contact hypothesis (which predicts thatsimilarity becomes a less important selection criterion with increasing diversity) and the

Page 22: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

330 M.J. Lubbers / Social Networks 25 (2003) 309–332

social identity hypothesis (similarity becomes a more important selection criterion) dis-cussed inSection 2are either both present, suggesting that the two tendencies counteracteach other, or they are both simply absent. Also, the average performance level in classesdoes not affect the salience of performance level.

With respect to the other structural effects, it appeared that for both liking and co-operation,mutual and transitive relations are very common in school classes. Interestingly, the extentto which mutual and transitive relationships occur varies substantially between classes.Again, the variance could not be explained by characteristics of the group composition.Apparently, the differences between classes are not determined by group composition insocio-demographic or educational performance terms. In order to understand the meaningof these structural tendencies and of differences between classes in the occurrence of thesetendencies, it would be useful to relate these findings to characteristics of students’ personal-ity and measures of the social climate in classes. This would illuminate whether hierarchicalrelations in classes are associated with negative or positive classroom interactions.

The methodological objective of this paper was to apply the logitp∗ model in a multilevelcontext. The two-stage method seems to be fruitful for answering the complex researchquestions. In broad outlines, the results of thep∗ model as summarized over all the classesmatched with the impression we had from simple bivariate associations. The fact thatthe findings from different models are quite consistent also adds to our confidence in themethod. However, the use of pseudo-likelihood for the estimation of thep∗ model hasbeen criticized for its statistical properties, which are said to be not well understood. Itis known that the standard errors estimated inp∗ models are too low, which means thatsignificance is overestimated in individual classes. It has been suggested that maximumlikelihood estimation may be possible by using Monte Carlo Markov Chains. When amethod is available, we intend to use this in the future.

Applying p∗ models to sociometric data with a fixed maximum number of names mayproduce large negative coefficients for the statistic of 2-out-stars. Relatively many classeshad to be excluded for this reason. Also, the models that are tested in this paper are rathersimple and it would be useful to extend them, but the number of parameters in proportionto the number of same gender couples in classes limit the possibilities to include morevariables. It would be worthwhile to further develop a random coefficient model forp∗.

Acknowledgements

Special thanks go to Tom Snijders for many valuable suggestions and comments duringthe preparation of this paper, and to Bert Creemers, Greetje Van Der Werf and two anony-mous reviewers for helpful comments on an earlier draft. This work was supported by TheNetherlands Organisation for Scientific Research (NWO) grant 411-21-703.

References

Aboud, F.E., Mendelson, M.J., 1996. Determinants of friendship selection and quality: developmental perspectives.In: Bukowski, W.M., Newcomb, A.F., Hartup, W.W. (Eds.), The Company They Keep. Friendship in Childhoodand Adolescence. Cambridge University Press, Cambridge, pp. 87–112.

Page 23: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

M.J. Lubbers / Social Networks 25 (2003) 309–332 331

Allport, G.W., 1954. The Nature of Prejudice. Addison-Wesley, Cambridge.Anderson, C.J., Wasserman, S., Crouch, B., 1999. Ap∗ primer: logit models for social networks. Social Networks

21, 37–66.Ardu, S., 1995. PREPSTAR Manual. Unpublished Document.Baerveldt, C., Van Hemert, D., Van Duijn, M., 1999. Etnische grenzen in de klas? De invloed van etniciteit en de

etnische samenstelling van MAVO-4 leerjaren op onderlinge sociale relaties van leerlingen. (Ethnic boundariesin the class? The influence of ethnicity and ethnic composition on interpersonal social relationships withinschools.) In: Proceedings of the Paper Presented at the Tweede Marktdag Sociologie in Utrecht.

Bandura, A., 1977. Social Learning Theory. Prentice Hall, New York.Bryk, A.S., Raudenbush, S.W., 1992. Hierarchical Linear Models: Applications and Data Analysis Methods. Sage,

Newbury Park, CA.Byrne, D., 1971. The Attraction Paradigm. Academic Press, New York.CBS/GION, 1999. Masterplan VOCL’99. Unpublished Document.Crouch, B., Wasserman, S., Contractor, N., 1998. A practical guide to fittingp∗ social network models via logistic

regression. Connections 21, 87–101.Foster, S.L., Martinez Jr., C.R., Kulberg, A.M., 1996. Race, ethnicity, and children’s peer relations. In: Ollendick,

Th.H., Prinz, R.J. (Eds.), Advances in Clinical Child Psychology. Plenum Press, New York, pp. 133–172.Frank, O., Strauss, D., 1986. Markov graphs. Journal of the American Statistical Association 81, 832–842.Goldstein, H., 1995. Multilevel Statistical Models. Edward Arnold, London.Goldstein, H., Rasbash, J., Plewis, I., Draper, D., Browne, W., Yang, M., Woodhouse, G., Healy, M., 1998. A

User’s Guide to MLwiN. Multilevel Models Project. Institute of Education, University of London, London.Hallinan, M.T., Smith, S.S., 1989. Classroom characteristics and student friendship cliques. Social Forces 67, 4.Hartup, W.W., 1993. Adolescents and their friends. In: Laursen, B. (Ed.), Close Friendships in Adolescence. New

Directions for Child Development, vol. 60. Jossey-Bass, San Francisco, pp. 3–22.Hedeker, D., Gibbons, R.D., 1996. MIXOR: a computer program for mixed-effects ordinal regression analysis.

Computer Methods and Programs in Biomedicine 49, 157–176.Hinde, R.A., 1997. Relationships: A Dialectical Perspective. Psychology Press, Hove, UK.Holland, P.W., Leinhardt, S., 1981. An exponential family of probability densities for directed graphs. Journal of

the American Statistical Association 76, 33–51.Hymel, S., Comfort, C., Schonert-Reichl, K., McDougall, P., 1996. Academic failure and school dropout: the

influence of peers. In: Juvonen, J., Wentzel, K.R. (Eds.), Social Motivation: Understanding Children’s SchoolAdjustment. Cambridge University Press, Cambridge, pp. 313–345.

Jussim, L., Osgood, D.W., 1989. Influence and similarity among friends: an integrative model applied toincarcerated adolescents. Social Psychology Quarterly 52, 98–112.

Leenders, R.Th.A., 1995. Structure and Influence. Statistical Models for the Dynamics of Actor Attributes, NetworkStructure and Their Interdependence. Thesis Publishers, Amsterdam.

Mayhew, B.H., McPherson, M., Rotolo, T., Smith-Lovin, L., 1995. Sex and ethnic heterogeneity in face-to-facegroups in public places: an ecological perspective on social interaction. Social Forces 74, 15–52.

McPherson, M., Smith-Lovin, L., Cook, J.M., 2001. Birds of a feather: homophily in social networks. AnnualReview of Sociology 27, 415–444.

Pattison, P., Wasserman, S., 1999. Logit models and logistic regressions for social networks. II. Multivariaterelations. British Journal of Mathematical and Statistical Psychology 52, 169–193.

Robins, G., Elliot, P., Pattison, P., 2001. Network models for social selection processes. Social Networks 23, 1–30.Robins, G., Pattison, P., Wasserman, S., 1999. Logit models and logistic regressions for social networks. III. Valued

relations. Psychometrika 64, 371–394.Shrum, W., Cheek Jr., N.H., Hunter, S.M., 1988. Friendship in school: gender and racial homophily. Sociology

of Education 61, 227–239.Snijders, T.A.B., 2002. Markov Chain Monte Carlo estimation of exponential random graph models. Journal of

Social Structure 3, 2.Snijders, T.A.B., Baerveldt, C., 2003. A multilevel network study of the effects of delinquent behavior

on friendship evolution. Available from:http://www.stat.gamma.rug.nl/snijders/Snba.pdf. Journal ofMathematical Sociology 27, 123–151.

Snijders, T.A.B., Bosker, R.J., 1999. Multilevel Analysis. An Introduction to Basic and Advanced MultilevelModeling. Sage, London.

Page 24: Group composition and network structure in school classes ... · Group composition and network structure in ... other than is expected on the basis of their relative numbers. Similar

332 M.J. Lubbers / Social Networks 25 (2003) 309–332

Strauss, D., Ikeda, M., 1990. Pseudo-likelihood estimation for social networks. Journal of the American StatisticalAssociation 85, 204–212.

Tajfel, H., Turner, J.C., 1979. An integrative theory of intergroup conflict. In: Austin, W.G., Worchel, S. (Eds.),The Social Psychology of Intergroup Relations. Brooks/Cole, Oxford, pp. 33–47.

Vermeij, L., Baerveldt, C., 2001. Social ethnic segregation in Dutch high school classes: contact versus competition.Paper Presented at SUNBELT XXI in Budapest, Hungary.

Wasserman, S., Pattison, P., 1996. Logit models and logistic regressions for social networks. I. An introduction toMarkov graphs andp∗. Psychometrika 61, 401–425.

Wentzel, K.R., 1996. Introduction. New perspectives on motivation at school. In: Juvonen, J., Wentzel, K.R. (Eds.),Social Motivation: Understanding Children’s School Adjustment. Cambridge University Press, Cambridge,pp. 1–8.

Wigfield, A., Eccles, J.S., Rodriguez, D., 1998. The development of children’s motivation in school contexts.Review of Research in Education 23, 73–118.